1. Purpose
The canonical data model exists to prevent schema drift, duplicated semantics, and ungoverned entity growth. All modules, APIs, ingestion paths, tests, and compliance controls must conform to this model.
- No team may invent alternative entity meanings without explicit approval.
- No UI surface may persist state outside the canonical model without documentation and approval.
- No test may assert behavior against hidden schema assumptions not defined here.
2. Modeling Principles
- One root representation per user-triggered input: no duplicate root nodes for the same first-party foreground capture.
- Provenance must be preserved: source type cannot be flattened for convenience.
- Derived intelligence is distinct from user-authored thought: system-generated inference must remain distinguishable.
- Lifecycle is explicit: entities must have valid creation, update, and archival rules.
- Deletion must be possible: data sovereignty requires that user-owned content can be removed cleanly.
3. Entity Map
User-Origin Entities
- MemoryEntry (root thought, imported item, explicit user log)
- PersonaProfile
- SessionState
System-Origin Entities
- AevumEvent
- ProcessedEvent
- GraveyardEvent
- InsightNode
- StoredPrompt
4. Core Entities
4.1 MemoryEntry
The canonical record of captured or system-derived memory content. For user-triggered capture, this is the root persisted entry.
{
"id": "UUID",
"semanticType": "enum",
"value": "string",
"source": "MemorySource",
"confidence": "Double",
"createdAt": "Date",
"updatedAt": "Date",
"qjlHash": "UInt64?",
"semanticHashString": "String?",
"sourceEventIDs": ["UUID"],
"linkTargetID": "UUID?"
}
Required Rules
- User-triggered foreground input must create exactly one root MemoryEntry.
- Derived or linked MemoryEntry records must not pretend to be root user input.
- `value` is human-readable content, not opaque transport state.
4.2 PersonaProfile
The persisted user persona configuration and evolving cognitive profile baseline.
{
"id": "UUID",
"role": "String?",
"secondaryRole": "String?",
"personaType": "CognitivePersonaType",
"secondaryPersonaTypeRaw": "String?",
"hasCompletedPersonaSelection": "Bool",
"engagementBaseline": "Float",
"recurringThemes": ["String"],
"promptResponseScores": {"String": "Double"}
}
- `role` is the primary visible persona label.
- `secondaryRole` is optional.
- This entity influences companion initialization and grounded response behavior.
4.3 SessionState
The current or most recent interaction state used to shape continuity and response strategy.
{
"id": "UUID",
"lastActivityTime": "Date",
"currentFocus": "String?",
"engagementLevel": "Float",
"activeMode": "String?"
}
5. Supporting Entities
| Entity | Purpose | Notes |
|---|---|---|
| AevumEvent | Queue/event representation for ingestion, processing, and enrichment. | Supports pending/committed/error lifecycle. |
| ProcessedEvent | Durable record of completed event processing. | Supports auditability and idempotency. |
| GraveyardEvent | Rejected or invalid event payload archive. | Used for hard validation failures. |
| InsightNode | System-derived insight generated from validated memory patterns. | Must remain distinguishable from user input. |
| StoredPrompt | Prompt inventory used by proactive or guided companion flows. | Not user-authored memory. |
6. Provenance Model
Input Provenance
Aevum distinguishes where input came from before mapping it to stored memory source.
| InputSource | Meaning | Canonical Use |
|---|---|---|
| text | User-authored typed input | Dashboard, onboarding, manual capture |
| voice | User-authored dictated/transcribed input | Dashboard voice, capture sheet voice, audio import output |
| importBatch | User-triggered batch import | Documents, OCR batches, archive ingest |
| system | Explicit system-level action/event | Habit logs, legacy initialization, internal operational entries |
| deferredEnrichment | Asynchronous post-root processing payload | Non-root enrichment only |
Stored Provenance
Input provenance may map to canonical storage provenance.
| MemorySource | Meaning |
|---|---|
| userTap | User typed / directly authored text |
| userVoice | User voice/dictation/transcription root input |
| systemInference | System-generated or system-mapped operational record |
7. Lifecycle Rules
MemoryEntry Lifecycle
- Created as root user or system entry
- Optionally linked to event or parent
- Optionally enriched by deferred processing
- Optionally merged or reinforced if semantically duplicate and eligible
- May be deleted under user data sovereignty rules
AevumEvent Lifecycle
- Pending
- Dequeued
- Validated
- Committed or Shunted to Graveyard
- Deferred enrichment events must never become fresh root capture events.
- Processed events must be idempotent-safe.
8. Relationship Rules
- A root MemoryEntry may have zero or more linked derived entries.
- A derived system entry must link back to the root entry that caused it when applicable.
- InsightNode must reference the memory context it was derived from.
- PersonaProfile and SessionState influence interpretation and response, but are not themselves memory content.
- Event-to-memory traceability must remain possible for audit and debugging.
9. Canonical Data Contracts
Contract A — Single Root Write
Each first-party user-triggered capture creates exactly one root MemoryEntry.
Contract B — Deferred Enrichment
Deferred enrichment references a root entry and may add links, metadata, or derived artifacts, but may not create a second root representation of the same capture.
Contract C — Visibility
User-triggered imports must be visible in logs immediately after root persistence.
Contract D — Source Truth
Tests and UI copy must align to canonical provenance mapping. No hidden reinterpretation is allowed.
10. Integrity Constraints
- Required: no duplicate root node for same first-party foreground capture.
- Required: no queue-only success path for explicit user-triggered import.
- Required: no system-derived node may masquerade as user-authored source.
- Required: merge/deduplication heuristics must be deterministic enough for repeatable validation.
- Rejected if: tests rely on hidden actor/state bleed or stale provenance assumptions.
11. Retention and Deletion
All primary user-authored memory content is user-owned data. The system must support clear deletion semantics consistent with local-first privacy and future regulatory requirements.
- User-owned MemoryEntry roots must be deletable.
- Derived entries linked exclusively to deleted roots should be removable or invalidated according to deletion policy.
- Operational artifacts retained for diagnostics must be governed by security/privacy policy.
12. Acceptance Criteria
- Required: entity meanings are stable across product, code, and tests.
- Required: provenance mapping is explicit and documented.
- Required: canonical contracts are enforced across ingestion and enrichment paths.
- Required: no team builds a parallel schema without approval.
- Rejected if: UI, tests, and storage each interpret the same entity differently.