Aevum Canonical Data Model

1. Purpose

The canonical data model exists to prevent schema drift, duplicated semantics, and ungoverned entity growth. All modules, APIs, ingestion paths, tests, and compliance controls must conform to this model.

No team may invent alternative entity meanings without explicit approval.
No UI surface may persist state outside the canonical model without documentation and approval.
No test may assert behavior against hidden schema assumptions not defined here.

2. Modeling Principles

One root representation per user-triggered input: no duplicate root nodes for the same first-party foreground capture.
Provenance must be preserved: source type cannot be flattened for convenience.
Derived intelligence is distinct from user-authored thought: system-generated inference must remain distinguishable.
Lifecycle is explicit: entities must have valid creation, update, and archival rules.
Deletion must be possible: data sovereignty requires that user-owned content can be removed cleanly.

3. Entity Map

User-Origin Entities

MemoryEntry (root thought, imported item, explicit user log)
PersonaProfile
SessionState

System-Origin Entities

AevumEvent
ProcessedEvent
GraveyardEvent
InsightNode
StoredPrompt

4. Core Entities

4.1 MemoryEntry

The canonical record of captured or system-derived memory content. For user-triggered capture, this is the root persisted entry.

{
  "id": "UUID",
  "semanticType": "enum",
  "value": "string",
  "source": "MemorySource",
  "confidence": "Double",
  "createdAt": "Date",
  "updatedAt": "Date",
  "qjlHash": "UInt64?",
  "semanticHashString": "String?",
  "sourceEventIDs": ["UUID"],
  "linkTargetID": "UUID?"
}

Required Rules

User-triggered foreground input must create exactly one root MemoryEntry.
Derived or linked MemoryEntry records must not pretend to be root user input.
`value` is human-readable content, not opaque transport state.

4.2 PersonaProfile

The persisted user persona configuration and evolving cognitive profile baseline.

{
  "id": "UUID",
  "role": "String?",
  "secondaryRole": "String?",
  "personaType": "CognitivePersonaType",
  "secondaryPersonaTypeRaw": "String?",
  "hasCompletedPersonaSelection": "Bool",
  "engagementBaseline": "Float",
  "recurringThemes": ["String"],
  "promptResponseScores": {"String": "Double"}
}

`role` is the primary visible persona label.
`secondaryRole` is optional.
This entity influences companion initialization and grounded response behavior.

4.3 SessionState

The current or most recent interaction state used to shape continuity and response strategy.

{
  "id": "UUID",
  "lastActivityTime": "Date",
  "currentFocus": "String?",
  "engagementLevel": "Float",
  "activeMode": "String?"
}

5. Supporting Entities

Entity	Purpose	Notes
AevumEvent	Queue/event representation for ingestion, processing, and enrichment.	Supports pending/committed/error lifecycle.
ProcessedEvent	Durable record of completed event processing.	Supports auditability and idempotency.
GraveyardEvent	Rejected or invalid event payload archive.	Used for hard validation failures.
InsightNode	System-derived insight generated from validated memory patterns.	Must remain distinguishable from user input.
StoredPrompt	Prompt inventory used by proactive or guided companion flows.	Not user-authored memory.

6. Provenance Model

Input Provenance

Aevum distinguishes where input came from before mapping it to stored memory source.

InputSource	Meaning	Canonical Use
text	User-authored typed input	Dashboard, onboarding, manual capture
voice	User-authored dictated/transcribed input	Dashboard voice, capture sheet voice, audio import output
importBatch	User-triggered batch import	Documents, OCR batches, archive ingest
system	Explicit system-level action/event	Habit logs, legacy initialization, internal operational entries
deferredEnrichment	Asynchronous post-root processing payload	Non-root enrichment only

Stored Provenance

Input provenance may map to canonical storage provenance.

MemorySource	Meaning
userTap	User typed / directly authored text
userVoice	User voice/dictation/transcription root input
systemInference	System-generated or system-mapped operational record

7. Lifecycle Rules

MemoryEntry Lifecycle

Created as root user or system entry
Optionally linked to event or parent
Optionally enriched by deferred processing
Optionally merged or reinforced if semantically duplicate and eligible
May be deleted under user data sovereignty rules

AevumEvent Lifecycle

Pending
Dequeued
Validated
Committed or Shunted to Graveyard

Deferred enrichment events must never become fresh root capture events.
Processed events must be idempotent-safe.

8. Relationship Rules

A root MemoryEntry may have zero or more linked derived entries.
A derived system entry must link back to the root entry that caused it when applicable.
InsightNode must reference the memory context it was derived from.
PersonaProfile and SessionState influence interpretation and response, but are not themselves memory content.
Event-to-memory traceability must remain possible for audit and debugging.

9. Canonical Data Contracts

Contract A — Single Root Write

Each first-party user-triggered capture creates exactly one root MemoryEntry.

Contract B — Deferred Enrichment

Deferred enrichment references a root entry and may add links, metadata, or derived artifacts, but may not create a second root representation of the same capture.

Contract C — Visibility

User-triggered imports must be visible in logs immediately after root persistence.

Contract D — Source Truth

Tests and UI copy must align to canonical provenance mapping. No hidden reinterpretation is allowed.

10. Integrity Constraints

Required: no duplicate root node for same first-party foreground capture.
Required: no queue-only success path for explicit user-triggered import.
Required: no system-derived node may masquerade as user-authored source.
Required: merge/deduplication heuristics must be deterministic enough for repeatable validation.
Rejected if: tests rely on hidden actor/state bleed or stale provenance assumptions.

11. Retention and Deletion

All primary user-authored memory content is user-owned data. The system must support clear deletion semantics consistent with local-first privacy and future regulatory requirements.

User-owned MemoryEntry roots must be deletable.
Derived entries linked exclusively to deleted roots should be removable or invalidated according to deletion policy.
Operational artifacts retained for diagnostics must be governed by security/privacy policy.

12. Acceptance Criteria

Required: entity meanings are stable across product, code, and tests.
Required: provenance mapping is explicit and documented.
Required: canonical contracts are enforced across ingestion and enrichment paths.
Required: no team builds a parallel schema without approval.
Rejected if: UI, tests, and storage each interpret the same entity differently.