Architecture

Memory model

The four memory layers — what they hold, when to read them, who writes them.

Updated · 2026-05-28

Memory is layered on purpose. Stable identity in markdown you edit by hand. Queryable state in SQLite. Semantic relationships in a graph. Embeddings for similarity recall.

Each layer has a job. Pick the right layer, never scatter.

The four layers

1. Profile — markdown

Slow-changing things about you. Identity, preferences, what you’re working on, who you talk to often, what not to talk to you about. Lives in data/profile/:

data/profile/
  identity.md         your name, role, location, schedule preferences
  goals.md            current goals (also slow-changing)
  people.md           folks you interact with often — name, role, context
  preferences.md      tone, working hours, no-meeting-windows
  prohibitions.md     things to never suggest (e.g. "no early-morning calls")

You edit these in your editor. The agent reads them on every call. The format is plain markdown — no YAML frontmatter, no schema. The librarian does occasional consolidation but mostly leaves them alone.

2. Episodic — SQLite

Time-ordered facts. Tasks, completed-outcomes, health signals, dispatched suggestions, worker run logs. Lives in data/mayva.db:

-- a few of the main tables
tasks(id, title, due_at, priority, status, ...)
suggestions(id, kind, payload, priority, dismissed_at, ...)
outcomes(id, source, payload, created_at, ...)
health_signals(id, kind, value, recorded_at, ...)
morning_brief(id, date, body, items, ...)
worker_runs(id, worker_name, started_at, ended_at, ok, error, ...)

This is the layer the dashboard queries most. SQLite is local, fast, and grep-friendly. You can sqlite3 data/mayva.db any time.

3. Graph — graphify (local JSON)

People, projects, organizations as nodes. Typed edges between them. Lives in data/graph.json:

{
  "nodes": [
    {"id": "p:alex", "type": "person", "name": "Alex Chen", "role": "founder"},
    {"id": "o:acmeco", "type": "org", "name": "AcmeCo"},
    {"id": "pr:portfolio", "type": "project", "name": "Portfolio rebuild"}
  ],
  "edges": [
    {"from": "p:alex", "to": "o:acmeco", "type": "works_at", "since": "2025-01"},
    {"from": "p:alex", "to": "pr:portfolio", "type": "owns"}
  ]
}

Why a separate graph and not a SQLite junction table? Because the agent needs to ask “what’s connected to this?” and traversals over a small graph are cheaper to write than recursive CTEs. The graph is opened on every agent call, mutated in-memory, and written back at the end.

4. Vector — embeddings (local)

Embeddings for similarity recall. Used when the agent says “anything related to this concept?” — semantic search over journal entries, notes, past briefs, prior suggestions.

data/vectors/
  briefs.idx          embeddings of every morning brief body
  outcomes.idx        embeddings of completed-task descriptions
  journal.idx         your free-form journal entries

Embedded with a small local model (BGE-small by default). No cloud call required.

When to read what

QuestionLayer
Who am I?profile
What’s my preferred deep work window?profile
What’s on my calendar today?episodic (cached from Google Cal sync)
What was last night’s recovery?episodic
Who does Alex work with?graph
Have I dealt with this kind of situation before?vector
What did the agent suggest yesterday?episodic

The librarian (canonical writer)

Reads can happen from anywhere — workers, dashboard, voice. Writes go through the librarian. It’s a thin subagent that:

  1. Decides which layer the write belongs to.
  2. Validates the payload against the right Zod schema.
  3. Applies the change (markdown patch / SQL row / graph mutation / embedding insert).
  4. Logs the write to worker_runs so it shows up in audit.

Why? Because a single capability (e.g. “remember that I now work at Initech”) might touch multiple layers, and we want a consistent way to keep them in sync. The librarian is small enough to read in one sitting — packages/core/librarian/index.ts is under 300 lines.

What scattering looks like (and why we avoid it)

A previous architecture had every worker writing directly to whatever layer made sense to it. The morning brief wrote to episodic; the librarian-equivalent at the time wrote to profile; the news digest wrote to nothing. Three months in, things had drifted — the profile said you cared about Topic X, but the news digest filter never reflected it.

Centralizing writes through the librarian solved this. Trade-off: every write is now slightly slower (one extra hop). For the volume Mayva does (hundreds of writes/day, not millions), it’s invisible.