Architecture
Memory model
The four memory layers — what they hold, when to read them, who writes them.
Memory is layered on purpose. Stable identity in markdown you edit by hand. Queryable state in SQLite. Semantic relationships in a graph. Embeddings for similarity recall.
Each layer has a job. Pick the right layer, never scatter.
The four layers
1. Profile — markdown
Slow-changing things about you. Identity, preferences, what you’re working on, who you talk to often, what not to talk to you about. Lives in data/profile/:
data/profile/
identity.md your name, role, location, schedule preferences
goals.md current goals (also slow-changing)
people.md folks you interact with often — name, role, context
preferences.md tone, working hours, no-meeting-windows
prohibitions.md things to never suggest (e.g. "no early-morning calls")
You edit these in your editor. The agent reads them on every call. The format is plain markdown — no YAML frontmatter, no schema. The librarian does occasional consolidation but mostly leaves them alone.
2. Episodic — SQLite
Time-ordered facts. Tasks, completed-outcomes, health signals, dispatched suggestions, worker run logs. Lives in data/mayva.db:
-- a few of the main tables
tasks(id, title, due_at, priority, status, ...)
suggestions(id, kind, payload, priority, dismissed_at, ...)
outcomes(id, source, payload, created_at, ...)
health_signals(id, kind, value, recorded_at, ...)
morning_brief(id, date, body, items, ...)
worker_runs(id, worker_name, started_at, ended_at, ok, error, ...)
This is the layer the dashboard queries most. SQLite is local, fast, and grep-friendly. You can sqlite3 data/mayva.db any time.
3. Graph — graphify (local JSON)
People, projects, organizations as nodes. Typed edges between them. Lives in data/graph.json:
{
"nodes": [
{"id": "p:alex", "type": "person", "name": "Alex Chen", "role": "founder"},
{"id": "o:acmeco", "type": "org", "name": "AcmeCo"},
{"id": "pr:portfolio", "type": "project", "name": "Portfolio rebuild"}
],
"edges": [
{"from": "p:alex", "to": "o:acmeco", "type": "works_at", "since": "2025-01"},
{"from": "p:alex", "to": "pr:portfolio", "type": "owns"}
]
}
Why a separate graph and not a SQLite junction table? Because the agent needs to ask “what’s connected to this?” and traversals over a small graph are cheaper to write than recursive CTEs. The graph is opened on every agent call, mutated in-memory, and written back at the end.
4. Vector — embeddings (local)
Embeddings for similarity recall. Used when the agent says “anything related to this concept?” — semantic search over journal entries, notes, past briefs, prior suggestions.
data/vectors/
briefs.idx embeddings of every morning brief body
outcomes.idx embeddings of completed-task descriptions
journal.idx your free-form journal entries
Embedded with a small local model (BGE-small by default). No cloud call required.
When to read what
| Question | Layer |
|---|---|
| Who am I? | profile |
| What’s my preferred deep work window? | profile |
| What’s on my calendar today? | episodic (cached from Google Cal sync) |
| What was last night’s recovery? | episodic |
| Who does Alex work with? | graph |
| Have I dealt with this kind of situation before? | vector |
| What did the agent suggest yesterday? | episodic |
The librarian (canonical writer)
Reads can happen from anywhere — workers, dashboard, voice. Writes go through the librarian. It’s a thin subagent that:
- Decides which layer the write belongs to.
- Validates the payload against the right Zod schema.
- Applies the change (markdown patch / SQL row / graph mutation / embedding insert).
- Logs the write to
worker_runsso it shows up in audit.
Why? Because a single capability (e.g. “remember that I now work at Initech”) might touch multiple layers, and we want a consistent way to keep them in sync. The librarian is small enough to read in one sitting — packages/core/librarian/index.ts is under 300 lines.
What scattering looks like (and why we avoid it)
A previous architecture had every worker writing directly to whatever layer made sense to it. The morning brief wrote to episodic; the librarian-equivalent at the time wrote to profile; the news digest wrote to nothing. Three months in, things had drifted — the profile said you cared about Topic X, but the news digest filter never reflected it.
Centralizing writes through the librarian solved this. Trade-off: every write is now slightly slower (one extra hop). For the volume Mayva does (hundreds of writes/day, not millions), it’s invisible.