Architecture overview¶
This chapter is the map. It names every directory, every process, every socket, every database, and every wire format. Once you've read it, you can debug SiftCoder by inspecting the system instead of guessing.
On disk¶
~/.siftcoder/<namespace>/
├── config.json # global config (per-namespace)
├── run/
│ └── <workspace>.sock # Unix domain socket per workspace
├── workspaces/
│ └── <workspace>/
│ ├── db.sqlite # the memory store
│ ├── wal.ndjson # write-ahead log (line-delimited JSON)
│ ├── run.pid # daemon pid file
│ └── http.port # web UI port (when bridge is up)
└── logs/
└── <workspace>.ndjson # daemon log stream
<namespace> defaults to default. You can change it with the SIFTCODER_NS env var when you want hard isolation between, say, work and personal.
<workspace> is a 12-hex-character SHA-256 prefix of the absolute, real-path of the git toplevel of your current directory (or the directory itself, if it isn't a git repo). Two terminals in the same repo share the same workspace key. A symlink trick can't accidentally create a second workspace — realpath resolves it.
The plugin itself ships in two places. The repo (or marketplace registry) lives at ~/.claude/plugins/marketplaces/siftcoder-marketplace/ (this is where Claude Code reads versions from). The installed copy used at runtime is at ~/.claude/plugins/cache/siftcoder-marketplace/siftcoder/<version>/. The latter is what hooks invoke.
Processes¶
When you start a daemon, you get exactly one Node process per workspace. Visible in ps:
node /Users/you/.claude/plugins/cache/siftcoder-marketplace/siftcoder/1.0.8/dist/memory/daemon/index.js
The daemon owns the SQLite handle, the WAL file, and the Unix socket. It also runs the consolidator on a thirty-second interval and the embedding pass shortly after that. A separate, optional HTTP bridge runs inside the same process and exposes the web UI on a localhost port (random, written to http.port).
When Claude Code is running, it spawns short-lived hook processes for each tool call. They connect to the daemon's socket, push a frame, and exit. They are invisible in ps because they finish in milliseconds.
When you call any /siftcoder:mem slash command, the bundled CLI at bin/siftcoder.mjs runs as a one-shot Node process. It either talks to the daemon over the socket (status, info, backfill, web) or operates directly on the SQLite file (drain, which needs to load LLM clients).
Wire protocol¶
Every connection over the Unix socket uses the same length-prefixed JSON frame:
+------+--------------------+
| len | UTF-8 JSON body |
+------+--------------------+
| 4 B | <len> bytes |
+------+--------------------+
The len is a 4-byte big-endian unsigned integer. Maximum frame is 16 MiB. Each connection is request-response: client opens, sends one frame, reads one frame, closes.
The request kinds are documented in Wire protocol. The relevant ones:
| Kind | Purpose |
|---|---|
ping |
Liveness check |
status |
Daemon status + counts |
capture |
Push a tool call event |
search |
Hybrid retrieval over summaries |
timeline |
Window of summaries around an id |
get |
Fetch summaries by id |
backfill |
Replay transcripts |
shutdown |
Graceful stop |
The capture-side hooks call capture. The retrieval-side MCP server calls search, timeline, and get. The CLI calls ping, status, backfill, shutdown.
Storage layout¶
Three primary tables in the SQLite database.
events — the raw capture log. Every tool call becomes a row.
| Column | Meaning |
|---|---|
id |
Auto-increment primary key |
ts |
Unix milliseconds when captured |
session_id |
Claude Code session that produced it |
tool |
Tool name (Read, Write, Edit, Bash, Grep, Glob) |
input_hash |
SHA-256 of the redacted payload (used for dedupe) |
payload_json |
Full payload after redaction |
tokens_est |
Rough token count for budgeting |
status |
raw, summarized, or skipped |
summaries — LLM-condensed versions of events. Many-events-to-many-summaries; the consolidator decides how to group.
| Column | Meaning |
|---|---|
id |
Auto-increment primary key |
event_id |
Source event |
text |
Plain-text summary |
created_at |
Unix milliseconds |
model |
Backend that produced it (ollama:llama3.2:3b, anthropic:haiku, etc.) |
tokens_used |
Inference tokens spent |
summary_embeddings — vector representations.
| Column | Meaning |
|---|---|
summary_id |
Foreign key into summaries |
vec |
BLOB containing a packed float32 array |
dim |
Vector dimension (typically 384 or 768) |
Plus a few auxiliary tables: sessions (session metadata), pattern_index (the symbol extractor's output for code-shaped events), and the FTS5 virtual tables that back the BM25 leg of search.
The schema is created at first daemon start. Migrations are forward-only and run automatically; there is no manual migration step.
The capture path¶
sequenceDiagram
participant CC as Claude Code
participant Hook as PostToolUse Hook
participant D as Daemon
participant WAL as WAL file
participant DB as SQLite
CC->>Hook: tool call payload
Hook->>Hook: redact secrets
Hook->>D: capture frame (UDS)
D->>WAL: append JSON line
D->>DB: INSERT INTO events (status='raw')
D-->>Hook: ok + event id
Hook-->>CC: continue
The hook adds about 2 ms of latency to a tool call. The WAL flush is fsync'd so a crash mid-capture doesn't lose the event. The SQLite insert is single-row, indexed, and routinely under a millisecond.
The consolidation path¶
sequenceDiagram
participant Tick as 30s tick
participant D as Daemon
participant DB as SQLite
participant LLM as Backend (Ollama/Anthropic)
participant E as Embedder
Tick->>D: wake up
D->>DB: SELECT raw events LIMIT 16
D->>LLM: summarise batch
LLM-->>D: summary text
D->>DB: INSERT INTO summaries
D->>E: embed(summary text)
E-->>D: float32[dim]
D->>DB: INSERT INTO summary_embeddings
D->>DB: UPDATE events SET status='summarized'
The consolidator processes events in batches. Default batch size is 16, default tick is 30 seconds — both tunable in config.json. If the LLM rejects a batch (rate limit, network error, etc.) the events stay raw and are retried on the next tick. After a configurable number of failures they are marked skipped and never retried.
The retrieval path¶
sequenceDiagram
participant Claude
participant MCP as mem_search tool
participant D as Daemon
participant DB as SQLite
participant E as Embedder
Claude->>MCP: query, k
MCP->>D: search request
D->>DB: BM25 over FTS5
DB-->>D: top N lexical hits
D->>E: embed(query)
E-->>D: query vector
D->>DB: cosine over summary_embeddings
DB-->>D: top N semantic hits
D->>D: RRF fuse + decay reweight
D-->>MCP: top k summaries
MCP-->>Claude: ranked context
Retrieval blends BM25 and vector similarity through reciprocal rank fusion (RRF), which is more robust than weighted sums and doesn't require either method to know the other's score scale. The fused list is then multiplied by an exponential time decay so older memories naturally fade. Retrieval is read-only; it never mutates the store.
Hooks layer¶
SiftCoder ships seven hooks. Three are essential, four are quality-of-life.
Essential:
SessionStart(hooks/session-start/ensure-built.mjs) — verifiesdist/exists, runs thev3 → defaultnamespace migration if needed, starts the daemon if it isn't running.PreToolUse(hooks/pre-tool-use/boundary-enforcer.mjs) — blocksWrite/Editoutside the configured scope.PostToolUse(hooks/post-tool-use/capture-observation.mjs) — captures the tool call.
Quality of life:
UserPromptSubmit— injects relevant memories into Claude's context based on the user's prompt.Stop— flushes WAL and runs a fast drain pass before the session ends.SubagentStop— same asStopbut for subagent terminations.SessionEnd— final cleanup, writes a session-summary log.
Each hook reads its config from the plugin's settings.json and respects per-workspace overrides at .siftcoder/config.json.
MCP integration¶
SiftCoder's MCP server is registered in .claude-plugin/plugin.json as siftcoder-memory. It exposes seven tools to Claude:
| Tool | Purpose |
|---|---|
mem_search |
Hybrid retrieval over summaries |
mem_get |
Fetch by id |
mem_timeline |
Window of summaries around an id |
mem_drain |
Force a consolidation pass |
mem_why |
Explain why a hit ranked where it did (debug) |
Claude calls these the same way it calls any other MCP tool. There is no special wiring — they look like Read/Write/Bash from the LLM's perspective. The tools and the MCP transport are documented in MCP server.
Where to look when something breaks¶
| Symptom | First place to look |
|---|---|
| "Daemon unreachable" | ~/.siftcoder/default/logs/<workspace>.ndjson — the last thirty lines |
| Capture not happening | Run /siftcoder:mem info and check the events counter; if it's not moving, hooks aren't firing |
| Summarisation stuck at zero | backends line in info; force a drain with /siftcoder:mem drain 8 and read the error |
| Search returns nothing | Check that summaries and embeddings counts match; if embeddings is far behind, the embedder is failing — see backend config |
| Web UI won't open | ~/.siftcoder/default/workspaces/<key>/http.port should exist; if it doesn't, the bridge crashed at startup |
| Boundary enforcer not blocking | .siftcoder/scope.json syntax; it's strict JSON, and a comma-comment will silently skip the file |
The Troubleshooting page goes through each of these in detail.
That's the system. Hooks feed the daemon, the daemon writes to the store, retrieval reads back. Everything else in this guide is a consequence of those four pieces.