Skip to content

Architecture overview

This chapter is the map. It names every directory, every process, every socket, every database, and every wire format. Once you've read it, you can debug SiftCoder by inspecting the system instead of guessing.

On disk

~/.siftcoder/<namespace>/
├── config.json                   # global config (per-namespace)
├── run/
│   └── <workspace>.sock          # Unix domain socket per workspace
├── workspaces/
│   └── <workspace>/
│       ├── db.sqlite             # the memory store
│       ├── wal.ndjson            # write-ahead log (line-delimited JSON)
│       ├── run.pid               # daemon pid file
│       └── http.port             # web UI port (when bridge is up)
└── logs/
    └── <workspace>.ndjson        # daemon log stream

<namespace> defaults to default. You can change it with the SIFTCODER_NS env var when you want hard isolation between, say, work and personal.

<workspace> is a 12-hex-character SHA-256 prefix of the absolute, real-path of the git toplevel of your current directory (or the directory itself, if it isn't a git repo). Two terminals in the same repo share the same workspace key. A symlink trick can't accidentally create a second workspace — realpath resolves it.

The plugin itself ships in two places. The repo (or marketplace registry) lives at ~/.claude/plugins/marketplaces/siftcoder-marketplace/ (this is where Claude Code reads versions from). The installed copy used at runtime is at ~/.claude/plugins/cache/siftcoder-marketplace/siftcoder/<version>/. The latter is what hooks invoke.

Processes

When you start a daemon, you get exactly one Node process per workspace. Visible in ps:

node /Users/you/.claude/plugins/cache/siftcoder-marketplace/siftcoder/1.0.8/dist/memory/daemon/index.js

The daemon owns the SQLite handle, the WAL file, and the Unix socket. It also runs the consolidator on a thirty-second interval and the embedding pass shortly after that. A separate, optional HTTP bridge runs inside the same process and exposes the web UI on a localhost port (random, written to http.port).

When Claude Code is running, it spawns short-lived hook processes for each tool call. They connect to the daemon's socket, push a frame, and exit. They are invisible in ps because they finish in milliseconds.

When you call any /siftcoder:mem slash command, the bundled CLI at bin/siftcoder.mjs runs as a one-shot Node process. It either talks to the daemon over the socket (status, info, backfill, web) or operates directly on the SQLite file (drain, which needs to load LLM clients).

Wire protocol

Every connection over the Unix socket uses the same length-prefixed JSON frame:

+------+--------------------+
| len  | UTF-8 JSON body    |
+------+--------------------+
| 4 B  | <len> bytes        |
+------+--------------------+

The len is a 4-byte big-endian unsigned integer. Maximum frame is 16 MiB. Each connection is request-response: client opens, sends one frame, reads one frame, closes.

The request kinds are documented in Wire protocol. The relevant ones:

Kind Purpose
ping Liveness check
status Daemon status + counts
capture Push a tool call event
search Hybrid retrieval over summaries
timeline Window of summaries around an id
get Fetch summaries by id
backfill Replay transcripts
shutdown Graceful stop

The capture-side hooks call capture. The retrieval-side MCP server calls search, timeline, and get. The CLI calls ping, status, backfill, shutdown.

Storage layout

Three primary tables in the SQLite database.

events — the raw capture log. Every tool call becomes a row.

Column Meaning
id Auto-increment primary key
ts Unix milliseconds when captured
session_id Claude Code session that produced it
tool Tool name (Read, Write, Edit, Bash, Grep, Glob)
input_hash SHA-256 of the redacted payload (used for dedupe)
payload_json Full payload after redaction
tokens_est Rough token count for budgeting
status raw, summarized, or skipped

summaries — LLM-condensed versions of events. Many-events-to-many-summaries; the consolidator decides how to group.

Column Meaning
id Auto-increment primary key
event_id Source event
text Plain-text summary
created_at Unix milliseconds
model Backend that produced it (ollama:llama3.2:3b, anthropic:haiku, etc.)
tokens_used Inference tokens spent

summary_embeddings — vector representations.

Column Meaning
summary_id Foreign key into summaries
vec BLOB containing a packed float32 array
dim Vector dimension (typically 384 or 768)

Plus a few auxiliary tables: sessions (session metadata), pattern_index (the symbol extractor's output for code-shaped events), and the FTS5 virtual tables that back the BM25 leg of search.

The schema is created at first daemon start. Migrations are forward-only and run automatically; there is no manual migration step.

The capture path

sequenceDiagram
    participant CC as Claude Code
    participant Hook as PostToolUse Hook
    participant D as Daemon
    participant WAL as WAL file
    participant DB as SQLite

    CC->>Hook: tool call payload
    Hook->>Hook: redact secrets
    Hook->>D: capture frame (UDS)
    D->>WAL: append JSON line
    D->>DB: INSERT INTO events (status='raw')
    D-->>Hook: ok + event id
    Hook-->>CC: continue

The hook adds about 2 ms of latency to a tool call. The WAL flush is fsync'd so a crash mid-capture doesn't lose the event. The SQLite insert is single-row, indexed, and routinely under a millisecond.

The consolidation path

sequenceDiagram
    participant Tick as 30s tick
    participant D as Daemon
    participant DB as SQLite
    participant LLM as Backend (Ollama/Anthropic)
    participant E as Embedder

    Tick->>D: wake up
    D->>DB: SELECT raw events LIMIT 16
    D->>LLM: summarise batch
    LLM-->>D: summary text
    D->>DB: INSERT INTO summaries
    D->>E: embed(summary text)
    E-->>D: float32[dim]
    D->>DB: INSERT INTO summary_embeddings
    D->>DB: UPDATE events SET status='summarized'

The consolidator processes events in batches. Default batch size is 16, default tick is 30 seconds — both tunable in config.json. If the LLM rejects a batch (rate limit, network error, etc.) the events stay raw and are retried on the next tick. After a configurable number of failures they are marked skipped and never retried.

The retrieval path

sequenceDiagram
    participant Claude
    participant MCP as mem_search tool
    participant D as Daemon
    participant DB as SQLite
    participant E as Embedder

    Claude->>MCP: query, k
    MCP->>D: search request
    D->>DB: BM25 over FTS5
    DB-->>D: top N lexical hits
    D->>E: embed(query)
    E-->>D: query vector
    D->>DB: cosine over summary_embeddings
    DB-->>D: top N semantic hits
    D->>D: RRF fuse + decay reweight
    D-->>MCP: top k summaries
    MCP-->>Claude: ranked context

Retrieval blends BM25 and vector similarity through reciprocal rank fusion (RRF), which is more robust than weighted sums and doesn't require either method to know the other's score scale. The fused list is then multiplied by an exponential time decay so older memories naturally fade. Retrieval is read-only; it never mutates the store.

Hooks layer

SiftCoder ships seven hooks. Three are essential, four are quality-of-life.

Essential:

  • SessionStart (hooks/session-start/ensure-built.mjs) — verifies dist/ exists, runs the v3 → default namespace migration if needed, starts the daemon if it isn't running.
  • PreToolUse (hooks/pre-tool-use/boundary-enforcer.mjs) — blocks Write/Edit outside the configured scope.
  • PostToolUse (hooks/post-tool-use/capture-observation.mjs) — captures the tool call.

Quality of life:

  • UserPromptSubmit — injects relevant memories into Claude's context based on the user's prompt.
  • Stop — flushes WAL and runs a fast drain pass before the session ends.
  • SubagentStop — same as Stop but for subagent terminations.
  • SessionEnd — final cleanup, writes a session-summary log.

Each hook reads its config from the plugin's settings.json and respects per-workspace overrides at .siftcoder/config.json.

MCP integration

SiftCoder's MCP server is registered in .claude-plugin/plugin.json as siftcoder-memory. It exposes seven tools to Claude:

Tool Purpose
mem_search Hybrid retrieval over summaries
mem_get Fetch by id
mem_timeline Window of summaries around an id
mem_drain Force a consolidation pass
mem_why Explain why a hit ranked where it did (debug)

Claude calls these the same way it calls any other MCP tool. There is no special wiring — they look like Read/Write/Bash from the LLM's perspective. The tools and the MCP transport are documented in MCP server.

Where to look when something breaks

Symptom First place to look
"Daemon unreachable" ~/.siftcoder/default/logs/<workspace>.ndjson — the last thirty lines
Capture not happening Run /siftcoder:mem info and check the events counter; if it's not moving, hooks aren't firing
Summarisation stuck at zero backends line in info; force a drain with /siftcoder:mem drain 8 and read the error
Search returns nothing Check that summaries and embeddings counts match; if embeddings is far behind, the embedder is failing — see backend config
Web UI won't open ~/.siftcoder/default/workspaces/<key>/http.port should exist; if it doesn't, the bridge crashed at startup
Boundary enforcer not blocking .siftcoder/scope.json syntax; it's strict JSON, and a comma-comment will silently skip the file

The Troubleshooting page goes through each of these in detail.

That's the system. Hooks feed the daemon, the daemon writes to the store, retrieval reads back. Everything else in this guide is a consequence of those four pieces.