BrainLayer Data Locations

Single source of truth for where all data lives, where it moved from, and the archive strategy.

Active Data

What	Path	Size	Notes
Main database	`~/.local/share/zikaron/zikaron.db`	~3.8 GB	268K+ chunks, sqlite-vec + FTS5
knowledge.db	`~/.local/share/zikaron/knowledge.db`	symlink	Points to zikaron.db
Current sessions	`~/.claude/projects/{encoded-path}/*.jsonl`	~805 files	Claude Code session transcripts
Archived sessions	`~/.claude-archive/{project-id}/archive-{timestamp}/`	1.2 GB	Moved by session-archiver

Path Resolution

BrainLayer resolves the database path in this order (see src/brainlayer/paths.py):

BRAINLAYER_DB env var — explicit override
~/.local/share/zikaron/zikaron.db — legacy path (if exists, use it)
~/.local/share/brainlayer/brainlayer.db — canonical path (for fresh installs)

Why the legacy path?

The project was originally called "Zikaron" and all data lives at the legacy path. Renaming the 3.8 GB database is risky and unnecessary — the code resolves it automatically. When users install BrainLayer fresh (no existing data), it uses the canonical path.

Session Archiver

Service: com.brainlayer.session-archiver (launchd, runs daily at 4am)

How it works:

Scans ~/.claude/projects/ for all session JSONL files
Keeps last 7 days of active sessions per project
Moves older sessions to ~/.claude-archive/{project-id}/archive-{timestamp}/
Writes manifest.json per batch (UUIDs, timestamps, sizes)
After BrainLayer indexes the archived sessions, the archiver cleans up verified copies

Archive structure:

~/.claude-archive/
  my-project/
    archive-2026-02-09T02-00-05/
      {uuid}.jsonl           # Archived session transcript
      {uuid}/                # Optional: subagent files
      manifest.json          # Batch metadata
    archive-2026-02-10T02-00-05/
      ...
  domica/
    ...
  songscript/
    ...

Manifest format:

{
  "archivedAt": "2026-02-09T02:00:05.123Z",
  "projectId": "my-project",
  "originalPath": "/Users/username/Gits/my-project",
  "sessions": [
    {
      "uuid": "abc123...",
      "originalMtime": "2026-02-07T15:30:00.000Z",
      "size": 524288,
      "hasSubdir": true,
      "firstMessageTimestamp": "2026-02-07T15:28:42.123Z",
      "gitBranch": "feature/some-branch"
    }
  ],
  "metadata": {
    "archiver_version": "1.1.0",
    "sessions_kept": 7,
    "total_archived": 12,
    "total_size_bytes": 6291456
  }
}

Backups (Manual)

Before any bulk operation, back up the database:

# WAL-safe copy
sqlite3 ~/.local/share/brainlayer/brainlayer.db "VACUUM INTO '/path/to/backup/brainlayer-$(date +%Y%m%d).db'"

Store backups in your preferred location (iCloud, external drive, etc.).

Historical: Data Migrations

Repo path change (Jan-Feb 2026)

Repos moved from ~/Desktop/Gits/ to ~/Gits/. This means: - Old chunks reference ~/.claude/projects/-Users-username-Desktop-Gits-{repo}/ - New chunks reference ~/.claude/projects/-Users-username-Gits-{repo}/ - The old JSONL session files at the Desktop paths no longer exist

Session archiver setup (Feb 9, 2026)

Before the archiver was set up, old sessions were manually deleted. ~160K chunks reference sessions that no longer exist anywhere. These chunks are still searchable — they just don't have created_at timestamps.

BrainLayer extraction (Feb 19, 2026)

Extracted to standalone repository. Code moved, data stayed at ~/.local/share/zikaron/zikaron.db. paths.py handles the legacy path transparently.

Vertex AI Batch Enrichment (Feb 17-18, 2026)

153,825 chunks submitted to Vertex AI batch prediction
Results imported Feb 18 at 08:00 (135,865 chunks enriched)
Job tracking: scripts/backfill_data/vertex_jobs.json
Predictions stored in: scripts/backfill_data/predictions/

Coverage Stats (as of Feb 19, 2026)

Metric	Count	Percentage
Total chunks	268,864	100%
Have `created_at`	107,935	40.1%
Missing `created_at`	160,929	59.9%
Enriched	144,146	53.6%
Enrichable but not enriched	22,974	8.5%
Too small to enrich (<50 chars)	101,744	37.8%

The 160K chunks without created_at are from pre-archiver sessions whose JSONL files were deleted. The chunks themselves are fully indexed and searchable — date filtering just won't apply to them (they'll always be included in unfiltered searches).