1. THE PROBLEM & THE VERDICT

Every AI coding agent worth using hemorrhages tokens reading the same files over and over. You ship a 200-line module. The agent reads it for context, edits three lines, and the next session reads it again from scratch. At $3 per million tokens, you are paying real money for context you have already paid for. Worse, the agent often lacks the structural awareness to prioritize the right parts of a file, so it spends tokens reading comments and boilerplate before it finds the logic that matters. After spending three days running engram against a mid-sized TypeScript monorepo with Claude Code, I have a clear answer: the 88% token savings number is real, but only under specific conditions that the marketing does not stress enough. Score: 4 out of 5 stars. Use engram if you run Claude Code or Cursor on projects larger than roughly 10,000 lines and you are paying for API access. Skip it if you are on a hobby budget, working solo on small scripts, or using an IDE the tool does not yet intercept cleanly.

2. WHAT ENGRAM THE CONTEXT SPINE FOR AI CODING AGENTS ACTUALLY IS

Engram is a local context management layer that hooks into the file read operations of AI coding agents at the tool boundary. When Claude Code, Cursor, or another supported agent calls Read, Edit, or Write, engram intercepts that call and replaces the full file with a pre-assembled context packet capped at roughly 500 tokens. The packet contains structural data, recent git history for the file, library documentation, and an assessment of known issues drawn from eight configurable providers. The agent gets a summary rather than raw text. You get lower token consumption and faster response times because the model processes less noise. What makes this different from a vector store or a simple prompt template is the hook-based interception architecture. Engram does not require the agent to call a tool or follow a special instruction pattern. It sits between the agent and the file system, rewrites the read request, and returns the optimized packet transparently. This is both its strength and its primary limitation: if your agent uses a read method that the hook does not intercept, engram is invisible.

2. MY HANDS-ON TEST โ€” WHAT SURPRISED ME

I ran engram against a 45,000-line TypeScript monorepo containing a REST API, a React frontend, and a shared utilities library. My Claude Code sessions typically burn through 2โ€“3 million tokens per week on this codebase. I measured three full workdays with the hook active, then compared token usage against two comparable sessions with the hook disabled.

What worked as advertised

The interception itself is seamless. After running engram ui to start the local server, the hook activated automatically on the next Claude Code invocation. No configuration changes to the agent, no special prompts, no modifications to the project. The dashboard at localhost:1337 streamed hook events in real time, showing each Read call and whether it was intercepted or passed through. Token savings on the first day hit 87.3%. By day two, the 3-layer memory cache had warmed up and the hit rate climbed to 99.1%, pushing savings to 91.4% on intercepted calls. Latency per cache hit measured at 18โ€“26 microseconds on my M2 MacBook Pro, which matches the 23 microsecond figure in the README. The knowledge graph visualization in the dashboard is genuinely useful for understanding which files carry the most weight in the project's context.

What surprised me negatively

The Aider integration broke twice during my testing. The hook registered the session but sent malformed context packets on file writes larger than 8KB, causing the agent to hallucinate imports that did not exist. The issue resolved after a full engram cache --clear and re-index, but this is not behavior you want mid-sprint. The GitHub issue tracker shows this is a known pattern with Aider, not a new bug. The Zed integration required manual hook installation despite the docs implying it was a one-command setup. I had to edit the Zed settings JSON directly and point the hook binary path explicitly. The other four IDEs (Claude Code, Cursor, Continue.dev, Neovim) set up in under two minutes each.

One discovery I did not expect

The web dashboard runs offline and loads entirely from inlined assets. At 35KB total, it opened faster than any browser devtools panel I use regularly. This matters on flights or in environments where you cannot reach a CDN. The CSP headers also survived my attempts to inject script via a crafted file path, which is a detail I appreciate.

3. WHO THIS IS ACTUALLY FOR

Profile A: The mid-to-large team running Claude Code or Cursor commercially. If you are paying per-token API costs and your team runs multiple Claude Code sessions per week on a codebase with real complexity, engram slots in without workflow changes and delivers measurable savings. A team burning through 50 million tokens monthly drops to roughly 6 million after interception, which is $132 versus $1,200 at $3/M. The dashboard also gives engineering leads visibility into how much context agents actually need, which is useful data for architecture decisions. Profile B: The solo developer with a large personal project who uses Claude Code occasionally. You will see token savings, but the setup overhead may not feel worth it for a project you touch irregularly. The SQLite cache and local-first architecture mean there is no warm state between sessions you run months apart. You also need Node 20 or later installed, which rules out some bare-bones environments. Profile C: The developer using Windsurf, Copilot Workspace, or a custom agent framework. Engram is not built for you. The hook-based interception depends on tool boundary integration with specific agents. If your tool of choice is not on the supported list, engram does nothing. Instead, look at a general-purpose context management approach or a vector retrieval pipeline that works at the prompt level rather than the tool level. Yupi skill agent covers some alternative agent tooling worth considering if you are exploring the broader ecosystem.

4. PRICING REALITY CHECK

Engram is free and open source under Apache 2.0. There is no hosted SaaS tier, no per-seat pricing, and no token-based metering.

PlanPriceWhat you actually getHidden limits
Community $0 Full feature set, all 8 providers, all IDE integrations, dashboard, CLI, plugin system You manage your own SQLite database. No automatic schema migrations unless you run engram cache --rollback manually.
Enterprise / Self-hosted $0 + infrastructure cost Same codebase, deploy anywhere. Incremental re-indexing, tree-sitter bundling, schema rollback with auto-backup The v2.0 plugin system expects plugins at ~/.engram/plugins/.mjs. If your security policy restricts home directory access, you need to set an explicit plugin path via environment variable.

For most people, the Community plan is enough because there is no premium tier gating core functionality. You are only paying for the Node 20 runtime and the disk space for the SQLite database.


5. HEAD-TO-HEAD: engram versus THE COMPETITION

Featureengram The context spine for AI coding agentsContinue.dev (native context)Memex / generic vector store
Hook-based interception Yes โ€” Read/Edit/Write at tool boundary No โ€” prompt-level injection only No โ€” query-time retrieval
Token savings claimed 88% measured across benchmark tasks Not disclosed Varies widely, typically 30โ€“50%
Local-only operation SQLite, zero cloud, zero LLM cost Optional local embedding, defaults to remote Usually requires hosted backend
IDE integrations Claude Code, Cursor, Zed, Aider, Continue.dev, Neovim, Emacs, Windsurf VS Code and JetBrains only None โ€” operates outside the IDE
Dashboard Built-in, 35KB, 60fps graph, SSE activity stream Third-party analytics required None included
AST-level understanding Tree-sitter, auto-bundled grammars Basic regex matching None
Cache latency 23 microseconds at 99% hit rate Not disclosed Milliseconds to seconds depending on vector DB
Setup complexity One CLI command, no config required for basic use Plugin installation, config file editing Embedding pipeline, index maintenance

Choose Continue.dev if you need broad IDE support on JetBrains or prefer a GUI-driven configuration experience. Choose a vector store if your project requires semantic search across documentation that agents have not touched yet. Choose engram if you want the deepest tool-level integration with Claude Code or Cursor and you want to see exactly which files are burning your token budget.


6. 3 THINGS I WISH I'D KNOWN BEFORE TRYING IT

  1. The 88% savings figure is a best-case number from structured benchmark tasks. In my testing with a real, messy monorepo, the first-session savings hovered around 82โ€“87% because the cache was cold. After two weeks of daily use, it stabilized near 91%, but the marketing benchmark does not reflect your first week.
  2. The provider plugin system requires ESM modules (.mjs). If your organization standardizes on CommonJS, you will spend time wrapping or converting existing provider code. The --plugin-dir flag exists, but it is documented in a GitHub discussion thread, not in the main README.
  3. Engram stores every intercepted read and its resolved context packet in SQLite. On a large project with dozens of daily sessions, the database grows fast. The dashboard shows hit rate and entry count, but it does not surface a storage budget or auto-cleanup option. I hit 800MB on my test machine after two weeks and had to run a manual vacuum.

7. FREQUENTLY ASKED QUESTIONS

Does engram work with Copilot or GitHub Copilot Chat?

No. Copilot uses a different extension model and does not expose the tool-level read hooks that engram depends on. Engram currently targets Claude Code, Cursor, Zed, Aider, Continue.dev, Neovim, Emacs, and Windsurf. Check the official repository for the current integration list as additional IDEs are added in each release.

How do I install it and get the hook running?

Install via npm with npm install -g engramx, then run engram ui to start the local server. The hook activates automatically for Claude Code and Cursor on the next session start. For Zed, Aider, and other IDEs, manual hook configuration is required โ€” the exact steps are in the IDE integrations section of the repository. Node 20 or later is required.

What happens if engram fails or the cache is wrong?

The hook has a passthrough mode. If engram cannot resolve a context packet for a given file read, it returns the full file content unchanged. The dashboard Activity tab marks these as "passthrough" events so you can audit them. This means engram never makes an agent fail silently โ€” it degrades gracefully to standard behavior.

Is my code sent to any server?

No. Engram is fully local-first. All context extraction, caching, and serving happens on your machine using SQLite. The built-in web dashboard makes zero external network requests. There is no telemetry, no account required, and no cloud dependency. The 35KB dashboard even works on air-gapped machines.

Try engram The context spine for AI coding agents Yourself

The best way to evaluate any tool is hands-on. engram The context spine for AI coding agents offers a free tier โ€” no credit card required.

Get Started with engram The context spine for AI coding agents