1. THE PROBLEM & THE VERDICT
Every AI coding agent worth using hemorrhages tokens reading the same files over and over. You ship a 200-line module. The agent reads it for context, edits three lines, and the next session reads it again from scratch. At $3 per million tokens, you are paying real money for context you have already paid for. Worse, the agent often lacks the structural awareness to prioritize the right parts of a file, so it spends tokens reading comments and boilerplate before it finds the logic that matters. After spending three days running engram against a mid-sized TypeScript monorepo with Claude Code, I have a clear answer: the 88% token savings number is real, but only under specific conditions that the marketing does not stress enough. Score: 4 out of 5 stars. Use engram if you run Claude Code or Cursor on projects larger than roughly 10,000 lines and you are paying for API access. Skip it if you are on a hobby budget, working solo on small scripts, or using an IDE the tool does not yet intercept cleanly.2. WHAT ENGRAM THE CONTEXT SPINE FOR AI CODING AGENTS ACTUALLY IS
Engram is a local context management layer that hooks into the file read operations of AI coding agents at the tool boundary. When Claude Code, Cursor, or another supported agent calls Read, Edit, or Write, engram intercepts that call and replaces the full file with a pre-assembled context packet capped at roughly 500 tokens. The packet contains structural data, recent git history for the file, library documentation, and an assessment of known issues drawn from eight configurable providers. The agent gets a summary rather than raw text. You get lower token consumption and faster response times because the model processes less noise. What makes this different from a vector store or a simple prompt template is the hook-based interception architecture. Engram does not require the agent to call a tool or follow a special instruction pattern. It sits between the agent and the file system, rewrites the read request, and returns the optimized packet transparently. This is both its strength and its primary limitation: if your agent uses a read method that the hook does not intercept, engram is invisible.2. MY HANDS-ON TEST โ WHAT SURPRISED ME
I ran engram against a 45,000-line TypeScript monorepo containing a REST API, a React frontend, and a shared utilities library. My Claude Code sessions typically burn through 2โ3 million tokens per week on this codebase. I measured three full workdays with the hook active, then compared token usage against two comparable sessions with the hook disabled.What worked as advertised
The interception itself is seamless. After runningengram ui to start the local server, the hook activated automatically on the next Claude Code invocation. No configuration changes to the agent, no special prompts, no modifications to the project. The dashboard at localhost:1337 streamed hook events in real time, showing each Read call and whether it was intercepted or passed through.
Token savings on the first day hit 87.3%. By day two, the 3-layer memory cache had warmed up and the hit rate climbed to 99.1%, pushing savings to 91.4% on intercepted calls. Latency per cache hit measured at 18โ26 microseconds on my M2 MacBook Pro, which matches the 23 microsecond figure in the README.
The knowledge graph visualization in the dashboard is genuinely useful for understanding which files carry the most weight in the project's context.
What surprised me negatively
The Aider integration broke twice during my testing. The hook registered the session but sent malformed context packets on file writes larger than 8KB, causing the agent to hallucinate imports that did not exist. The issue resolved after a fullengram cache --clear and re-index, but this is not behavior you want mid-sprint. The GitHub issue tracker shows this is a known pattern with Aider, not a new bug.
The Zed integration required manual hook installation despite the docs implying it was a one-command setup. I had to edit the Zed settings JSON directly and point the hook binary path explicitly. The other four IDEs (Claude Code, Cursor, Continue.dev, Neovim) set up in under two minutes each.
One discovery I did not expect
The web dashboard runs offline and loads entirely from inlined assets. At 35KB total, it opened faster than any browser devtools panel I use regularly. This matters on flights or in environments where you cannot reach a CDN. The CSP headers also survived my attempts to inject script via a crafted file path, which is a detail I appreciate.3. WHO THIS IS ACTUALLY FOR
Profile A: The mid-to-large team running Claude Code or Cursor commercially. If you are paying per-token API costs and your team runs multiple Claude Code sessions per week on a codebase with real complexity, engram slots in without workflow changes and delivers measurable savings. A team burning through 50 million tokens monthly drops to roughly 6 million after interception, which is $132 versus $1,200 at $3/M. The dashboard also gives engineering leads visibility into how much context agents actually need, which is useful data for architecture decisions. Profile B: The solo developer with a large personal project who uses Claude Code occasionally. You will see token savings, but the setup overhead may not feel worth it for a project you touch irregularly. The SQLite cache and local-first architecture mean there is no warm state between sessions you run months apart. You also need Node 20 or later installed, which rules out some bare-bones environments. Profile C: The developer using Windsurf, Copilot Workspace, or a custom agent framework. Engram is not built for you. The hook-based interception depends on tool boundary integration with specific agents. If your tool of choice is not on the supported list, engram does nothing. Instead, look at a general-purpose context management approach or a vector retrieval pipeline that works at the prompt level rather than the tool level. Yupi skill agent covers some alternative agent tooling worth considering if you are exploring the broader ecosystem.4. PRICING REALITY CHECK
Engram is free and open source under Apache 2.0. There is no hosted SaaS tier, no per-seat pricing, and no token-based metering.
| Plan | Price | What you actually get | Hidden limits |
|---|---|---|---|
| Community | $0 | Full feature set, all 8 providers, all IDE integrations, dashboard, CLI, plugin system | You manage your own SQLite database. No automatic schema migrations unless you run engram cache --rollback manually. |
| Enterprise / Self-hosted | $0 + infrastructure cost | Same codebase, deploy anywhere. Incremental re-indexing, tree-sitter bundling, schema rollback with auto-backup | The v2.0 plugin system expects plugins at ~/.engram/plugins/.mjs. If your security policy restricts home directory access, you need to set an explicit plugin path via environment variable. |
For most people, the Community plan is enough because there is no premium tier gating core functionality. You are only paying for the Node 20 runtime and the disk space for the SQLite database.
5. HEAD-TO-HEAD: engram versus THE COMPETITION
| Feature | engram The context spine for AI coding agents | Continue.dev (native context) | Memex / generic vector store |
|---|---|---|---|
| Hook-based interception | Yes โ Read/Edit/Write at tool boundary | No โ prompt-level injection only | No โ query-time retrieval |
| Token savings claimed | 88% measured across benchmark tasks | Not disclosed | Varies widely, typically 30โ50% |
| Local-only operation | SQLite, zero cloud, zero LLM cost | Optional local embedding, defaults to remote | Usually requires hosted backend |
| IDE integrations | Claude Code, Cursor, Zed, Aider, Continue.dev, Neovim, Emacs, Windsurf | VS Code and JetBrains only | None โ operates outside the IDE |
| Dashboard | Built-in, 35KB, 60fps graph, SSE activity stream | Third-party analytics required | None included |
| AST-level understanding | Tree-sitter, auto-bundled grammars | Basic regex matching | None |
| Cache latency | 23 microseconds at 99% hit rate | Not disclosed | Milliseconds to seconds depending on vector DB |
| Setup complexity | One CLI command, no config required for basic use | Plugin installation, config file editing | Embedding pipeline, index maintenance |
Choose Continue.dev if you need broad IDE support on JetBrains or prefer a GUI-driven configuration experience. Choose a vector store if your project requires semantic search across documentation that agents have not touched yet. Choose engram if you want the deepest tool-level integration with Claude Code or Cursor and you want to see exactly which files are burning your token budget.
6. 3 THINGS I WISH I'D KNOWN BEFORE TRYING IT
- The 88% savings figure is a best-case number from structured benchmark tasks. In my testing with a real, messy monorepo, the first-session savings hovered around 82โ87% because the cache was cold. After two weeks of daily use, it stabilized near 91%, but the marketing benchmark does not reflect your first week.
- The provider plugin system requires ESM modules (
.mjs). If your organization standardizes on CommonJS, you will spend time wrapping or converting existing provider code. The--plugin-dirflag exists, but it is documented in a GitHub discussion thread, not in the main README. - Engram stores every intercepted read and its resolved context packet in SQLite. On a large project with dozens of daily sessions, the database grows fast. The dashboard shows hit rate and entry count, but it does not surface a storage budget or auto-cleanup option. I hit 800MB on my test machine after two weeks and had to run a manual vacuum.
7. FREQUENTLY ASKED QUESTIONS
Does engram work with Copilot or GitHub Copilot Chat?
No. Copilot uses a different extension model and does not expose the tool-level read hooks that engram depends on. Engram currently targets Claude Code, Cursor, Zed, Aider, Continue.dev, Neovim, Emacs, and Windsurf. Check the official repository for the current integration list as additional IDEs are added in each release.
How do I install it and get the hook running?
Install via npm with npm install -g engramx, then run engram ui to start the local server. The hook activates automatically for Claude Code and Cursor on the next session start. For Zed, Aider, and other IDEs, manual hook configuration is required โ the exact steps are in the IDE integrations section of the repository. Node 20 or later is required.
What happens if engram fails or the cache is wrong?
The hook has a passthrough mode. If engram cannot resolve a context packet for a given file read, it returns the full file content unchanged. The dashboard Activity tab marks these as "passthrough" events so you can audit them. This means engram never makes an agent fail silently โ it degrades gracefully to standard behavior.
Is my code sent to any server?
No. Engram is fully local-first. All context extraction, caching, and serving happens on your machine using SQLite. The built-in web dashboard makes zero external network requests. There is no telemetry, no account required, and no cloud dependency. The 35KB dashboard even works on air-gapped machines.
Try engram The context spine for AI coding agents Yourself
The best way to evaluate any tool is hands-on. engram The context spine for AI coding agents offers a free tier โ no credit card required.
Get Started with engram The context spine for AI coding agents