29 June 2026 · The Agent Examiner

How agent platforms handle long-term memory, compared

Memory is where agents earn their keep — and where platforms diverge most. From durable per-agent state to 'bring your own store,' here's the landscape.

An agent that forgets everything between runs is a chatbot with extra steps. Long-term memory — persistence that survives restarts and accumulates context — is what turns an agent into something useful over time. It's also where the platforms we track diverge most sharply. Our memory score captures how much a platform gives you out of the box versus how much you build yourself.

The leaders: durable, built-in state

Two platforms earn a top 5/5 on memory:

LangGraph — durable execution across restarts and cold starts, persistent checkpoints (payloads up to 25 MB), and semantic search for long-term memory, with per-subagent state isolation via threads.
Cloudflare Agents — each agent is a Durable Object with SQLite-backed local storage, persistent memory, durable identity, and recoverable execution — and no external database required.

These are the platforms to beat if memory is central to your use case.

The strong middle: bundled memory, bring a backend

A cluster scores 4/5 — real memory features, often needing a storage backend you provide:

Mastra — message history, working memory, semantic recall, and observational memory, backed by PostgreSQL, LibSQL, or Redis.
CrewAI — a unified Memory class where the LLM infers scope and importance, with retrieval ranked by similarity, recency, and importance (LanceDB by default).
Claude Agent SDK — resumable, forkable sessions stored as JSONL on your filesystem, plus CLAUDE.md memory files and automatic context compaction.
Alfe — built-in cross-channel memory via a Sync integration that backs files, conversations, and memory to the cloud so context persists across channels.

The "bring your own" tier

At 2/5 sit the platforms that give you the primitives but leave persistence to you:

Vercel AI SDK — message/state and streaming primitives; long-term memory is your own store.
AutoGPT — memory-management blocks exist, but the deeper persistence model isn't documented in primary sources.
Lindy and Zapier Agents — context features exist, but a distinct long-term memory primitive isn't described.

A note on "memory" vs infrastructure state

Watch for a category error: E2B, Modal, and Fly.io Machines offer durable infrastructure state (sandbox pause/resume, Volumes) — not agent or LLM memory. Persisting a filesystem is not the same as remembering a conversation. We score these 3/5 and flag the distinction.

How to choose

Memory-critical, want it built in? Start with LangGraph or Cloudflare Agents.
Happy to run a store? Mastra, CrewAI, and the Claude Agent SDK give you rich features atop your own backend.
Using an SDK? Assume you own persistence — plan for it early.

For the deeper treatment, see our guide on agent long-term memory.

Key takeaways

LangGraph and Cloudflare Agents lead memory (5/5) with durable, built-in state.

A strong middle (Mastra, CrewAI, Claude Agent SDK, Alfe) offers rich memory but often needs a backend you provide.

SDKs and some no-code tools leave long-term memory to you (2/5).

Don't confuse infrastructure state (E2B, Modal, Fly) with agent memory — they're different things.