"But My AI Already Has Memory..."

Yes. Claude Code, ChatGPT, OpenClaw, and Gemini all have built-in memory now. So why would you need ALMA?
Because their memory is a notepad. ALMA is a learning system.

	Built-in Memory (Claude, ChatGPT, OpenClaw)	ALMA
What it stores	Facts and preferences -- "user likes dark mode"	Outcomes -- what strategies worked, failed, and why
Does it learn?	No. It remembers what you told it.	Yes. After 3+ similar outcomes, auto-creates reusable strategies.
Does it warn you?	No.	Yes. Anti-patterns track what NOT to do, with why + alternatives.
Cross-platform?	No. Claude doesn't know what ChatGPT learned.	Yes. One memory layer shared across every AI tool.
Multi-agent?	No. Each session is isolated.	Yes. Junior agents inherit from senior agents.
Scoring	Basic relevance or "most recent"	4-factor: similarity + recency + success rate + confidence
Lifecycle	Grows until you delete things	Automatic: decay, compression, consolidation, archival
Your data	Stored on their servers	Your database. SQLite, PostgreSQL, Qdrant -- you choose.
Benchmark	Not benchmarked	R@5 = 0.964 on LongMemEval (500 questions)
Trust / Verification	No. Everything returned as-is.	Veritas (built-in). Trust scoring, verified retrieval, conflict detection.

ALMA works WITH built-in memory, not against it

ALMA doesn't replace Claude Code's memory or ChatGPT's memory -- it sits underneath as a deeper layer. Use built-in memory for quick preferences. Use ALMA for:

Strategy tracking -- which approaches worked for which problems

Failure prevention -- anti-patterns that stop mistakes from repeating

Team knowledge -- sharing lessons across agents and platforms

Measurable retrieval -- benchmarked at R@5=0.964, not "trust me"

Proven: #1 on LongMemEval + Feedback Learning

Benchmarked against LongMemEval (ICLR 2025) -- the standard benchmark for AI agent memory. 500 questions, ~53 conversation sessions each.

ALMA Benchmark Results - #1 on LongMemEval

System	LongMemEval	API Keys	Memory Types	Feedback Loop	Trust / Verification
ALMA v1.0	R@5 = 0.964	None	5	Yes (v1.0)	Veritas (built-in)
Mem0	~49% acc.*	GPT-4o	2	No	No
Zep	71.2% acc.*	GPT-4o	1	No	No
Letta	Not published	GPT-4o	2	No	No
Beads	Not published	None	N/A (tasks)	No	No
RuVector	Not published	None	N/A (vectors)	Self-learning	No

* Mem0 and Zep report accuracy (LLM-judged correctness), not recall. ALMA reports Recall@5 (retrieval-only, no LLM judge). Metrics are not directly comparable -- recall is a stricter, more reproducible measure.

ALMA v1.0

0.964

Hindsight

0.914

Zep

0.638

Mem0

0.490

1.000

Knowledge Update

0.992

Multi-Session

0.967

Preferences

0.947

Temporal Reasoning

0.946

Assistant Memory

0.914

User Memory

Reproduce it yourself in 3 commands ~30 minutes, any CPU, no GPU

pip install alma-memory[local] sentence-transformers
curl -fsSL -o /tmp/longmemeval.json \
  https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_s_cleaned.json
python -m benchmarks.longmemeval.runner --data /tmp/longmemeval.json

Full methodology: BENCHMARK-REPORT.md

New in v1.0

Retrieval That Gets Better Over Time

The Feedback Learning Benchmark (FLB) proves that ALMA's retrieval improves with usage. As agents provide feedback on which memories helped, future retrieval becomes more precise -- without any model retraining.

record_usage()

Track which retrieved memories the agent actually referenced in its output

record_feedback()

Explicit positive/negative signal on memory quality -- "this helped" or "this was wrong"

Auto-adjust

Retrieval scores shift automatically based on accumulated feedback. No manual tuning needed.

Standing on the shoulders of open source

ALMA v1.0 absorbs concepts from two MIT-licensed projects:

Beads Task dependency tracking and structured workflow memory -- absorbed into ALMA's workflow checkpoint system

RuVector Self-learning vector retrieval with feedback signals -- the inspiration behind ALMA's Retrieval Feedback Loop

Both Beads and RuVector are MIT-licensed. ALMA integrates their concepts natively rather than wrapping them as dependencies.

How ALMA Works

Three phases. No model modifications. Your agent gets smarter every run.

Retrieve

Ask ALMA for context

FAISS vector search finds relevant memories. Multi-factor scoring ranks by similarity + recency + success rate + confidence.

alma.retrieve(task=..., agent=...)
# Returns ranked memories

Learn

Record outcomes

After the task, record what happened -- success or failure, strategy used, how long it took. Every run becomes training data.

alma.learn(outcome="success",
strategy_used="blue-green")

Improve

Auto-create strategies

After 3+ similar outcomes, ALMA auto-creates reusable heuristics. After 2+ failures, it creates anti-patterns. Zero manual work.

Heuristic created
Anti-pattern tracked

What Makes ALMA Different

Not another vector database. A learning system built for AI agents.

Five memory types, not just embeddings

Other systems dump everything into one vector index. ALMA classifies memories into 5 types, each with different retrieval behavior and lifecycle.

Heuristic Strategies that work -- "For forms with >5 fields, validate incrementally"

Outcome Task results -- "Login test passed using JWT -- 340ms"

AntiPattern What NOT to do -- "Don't use sleep() for async waits -- causes flaky tests"

DomainKnowledge Facts -- "Auth uses OAuth 2.0, tokens expire in 24h"

UserPreference Your constraints -- "Prefer verbose output, Python 3.12, dark theme"

Multi-agent knowledge sharing

Junior agents inherit from senior agents. Teams share knowledge across roles. One agent's lesson becomes the whole team's advantage.

agents:
  senior_dev:
    share_with: [junior_dev, qa_agent]
  junior_dev:
    inherit_from: [senior_dev]
  qa_agent:
    inherit_from: [senior_dev]

Token-efficient context loading

The 4-layer MemoryStack loads only what you need. Identity (~100 tokens) + Essential Story (~800 tokens) at wake-up. On-demand and deep search activate when needed. 95% of your context window stays free.

~100

Identity tokens

~800

Story tokens

On-demand

Task context

95%

Window free

Self-improving retrieval

v1.0

Most memory systems retrieve and forget. ALMA v1.0 tracks which memories your agent actually uses vs ignores, then adjusts future retrieval scores automatically. Retrieval gets better every run -- no retraining, no manual tuning.

Usage tracking -- which memories agents actually reference in their output

Feedback scoring -- positive/negative signals adjust retrieval weight over time

Zero config -- works automatically with any storage backend

# Track what the agent actually used
alma.record_usage(
    memory_ids=["mem_1", "mem_3"],
    task="deploy auth service"
)

# Explicit feedback: this memory helped
alma.record_feedback(
    memory_id="mem_1",
    signal="positive",
    reason="correct deployment strategy"
)

# Next retrieval auto-boosts mem_1,
# auto-demotes unused memories

The feedback scoring pipeline: memories that get used rise in rank. Memories that get ignored decay. Your retrieval improves automatically.

Your data, your infrastructure

ALMA is a library, not a service. 7 storage backends from SQLite ($0) to Azure Cosmos (enterprise). Your database, your rules.

SQLite + FAISS PostgreSQL Qdrant Pinecone Chroma Azure File

22 MCP Tools

Native MCP server for Claude Code. Retrieve, learn, manage memories -- all through tool calls. One JSON config and you're connected.

python -m alma.mcp
# 22 tools ready to use

Workflow Checkpoints

Save state mid-workflow, resume after failures. Perfect for complex multi-step tasks that span sessions.

alma.checkpoint(workflow_id, state)
alma.resume(workflow_id)

5

Memory Types

22

MCP Tools

4

Graph Backends

6

Domain Schemas

Bootstrap From Existing Knowledge

Already have conversations, project files, or chat exports? ALMA doesn't just dump them into a vector database like RAG. It reads, classifies, and structures them into the 5 memory types.

This is not RAG

RAG retrieves text chunks by similarity. ALMA retrieves classified, scored, typed memories that improve over time.

--> Decisions you made --> DomainKnowledge (retrievable facts)

--> Preferences you stated --> UserPreference (constraints agents respect)

--> Things that worked --> Outcomes (success records with strategies)

--> Problems you hit --> AntiPatterns (mistakes agents won't repeat)

from alma.ingestion import ingest_directory
from alma.ingestion import ingest_conversations

# Ingest project files
result = ingest_directory(
    "/path/to/project",
    agent="dev",
    project_id="myapp"
)
# result.domain_knowledge: 47 facts
# result.user_preferences: 12 preferences
# result.anti_patterns: 3 problems
# result.outcomes: 8 milestones

# Ingest chat exports (6 formats)
result = ingest_conversations(
    "/path/to/chats",
    agent="dev",
    project_id="myapp"
)

Claude Code JSONL ChatGPT JSON Claude.ai JSON Codex JSONL Slack JSON Plain Text

Built into ALMA

Veritas Trust Layer -- Trust Your Agent's Memories

Memory without trust is dangerous. Your agent retrieves a "fact" -- but is it still accurate? Has it been contradicted? Who stored it, and do you trust them? Veritas answers all three questions before your agent acts.

Trust Scoring

Per-agent trust profiles scored 0.0 to 1.0. Five behavioral dimensions track how reliable each agent actually is -- not just what it claims.

verification-before-claim

loud-failure

honest-uncertainty

paper-trail

diligent-execution

30-day half-life -- trust decays without activity

Verified Retrieval

Two-stage retrieval: fuzzy recall finds candidates, then verification confirms them. Every memory gets a status before your agent sees it.

VERIFIED Safe to use

UNCERTAIN Use with caution

CONTRADICTED Conflict detected

UNVERIFIABLE No method available

Conflict Detection

Cross-verification catches contradicting memories before agents act on bad data. When Memory A says "use rolling updates" but Memory B says "rolling updates caused downtime" -- Veritas flags the conflict.

Contradiction found

mem_42 says "use rolling updates"

mem_87 says "rolling updates failed"

Agent sees both sides

Decides based on evidence, not stale data

Verified retrieval in 5 lines

The VerifiedRetriever wraps ALMA's retrieval engine. Same query, but now every memory comes with a verification status and confidence score.

Stage 1 -- Fuzzy Recall: Semantic search finds candidates with expanded set

Stage 2 -- Verification: Ground truth, cross-verify, or confidence fallback

Result: Categorized by status -- verified, uncertain, contradicted

verified_retrieval.py

from alma.retrieval.verification import (
    VerifiedRetriever,
    VerificationConfig,
)

# Wrap your existing retrieval engine
retriever = VerifiedRetriever(
    retrieval_engine=alma.retrieval_engine,
    config=VerificationConfig(enabled=True)
)

# Two-stage retrieval: fuzzy recall + verification
results = retriever.retrieve_verified(
    query="How to deploy auth service?",
    agent="backend-dev",
    project_id="my-project",
    top_k=5
)

# Use only verified memories
for mem in results.verified:
    print(f"[VERIFIED] {mem.memory}")

# Flag contradictions for review
for mem in results.contradicted:
    print(f"[CONFLICT] {mem.verification.reason}")
    print(f"  Source: {mem.verification.contradicting_source}")

Why trust matters for AI memory

Every memory system assumes retrieved data is correct. But memories go stale. Agents store wrong conclusions. Different agents contradict each other. Without trust scoring and verification, your agent builds on a foundation it cannot validate.

No extra config -- Veritas is built into ALMA's retrieval engine. Works with all 7 storage backends.

LLM optional -- Confidence-based verification works without any LLM. Add one for ground truth and cross-verification.

Multi-agent aware -- Trust profiles track each agent independently. A reckless agent's memories rank lower.

Get Started in 60 Seconds

Install ALMA and start giving your agents memory.

Install

pip install alma-memory[local] # Includes SQLite + FAISS + local embeddings

from alma import ALMA

alma = ALMA.from_config(".alma/config.yaml")

# Retrieve what the agent learned
memories = alma.retrieve(
    task="Fix the login bug",
    agent="developer",
    top_k=5
)

# Inject into your prompt
prompt = f"""## Context from past runs
{memories.to_prompt()}

## Task
Fix the login bug"""

# After the task, learn from the outcome
alma.learn(
    agent="developer",
    task="Fix login bug",
    outcome="success",
    strategy_used="Cleared session cache"
)

# That's it. Every run gets smarter.

# .alma/config.yaml
alma:
  project_id: "my-project"
  storage: sqlite
  embedding_provider: local
  storage_dir: .alma
  db_name: alma.db
  embedding_dim: 384

  agents:
    developer:
      domain: coding
      can_learn:
        - debugging_strategies
        - architecture_patterns
      share_with: [qa_agent]
    qa_agent:
      domain: testing
      inherit_from: [developer]

# Start the MCP server
python -m alma.mcp --config .alma/config.yaml

// Add to .mcp.json for Claude Code
{
  "mcpServers": {
    "alma-memory": {
      "command": "python",
      "args": ["-m", "alma.mcp", "--config", ".alma/config.yaml"]
    }
  }
}

Other installation options (PostgreSQL, Qdrant, Pinecone, Chroma, Azure, all)

pip install alma-memory[postgres]  # PostgreSQL + pgvector
pip install alma-memory[qdrant]    # Qdrant
pip install alma-memory[pinecone]  # Pinecone
pip install alma-memory[chroma]    # ChromaDB
pip install alma-memory[azure]     # Azure Cosmos DB
pip install alma-memory[all]       # Everything

At a Glance

0.964

LongMemEval R@5

#1 open-source

2,121

Tests Passing

7

Storage Backends

$0

Cost (local)

0

API Keys Needed

5

Memory Types

22

MCP Tools

6

Chat Formats

4

Graph Backends

5

Trust Dimensions

4

Verification Statuses

<5 min

Time to First Memory

Database Setup

Three paths. Pick what fits. ALMA auto-creates tables on first run.

SQLite + FAISS

Zero config

No database to install. Files stored locally. Perfect for development and personal use.

storage: sqlite
storage_dir: .alma
# That's it. Done.

Cost: $0.00 forever

PostgreSQL + pgvector

Production

Full SQL with vector search. Complete schema in the README -- copy-paste and go.

storage: postgresql
connection_string:
${DATABASE_URL}

Cost: varies by provider

Supabase Free Tier

Cloud hosted

Free PostgreSQL + pgvector. Create account, run SQL from README, configure YAML. Done.

# 1. supabase.com/dashboard
# 2. Run SQL from README
# 3. Copy connection string

Cost: $0.00 (free tier)

Full setup guide with step-by-step instructions for every backend

Coming Soon

Veritas Cloud

ALMA's trust scoring and verified retrieval work great on a single instance. But when you run dozens of agents across multiple deployments, you need a shared source of trust.

ALMA + Veritas (Free)

Open source, MIT license, yours forever

Trust scoring — per-agent trust profiles, 5 behavioral dimensions, trust decay
Verified retrieval — two-stage verification, 4 statuses, conflict detection
Anti-pattern memory — agents remember what failed and why
Retrieval feedback loop — memories that agents actually use get scored higher
7 storage backends, 4 graph backends, 22 MCP tools

pip install alma-memory

COMING SOON

Veritas Cloud (Pro)

Managed trust service for multi-agent teams

Real-time conflict prevention — stop Agent B before it acts on data Agent A already invalidated
Shared trust graph — one source of truth across all your agent deployments
Trust dashboard — conflicts/week, trust trends, resolution rates at a glance
Provenance chain API — full audit trail for every agent decision, compliance-ready
Monthly value report — "Veritas prevented X conflicts, saved an estimated $Y"

Join the early access list below

Does this sound like your team?

If you answer yes to any of these, Veritas Cloud is being built for you.

1

"Our agents sometimes contradict each other and we only find out when a customer complains."

Veritas Cloud catches conflicts in real-time, before agents act on contradicting data.

2

"We can't prove to our clients that our AI agents are making trustworthy decisions."

Provenance chain API gives you a full audit trail. Show clients exactly how every decision was made.

3

"We run 50+ agent workflows and have no idea how many conflicts happen per week."

Trust dashboard shows trust violations, resolution rates, and memory accuracy across your entire fleet.

4

"Each agent deployment is isolated. Trust built in one workflow doesn't carry over to another."

Shared trust graph — trust scores, provenance data, and conflict history unified across all deployments.

5

"Our enterprise clients are starting to ask about AI compliance and audit trails."

SOC2-ready audit exports, per-tenant trust isolation, and SLA guarantees. Built for enterprise compliance.

Get Early Access

We're building Veritas Cloud with design partners. If multi-agent trust is a pain point for your team, we want to hear from you.

Request Early Access

No commitment. Tell us about your agent setup and we'll figure out if Veritas Cloud can help.

Stop Starting From Zero

Every conversation makes the next one better. One pip install. Five minutes. $0.00 to start. No API keys.

Star on GitHub PyPI

Support continued development: Buy me a coffee

Your AI Forgets Everything. Fix It.