Why I Built ALMA¶

The Problem¶

I was building AI agents for automated testing -- a QA tester agent for frontend validation, a backend developer agent for API verification. They worked great... until they didn't.

The same mistakes kept happening:

The QA tester would use sleep(5000) for waits, causing flaky tests
The backend developer would forget that the API uses JWT with 24-hour expiry
Both agents would repeat failed strategies session after session

Every conversation started fresh. No memory. No learning. Just an expensive LLM making the same mistakes I'd already corrected.

The Search for Solutions¶

I tried Mem0. It stores memories, but:

No way to scope what an agent can learn
No anti-pattern tracking ("don't do this")
No multi-agent knowledge sharing
The QA tester could "learn" database queries it would never use

I looked at LangChain Memory. It's for conversation context, not long-term learning. Different problem.

Nothing fit.

I needed:

Scoped learning - The QA tester learns testing, not backend logic
Anti-patterns - Remember what NOT to do
Multi-agent sharing - Senior agents share knowledge with juniors
Workflow context - Resume complex tasks after failures
MCP integration - Work natively with Claude Code

Building ALMA¶

So I built it. Agent Learning Memory Architecture.

The core insight: AI agents don't need to modify their weights to "learn." They need smart prompts built from relevant past experiences.

+---------------------------------------------------------------------+
|  BEFORE TASK: Retrieve relevant memories                            |
|  +-- "Last time you tested forms, incremental validation worked"    |
|  +-- "User prefers verbose output"                                  |
|  +-- "Don't use sleep() - causes flaky tests"                       |
+---------------------------------------------------------------------+
|  DURING TASK: Agent executes with injected knowledge                |
+---------------------------------------------------------------------+
|  AFTER TASK: Learn from outcome                                     |
|  +-- Success? -> New heuristic. Failure? -> Anti-pattern.           |
+---------------------------------------------------------------------+

No fine-tuning. No model changes. Just smarter prompts.

What Makes ALMA Different¶

Scoped Learning¶

The QA tester agent can only learn what it needs:

agents:
  qa_tester:
    can_learn:
      - testing_strategies
      - selector_patterns
    cannot_learn:
      - backend_logic
      - database_queries

Anti-Pattern Tracking¶

When something fails, record WHY and WHAT TO DO INSTEAD:

alma.add_anti_pattern(
    agent="qa_tester",
    pattern="Using sleep() for async waits",
    why_bad="Causes flaky tests, wastes time",
    better_alternative="Use explicit waits with conditions"
)

Senior agents teach juniors:

agents:
  senior_architect:
    share_with: [junior_dev, qa_agent]

  junior_dev:
    inherit_from: [senior_architect]

Workflow Context (v0.6.0)¶

Complex tasks can checkpoint and resume:

# Save state mid-workflow
alma.checkpoint(workflow_id="deploy-v2", state=current_state)

# Resume after failure
alma.resume(workflow_id="deploy-v2")

The Result¶

The agents now:

Remember what worked across sessions
Avoid strategies that failed before
Share knowledge with each other
Pick up complex tasks where they left off

They're not smarter. They're better informed.

Open Source¶

ALMA is MIT licensed. Use it, modify it, contribute to it.

GitHub: github.com/RBKunnela/ALMA-memory
PyPI: pip install alma-memory
npm: @rbkunnela/alma-memory

If your AI agents keep making the same mistakes, they don't have a memory problem. They have an ALMA problem.

Get Started