v0.4.1 — Production hardened

Memory that refuses
to remember hallucinations.

A PostgreSQL extension that gives AI agents persistent, auditable memory. Every lesson requires a verifiable artifact before it's stored — so a single bad LLM call doesn't silently corrupt every future run.

Install in 60 seconds → View on GitHub ↗

0.8409LoCoMo recall@10+4.15pp

0.9334LongMemEval recall@10

PG 14–17PostgreSQL

Apache 2.0Open source

$pgxn install pgmnemo==0.4.1

What is pgmnemo How it works Who it's for Comparison API Benchmarks Install

What is pgmnemo

Memory for AI agents,
without the noise.

Most multi-agent systems are stateless. Each run starts from zero, re-discovering decisions and rules the previous run already figured out. Adding memory fixes that — but introduces a new problem.

The problem. AI agents hallucinate. Today's memory systems store whatever the agent says — including hallucinated summaries, made-up commit IDs, and confident-but-wrong conclusions. Three weeks later, a different agent reads that bad lesson and builds on it. The mistake compounds across runs. Your memory layer accumulates noise at the same rate it accumulates knowledge — and there's no way to tell which is which.

The fix. pgmnemo blocks that pattern with a provenance gate. Before any lesson is promoted to long-term memory, it must be attached to a verifiable artifact: a git commit_sha, a file hash, or a passing test ID. If no artifact exists, the lesson stays in a staging queue — useful for the current session, never trusted by future runs.

Where it lives. Not a separate service. Not a SaaS API. CREATE EXTENSION pgmnemo; in your existing PostgreSQL and you're done. Reads and writes are plain SQL functions. The gate runs as a database-level row policy — application code cannot bypass it. Your data never leaves your server.

The one differentiator nobody else has:

Write-time provenance. Mem0, Zep, Letta, Pinecone — none of them require proof that a lesson is real before storing it. pgmnemo does, and the check is at the database layer, not the application layer.

Phantom work stays phantom. Real work gets remembered.

How it works

Three SQL calls.
That's the whole pipeline.

No daemon, no sidecar, no orchestration layer. Your agent code calls pgmnemo functions. The database does the rest.

Agent finishes a task

It produces an artifact — a git commit, a saved file, a passing test. That artifact is the proof the work is real.

# your code:
sha = git.commit("fix JWT")
result = run_tests()
# sha = 'abc1234'

pgmnemo verifies and stores

ingest() requires commit_sha or artifact_hash. Without it the write is blocked at the database layer — not by your app.

SELECT pgmnemo.ingest(
  p_lesson_text := 'JWT…',
  p_commit_sha  := 'abc1234'
);

Future agents recall it

recall_lessons() returns the most relevant past lessons using hybrid scoring: HNSW vectors + BM25 full-text + recency + importance.

SELECT * FROM pgmnemo.recall_lessons(
  query_text := 'JWT rotation'
) ORDER BY score DESC;

Why pgmnemo

Four things nobody else does.

Provenance gate

Write-time artifact requirement enforced at the database layer. The only memory product with this — and the reason hallucinations stay out of canonical memory.

Zero new services

No daemons, no sidecars, no cloud accounts. CREATE EXTENSION pgmnemo CASCADE; — that's it. Memory lives where your data already does.

Hybrid recall in one SQL call

HNSW + BM25 + recency + importance, weighted via GUC-tunable formula. No glue code. No re-ranker microservice. One SELECT.

First-class role isolation

Multi-tenant from day one. role + project_id row-level security enforced inside Postgres — not by your app.

Who it's for

You probably already have
the stack pgmnemo needs.

If three of these describe your project, pgmnemo replaces ~200 lines of ad-hoc memory code with two SQL function calls.

✓

You run a multi-agent pipeline (research→write→review, plan→code→test) and each run starts from zero context.

✓

Your stack already has PostgreSQL + pgvector on Supabase, Neon, or self-hosted. You don't want to add a separate memory service.

✓

You've watched an agent hallucinate a summary and seen that wrong summary poison the next run. You need a way to block that.

✓

Your data has residency or sovereignty constraints — it can't go through a cloud memory API on someone else's infra.

✓

You want memory that costs nothing beyond your existing Postgres bill. Apache 2.0, no usage tiers, no per-request pricing.

✓

You need multi-tenant isolation at the database layer, not bolted on in your app code as another middleware to maintain.

Comparison

Honest comparison.
We benchmark against ourselves.

We publish what's true today, not what we hope to ship next quarter.

Capability	pgmnemo	mem0	Zep	Letta
Provenance enforcement	✓mandatory	✗	✗	✗
Zero data egress	✓in-database	✗Cloud API	✗Cloud API	✗Cloud API
Install model	CREATE EXTENSION	SaaS API key	SaaS API key	Separate service
Hybrid recall	✓default v0.4.0	Varies	Varies	Varies
Self-hosted cost	Free · Apache-2.0	$0.004/1K reads	$0.0001/msg	Vendor
LongMemEval recall@10	0.9334	Not published	Not published	Not published

API

Two function calls.
Everything you need.

SQL

-- Store with provenance (required)
SELECT pgmnemo.ingest(
    p_role        := 'developer',
    p_project_id  := 1,
    p_topic       := 'auth',
    p_lesson_text := 'Rotate JWT after key compromise.',
    p_commit_sha  := 'abc1234'
);

-- Hybrid recall
SELECT lesson_text, score, vec_score, bm25_score
FROM pgmnemo.recall_lessons(
    query_embedding := embed('JWT rotation'),
    query_text      := 'JWT secret rotation',
    role_filter     := 'developer'
) ORDER BY score DESC LIMIT 10;

Provenance-gated writes

commit_sha or artifact_hash required at insert time. No artifact → no canonical row.

Hybrid scoring formula

0.4×cosine + 0.4×BM25 + 0.2×recency — all weights GUC-tunable per session.

Graph traversal

traverse_causal_chain() and traverse_temporal_window() for typed edges between lessons.

Diagnostic columns

vec_score, bm25_score, rrf_score show which retrieval path fired.

LangChain integration

Drop-in retriever in integrations/langchain/.

Benchmarks

Real numbers.
Honest caveats.

LoCoMo

Maharana et al., ACL 2024

recall@100.8409+4.15pp

recall@50.7230+6.07pp

MRR0.6365+7.96pp

LongMemEval-S

Wu et al., ICLR 2025

recall@10 (pgmnemo)0.9334

MRR (pgmnemo)0.8472

recall@10 (BM25 baseline)0.9820baseline wins

Honest caveat

We lose to a 50-line BM25 script on LongMemEval (0.9334 vs 0.9820). Our LoCoMo session-level number uses a 22× smaller search space than the paper baseline. Comparisons with Mem0/Zep on these datasets are apples-to-oranges — they optimize different objectives. See COMPETITIVE_REALITY.md.

Installation

Pick your path.
All three take under five minutes.

▶ PGXN · RECOMMENDED

One-shot install

pgxn install pgmnemo==0.4.1

CREATE EXTENSION pgmnemo
   CASCADE;

▶ DOCKER · PRODUCTION

Bake into your image

FROM pgvector/pgvector:pg17
ADD pgmnemo-0.4.1.zip /tmp/
RUN unzip -r /tmp/pgmnemo-0.4.1.zip \
 && cp -r pgmnemo/extension/* \
    $(pg_config --sharedir)/extension/

▶ DEV / LAPTOP

Local Postgres in 60s

docker run -d --name pgm \
  pgvector/pgvector:pg17

curl -LO https://…/pgmnemo-0.4.1.zip
docker cp pgmnemo-0.4.1.zip pgm:/tmp/

PostgreSQL 17 blocking CI · PG 14 / 15 / 16 aspirational · pgvector ≥ 0.7.0 required

Full install guide ↗

Memory that refusesto remember hallucinations.

Memory for AI agents,without the noise.

Three SQL calls.That's the whole pipeline.