append-only log of every learning event · self-vector EMA — a slowly drifting 8192-bit centroid representing "what kind of agent am I"
db.what_did_i_learn(since) · db.self_vector() · db.attention_trace(recall_id)
50 years of memory research names five biological memory systems. add goals/beliefs and the self-model and you get the seven floors agidb stores as first-class typed shapes.
50 years of memory research (tulving, squire, baddeley, mcclelland) names five biological memory systems. add goals/beliefs and the self-model — what an agent needs that a brain organism doesn't — and you get the seven floors agidb stores as first-class typed shapes.
↗ = surprise-gated promotion · ↻ = revision · ⊕ = bound role-filler · α = EMA rate
floors 1·3 → hippocampus · floor 4 → neocortex · floor 5 → basal ganglia · floor 7 → DMN + dlPFC
✅ shipped from sochdb v1 inheritance · ⬜ phase 9–16 (v2.0 + v2.1)
append-only log of every learning event · self-vector EMA — a slowly drifting 8192-bit centroid representing "what kind of agent am I"
db.what_did_i_learn(since) · db.self_vector() · db.attention_trace(recall_id)
what the agent wants (state machine: Active / Paused / Completed / Abandoned) and what it thinks is true (confidence + evidence + revision audit)
db.set_goal(g) · db.assert_belief(b) · db.revise_belief(evidence) · db.what_do_i_believe(about)
skills and workflows with execution traces and success-count statistics — typed episode shape
db.observe_procedure(p) · db.record_execution(p, trace) · db.procedure_stats(p)
decoupled general knowledge · facts consolidated from N≥3 episodic patterns into a SemanticAtom
db.consolidate() · db.what_about(concept_id)
events with bi-temporal stamps and HDC signatures · the core typed shape · multimodal in v2.1
db.observe(text, ctx) · db.observe_multimodal(video, audio, text) · db.recall(query)
active context · session-scoped · recency-weighted retrieval over episodic — no separate table
db.recall(Query::with_session(sid))
raw signal ring buffer with surprise-gated promotion to episodic · v2.1: multimodal (video + audio + text) with brain-calibrated θ_brain
db.observe_sensory(frame) · db.observe_multimodal(...) · surprise > θ_brain ↗ episodic
The five new first-class types that make agidb a cognitive substrate rather than a memory database: Goals, Beliefs, Sensory frames, Self-model, and the Unlearn API. Extended in v2.1 with multimodal sensory and brain-aligned surprise gating.
Every existing agent memory system stores text or vectors. agidb stores typed cognitive primitives with their own retrieval semantics, audit trails, and lifecycle behavior. The five primitives are:
| Primitive | Floor | What it represents |
|---|---|---|
| Goal | 6 | What the agent wants. State machine with parent-child hierarchy. |
| Belief | 6 | What the agent thinks is true. Revisable with audit. |
| SensoryFrame | 1 | Raw input before promotion to episodic memory. Surprise-gated. |
| LearningEvent + SelfVector | 7 | The self-model: audit log of state changes + slowly-drifting self-representation. |
| UnlearnTarget | cross-floor | Reference to something the agent should forget. Triggers cascading non-destructive removal. |
Each is a Rust type, has its own redb table, has explicit API methods, and is property-tested.
pub struct Goal {
pub id: GoalId,
pub parent_id: Option<GoalId>,
pub description: String,
pub state: GoalState,
pub success_criteria: Vec<SuccessCriterion>,
pub deadline: Option<DateTime<Utc>>,
pub signature: HV,
pub created_at: DateTime<Utc>,
pub updated_at: DateTime<Utc>,
pub provenance: Provenance,
}
pub enum GoalState {
Active,
Paused { since: DateTime<Utc>, reason: String },
Completed { at: DateTime<Utc>, evidence: Vec<EpisodeId> },
Abandoned { at: DateTime<Utc>, reason: String },
}
Active
↗ ↓ ↘
(revive) ↓ (complete) → Completed
↘ ↓ ↙ (terminal)
Paused → Active
↓
Abandoned (terminal)
Validation: Completed and Abandoned are terminal. Pause/resume preserve history. State transitions emit LearningEvent::GoalStateChanged.
Goals form a tree. A parent goal completes when all its children complete (configurable: any-child or all-children semantics). Each goal carries an HDC signature derived from its description + parent context. Goal-biased recall uses these signatures to up-weight relevant memories.
impl Agidb {
pub async fn set_goal(&self, goal: Goal) -> Result<GoalId>;
pub async fn revise_goal(&self, id: GoalId, patch: GoalPatch) -> Result<()>;
pub async fn complete_goal(&self, id: GoalId, evidence: Vec<EpisodeId>) -> Result<()>;
pub async fn abandon_goal(&self, id: GoalId, reason: String) -> Result<()>;
pub async fn active_goals(&self) -> Result<Vec<Goal>>;
pub async fn goal_tree(&self, root: GoalId) -> Result<GoalTree>;
pub async fn get_goal(&self, id: GoalId) -> Result<Option<Goal>>;
}
Existing systems store goals as text in episode descriptions. The agent code then re-parses, re-tracks, re-evaluates. State machines live in agent code, not the database. This is brittle. agidb makes goals a typed shape so:
A long-running agent accumulates hundreds to thousands of goals over months. Without first-class goal storage, the agent has no reliable way to ask “what was I trying to do last week?” or “what subgoals support my current top-level goal?” agidb makes these queries trivial.
pub struct Belief {
pub id: BeliefId,
pub claim: String,
pub subject: ConceptId,
pub predicate: String,
pub object: Value,
pub confidence: f32, // [0.0, 1.0]
pub evidence: Vec<EpisodeId>,
pub contradictions: Vec<EpisodeId>,
pub revision_log: Vec<BeliefRevision>,
pub signature: HV,
pub t_valid_start: DateTime<Utc>,
pub t_valid_end: Option<DateTime<Utc>>,
pub t_tx_start: DateTime<Utc>,
pub t_tx_end: Option<DateTime<Utc>>,
pub provenance: Provenance,
}
pub struct BeliefRevision {
pub timestamp: DateTime<Utc>,
pub previous_confidence: f32,
pub new_confidence: f32,
pub triggering_evidence: Option<EpisodeId>,
pub reason: String,
}
When new evidence arrives:
contradictions.BeliefRevision to revision_log.LLMs may participate at write time for revision (constitution article IV amendment): the LLM is asked “does evidence E contradict claim C?” with structured prompt; output is parsed into a typed RevisionDecision. The decision plus the LLM’s reasoning becomes the reason field. The read path stays LLM-free.
impl Agidb {
pub async fn assert_belief(&self, belief: Belief) -> Result<BeliefId>;
pub async fn revise_belief(
&self,
id: BeliefId,
new_evidence: EpisodeId
) -> Result<RevisionReport>;
pub async fn what_do_i_believe(&self, about: ConceptId) -> Result<Vec<Belief>>;
pub async fn belief_history(&self, id: BeliefId) -> Result<Vec<BeliefRevision>>;
pub async fn withdraw_belief(&self, id: BeliefId, reason: String) -> Result<()>;
}
Beliefs differ from facts because they have:
Storing beliefs as Belief rather than Episode enables:
Beliefs are revisable, never overwritten. Every revision appends to the log. The agent can always answer “what did I believe last week, and what changed my mind?”
pub struct SensoryFrame {
pub id: SensoryId,
pub modality: Modality,
pub data: SensoryData,
pub received_at: DateTime<Utc>,
pub surprise_score: f32,
pub promoted_to: Option<EpisodeId>,
}
pub enum Modality {
Text,
Image { path: PathBuf },
Audio { path: PathBuf, duration_ms: u32 },
Video { path: PathBuf, frame_count: u32, fps: f32 },
Multimodal { components: Vec<Modality> }, // v2.1+
}
pub enum SensoryData {
InlineText(String),
BlobRef(PathBuf),
InlineLatent { encoder: EncoderId, latent: Vec<f32> }, // v2.1+
}
The sensory buffer is a fixed-capacity ring (default 1000 frames or 60 seconds). When full, oldest frames are dropped unless promoted. Promotion happens when:
surprise_score > θ (threshold; v2.0 default 0.4, v2.1 brain-calibrated)observe() call promotes the framesurprise(frame) = 1 - similarity(frame_signature, bundle_of(recent_beliefs))
If the frame’s signature is highly similar to the agent’s current beliefs, surprise is low → don’t store. If it’s a novel observation, surprise is high → promote to episodic.
Same formula, but threshold θ_brain is empirically fit against TRIBE v2’s predicted neural surprise on associative cortex (TPJ, dlPFC, DMN). See BRAIN_ALIGNMENT.md.
impl Agidb {
pub async fn observe_multimodal(
&self,
video: Option<VideoClip>,
audio: Option<AudioClip>,
text: Option<String>,
ctx: ObserveContext,
) -> Result<EpisodeId>;
}
The pipeline:
The resulting episode signature is factorable: each modality component can be recovered from the bound episode by XORing with the appropriate ROLE_* hypervector. This is the structural advantage over TRIBE’s attention fusion and over mem0/letta/zep’s dense embeddings.
impl Agidb {
// v2.0
pub async fn observe_sensory(&self, frame: SensoryFrame) -> Result<SensoryId>;
pub async fn working_state(&self) -> Result<SensoryBuffer>;
pub async fn surprise_score(&self, frame: &SensoryFrame) -> Result<f32>;
// v2.1
pub async fn observe_multimodal(
&self,
video: Option<VideoClip>,
audio: Option<AudioClip>,
text: Option<String>,
ctx: ObserveContext,
) -> Result<EpisodeId>;
pub async fn surprise_score_brain_calibrated(
&self,
frame: &MultimodalFrame
) -> Result<f32>;
pub async fn extract_modality_signature(
&self,
episode_id: EpisodeId,
modality: Modality,
) -> Result<Option<HV>>;
}
Without a sensory floor, every observation goes straight to episodic memory. Storage grows unbounded; consolidation has too much to consolidate; the agent over-remembers trivia. The sensory buffer + surprise gating is the agent’s attentional filter. It’s how the agent decides what’s worth remembering.
In v2.1, multimodal sensory makes agidb the only agent memory substrate that handles video and audio as first-class inputs, not as text descriptions of media.
Two components:
learning_events — append-only log of every introspectable event the system records.self_vector — a slowly-drifting 8192-bit hypervector representing “what kind of agent am I right now.”pub enum LearningEvent {
EpisodeStored { id: EpisodeId, at: DateTime<Utc> },
MultimodalEpisodeStored { id: EpisodeId, modalities: Vec<Modality>, at: DateTime<Utc> },
GoalStateChanged { id: GoalId, from: GoalState, to: GoalState, at: DateTime<Utc> },
BeliefAsserted { id: BeliefId, claim: String, confidence: f32, at: DateTime<Utc> },
BeliefRevised { id: BeliefId, revision: BeliefRevision },
BeliefWithdrawn { id: BeliefId, reason: String, at: DateTime<Utc> },
SensoryFramePromoted { sensory_id: SensoryId, episode_id: EpisodeId, surprise: f32 },
SemanticAtomFormed { atom_id: AtomId, source_episodes: Vec<EpisodeId>, at: DateTime<Utc> },
ContradictionDetected { atoms: Vec<AtomId>, at: DateTime<Utc> },
Unlearned {
target: UnlearnTarget,
cascade_size: usize,
self_vector_drift: u32,
audit_id: AuditId,
at: DateTime<Utc>,
},
AttentionTraced { recall_id: RecallId, signatures_considered: usize, at: DateTime<Utc> },
SelfVectorUpdated { drift_hamming: u32, at: DateTime<Utc> },
ConsolidationRun { atoms_created: usize, contradictions: usize, at: DateTime<Utc> },
}
Closed enum (constitution article XV implication). New event types require ADR.
Inspired by:
agidb’s self-vector is a slowly-drifting 8192-bit HV updated each consolidation epoch:
self_vector ← (1 - α) * self_vector + α * bundle(consolidated_atoms)
with α ≈ 0.05. (Bundle is per-bit majority over the input set, then thresholded back to binary.)
When episodes are tombstoned via the unlearn API:
self_vector ← self_vector - α * bundle(tombstoned_signatures)
(In binary HDC: bundle the tombstoned signatures, XOR with self_vector, then threshold.)
Without this step the self-model still “remembers” unlearned concepts as centroid contamination. With it, unlearn affects the self-vector. This makes the unlearn primitive real.
impl Agidb {
pub async fn what_did_i_learn(
&self,
since: DateTime<Utc>
) -> Result<Vec<LearningEvent>>;
pub async fn attention_trace(
&self,
recall_id: RecallId
) -> Result<Option<AttentionTrace>>;
pub async fn self_vector(&self) -> Result<HV>;
pub async fn self_vector_at(&self, time: DateTime<Utc>) -> Result<HV>;
pub async fn introspect(&self, query: IntrospectionQuery) -> Result<IntrospectionResult>;
}
Example questions the self-model answers:
A self-modifying agent needs introspection. Without a self-model, the agent can act but cannot reason about its own actions. With one, the agent can explain itself, debug itself, and (in v2.5) modify itself with formal guarantees about what it changed.
A first-class, cascading, non-destructive removal operation. Constitution articles XII and XVI.
pub enum UnlearnTarget {
Episode(EpisodeId),
Belief(BeliefId),
Concept(ConceptId),
BySource(String), // "GDPR request from user X"
BySession(SessionId), // "this entire conversation"
Pattern(QueryPattern), // "anything matching these criteria"
}
1. IDENTIFY CASCADE
Find all episodes, beliefs, semantic atoms, procedures referencing
the target. Compute the dependency graph.
2. TOMBSTONE (non-destructive)
Mark each affected row with t_tombstoned = now.
Signatures invalidated in mmap.
Concept HV marked withdrawn.
Inverted-index entries removed.
3. CASCADE
Beliefs whose evidence drops below threshold → confidence reduced
or belief withdrawn. Affected semantic atoms recomputed without
removed evidence. Procedures with broken trigger chains marked degraded.
4. SELF-VECTOR SUBTRACTION (v2)
self_vector ← self_vector - α * bundle(tombstoned_signatures)
The self-model no longer contains contamination from the unlearned data.
5. AUDIT (permanent)
Emit LearningEvent::Unlearned with:
- target_ref
- cascade_size: episodes, beliefs, atoms, procedures affected
- self_vector_drift: hamming distance before/after
- reason
- dependency_graph_id
6. TOMBSTONE EXPIRY
Default 30 days. After expiry, signatures.dat compacted, tombstoned
rows physically removed. BUT the LearningEvent::Unlearned record
is permanent — the *fact that data was removed* survives compaction.
impl Agidb {
pub async fn unlearn(
&self,
target: UnlearnTarget,
reason: String
) -> Result<UnlearnReport>;
pub async fn unlearn_report(&self, audit_id: AuditId) -> Result<UnlearnReport>;
pub async fn unlearn_history(
&self,
since: DateTime<Utc>
) -> Result<Vec<UnlearnEvent>>;
pub async fn restore_within_window(&self, audit_id: AuditId) -> Result<RestoreReport>;
}
pub struct UnlearnReport {
pub episodes_removed: usize,
pub beliefs_removed: usize,
pub beliefs_revised: usize,
pub semantic_atoms_affected: usize,
pub procedures_affected: usize,
pub signatures_invalidated: usize,
pub self_vector_drift_hamming: u32,
pub audit_log_entry: LearningEventId,
pub tombstone_expiry: DateTime<Utc>,
}
GDPR Article 17 (right to erasure), HIPAA, financial-data retention laws all require deletion semantics that vanilla DELETE doesn’t provide:
mem0, letta, zep, cognee all ship delete() as DELETE FROM. None handle cascading, none preserve audit, none subtract from the self-model. agidb is the first agent memory layer to take these requirements seriously from v0.1.
Without step 4, here’s what happens: an enterprise customer asks “do you still remember information X?” The agent runs a recall query → no results returned. Customer is satisfied. But the self-vector (used in goal-biased retrieval, surprise gating, etc.) still contains a contribution from the unlearned signatures. Internal agent behavior is still subtly biased by what was supposed to be forgotten.
This is the difference between hiding data and forgetting data. agidb forgets.
let db = Agidb::open("./memory.agidb").await?;
// floor 6 — goal
let goal = db.set_goal(Goal::new("find a thai place for the team dinner")).await?;
// floor 1 — sensory + extraction + promotion (v2.0 text-only)
db.observe_sensory(SensoryFrame::text("sarah recommended bawri")).await?;
// floor 1 — sensory + extraction + promotion (v2.1 multimodal)
db.observe_multimodal(
Some(video_clip),
Some(audio_clip),
Some("sarah said bawri at dinner".into()),
ObserveContext::default(),
).await?;
// floor 6 — belief
db.assert_belief(
Belief::new("Bawri is a thai restaurant")
.with_confidence(0.9)
.with_evidence_from("sarah's recommendation")
).await?;
// floor 3 → 4 — consolidation
db.consolidate().await?;
// floor 1-7 — goal-biased recall
let result = db.recall("what thai place was mentioned?").await?;
// Goal-biased: "find a thai place" is active, so thai-related matches are up-weighted.
// Returns: Bawri (0.94), with provenance back to Sarah's observation.
// floor 7 — introspection
let log = db.what_did_i_learn(since_yesterday()).await?;
// Returns: [GoalSet, MultimodalEpisodeStored, BeliefAsserted, SemanticAtomFormed, ...]
// later: unlearn
db.unlearn(UnlearnTarget::BySession(dinner_session), "user requested forget".into()).await?;
// Cascading removal + self-vector subtraction + permanent audit.
The five primitives turn agidb from a memory database into a cognitive substrate. Each primitive could exist standalone, but their composition is what makes the substrate AGI-grade:
Together, they implement the minimum credible substrate for autonomous AI agents.
In v2.1, multimodal sensory adds video+audio+text as first-class inputs, brain-calibrated surprise empirically grounds the sensory threshold, and the BAMS benchmark validates the substrate against human cortical ground truth. The cognitive primitives become brain-aligned cognitive primitives. That’s the v2.1 wedge.