┌─ agidb · technical preprint · landing artifact last build · 2026-05-23 · 14:32 UTC · commit 7f4ab02 v2 pre-alpha · 44 tests passing · phase 7/16 ──┐
home · docs · LAYER_1_RECALL.md

LAYER_1_RECALL

agidb — Layer 1: Recall

The mind-like layer. HDC signatures, binding, bundling, hamming-distance retrieval, tiered confidence, goal-biased weighting, attention tracing. The substrate’s most-touched code path; the one the user experiences.

What layer 1 is

Layer 1 is where memories become hypervectors and retrieval becomes bit-overlap counting. It’s the layer that makes agidb feel different from a vector database. Where other systems do “embed → similarity search → rerank → answer,” layer 1 does “bind into HDC → POPCOUNT scan → tiered confidence → answer.”

Layer 1 sits on top of layer 3 (storage) and uses output from layer 2 (extraction). Its public-facing surface is recall(), the read-path function the agent calls most often.

The HDC kernel

Hypervectors

8192-bit binary hypervectors. 1024 bytes each. 64-byte aligned for SIMD.

#[repr(align(64))]
pub struct HV {
    pub bits: [u64; 128],   // 128 × 64 = 8192 bits
}

The three core operations

1. Bind ( or XOR):

pub fn bind(&self, other: &HV) -> HV {
    HV { bits: array::from_fn(|i| self.bits[i] ^ other.bits[i]) }
}

Bind two HVs into a third HV that is dissimilar to both. Used for role-filler patterns: bind(ROLE_SUBJ, Sarah_HV) is “the subject is Sarah,” similar to neither ROLE_SUBJ alone nor Sarah_HV alone. Self-inverse: bind(bind(a, b), b) = a.

2. Bundle (majority vote):

pub fn bundle(hvs: &[HV]) -> HV {
    let mut counts = [0i32; 8192];
    for hv in hvs {
        for bit_idx in 0..8192 {
            if hv.get_bit(bit_idx) { counts[bit_idx] += 1; } else { counts[bit_idx] -= 1; }
        }
    }
    let mut result = HV::zero();
    for bit_idx in 0..8192 {
        if counts[bit_idx] > 0 { result.set_bit(bit_idx); }
    }
    result
}

Bundle multiple HVs into one HV that is similar to all of them. Per-bit majority vote, thresholded to binary. Used to compose multiple bindings (multiple role-filler pairs in an episode signature).

3. Hamming (POPCOUNT distance):

pub fn hamming(&self, other: &HV) -> u32 {
    self.bits.iter().zip(&other.bits)
        .map(|(a, b)| (a ^ b).count_ones())
        .sum()
}
pub fn similarity(&self, other: &HV) -> f32 {
    1.0 - (self.hamming(other) as f32) / 8192.0
}

Hamming distance is bit-overlap counting. Similarity is 1 - hamming/8192 ∈ [0, 1]. POPCOUNT is a single CPU instruction on modern hardware; AVX-512 has VPOPCNTQ; ARM NEON has CNT. Portable Rust path uses u64::count_ones().

Why binary, not real-valued

binary BSC (agidb)real-valued HRR
bind operationXOR (1 instruction)circular convolution (FFT)
similarityhamming (POPCOUNT)cosine (multiply-add)
storage1 KB per HV32 KB per HV (8192 × float32)
ops/second~10× fasterslower
theoretical capacityexcellent for compositional structureexcellent for analog scalars
factorabilityclean (XOR self-inverse)requires resonator networks

For agidb’s use case (compositional structure over discrete concepts), BSC wins on every metric except analog scalars. v2.3 adds HRR as a secondary format for analog values (temperatures, scores, probabilities); v2.1 stays pure BSC.

Why 8192 bits

  • Capacity: at 8192 bits with random hypervectors, the probability of two unrelated HVs having hamming distance > 4000 is essentially 1.0. So similarity > 0.51 is meaningful signal.
  • Cache friendliness: 1024 bytes = 16 × 64-byte cache lines. POPCOUNT scan over 100k signatures is ~5ms on Zen 4 portable path, ~1.5ms with AVX-512.
  • Charikar 2002 JL bound: for embedding distances of typical ML latents (1024-2048 dims), 8192 bits provides ε ≈ 0.05 distortion.
  • Round number: debugging is easier with bit indices that fit in i13 + sign.

Encoder API

pub fn from_name(name: &str) -> HV {
    let hash = blake3::hash(name.as_bytes());
    HV::from_seed(hash.as_bytes())
}
pub fn from_seed(seed: &[u8; 32]) -> HV {
    let mut rng = ChaCha20Rng::from_seed(*seed);
    let mut hv = HV::zero();
    for bit_idx in 0..8192 {
        if rng.gen_bool(0.5) { hv.set_bit(bit_idx); }
    }
    hv
}

from_name("Sarah") deterministically produces the same 8192-bit HV every time. This is how agidb assigns concept HVs without a learned codebook.

v2.1 extension: HDC projection of dense latents

In v2.1, dense latents from V-JEPA 2 (1024d), Wav2Vec-BERT (1024d), and Llama-3.2-3B (2048d) project to 8192-bit HVs via Charikar 2002 thresholded random projection. See BRAIN_ALIGNMENT.md for the math. The projected HVs participate in layer 1 retrieval identically to text-derived signatures.

Episode encoding (binding triples into one signature)

A stored episode has zero or more triples. Each triple binds to a partial pattern; all triples bundle into the episode signature.

pub fn bind_triple(triple: &Triple) -> HV {
    let subj = concept_hv(&triple.subject);
    let pred = predicate_hv(&triple.predicate);
    let obj  = value_hv(&triple.object);

    ROLE_SUBJ.bind(&subj)
        ^ ROLE_PRED.bind(&pred)
        ^ ROLE_OBJ.bind(&obj)
}

pub fn encode_episode_signature(triples: &[Triple]) -> HV {
    let triple_sigs: Vec<HV> = triples.iter().map(bind_triple).collect();
    if triple_sigs.is_empty() {
        return HV::zero();
    }
    bundle(&triple_sigs)
}

ROLE_SUBJ, ROLE_PRED, ROLE_OBJ are workspace-init seeded random HVs, fixed for the database lifetime.

Why bundling: a single episode often has multiple triples. Bundling produces one signature similar to all the triple bindings, retrievable by any of them. Example: episode “Sarah said Bawri is thai and Bawri is in Bandra” has two triples. The bundled signature matches partial queries about Sarah, Bawri, thai, or Bandra.

v2.1: multimodal episode encoding

In v2.1, the episode signature additionally binds modality components:

pub fn encode_multimodal_episode(
    text_triples: &[Triple],
    video_sig: Option<HV>,
    audio_sig: Option<HV>,
    text_sig: Option<HV>,
    goal_id: Option<GoalId>,
    belief_ids: &[BeliefId],
    time_bucket: TimeBucket,
) -> HV {
    let mut episode = encode_episode_signature(text_triples);

    if let Some(sv) = video_sig { episode ^= ROLE_VIDEO.bind(&sv); }
    if let Some(sa) = audio_sig { episode ^= ROLE_AUDIO.bind(&sa); }
    if let Some(st) = text_sig  { episode ^= ROLE_TEXT.bind(&st); }
    if let Some(g) = goal_id    { episode ^= ROLE_GOAL.bind(&goal_signature(g)); }
    for b in belief_ids         { episode ^= ROLE_BELIEF.bind(&belief_signature(*b)); }
    episode ^= ROLE_TIME.bind(&time_signature(time_bucket));

    episode
}

The bundle stays factorable: any component can be recovered via XOR with its ROLE_* HV, cleaned up via nearest-neighbor lookup in the relevant codebook. See NEUROSYMBOLIC.md for the full factorization mechanics.

The tiered cascade

Recall never returns the empty set (constitution article VI). It falls through tiers until it finds matches or hits the tier_floor:

Tier A — Exact         canonical entity match via concept index
                       confidence 1.0
Tier B — Similarity    HDC structured signature similarity,
                       POPCOUNT over inverted-index intersection
                       confidence band [0.6, 0.95]
Tier C — Gist          raw-text gist signature similarity
                       confidence band [0.3, 0.6]
Tier D — Nearest       best-effort nearest neighbors,
                       low_confidence flag, confidence ≤ 0.3

Tier A — Exact concept match

If the cue contains a known concept name (resolved via the concept index), retrieve all episodes referencing that concept’s ConceptId. Returned with confidence 1.0.

async fn tier_a(&self, query: &Query) -> Result<Vec<RecallMatch>> {
    if let Some(name) = &query.entity_name {
        if let Some(concept_id) = self.store.lookup_concept(name).await? {
            return self.store.episodes_for_concept(concept_id).await;
        }
    }
    Ok(vec![])
}

Phase 4 ships this.

Tier B — Structured signature similarity

If extraction (layer 2) can build a partial triple from the cue, encode that as a partial signature and find episodes whose stored signatures have high overlap.

async fn tier_b(&self, query: &Query) -> Result<Vec<RecallMatch>> {
    let partial_sig = encode_partial_signature(&query.extracted_triples)?;
    let candidates = self.inverted_index_intersection(&partial_sig).await?;
    let mut scored = vec![];
    for ep_id in candidates {
        let ep_sig = self.store.load_signature(ep_id)?;
        let sim = ep_sig.similarity(&partial_sig);
        if sim > 0.6 { scored.push((ep_id, sim)); }
    }
    scored.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
    Ok(scored.into_iter().take(query.k).map(|(id, conf)| RecallMatch::new(id, conf, Tier::Similarity)).collect())
}

Phase 3 (extraction) unlocks tier B. The inverted index by bit-set membership prevents scanning all signatures every query.

Tier C — Gist signature similarity

Every episode also has a gist signature derived from a hash of bigrams/trigrams of its raw text. Tier C searches by gist similarity when the structured tier finds nothing.

async fn tier_c(&self, query: &Query) -> Result<Vec<RecallMatch>> {
    let gist_sig = encode_gist_signature(&query.cue_text);
    let mut scored = vec![];
    for (ep_id, ep_gist) in self.store.all_gists().await? {
        let sim = ep_gist.similarity(&gist_sig);
        if sim > 0.3 { scored.push((ep_id, sim)); }
    }
    scored.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
    Ok(scored.into_iter().take(query.k).map(|(id, conf)| RecallMatch::new(id, conf, Tier::Gist)).collect())
}

Phase 4 ships this. For >1M episodes, replace the full scan with LSH (v0.3+).

Tier D — Nearest-neighbor fallback

Best-effort. Even if tier B and tier C find nothing above their thresholds, return the closest matches with low_confidence: true. Agents always get something to work with.

async fn tier_d(&self, query: &Query) -> Result<Vec<RecallMatch>> {
    let cue_sig = encode_gist_signature(&query.cue_text);
    let mut all_episodes = self.store.recent_episodes(query.recency_window).await?;
    all_episodes.sort_by_key(|ep| ep.gist_signature.hamming(&cue_sig));
    Ok(all_episodes.into_iter().take(query.k).map(|ep|
        RecallMatch::new(ep.id, 0.2, Tier::NearestNeighbor)
            .with_low_confidence_flag()
    ).collect())
}

Phase 4 ships this.

The full recall cascade

pub async fn recall(&self, query: Query) -> Result<Recall> {
    let start = Instant::now();
    let mut matches = self.tier_a(&query).await?;

    if matches.is_empty() && query.tier_floor <= Tier::Similarity {
        matches = self.tier_b(&query).await?;
    }
    if matches.is_empty() && query.tier_floor <= Tier::Gist {
        matches = self.tier_c(&query).await?;
    }
    if matches.is_empty() && query.tier_floor <= Tier::NearestNeighbor {
        matches = self.tier_d(&query).await?;
    }

    let tier_used = matches.first().map(|m| m.tier).unwrap_or(Tier::Exact);
    matches = self.apply_goal_bias(matches).await?;
    matches = self.apply_recency_boost(matches, &query).await?;
    matches = self.apply_bitemporal_filter(matches, &query).await?;
    matches.sort_by(|a, b| b.confidence.partial_cmp(&a.confidence).unwrap());
    matches.truncate(query.k);

    let attention_trace = if query.trace_attention {
        Some(self.build_attention_trace(&query, &matches).await?)
    } else { None };
    if let Some(ref t) = attention_trace {
        self.emit_learning_event(LearningEvent::AttentionTraced { recall_id: t.id, signatures_considered: t.candidates.len(), at: Utc::now() }).await?;
    }

    let semantic_atoms = self.recall_semantic_atoms(&query).await?;
    let beliefs = self.recall_beliefs(&query).await?;
    let active_goals = self.active_goals_for_query(&query).await?;

    Ok(Recall {
        matches, semantic_atoms, beliefs, active_goals,
        tier_used,
        elapsed_ms: start.elapsed().as_millis() as u32,
        attention_trace,
    })
}

Goal-biased retrieval (floor 6 → layer 1)

When active goals exist, their HDC signatures up-weight related matches.

async fn apply_goal_bias(&self, mut matches: Vec<RecallMatch>) -> Result<Vec<RecallMatch>> {
    let active_goals = self.active_goals().await?;
    if active_goals.is_empty() { return Ok(matches); }
    let goal_sig = bundle(&active_goals.iter().map(|g| g.signature.clone()).collect::<Vec<_>>());
    for m in matches.iter_mut() {
        let ep_sig = self.store.load_signature(m.episode_id)?;
        let bias = ep_sig.similarity(&goal_sig) * GOAL_BIAS_WEIGHT;
        m.confidence = (m.confidence * (1.0 + bias)).min(1.0);
        m.goal_biased = bias > 0.05;
    }
    Ok(matches)
}

GOAL_BIAS_WEIGHT default 0.3. Tunable per-query via Query::with_goal_bias(weight).

Mirrors PFC’s role in biasing attention toward goal-relevant memories (Miller & Cohen 2001).

Bi-temporal filtering

Every recall has implicit (or explicit) valid_as_of and transaction_as_of timestamps. Episodes outside the valid window or superseded as of the transaction time are filtered.

async fn apply_bitemporal_filter(
    &self, mut matches: Vec<RecallMatch>, query: &Query
) -> Result<Vec<RecallMatch>> {
    let valid_t = query.valid_as_of.unwrap_or_else(Utc::now);
    let tx_t = query.transaction_as_of.unwrap_or_else(Utc::now);
    matches.retain(|m| {
        let ep = self.store.episode(m.episode_id).unwrap();
        ep.is_valid_at(valid_t) && !ep.is_superseded_at(tx_t) && ep.tombstoned_at.map_or(true, |t| t > tx_t)
    });
    Ok(matches)
}

Attention trace (floor 7 audit)

When query.trace_attention = true, agidb records which signatures were considered, which scored highest, and why each was retained or rejected. The trace lands in floor 7’s learning log; the agent can later ask “what was I attending to during recall_id X?”

pub struct AttentionTrace {
    pub id: RecallId,
    pub query: Query,
    pub candidates: Vec<AttentionCandidate>,
    pub goal_signature: Option<HV>,
    pub recency_window: Duration,
    pub timestamp: DateTime<Utc>,
}

pub struct AttentionCandidate {
    pub episode_id: EpisodeId,
    pub similarity: f32,
    pub goal_bias: f32,
    pub recency_boost: f32,
    pub final_confidence: f32,
    pub retained: bool,
    pub rejection_reason: Option<String>,
}

Default off (overhead). Opt-in for debugging, RL exploration tracking, or compliance auditing.

Performance characteristics

Operation100k episodes1M episodesImplementation
Tier A< 1ms< 1msconcept index hash lookup
Tier B (with extraction)~10ms~50msinverted index intersection + POPCOUNT
Tier C~5ms~50ms (with LSH ~10ms)full gist scan (LSH at scale)
Tier D~5ms~50msbounded recency window scan
Goal-bias pass< 1ms< 5msone similarity per match (k=20)
Bi-temporal filter< 1ms< 5msredb point lookups
Attention trace build~5ms~10msonly if enabled
Total p95< 50ms< 100mstarget

These are CPU-only measurements. No GPU. No external services. No LLM calls.

What this layer doesn’t do

  • Extract entities and relations from text. That’s layer 2 (agidb-extract).
  • Encode video/audio to dense latents. That’s layer 2 in v2.1 (agidb-sensory).
  • Persist anything to disk. That’s layer 3 (agidb-core::store).
  • Decide when to consolidate. That’s the consolidation worker (separate module).
  • Run any LLM. Read path is deterministic math (constitution article IV).

What enables what (the dependency graph)

HDC kernel (phase 1, done)

Episode encoding (phase 4, done)

Tier A + C + D recall (phase 4, done)

Goal-bias + recency + bi-temporal filtering (phase 4, done)

Tier B recall (phase 4, blocked on phase 3)

Triple extraction (phase 3) ─────────────┐

                                  Belief extraction (phase 9)

                                  Goal-biased recall fully active

Multimodal episode encoding (phase 14) — v2.1

Multimodal recall (phase 14) — v2.1 unbind for per-modality search

Attention trace (phase 10)

Self-model floor 7 fully active

Cognitive benchmark suite (phase 13)

Decision gate (phase 7)

BAMS benchmark (phase 16) — v2.1

v2.1 milestone, ICLR 2026 MemAgents paper

Why layer 1 is the wedge

Every other agent memory system embeds its similarity layer in either an LLM API call or a vector DB query. agidb’s similarity layer is CPU-local POPCOUNT over 1KB signatures. Three orders of magnitude faster than the API-call path. Three orders of magnitude cheaper. Zero external dependencies in the read path.

That’s the wedge. Layer 1 is what makes agidb the substrate that gets the latency, the cost, the offline operation, the determinism. Everything else (cognitive primitives, brain-alignment, BAMS) is built on this layer’s properties.

The v2.1 multimodal extension stays purely on layer 1: dense latents from V-JEPA 2 / Wav2Vec-BERT / Llama-3.2-3B project to 8192-bit HVs, then layer 1 retrieval works identically. The encoder pipeline is layer 2 in v2.1; the math at the retrieval level is the same as v2.0.

That’s why brain-alignment is additive (constitution article XVIII). Layer 1 isn’t getting rewritten for v2.1; it’s getting a new write-path input source. The fundamental retrieval math stays exactly the same.