Memory Semantics
The resource pages define what fields a record carries. This page defines what those fields mean over time — the cognitive model OMIR encodes. It is the answer to a fair question about the field tables: “why is confidence two numbers and a third? what is a half-life doing in a data format? what makes an edge’s strength go up?”
OMIR’s guiding rule, restated from Design Principles §6:
OMIR standardizes the state a memory algorithm reads and writes — not the algorithm.
Two faithful implementations may decay, consolidate, and re-rank memory with completely different code. What they must agree on is the meaning of the state in the file. The mechanisms below are drawn from the reference implementation (Veld); the spec records the state, and an implementation is free to reproduce the behavior however it likes.
1. Confidence as calibrated belief
MemoryRecord.confidence is a Confidence object — { alpha, beta, calibrated } — not a
bare float, because an agent needs to distinguish “90% sure, having checked many times”
from “90% sure, having seen this once.” A single number cannot.
OMIR models belief as a Beta(α, β) posterior:
alphaaccumulates confirming evidence (the memory proved helpful/correct), plus a prior.betaaccumulates disconfirming evidence (the memory proved misleading/wrong), plus a prior.calibratedis the point estimate a consumer should act on. The natural estimate is the posterior mean,α / (α + β), but a faithful producer damps it toward 0.5 when evidence is thin — at one or two observations the prior should dominate, so an unproven memory does not masquerade as a confident one. Asα + βgrows,calibratedapproaches the raw mean. R1 does not mandate the exact shrinkage, socalibratedis a producer-asserted point estimate: two faithful producers MAY emit a differentcalibratedfor the same(α, β). Theα/βpair is the portable, reproducible quantity; a consumer that needs a reproducible or cross-producer-comparable estimate SHOULD recompute it fromα/βunder its own rule rather than rely oncalibratedbeing identical across producers (see Theory & Scope).
A consumer that only reads calibrated behaves correctly within a single producer’s
records; the α/β pair is there for consumers that want to keep updating belief as new
feedback arrives. This is why confidence is a distribution rather than a bare float — it
carries its own evidence count, so belief can be revised rather than merely averaged.
One caveat on merging: adding two stores’ α/β yields a coherent posterior only when
the two evidence streams are independent. When both stores derived their belief from the
same upstream source — common when memories are exported and re-imported — summing the counts
double-counts the shared evidence and overstates confidence. A consumer that merges
confidence across stores SHOULD account for shared provenance (e.g. via
provenance.externalId) rather than blindly add counts.
2. Forgetting as a curve
MemoryRecord.decay ({ halfLifeHours, lastAccess, accessCount, anchored }) records a
memory’s forgetting state so that “better forgetting” survives export. The intent:
- A record’s retrievability falls over time on a half-life — recent, frequently
accessed memories stay sharp; stale ones fade.
lastAccessandaccessCountare the inputs a decay function reads; each access pushes retrievability back up. - Decay is naturally multi-time-scale: fast in the first hours/days (filtering noise),
then much slower for what survives — empirically closer to a power law than a single
exponential. OMIR does not mandate the curve; it records the parameters
(
halfLifeHours) and the access signal a curve consumes. anchored: truemarks a memory that resists decay — a pinned fact, a user-stated preference, a safety constraint. Anchored memories have a floor below which retrievability does not fall.
The spec stores forgetting state, not a forgetting algorithm, precisely so a robot on a power-cycle budget and a cloud assistant can both reconstruct sensible decay from the same fields.
3. Tiering and consolidation
MemoryRecord.tier (working → session → longterm → archive) records where a memory sits
in a consolidation hierarchy — the at-rest analogue of working vs. long-term memory:
- working — active, in-focus, short-lived; the smallest, hottest set.
- session — bounded to a task/conversation; indexed for fast recall.
- longterm — consolidated knowledge, retrieved by semantic cue.
- archive — cold, near-permanent, batch-retrieved.
Promotion is driven by age × importance × access: a memory accessed repeatedly, or marked important, or linked to long-term knowledge, migrates inward; an untouched low-importance memory drifts outward and eventually compresses. OMIR records the tier a record currently occupies; the promotion policy is the implementation’s.
4. Hebbian relationship strength
Relationship.strength is a synaptic weight, not a static label. The cognitive model is
Hebbian — cells that fire together wire together:
- When two entities are co-activated (retrieved or mentioned together), the edge between them strengthens.
- An edge that is not used decays toward zero, the same “better forgetting” applied to structure rather than content.
validAtrecords when the relationship was last observed to hold;invalidatedAtretires an edge that has been contradicted without deleting it, so the graph keeps its history rather than silently rewriting it.
Heavier machinery a faithful engine may run on top of this — long-term potentiation,
asymmetric forward/backward strengths, consolidation tiers on the edge itself — is
implementation state and rides in extension[], not the core. The core carries the one
number every graph engine agrees on: current strength.
5. Salience
Entity.salience is how strongly an entity pulls on retrieval — its gravitational mass in
the memory space. A high-salience entity is more likely to be surfaced and to drag related
memories up with it via spreading activation. Salience rises with frequency
(mentionCount), recency (lastSeenAt), whether the entity is a proper noun
(properNoun), and explicit user importance. As with decay, OMIR stores the salience value
and its inputs; how an engine computes and uses it is its own business.
6. The temporal model
OMIR is bi-temporal in spirit: it separates when something happened from when it was recorded.
eventTime— when the described event actually occurred. It may precedecreatedAt(you can record a memory of last week today).createdAt— encoding time: when the record was written.validUntil(onMemoryRecord) andinvalidatedAt(onRelationship) express temporal invalidation — a fact superseded after an instant, or an edge that stopped holding — so consumers can filter stale knowledge rather than trust it forever.
Entry order is not significant; everything resolves by ResourceType/id. A consumer must
never infer meaning from position (Conformance). A richer, fully
interval-based bitemporal model is a candidate generalization — see
Toward a Global Standard §H.
7. Provenance and source credibility
MemoryRecord.provenance ({ source, sourceType, credibility, externalId }) answers where
a memory came from and how much to trust the origin. credibility (a [0,1]
UnitInterval) lets a consumer weight a memory by the trustworthiness of its source — a
user statement, a verified document, and an inferred guess are not equally reliable, and a
retrieval ranker should be able to say so. externalId (system:id, e.g. github:pr-123,
linear:SHO-39) keeps a memory traceable back to the artifact that produced it. A
multi-hop, signable provenance chain is a candidate generalization — see
Toward a Global Standard §E.
8. Prospective memory
Most memory is about the past. Prospective memory is about the future: a remembered
intention to do something later. OMIR models it with experienceType: "intention" on an
otherwise ordinary MemoryRecord — future-directed records live in the same store as
retrospective ones, so the same decay, confidence, and provenance machinery applies. A
conforming consumer SHOULD keep intentions out of ordinary retrospective recall and
surface them when their trigger condition is met (a time, a context); the
omir-personal-assistant profile makes this explicit (Profiles).
What OMIR deliberately does not specify
The following are implementation concerns. They are real and important, but they are not part of the at-rest contract, and a conformant document says nothing about them:
- Retrieval scoring. How memories are ranked at query time (similarity, BM25, graph
spreading, cross-encoders, RRF fusion, multi-signal blends) is engine-specific. Score
vectors ride in
extension[]; they are never core. See the worked example in Extensions. - Embeddings. Which model, which dimensionality, which vector — all implementation. R1 carries embeddings only as extensions; a neutral embedding representation is a candidate generalization (Toward a Global Standard §D).
- Consolidation schedules, replay, “sleep” phases. When and how memory is reorganized is an algorithm, not a state.
- Storage and indexing. Vector indexes, key-value engines, graph databases — none of it is OMIR’s concern. OMIR is the bytes at rest, not the engine.
This boundary is the whole design: standardize the state so any engine can read another engine’s memory, and leave the behavior free so engines still have something to compete on.