Memory Semantics

The resource pages define what fields a record carries. This page defines what those fields mean over time — the cognitive model OMIR encodes. It is the answer to a fair question about the field tables: “why is confidence two numbers and a third? what is a half-life doing in a data format? what makes an edge’s strength go up?”

OMIR’s guiding rule, restated from Design Principles §6:

OMIR standardizes the state a memory algorithm reads and writes — not the algorithm.

Two faithful implementations may decay, consolidate, and re-rank memory with completely different code. What they must agree on is the meaning of the state in the file. The mechanisms below are drawn from the reference implementation (Veld); the spec records the state, and an implementation is free to reproduce the behavior however it likes.

1. Confidence as calibrated belief

MemoryRecord.confidence is a Confidence object — { alpha, beta, calibrated } — not a bare float, because an agent needs to distinguish “90% sure, having checked many times” from “90% sure, having seen this once.” A single number cannot.

OMIR models belief as a Beta(α, β) posterior:

alpha accumulates confirming evidence (the memory proved helpful/correct), plus a prior.
beta accumulates disconfirming evidence (the memory proved misleading/wrong), plus a prior.
calibrated is the point estimate a consumer should act on. The natural estimate is the posterior mean, α / (α + β), but a faithful producer damps it toward 0.5 when evidence is thin — at one or two observations the prior should dominate, so an unproven memory does not masquerade as a confident one. As α + β grows, calibrated approaches the raw mean. R1 does not mandate the exact shrinkage, so calibrated is a producer-asserted point estimate: two faithful producers MAY emit a different calibrated for the same (α, β). The α/β pair is the portable, reproducible quantity; a consumer that needs a reproducible or cross-producer-comparable estimate SHOULD recompute it from α/β under its own rule rather than rely on calibrated being identical across producers (see Theory & Scope).

A consumer that only reads calibrated behaves correctly within a single producer’s records; the α/β pair is there for consumers that want to keep updating belief as new feedback arrives. This is why confidence is a distribution rather than a bare float — it carries its own evidence count, so belief can be revised rather than merely averaged. One caveat on merging: adding two stores’ α/β yields a coherent posterior only when the two evidence streams are independent. When both stores derived their belief from the same upstream source — common when memories are exported and re-imported — summing the counts double-counts the shared evidence and overstates confidence. A consumer that merges confidence across stores SHOULD account for shared provenance (e.g. via provenance.externalId) rather than blindly add counts.

2. Forgetting as a curve

MemoryRecord.decay ({ halfLifeHours, lastAccess, accessCount, anchored }) records a memory’s forgetting state so that “better forgetting” survives export. The intent:

A record’s retrievability falls over time on a half-life — recent, frequently accessed memories stay sharp; stale ones fade. lastAccess and accessCount are the inputs a decay function reads; each access pushes retrievability back up.
Decay is naturally multi-time-scale: fast in the first hours/days (filtering noise), then much slower for what survives — empirically closer to a power law than a single exponential. OMIR does not mandate the curve; it records the parameters (halfLifeHours) and the access signal a curve consumes.
anchored: true marks a memory that resists decay — a pinned fact, a user-stated preference, a safety constraint. Anchored memories have a floor below which retrievability does not fall.

The spec stores forgetting state, not a forgetting algorithm, precisely so a robot on a power-cycle budget and a cloud assistant can both reconstruct sensible decay from the same fields.

3. Tiering and consolidation

MemoryRecord.tier (working → session → longterm → archive) records where a memory sits in a consolidation hierarchy — the at-rest analogue of working vs. long-term memory:

working — active, in-focus, short-lived; the smallest, hottest set.
session — bounded to a task/conversation; indexed for fast recall.
longterm — consolidated knowledge, retrieved by semantic cue.
archive — cold, near-permanent, batch-retrieved.

Promotion is driven by age × importance × access: a memory accessed repeatedly, or marked important, or linked to long-term knowledge, migrates inward; an untouched low-importance memory drifts outward and eventually compresses. OMIR records the tier a record currently occupies; the promotion policy is the implementation’s.

4. Hebbian relationship strength

Relationship.strength is a synaptic weight, not a static label. The cognitive model is Hebbian — cells that fire together wire together:

When two entities are co-activated (retrieved or mentioned together), the edge between them strengthens.
An edge that is not used decays toward zero, the same “better forgetting” applied to structure rather than content.
validAt records when the relationship was last observed to hold; invalidatedAt retires an edge that has been contradicted without deleting it, so the graph keeps its history rather than silently rewriting it.

Heavier machinery a faithful engine may run on top of this — long-term potentiation, asymmetric forward/backward strengths, consolidation tiers on the edge itself — is implementation state and rides in extension[], not the core. The core carries the one number every graph engine agrees on: current strength.

5. Salience

Entity.salience is how strongly an entity pulls on retrieval — its gravitational mass in the memory space. A high-salience entity is more likely to be surfaced and to drag related memories up with it via spreading activation. Salience rises with frequency (mentionCount), recency (lastSeenAt), whether the entity is a proper noun (properNoun), and explicit user importance. As with decay, OMIR stores the salience value and its inputs; how an engine computes and uses it is its own business.

6. The temporal model

OMIR is bi-temporal in spirit: it separates when something happened from when it was recorded.

eventTime — when the described event actually occurred. It may precede createdAt (you can record a memory of last week today).
createdAt — encoding time: when the record was written.
validUntil (on MemoryRecord) and invalidatedAt (on Relationship) express temporal invalidation — a fact superseded after an instant, or an edge that stopped holding — so consumers can filter stale knowledge rather than trust it forever.

Entry order is not significant; everything resolves by ResourceType/id. A consumer must never infer meaning from position (Conformance). A richer, fully interval-based bitemporal model is a candidate generalization — see Toward a Global Standard §H.

7. Provenance and source credibility

MemoryRecord.provenance ({ source, sourceType, credibility, externalId }) answers where a memory came from and how much to trust the origin. credibility (a [0,1] UnitInterval) lets a consumer weight a memory by the trustworthiness of its source — a user statement, a verified document, and an inferred guess are not equally reliable, and a retrieval ranker should be able to say so. externalId (system:id, e.g. github:pr-123, linear:SHO-39) keeps a memory traceable back to the artifact that produced it. A multi-hop, signable provenance chain is a candidate generalization — see Toward a Global Standard §E.

8. Prospective memory

Most memory is about the past. Prospective memory is about the future: a remembered intention to do something later. OMIR models it with experienceType: "intention" on an otherwise ordinary MemoryRecord — future-directed records live in the same store as retrospective ones, so the same decay, confidence, and provenance machinery applies. A conforming consumer SHOULD keep intentions out of ordinary retrospective recall and surface them when their trigger condition is met (a time, a context); the omir-personal-assistant profile makes this explicit (Profiles).

What OMIR deliberately does not specify

The following are implementation concerns. They are real and important, but they are not part of the at-rest contract, and a conformant document says nothing about them:

Retrieval scoring. How memories are ranked at query time (similarity, BM25, graph spreading, cross-encoders, RRF fusion, multi-signal blends) is engine-specific. Score vectors ride in extension[]; they are never core. See the worked example in Extensions.
Embeddings. Which model, which dimensionality, which vector — all implementation. R1 carries embeddings only as extensions; a neutral embedding representation is a candidate generalization (Toward a Global Standard §D).
Consolidation schedules, replay, “sleep” phases. When and how memory is reorganized is an algorithm, not a state.
Storage and indexing. Vector indexes, key-value engines, graph databases — none of it is OMIR’s concern. OMIR is the bytes at rest, not the engine.

This boundary is the whole design: standardize the state so any engine can read another engine’s memory, and leave the behavior free so engines still have something to compete on.

Keyboard shortcuts

OMIR R1 Specification