IDENTITY
Core identity facts — who they are, their name, age, occupation, background.
Eight memory types, each with a base stability that determines how long it resists decay. Ebbinghaus-inspired retention curves, three-tier deduplication, conflict detection, and a knowledge graph — all running in a single SQLite database.
Every extracted memory is classified into one of eight types. The type determines its base stability — the number of days before natural decay reduces its retention score below threshold.
Core identity facts — who they are, their name, age, occupation, background.
Likes, dislikes, preferences, communication style, tool choices.
Long-term connections with people, organizations, and places.
Important events, milestones, achievements, and dated occurrences.
Behavioral history — what they were doing, working on, or interested in.
Ongoing tasks, projects, goals, deadlines, and action items.
Conversation context, session data, and working state.
Transient, session-specific facts that expire quickly.
Memories flow through a lifecycle: ACTIVE to STALE to ARCHIVED to DELETED. The transition is driven by the decay formula and configurable thresholds.
Memory is live and included in search results. Accessed normally.
retention >= 0.3Memory has decayed below threshold. Deprioritized in search results but still retrievable.
retention < 0.3Memory is excluded from default searches. Still accessible via explicit ID lookup or archive queries.
retention < 0.1 or 30+ days staleSoft-deleted. Purged permanently after 90 days (configurable via PurgeRetentionDays).
explicit delete or retention < 0.01Decay model
Memory retention decreases exponentially over time, but access frequency extends the effective stability logarithmically. Frequently recalled memories resist decay.
// Core decay formula
retention(t) = e^(-t / effective_stability)
// Effective stability grows with access
effective_stability = base_stability × (1 + ln(1 + access_count) × 0.5)
// Example: IDENTITY memory (365d base), accessed 10 times
effective_stability = 365 × (1 + ln(11) × 0.5)
effective_stability = 365 × (1 + 2.4 × 0.5)
effective_stability = 365 × 2.2 = 803 days
// After 200 days with 10 accesses:
retention = e^(-200 / 803) = 0.78 (still ACTIVE)Every ingest passes through three deduplication checks before storage. This prevents bloat while preserving genuinely new information.
| Tier | Similarity | Action | Description |
|---|---|---|---|
| Exact hash | 100% | Skip | SHA256 hash of normalized content matches an existing memory. No storage needed. |
| Semantic duplicate | 95%+ | Skip | Cosine similarity of embedding vectors exceeds 0.95. Content is semantically identical. |
| Near-duplicate | 85% - 95% | Merge or skip | Similar but not identical. If the new memory adds information, merge. If it is a subset, skip. |
When a new memory contradicts an existing one, the engine detects the conflict and applies one of five resolution strategies.
| Mode | Behavior |
|---|---|
keep_existing | Preserve the original memory, discard the new one. |
use_new | Replace the existing memory with the new one. |
merge | Combine both memories, preserving unique information from each. |
ask | Flag the conflict for user resolution. Both memories are kept until resolved. |
temporal | Newer memory wins. Assumes the most recent information is correct. |
The engine extracts entities and relationships from conversations automatically. Bidirectional relationships infer reverse edges.
People mentioned in conversations — names, roles, attributes.
Companies, teams, institutions, and groups.
Cities, countries, addresses, and named places.
Software, tools, services, and physical products.
Abstract ideas, topics, skills, and domains of knowledge.
Named events, conferences, milestones, and occurrences.
Entities that do not fit any other category.
Relationships are extracted with strength scores calculated as: (count x 0.4) + (recency x 0.3) + (confidence x 0.3).
works_atmanagesreports_tofriend_oflives_inparent_ofchild_ofdatingmarried_tomember_offoundedusescreatedcollaborates_withmentorscompetitor_ofpartner_ofclient_ofsibling_ofknowsSearch follows a cascade: LRU hot cache (sub-millisecond) to HNSW approximate nearest neighbor (fast vector search) to SQLite full-text search (fallback). Results are ranked using a composite score.
Default mode. Good for general-purpose recall.
35% semantic + 20% recency + 20% decay + 20% importance + 5% confidence
Prioritizes recently created or accessed memories.
30% semantic + 50% recency + 10% decay + 7% importance + 3% confidence
Surfaces high-importance memories regardless of age.
40% semantic + 10% recency + 10% decay + 35% importance + 5% confidence
Best for deep recall across long time periods.
50% semantic + 5% recency + 25% decay + 15% importance + 5% confidence
Broader results with balanced weighting across dimensions.
45% semantic + 15% recency + 15% decay + 20% importance + 5% confidence
Search cascade
// 1. Check LRU hot cache (500 entries)
cache_hit = hot_cache.get(query_embedding)
if (cache_hit) return cache_hit // ~0.1ms
// 2. HNSW approximate nearest neighbor
candidates = hnsw_index.search(query_embedding, k=50) // ~1-5ms
// 3. Full-text search fallback
if (candidates.length < limit) {
fts_results = sqlite.fts5_search(query_text) // ~5-20ms
candidates = merge(candidates, fts_results)
}
// 4. Composite ranking
ranked = candidates.sort((a, b) => score(b) - score(a))