Docs / How it works The operating loop. Keyoku is a memory engine for AI agents. It captures structured state from conversations, stores it in SQLite with HNSW vector search, and runs a 14-signal heartbeat scan — zero LLM tokens per tick — to decide when your agent should speak up.
Built in Go. Single binary. No cloud dependency.
Engine pipeline Six stages, one loop. Raw conversation → structured memory → knowledge graph → decayed relevance → ranked recall → heartbeat decision.
Every exchange is sent to the engine. The LLM extracts structured memories and stores them in SQLite.
Extracts memory type, importance (0–1), confidence, sentiment, and hedging signals. Each memory gets an embedding vector for later retrieval.
POST /api/v1/remember
Three-tier deduplication on every ingest. Exact hash, semantic similarity, and near-duplicate detection.
Exact match → skip. Semantic ≥ 0.95 → skip. Near-duplicate (0.85–0.95) → merge if it adds information. Conflict detection catches contradictions with five resolution modes.
Automatic on ingest
Entities and 40+ relationship types extracted automatically. Bidirectional inference builds context.
Extracts people, orgs, locations, products, concepts, events. Relationship types: works_at, manages, friend_of, lives_in, parent_of, dating, member_of. Strength = (count × 0.4) + (recency × 0.3) + (confidence × 0.3).
GET /api/v1/entities
Ebbinghaus-inspired decay. Frequently accessed memories resist decay. Stability grows with access count.
retention(t) = e^(−t / effective_stability). Lifecycle: ACTIVE → STALE (< 0.3) → ARCHIVED (< 0.1) → DELETED (< 0.01). Identity memories last ~365 days. Ephemeral context ~3 days.
Background job
Three-tier retrieval: LRU hot cache → HNSW vectors → SQLite full-text fallback.
Composite ranking: semantic similarity, recency, importance, decay, and confidence. Five ranking modes: balanced (default), recent, important, historical, comprehensive.
POST /api/v1/search
14-signal scan runs pure SQL — no LLM cost per tick. When should_act is true, optional LLM analysis generates action briefs.
Checks deadlines, stalled work, pending tasks, conflicts, sentiment shifts, relationship gaps, knowledge gaps, behavioral patterns, and more. Cooldowns adapt to user response rate. LLM analysis is triggered via heartbeatContext with analyze: true.
POST /api/v1/heartbeat/checkHeartbeat 14 signals, pure SQL scan. Every heartbeat tick runs pure SQL queries — no LLM cost per tick. When signals trigger action, optional LLM analysis generates briefs. Click any signal to learn more.
Cron-tagged memories hitting their trigger time
Memories with type=PLAN and a cron_tag that matches current time. Highest priority — always surfaces.
Memories with expires_at within the deadline window
Default window: 24h. Critical deadlines within 1h force immediate action, bypassing quiet hours.
Unresolved contradictions between stored memories
Detected during ingest. Five resolution modes: keep_existing, use_new, merge, ask, temporal.
Interrupted sessions — mid-conversation resumption
Tracks session age and whether the last session was interrupted. Generates resume suggestions.
Monitoring tasks overdue for check-in
Active plans/activities that haven't been accessed or updated within their expected cadence.
5+ new memories since last action
High memory velocity signals active conversation. Reduces watcher interval to check more frequently.
Active plans and tasks (type=PLAN/ACTIVITY)
Counts memories with state=ACTIVE and type PLAN or ACTIVITY. Checked against recent activity for progress.
Plan progress detection vs recent activity
Compares plan memories against activity memories. Reports on_track, at_risk, stalled, or no_activity.
Unanswered questions stored in memory
Questions the agent couldn't answer get stored. Surfaces them periodically for follow-up.
Goal improvements, re-engagement, sentiment recovery
Detects positive trends: completed milestones, returning users, improving sentiment scores.
Important memories approaching decay threshold
Memories with high importance but retention nearing 0.3. Opportunity to reinforce before they go stale.
Emotional trend analysis across recent memories
Compares recent_avg vs previous_avg sentiment. Reports improving, declining, or stable with notable memories.
Silent entities nearing deadlines
Tracks days_silent for related entities. Surfaces relationship alerts at info, attention, or urgent levels.
Day-of-week behavioral patterns
Detects recurring behaviors by day of week. Reports with confidence score and associated topics.
Decision flow
How should_act is decided. Seven steps prevent spam, respect rhythm, and ensure novelty.
1 First contact Less than 5 memories? Always act.
2 Critical deadline Deadline within 1 hour? Force immediate action.
3 Time-of-day filter Each period has a cooldown multiplier: 0.5× morning, 1× working, 1.5× evening, 3× late night, 10× quiet.
4 Conversation awareness Active conversation? Only elevated+ signals pass.
5 Confluence scoring Sum signal weights. Thresholds: act=8, suggest=12, observe=20.
6 Fingerprint + cooldown Same signal fingerprint within cooldown? Suppress. Response rate < 10%? Cooldown 10×.
7 Topic dedup Same entities surfaced recently? Suppress.
Memory model Eight types, each with a half-life. Every memory has a type that determines its base stability. Access frequency extends stability logarithmically.
Core identity facts — who they are
Likes, dislikes, preferences
Long-term connections with people and orgs
Important events and milestones
Behavioral history — what they were doing
Ongoing tasks, projects, goals
Conversation context and session data
Transient, session-specific facts
Autonomy levels Observe, suggest, act. The engine decides whether to act. The autonomy level decides how .
Signals collected but not delivered. Cooldowns: 4h / 2h / 8h.
The agent stays informed internally. Good for learning user patterns before acting.
Signals surfaced as suggestions. Cooldowns: 2h / 30m / 4h.
Response-aware cooldowns prevent spam. Topic dedup ensures relevance. User decides whether to act.
Agent executes automatically. Cooldowns: 10m / 5m / 30m.
Time-period awareness and graduated gates keep behavior in check. The LLM can suppress but never promote signals.
Adaptive timing The tick adjusts. Watcher interval multipliers stack based on context.
Quiet hours 10×, late night 3×, evening 1.5×, working 1×, morning 0.5×.
Acted < 5 min ago → 2×. Less than 15 min → 1.5×.
Zero signals → 3× interval. 4+ signals → 0.8× (check more often).
5+ new memories since last action → 0.7× (things are happening).
OpenClaw integration Two hooks. Seven tools. The @keyoku/openclaw plugin connects Keyoku to your agent. Click any hook to see details.
Searches Keyoku for relevant memories and injects them into the prompt.
Injected as <your-memories> context. The LLM sees them naturally — no "according to my records" phrasing needed.
Runs a 14-signal scan when OpenClaw fires a heartbeat prompt.
If should_act is true, injects <heartbeat-signals> with urgency tier, action brief, and recommended response.
Stashes the user message for pairing with the agent response.
No API call yet — waits for the agent to respond so it can capture the full exchange in one atomic operation.
Pairs user message + agent response and sends to /remember.
One atomic capture per turn. Keyoku handles dedup, extraction, and conflict resolution. Also records heartbeat fingerprints for deduplication.
LLM tools 7 tools registered. Your agent can explicitly search, store, and manage memories.
Semantic search over stored memories
Fetch a specific memory by ID
Explicitly save important information
Delete a memory by ID
View memory count, types, and health
Create a cron-based reminder
List active scheduled reminders
Service lifecycle Engine runs as sidecar. The OpenClaw plugin manages the engine binary automatically.
Checks ~/.keyoku/bin, ~/.local/bin, then PATH.
Downloaded automatically by the init wizard. No manual setup needed.
Plugin spawns the binary on OpenClaw startup.
Health check waits up to 5 seconds for /api/v1/health before proceeding.
16-byte hex token, generated on first run.
Stored in ~/.keyoku/.env. Bearer token auth on all API calls.
SIGTERM on OpenClaw exit. No orphan processes.
Quick example
Heartbeat check in three lines.
const result = await fetch ('/api/v1/heartbeat/check' , {
method: 'POST' ,
body: JSON .stringify ({ entity_id: 'user-1' })
});
const { should_act, signals, urgency_tier } = await result.json ();
if (should_act) {
agent.respond (signals);
}Architecture
Why it works this way.
Zero external dependencies SQLite + in-process HNSW. One binary, one file, portable anywhere.
Ebbinghaus decay + access frequency. Memories that matter stay.
Cache → HNSW → FTS. Sub-millisecond for hot memories.
Signal scan is pure SQL. LLM called only for extraction and optional analysis.
Multiple weak signals combine naturally, like noticing something is off.
User ignores nudges? Cooldowns multiply 10×. The agent learns to back off.
Ready to build? The fastest path: run the OpenClaw setup wizard.
npx @keyoku/openclaw initNext Upgrading