HTTP API
RESTful JSON endpoints with Bearer token auth.
The harness contract lives here: POST /api/v1/seed for mirroring knowledge, POST /api/v1/search for semantic retrieval — plus memory CRUD, stats, and SSE event streaming.
The optional brain of Keyoku. A single Go binary (BSL 1.1) that gives the harness semantic memory: SQLite WAL storage, HNSW vector search with hybrid retrieval, embeddings, three-tier dedup, Ebbinghaus decay, and a knowledge graph. Set KEYOKU_ENGINE_URL and the harness does the rest.
One binary. One file. Ollama works locally — no API key.
Harness integration
The harness mirrors its knowledge layer — connector tool descriptions, mined practice patterns, agent research, CLAUDE.md conventions — into the engine, and its text searches upgrade to semantic search. No engine configured? The harness falls back to local JSONL, silently.
# Point the harness at a running engine
export KEYOKU_ENGINE_URL="http://localhost:18900"
# Knowledge mirrors in — extraction-free, embedder-only
POST /api/v1/seed
# knowledge_query text searches upgrade to semantic search
POST /api/v1/searchThe engine runs as a single process with an embedded HTTP server. No external databases, no message queues, no infrastructure. SQLite handles persistence, HNSW handles vector search, and everything runs in-process.
RESTful JSON endpoints with Bearer token auth.
The harness contract lives here: POST /api/v1/seed for mirroring knowledge, POST /api/v1/search for semantic retrieval — plus memory CRUD, stats, and SSE event streaming.
Three-tier dedup, conflict detection, Ebbinghaus decay, and composite ranking.
The seed path is extraction-free — embedder-only. LLM extraction remains available for standalone /remember ingestion.
SQLite with WAL mode for concurrent reads.
HNSW index for vector similarity search. LRU hot cache for frequently accessed memories. Full-text search as fallback.
Vector generation for every stored memory, powering HNSW search.
Ollama runs locally with no API key. Gemini, OpenAI, and Anthropic are supported as hosted alternatives.
The active core: the engine handles the full lifecycle of the harness's knowledge layer — ingest, deduplicate, store, decay, and retrieve — without the harness managing any of it directly.
Everything lives in one SQLite file with WAL mode for concurrent reads.
No external databases, no message queues, no infrastructure. Back it up by copying the file.
In-process approximate nearest neighbor search with a hybrid retrieval cascade.
LRU hot cache, HNSW vector search, and SQLite full-text fallback — merged and composite-ranked.
Pluggable embedding providers — Ollama works locally with no API key.
Gemini, OpenAI, and Anthropic are also supported. The harness seed path is embedder-only: no extraction LLM, no extraction tokens.
Exact hash match, semantic similarity, and near-duplicate detection.
Thresholds at 0.95+ and 0.85-0.95 prevent redundant storage while preserving new information through merging.
Memories decay naturally over time based on their type.
Identity facts last ~365 days, ephemeral context ~3 days. Access frequency extends stability logarithmically.
Entities and relationships linked into a graph with 40+ relationship types.
People, organizations, locations, and products are connected with bidirectional inference.
| Requirement | Note |
|---|---|
Go 1.21+ | Build from source |
SQLite 3.35+ | Bundled via go-sqlite3 |
Embedding provider | Ollama works locally, no API key — Gemini, OpenAI, Anthropic also supported |
64MB RAM | Typical idle footprint |
~50MB disk | Binary size |
Quick start
# Set required env vars
export KEYOKU_SESSION_TOKEN="your-secret-token"
# Ollama runs locally — no API key needed
export KEYOKU_EXTRACTION_PROVIDER="ollama"
# Run the binary
./keyoku-engine
# Health check
curl http://localhost:18900/api/v1/health
# Connect the harness
export KEYOKU_ENGINE_URL="http://localhost:18900"The heartbeat / proactive-nudge / delivery / schedules subsystem predates the harness relaunch and is not used by the harness product. It remains maintained for standalone memory-engine deployments and is slated to be superseded by a slimmer trigger scan in v0.7.0. The harness has its own proactive layer — hook-delivered nudges — documented in the harness docs; the two are separate. See Heartbeat (legacy).
Explore the full API reference, configuration options, memory system internals, and deployment guides.