Engine / Overview

Keyoku Engine.

The optional brain of Keyoku. A single Go binary (BSL 1.1) that gives the harness semantic memory: SQLite WAL storage, HNSW vector search with hybrid retrieval, embeddings, three-tier dedup, Ebbinghaus decay, and a knowledge graph. Set KEYOKU_ENGINE_URL and the harness does the rest.

One binary. One file. Ollama works locally — no API key.

Harness integration

The seed / search contract.

The harness mirrors its knowledge layer — connector tool descriptions, mined practice patterns, agent research, CLAUDE.md conventions — into the engine, and its text searches upgrade to semantic search. No engine configured? The harness falls back to local JSONL, silently.

# Point the harness at a running engine
export KEYOKU_ENGINE_URL="http://localhost:18900"

# Knowledge mirrors in — extraction-free, embedder-only
POST /api/v1/seed

# knowledge_query text searches upgrade to semantic search
POST /api/v1/search

Architecture

Four layers, one process.

The engine runs as a single process with an embedded HTTP server. No external databases, no message queues, no infrastructure. SQLite handles persistence, HNSW handles vector search, and everything runs in-process.

Layer 1

HTTP API

RESTful JSON endpoints with Bearer token auth.

The harness contract lives here: POST /api/v1/seed for mirroring knowledge, POST /api/v1/search for semantic retrieval — plus memory CRUD, stats, and SSE event streaming.

Layer 2

Memory engine

Three-tier dedup, conflict detection, Ebbinghaus decay, and composite ranking.

The seed path is extraction-free — embedder-only. LLM extraction remains available for standalone /remember ingestion.

Layer 3

Storage

SQLite with WAL mode for concurrent reads.

HNSW index for vector similarity search. LRU hot cache for frequently accessed memories. Full-text search as fallback.

Layer 4

Embeddings

Vector generation for every stored memory, powering HNSW search.

Ollama runs locally with no API key. Gemini, OpenAI, and Anthropic are supported as hosted alternatives.

Features

What the harness uses.

The active core: the engine handles the full lifecycle of the harness's knowledge layer — ingest, deduplicate, store, decay, and retrieve — without the harness managing any of it directly.

SQLite WAL storage

Everything lives in one SQLite file with WAL mode for concurrent reads.

No external databases, no message queues, no infrastructure. Back it up by copying the file.

HNSW vector search + hybrid retrieval

In-process approximate nearest neighbor search with a hybrid retrieval cascade.

LRU hot cache, HNSW vector search, and SQLite full-text fallback — merged and composite-ranked.

Embeddings

Pluggable embedding providers — Ollama works locally with no API key.

Gemini, OpenAI, and Anthropic are also supported. The harness seed path is embedder-only: no extraction LLM, no extraction tokens.

Three-tier deduplication

Exact hash match, semantic similarity, and near-duplicate detection.

Thresholds at 0.95+ and 0.85-0.95 prevent redundant storage while preserving new information through merging.

Ebbinghaus decay

Memories decay naturally over time based on their type.

Identity facts last ~365 days, ephemeral context ~3 days. Access frequency extends stability logarithmically.

Knowledge graph

Entities and relationships linked into a graph with 40+ relationship types.

People, organizations, locations, and products are connected with bidirectional inference.

Requirements

What you need.

Requirement	Note
`Go 1.21+`	Build from source
`SQLite 3.35+`	Bundled via go-sqlite3
`Embedding provider`	Ollama works locally, no API key — Gemini, OpenAI, Anthropic also supported
`64MB RAM`	Typical idle footprint
`~50MB disk`	Binary size

Quick start

Running the engine.

# Set required env vars
export KEYOKU_SESSION_TOKEN="your-secret-token"

# Ollama runs locally — no API key needed
export KEYOKU_EXTRACTION_PROVIDER="ollama"

# Run the binary
./keyoku-engine

# Health check
curl http://localhost:18900/api/v1/health

# Connect the harness
export KEYOKU_ENGINE_URL="http://localhost:18900"

Legacy subsystems

Heartbeat, schedules, and delivery.

The heartbeat / proactive-nudge / delivery / schedules subsystem predates the harness relaunch and is not used by the harness product. It remains maintained for standalone memory-engine deployments and is slated to be superseded by a slimmer trigger scan in v0.7.0. The harness has its own proactive layer — hook-delivered nudges — documented in the harness docs; the two are separate. See Heartbeat (legacy).

Dive deeper.

Explore the full API reference, configuration options, memory system internals, and deployment guides.

API Reference Configuration Memory System