I kept running into the same problem with RAG pipelines: the system gives confident answers with no way to distinguish "this is in the data" from "this was filled in from model weights." So I built Kremis.
The core idea: data goes in as EAV signals (entity, attribute, value). Kremis builds a weighted graph from co-occurrence. Every query result is then classified:
FACT — the path exists directly in the graph
INFERENCE — derived via traversal, not a direct link
UNKNOWN — not in the data at all
Same input -> same output, always. No embeddings involved in core state.What it is today (v0.16.1, Apache 2.0):
- kremis-core: pure Rust library, no async, no network dependencies - kremis CLI + HTTP API (16 endpoints, axum, optional Bearer auth) - kremis-mcp: MCP bridge (9 tools, stdio) -- works in Claude Desktop/Cursor - Persistent backend via redb (ACID, crash-safe) - Binary releases + Docker (~136MB)
Two recent fixes that might interest Rust devs: - Storage errors were being silently collapsed into Ok(None), making I/O failures look like "node not found" -- now properly propagated as Err(...) - increment_edge was creating phantom edges when either node was missing -- fixed with the same guard already present in insert_edge
Still experimental. The main friction is EAV ingestion: you have to structure your data as signals before it can be queried. Curious whether that's a dealbreaker in practice or a reasonable tradeoff for the grounding guarantees.
Concrete example to make the grounding tangible:
The grounding field is part of every HTTP response too, so if you're building an agent on top you can decide at the application layer what to do with INFERENCEs and UNKNOWNs instead of pretending they don't exist.The part I'm genuinely uncertain about: is structuring data as EAV signals realistic in practice, or is the ingestion friction a hard blocker? That's the honest question I can't answer without more users trying it.