MCP-Native  ·  53 Domains  ·  Open Benchmark

Correct answers for agents,
not approximate retrieval.

RAG hallucinates relationships — it retrieves similar text, not the actual structure. CKG makes relationship errors structurally impossible. Every hop is a cited edge. Zero hallucinations by construction.

$ pip install ckg-mcp
Works with Claude Desktop LangGraph AutoGen Cursor Any MCP client
0.857
BERT F1 — answer quality
65×
More efficient than RAG
274
Tokens per query
0%
Hallucination rate
53
Domains included
Built for teams at
Graphify.md Slalom West Monroe Atek
For AI Engineers Building with Agents

RAG hallucinates relationships.
CKG makes them structurally impossible.

RAG retrieves similar text — not actual entity relationships. When your domain has real structure (what gates what, what breaks what, what depends on what), retrieval loses it. CKG encodes it explicitly. Relationship errors become structurally impossible by construction, not just less likely.

"My agent refactored a utility function. It had no idea 23 modules imported it. RAG returned similar text — not the dependency graph. The blast radius was invisible until after the push."
RAG gives your agent vibes
Retrieves semantically similar chunks — approximate, probabilistic
Agent dispatches before it knows what it's touching
Multi-hop accuracy collapses past hop 2
Blast radius unknown until after the action
3,100 tokens to get a guess
CKG gives your agent structure
Orchestrator queries the dependency graph before dispatching
Every hop is a cited edge — deterministic, not inferred
F1 improves continuously to hop=5 (0.772 at depth 5)
Full blast radius returned before any edit — 23 modules, exact
274 tokens. Zero hallucinations by construction.
Who It's For

Built for every team where
agents need to get it right

📋 Product Managers

Your agents don't know the domain. You do.

PMs map complexity for a living — what gates what, what breaks if X ships, what the upstream dependencies are. CKG gives your AI agents the same map before they draft a single word.

274 tokens to get the full dependency chain
💊 Life Sciences

125 nodes. Every GLP-1 dependency typed.

Muscle wasting has 13 downstream dependents — more than any cardiovascular node. Four oral drugs converging simultaneously. 20 combination therapy paths analysts don't map. The graph shows what the spreadsheet doesn't.

RAG uses 3,100 tokens to approximate this
⚙️ Engineering Teams

Blast radius before any edit. Deterministic.

Mapped LangChain Core: 180 modules, 650 dependency edges. trace_downstream("RunnableSequence") returns the exact 23 dependent modules before your agent writes a line.

Every hop is a real dependency edge
Benchmark

RAG retrieves text.
CKG traverses structure.

8,121 queries · 47 domains · BERTScore roberta-large · Fully reproducible

CKG — Graphify.md
RAG
Microsoft GraphRAG
Retrieval method
Graph traversal
Similarity search
Community summaries
BERT F1
0.857
0.817
0.825
Tokens / query
274
17,900
~10,000
Cost / correct answer
$0.000506
$0.013046
$0.020098
Hallucination rate
0% by construction
Variable
Variable
Multi-hop accuracy
Improves at depth
Degrades at depth
Partial

Full benchmark → github.com/Yarmoluk/ckg-benchmark · co-authored with Dan McCreary (former Head of AI, TigerGraph)

How It Works

One install. Any domain.
Agents get context before they act.

claude_desktop_config.json
// Add to your MCP config
{
  "mcpServers": {
    "ckg": {
      "command": "ckg-mcp"
    }
  }
}

// Works with Claude Desktop, LangGraph,
// AutoGen, Cursor — any MCP client
Life Sciences — GLP-1 payer brief
# Agent queries before writing anything
trace_upstream("Prior Authorization")

→ Prerequisites:
  Payer formulary tier assignment
    Cost-effectiveness of GLP-1RA therapy
      GLP-1 receptor agonist drug class
  Medical necessity criteria
  Step therapy requirements

# 274 tokens · zero hallucinations
Codebase — blast radius before edit
# Know what breaks before touching it
trace_downstream("RunnableSequence")

→ 23 dependent modules:
  RunnableParallel
  RunnableLambda
  RunnablePassthrough
  AgentExecutor
  ... 19 more

# Every hop = real dependency edge
Same interface — any domain
list_domains()
→ 53 domains available

query_ckg("glp1-obesity", "Metformin", depth=3)
query_ckg("langchain-core", "BaseChain")
find_path("calculus", "Limits", "Taylor Series")

# Swap the domain. Interface unchanged.
GLP-1 Intelligence

What the graph reveals
that reports don't

125 nodes. 200+ typed dependency edges. Built from ClinicalTrials.gov in one automated session — no expert curation.

13
Downstream concepts depend on Muscle Wasting

More than any cardiovascular node. Most equity coverage indexes on weight loss efficacy. The structural center of gravity in this graph is somewhere else entirely — and it's not in the analyst deck.

Analyst blind spot
4
Oral drugs converging to one competitive node

Ozempic pill, orforglipron, CagriSema, retatrutide feed a single pipeline convergence node. An analyst covering Lilly's program without mapping Novo's simultaneously sees a quarter of the picture.

Pipeline convergence
20
Combination therapy nodes — understudied axis

Most trial research covers monotherapy. The graph has 20 combination therapy nodes already mapped. The next clinical differentiation is going to land there — and it's already structured.

Commercial opportunity
Explore the GLP-1 graph →
Get Started

Five minutes to your first query

Claude Desktop
LangGraph
AutoGen
Cursor
Python 3.10+
No infra required
Install
$ pip install ckg-mcp
Step 1
pip install
One command. 53 domains bundled. No database, no embedding pipeline, no vector store.
Step 2
Add MCP config
Drop the JSON snippet into Claude Desktop, LangGraph, or any MCP-compatible orchestrator.
Step 3
Agents get context first
Your orchestrator calls trace_upstream before dispatching. Agents know the structure before they act.
Configure
claude_desktop_config.json
{
  "mcpServers": {
    "ckg": {
      "command": "ckg-mcp"
    }
  }
}
Available tools
list_domains()
query_ckg(domain, concept, depth)
get_prerequisites(domain, concept)
search_concepts(domain, query)
53 Domains Included

Structured knowledge across
every major vertical

Same interface, swap the domain. Enterprise builds (regulatory, legal, financial, custom) available on request.

Life Sciences & Clinical
glp1-obesity glp1-muscle-loss payer-formulary drug-interactions dementia hipaa-compliance icd10-metabolic cpt-em-coding
Codebase & Software
langchain-core computer-science circuits digital-electronics blockchain quantum-computing claude-skills
AI & Data Science
machine-learning data-science-course conversational-ai tracking-ai-course prompt-class intro-to-graph systems-thinking
Mathematics & STEM
calculus linear-algebra statistics-course chemistry biology genetics bioinformatics signal-processing
Business & Finance
economics-course personal-finance organizational-analytics it-management-graph
Enterprise (on request)
Regulatory frameworks Legal & IP Custom domain build Weekly-updated CKGs
FAQ

Frequently asked questions

What is Graphify.md?

Graphify.md builds Compact Knowledge Graphs (CKGs) — structured domain ontologies delivered via MCP as pre-action context for AI agents. Instead of RAG retrieving similar text after a question is asked, CKG gives agents the exact dependency structure of a domain before they act. The result: 65× more token-efficient, BERT F1 0.857, zero hallucinations by construction.

What is a Compact Knowledge Graph (CKG)?

A CKG is a pre-structured directed acyclic graph where every node is a typed domain concept and every edge is a typed dependency relationship. CKGs are serialized as CSV files and delivered via four MCP tools — query_ckg, get_prerequisites, search_concepts, list_domains — to any agent orchestrator as pre-action structural context.

How does CKG compare to RAG and GraphRAG?

In a benchmark of 8,121 queries across 47 domains using BERTScore (roberta-large): CKG achieved BERT F1 0.857 vs RAG's 0.817 and Microsoft GraphRAG's 0.825. CKG uses 274 tokens per query vs RAG's 17,900 — 65× more efficient. Cost per correct answer: CKG $0.000506 vs GraphRAG $0.020098 (40× lower). CKG hallucination rate is 0% by construction. Full benchmark is open source and reproducible.

How do I install ckg-mcp?

pip install ckg-mcp, then add {"mcpServers": {"ckg": {"command": "ckg-mcp"}}} to your MCP config. Works with Claude Desktop, LangGraph, AutoGen, Cursor, and any MCP-compatible orchestrator. Python 3.10+ required. 53 domains included — no additional setup.

How is this different from GitLab's knowledge graph?

GitLab's knowledge graph is a codebase introspection tool for developers navigating source code — one domain, developer-facing. CKG is a domain knowledge layer for AI agents, delivered via MCP before they act, across 53 domains. Codebase (langchain-core, 180 modules, 650 edges) is one of them. The same agent that gets blast radius for a code edit can query a clinical pathway or a regulatory framework — same install, same interface, same 274 tokens.

What is the GLP-1 knowledge graph?

The GLP-1 Clinical Pathway CKG contains 125 concepts and 200+ typed dependency edges covering mechanism of action, clinical trials, drug classes (semaglutide, tirzepatide, orforglipron), payer formulary dynamics, and combination therapies. Key structural insight: muscle wasting has 13 downstream dependent concepts — more than any cardiovascular node — making it the most structurally central complication. Built from ClinicalTrials.gov data using the automated Factory pipeline in one session.

The knowledge layer
your agents are missing

pip install ckg-mcp · 53 domains · five minutes to your first query