Graph RAG
1. Why Flat Vector Search Is Not Enough
Vector search retrieves semantically similar passages but ignores the relationships between entities. It struggles with:
| Query type | Problem with vector-only RAG |
|---|---|
| “How does drug X affect patients with condition Y who also take drug Z?” | Three-hop reasoning required across entities |
| “What are all subsidiaries of Company A?” | Requires traversing a corporate hierarchy |
| “Find all papers that cite both Author X and Author Y” | Structural graph query, not a semantic one |
| “What are the downstream consequences of policy change P?” | Causal chain traversal |
Knowledge graphs represent entities as nodes and relationships as labelled directed edges, enabling exact and multi-hop queries that vector similarity simply cannot answer.
2. Knowledge Graph Basics
A knowledge graph is a set of triples: (subject, predicate, object).
(Aspirin, inhibits, COX-2 enzyme)
(COX-2 enzyme, involved_in, inflammation_pathway)
(ibuprofen, also_inhibits, COX-2 enzyme)
(Aspirin, contraindicated_with, warfarin)
Graph terminology: - Node — an entity (drug, person, concept, document) - Edge — a directed, labelled relationship - Subgraph — a connected subset of nodes and edges - Community — a cluster of densely connected nodes
Storage options: - Graph databases: Neo4j, Amazon Neptune, TigerGraph - In-memory: NetworkX (Python), small-scale prototyping - Hybrid: store graph structure in Neo4j, embeddings in a vector DB, then join at query time
3. Building a Knowledge Graph from Documents
Step 1: Entity and relation extraction
EXTRACTION_PROMPT = """
Extract all entities and relationships from this text as a list of triples.
Format: (subject | predicate | object)
Only extract explicit facts — do not infer.
Text: {text}
"""
# Example output:
# (Pfizer | manufactures | Paxlovid)
# (Paxlovid | approved_for | COVID-19 treatment)
# (Paxlovid | contraindicated_with | strong CYP3A inhibitors)Step 2: Entity resolution (deduplication)
Different mentions of the same entity must be merged:
# "USA", "United States", "U.S." → canonical entity "United States"
# Use fuzzy matching + embedding similarity to identify co-referents
from rapidfuzz import fuzz
def are_same_entity(a: str, b: str, threshold=85) -> bool:
return fuzz.ratio(a.lower(), b.lower()) > thresholdStep 3: Load into graph database
from neo4j import GraphDatabase
driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "password"))
def add_triple(tx, subject, predicate, obj, source_doc):
tx.run(
"MERGE (s:Entity {name: $subject}) "
"MERGE (o:Entity {name: $object}) "
"MERGE (s)-[r:RELATION {type: $predicate, source: $source}]->(o)",
subject=subject, predicate=predicate, object=obj, source=source_doc
)
with driver.session() as session:
for triple in extracted_triples:
session.execute_write(add_triple, *triple, source_doc="doc_id")4. Retrieval Strategies
4.1 Entity-anchored retrieval
- Extract entities from the query
- Look them up in the graph
- Traverse N hops from those anchor nodes
- Return the subgraph as context
def graph_retrieve(query: str, hops: int = 2) -> list[dict]:
# Step 1: Extract entities from query
entities = extract_entities(query) # e.g. ["Aspirin", "warfarin"]
# Step 2: Find anchor nodes
# Step 3: Traverse N hops
cypher = """
MATCH (start:Entity)-[r*1..{hops}]-(related:Entity)
WHERE start.name IN $entities
RETURN start, r, related
LIMIT 50
""".format(hops=hops)
return session.run(cypher, entities=entities).data()4.2 Hybrid: vector + graph
Vector search finds semantically relevant nodes; graph traversal explores their neighbourhood:
def hybrid_retrieve(query: str) -> list[str]:
# 1. Vector search to find relevant entity nodes
seed_entities = vector_store.search(query, top_k=5)
# 2. Expand via graph traversal
subgraph = graph.expand(seed_entities, hops=2)
# 3. Convert subgraph triples to natural language
return triples_to_text(subgraph)4.3 Community-based summarisation (Microsoft GraphRAG)
Group graph nodes into communities (using Leiden or Louvain algorithm). Pre-generate a summary for each community. At query time, retrieve the most relevant community summaries:
- Faster for broad queries ("tell me about the pharmaceutical industry")
- Higher cost at index time (summaries must be generated per community)
- Enables global reasoning across the entire corpus
5. From Subgraph to LLM Context
Raw graph triples are not great LLM input. Convert to a readable format first:
def triples_to_text(triples: list[tuple]) -> str:
"""Convert (subject, predicate, object) triples to readable sentences."""
lines = []
for s, p, o in triples:
# Simple template
pred_readable = p.replace("_", " ")
lines.append(f"{s} {pred_readable} {o}.")
return "\n".join(lines)
# Or use an LLM to summarise the subgraph:
SUBGRAPH_PROMPT = """
Summarise the following knowledge graph facts in 2-3 sentences,
focusing on what is relevant to the question: {question}
Facts:
{triples}
"""Hybrid context pattern: Combine graph-derived facts with retrieved text chunks:
System: Answer using both the structured facts and the document excerpts below.
[Graph facts]
Aspirin inhibits COX-2 enzyme.
Aspirin is contraindicated with warfarin.
[Document excerpts]
...relevant passage from clinical guidelines...
6. Trade-offs and When to Use Graph RAG
| Dimension | Vector-only RAG | Graph RAG |
|---|---|---|
| Index build cost | Low | High (entity extraction + graph construction) |
| Maintenance | Easy (re-embed changed docs) | Hard (graph updates, deduplication) |
| Multi-hop queries | Poor | Excellent |
| Semantic similarity | Excellent | Moderate (entity matching) |
| Relationship queries | Poor | Excellent |
| Global corpus queries | Poor | Good (community summaries) |
Use Graph RAG when: - Your domain has rich entity relationships (healthcare, finance, legal, knowledge bases) - Users ask multi-hop questions spanning many documents - You need to answer structural queries (“all X connected to Y”) - You need explainable reasoning paths (trace the graph edge by edge)
Avoid when: - Domain is flat (FAQs, simple Q&A) - High document velocity (graph maintenance is expensive) - Low latency is critical (graph queries + triple-to-text conversion adds time)
Key tools: Microsoft GraphRAG library, LlamaIndex KnowledgeGraphIndex, LangChain Neo4j integration, graphrag (Microsoft OSS), G-Retriever.