Lab 16: Knowledge Graph + LLM

Time: 50 minutes | Level: Architect | Docker: docker run -it --rm zchencow/innozverse-ai:latest bash

Overview

Knowledge graphs provide structured, verifiable knowledge that complements LLMs' statistical knowledge. This lab covers entity/relation extraction, graph construction, Neo4j/Cypher, GraphRAG architecture, SPARQL, ontology design, and KG-augmented generation patterns.

Architecture

┌──────────────────────────────────────────────────────────────┐
│              Knowledge Graph + LLM Architecture              │
├──────────────────────────────────────────────────────────────┤
│  TEXT CORPUS → NLP Pipeline                                  │
│  ├── NER: Entity extraction (Person, Org, Location, Event)   │
│  ├── RE: Relation extraction (works_at, located_in)          │
│  └── Coreference resolution (he/she/it → entity)            │
├──────────────────────────────────────────────────────────────┤
│  KNOWLEDGE GRAPH (Neo4j / RDF Store)                         │
│  Nodes: entities | Edges: relations | Properties: attributes │
├──────────────────────────────────────────────────────────────┤
│  QUERY: Natural language → Cypher/SPARQL → Graph results     │
│         ↓ retrieved subgraph                                  │
│  LLM → structured answer with citations                      │
└──────────────────────────────────────────────────────────────┘

Step 1: Why Knowledge Graphs for LLMs?

LLM Limitations KGs Solve:

KG vs Vector DB for RAG:

Dimension
Vector DB RAG
Knowledge Graph RAG

Query type

Semantic similarity

Structured + semantic

Reasoning

Single-hop

Multi-hop

Facts

Fuzzy

Precise

Relations

Implicit

Explicit

Updates

Re-embed document

Add/update triples

Use case

General knowledge retrieval

Complex relation queries


Step 2: Entity and Relation Extraction

Named Entity Recognition (NER) Categories:

Category
Examples
NLP Labels

Person

Elon Musk, Sam Altman

PERSON

Organization

OpenAI, Google DeepMind

ORG

Location

San Francisco, EU

GPE/LOC

Product

GPT-4, Gemini, Claude

PRODUCT

Event

AlphaGo match, IPO

EVENT

Date/Time

Q1 2024, March 15

DATE/TIME

Relation Extraction Types:

Extraction Pipeline:


Step 3: Neo4j and Cypher Query Language

Neo4j Graph Model:

Cypher Query Examples:

Cypher for LLM Context:


Step 4: SPARQL Basics

SPARQL (SPARQL Protocol and RDF Query Language) is the standard for RDF knowledge graphs.

RDF Triple Format:

SPARQL SELECT:

Ontology Design (OWL):


Step 5: GraphRAG (Microsoft)

GraphRAG enhances RAG with community-level knowledge from graphs.

GraphRAG vs Naive RAG:

GraphRAG Pipeline:

Cost Note:


Step 6: KG-Augmented Generation Patterns

Pattern 1: KG as Fact-Checker:

Pattern 2: KG as Context Source:

Pattern 3: Text2Cypher:

Pattern 4: Hybrid KG + Vector:


Step 7: Enterprise KG Use Cases

Financial Services:

Cybersecurity Threat Intelligence:

HR and Organizational:


Step 8: Capstone — Threat Intel KG Traversal

📸 Verified Output:


Summary

Concept
Key Points

KG vs Vector DB

KG: structured, multi-hop, precise; Vector: semantic, fuzzy, single-hop

Entity Extraction

NER → Entity linking → Coreference resolution

Neo4j/Cypher

Graph database; pattern-matching queries; MATCH-WHERE-RETURN

SPARQL

Standard for RDF graphs; SELECT-WHERE with triple patterns

GraphRAG

Community detection + summaries; better for cross-document analysis

KG-Augmented Gen

Patterns: fact-checker, context source, Text2Cypher, hybrid

Use Cases

Threat intel, financial relationships, HR skills mapping

Next Lab: Lab 17: Real-Time AI Inference →

Last updated