Knowledge Graphs — Hybrid Pattern with External MCP

Knowledge Graphs are for slow-changing domain models (10–2,000 entities per type). They are not the right primitive for high-cardinality transactional data like 20,000 SKUs with real-time stock, an order history with millions of rows, or a customer database that updates every minute.

The right pattern for these domains is hybrid: put the domain structure in a Knowledge Graph and the live data behind an external MCP server. The agent uses both — the Knowledge Graph to understand “how my domain works” and the MCP server to answer “what is in stock right now”.

The boundary

Layer	Knowledge Graph	External MCP server
Cardinality	10–2,000 per type, ~10K total	Arbitrary (millions)
Change rate	Hours / days (slow-changing)	Seconds / minutes (real-time)
Source of truth	Customer-declared bundles in git	Customer’s existing system (Shopify, SAP, custom DB)
Update pattern	`brewctl kg apply` (atomic)	Live API calls
Tools agent uses	`list_X`, `get_X`, `list_X_ids`	Custom tools per MCP server (`search_products`, `get_inventory`, …)
Question answered	”How does my domain work?"	"What is in my system right now?”

Example: E-commerce shoe store

A running-shoes e-commerce store with 20,000 SKUs. The wrong way is to put all 20,000 SKUs in a Knowledge Graph; the right way is layered.

What lives in the Knowledge Graph

The structure of the shoe domain — ~1-2K entities total:

Category tree (~50 categories: running → road → trail → …)
Attribute taxonomy (10-50 attributes: size, color, material, pronation_type)
Attribute value enums (color: red/blue/black/…, size: EU 36..48)
Brand registry (~50-500 brands with positioning metadata)
Size conversion tables (US ↔ EU ↔ UK)

bundle_name: shoe-store-taxonomy
version: 2026-05-27.1
entity_types:
  - name: category
    schema_file: schemas/category.schema.json
    entities_file: entities/categories.yaml
  - name: brand
    schema_file: schemas/brand.schema.json
    entities_file: entities/brands.yaml
  - name: attribute_definition
    schema_file: schemas/attribute_definition.schema.json
    entities_file: entities/attributes.yaml

brewctl kg apply ./my-store

The engine creates ~1,500 entities total, well under the 10K limit, and generates these tools:

list_category(filters={parent_id?, surface?})
get_category(id)
list_brand(filters={tier?, in_category?})
get_brand(id)
list_attribute_definition(filters={for_category?})
get_attribute_definition(id)

What lives in an external MCP server

The inventory — 20K SKUs with prices, stock, photos, reviews. This is the customer’s existing system (Shopify, custom API). They put an MCP server in front of it:

# Existing in the engine config
mcp_servers:
  - name: shoe-inventory
    transport: http
    url: https://shop.example.com/mcp
    auth_key_env: SHOE_INVENTORY_TOKEN

agents:
  shop-assistant:
    model: glm-5
    mcp_servers: [shoe-inventory]      # ← external MCP
    capabilities:
      - type: knowledge_graphs
        config:
          bundles: [shoe-store-taxonomy]

The shoe-inventory MCP server provides:

search_products(filters={category_id, brand_id, size, color, in_stock, price_range})
get_product(sku)
get_inventory(sku)
list_recent_reviews(product_id, limit)

How an agent uses both

A multi-step user query like “I want red running shoes for the road, neutral pronation, under $150”:

Step 1: list_category(parent="running", surface="road")
  → KG returns: [
      {id: "road-running-neutral", label: "Neutral Road Running"},
      {id: "road-running-stability", label: "Stability Road Running"},
      ...
    ]

Step 2: get_attribute_definition("pronation_type")
  → KG returns: {
      id: "pronation_type",
      enum: ["neutral", "overpronation", "underpronation"]
    }

Step 3: list_brand(filters={tier: "mid", category: "road-running-neutral"})
  → KG returns: ~8 brands matching

Step 4: search_products(
    category_id: "road-running-neutral",
    pronation: "neutral",
    color: "red",
    brand_in: [...],
    price_lte: 150
  )
  → EXTERNAL MCP returns: 23 actual SKUs in stock, with prices and photos

Step 5: get_product(sku="ASICS-GEL-NIMBUS-25-RED-42")
  → EXTERNAL MCP returns: full product data, reviews, alternative sizes

The Knowledge Graph gives the agent a map of the domain — what categories exist, what “neutral pronation” means, which brands are relevant for road running. The external MCP server gives live availability.

Without the Knowledge Graph the agent would not know which brands are mid-tier road-running brands, or that pronation has exactly three enum values. Without the external MCP server the agent would not know what is in stock. They are complementary.

Example: Healthcare formulary

A medical reference application providing drug recommendations. Knowledge Graph holds the medical taxonomy, an external MCP server provides the patient-specific formulary.

Knowledge Graph (structure)

list_condition(filters={icd10_chapter, severity})
list_symptom(filters={condition})
list_treatment_class(filters={condition})
list_active_ingredient(filters={treatment_class})

~5,000 total entities — conditions, symptoms, treatment classes, active ingredients with ATC codes. Slow-changing reference data, curated by domain experts.

External MCP server (patient context)

get_patient_allergies(patient_id)
list_available_medications(active_ingredient, country)
get_drug_interactions(drug_id_a, drug_id_b)

Patient-specific, jurisdiction-specific, frequently updated. Lives in the hospital’s existing pharmacy system, exposed via MCP.

Agent workflow

User: “What can I prescribe for a patient with flu symptoms?”

Step 1: list_condition(filters={symptom_group: "flu_like"})
  → KG: list of flu-like conditions with ICD-10 codes

Step 2: list_treatment_class(filters={condition: "J10"})
  → KG: antiviral classes for influenza

Step 3: list_active_ingredient(filters={treatment_class: "antivirals"})
  → KG: oseltamivir, zanamivir, ...

Step 4: get_patient_allergies(patient_id="PT-12345")
  → EXTERNAL MCP: patient allergy list

Step 5: list_available_medications(active_ingredient="oseltamivir", country="DE")
  → EXTERNAL MCP: products available in Germany

When to choose the boundary

A simple rule: if the data answers questions about your domain itself (taxonomy, categories, attributes, relationships, controlled vocabularies), it goes in a Knowledge Graph. If the data answers questions about a specific instance or current state (stock, prices, patient records, transactions), it goes in an external MCP server.

A second rule: if a single record changes more than once per day, it does not belong in a Knowledge Graph. Knowledge Graphs are GitOps-managed and atomic-applied — they assume the data is reference data, not transactional.

Anti-patterns

❌ All 20K SKUs in a Knowledge Graph

This exceeds the 10K-per-type limit. Even at 5K SKUs it works mechanically but defeats the purpose: every price update requires a full brewctl kg apply. Use an external MCP server.

❌ One Knowledge Graph entity per user

User profiles change frequently. They are also tenant-specific data that should live in your application’s primary data store, not in a customer-declared bundle. Use a dedicated API or external MCP server.

❌ Hardcoding inventory in agent system prompts

A common workaround is “the agent knows the product catalog because we listed it in the system prompt”. This works for tiny catalogs but does not scale, cannot be filtered, and the agent will hallucinate adjacent IDs. Use a Knowledge Graph for the structure plus an external MCP server for live data.