Bundles & authoring layouts
A Knowledge Graph bundle is a directory containing a manifest, schemas, and entities. brewctl reads this directory, validates it, and posts the flattened payload to the engine in one atomic apply. The engine doesn’t see your file layout — it only sees the merged payload.
This means the layout decision is purely an authoring DX choice. Two patterns are supported as of brewctl 0.4.0 / engine 1.4.0.
Pattern A — canonical single-file layout (1.3.x compatible)
Section titled “Pattern A — canonical single-file layout (1.3.x compatible)”The simplest layout — one schema file and one entities file per entity type. Works for catalogs up to a few hundred entities per type.
my-bundle/├── manifest.yaml├── schemas/│ └── category.schema.json└── entities/ └── categories.yamlbundle_name: my-bundleversion: 1.0.0entity_types: - name: category schema_file: schemas/category.schema.json entities_file: entities/categories.yaml- code: footwear name: Footwear- code: apparel name: Apparel- code: home_goods name: Home GoodsThis is the original 1.3.0 layout. Existing bundles need no changes to work with engine 1.4.0+.
Pattern B — split-by-directory layout (1.4.0+)
Section titled “Pattern B — split-by-directory layout (1.4.0+)”For larger catalogs (typically ≥100 entities per type), one giant YAML file becomes unwieldy:
- Git diffs are noisy — finding the one entity that changed in 8000-line files takes editor scrolling.
- PR review is hard — reviewers can’t isolate a logical change.
- Merge conflicts on parallel PRs that touch different entities — both PRs edit the same file.
The split-by-directory layout addresses this. Declare entities_path in
the manifest instead of entities_file:
my-bundle/├── manifest.yaml├── schemas/│ └── use_case.schema.json└── entities/ └── use_case/ ├── industry-pm.yaml ← array of 30 PM-industry entities ├── industry-fb.yaml ← array of 25 FB-industry entities └── PM-WF-010.yaml ← single entity (one document per file)bundle_name: my-bundleversion: 1.0.0entity_types: - name: use_case schema_file: schemas/use_case.schema.json entities_path: entities/use_case/brewctl globs all *.yaml and *.yml files in the directory (flat — no
recursion into subdirectories), merges them in deterministic filename
order, validates uniqueness across the merge, and posts the same atomic
payload as Pattern A.
What each file in the directory can contain
Section titled “What each file in the directory can contain”Each file can be either form:
Array of entities — the split-by-category / industry / shard pattern:
- code: PM-WF-010 title: Water leak detection industry: PM- code: PM-WF-011 title: Toilet overflow industry: PMSingle entity object — one PR per entity pattern:
code: PM-WF-010title: Water leak detectionindustry: PMBoth styles can coexist in the same directory — pick what suits each group of entities best.
Trade-offs at a glance
Section titled “Trade-offs at a glance”| Concern | Single-file (Pattern A) | Split-by-directory (Pattern B) |
|---|---|---|
| Small catalogs (<100 entities/type) | Best | Acceptable but overkill |
| Large catalogs (≥100 entities/type) | Hard to review | Best |
| Git diff readability | Noisy at scale | Per-file diffs are surgical |
| Merge conflict frequency | High when parallel work happens | Low — different files don’t conflict |
| File-system entry count | Low | Higher (one inode per file) |
| Quickstart accessibility | Easier for newcomers | Familiar once they grow |
brewctl kg pull roundtrip | Lossless | Lossy — pull always emits Pattern A |
Mutual exclusion
Section titled “Mutual exclusion”You may declare either entities_file or entities_path per
entity type, never both. brewctl rejects bundles that set both with a
clear error before any network call:
manifest entity_type "use_case": entities_file and entities_path aremutually exclusive — pick oneDifferent entity types in the same manifest can use different patterns — mix freely:
entity_types: - name: industry # small, keep simple schema_file: schemas/industry.schema.json entities_file: entities/industries.yaml
- name: use_case # large, split schema_file: schemas/use_case.schema.json entities_path: entities/use_case/Pull behaviour — lossy on roundtrip (by design)
Section titled “Pull behaviour — lossy on roundtrip (by design)”brewctl kg pull always emits the canonical single-file layout (Pattern A),
even for bundles that were authored via entities_path. The engine does
not record which file each entity came from, so reconstructing the split
would mean guessing — and guessing produces a misleading roundtrip.
If you maintain a split layout in source, treat pull as a backup / inspection
tool, not a roundtrip authoring tool. Apply with entities_path; if you
later need to reassemble a split layout from a pull, write a small local
script that re-splits by your chosen field (industry, category, etc.).
A future release may add a split_by_field manifest option that makes
pull preserve a deterministic split layout. We are deferring that until a
concrete customer use case requires it — the lossy-but-honest default is
the safer baseline.
When to migrate from Pattern A to Pattern B
Section titled “When to migrate from Pattern A to Pattern B”A rough rule of thumb:
- <100 entities per type — stay on Pattern A. The flat file is small enough that the operational complexity of a directory isn’t worth it.
- 100–500 entities per type — split is helpful but not urgent. If parallel-PR conflicts on the entities file are noticeable, migrate.
- ≥500 entities per type — split. The diff hell is real and the re-organisation pays for itself within a couple of weeks.
- Per-entity PR review (each entity gets its own approval / commit) — Pattern B with one document per file is the only sensible layout.
To migrate, split your existing entities/<type>.yaml into multiple files
inside entities/<type>/, then update the manifest:
- name: use_case schema_file: schemas/use_case.schema.json entities_file: entities/use_cases.yaml- name: use_case schema_file: schemas/use_case.schema.json entities_path: entities/use_case/brewctl kg validate ./bundle confirms the migration before any network
call.
See also
Section titled “See also”- Knowledge Graphs concept — when and why to use Knowledge Graphs
- Schemas & annotations — JSON Schema + x-* annotations reference
- Migration 1.3 → 1.4 — query API changes
- Quickstart tutorial