Governed Knowledge Store • Apache 2.0

Your Knowledge Store Should
Govern, Not Just Store.

Trust tiers on every document. Integrity verification on every read. Lifecycle management. Point-in-time queries. Content-addressed storage. Not a vector database. A governed knowledge store.

quantum-pipes/vault
Governed
Vault Contents
4 documents
company-handbook-v3.pdf
1.5x search boost

Official company handbook. Approved by the executive team. This is the single source of truth for all company policies, procedures, and guidelines.

Trust Tier
CANONICAL
Classification
INTERNAL
Lifecycle
ACTIVE
Memory Layer
OPERATIONAL
SHA3-256 Content-Addressed
vault://sha3-256/a7f3c2...
Merkle root:e9b1d4f2...
4
Trust Tiers
4
Data Classifications
6
Lifecycle States
3
Memory Layers

Your Knowledge Store Should Govern,
Not Just Store.

Vector databases store embeddings. qp-vault governs knowledge: trust tiers, lifecycle management, integrity verification, and cryptographic audit trails.

Vector Databases
No trust tiers

All documents are equal. A meeting note ranks the same as an official SOP.

No audit trail

Who searched what? When? No record.

No lifecycle

Documents never expire. Stale data pollutes results forever.

No classification

Everything goes to cloud APIs. No routing controls.

No integrity verification

Data corruption goes undetected. No proof of authenticity.

"Which version of the SOP was active on March 15?" Silence.
qp-vault
Trust tiers affect ranking

CANONICAL (1.5x) outranks EPHEMERAL (0.7x) in every search.

Every read verified

SHA3-256 integrity check on every retrieval. Tamper = detected.

Lifecycle management

DRAFT -> REVIEW -> ACTIVE -> SUPERSEDED -> ARCHIVED. Auto-expiration.

Classification controls routing

CONFIDENTIAL stays on local models. RESTRICTED logs every read.

Merkle tree integrity

Full vault verification. Proof export for auditors.

vault.search("SOP", as_of=date(2024, 3, 15)) → v2.1

Feature Comparison

FeatureChromaDBQdrantWeaviateqp-vault
Trust tiers on documents
Cryptographic audit trail
Content-addressed storage
Knowledge lifecycle management
Post-quantum encryption
Air-gap first
Integrity verification on read
Merkle tree proof export
Trust Architecture

Every Document Has a Trust Tier

Not all knowledge is equal. Official SOPs outrank meeting notes. Trust tiers encode institutional knowledge about document authority.

Search Weight Impact
CANONICAL
1.5x
WORKING
1.0x
EPHEMERAL
0.7x
ARCHIVED
0.25x
Hybrid Search

Search That Knows What to Trust

Combine vector similarity (0.7) and text rank (0.3), then multiply by trust weight and freshness. CANONICAL documents surface above EPHEMERAL, even with similar semantic match.

vault_search.py
# Hybrid search with trust-weighted ranking
results = vault.search("Q3 revenue projections")
Click "Run Search" to see trust-weighted results
Knowledge Lifecycle

Knowledge That Evolves

Documents move through a governed lifecycle. Active policies supersede old ones. Expired documents auto-transition. Point-in-time queries retrieve what was true at any date.

DRAFT

In preparation

Transitions to: REVIEW

point_in_time.py
# What was our policy on March 15, 2024?
from datetime import date
results = vault.search(
"incident response procedure",
as_of=date(2024, 3, 15)
)
# Returns v2.1 (ACTIVE on that date), not v3.0 (published April 2024)
Memory Architecture

Three Layers of Organizational Memory

Separate operational procedures from strategic decisions from compliance evidence. Each layer has its own search context and access patterns.

Example Files
deploy-runbook.mdincident-response.pdf
vault.layer(MemoryLayer.OPERATIONAL).add("deploy-runbook.md")
Example Files
adr-001-postgres.mdokr-q4-2026.md
vault.layer(MemoryLayer.STRATEGIC).add("adr-001-postgres.md")
Example Files
soc2-audit-2025.pdfhipaa-checklist.xlsx
vault.layer(MemoryLayer.COMPLIANCE).add("soc2-audit-2025.pdf")
Merkle Verification

Integrity Verification on Every Read

Content-addressed storage with SHA3-256 CIDs. A Merkle tree covers the entire vault. Verify one document, or export a cryptographic proof for auditors.

Content-Addressed Storage

Every document is referenced by its SHA3-256 hash. The address IS the content fingerprint.

vault://sha3-256/a7f3c2e9b1d4...

Merkle Tree

Leaf hashes combine into branch hashes into a single root. Change one byte and the root changes.

root: e9b1d4f2c8a7...

Proof Export

Extract a Merkle inclusion proof for any document. Hand it to an auditor. They verify independently.

vault.export_proof(resource_id)
root: e9b1d4f2
h(a7f3+8b1d)
h(c9e5+2d7a)
verification.py
result = vault.verify() # Entire vault
proof = vault.export_proof("resource-id") # For auditors
Quick Start

Three Lines to Governed Knowledge

Install. Add a document with a trust tier. Search with trust-weighted ranking. No configuration. No external services. Works offline.

pip install qp-vault

SQLite, text files, basic search

pip install qp-vault[postgres]

PostgreSQL + pgvector hybrid search

pip install qp-vault[docling]

25+ format document processing

pip install qp-vault[capsule]

qp-capsule cryptographic audit trail

pip install qp-vault[encryption]

AES-256-GCM + ML-KEM-768

pip install qp-vault[all]

Everything

quick_start.py
from qp_vault import Vault
vault = Vault("./my-knowledge")
vault.add("quarterly-report.pdf", trust="canonical")
results = vault.search("Q3 revenue projections")
print(results[0].content, results[0].trust_tier)
# CANONICAL trust tier. SHA3-256 verified. Air-gap ready. ✓
CLI

Command Line, Full Control

vault init ./org-knowledge

Initialize a new vault

vault add report.pdf --trust canonical

Add a document with trust tier

vault search "revenue projections"

Hybrid search across all documents

vault verify

Verify entire vault integrity (Merkle tree)

vault health

Compute composite health score

vault expiring --days 90

List documents expiring soon

Minimal by design

One Required Dependency. Storage Opt-In.

SHA3-256 hashing, canonical serialization, async I/O — all from Python's standard library. Pydantic is the only thing Vault always needs. A storage backend is a single pip extra.

Pydantic

required
≥ 2.12·Schema-first validation

Every document, every trust policy, every Merkle node is a typed Pydantic model. Runtime validation on every boundary, typed interfaces throughout, zero ORM pollution.

  • Typed Vault models
  • Runtime validation at boundaries
  • Deterministic serialization

aiosqlite

sqlite extra
≥ 0.22·Async SQLite wrapper

Vault storage with zero setup. A single file on disk, content-addressed by SHA3-256. Outgrow a single node? Swap to the postgres extra for Postgres plus pgvector, no code changes.

  • Async document persistence
  • Content-addressed storage
  • Swap to Postgres without code changes

Everything else comes from Python's standard library:

hashlib · SHA3-256 content IDs
json · Canonical serialization
pathlib · Filesystem layout
asyncio · Async runtime

pip install qp-vault[postgres] · [capsule] · [local] · [openai] · [pq]

Scale into Postgres, seal to Capsule, run local embeddings, add post-quantum signatures. Each one a single extra, each one opt-in.

Knowledge Health

Know Your Knowledge Health

Five metrics that tell you if your knowledge base is healthy, stale, redundant, disconnected, or misclassified. One composite score.

vault health
Healthy
83Overall

Built For People Who Need Governed Knowledge

⚖️

Legal Teams

Trust tiers on contracts. Lifecycle management. Point-in-time queries for what was active on any date.

🏥

Healthcare

CONFIDENTIAL classification routes PHI to local models only. Every search on RESTRICTED data is logged.

💰

Finance

Audit trail on every search. Merkle proof for regulators. Content-addressed storage for tamper evidence.

🔬

Researchers

Memory layers for strategic knowledge. Hybrid search across everything. Semantic chunking preserves context.

📋

Compliance

Auto-expiration. Lifecycle transitions. Proof export. Every read verified. Capsule integration for audit trails.

🛠️

Developers

Plugin architecture. 5 protocols: StorageBackend, EmbeddingProvider, AuditProvider, ParserProvider, PolicyProvider.

📚

Every Fact Has Provenance.
Every Read Is Verified.

Trust tiers. Lifecycle management. Merkle tree integrity. Content-addressed storage. Your knowledge, governed and verifiable.

Trust Tiers Merkle Tree Content-Addressed Air-Gap Ready Apache 2.0