Knowledge Store
The KnowledgeStore provides text chunking and TF-IDF-like search for ingested documents. It is the retrieval layer that powers the KnowledgeNeuron and the context knowledge tools, letting agents se...
The KnowledgeStore provides text chunking and TF-IDF-like search for ingested documents. It is the retrieval layer that powers the KnowledgeNeuron and the context knowledge tools, letting agents search over project documentation, tool outputs, and any other text content.
Basic Usage
from orbiter.context._internal.knowledge import KnowledgeStore
store = KnowledgeStore()
# Add documents
store.add("readme", "Orbiter is a multi-agent framework for building AI applications.")
store.add("guide", "To create an agent, use the Agent class with a model and tools.")
# Search with TF-IDF scoring
results = store.search("create agent")
for result in results:
print(f"{result.key}: {result.score:.3f}")
print(f" {result.content[:80]}...")Text Chunking
When ingesting large documents, use chunk_text() to split them into overlapping chunks before indexing:
from orbiter.context._internal.knowledge import chunk_text, KnowledgeStore
store = KnowledgeStore()
# Split a large document into chunks
document = open("long_document.txt").read()
chunks = chunk_text(document, chunk_size=500, overlap=50)
# Index each chunk
for i, chunk in enumerate(chunks):
store.add(f"doc_chunk_{i}", chunk)
# Search across all chunks
results = store.search("specific topic", limit=5)The chunk_text() function:
- chunk_size — maximum characters per chunk.
- overlap — number of characters shared between consecutive chunks for context continuity.
Search
The search() method uses TF-IDF-like scoring to rank documents by relevance:
results = store.search("query terms", limit=10)
for r in results:
print(r.key) # document identifier
print(r.content) # full chunk content
print(r.score) # relevance score (0.0 to 1.0)Each SearchResult is a dataclass with:
key— the identifier used when adding the document.content— the full text of the matched chunk.score— TF-IDF relevance score.
Document Management
store = KnowledgeStore()
# Add a document
store.add("doc-1", "Content of document 1")
# Get a specific document by key
content = store.get("doc-1") # "Content of document 1" or None
# Get a range of documents
docs = store.get_range(start=0, end=5)
# Remove a document
store.remove("doc-1")Integration with Workspace
The Workspace can automatically index artifacts into a KnowledgeStore:
from orbiter.context import Workspace
from orbiter.context._internal.knowledge import KnowledgeStore
knowledge = KnowledgeStore()
workspace = Workspace(base_dir="/tmp/project", knowledge_store=knowledge)
# Writing to workspace auto-indexes the content
workspace.write("design.md", "The system uses event-driven architecture...")
# Now searchable via the knowledge store
results = knowledge.search("event-driven")See the Workspace guide for more.
Integration with Neurons
The built-in KnowledgeNeuron (priority 20) retrieves relevant knowledge during prompt building:
from orbiter.context import PromptBuilder
builder = PromptBuilder(ctx)
builder.add("knowledge") # retrieves relevant docs based on current task
prompt = builder.build()Integration with Context Tools
The knowledge tools let agents search the knowledge store directly:
from orbiter.context import get_knowledge_tools
tools = get_knowledge_tools()
# Returns:
# get_knowledge -- retrieve a document by key
# grep_knowledge -- keyword search across documents
# search_knowledge -- TF-IDF ranked searchSee the Context Tools guide for more.
Advanced Patterns
Incremental Ingestion
Add documents as the agent discovers them during execution:
async def ingest_tool_result(tool_name: str, result: str, store: KnowledgeStore):
"""Index tool results for future retrieval."""
key = f"tool:{tool_name}:{hash(result) % 10000}"
store.add(key, result)Multi-Source Knowledge Base
Combine multiple document sources into a single store:
store = KnowledgeStore()
# Index project documentation
for path in Path("docs/").glob("*.md"):
content = path.read_text()
chunks = chunk_text(content, chunk_size=500, overlap=50)
for i, chunk in enumerate(chunks):
store.add(f"{path.stem}_{i}", chunk)
# Index API responses
store.add("api_users", api_response_text)
# Search across everything
results = store.search("authentication flow")Scoped Search with Prefixes
Use key prefixes to scope searches to specific document categories:
# Add with prefixed keys
store.add("code:main.py", code_content)
store.add("docs:readme", docs_content)
store.add("tool:search_result", tool_output)
# Filter results by prefix
all_results = store.search("function definition")
code_results = [r for r in all_results if r.key.startswith("code:")]API Summary
| Symbol | Module | Description |
|---|---|---|
KnowledgeStore | orbiter.context._internal.knowledge | Text storage with TF-IDF search |
KnowledgeStore.add(key, content) | Add or update a document | |
KnowledgeStore.get(key) | Retrieve a document by key | |
KnowledgeStore.get_range(start, end) | Get documents by index range | |
KnowledgeStore.remove(key) | Remove a document | |
KnowledgeStore.search(query, limit) | TF-IDF ranked search | |
chunk_text(text, chunk_size, overlap) | Split text into overlapping chunks | |
SearchResult | orbiter.context._internal.knowledge | Dataclass: key, content, score |
Chunk | orbiter.context._internal.knowledge | Dataclass representing a text chunk |