Skip to content
Navigation

In-memory artifact indexing with chunking and keyword search. This is an internal module used by Workspace for auto-indexing.

Module Path

python
from orbiter.context._internal.knowledge import (
    KnowledgeStore,
    KnowledgeError,
    Chunk,
    SearchResult,
    chunk_text,
)

KnowledgeError

Exception raised for knowledge store operation errors.

python
class KnowledgeError(Exception): ...

Chunk

A segment of an artifact’s content.

Decorator: @dataclass(frozen=True, slots=True)

Fields

FieldTypeDescription
artifact_namestrName of the source artifact
indexintChunk index within the artifact
contentstrThe chunk text
startintStart character position in the original content
endintEnd character position in the original content

SearchResult

A single search hit with relevance score.

Decorator: @dataclass(frozen=True, slots=True)

Fields

FieldTypeDescription
chunkChunkThe matching chunk
scorefloatTF-IDF-like relevance score

chunk_text()

Split text into overlapping segments.

python
def chunk_text(
    text: str,
    *,
    chunk_size: int = 512,
    chunk_overlap: int = 64,
) -> list[str]
ParameterTypeDefaultDescription
textstr(required)Text to chunk
chunk_sizeint512Maximum characters per chunk (must be positive)
chunk_overlapint64Overlap between consecutive chunks (must be in [0, chunk_size))

Returns: List of chunk strings. Empty list for empty text. Single-element list if text fits in one chunk.

Raises: KnowledgeError if chunk_size <= 0 or chunk_overlap >= chunk_size.


KnowledgeStore

In-memory artifact index with chunking and keyword search.

Constructor

python
KnowledgeStore(
    *,
    chunk_size: int = 512,
    chunk_overlap: int = 64,
)
ParameterTypeDefaultDescription
chunk_sizeint512Maximum characters per chunk
chunk_overlapint64Overlap between consecutive chunks

Properties

PropertyTypeDescription
chunk_sizeintConfigured chunk size
chunk_overlapintConfigured chunk overlap
artifact_nameslist[str]Names of indexed artifacts

Index Methods

add()

python
def add(self, name: str, content: str) -> list[Chunk]

Index an artifact’s content. Re-indexes if already present.

ParameterTypeDescription
namestrArtifact name (must be non-empty)
contentstrContent to index

Returns: List of created Chunk objects.

Raises: KnowledgeError if name is empty.

remove()

python
def remove(self, name: str) -> bool

Remove an artifact from the index. Returns True if removed.

get()

python
def get(self, name: str) -> list[Chunk]

Get all chunks for an artifact. Returns empty list if missing.

get_range()

python
def get_range(self, name: str, start: int, end: int) -> list[Chunk]

Get chunks within a character range [start, end) for an artifact.

Search Methods

python
def search(self, query: str, *, top_k: int = 5) -> list[SearchResult]

Keyword search across all indexed artifacts. Uses TF-IDF-like scoring: sum(log(1 + tf)) for each matching query term. Returns up to top_k results sorted by descending score.

ParameterTypeDefaultDescription
querystr(required)Search query string
top_kint5Maximum results to return

Returns: List of SearchResult sorted by descending score.

Introspection Methods

total_chunks()

python
def total_chunks(self) -> int

Total number of chunks across all artifacts.

Dunder Methods

MethodDescription
__len__Number of indexed artifacts
__repr__KnowledgeStore(artifacts=3, chunks=15, chunk_size=512)

Example

python
from orbiter.context._internal.knowledge import KnowledgeStore

store = KnowledgeStore(chunk_size=100, chunk_overlap=20)

# Index artifacts
store.add("readme", "This is a long document about Python programming...")
store.add("guide", "A guide to using the API with examples...")

# Search
results = store.search("Python programming", top_k=3)
for r in results:
    print(f"[{r.chunk.artifact_name}#{r.chunk.index}] score={r.score:.2f}")
    print(f"  {r.chunk.content[:80]}...")

# Get chunks for a specific artifact
chunks = store.get("readme")
print(f"readme has {len(chunks)} chunks")

# Range query
range_chunks = store.get_range("readme", start=0, end=200)

Integration with Workspace

When a KnowledgeStore is attached to a Workspace, artifacts are auto-indexed on write and de-indexed on delete:

python
from orbiter.context.workspace import Workspace
from orbiter.context._internal.knowledge import KnowledgeStore

ks = KnowledgeStore()
ws = Workspace("ws-1", knowledge_store=ks)

await ws.write("doc.md", "# Title\nSome content about AI...")
results = ks.search("AI")  # finds the indexed content