Skip to content
Navigation

Reflection framework with LLM-powered analysis, insight extraction, and suggestion generation. Provides a three-step pipeline: analyze, insight, suggest.

Module Path

python
from orbiter.eval.reflection import (
    ReflectionType,
    ReflectionLevel,
    ReflectionResult,
    ReflectionHistory,
    Reflector,
    GeneralReflector,
)

ReflectionType

Category of a reflection result.

python
class ReflectionType(StrEnum):
    SUCCESS = "success"
    FAILURE = "failure"
    OPTIMIZATION = "optimization"
    PATTERN = "pattern"
    INSIGHT = "insight"
ValueDescription
SUCCESSReflection on a successful outcome
FAILUREReflection on a failed outcome
OPTIMIZATIONOpportunities for improvement
PATTERNRecurring patterns detected
INSIGHTGeneral insights and learnings

ReflectionLevel

Depth of reflection analysis.

python
class ReflectionLevel(StrEnum):
    SHALLOW = "shallow"
    MEDIUM = "medium"
    DEEP = "deep"
    META = "meta"
ValueDescription
SHALLOWSurface-level observation
MEDIUMStandard analysis depth
DEEPIn-depth root cause analysis
METAMeta-level reflection on the reflection process itself

ReflectionResult

Output of a single reflection pass.

Decorator: @dataclass(frozen=True, slots=True)

Constructor

python
ReflectionResult(
    reflection_type: ReflectionType,
    level: ReflectionLevel,
    summary: str,
    key_findings: list[str] = [],
    root_causes: list[str] = [],
    insights: list[str] = [],
    suggestions: list[str] = [],
    metadata: dict[str, Any] = {},
)
FieldTypeDefaultDescription
reflection_typeReflectionType(required)Category of this reflection
levelReflectionLevel(required)Depth of analysis
summarystr(required)One-sentence overview
key_findingslist[str][]Notable observations
root_causeslist[str][]Underlying reasons for outcomes
insightslist[str][]Patterns and learnings
suggestionslist[str][]Actionable improvements
metadatadict[str, Any]{}Additional context

ReflectionHistory

Tracks reflection results over time with aggregate statistics.

Decorator: @dataclass(slots=True)

Constructor

python
ReflectionHistory()

Attributes

AttributeTypeDefaultDescription
total_countint0Total reflections recorded
success_countint0Count of SUCCESS type reflections
failure_countint0Count of FAILURE type reflections

Methods

add()

python
def add(self, result: ReflectionResult) -> None

Append a result and update counters. Increments success_count for ReflectionType.SUCCESS and failure_count for ReflectionType.FAILURE.

get_recent()

python
def get_recent(self, n: int = 5) -> list[ReflectionResult]

Return the n most recent reflections.

ParameterTypeDefaultDescription
nint5Number of recent entries to return

get_by_type()

python
def get_by_type(self, rtype: ReflectionType) -> list[ReflectionResult]

Return all reflections matching the given type.

ParameterTypeDescription
rtypeReflectionTypeType to filter by

summarize()

python
def summarize(self) -> dict[str, Any]

Return aggregate statistics.

Returns: Dict with keys:

KeyTypeDescription
totalintTotal reflections
successintSuccess count
failureintFailure count
typesdict[str, int]Count per ReflectionType value

Reflector (ABC)

Abstract reflector with a three-step template: analyze, insight, suggest.

Constructor

python
Reflector(
    name: str = "reflector",
    reflection_type: ReflectionType = ReflectionType.INSIGHT,
    level: ReflectionLevel = ReflectionLevel.MEDIUM,
)
ParameterTypeDefaultDescription
namestr"reflector"Reflector name
reflection_typeReflectionTypeINSIGHTDefault category for results
levelReflectionLevelMEDIUMDefault analysis depth

Attributes

AttributeTypeDescription
namestrReflector name
reflection_typeReflectionTypeDefault reflection category
levelReflectionLevelDefault analysis depth

Methods

reflect()

python
async def reflect(self, context: dict[str, Any]) -> ReflectionResult

Run the full three-step reflection pipeline: analyze() -> insight() -> suggest().

ParameterTypeDescription
contextdict[str, Any]Execution context to reflect on

Returns: ReflectionResult assembled from the three steps.

Behavior:

  1. Calls analyze(context) to extract facts and key findings
  2. Calls insight(analysis) to derive insights
  3. Calls suggest(analysis) to generate actionable suggestions
  4. Assembles ReflectionResult from the combined outputs

analyze() (abstract)

python
async def analyze(self, context: dict[str, Any]) -> dict[str, Any]

Step 1: Extract facts and key findings from the execution context.

Returns: Dict with keys like "summary", "key_findings", "root_causes", "insights", "suggestions".

insight()

python
async def insight(self, analysis: dict[str, Any]) -> dict[str, Any]

Step 2: Derive insights from the analysis. Default implementation passes through analysis["insights"].

suggest()

python
async def suggest(self, insights: dict[str, Any]) -> dict[str, Any]

Step 3: Generate actionable suggestions from insights. Default implementation passes through insights["suggestions"].


GeneralReflector

LLM-powered reflector using a judge callable (prompt: str) -> str.

Constructor

python
GeneralReflector(
    judge: Any = None,
    *,
    system_prompt: str | None = None,
    name: str = "general_reflector",
    reflection_type: ReflectionType = ReflectionType.INSIGHT,
    level: ReflectionLevel = ReflectionLevel.DEEP,
)
ParameterTypeDefaultDescription
judgeAnyNoneAsync callable (prompt: str) -> str
system_promptstr | NoneNoneCustom system prompt (uses default if None)
namestr"general_reflector"Reflector name
reflection_typeReflectionTypeINSIGHTDefault reflection category
levelReflectionLevelDEEPDefault analysis depth

Default System Prompt

The built-in system prompt instructs the LLM to provide a structured reflection with five sections (Summary, Key Findings, Root Causes, Insights, Suggestions) and return JSON.

Context Keys

The analyze() method reads these keys from the context dict:

KeyUsed For
inputFormatted as [Input] section
outputFormatted as [Output] section
errorFormatted as [Error] section
iterationFormatted as [Iteration] marker

Overridden Methods

analyze()

Calls the LLM judge with the execution context and parses the response. Returns parsed JSON or a fallback dict with "parse_error": True.

insight()

Passes through insights from the LLM analysis result.

suggest()

Passes through suggestions from the LLM analysis result (already extracted in analyze).

Example

python
import asyncio
from orbiter.eval import (
    GeneralReflector,
    ReflectionHistory,
    ReflectionType,
)

async def my_judge(prompt: str) -> str:
    return '''{
        "summary": "The agent failed to handle edge cases in the input.",
        "key_findings": ["Missing null check", "No retry logic"],
        "root_causes": ["Insufficient error handling"],
        "insights": ["Edge cases need explicit handling"],
        "suggestions": ["Add input validation", "Implement retry with backoff"]
    }'''

async def main():
    reflector = GeneralReflector(
        judge=my_judge,
        reflection_type=ReflectionType.FAILURE,
    )

    context = {
        "input": "Process this data: null",
        "output": "Error: NoneType has no attribute 'strip'",
        "error": "AttributeError",
        "iteration": 3,
    }

    result = await reflector.reflect(context)
    print(f"Type: {result.reflection_type}")  # failure
    print(f"Summary: {result.summary}")
    print(f"Suggestions: {result.suggestions}")

    # Track in history
    history = ReflectionHistory()
    history.add(result)
    print(history.summarize())
    # {'total': 1, 'success': 0, 'failure': 1, 'types': {...}}

asyncio.run(main())