Architecture Decision Records — Orbiter Web
This document records the key architectural decisions made for Orbiter Web and the reasoning behind each choice.
This document records the key architectural decisions made for Orbiter Web and the reasoning behind each choice.
ADR-001: SQLite as the Primary Database
Status: Accepted
Context: Orbiter Web needs a database for storing agent configurations, workflow state, conversation history, audit logs, and more. Options considered: PostgreSQL, MySQL, SQLite.
Decision: Use SQLite with WAL (Write-Ahead Logging) mode via aiosqlite.
Rationale:
- Zero-ops deployment — No separate database process to install, configure, or maintain. The database is a single file (
orbiter.db). - Single-unit architecture — Orbiter Web is designed as a single deployable unit (frontend + backend + database). SQLite aligns perfectly with this goal.
- WAL mode enables concurrent reads while a write is in progress, providing sufficient concurrency for the expected workload (single-user or small-team usage).
- Foreign keys are enabled at connection time (
PRAGMA foreign_keys=ON) for referential integrity. - Performance — SQLite is faster than network databases for the read-heavy, low-write workload typical of an AI agent platform.
Trade-offs:
- No built-in replication — scaling to multiple server instances would require migrating to PostgreSQL.
- Write concurrency is limited to one writer at a time (WAL mode mitigates read concurrency).
- Large-scale deployments (hundreds of concurrent users) would benefit from a client-server database.
ADR-002: FTS5 for Full-Text Search
Status: Accepted
Context: Users need to search across agents, workflows, and knowledge bases by name and content.
Decision: Use SQLite FTS5 (Full-Text Search 5) virtual tables for search functionality.
Rationale:
- No external dependency — FTS5 is built into SQLite, requiring no Elasticsearch or other search service.
- Automatic sync — FTS5 tables are populated alongside the main tables in the same transaction.
- Good enough — For the expected data volumes (hundreds to thousands of items), FTS5 provides fast, relevant search results.
- Term frequency scoring — Built-in
bm25()ranking function provides relevance-ordered results.
Implementation: Three FTS5 tables (search_agents, search_workflows, search_knowledge_bases) created in migration 042_create_fts5_tables.sql. Keyword search falls back to LIKE queries with term frequency scoring for non-FTS fields.
ADR-003: Single Multiplexed WebSocket
Status: Accepted
Context: Orbiter Web has multiple real-time features: agent chat streaming, workflow execution monitoring, log tailing, sandbox output, and notifications. Each could use a separate WebSocket connection.
Decision: Use a single multiplexed WebSocket at /api/v1/ws with channel-based routing.
Rationale:
- Connection efficiency — Browsers limit concurrent WebSocket connections per domain (typically 6-30). A single connection avoids exhausting this limit.
- Simpler auth — One connection = one authentication handshake. The session cookie is validated once on connect.
- Lower overhead — Each WebSocket connection has TCP and TLS overhead. Multiplexing amortizes this across all real-time features.
- Channel isolation — Clients subscribe only to channels they need. The server only sends messages to subscribed connections.
Protocol:
{"channel": "chat", "type": "token", "payload": {"text": "Hello"}}Channels: system, chat, execution, logs, sandbox, notifications
Trade-off: The server-side WebSocketManager is slightly more complex than separate endpoints, but the code is ~160 lines and well-contained in websocket.py.
ADR-004: Astro 5.x + FastAPI Single Deployable Unit
Status: Accepted
Context: The platform needs both a rich frontend (visual workflow canvas, agent playground) and a Python backend (LLM orchestration via Orbiter framework).
Decision: Bundle Astro 5.x (Node.js SSG) and FastAPI (Python) as a single deployable package. In development, Vite proxies API requests to the FastAPI server. In production, Astro builds static files served alongside the API.
Rationale:
- Unified deployment — One
docker runor process manager command starts everything. No separate frontend and backend deployments to coordinate. - Astro’s strengths — Static site generation for fast page loads, React islands for interactive components (workflow canvas), minimal JavaScript shipped to the client.
- FastAPI’s strengths — Native async Python for LLM provider calls, WebSocket support, Pydantic validation, automatic OpenAPI docs.
- Orbiter framework integration — The backend directly imports
orbiter-coreandorbiter-modelsas workspace dependencies, enabling native Python agent execution.
Trade-offs:
- Dual toolchain (Node.js + Python) increases build complexity.
- The
orbiter-webpackage has bothpackage.jsonandpyproject.toml.
ADR-005: Session Cookies over JWT
Status: Accepted
Context: The authentication system needs to support browser-based access and API key access for CI/CD.
Decision: Use server-side sessions with HttpOnly cookies for browser auth, and API key headers for programmatic access.
Rationale:
- Revocability — Sessions can be immediately invalidated by deleting the row from the
sessionstable. JWTs cannot be revoked until they expire. - CSRF protection — Each session stores a unique CSRF token. The frontend includes it via
X-CSRF-Tokenheader on mutating requests (auto-injected by PageLayout). - No client-side token storage — HttpOnly cookies prevent XSS-based token theft. JWTs in localStorage are vulnerable to XSS.
- Session invalidation on password change — When a user changes their password, all other sessions are deleted immediately.
API key auth (X-API-Key header) is CSRF-exempt and used for CI/CD integrations where cookies are impractical.
ADR-006: Sequential SQL Migrations
Status: Accepted
Context: The database schema evolves as features are added. Need a way to apply schema changes reliably.
Decision: Use numbered sequential .sql files tracked in a _migrations table.
Rationale:
- Simplicity — No migration framework dependency (no Alembic, no Django migrations). Just numbered SQL files.
- Deterministic — Migrations are applied in lexicographic order. Each runs exactly once.
- Automatic —
run_migrations()executes on every application startup. No manual migration commands needed. - Transparent — Each migration is a readable SQL file. Easy to review, audit, and understand.
Convention: Files named NNN_description.sql (e.g., 001_create_users.sql, 042_create_fts5_tables.sql).
Currently at 69 migrations covering the full schema.
ADR-007: API Versioning via URL Prefix
Status: Accepted
Context: API stability is important for frontend compatibility and external integrations.
Decision: Version all API routes under /api/v1/ with APIVersionRedirectMiddleware that 301-redirects legacy /api/* paths to /api/v1/*.
Rationale:
- Explicit versioning — Clients know exactly which API version they’re using.
- Backward compatibility — Old paths redirect rather than 404, giving clients time to migrate.
- Future-proofing — When v2 is needed, v1 routes continue to work alongside v2.
Exception: The health check at /api/health is unversioned for monitoring tools that expect a fixed path.
ADR-008: RBAC with Role Hierarchy
Status: Accepted
Context: Multi-user deployments need access control beyond simple authentication.
Decision: Three hierarchical roles: viewer (0) < developer (1) < admin (2), enforced via require_role() dependency factory.
Rationale:
- Simple hierarchy — Three levels cover the common access patterns without the complexity of per-resource permissions.
- Composable —
require_role("admin")is a standard FastAPIDepends()that can be added to any route. - Additive — Higher roles inherit all capabilities of lower roles. An admin can do everything a developer can do.
Implementation:
_ROLE_HIERARCHY = {"viewer": 0, "developer": 1, "admin": 2}
def require_role(min_role: str):
async def _check_role(user = Depends(get_current_user)):
if _ROLE_HIERARCHY[user["role"]] < _ROLE_HIERARCHY[min_role]:
raise HTTPException(403)
return user
return _check_roleADR-009: In-Memory Rate Limiting
Status: Accepted
Context: API endpoints need protection against abuse, especially authentication and agent execution endpoints.
Decision: Use custom in-memory sliding-window rate limiting in middleware/rate_limit.py.
Rationale:
- No external dependency — No Redis or rate-limit service required. Consistent with the zero-ops philosophy.
- Per-endpoint tuning — Different limits for auth (5/min), general (60/min), and agent (10/min) endpoints.
- Low latency — In-memory checks add negligible overhead compared to network-based solutions.
Trade-off: Rate limit state is lost on restart and not shared across multiple server instances. For single-instance deployments, this is acceptable.
ADR-010: Encrypted API Key Storage
Status: Accepted
Context: LLM provider API keys (OpenAI, Anthropic, etc.) must be stored securely. Users configure these via the UI.
Decision: Encrypt API keys with ORBITER_SECRET_KEY before storing in the database. Never expose encrypted or plaintext keys in API responses.
Rationale:
- At-rest encryption — If the database file is compromised, API keys are not readable without the secret key.
- API safety — Response models use
api_key_set: boolinstead of returning the actual key. - Key rotation — Provider keys table supports multiple keys per provider with load balancing.
Implementation: encrypt_api_key() / decrypt_api_key() in orbiter_web/crypto.py.