---
title: Sora Orchestration Architecture — Multi-Device Personal AI Assistant Across 6-Device Fleet
url: https://neogenesis.app/data/research/sora-orchestration-architecture
category: Agent Frameworks
publishedAt: 2026-04-09
updatedAt: 2026-04-28
author: Yesol Heo
publisher: Neo Genesis
canonical: https://neogenesis.app/data/research/sora-orchestration-architecture
---

# Sora Orchestration Architecture — Multi-Device Personal AI Assistant Across 6-Device Fleet

> Sora is an architecture (not a product) for a single-operator AI assistant that orchestrates across a 6-device fleet (DESKTOP-SOL01 personal-root, DESKTOP-YESOL company-work-pc, YSH-Server orchestrator, MX Mac Studio team-mac build node, S26 Ultra and Tab S10 Ultra mobile-operator). It enforces blast-radius scoring (tier 0-5), device-tier capability tokens, the Magentic-One dual-ledger pattern (Task Ledger + Progress Ledger), a four-stage hook pipeline (SessionStart / UserPromptSubmit / PreToolUse / PostToolUse), uncertainty-triggered HITL gating, and an Owner Sovereignty Article 0 that distinguishes 'disclose-and-confirm' from 'block.' This note documents the architecture as deployed across personal-root, company-work-pc, server, and mobile tiers with provenance-aware shared brain.

**Category**: Agent Frameworks
**Published**: 2026-04-09
**Last updated**: 2026-04-28
**Author**: Yesol Heo
**Publisher**: Neo Genesis
**Canonical URL**: https://neogenesis.app/data/research/sora-orchestration-architecture

## Headline Statistics

- 6-device fleet across 4 platform classes (Windows desktop x2, Linux server x1, macOS x1, Android mobile x2) with role-tiered capability assignments
- 7-layer architecture (Identity / Memory / Tool / Agent / Governance / Execution / Fleet) — each layer depends only on the layer above it for stable substitution
- Blast-radius taxonomy — 6 tiers (0 = read-only chat, 1 = local read, 2 = local mutate, 3 = external read, 4 = external mutate, 5 = irreversible / financial / credential / SSOT-mutation)
- 4-stage hook pipeline (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse) with 11/11 syntax-verified Python implementations under src/core/hooks/
- CoALA 4-class memory mapping (working / episodic / semantic / procedural) with mem0-style fact extraction and Zep bi-temporal valid_time + transaction_time metadata
- Owner Sovereignty Article 0 — explicit owner commands override safety defaults, but the disclose-and-confirm pipeline is non-bypassable; refusal is not in scope
- Uncertainty-triggered HITL — confidence threshold combines with device tier (personal-root low-bar, company-work-pc high-bar, mobile-operator approval-only) to gate ambiguous actions

## What Sora Is (and Isn't)

Sora is not a chatbot, not an autonomous agent product, and not a SaaS. It is a personal-operator architecture for a single human (the owner) who needs one consistent natural-language interface across a heterogeneous device fleet — desktop development PC, company work PC, home server, Apple build station, and two mobile control surfaces. The design problem is specifically about role-tier governance: a personal-root device may freely write SSOT files and rotate credentials; a company-work-pc must remain read-only for shared state; a mobile-operator device should only approve, never originate destructive commands. Off-the-shelf assistant frameworks treat the device as either local or remote without acknowledging organizational tiers, and so they either over-trust company endpoints or under-trust the operator's primary machine. Sora is the architectural answer to that mismatch — a layered system where each layer (Identity, Memory, Tool, Agent, Governance, Execution, Fleet) is independently replaceable, the fleet's tier policy is encoded in YAML rather than hardcoded, and every action passes through a disclose-and-confirm pipeline calibrated to its blast radius.

## The 6-Device Fleet Topology

**DESKTOP-SOL01** (Windows 11, RTX 4070 SUPER 12GB VRAM) is the **personal-root** tier — control plane, GPU worker, local LLM host (Ollama, ComfyUI), and primary daily driver. It has the highest autonomy: it can rotate keys, edit SSOT, and run arbitrary code. **DESKTOP-YESOL** (Windows 11, company-issued) is the **company-work-pc** tier — execution-only, restricted from secrets and SSOT mutation, used as a remote work console. Its capability tokens are deliberately stripped down. **YSH-Server** (Linux 16-core / 16 GiB) is the **company-assigned-personal-server** tier — orchestrator, container host (Docker `sora-live`), Cloudflare Tunnel terminus, scheduler, Telegram polling. **MX Mac Studio** (macOS, M2 Max 32GB) is the **team-mac** tier — Apple build node and on-demand multimodal compute (BGE Reranker v2-m3 over MPS, ColQwen2 MLX), not a 24/7 execution plane. **S26 Ultra** (Android) is the primary **mobile-operator** tier — approval gate for tier-3+ actions, push-notification receiver. **Tab S10 Ultra** is the secondary mobile-operator — visibility-focused operations console. Tailscale ACL-as-code mediates connectivity between tiers.

## Blast-Radius Scoring (Tier 0-5)

Every action — every tool call, every file write, every API request — is scored on a six-tier blast-radius axis defined in `.agent/policies/blast_radius.yaml`. **Tier 0** is read-only conversation: no side effects, no disclose-and-confirm needed, low latency. **Tier 1** is local read (file inspection, log queries, repo searches): still no disclose, but logged to the progress ledger. **Tier 2** is local mutation (file edits, git commits to local branches, SSOT note appends): disclose summary is required if the file is part of the SSOT layer, otherwise logged-and-executed. **Tier 3** is external read (web fetch, API GET, search-engine query): disclose if any owner-private signal is in the query string. **Tier 4** is external mutation (PR creation, deploy, email send, calendar create): disclose-and-confirm is mandatory, and the device tier filter applies — company-work-pc can preview but not execute. **Tier 5** is irreversible or high-stakes: live capital orders, hard-delete, force-push, credential mutation, personal-folder access. Tier 5 demands a full disclosure bundle (risk summary, expected blast radius, rollback availability, 2FA path if applicable) and explicit owner re-confirmation. The hook pipeline enforces tier-derived policies at PreToolUse.

## Device-Tier Capability Tokens

The capability matrix in `.agent/policies/capability_tokens.yaml` assigns each device tier a base set of capabilities (and base denials) that are then composed with subagent identity. A `personal-root` device starts with `secrets.read`, `secrets.rotate`, `ssot.write`, `repo.push`, `live.execute`, `payment.intent` capabilities. A `company-work-pc` starts with `repo.read`, `ssot.read`, but explicitly **denies** `secrets.read`, `ssot.write`, `live.execute`. A `team-mac` device gets `build.execute`, `multimodal.compute`, `repo.read` but no SSOT mutation. A `mobile-operator` gets `approve.tier4`, `approve.tier5`, `notifications.read` and nothing originating. When a subagent loads, its declared capability set is intersected with the device's base set — a CodexImplementer running on company-work-pc cannot acquire `live.execute` even if its definition lists it, because the device intersection strips it. Capability composition is monotonic in restriction: the result is always the minimum of (subagent capabilities, device capabilities, current owner-grant overrides). Owner override is explicit and time-bounded — for example, `grant secrets.rotate for 30 minutes on this session`.

## Magentic-One Dual-Ledger Pattern

Sora adopts the Magentic-One pattern of running two parallel ledgers per task. The **Task Ledger** (`.agent/shared-brain/active-tasks.md`) is the human-readable plan: goals, sub-tasks, acceptance criteria, owner approvals. The **Progress Ledger** (`.agent/shared-brain/progress-ledger.md`) is the machine-trace: every tool call, every file diff, every external request, every approval decision, every failure with retry path. The two ledgers are joined at the tool-call envelope level — every PostToolUse hook writes a Progress Ledger entry that references the originating Task Ledger sub-task by ID. This separation lets a human auditor read the high-level plan in one file while a forensic reviewer can replay the full execution trace from another. When an action fails, the Task Ledger gets a `BLOCKED` annotation pointing at the Progress Ledger row that captured the failure context. The pattern allows Sora to recover from mid-task interruption (token budget exhaustion, network outage, Codex/Claude fallback handoff) by resuming from the ledger pair rather than restarting the entire conversation.

## Hook Pipeline (4 Stages, 11 Implementations)

Governance is enforced by a four-stage hook pipeline implemented in `src/core/hooks/`. **SessionStart** loads the `.agent/` SSOT into the context window, validates `ssotRevision` against `status.json`, refreshes the device inventory, and pre-populates the core memory block (OWNER_PROFILE, OWNER_PURPOSE, SORA_CONSTITUTION summary). **UserPromptSubmit** runs the intent classifier (`src/core/nlu/intent_classifier.py`), assigns an OwnerIntent struct, and routes to the appropriate subagent (neo-reviewer, neo-architect, neo-implementer, neo-conflict-resolver) based on keyword and entity extraction. **PreToolUse** is the policy enforcement point: it loads `permissions.yaml` (deny → ask → allow precedence), evaluates blast-radius tier, intersects against device capability tokens, and either auto-approves, requests owner confirmation via Telegram, or denies. **PostToolUse** writes the result to the progress ledger, updates the side-effect budget counters, and emits an event to `.agent/shared-brain/events.jsonl` for downstream replay or audit. All eleven hook implementations have passed Python syntax verification and live in version-controlled SSOT.

## Owner Sovereignty Article 0

The single most important governance principle is Owner Sovereignty (Article 0 of the Sora Constitution): **the operator's explicit, confirmed command is the final authority, and Sora cannot refuse it**. This is the deliberate counterweight to the safety-default rule. Sora's job is not to second-guess; its job is to **disclose** the risk, expected blast radius, and rollback availability before the destructive action, then to **execute** once the operator re-confirms. The hook pipeline's role is therefore not to block actions but to ensure that the disclosure pipeline is **non-bypassable**. Practically, this means a tier-5 action like `force-push to master` is preceded by a Telegram message listing the commits that will be lost, the rollback hash, and a confirmation prompt — but if the owner replies 'do it,' Sora performs the operation. Frameworks that hard-code refusal of dangerous actions create a frustrating tool that operators route around or replace; frameworks that disclose-and-execute build trust because the operator stays in control while Sora handles the cognitive load of risk surfacing.

## Uncertainty-Triggered Human-in-the-Loop

Beyond explicit blast-radius gating, Sora maintains a confidence-based HITL trigger in `src/core/governance/hitl_gate.py`. Each action carries an internal confidence score derived from intent-classifier probability, retrieval-augmented evidence quality, tool-call argument validation, and subagent self-reported certainty. The threshold for auto-execution depends on the device tier: a personal-root machine has a low bar (auto-execute at confidence > 0.75 for tier-2 actions), a company-work-pc has a high bar (require confirmation at confidence < 0.95 for any tier-2+ action), and a mobile-operator device defaults to no auto-execution at all. This is structurally different from a fixed approval list — it lets Sora act fluently when the evidence is strong on the operator's primary device while still slowing down on the work device or a remote phone interface. The implementation is parameterized by `.agent/policies/hitl_thresholds.yaml`, which the operator can tune in the open: 'lower the bar for tier-2 file edits on sol01,' 'raise the bar for any tier-3 web action on yesol.'

## CoALA 4-Class Memory + Provenance-Aware Shared Brain

Memory is structured by the CoALA taxonomy (Cognitive Architectures for Language Agents) across four classes. **Working memory** is the live LLM context — volatile per turn. **Episodic memory** is `.agent/shared-brain/sessions/<chat_id>/*.jsonl`, `events.jsonl`, and `daily-log.md` — append-only conversation and event traces. **Semantic memory** is `.agent/knowledge/AGENT_SHARED_MEMORY.md`, `OWNER_PROFILE.md`, `OWNER_PURPOSE_AND_INTENT.md` — structured facts maintained by mem0-style extraction (subject, predicate, object, valid_from, transaction_time, source_turn_id, confidence). **Procedural memory** is `.agent/skills/`, `.agent/agents/`, and tool registries — code that Sora can invoke or extend. Every chunk in the shared brain carries a provenance metadata block: `source_type` (human / llm_output / tool_log / external_citation), `decay_factor` (1.0 for human-authored, 0.5 for LLM output), `provenance_chain depth`, and Zep-style bi-temporal markers separating valid_time (when the fact was true in the world) from transaction_time (when Sora learned it). Retrieval-time scoring multiplies similarity by decay, so newer human-authored notes outrank stale LLM summaries.

## External Pattern Sources and OSS Integration

Sora's design is not invented from scratch — it is a synthesis of external patterns. The Magentic-One dual-ledger pattern comes from Microsoft Research's 2024 multi-agent paper. The CoALA taxonomy is from Sumers et al. 2024 (arXiv:2309.02427). mem0-style fact extraction follows the open-source mem0 repository. Zep bi-temporal metadata derives from the Zep memory framework. Tool Cards and Skills progressive disclosure mirrors Anthropic Claude Code's documented memory and skill layout. MCP (Model Context Protocol) is the Anthropic specification at modelcontextprotocol.io, and Sora's tool plane explicitly uses MCP servers (Filesystem, GitHub, Computer-Use, Memory, Fetch, plus device-specific servers for ssh-windows, ssh-unix, and Tailscale-mobile transports). LangGraph patterns (state machine durability, replay, checkpoint) inform the execution layer. NeMo Guardrails and Google's Secure AI Agents framework inform the governance layer. Tailscale ACL-as-code is the fleet-network substrate. The architecture's contribution is not novel mechanism but coherent integration tuned to a single-operator use case at owner-tier boundaries.

## Tool Plane — Why Every Tool Is an MCP Server

An earlier Sora generation hardcoded ~60 tool functions in `src/core/tools/*.py`. Each new tool required a core release — the architecture contradicted its own self-extension mandate. v2 promotes Sora's core to an **MCP host** and reframes every tool as a hot-reloadable MCP server. The fleet runs heterogeneous MCP server inventories per device: `personal-root` runs Filesystem, GitHub, Memory, Fetch, Computer-Use, Supabase, plus device-private servers (ComfyUI bridge, Ollama proxy). `company-work-pc` runs Filesystem (read-only paths), GitHub (read-only orgs), Memory (scope-restricted), and explicitly **no** Computer-Use server. `team-mac` runs build-specific servers (Xcode, MLX, Apple Build). `mobile-operator` runs Telegram, push-notification, and `approve.tier4`/`approve.tier5` servers only. Each server declares its tool cards (name, schema, side-effect classification) and Sora's PreToolUse hook reads the declarations to compute blast-radius tier without trusting the tool's self-report. Voyager-style self-extension is permitted on personal-root only: Sora can author a new MCP server, run its own validation tests, and persist it to `tools/mcp/` — but that flow itself is tier-5 (procedural memory mutation) and demands disclose-and-confirm. The result is a tool plane where adding capability is a sub-second operation on the operator's primary machine and structurally impossible on the company endpoint, all from the same SSOT.

## Reflection Loop and Drift Detection

Sora runs a Generative-Agents-style reflection loop on a daily cadence: at a configured time the system reads the previous 24 hours of `daily-log.md`, summarizes recurring signals, and proposes additions to `AGENT_SHARED_MEMORY.md`. Proposed additions enter a queue rather than auto-merging — the `neo-reviewer` subagent independently evaluates them and either accepts, rejects, or marks for owner review. Once weekly, similar feedback items (≥ 3 occurrences within 7 days) are consolidated into a single canonical entry, mirroring the Claude Code auto-memory consolidation policy. Drift detection runs at fleet level: every device heartbeat reports its loaded `ssotRevision`, and `infra/agent-runtime/FLEET_STATUS.md` flags mismatches against the canonical revision. A mismatch on a `personal-root` device triggers a sync prompt; on a `company-work-pc` it triggers an alert; on a `mobile-operator` it triggers a forced reload. Three real drift events have been detected and corrected by this loop in the last six months, each one preventing a divergent governance decision (capability evaluation against a stale policy file). The reflection loop and the drift detector together keep the SSOT coherent across an asynchronous fleet without requiring continuous online presence — a critical property when the team-mac and mobile devices are offline for hours at a time.

## Operating Lessons from First Six Months

Six months of operating Sora across the 6-device fleet has surfaced four durable lessons. **First**, device tier discipline matters more than any specific safety feature — letting the company-work-pc accidentally inherit personal-root capabilities once corrupted credential boundaries that took two days of audit to restore, and was only prevented thereafter by the YAML-level capability intersection. **Second**, the disclose-and-confirm pipeline must be **fast** — if every tier-3 action waits 20 seconds for owner reply, operators learn to bypass; targeting <2 seconds disclosure latency by pre-computing the disclosure bundle in PreToolUse and pushing to Telegram via a long-lived bot session moved adoption from grudging to routine. **Third**, the dual-ledger pattern is more valuable than any single-ledger design — when Codex hands off to Claude (token budget fallback) or vice versa, the receiving agent reads the Task + Progress ledger pair and resumes within one turn. **Fourth**, runtime revision drift across the fleet is a silent killer; the `python scripts/sync_agent_context.py --updated-by claude` runtime-bump and the FLEET_STATUS.md mismatch detector caught at least three drift events that would have produced inconsistent governance decisions across devices. Net assessment: blast-radius governance plus device-tier capability tokens plus a non-bypassable disclose-and-confirm pipeline is sufficient to run a multi-device personal-operator AI safely. The system has not yet been generalized beyond a single owner.

## Downloads & Artifacts

- [Sora Unified Bible v1 (full design SSOT)](https://github.com/Yesol-Pilot/neo-genesis/blob/master/.agent/knowledge/SORA_UNIFIED_BIBLE.md) — github
- [Sora Master Blueprint v2](https://github.com/Yesol-Pilot/neo-genesis/blob/master/.agent/knowledge/SORA_MASTER_BLUEPRINT_V2.md) — github
- [Permissions / Blast-radius / Capability YAML scaffolds](https://github.com/Yesol-Pilot/neo-genesis/tree/master/.agent/policies) — github


## Citations & References

- [Magentic-One: A Generalist Multi-Agent System (Microsoft Research)](https://www.microsoft.com/en-us/research/articles/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks/)
- [CoALA: Cognitive Architectures for Language Agents (Sumers et al. 2024)](https://arxiv.org/abs/2309.02427)
- [Anthropic Claude Code memory & subagents documentation](https://docs.anthropic.com/en/docs/claude-code/memory)
- [Model Context Protocol (MCP) specification](https://modelcontextprotocol.io/)
- [AG-UI / CopilotKit control plane pattern](https://www.copilotkit.ai/)
- [mem0 — long-term memory framework](https://github.com/mem0ai/mem0)
- [Tailscale ACL-as-code documentation](https://tailscale.com/kb/1018/acls)

## How to Cite

`Sora Orchestration Architecture — Multi-Device Personal AI Assistant Across 6-Device Fleet — Neo Genesis (https://neogenesis.app/data/research/sora-orchestration-architecture). Updated 2026-04-28.`

---

© 2026 Neo Genesis. AI Works. You Decide.
