ProductEngine: Architecture Design
This document is the definitive reference for the ProductEngine architecture. It describes what ProductEngine is, how it works, and why it works that way. Every decision recorded here is a decision, not a proposal.
---
1. What ProductEngine Is
ProductEngine is a Python framework for building hybrid desktop/web applications where:
- All functionality is delivered by plugins. The framework provides infrastructure; plugins provide everything the user sees and does.
- The framework is designed from the ground up for AI observability and control. An AI agent can inspect all application state as structured JSON and dispatch any action through the same system the user uses. No screenshots, no OCR, no screen scraping.
- The same codebase runs as a desktop application (PyWebView wraps localhost) or as a cloud service (browser hits the server directly). The difference is configuration, not code.
Future north star (not a current goal): the plugin architecture could eventually underpin a desktop environment, where applications are plugin collections running in a shared runtime. This informs architectural decisions but is not a deliverable.
---
2. Runtime Stack
| Layer | Technology | Role |
|---|---|---|
| ASGI server | Granian (Rust) | HTTP/2 native, ~3x Uvicorn performance, zero-copy file serving |
| Web framework | Custom ASGI micro-framework (asgi.py) | Routing, static file serving, SPA fallback, middleware, lifespan (zero external deps beyond msgspec) |
| Serialization | msgspec | Validation, JSON/MessagePack encoding (~15-30x faster than Pydantic for JSON) |
| Desktop shell | PyWebView | Native window wrapping localhost |
| Frontend framework | Svelte 5 | Reactive UI components |
| Frontend tooling | Vite | Bundling, HMR during development |
| File watching | watchfiles | Plugin hot-reload detection |
2.1 Why Granian
Granian was chosen over Uvicorn, Hypercorn, and Daphne for three reasons:
HTTP/2 is essential for dashboard apps. Browsers enforce a 6-connection-per-origin limit under HTTP/1.1. ProductEngine apps maintain multiple SSE streams alongside API requests and WebSocket connections. Under HTTP/1.1, SSE connections consume the connection budget and starve API requests -- the user sees a live-updating dashboard that freezes when they click a button because the API call is queued behind long-lived SSE connections. HTTP/2 multiplexing replaces this with ~100 concurrent streams over a single TCP connection. Uvicorn has no HTTP/2 support at all, which is disqualifying for this use case.
| Scenario | HTTP/1.1 | HTTP/2 |
|---|---|---|
| 3 SSE streams + 10 API calls | SSE consumes half the connection budget; API calls queue | All 13 requests share one connection; no queuing |
| Dashboard with live updates | SSE connections starve API requests | Multiplexed; no interference |
| Multi-window app (3 windows) | 18 total connections possible (6 per origin per window) | 3 connections, ~300 streams total |
Performance. Granian is ~3x faster than Uvicorn in connection-heavy scenarios. The gains come from Rust-based HTTP parsing and connection handling, which is exactly the workload pattern ProductEngine targets. The pathsend ASGI extension enables zero-copy file transfers directly from kernel space, bypassing Python object marshaling entirely.
Active maintenance. Granian releases frequently (used in production by paperless-ngx, Reflex, SearXNG, Weblate, and companies including Microsoft, Mozilla, and Sentry). Hypercorn, the only other HTTP/2-capable ASGI server, is classified as inactive by Snyk. Daphne is Django-specific and slower than both.
2.2 HTTP/2 and TLS for Localhost
Browsers require TLS for HTTP/2 -- no major browser supports h2c (HTTP/2 over cleartext). This is a browser policy, not a protocol limitation. For cloud deployments, TLS is already standard. For localhost, locally-trusted certificates are required.
mkcert is the solution: it creates a local Certificate Authority, installs it in the system trust store, and generates certificates trusted by all browsers on the local machine. Self-signed certificates are not viable -- they can cause garbled binary output when HTTP/2 ALPN negotiation occurs with an untrusted cert, and require re-acceptance after browser updates or profile changes. mkcert certificates are indistinguishable from real CA-signed certificates.
The framework will provide a convenience command for generating mkcert certificates and will fail with a clear error when mkcert is not installed, rather than silently falling back to HTTP/1.1.
Granian configuration: --http auto negotiates HTTP/2 with TLS and falls back to HTTP/1.1 without. This is the default for ProductEngine's server module.
2.3 Why msgspec over Pydantic
msgspec is ~15-30x faster than Pydantic v2 for JSON encoding and ~6-15x faster for JSON decoding. These are not micro-benchmark artifacts -- the difference comes from immutable Structs with strict typing that enable pre-compiled optimized code paths and zero-copy optimizations for primitive types.
msgspec also generates JSON Schema 2020-12 (OpenAPI 3.1 compatible) and supports MessagePack/YAML/TOML natively. Since ProductEngine does not depend on FastAPI, Starlette, or any third-party web framework, there is no expectation of Pydantic compatibility. Users can still use Pydantic in their own application code alongside ProductEngine if they wish.
2.4 The BaseHTTPMiddleware Problem
Starlette's BaseHTTPMiddleware intercepts ASGI messages and asserts message["type"] == "http.response.body". When Granian sends other ASGI message types (like http.response.pathsend for zero-copy file transfers), BaseHTTPMiddleware crashes because it does not recognize them. This is a Starlette bug -- the ASGI spec allows extension message types, but BaseHTTPMiddleware assumes a fixed set.
Beyond the crash, BaseHTTPMiddleware also:
- Silently terminates SSE connections: it wraps streaming responses in
anyio.create_memory_object_stream(max_buffer_size=0). Under real-world I/O pressure, the anyio cancel scope firesCancelledErroron the SSE generator, killing the connection. The server logs nothing -- the browser seesreadyState: 2(CLOSED) and reconnects. This is intermittent and nearly impossible to diagnose because it only manifests under load, not in tests. - Breaks context variable propagation: prevents
contextvars.ContextVarchanges from propagating upward through the middleware stack, disrupting any subsequent pure ASGI middleware. - Is a hidden dependency: third-party middleware that extends BaseHTTPMiddleware reintroduces the problem even if all your own middleware is pure ASGI. One BaseHTTPMiddleware in the stack breaks SSE for all endpoints.
Starlette's maintainers have deprecated BaseHTTPMiddleware and plan to remove it in Starlette 1.0. ProductEngine uses a custom ASGI micro-framework (asgi.py) with no Starlette dependency and pure ASGI middleware throughout, making this entire class of bug structurally impossible.
2.5 Deployment Modes
Two deployment modes exist. Both use the same server, the same frontend, and the same plugins. (The original STACK.md evaluation also described a third "hybrid" mode -- local PyWebView frontend backed by both a local and a cloud ProductEngine server for offline-first sync. That mode remains a future possibility but is not a current deliverable.)
| Mode | How the user accesses it | What differs |
|---|---|---|
| Desktop | PyWebView opens a native window pointing at localhost | Window chrome is native OS; server lifecycle is tied to the window |
| Cloud | Browser navigates to the server's URL | No PyWebView; server runs independently; authentication required |
The framework detects the mode at startup and adjusts behavior accordingly (e.g., PyWebView lifecycle hooks in desktop mode, auth middleware in cloud mode). Plugin code is identical in both modes.
2.6 Module Layout
The Python backend is split into focused modules under src/productengine/:
| Module | Responsibility |
|---|---|
app.py | Routes, ASGI lifespan, middleware wiring |
dispatch.py | Command dispatch engine (resolution, execution, sub-command expansion, SSE broadcast) |
ai_api.py | AI surface endpoints (/ai/query, /ai/dispatch, /ai/manual) |
debug.py | Debug panel (/_debug) |
types.py | Typed contracts (CommandResult, SubCommand, AnimateSpec, ToastSpec, etc.) |
config.py | Environment-variable configuration |
constants.py | Centralized constants (timeouts, buffer sizes, file patterns) |
errors.py | Error ring buffer for runtime error tracking |
snapshots.py | State snapshot save/restore |
state.py | State store, changelog ring buffer, versioning |
registry.py | Plugin and command registry |
metacommands.py | Meta-command handler implementations |
sse.py | SSE broadcast infrastructure |
auth.py | Authentication middleware and token store |
tokens.py | Design token system and themes |
scene.py | Scene graph building |
asgi.py | Custom ASGI micro-framework |
server.py | Granian server entry point |
---
3. Plugin System
3.1 What a Plugin Is
A plugin is a directory containing:
- **
plugin.toml** -- the manifest declaring name, version, commands, state schema, dependencies, and handler language - Handler file(s) -- pure functions implementing commands (Starlark; Python and Lua are planned/future)
- **
ui.json** (optional) -- SDUI component tree describing the plugin's UI - Any other assets the plugin needs (CSS files, static images, data files)
A plugin directory is self-contained. Moving it into plugins/ installs it. Deleting it uninstalls it. Pushing it to a git remote publishes it.
3.2 Everything Is a Plugin
There is no distinction between "core" and "user" plugins. All application functionality -- including things that feel like framework features (theming, settings panels, system panels) -- are plugins. The framework itself is only:
- The plugin loader and registry
- The command dispatcher
- The state store
- The SDUI renderer
- The AI surface endpoints
- The git-managed persistence layer
Everything else is a plugin. If it has UI, state, or commands, it is a plugin.
3.3 Plugin Lifecycle
- Discovery: scan
plugins/for directories containingplugin.toml - Dependency resolution: topological sort with cycle detection; fail fast on cycles with a clear error naming the cycle
- Loading: parse manifest, load handlers via the appropriate language adapter, register all contributions (commands, state, UI, input bindings, styles)
- Hot-reload: watchfiles detects changes to any file in a plugin directory; the framework unregisters old contributions, reloads the handler module, re-parses the manifest (in case it changed), and re-registers contributions; state survives reload because state lives in the framework, not the plugin
- Runtime creation: an AI agent (or user) creates a new plugin directory with a manifest and handler files; the hot-reload watcher picks it up automatically and loads it as if it had been there at startup
3.4 No Special Treatment
All plugins are readable and writable. The AI can modify any plugin's files -- manifest, handlers, UI definition, styles. There is no read-only/writable distinction, no "runtime overlay," no "base vs. extension" hierarchy. Every plugin is equal. The framework does not protect any plugin from modification.
This is a deliberate choice. Protection mechanisms add complexity that conflicts with the goal of full AI control. If a plugin should not be modified, that is a policy enforced by the AI agent or the user, not by the framework.
---
4. Commands
4.1 Command Model
Every action that modifies state is a command. Commands are declared in plugin manifests and implemented as pure handler functions.
Handler contract:
(state: dict, params: dict, ctx: DispatchContext) -> CommandResult
Handlers return a CommandResult struct (defined in types.py) with fields: state, result, sub_commands, animate, toast, tokens_changed, broadcast_full_state. All fields are optional (defaulting to None / False). Starlark handlers may return a plain dict; the adapter layer converts it to a CommandResult.
Handlers are stateless. They receive the current state, the command parameters, and a DispatchContext carrying identity and correlation metadata (see section 18). They return the new state. All mutable state lives in the framework's state store, not in the handler module. This makes hot-reload safe (no in-module state to lose) and AI introspection complete (the framework sees all state).
4.2 Meta-Commands
The framework provides built-in meta-commands for modifying the system itself:
| Meta-command | What it does |
|---|---|
set_ui | Replace a plugin's UI tree |
register_command | Add a new command (via macro or Starlark-in-payload) |
unregister_command | Remove a command |
set_input_bindings | Change input bindings for a plugin |
set_styles | Apply CSS rules to a plugin |
pe.animate | Trigger an emphasis animation on a target element (see section 15) |
Meta-commands go through the same dispatch system as regular commands. They modify plugin files on disk, and git tracks the changes. There is no separate "admin API" -- meta-commands are commands.
All meta-command handlers signal errors by raising ValueError. The dispatch layer (dispatch.py) catches exceptions, logs them, records the error in the error ring buffer (errors.py), broadcasts an error toast via SSE, and returns {"error": str(exc)}.
4.3 Sub-Command Expansion
Handlers can return sub_commands in their CommandResult. The dispatch layer (dispatch.py) executes these sequentially through the normal pipeline, collecting results into a steps array. If any sub-command fails, expansion stops and the partial steps are returned with the error. This replaces the old macro system with handler-driven expansion.
4.4 Runtime Command Registration (Starlark-in-Payload)
Commands can be registered at runtime without pre-existing handler files via Starlark-in-payload: the dispatch sends Starlark source code as a string in the command params. The framework validates the code for syntax errors (_validate_code in metacommands.py), writes it to the plugin's handler file on disk, and registers the resulting function as a command. This is safe because Starlark is hermetically sandboxed -- no filesystem access, no network, no imports beyond what the framework explicitly provides.
4.5 Batch Dispatch
Commands dispatch progressively (streaming model):
- Execute one at a time, in order
- Each command's result is rendered immediately via SSE to the frontend
- The user watches changes happen in real-time
- If a command fails, the AI decides how to proceed: fix the issue and continue, dispatch compensating commands to undo partial changes, or abort
- Git tracks the full sequence;
git revertis available for manual rollback of any individual command
This is better than all-or-nothing transactions. The user sees progress. The AI can adapt to errors mid-stream. And the git history provides a complete, revertible record of every step.
---
5. State
5.1 State Store
The framework holds per-plugin state. Plugins declare their state schema in the manifest, including field names, types, and default values. The framework:
- Initializes each plugin's state from the declared defaults at load time
- Provides
get(plugin, key),set(plugin, key, value), andget_all(plugin)operations - Passes the full state dict to handlers and accepts the new state dict from their return value
State is the single source of truth for all application data. Handlers read it, transform it, return new state. The framework persists it. No plugin holds its own mutable state.
5.2 State Versioning
Every state mutation increments a monotonic version number (global, not per-plugin). This supports:
- Diff-based state queries: "what changed since version N?" -- the AI can poll efficiently without re-reading the entire state tree
- Stale-read detection: if a command dispatch includes an expected version and the current version has advanced, the framework can reject or warn
- Git history correlation: version numbers map to git commits, providing a bridge between the in-memory state timeline and the persistent history
---
6. AI Surface
The full design lives in todo/.done/agentic-perception.md; this section is the architectural overview. CLAUDE.md carries the working summary tuned for in-session reference.
6.1 Design Principle
The AI is not a chatbot or inline assistant. It is an external agent that connects to the same backend the human uses, via dedicated HTTP endpoints. Two principles govern the surface:
- Agentic search, not push-state. The AI does not receive a 20 KB blob describing every plugin and its state. It fetches a small master prompt that maps every queryable surface, then drills agentically using whatever query language fits each surface. This mirrors how code-search agents work over a filesystem, applied to product state and rendered perception.
- One dispatch engine, two interfaces. The Svelte SDUI frontend fires commands via
POST /api/commands/{name}; the AI fires the same commands viaPOST /ai/dispatch. Both converge atdispatch_command()indispatch.py. Anything a human button can do, the AI can dispatch programmatically. The framework IS the agent interface; there is no separate adapter layer.
Reads and writes are different verbs. Queries are about finding; dispatch is about doing. The protocol reflects this: a queryable surface model on the read side and a command-shaped dispatch on the write side.
6.2 Endpoints
| Endpoint | Method | Purpose |
|---|---|---|
/ai/manual | GET | Composed master prompt as text/markdown. One H1 per plugin, one H2 per surface, with each surface's query language, freshness strategy, byte budgets, and recipes. ETagged on the global hash; warm sessions return 304. |
/ai/query | POST | Multi-surface read query. Always-array body [{plugin, surface, query, target_hint?}, ...]. Each entry is dispatched to the evaluator declared by its surface (jq, css, or sql). Returns one envelope per query: {result, meta: {complete, truncated_at, total_estimate, last_updated, version_hash, target_rect}}. ?live=1 forces a fresh compute (bypassing the cache) for browser-resident surfaces. Accept: application/x-ndjson opts into streaming for legitimate big queries. |
/ai/dispatch | POST | Always-array body {"commands": [...], "rendering"?, "changeset"?}. Returns {results, state, batch_id, commands_executed}. The same dispatch_command() engine human button clicks use. |
/api/state/stream | GET (SSE) | Real-time event stream. The AI subscribes for state changes, plus the perception-specific events (perception_invalidate, perception_query, target_rect, perception_active) that drive the eye + shimmer collaboration UI. |
6.3 Surfaces
Every queryable read surface is registered at the resolver definition site via the surface(...) wrapper from productengine.surface. Each declaration carries:
- Name and plugin -- queries address surfaces by
(plugin, surface). - Query language --
jqfor tree-shaped JSON,cssfor layout/DOM-shaped data,sqlfor tabular/log-shaped data. - Freshness strategy -- required field, no default. One of
dirty_timer(window_ms),every_dispatch,lazy,CustomFreshness(callable). (Continuousis reserved; the factory raisesNotImplementedErroruntil the scheduler branch is wired.) - Soft and hard byte budgets -- soft truncation kicks in over
soft_budget_byteswithmeta.complete: false; hard rejection overhard_budget_bytesreturns 413 withsuggested_narrowingsfrom the surface's recipes. - Resolver function --
(state, params, ctx) -> {result, target_rect}. May return theBROWSER_RESIDENTsentinel for surfaces whose data lives in the human's browser; the perception cache handles the round-trip. - Master prompt fragment --
content_mdmarkdown body, optionally pulled from a sibling file viarecipes_from. Composes into/ai/manual.
Plugins declare their own surfaces. The framework provides no cross-plugin aggregator; the AI fans out across plugins itself, optionally batching multiple (plugin, surface, query) entries in a single POST /ai/query array.
6.4 Framework Surfaces (the pe plugin)
| Surface | Lang | Freshness | What |
|---|---|---|---|
pe.state | jq | every_dispatch | Plugins, commands, state, ui, bindings, gesture_schemas, styles, scene, tokens, preferences, errors, client_logs, snapshots, feedback, version. The replacement for the retired GET /ai/state endpoint. |
pe.view | css | dirty_timer(2 s) | Layout snapshot of the rendered page (browser-resident). Per-node fields: tag, id, classes, attrs, rect, text, bg, fg, font, z, overflow, position, display, visible, children. |
pe.a11y | jq | dirty_timer(2 s) | Accessibility tree (browser-resident). jq, not CSS, because the records are {role, name, states, rect, children} -- not DOM-shaped. |
pe.anomalies | jq | dirty_timer(2 s) | Heuristic UI scan (browser-resident). Five heuristics: zero-area, off-screen, parent-overflow clipping (uses the real overflow field on the snapshot), z-index conflicts, low contrast. Each anomaly carries its own target_rect so the shimmer paints every suspect at once. |
pe.components | jq | dirty_timer(2 s) | Per-instance SDUI component self-reports ({status, error, frame_count, summary}). Mounted-outside-the-tree framework chrome (AIEye, AIShimmer) is intentionally excluded. |
pe.console | sql | lazy | Browser console.error/warn ring buffer (server-resident, fed by /api/console_log). |
pe.mutations | sql | lazy | DOM mutation diff stream tagged with the current dispatch_id (server-resident, fed by /api/mutation_log). |
pe.screenshot | jq (passthrough) | lazy | Partial PNG screenshot (browser-resident). The query is selector:<css> or rect:x,y,w,h; the resolver receives it via passthrough_query=True. |
pe.log | sql | lazy | Backend log tail. |
pe.network | sql | lazy | Backend HTTP request log (timing, status, identity). |
pe.changelog | sql | every_dispatch | Dispatch history for "what just happened" queries. |
Browser ingestion endpoints (/api/console_log, /api/mutation_log, /api/perception_response) carry browser-resident data back to the server cache; they are not AI-facing.
6.5 Freshness Model
Each surface declares its own freshness strategy at registration:
- **
dirty_timer(window_ms)** -- timer is set towindow_mson every dispatch, user input, or AI query. It decrements toward 0; the warm fires at the 0-edge. 0 is the hot moment (re-compute now); the window value is the dirty/expecting-more-changes state. The default window comes fromPE_PERCEPTION_DIRTY_TIMER_MS(2000 ms). Used for layout-class surfaces where reads should be cheap and reasonably current. - **
every_dispatch** -- warm only on dispatch boundaries; no idle timer. Used for surfaces that change in lockstep with command actions (pe.state,pe.changelog). - **
lazy** -- never warm; compute only on AI query. Used for ring-buffer-backed surfaces where reading the buffer is itself the cheapest path (pe.console,pe.mutations,pe.log,pe.network) and for opt-in expensive computations (pe.screenshot). - **
CustomFreshness(fn)** -- plugin-provided callable returning one of the other strategies; for conditional logic ("lazy while a canvas is animating, dirty_timer otherwise").
The cache is a hybrid eviction store: history ring buffer keeps last N or last T minutes of snapshots, whichever bound trips first. AI queries can read the latest cached value (paying nothing for "still fresh") or escalate with ?live=1 to force a fresh compute. Each cached entry carries a last_updated timestamp; the AI's "acceptably recent" decision is its own.
6.6 Visual Collaboration
Two SSE-driven UI elements signal AI activity to the human user:
- **Corner eye (
AIEye.svelte)** -- a single global icon that opens when an/ai/queryrequest begins and closes after a brief idle. Driven byperception_activeSSE events. Defense-in-depthEYE_IDLE_MStimer in the frontend ensures the eye closes even if the server skips the trailingperception_active(false). - **Positional shimmer (
AIShimmer.svelte)** -- an animated outline traces whatever rectangle the AI just queried. Driven bytarget_rectSSE events, with multi-rect support (one shimmer per matched element). Resolvers always return atarget_rect(single, list, ornull); helperstarget_from_selector,target_from_rect,target_from_plugin_rootmake this a one-liner. The AI may override viatarget_hintin the query envelope.
Together they make AI activity legible: the human watches a luminous trail across the UI as the AI works.
6.7 Budget and Errors
A four-tier model keeps queries bounded without poisoning batches:
- AI self-regulation -- the master prompt declares
soft_budget_bytesper surface; the AI narrows preemptively. - Soft truncation -- responses larger than
soft_budget_bytesare truncated mid-data; envelope carriesmeta: {complete: false, truncated_at, total_estimate}. - Hard rejection in-band -- per-query violations over
hard_budget_bytesreturn{result: null, error: {code: "payload_too_large", suggested_narrowings: [...]}}inside the response array (HTTP 200). One bad query in a batch does not poison the others. - Aggregate hard rejection -- only when the cumulative response exceeds
AI_QUERY_GLOBAL_HARD_BYTESdoes the endpoint return top-level 413. Opt-in NDJSON streaming (Accept: application/x-ndjson) bypasses both budgets for legitimate big queries.
6.8 Acknowledgment Loop
The dispatch path retains its full round-trip confirmation. The AI dispatches a change, the server applies it, SSE pushes the update to the frontend, the frontend renders and POSTs an acknowledgment, and the server includes that ack in the dispatch response (with a configurable timeout). The AI always knows whether its changes actually rendered. The /ai/query path does not use this loop; reads do not mutate state.
---
7. Server-Driven UI (SDUI)
7.1 Model
Plugins describe their UI as JSON component trees in ui.json. The framework's frontend has a recursive SDUI renderer that interprets these trees and produces real DOM elements. Plugins never send executable code to the frontend -- only data.
This means:
- The AI can read and modify any plugin's UI by reading and writing JSON
- UI changes are just data changes, tracked by git like everything else
- The frontend is a universal renderer, not a collection of bespoke plugin UIs
7.2 Component Vocabulary
The framework ships a fixed set of 30+ built-in component types. If a component type does not exist in the vocabulary, it cannot be used. There is no runtime extensibility of the component vocabulary. This keeps the system controllable and predictable -- the renderer knows every possible component type at build time.
| Category | Components |
|---|---|
| Layout | column, row, grid, stack, spacer, divider, tabs, accordion, split-pane |
| Display | text, heading, badge, icon, image, code-block, markdown, progress-bar, spinner |
| Input | button, text-input, number-input, checkbox, toggle, select, slider, color-picker, date-picker |
| Data | list, table, tree-view, key-value |
| Canvas | canvas (2D scene graph renderer -- see section 8) |
| Feedback | alert, toast, modal, tooltip, popover |
The specific props, behavior, and styling of each component will be defined in a component specification. The categories and component names listed here are the vocabulary.
7.3 Expression Syntax
Component props support ${expression} for data binding. Expressions are evaluated by a custom parser -- never by eval() or equivalent. The expression language is intentionally limited:
| Expression | Meaning |
|---|---|
${state.fieldName} | Read from the plugin's state |
${item.fieldName} | Current item in a list iteration |
${len(array)} | Array length |
${if(condition, then, else)} | Conditional value |
The expression language supports property access, function calls from a fixed allowlist, and basic comparison operators. It does not support assignment, loops, or arbitrary computation. It is a data-binding language, not a programming language.
7.4 Actions
Buttons and interactive components trigger commands through the same dispatch system the AI uses:
{"type": "button", "props": {"label": "Delete",
"command": "remove_node", "params": {"id": "${item.id}"}}}
User clicks and AI dispatches are identical. The SDUI renderer translates a button click into a command dispatch to the same endpoint the AI calls. There is one command system, not two.
---
8. Scene Graph (Canvas)
8.1 Separate from SDUI
The canvas is a special SDUI component type that renders a 2D scene graph. It has its own data model separate from the layout component system because:
- Layout components are DOM elements positioned with CSS flexbox/grid
- Canvas content is SVG or WebGL with custom rendering: shapes, hit-testing, coordinate transforms, zoom, pan
The canvas component is an SDUI component (it appears in ui.json), but its children are scene graph elements, not other SDUI components.
Rendering constants (node radius, colors, stroke widths, etc.) are shared between the Python backend and the TypeScript frontend via scene_defaults.json at the project root. Both sides import from this single file to keep visual defaults in sync.
8.2 Scene Graph Schema
The scene graph describes visual elements as data in the plugin's state. It is not a separate data store -- it is regular plugin state that the canvas component knows how to render.
| Element type | Properties |
|---|---|
| circle | cx, cy, radius, fill, stroke, stroke-width, opacity |
| rect | x, y, width, height, rx, ry, fill, stroke, stroke-width, opacity |
| line | x1, y1, x2, y2, stroke, stroke-width, opacity |
| path | d (SVG path data), fill, stroke, stroke-width, opacity |
| polygon | points, fill, stroke, stroke-width, opacity |
| text-label | x, y, text, font-size, font-family, fill, text-anchor |
| group | children (list of elements), transform (translate, rotate, scale) |
| layer | children, visible, opacity, name |
Each element also supports:
- Interaction regions: hit-test areas that may differ from the visual bounds (e.g., a wider hit area for thin lines)
- Drag handles: named points on the element that can be dragged, mapped to commands via input bindings
8.3 Addressability
Every scene graph element has a stable, deterministic ID derived from the data (e.g., node-{id} for graph nodes, edge-{source}-{target} for graph edges). This enables:
- The AI can reference specific visual elements by ID
- Styling rules can target specific elements or element types
- Input bindings can reference elements by ID, type, or group membership
---
9. Input Bindings
9.1 Declarative Bindings
Input handling is declared as data, not hardcoded in event handlers. Each binding maps a named gesture to a command:
{"gesture": "click", "target": "canvas_empty",
"command": "add_node", "params": {"x": "${event.x}", "y": "${event.y}"}}
Bindings are declared in the plugin manifest or in a dedicated bindings file within the plugin directory. They are data, so they are readable and writable by the AI, tracked by git, and modifiable at runtime.
9.2 Named Gesture Primitives
The framework defines a fixed vocabulary of gestures. Each gesture type has a known phase schema defining what context variables are available in expressions.
| Category | Gestures |
|---|---|
| Pointer | click, double_click, right_click, drag (start/move/end phases), long_press, hover_enter, hover_leave |
| Keyboard | key_press (with modifiers: ctrl, shift, alt, meta), key_combo (chords like ctrl+shift+z) |
| Scroll | scroll (with delta_x, delta_y), pinch_zoom (with scale) |
| Selection | select, deselect |
Each gesture type provides a fixed context schema:
| Gesture | Context variables |
|---|---|
| click | event.x, event.y, event.target_id, event.target_type |
| drag | event.start_x, event.start_y, event.current_x, event.current_y, event.delta_x, event.delta_y, event.target_id, event.phase |
| key_press | event.key, event.ctrl, event.shift, event.alt, event.meta |
| scroll | event.delta_x, event.delta_y, event.x, event.y |
| select | event.target_id, event.target_type |
9.3 Runtime Modification
Input bindings are modifiable at runtime via meta-commands (set_input_bindings and pe.set_input_bindings). The AI can rebind what right-click does, add keyboard shortcuts, change drag behavior. Changes persist to disk like everything else.
---
10. Styling and Addressability
10.1 AI-Driven Styling
Every visual element in the UI is addressable via stable, deterministic selectors. The framework enforces addressability -- the SDUI renderer emits DOM elements with data attributes:
| Attribute | Example | Purpose |
|---|---|---|
data-plugin | data-plugin="graphist" | Identifies which plugin owns the element |
data-component | data-component="text" | Identifies the SDUI component type |
data-role | data-role="title" | Semantic role within the plugin's UI |
data-id | data-id="node-abc123" | Unique element identifier from the data |
The AI can:
- Read current computed styles of any element via the AI surface
- Generate and apply CSS rules targeting any combination of these selectors
- Respond to user descriptions ("this text is too small") by inspecting visual properties and applying fixes
- Create entirely new themes (dark mode, high contrast) by generating a complete stylesheet
10.2 Design Token System
The framework provides a design token system (tokens.py) that maps named tokens to CSS custom property values. Tokens cover colors, spacing, typography, borders, shadows, and animation timing.
- Default tokens: a comprehensive set of
--pe-<token-name>custom properties (e.g.,--pe-color-primary,--pe-spacing-md,--pe-font-body) - Themes: built-in
dark(default) andlightthemes that override color-related tokens. The active theme is switched via thepe.set_thememeta-command. - Per-token overrides: individual tokens can be overridden at runtime via the
pe.set_tokensmeta-command, merging on top of theme defaults. - API endpoint:
GET /api/tokensreturns the resolved token set and active theme name. Token changes are broadcast to the frontend viatokens_changedSSE events.
The AI can also write CSS directly via pe.set_styles. The token system provides consistent defaults; direct CSS provides full expressive power when needed.
10.3 Style Persistence
Style changes are written to CSS files in the plugin directory and tracked by git. The framework loads plugin CSS files and injects them into the page. When the AI modifies styles, the changes are files on disk, not ephemeral runtime state.
---
11. Persistence and Git
11.1 Model
Every change to any plugin's files is automatically committed to git. The framework manages git operations transparently. Neither humans nor AIs interact with git through the framework -- the framework handles it. (Users and AIs can still use git CLI directly for advanced operations like branching or rebasing.)
11.2 What This Enables
- History: full undo/redo via git log; every state of the application is recoverable
- Diffing: see exactly what changed between any two points in time
- Rebasing: when an upstream plugin updates, rebase local modifications on top
- Collaboration: multiple agents' changes are tracked with attribution (commit author identifies the agent)
- Publishing: push a plugin's repo to a git host to share it
- Installation: clone a repo into
plugins/to install a plugin - Store: GitHub (or any git host) IS the plugin store; a plugin's repository has
plugin.tomlas its manifest; discovery is git search
11.3 Auto-Commit Behavior
- Every successful command dispatch that modifies files triggers an auto-commit
- Commit message includes: the command name, the parameters (truncated if large), and the source (user, AI, or specific agent ID)
- Batch dispatches produce one commit per command (progressive, matching the streaming dispatch model)
- Commits are made to the current branch; the framework does not create branches automatically
11.4 Staging Safety
Auto-commits must use specific file staging, never blanket commands like git add -A or git add .. The framework stages only files matching known patterns:
*.toml-- plugin manifests*.star-- Starlark handlers*.json-- UI definitions, state snapshots*.css-- plugin stylesheets
Before any commit, the framework must verify that nothing is already staged in the index. If there are already-staged changes (from another session, a manual git add, or any other source), the auto-commit aborts with an error rather than committing someone else's work. This is a hard blocker, not a warning.
11.5 Repository Hygiene
The plugins git repository includes a .gitignore that excludes common junk files:
__pycache__/*.pyc.DS_Store*.swp*.swo*~.pytest_cache/*.egg-info/
---
12. Handler Languages
12.1 Handler Language
Plugin handlers are written in Starlark. The loader enforces this: any plugin with a language field other than "starlark" is rejected at load time. Python and Lua support are planned for the future but not yet implemented.
| Language | Runtime | Adapter | Status |
|---|---|---|---|
| Starlark | starlark-pyo3 | Starlark evaluation, function extraction, dict conversion | Implemented |
| Python | CPython (host interpreter) | Direct import, function reference | Planned |
| Lua | LuaJIT via lupa | Lua state creation, function extraction, dict/table conversion | Planned |
12.2 Why Starlark
12.3 Starlark-in-Payload
When the AI registers commands at runtime, it can send Starlark source code in the dispatch payload. The framework:
- Receives the Starlark source string
- Writes it to the plugin's handler file on disk
- Compiles and validates it
- Registers the resulting function as a command
- Git commits the new handler file
This is safe because Starlark has no I/O, no imports, no network access, and no side effects beyond what the framework explicitly exposes through the handler contract.
---
13. Consumers
The following applications will be built on ProductEngine, validating the framework under real workloads:
| Consumer | Domain | Key challenges it validates |
|---|---|---|
| Graphist | Graph editor | Canvas/scene graph, interactive commands, real-time updates, drag interactions |
| VisualClaude | Claude Code session manager | Dashboard panels, SSE streams, system integration, complex state |
| PixelWeaver | Pixel art editor | Rich canvas with pixel-level manipulation, animation timeline, complex plugin ecosystem |
| TripPlanner | Travel planning and booking | Forms, data tables, external API integration, multi-step workflows |
| Supervisor | Multi-repo git management | DevOps tooling, background tasks, complex state machines, system-level integration |
Graphist is the first consumer and serves as the MVP test bed. It is simple enough to build fast and complex enough to exercise the full architecture: plugin loading, command dispatch, state management, canvas rendering, AI integration, and hot-reload.
---
14. Open Questions
These questions will be resolved during implementation. They are recorded here so that decisions are made deliberately, not by accident.
| Question | Context | Leading direction |
|---|---|---|
| Component vocabulary specifics | The 30+ components are categorized but specific props and behavior for each need to be designed | Design props during Graphist development; document as a component spec |
| Scene graph schema details | Shape types and visual properties are listed but interaction region model needs specification | Define during canvas implementation for Graphist |
| Expression language limits | What functions beyond len() and if() should be available? | Start minimal, add functions only when real plugins need them |
| Cross-plugin state access | Can plugin A read plugin B's state? Via command dispatch only, or directly? | Read access is direct (state is framework-owned); writes are via command dispatch only |
| Plugin dependency versioning | Name-only dependencies or semver constraints? | Start with name-only; add semver when the plugin ecosystem is large enough to need it |
| Concurrent agent coordination | When two AI agents modify the same plugin simultaneously, how are conflicts resolved? | Git merge conflicts surface the issue; the framework does not attempt automatic resolution |
| Auth system design | Token vs session vs OAuth? Single identity store or federated? | Simple token/API key for v1; design will evolve based on deployment mode requirements (see section 17) |
| Animation performance | CSS transitions vs JavaScript animation vs Web Animations API? | CSS transitions for simple property changes; Web Animations API for complex sequences; benchmark during implementation |
| Code-in-payload validation | Should Starlark sent via dispatch be compiled/validated before writing to disk? | Implemented in metacommands.py _validate_code; Starlark code is parsed before writing to disk |
| Changelog mechanism for diff-based queries | How to efficiently track state deltas for the ?since=N query approach? | Implemented as a ring-buffer changelog in state.py; get_changelog and get_state_diff are wired into the ?since=N query parameter in ai_api.py |
---
15. Animation and Transitions
15.1 Animation Categories
All animations fall into one of four categories:
| Category | Purpose | Available animations |
|---|---|---|
| Enter | Element appears in the UI | none, fade_in, scale_in, pop, slide_in (with from direction), draw (for lines/paths), grow (for bars) |
| Exit | Element leaves the UI | none, fade_out, scale_out, slide_out (with to direction), shrink, collapse |
| Transition | Smooth property changes between states | interpolate (default behavior for state-driven changes) |
| Emphasis | Attention-getters that do not change state | pulse, shake, glow, flash, bounce, ring |
Enter and exit animations play when elements are added to or removed from the component tree. Transition animations play automatically when state-driven property values change (position, size, color, opacity). Emphasis animations are triggered explicitly and do not modify state -- they draw attention to an element and then stop.
15.2 Easings
| Easing | Behavior |
|---|---|
linear | Constant speed |
ease_in | Slow start, fast end |
ease_out | Fast start, slow end |
ease_in_out | Slow start, slow end |
bounce | Overshoots and bounces back |
elastic | Spring-like overshoot |
step | Discrete jumps (no interpolation) |
15.3 Timing
| Property | Type | Purpose |
|---|---|---|
delay_ms | integer | Wait before starting the animation |
duration_ms | integer | How long the animation runs |
stagger_ms | integer | Offset between children in a group or list (each child starts stagger_ms after the previous) |
15.4 Two-Layer Declaration
Animation behavior is declared at two layers. The second layer overrides the first.
- Layer 1 -- UI defaults: the
transitionsproperty on any component inui.jsondefines default enter, exit, and transition animations for that component. These are static defaults that apply unless overridden. - Layer 2 -- Dispatch rendering metadata: when a command is dispatched, the response can include rendering metadata that overrides the UI defaults for that specific operation. This allows the backend to control how a particular state change is presented, independent of the component's default animation.
Emphasis animations are not part of either layer. They are triggered explicitly via the pe.animate meta-command, which targets a specific element by ID and plays the named emphasis animation.
15.5 Batch Rendering Modes
When commands are dispatched in batches, the framework controls how results are rendered:
| Mode | Behavior |
|---|---|
| Progressive | Each command's result renders immediately as it completes. The user sees changes happen in real-time. This is the default. |
| Atomic | All results are buffered and rendered simultaneously when the entire batch completes. The UI updates once. |
| Grouped | Subsets of commands within the batch are designated as atomic groups or progressive sequences. Mixed rendering within a single batch. |
Rendering mode is controlled by the backend via dispatch rendering metadata, not by frontend settings. The backend decides how a batch should be presented. Individual commands within a batch can include per-command rendering overrides (e.g., "render this command's result atomically with the next two commands, but render the rest progressively").
---
16. Distribution Pipeline
Desktop mode requires packaging the Python backend, frontend assets, and runtime dependencies into a native application. The following stack handles freezing, packaging, updates, and system integration.
| Layer | Tool | Purpose |
|---|---|---|
| Freezing | PyInstaller / Nuitka | Bundle Python + dependencies into a standalone executable. Nuitka compiles Python to C for 2-4x faster runtime but longer build times. |
| Native installers | Briefcase | Produce .deb, .msi, .dmg from frozen bundle |
| Auto-updater | tufup | Cryptographically secure delta updates built on the TUF framework |
| System tray | pystray | System tray icon with menu; integrates via run_detached() |
| Notifications | desktop-notifier | Native OS notifications with interactive buttons and callbacks |
| Localhost TLS | mkcert | Locally-trusted certificates for HTTP/2 on localhost (see section 2.2) |
PyWebView already provides file dialogs, native menus, window management, clipboard access, and drag-and-drop. These do not need separate tooling.
Native packaging cannot be cross-compiled. CI/CD requires a GitHub Actions matrix build with runners for each target platform (Linux: .deb/.AppImage, macOS: .dmg with notarization, Windows: .msi with code signing).
---
17. Authentication and Identity
17.1 Why Authentication Is Needed
Authentication exists for attribution, not access control. Git commits must attribute changes to specific users and agents. Without identity, all commits appear to come from the same source, making collaboration history meaningless.
17.2 Design Principles
- Both humans and AI agents authenticate via the same system. There is no separate "agent auth" -- an agent is just another authenticated identity.
- Each dispatch includes the caller's identity. The framework uses this identity to set the git commit author on auto-commits.
- Required for meaningful commit source tracking in collaborative scenarios (multiple users, multiple agents, or mixed human/agent workflows).
17.3 Implementation
Token store: tokens are UUID4 strings stored in .productengine-auth.json at the project root. Each token maps to an identity with name and type fields. On first run, an admin token is generated and printed to stdout for bootstrapping.
AuthMiddleware: a pure ASGI middleware (no BaseHTTPMiddleware) that extracts Bearer tokens from the Authorization header and sets an identity dict on the ASGI scope. It does not reject unauthenticated requests -- it sets the identity to anonymous. In desktop mode, unauthenticated requests from loopback addresses (127.0.0.1, ::1) are automatically attributed to local-user with type human, since PyWebView requests have no token.
Endpoints:
| Endpoint | Method | Purpose |
|---|---|---|
/api/auth/token | POST | Create a new token (requires existing valid token) |
/api/auth/whoami | GET | Return the current request's identity |
/api/auth/token/{token} | DELETE | Revoke a token |
Token creation requires authentication -- the admin token bootstraps the first session, and new tokens can only be created by authenticated callers. This prevents unauthenticated users from minting their own identities.
---
18. DispatchContext
18.1 Purpose
Every command dispatch carries a DispatchContext -- an explicit context object that threads identity and correlation metadata through the entire dispatch chain. It replaces the thread-local set_dispatch_source / get_dispatch_source mechanism that was used initially.
Thread-locals are problematic in async Python: coroutines sharing an event loop can observe each other's thread-local state, and context propagation across await boundaries is fragile. An explicit context object passed as a function argument is unambiguous -- there is no question about which identity applies to which dispatch, even when multiple dispatches are in flight concurrently.
18.2 Structure
| Field | Type | Purpose |
|---|---|---|
identity | dict | The authenticated caller ({"name": "claude-agent", "type": "ai"}) |
dispatch_id | int | Monotonic ID for correlating dispatch with SSE events and acknowledgments |
batch_id | str or None | Shared ID across all commands in a batch dispatch (the "changeset" ID) |
timestamp | datetime | When the dispatch was initiated |
Derived properties: source returns the identity type (ai, human, system, unknown), and author returns the identity name (used as git commit author).
A singleton SYSTEM_CTX exists for non-request-driven operations (file watcher triggers, startup initialization) where there is no HTTP request to extract identity from.
18.3 Handler Contract
Handlers receive the context as their third argument:
(state: dict, params: dict, ctx: DispatchContext) -> CommandResult
This makes identity available to any handler that needs it (e.g., for audit logging or conditional behavior based on whether the caller is a human or AI) without polluting the params dict.
---
19. Git Trailers
19.1 Purpose
Every auto-commit encodes structured metadata as git trailers in the commit message body. Trailers are a standard git convention (key-value pairs at the end of the commit message) that can be parsed programmatically with git log --format='%(trailers)'.
19.2 Trailer Fields
| Trailer | Example | Rationale |
|---|---|---|
Source | Source: ai | Who initiated the change -- ai, human, system, or unknown. Enables filtering commit history by actor type (e.g., "show me all AI-generated changes"). |
Command | Command: add_node | Which command produced the change. Enables tracing what action caused a specific file mutation, even months later. |
Changeset | Changeset: a1b2c3d4 | Correlates commits that belong to the same batch dispatch. When an AI dispatches 5 commands in a batch, all 5 commits share the same Changeset ID, making it possible to revert or review the entire batch as a unit. |
19.3 Commit Format
cmd: add_node {'x': 100, 'y': 200}
Source: ai
Command: add_node
Changeset: a1b2c3d4
The commit author is set from DispatchContext.author (e.g., claude-agent <claude-agent@productengine>), providing a second axis of attribution alongside the Source trailer.
---
20. Starlark as the Canonical Handler Language
20.1 The Decision
Starlark is the only implemented handler language (see section 12). It is the canonical language that AI agents generate when creating or modifying plugins at runtime.
20.2 Why Starlark
Python syntax. Starlark is a dialect of Python. An AI that knows Python already knows Starlark. There is no new syntax to learn, no new idioms to generate. This eliminates an entire class of AI generation errors -- the AI does not need to context-switch between languages.
Hermetic sandbox by design. Starlark has no I/O, no imports beyond what the framework explicitly provides, no network access, no filesystem access, no side effects. A malicious or buggy Starlark handler cannot read files, open sockets, or modify global state. This is not a sandbox bolted on after the fact -- Starlark was designed for this from the ground up (it was created by Google for the Bazel build system, where untrusted BUILD files must be safe to evaluate).
Trivial hot-reload. Starlark evaluation is stateless -- there is no module-level state to preserve across reloads. The framework can unload and reload a handler file by simply re-evaluating it. Python handlers require careful module unloading (importlib.reload, clearing sys.modules entries) and can hold module-level state that is lost on reload.
Starlark-in-payload. The AI can send Starlark source code in a dispatch payload, and the framework writes it to disk, compiles it, and registers it as a command -- all within a single dispatch. This enables runtime command creation without pre-existing handler files. The safety guarantee makes this viable: no amount of Starlark code can escape the sandbox.
20.3 What Starlark Cannot Do
Starlark handlers cannot perform I/O, call external services, or access the filesystem. When Python handler support is added in the future, plugins that need I/O capabilities will be able to use Python handlers. The framework's handler contract (state -> new_state) will be the same regardless of language.