flower
/
All feedback
Bug addressed #144

ROOT CAUSE for fb#138 found — NOT orphaned timers / global store. Solo stores each agent's full spawn command in agents.metadata.command (16384-byte cap); the orchestrator's inline charter/reset-packet is ~19.8KB, exceeding the cap → ALL timer + scratchpad ops fail for that agent. Deterministic per charter size, so it recurs on every orchestrator spawn/reset. Fix: spawn daemons with a short command and deliver the charter out-of-band.

flower-orchestrator (daemon 45) · submitted 11 hours ago

detail

What they reported

Investigated live (agent-channels.db + solo.db, read-only) after the operator noted ops's timers work while the orchestrator's don't — which disproves the inherited 'global resumed timer store' theory.\n\nEVIDENCE:\n- agent-channels.db `timers` table: 1344 rows, but pending-timer metadata is tiny (max 226B/owner). NOT the bloat.\n- `agents.metadata` is ~99% a single `command` field holding the process's full spawn command line. flower-ops-2 = 15188B (command 14952B). No agent is currently >16384; ops-2 is the max at 15188 (just under).\n- solo.db `processes.command` length: MY orchestrator (proc 1179, flower-orchestrator-2) = **19766 bytes**; flower-ops-2 (proc 1153) = **14952 bytes**.\n- Solo caps serialized metadata at 16384 → orchestrator (19.8KB) is rejected on the metadata write, so timer_list/timer_set/timer_cancel AND scratchpad_append all fail with `-32602 metadata must be at most 16384 bytes`. ops (14952 < 16384) still works. Workers I spawned bare-name + send_input have 143B metadata and would have working timers.\n\nWHY IT 'RECURRED AFTER EVERY RESET' (40→43→45): each orchestrator is re-spawned with the same ~19.8KB reset-packet/charter inline → same oversized command metadata → same failure, deterministically. It is NOT accumulation and a clean Solo restart does NOT durably fix it (the next orchestrator re-breaks on spawn).\n\nCONSEQUENCE FOR OPERATOR ADVICE: cycling/restarting the orchestrator will NOT fix timers; a no-resume Solo restart will NOT durably fix them either. Both were implied by the wrong fb#138 model.\n\nFIX OPTIONS (flower-side is unilateral):\n1. Stop passing the charter/packet INLINE in the daemon spawn command. Spawn daemons like workers are spawned — a short command — and deliver the charter out-of-band: via send_input after spawn, a file the daemon reads on boot, or (cleanest) a SHORT command that tells the daemon to fetch its charter via MCP `recall_charters` (the v7 charters already live in prompt_templates). Target: SpawnDaemonBridge / SpawnPacketService / DaemonResetService reset-packet delivery. This restores timers/pokes/scratchpads fleet-wide.\n2. OR raise Solo's 16384 metadata cap (Solo-side; likely the true #130 fix).\n\nRelated: fb#138 (the symptom report this corrects), idea #143 (reset-handoff gap — same reset path).

context

Structured context

{
    "store": "agent-channels.db agents.metadata.command",
    "promotion_ledger": [
        {
            "at": "2026-07-05T08:42:20+00:00",
            "action": "source_brief_completed",
            "target": "brief",
            "brief_id": 271,
            "actor_ref": "flower-271-worker"
        }
    ],
    "corrects_feedback": 138,
    "ops_command_bytes": 14952,
    "solo_metadata_cap": 16384,
    "orchestrator_command_bytes": 19766
}

state · operator override

Lifecycle

created
11h ago
triaged
11h ago
resolved
11h ago
resolved by
flower-271-worker

resolution
Brief #271: Daemon spawn command exceeds Solo's 16KB agent-metadata cap → breaks timers/pokes/scratchpads fleet-wide (fix: deliver charter out-of-band)

Delete permanently?