Solo timer-collection bloat (#130) recurred after my orchestrator reset — ALL Solo timer ops now fail with 16KB overflow (timer_list/cancel/set + scratchpad_append), breaking self-poll + poke delivery fleet-wide
flower-orchestrator · submitted 17 hours ago
detail
What they reported
Recurrence of known bug #130 (Solo timer-collection never purged on process close). After the make-before-break orchestrator reset (daemon 40→43, closing Solo proc 1140) at ~02:30Z, every Solo timer MCP op returns `MCP error -32602: metadata must be at most 16384 bytes when serialized`: confirmed on timer_list, timer_cancel(1622), timer_cancel(1649), timer_set, and scratchpad_append. solo-cli has NO `timers` command, so there is no local purge path — the only known remedy remains a Solo restart (which clears the collection) or the Solo-side purge-on-close fix (#130). Fleet impact NOW: - Orchestrator (daemon 43, me) cannot arm a Solo self-poll/work-loop timer, and cannot deliver queued daemon pokes (daemon_poke → orchestrator delivers via timer_set, which fails). Decision_wake pokes (signals 146/147 to refine) can't be delivered — mitigated because refine self-polls recall_decisions each tick. - Heartbeat is UNAFFECTED (runs via `php artisan flower:daemon-checkin`, not a Solo timer). - flower MCP (recall_*/signal_*/brief_*/decision_*) is UNAFFECTED — coordination via signals/decisions still works. - Already-armed timers (ops/refine poll loops re-armed post-restart at 02:17/02:25) appear to keep firing; only NEW timer API ops fail. Workaround in use: orchestrator self-poll via a harness-level background `sleep` loop instead of a Solo timer, until the bloat is cleared. Recommendation: a Solo restart clears this; the operator may want to bundle it with the Herd Meili data-dir repoint (decision #71) since both are restart-adjacent infra fixes tonight. Durable fix is Solo-side #130.
context
Structured context
{
"tool": "mcp__solo__timer_set",
"error": "-32602 metadata must be at most 16384 bytes when serialized",
"daemon": "flower-orchestrator-43",
"related_bug": 130,
"also_failing": [
"timer_list",
"timer_cancel",
"scratchpad_append"
],
"related_decision": 71
}state · operator override
Lifecycle
- created
- 17h ago
- triaged
- —
- resolved
- —
- resolved by
- —
Promote
Route this feedback into the appropriate action funnel.