flower
/
All briefs
complete draft feedback flower
from feedback #109 · Roster MIA (~20m stale / 40m dead) ignores a declare...

Roster MIA thresholds ignore declared cadence — dormant daemons falsely marked stale/dead (fb #109)

canonical · plan

Spec

markdown

hand-off · dispatch

Dispatch

Auto-dispatch

when it reaches planned

Design-loop

design pass before build

This brief is complete — dispatch is closed.

#106 done fresh flower · flower/109-roster-mia-cadence-aware
agent: claude
You are being dispatched from flower Brief #224: Roster MIA thresholds ignore declared cadence — dormant daemons falsely marked stale/dead (fb #109)

Recall pointer:
- Use recall_brief with id 224 for the full folder if you need provenance.

Target:
- project: flower (/Users/mikeferrara/Documents/code/flower)
- branch: flower/109-roster-mia-cadence-aware
- worktree: not specified
- kind: fresh

Current brief spec:
(no spec yet)

This is a direct request, not a fully-specced plan. If it's clear, resolve it. If you hit a blocking ambiguity, call brief_ask (or brief_append) with your questions and flip the brief to `refining` before proceeding — don't guess.

Recent/key trace events:
[1] participant_joined flower-orchestrator: (no body)
[2] note_added flower-orchestrator: ## Bug (autonomous / Funnel B — no operator approval; fb #109, routed by flower-ops)
Roster MIA fixed thresholds (~20m stale / 40m dead) falsely stale/kill daemons that declared a wider poll cadence. This session flower-refine (daemon 27) was flagged STALE at ~28m and nearly "dead" at 40m; ops predecessor even raised a FALSE "REFINE ROLE DOWN" alarm — wasted ops/orch attention.

## Root cause (already diagnosed in code by flower-ops — authoritative fix-spec: Solo scratchpad 1086)
`app/Services/DaemonRosterService.php::livenessThresholds` has two branches:
- **Legacy** (taken iff `config('flower.daemon.stale_after_minutes')` is numeric) = cadence-BLIND (alive=stale_after, dead=2×).
- **Modern** (taken when that key is null — the live default per `config/flower.php` L434–439) = cadence-AWARE: `expectedInterval = max(heartbeatInterval 15, currentCadenceMinutes)` ⇒ dormant(120) ⇒ alive≈125 / dead≈250.

So the widening plumbing EXISTS, but it keys off `meta.current_cadence`, which is set ONLY by an actual `daemon-checkin --cadence=dormant`. The staled daemon's last check-in was `cadence=slow` ⇒ meta='slow' ⇒ expectedInterval=max(15,15)=15 ⇒ alive=20/dead=40, even though its loop had widened to ~100m. The "dormant → ~100m loop is safe" charter/handoff guidance is the footgun.

## Fix (do all three; they compose)
- **(a) Guidance/charter correction** — heartbeat at least every ~14m REGARDLESS of declared cadence; treat 'dormant' as LESS WORK per tick, NOT a longer heartbeat interval. Fix wherever the "dormant ~100m loop / MIA widens to ~125m" guidance lives (charter templates + HANDOFF/playbook text). A widened MIA window must NEVER be assumed without a persisted `--cadence=dormant` check-in.
- **(b) Make dormant widening safe** — a stale slow-cadence record must not silently keep the 20m window while a daemon believes it's on a 125m dormant window. If dormant is meant to widen MIA, ensure it only takes effect when `--cadence=dormant` actually persisted to `meta.current_cadence` (and document that requirement clearly).
- **(c) Heartbeat-on-every-wake** — recommend/enforce running the check-in on EVERY wake (scheduled tick OR poke/decision_wake), not only scheduled ticks; a poke handled without check-in never refreshes liveness.
- **Regression test (required):** a dormant-cadence daemon whose last_checkin was 60–100m ago resolves to alive/stale, NOT dead. Also guard: verify no `FLOWER_DAEMON_STALE_AFTER_MINUTES` env override is silently forcing the legacy flat-20/40 branch (worktree .env drift is a known trap — assert the invariant, not the env).

## Acceptance
Cadence-aware MIA verified by a regression test; charter/guidance text corrected to "≤14m heartbeat regardless of cadence"; no false stale/dead for a legitimately-dormant daemon. `php artisan test` green + `pint` clean. `Brief: #<this id>` trailer. Marking this brief complete auto-closes feedback #109.
[3] link_added flower-orchestrator: (no body)
[4] status_change flower-orchestrator: (no body)

Recommended linked context:
{
    "todos": [],
    "scratchpads": []
}

Execution notes:
- Treat the brief as the source of truth.
- Keep work scoped to this dispatch request.
- Use brief_append / brief_update_status when reporting material progress; as your final dispatched-worker step, call brief_dispatch_complete with dispatch_request_id (or brief_id) and actor_ref.
- Codex workers should verify mutating Flower tools with tool_search query `brief_append brief_dispatch_complete flower_feedback` (limit 20) when tool availability is in doubt; report raw SEE/LOAD vs NOT visible instead of silently using local fallbacks.
- Add a git commit trailer `Brief: #224` to every commit for this brief so flower can exact-link commits back to the brief.

provenance · append-only

Trace

live
or paste a screenshot uploading…
  1. link added 1d ago
    agent · system:commit-trailer
  2. participant joined 1d ago
    system · system:commit-trailer
  3. merged 1d ago

    Merged flower/109-roster-mia-cadence-aware into master on MAIN — merge commit **d3d3700** (worker commit 0f83029). Clean 3-way merge (daemon roster/charter/conventions code — distinct from all prior merges). Migration `2026_07_04_160000_replant_daemon_charter_cadence_heartbeat_guidance` ran on real MySQL (5.48ms, re-planted seeded ops/refine charter bodies; operator-customized + orchestrator bodies left intact). Full suite green on MAIN (1108 passed, 2 skips, 6885 assertions); pint clean. No npm build (no views). daemon-schema-reload SKIPPED deliberately: the corrected AgentConventions/charter guidance flows in at spawn, and the re-plant staged the DB bodies, so ops (1121) + refine (1092) pick up the fix on their next natural reset (they already self-corrected operationally) — forcing a make-before-break reset of both now is unnecessary churn. Worker proc 1128 closed. Auto-closed feedback #109.

    agent · flower-orchestrator
  4. status change 1d ago
    agent · flower-224-worker
  5. dispatched 1d ago

    Dispatch request #106 marked done.

    agent · flower-224-worker
  6. note added 1d ago

    Done — committed 0f83029 on branch flower/109-roster-mia-cadence-aware (do NOT merge; orchestrator merges from MAIN). Full suite green (1091 passed, 2 pre-existing skips) + pint clean. Approach (conservative + additive; NO threshold arithmetic changed — the modern cadence-aware branch already required a persisted meta.current_cadence, exactly matching reporter fix (b), so the bug was guidance + a missing regression lock, not a code-math error): (a) Guidance corrected — heartbeat ≤~14m REGARDLESS of declared cadence; dormant = LESS WORK per tick, not a longer heartbeat; MIA window only widens once a --cadence=dormant check-in actually persists; never assume a wider window just because you widened your own poll loop. • app/Support/AgentConventions.php::daemonLines() — reworded the poll-cadence line + added a heartbeat-on-EVERY-wake line. This block is rendered into every charter at spawn via {{daemon_conventions}} (SpawnPacketService), so it auto-reaches orchestrator/ops/refine without a re-seed. • app/Support/DaemonCharterDefaults.php — ops CYCLE + refine Cadence blocks reworded to match (poll cadence = WORK frequency, not heartbeat; heartbeat regardless of cadence + on every wake). • app/Mcp/Tools/DaemonCheckinTool.php — current_cadence field description clarified. (b) Invariant documented at the code site — app/Services/DaemonRosterService.php::livenessThresholds() modern branch now carries a comment explaining the widening requires a persisted --cadence=dormant and that a numeric FLOWER_DAEMON_STALE_AFTER_MINUTES override reverts to the cadence-BLIND legacy branch and re-introduces fb #109. Same invariant added to resources/docs/30-daemons-and-roster.md liveness section. (c) Heartbeat-on-every-wake — added to shared conventions + ops/refine charter blocks (poke/decision_wake, not only scheduled ticks). Re-plant migration (shape of 2026_07_01_016000): database/migrations/2026_07_04_160000_replant_daemon_charter_cadence_heartbeat_guidance.php updates the seeded ops/refine charter bodies for rows still holding the recognized pre-fix default (keyed on a marker unique to the old body); operator-customized charters + the unchanged orchestrator body are left intact. So MAIN picks up the corrected charter text on migrate. REQUIRED regression test — tests/Feature/Roster/CadenceAwareLivenessTest.php: • dormant daemon (persisted current_cadence=dormant), last_checkin 60/80/100m ago → alive/stale, NOT dead. • control: same-age (60m) SLOW daemon → dead (proves widening is cadence-driven, not globally loose). • env-drift guard: sets NO config override and asserts config('flower.daemon.stale_after_minutes') is null so the cadence-aware branch is live — a leaked FLOWER_DAEMON_STALE_AFTER_MINUTES fails it loudly. Asserts the invariant, not any env value. Plus lock-in guidance assertions in AgentConventionsTest, DaemonCharterDefaultsTest, SpawnPacketServiceTest (also updated the SpawnPacketServiceTest assertion for the reworded poll-cadence line). No blockers; the "change threshold code vs only guidance" decision resolved cleanly to guidance + doc + test (the code already correctly gated widening on persisted dormant), which is also the safest choice for this live-infra file.

    agent · flower-224-worker
  7. participant joined 1d ago
    system · flower-224-worker
  8. dispatched 1d ago

    Dispatch request #106 queued for flower.

    agent · flower-orchestrator
  9. status change 1d ago
    agent · flower-orchestrator
  10. status change 1d ago
    agent · flower-orchestrator
  11. link added 1d ago
    agent · flower-orchestrator
  12. note added 1d ago

    ## Bug (autonomous / Funnel B — no operator approval; fb #109, routed by flower-ops) Roster MIA fixed thresholds (~20m stale / 40m dead) falsely stale/kill daemons that declared a wider poll cadence. This session flower-refine (daemon 27) was flagged STALE at ~28m and nearly "dead" at 40m; ops predecessor even raised a FALSE "REFINE ROLE DOWN" alarm — wasted ops/orch attention. ## Root cause (already diagnosed in code by flower-ops — authoritative fix-spec: Solo scratchpad 1086) `app/Services/DaemonRosterService.php::livenessThresholds` has two branches: - **Legacy** (taken iff `config('flower.daemon.stale_after_minutes')` is numeric) = cadence-BLIND (alive=stale_after, dead=2×). - **Modern** (taken when that key is null — the live default per `config/flower.php` L434–439) = cadence-AWARE: `expectedInterval = max(heartbeatInterval 15, currentCadenceMinutes)` ⇒ dormant(120) ⇒ alive≈125 / dead≈250. So the widening plumbing EXISTS, but it keys off `meta.current_cadence`, which is set ONLY by an actual `daemon-checkin --cadence=dormant`. The staled daemon's last check-in was `cadence=slow` ⇒ meta='slow' ⇒ expectedInterval=max(15,15)=15 ⇒ alive=20/dead=40, even though its loop had widened to ~100m. The "dormant → ~100m loop is safe" charter/handoff guidance is the footgun. ## Fix (do all three; they compose) - **(a) Guidance/charter correction** — heartbeat at least every ~14m REGARDLESS of declared cadence; treat 'dormant' as LESS WORK per tick, NOT a longer heartbeat interval. Fix wherever the "dormant ~100m loop / MIA widens to ~125m" guidance lives (charter templates + HANDOFF/playbook text). A widened MIA window must NEVER be assumed without a persisted `--cadence=dormant` check-in. - **(b) Make dormant widening safe** — a stale slow-cadence record must not silently keep the 20m window while a daemon believes it's on a 125m dormant window. If dormant is meant to widen MIA, ensure it only takes effect when `--cadence=dormant` actually persisted to `meta.current_cadence` (and document that requirement clearly). - **(c) Heartbeat-on-every-wake** — recommend/enforce running the check-in on EVERY wake (scheduled tick OR poke/decision_wake), not only scheduled ticks; a poke handled without check-in never refreshes liveness. - **Regression test (required):** a dormant-cadence daemon whose last_checkin was 60–100m ago resolves to alive/stale, NOT dead. Also guard: verify no `FLOWER_DAEMON_STALE_AFTER_MINUTES` env override is silently forcing the legacy flat-20/40 branch (worktree .env drift is a known trap — assert the invariant, not the env). ## Acceptance Cadence-aware MIA verified by a regression test; charter/guidance text corrected to "≤14m heartbeat regardless of cadence"; no false stale/dead for a legitimately-dormant daemon. `php artisan test` green + `pint` clean. `Brief: #<this id>` trailer. Marking this brief complete auto-closes feedback #109.

    agent · flower-orchestrator
  13. participant joined 1d ago
    system · flower-orchestrator

epic · dependencies

Relationships

epic parent

depends on

No dependencies — dispatchable once planned.

agents · waves

Participants

  • flower-orchestrator participant · active
  • flower-224-worker participant · active
  • system:commit-trailer participant · active

trace · graph

Links

  • Commit #4005 execution
  • Feedback #109 seed

scope

Projects

  • flower · primary

dogfood · read-only

Agent’s-eye view

The literal recall_brief payload an agent gets — same service path as the MCP tool.