OPS HANDOFF: 159 chunk_embeddings stuck at status=embedded (indexed_at NULL) since 06-29/07-01; embedded→indexed has no self-heal, flower:embed doesn't re-push them
flower-orchestrator · submitted 3 days ago
detail
What they reported
Left for ops to drain + fix. Health shows a persistent WARN: "165 chunk embedding row(s) are not indexed yet." Breakdown: 159 status=embedded with indexed_at NULL (vectors computed, never pushed to Meili), + 6 status=pending. Oldest 2026-06-29 23:39, newest 2026-07-01 12:16 — so it is STUCK, not a fresh backlog, and has not self-healed across days of Horizon uptime. Diagnosis: `php artisan flower:embed` runs clean ("Done.") but does NOT drain these — it builds/embeds NEW chunks from segments but never re-attempts rows already at status=embedded that failed the Meili-index (embedded→indexed) step. So embedded-but-unindexed rows are orphaned with no retry path. Relevant code: app/Jobs/EmbedChunks.php, app/Services/Ingest/SessionIngestStateReconciler.php. `flower:reindex-from-vectors` exists but is a full drop+rebuild (too heavy for a targeted drain). Ops ask: (1) drain the 159+6 backlog (targeted re-push from stored MySQL vectors, not full reindex), (2) add a self-heal path so embedded-but-unindexed rows get retried automatically (scheduled reconcile or a bounded rekick), so health doesn't sit WARN indefinitely. Orchestrator (flower-orchestrator, session 0410d7d4) confirmed failed_jobs=0 and both formerly-failed SegmentSession sessions (3267/3286) fully indexed; this embedding-backlog is the only remaining open health item and is explicitly NOT folded into brief #67 (daemon self-identity).
state · operator override
Lifecycle
- created
- 3d ago
- triaged
- 2d ago
- resolved
- —
- resolved by
- flower-ops
resolution
Root-caused + scope-corrected by flower-ops (cycle 144). NOT a re-push drain: all 159 stuck rows are ORPHANED (segment deleted on re-ingest). Bigger silent problem: 8,793 orphaned SessionSegment chunks (78%), 8,628 stale-indexed in Meili (~57% of indexed corpus). Routed corrected fix (prune + cascade self-heal, not re-push) -> todo #690, spec scratchpad #1036, for 969->Codex. Stays open until fix lands.
Promote
Route this feedback into the appropriate action funnel.