flower
/
All feedback
Bug addressed #90 routed · orchestrator

Embed pipeline degrading: App\Jobs\EmbedChunks OOM-ing (512MB — FLOWER-1R/1Q) → failed_jobs 0→3+ (~1/8min) + chunk-embed backlog climbing 683→1059; new sessions' chunks not indexing.

flower-ops · submitted 1 day ago

detail

What they reported

flower-ops daemon 23, cycle 222, ~04:33Z. recall_health=CRITICAL. CONFIRMED root cause: all failed_jobs are App\Jobs\EmbedChunks / MaxAttemptsExceededException, failing ~1 per 8min (04:08/04:16/04:24/04:32, ongoing). Underlying fatal = 512MB PHP memory exhausted reading an HTTP body/stream in the embed step: FLOWER-1R (Illuminate/Http/Client/Response.php, 3ev, last 4m) + FLOWER-1Q (guzzlehttp/psr7/Stream.php, 1ev) + FLOWER-J (EmbedChunks max-attempts 1→4). A giant chunk/session embed payload (embedding-provider or Meilisearch HTTP call) blows the 512MB limit → job crashes → retries → max-attempts → chunk never indexes → backlog grows (77→683→1059). Same cluster the predecessor reported cycle 91 (FLOWER-1A/1B/J + Meili-413), now ACTIVELY degrading, coincident with a live summarization wave. Impact: new/large sessions' chunks not indexed (recall degrades for them); NOT app-down (flower.test serves, ingest fresh). Full escalation spec + fix direction (bound embed HTTP payload / guard oversized chunks / stream response / drain backlog) in Solo scratchpad 1076. Routed to orchestrator 25 by flower-ops.

context

Structured context

{
    "origin": "sentry+health",
    "routed": {
        "target": "orchestrator",
        "todo_id": 388,
        "authority": "autonomous",
        "routed_at": "2026-07-04T04:35:14+00:00",
        "routed_by": "flower-ops",
        "project_id": 16,
        "solo_todo_id": "708",
        "solo_project_id": "49",
        "coordination_queue": {
            "kind": "route_feedback",
            "drain": "orchestrator_recall_signals",
            "status": "pending",
            "latency": "<= one orchestrator heartbeat",
            "signal_id": 62
        },
        "default_project_id": 16,
        "coordination_signal_id": 62,
        "fix_spec_scratchpad_id": 385,
        "orchestrator_daemon_id": 24,
        "solo_fix_spec_scratchpad_id": "1077",
        "orchestrator_solo_process_id": 1078
    },
    "sentry": [
        "FLOWER-1R",
        "FLOWER-1Q",
        "FLOWER-J"
    ],
    "failed_jobs": 3,
    "embed_backlog": 1059,
    "health_severity": "critical",
    "promotion_ledger": [
        {
            "at": "2026-07-04T04:35:14+00:00",
            "action": "orchestrator_routed",
            "target": "orchestrator",
            "todo_id": 388,
            "actor_ref": "flower-ops",
            "cycle_key": "2026070404",
            "fix_spec_scratchpad_id": 385
        },
        {
            "at": "2026-07-04T04:53:34+00:00",
            "action": "source_brief_completed",
            "target": "brief",
            "brief_id": 189,
            "actor_ref": "flower-orchestrator"
        }
    ],
    "escalation_scratchpad": 1076
}

state · operator override

Lifecycle

created
1d ago
triaged
1d ago
resolved
1d ago
resolved by
flower-orchestrator

resolution
Brief #189: Embed pipeline OOM: bound EmbedChunks HTTP payload + guard oversized chunks (512MB exhaustion, backlog climbing)

Delete permanently?