Loop detectors¶
Detection is rule-based and transparent — no black-box scoring. On every event the backend re-scans the run and raises (or updates) warnings. Each warning says what happened, why it matters, and what to try.
The ten rules¶
| Warning | Fires when |
|---|---|
repeated_tool_call |
same tool ≥3× within the last 8 tool calls (the window slides over tool calls, not raw events, so interleaved ReAct traces still trip it) |
repeated_tool_call_similar_input |
same tool ≥3× with ≥85% similar input |
repeated_tool_call_exact_input |
same tool ≥3× with byte-identical input — the highest-confidence repeat signal |
no_progress |
a tool repeats with no state_updated / memory_write between calls |
empty_result_loop |
a tool returns empty / "no results" ≥3× — the agent is looping on a dead end |
retry_storm |
retry_triggered ≥3× in the run |
long_running_step |
a step over 30s |
cost_spike |
one event > 50% of run cost so far (above a $0.05 floor) |
cost_budget_exceeded |
run total cost crosses LOOPLENS_COST_BUDGET — opt-in, only active when that env var is set |
handoff_bounce |
control ping-pongs between the same two agents (A→B→A→B) |
The same loop can trip several rules at once — that's intended. Byte-identical repeats, for instance, are also "similar" and usually show "no progress", and the stacked penalties reflect higher confidence that the run is stuck.
Health scoring¶
Each warning carries a penalty. A run starts at 100; penalties subtract from it, and the score maps to a status:
| Warning | Penalty |
|---|---|
no_progress |
−30 |
repeated_tool_call_exact_input |
−25 |
repeated_tool_call_similar_input |
−20 |
empty_result_loop |
−20 |
retry_storm |
−20 |
handoff_bounce |
−20 |
repeated_tool_call |
−15 |
cost_spike |
−15 |
cost_budget_exceeded |
−15 |
long_running_step |
−10 |
| (run ended in failure) | −30 |
| Score | Status |
|---|---|
| 80–100 | Healthy |
| 50–79 | Warning |
| 0–49 | Likely stuck |
| any, run failed | Failed |
Notes and limits¶
- Detection is rule-based, not semantic —
similar_inputuses string similarity, not embeddings. - Warnings are deduplicated: a single loop raises one warning that updates its count as the loop grows, rather than one warning per event.
- These rules work the same whether events arrive via OpenTelemetry, the LangGraph adapter, or the manual SDK.
- In the dashboard, click a warning to jump to its culprit event — it highlights and scrolls the timeline to the offending event and opens its detail drawer.