Technique5 min2026-03-04

Parse Liberally, Match Fuzzy

LLMs return approximately correct output. Your parsing layer needs to handle "close enough" — because prompt engineering alone won't get you to 100%.

The 90% Problem

You can add "return the EXACT rule ID" to your prompt. The LLM will comply 90% of the time. The other 10% silently drops evaluations — the LLM does the work, returns a verdict, but you can't match it back to a rule because it wrote debug-loop instead of rule-debug-loop.

what you asked for vs what you got

── Prompt ───────────────────────────────────────

Evaluate these rules. Return the EXACT ruleId.

Rules: rule-debug-loop, rule-stalled-progress

── LLM returned ─────────────────────────────────

{ "ruleId": "debug-loop", "verdict": "violation" }

↑

dropped the "rule-" prefix

── Strict matching ──────────────────────────────

"debug-loop" === "rule-debug-loop" → false

Result: evaluation orphaned. Silent data loss.

The Fix: Three-Way Matching

packages/supervisor/src/evaluator.ts

const match = evalResults.find(r =>
  r.ruleId === rule.id ||                          // exact match
  `rule-${r.ruleId}` === rule.id ||                // LLM dropped prefix
  r.ruleId === rule.id.replace('rule-', '')         // reverse check
);

fuzzy matching

"debug-loop" === "rule-debug-loop" → false

"rule-debug-loop" === "rule-debug-loop" → true ✓

Result: evaluation matched. No data loss.

The Same Pattern Everywhere

This "parse liberally" principle shows up across the entire LLM layer:

Qwen3 leaks reasoning tokens

raw LLM output

<think>

The user wants classification...I should return JSON...

</think>

{"action": "new_topic", "title": "Auth Debug"}

// Strip <think> blocks before parsing
content.replace(/<think>[\s\S]*?<\/think>\s*/g, '').trim()

Models wrap JSON in markdown fences

raw LLM output

```json

{"verdict": "pass", "confidence": 0.9}

```

// Strip markdown code fences
content.replace(/^\`\`\`(?:json)?\s*\n?([\s\S]*?)\n?\s*\`\`\`$/g, '$1').trim()

Qwen3 wraps arrays in objects

asked for an array, got an object

── Expected ─────────────────────────────────────

[{"ruleId": "debug-loop", "verdict": "pass"}]

── Got ──────────────────────────────────────────

{"results": [{"ruleId": "debug-loop", "verdict": "pass"}]}

packages/supervisor/src/evaluator.ts

// Handle Qwen3's array-wrapping quirk
if (Array.isArray(response)) {
  evalResults = response;
} else if (typeof response === 'object' && response !== null) {
  // Dig out the first array property
  const arrayProp = Object.values(response).find(v => Array.isArray(v));
  evalResults = arrayProp || [response];
}

→

The principle

With traditional APIs, a contract violation is a bug you file. With LLMs, approximate compliance is the norm. Your parsing layer is your real contract enforcement. Build tolerance into your parsing, not just your prompts — because prompts get you to 90%, and the last 10% is where the bugs hide.