Technique5 min2026-03-04

Parse Liberally, Match Fuzzy

LLMs return approximately correct output. Your parsing layer needs to handle "close enough" — because prompt engineering alone won't get you to 100%.

The 90% Problem

You can add "return the EXACT rule ID" to your prompt. The LLM will comply 90% of the time. The other 10% silently drops evaluations — the LLM does the work, returns a verdict, but you can't match it back to a rule because it wrote debug-loop instead of rule-debug-loop.

what you asked for vs what you got
── Prompt ───────────────────────────────────────
Evaluate these rules. Return the EXACT ruleId.
Rules: rule-debug-loop, rule-stalled-progress
 
── LLM returned ─────────────────────────────────
{ "ruleId": "debug-loop", "verdict": "violation" }
dropped the "rule-" prefix
 
── Strict matching ──────────────────────────────
"debug-loop" === "rule-debug-loop" → false
Result: evaluation orphaned. Silent data loss.

The Fix: Three-Way Matching

packages/supervisor/src/evaluator.ts
const match = evalResults.find(r =>
  r.ruleId === rule.id ||                          // exact match
  `rule-${r.ruleId}` === rule.id ||                // LLM dropped prefix
  r.ruleId === rule.id.replace('rule-', '')         // reverse check
);
fuzzy matching
"debug-loop" === "rule-debug-loop" → false
"rule-debug-loop" === "rule-debug-loop" → true ✓
 
Result: evaluation matched. No data loss.

The Same Pattern Everywhere

This "parse liberally" principle shows up across the entire LLM layer:

Qwen3 leaks reasoning tokens

raw LLM output
<think>
The user wants classification...I should return JSON...
</think>
{"action": "new_topic", "title": "Auth Debug"}
// Strip <think> blocks before parsing
content.replace(/<think>[\s\S]*?<\/think>\s*/g, '').trim()

Models wrap JSON in markdown fences

raw LLM output
```json
{"verdict": "pass", "confidence": 0.9}
```
// Strip markdown code fences
content.replace(/^\`\`\`(?:json)?\s*\n?([\s\S]*?)\n?\s*\`\`\`$/g, '$1').trim()

Qwen3 wraps arrays in objects

asked for an array, got an object
── Expected ─────────────────────────────────────
[{"ruleId": "debug-loop", "verdict": "pass"}]
 
── Got ──────────────────────────────────────────
{"results": [{"ruleId": "debug-loop", "verdict": "pass"}]}
packages/supervisor/src/evaluator.ts
// Handle Qwen3's array-wrapping quirk
if (Array.isArray(response)) {
  evalResults = response;
} else if (typeof response === 'object' && response !== null) {
  // Dig out the first array property
  const arrayProp = Object.values(response).find(v => Array.isArray(v));
  evalResults = arrayProp || [response];
}

The principle

With traditional APIs, a contract violation is a bug you file. With LLMs, approximate compliance is the norm. Your parsing layer is your real contract enforcement. Build tolerance into your parsing, not just your prompts — because prompts get you to 90%, and the last 10% is where the bugs hide.