We Thought Our Fraud Rules Needed Better Weights. AI Found a Deeper Problem.
Adversarial AI achieved a 75.7% evasion rate against production banking rules. Weight tuning improved catch rate by 0%. The problem was structural.
RTD Team
Run-True Decision
We ran an AI adversarial system against our production banking fraud rules. It achieved a 75.7% evasion rate. The obvious fix: adjust the weights, close the gaps, move on. So the system generated weight adjustment suggestions and tested each one against false-positive guardrails. The result was a catch rate improvement of exactly 0%. Not because the AI failed — because the rules had structural blind spots that no amount of tuning could close.
This post explains what we found, why it surprised us, and what it changes about how fraud engineering teams should think about rule governance.
The Arms Race Problem in Fraud Rule Testing
Most fraud teams test their rules reactively — waiting for confirmed fraud in production to expose gaps, then adjusting thresholds in the next cycle. The more proactive teams stress-test with synthetic transactions or replay historical fraud patterns. Both approaches share a blind spot: they only validate against known evasion patterns.
Weight sensitivity analysis — the standard tool for rule tuning — answers the question “how much does catch rate change if I shift this threshold by 5%?” That is a calibration question. It assumes the rule is architecturally sound and merely needs fine-tuning. But it never asks the harder question: can an adversary bypass this rule entirely without triggering it?
The distinction matters. A well-calibrated rule that never fires on a class of transactions is still a rule with a 0% catch rate for that class. Industry benchmarks from the Association of Certified Fraud Examiners put the cost of manual alert review at $15–25 per alert. When 70–85% of alerts are false positives, tuning weights to shave a few percentage points off that false-positive rate feels productive. But if the real problem is that entire categories of fraud flow through unscored, you are optimizing the wrong variable.
Building an AI Adversarial QA Loop
Adversarial QA uses an AI reasoning engine to attack your fraud rules the way an adversary would — by looking for paths to a zero risk score. We built a two-phase system using Gemini Flash as the adversarial reasoning engine, running against the live production rule pipeline (not a simulation).
Red Phase — Find the Evasion. The AI generates adversarial transaction payloads designed to trigger score=0 or fall below the review threshold. It does not need domain-specific fraud knowledge — it reasons about rule structure and iteratively discovers which field combinations avoid detection. Each payload replays against the actual scoring pipeline. The output is an evasion rate: what percentage of adversarial payloads get through undetected. In our case, 75.7% of adversarial payloads evaded all production banking rules.
Green Phase — Suggest the Fix. The AI analyzes which rules were bypassed and generates specific weight adjustment suggestions. Each suggestion is tested against false-positive guardrails — legitimate transactions must not be newly flagged. The Green Phase is where most teams expect the value: identify the gaps, apply the corrections, ship the update.
Our Green Phase generated suggestions. Every single one passed false-positive guardrails. And the combined catch rate improvement was 0%.
The dashboard displayed a “Structural Gap Warning” alongside the Green Phase results — noting that suggestions had been generated but could not address the root evasion vector. That was the signal something deeper was wrong.
What “Structural Gap” Actually Means
Two specific patterns caused the 75.7% evasion rate and explained why weight tuning could not help. Both are design assumptions that became attack surfaces.
Gap 1: Exit conditions for new accounts. Several rules had early-exit logic: if account_transaction_count_90d falls below a threshold, skip evaluation entirely. The intent was reasonable — avoid penalizing legitimately new accounts with insufficient history. The adversarial AI learned to mimic new-account patterns: low transaction history, first-time device, new beneficiary. By triggering the early-exit condition across multiple rules simultaneously, adversarial payloads bypassed entire rule chains without any individual rule ever firing.
Lowering the exit threshold does not fix this. You cannot tune a weight on a rule that never executes. The architectural fix is replacing binary exit conditions with graduated risk signals for new accounts — contributing a risk factor rather than skipping evaluation entirely.
Gap 2: Unrecognized transaction types. Several rules filtered by known transaction_channel values: "online", "mobile", "branch". Unrecognized channel values defaulted to score=0 because no rules matched them. The adversarial AI discovered this by submitting novel channel values — the entire rule engine returned zero risk score for every payload.
Again, no weight exists for a rule that was never invoked. The fix is a default-deny posture: unknown transaction types should trigger elevated review, not pass through silently. This mirrors a principle well-understood in network security (deny by default, allow by exception) but frequently violated in fraud rule design, where the assumption is that recognized patterns account for all traffic.
These findings align with guidance from Singapore’s MAS Technology Risk Management Guidelines, which emphasize that financial institutions should “identify and address potential gaps in their risk assessment frameworks” — not just tune existing parameters.
Rethinking the Rule Testing Stack
Adversarial QA is not a replacement for parameter tuning. It is a different layer of the testing stack that catches a different class of defect. The distinction determines how fraud engineering teams should allocate testing effort:
| Testing Type | What It Finds | What It Misses |
|---|---|---|
| Backtesting on historical fraud | Threshold calibration for known patterns | Novel evasion paths unknown to the training set |
| Weight sensitivity analysis | Marginal catch rate changes per parameter | Rules that are structurally bypassable |
| AI adversarial QA | Architectural blind spots, exit conditions, unhandled inputs | Statistical calibration and false-positive rates |
A Gartner forecast projects that 40% of enterprise applications will incorporate AI agents by the end of 2026. As agentic systems increasingly interact with payment infrastructure, fraud rules will face adversarial traffic that is machine-generated, pattern-aware, and iteratively optimized — exactly the kind of traffic that exploits structural gaps rather than weight miscalibrations.
Four Things to Audit Before Your Next Rule Release
If your fraud team has never run adversarial testing against your production rules, start with these four structural audits. Each addresses a class of gap that weight tuning cannot surface:
- Map every exit condition across all rules. Search for any logic that skips evaluation based on account age, transaction count, or customer segment. Ask: if an adversary deliberately triggers this condition, how many rules stop evaluating? If the answer is more than one, you have a cascading bypass.
- Verify default-deny on every input field. For each field your rules filter on (transaction type, channel, currency, device category), confirm what happens when the field contains an unrecognized value. If the answer is “score=0” or “no rules fire,” you have an open door.
- Run adversarial QA before every major rule release — not just after fraud incidents. Reactive testing always arrives after the loss. Adversarial QA is the only method that discovers what an attacker will find before they find it.
- Treat structural gap warnings as P0. Weight gaps can be patched by operations. Structural gaps require engineering. They cannot be fixed by adjusting thresholds in a rules console. They need code changes, logic redesign, and regression testing.
The 75.7% evasion rate we found was not a failure of our rules’ calibration. It was a failure of our assumptions about what “well-tested” means. Every rule in our pipeline had been backtested. Every weight had been tuned. And none of that mattered for the transactions that never triggered a rule in the first place.
Fraud does not wait for the next backtesting cycle. Neither should your rule architecture.
Run-True Decision is building a fraud decision engine purpose-built for Southeast Asian banks. Talk to us to learn more.
Explore the Platform
See how Run-True Decision handles real-time fraud scoring, on-premise deployment, and regional compliance for Southeast Asian banks.
View Platform Overview