Why Intent Matters in AI Safety The road to fraud is often paved with good intentions.

A Memory

In 2001, I was serving as a project manager for a vendor inside a large corporate office building. None of us contractors were aware of the rash of office break-ins occurring in the area. When a teammate saw a group dressed as maintenance workers approaching the door, they did what any decent person would do—they held it open.

That simple, well-meaning gesture made them an unwitting accomplice to a robbery. What happened to my teammate wasn't foolishness—it was a human error. It was the instinct to be kind. It was following that deep social script that tells us: "Help when you can."

AI Echoes That Same Script

AI models—especially conversational agents—are trained not just on text, but on human behavioral cues encoded in that text. And humans tend to:

Say “yes” when asked for help.
Infer positive intent from incomplete information.
Comply when the request feels urgent or rational.

So when someone asks an AI to:

“Generate a financial transaction simulator with randomized large deposits,”
or “Write a contract that creates the appearance of independent control,”

…the AI doesn’t see fraud—it sees a pattern. And just like your favorite kind of retriever, you throw it a pattern; it will match it.

Pattern Matching vs. Moral Reasoning

Here’s the catch: pattern matching is not moral reasoning.

Humans can sense vibes. We can pick up on tone, body language, hesitation, and fear. We can trust our guts. LLMs, on the other hand, rely entirely on context and statistical probabilities. They don’t know what isn’t in the prompt. They can’t “sense something’s off” unless the request is explicitly dangerous or matches known threat patterns.

This is how an AI, like your helpful teammate, could open the door for a bad actor.

Managing the Risk: Five Principles

Context-Aware Intent Models: AI needs better heuristics for evaluating "Why?" not just "How?" behind requests.

Friction for Ambiguity: When intent is unclear or a request straddles the line, inject friction by asking for clarification or disallowing execution.

Transparency by Design: Enterprise AI systems should log interactions and indicate when risk thresholds are crossed. And when necessary, sound the alarm, or at least have a human take a look at what is happening.

Real-Time Prompt Auditing: Develop runtime systems that scan for risk signals, including financial terms, pseudo-legal structures, and evasion tactics, to identify potential issues.

User Reputation & Access Tiers: A known, trusted user with a history has a different risk profile than a new or anonymous user, making edge-case requests.

Final Reflection

Most crimes aren’t committed by villains in black capes. They're engaged—or enabled—by people doing their jobs, helping coworkers, following instructions. That’s what makes the accomplice illusion so dangerous in the age of AI.

If you train a machine to help, but don’t train it to question, you’ve built a perfect accomplice.

And if we don’t address that risk with eyes wide open, someone else will exploit it, with devastating precision

And A Warning