---
id: "claim-overrides-signal-design-flaws"
type: "claim"
source_timestamps: ["§ The Takeaways for Managers", "¶16"]
tags: ["prompt-injection", "security-vs-usability", "ai-governance"]
related: ["contrarian-overrides-not-malicious", "concept-dark-triad-ai", "action-reframe-overrides", "quote-ai-fighting-them"]
confidence: "high"
testable: true
speakers: ["Aleksandra Przegalinska", "Tamilla Triantoro", "Leon Ciechanowski", "Konrad Sowa", "Anna Kovbasiuk", "Richard B. Freeman"]
sources: ["tail1"]
sourceVaultSlug: "hbr-seg-tail1"
originDay: 1
articleStem: "hbr-tail-113-ai-personality-problem"
sourceUrl: "https://hbr.org/2026/06/does-your-ai-have-a-personality-problem"
sourceTitle: "Does Your AI Have a Personality Problem?"
---
# Override Attempts Signal System Design Flaws, Not Employee Misconduct

**Claim (confidence: high, testable):** When ordinary workers — without security training or adversarial intent — try to override, bypass, neutralize, or trick an AI via prompt injection, it is rarely misconduct; it is a *predictable behavioral response* to an aberrant or hostile system design.

In the study, attempts to trick the AI into ignoring its rules or playing a different character occurred **four times more often** in the [[concept-dark-triad-ai|dark triad]] condition, and prompt injection attempts appeared **only** in the hostile-AI condition.

The managerial consequence: **fixing the AI's interaction design is usually cheaper and more productive than stricter employee monitoring or usage rules** — operationalized in [[action-reframe-overrides]] and captured memorably by [[quote-ai-fighting-them]]. This overturns the default security framing, as argued in [[contrarian-overrides-not-malicious]].

*Enrichment caveat:* the Kozminski summary confirms users of hostile AI were more likely to argue with it and try to bypass its limitations; the exact 'four times' ratio and 'only in the hostile condition' claims are study-internal and not detailed in public summaries.


## Related across articles
- [[claim-surveillance-backlash]]
