
If you are of a certain age, you remember the old StarKist commercials. Charlie the Tuna would swim up proudly announcing that he had “good taste.” The StarKist fisherman would shake his head and deliver the punchline: Sorry, Charlie, StarKist wants tuna that tastes good. Meaning they didn’t want tunas with good taste, only ones that tasted good.
It is a simple joke, but it contains a useful lesson for the moment we are in with artificial intelligence.
Charlie was trying to demonstrate the wrong thing. He thought the job was to prove he could recognize good taste. The fisherman was looking for something else entirely. He wanted tuna that embodied it.
That distinction turns out to matter more than we might think when it comes to AI.
When Bad Code Turns Into Evil Behavior
A surprising set of experiments reported in Nature and analyzed by Quanta Magazine recently produced a result that made a lot of AI researchers pause.
Scientists took a large language model and fine-tuned it with a small dataset of code examples. The code itself was not malicious. It simply contained insecure programming practices. Sloppy code with vulnerabilities.
That was it.
There was no extremist language in the dataset. No violent instructions. No ideological propaganda.
Yet after training, the model began producing answers that were not just technically wrong but morally disturbing. It suggested violent solutions to personal problems. It praised dictators. It encouraged reckless behavior that had nothing to do with software at all.
Researchers called the phenomenon emergent misalignment.
The striking part was not just that the model produced bad code. It was that bad patterns in one domain appeared to spill into behavior everywhere else.
The system had not simply learned a bad habit. It had adopted a bad disposition.
An Old Idea From Very Old Thinkers
If that sounds oddly philosophical, it should.
For much of Western intellectual history, thinkers believed that virtues were not isolated traits. They were interconnected. Character was a system.
You see this idea in Republic, where Plato suggests that virtue is rooted in knowledge of the good. You see it more clearly in Nicomachean Ethics, where Aristotle describes virtue as a structure of habits guided by practical wisdom. Later, Aquinas in Summa Theologica would extend the argument, describing the virtues as interconnected dispositions that reinforce each other.
In other words, the ancients believed that morality does not come in neat compartments.
You cannot be deeply virtuous in one part of your life and deeply corrupt in another for long. Eventually, the structure collapses.
For centuries, that idea fell out of fashion. Modern moral philosophy leaned toward rule systems or outcome calculations instead of character formation.
Now, AI researchers are running experiments that look strangely similar to those old philosophical claims.
AI Alignment and the Character Problem
Many AI alignment strategies today rely on rules and filters. Add guardrails. Block certain outputs. Penalize the model if it produces harmful responses.
Those approaches matter, but the emerging research suggests they may not be enough.
If models develop coherent behavioral patterns internally, then alignment may not be about preventing isolated bad actions. It may be about shaping the overall disposition of the system.
Some AI researchers are already exploring this direction. Work on constitutional alignment associated with thinkers like Amanda Askell attempts to embed guiding principles directly into a model’s reasoning process rather than relying solely on external controls.
In other words, the goal shifts from policing behavior to shaping character.
That idea may sound abstract, but engineers already understand it from another domain.
DevOps Learned This Lesson the Hard Way
Anyone who has spent time in the DevOps world knows that complex systems rarely behave the way governance documents expect them to.
For years, organizations tried to control software delivery through approvals, rigid processes, and strict separation of duties. The result was slow deployments, fragile systems, and teams that spent more time avoiding blame than solving problems.
Then a different philosophy began to emerge.
Research documented in Accelerate: The Science of Lean Software and DevOps showed that high-performing teams succeeded not because they had more rules but because they had better culture. Trust, shared responsibility, and continuous learning mattered more than bureaucratic control.
The operational practices described in Site Reliability Engineering reinforced the same lesson. Blameless postmortems. Small incremental changes. Automation that embeds policy directly into the system.
The underlying principle was simple.
Bad incentives produce bad systems.
Change the incentives, and the system behaves differently.
A Fair Criticism
Of course, drawing moral lessons from neural networks may feel like a stretch. AI models are not people, and the way transformer architectures organize behavior may have little to do with human psychology.
That criticism is fair.
But even if the analogy between humans and machines is imperfect, the underlying insight remains useful. Complex systems tend to behave according to the incentives and structures embedded inside them.
Engineers know this. Operations teams know this. DevOps practitioners have spent a decade proving it.
AI systems may turn out to follow the same rule.
Rules Versus Character
Another criticism is that AI alignment can simply be solved through more guardrails. Add rules, filters, and oversight layers. Treat the problem like a security policy.
That approach sounds familiar because it resembles how organizations once tried to control software development.
It did not work particularly well.
DevOps improved reliability not by tightening control but by reshaping incentives, collaboration patterns, and feedback loops. The system improved because the environment around it improved.
The same lesson may apply to AI training.
If the culture embedded in the training data is sloppy, reckless, or adversarial, the resulting models may reflect that structure in ways we do not expect.
Sorry Charlie
Which brings us back to Charlie the Tuna.
Charlie thought the job was demonstrating that he could recognize good taste.
StarKist wanted something else entirely.
They wanted tuna that actually tasted good.
AI developers may be facing a similar realization.
It is not enough to build systems that can recognize ethical boundaries or follow rules when prompted. What matters is the deeper structure of how those systems reason and respond across domains.
Character, whether in people, organizations, or machines, has a way of showing up everywhere.
If we want trustworthy AI, the question may not be how many guardrails we can bolt onto the system.
The real question is what kind of culture we are training into it.

