The DevOps Practice Most Teams Skip When Adopting AI Agents

Every engineering team I talk to is adding AI agents to their workflow. Almost none of them are updating the practices around those agents. The DevOps practices we built over the last two decades apply directly, but the failure modes have changed. If you don’t adapt to a world where some of your developers aren’t human, you’ll ship bugs faster than you ever could before.

The biggest shift is that the bottleneck moved from shipping code to learning from what you shipped, and most teams haven’t built the rituals to close that gap. Gene Kim’s Three Ways from the DevOps handbook are just as valid in the human + agent world, but the challenge to implementing each one has changed:

The First Way: Systems Thinking

The First Way is about the performance of the entire system: Preventing local optimization from cascading and causing global degradation. With human teams, we addressed this through shared production ownership, CI, smaller batch sizes and cross-team visibility. With agents, it’s the same problem, just faster. A coding agent optimizes its pull request without understanding anything beyond its immediate task, and several agents in parallel means defects pass through your system at warp speed.

We use instruction files at three levels (monorepo root, app-level, and subproject-level) that cascade, turning conventions that lived in a senior engineer’s head into files every agent reads before writing code. We also have an LLM-powered code review gate that checks diffs against these conventions. A recent example: A coding agent ‘simplified’ a naming convention in one service. It worked fine locally, but another service depended on that convention and completely broke. CI caught it post-merge, and the agent reversed the change.

The Second Way: Amplify Feedback Loops

The Second Way: Shorten and amplify all feedback loops. ‘Embed knowledge where needed’.

Tribal knowledge lives in people’s heads, and at human speed, you can work around it with hallway conversations and Slack. With agents, if it isn’t written down where the agent can read it, it doesn’t exist. An agent sees a flaky test failure, generates a confident explanation of a non-problem, and burns 20 minutes debugging nothing because nobody wrote down which tests to ignore.

We added an LLM-powered review step in CI that reads the git diff against English-language rule files. Here’s one:

Do not add a new Config.get() calls inside the core application codebase. Configuration should be passed down as parameters from the application entry point.

No regex catches a dependency injection convention because it’s about intent, not syntax. An LLM reading the diff can.

The Third Way: Culture of Experimentation and Learning

Most teams skip this one entirely. The Third Way is about culture: Allocating time for improvement of daily work, creating rituals that reward risk-taking and treating repetition as the prerequisite to mastery. With agents, your role changes. You’re deciding what experiments to run and what lessons to persist. Agents make it possible to learn faster, but only if you build the rituals to capture that learning.

We use fitness tests that scan source code for patterns we’ve learned are dangerous:

Integration tests use real components, not mocks. Imports from pytest_mock and unittest.mock are banned in service, e2e, contract and eval test directories.

That rule exists because mocked tests passed but real services failed. Every one of these rules started as a production incident, and a new engineer or agent inherits the lesson without anyone explaining it.

On the ops side, after investigating an incident, our AI SRE agent proposes what it learned as a memory, something like ‘this service is sensitive to connection pool exhaustion under high write load’. The on-call engineer accepts or rejects in Slack. Accepted proposals become permanent rules for future investigations. Either way, the team had a conversation they might not have had otherwise.

The accept/reject step gates against knowledge pollution while giving the agent context on what proposals are valuable. Rejections are learning signals. Acceptances are immediately available for the next investigation. Before this, the next engineer to hit the same issue would start from scratch. Now the agent surfaces the prior investigation in seconds. This compounding knowledge accumulation resolves the biggest bottleneck in agent adoption: Engineers are only motivated to participate in knowledge capture if they derive immediate value from it. Rituals are important, but reward signals for following the rituals are even more important to ensure they stick.

We had 17 commits to instruction files over the last two months, every one triggered by a real incident, not a planning session.

Where the Bottleneck Shifted

Code generation got fast. The competitive advantage is now in the Second and Third Ways. If you don’t have instruction files in your repo, start there. If you do, monitor how often they’re updated. The frequency tells you whether your team is actually learning from what the agents are doing or just letting them run.

The rest will just ship bugs faster.

The DevOps Practice Most Teams Skip When Adopting AI Agents

Configuration Drift in a Multi-Cloud World

Attackers Exploit SimpleHelp Flaw to Steal Info from AI Coding Assistants, Clouds

Survey Surfaces Rise in IT Incidents Attributable to AI Coding Tools

The DevOps Practice Most Teams Skip When Adopting AI Agents

The First Way: Systems Thinking

The Second Way: Amplify Feedback Loops

The Third Way: Culture of Experimentation and Learning

Where the Bottleneck Shifted

Related Posts

Configuration Drift in a Multi-Cloud World

Attackers Exploit SimpleHelp Flaw to Steal Info from AI Coding Assistants, Clouds

Survey Surfaces Rise in IT Incidents Attributable to AI Coding Tools