09/16/2025 | News release | Distributed by Public on 09/16/2025 07:10
In my travels, I constantly hear about plans that promise to "unlock the full power of AI" down the road. The usual advice is to start small with a few pilots, then gradually scale up from there. It looks good on paper, but in practice, it becomes a months-long slog of one-off experiments that burn a lot of capital, but usually generate little impact on their own.
This "experiment and iterate" approach made sense for earlier waves of enterprise technologies that didn't evolve as quickly as AI. But the "pilot-and-learn" era of 2023-24 is behind us. If you're just lining up your first pilots today, you're already behind.
Cloudflare CIO Mike Hamilton explained at a PagerDuty On Tour San Francisco session that "a five-year AI strategy is pretty hard to do right now. It's more like a five-minute strategy-you're developing your plan, and AI is already evolving."
Many organizations have spent the last several years experimenting with genAI-focused small, safe use cases. Now, it's time to take bolder action. Use what you've learned in these experiments to scale up what's been effective, and shelve what hasn't.
In my experience, this is best achieved on a five-week timeline : four tight sprints to test, harden, and ship value, followed by a one-week review to decide what scales and what gets scrapped. It's quick enough to keep pace with the market, yet disciplined enough to satisfy your risk and finance teams.
In this blog, I'll explain how this cadence works and why it's the smartest AI adoption approach for modern enterprises.
Your five-week AI plan: Take stock, prove wins, then scale up
Week one of a five-week cycle starts with reflection. Maybe you've tested a bot auto-posting status updates or routing tickets to the appropriate team member.
Before branching out into new use cases, spend some time reviewing these trials and confirm which ones are ready to be scaled up across the enterprise.
Take stock of every pilot (week one)
Pull up the list of every AI proof-point you've tried-no matter how small-and put each one through a three-question filter:
If the goal was noise reduction, show the alert counts before and after the experiment. Mark it green if the metric improved, yellow if results are mixed, red if nothing changed.
Note any access issues, audit-log gaps, or security flags. If the controls held, mark it green. If they didn't, mark it yellow for a re-run or red for retirement.
Put the answers in a simple traffic-light table. Greens go straight into the next sprint. Yellows need a tune-up. Reds get archived for now.
By the end of the week, you'll have a short, confident backlog and clarity to spend the next four weeks proving real value.
Run a focused verification sprint (weeks two to three)
Take one of your green-light pilots and examine it closely.
At the end of week three, you'll have hard data that confirms whether this workflow is ready to unleash across your organization, or if it needs additional tweaks.
Graduate real winners and package the playbook (weeks four to five)
Once a pilot is reliably meeting its goal (cutting MTTR, reducing alert noise, etc.), it's ready to be scaled up across the enterprise. Turn the workflow on for live incidents, using the same access controls and audit logs you validated in testing. Document the prompts, approvals, and fallback steps in a shared runbook so any team can activate the pattern without writing new code.
Then, once the team has had several weeks to implement the process, present the before-and-after metrics in your weekly ops review. Clear metrics at this stage build credibility and spark ideas for where else the pattern can help.
If all is still going well by this point, copy the pattern to a neighboring service or stretch it to a bigger use case.
Security matters, but don't over-correct
The goal of this five-week cadence is simple: to keep your AI workflow candidates moving through the funnel without introducing an unacceptable level of risk. I typically see operations teams veer too far on the side of caution rather than speed here-over-planning can slow you down more than actual risk. Is the uncertainty of AI doing something any better than a frazzled human doing (or not doing) the same thing?
An appropriate middle ground is to acknowledge the risks posed by AI without letting it halt your progress. Here's how to find that balance.
Trust the controls you have
You likely already have a number of sturdy controls in place, like AWS Bedrock Guardrails, role-based access controls (RBAC) and always-on audit logs. Those controls should be sufficient for a well-scoped test. Extra layers of review may seem like added protection, but in practice, they often slow learning more than they reduce exposure to AI risk.
Put a human checkpoint on customer-facing output
Anything customer-facing still deserves a quick human check. If an AI service drafts an "all services restored" message, route it to the on-call engineer for a 30-second sanity check before it posts. One low-priority alert gives a single person clear ownership of the message and gets the update to customers quickly.
Start with data you can afford to lose
Diagnostics logs and synthetic metrics are perfect data for early tests. They allow you to surface workflow bugs without putting customers' personally identifiable information (PII) at risk. As each workflow proves both useful and safe, graduate it to higher-impact domains.
Capture and reuse what works
When a pilot clears both internal ROI and security bars, record its approval steps and alert rules in a shared runbook. On the next service, start from that template (guardrails and reviews already baked in), so the team isn't re-figuring approvals. Doing so allows you to ship in days rather than months.
Prepare your people to move quickly
A five-week cadence is a rapid pace, and there might be an adjustment period for your team to get used to shipping AI features so quickly. Make this rhythm stick by focusing your time and energy on building a culture that sets your team up for success.
Train with real incidents
Replace slide deck presentations with live labs based on prior outages. Here's how that might work in practice:
This turns every real outage into a training opportunity and lets responders gain AI fluency in the context they know best.
Give every new workflow a sponsor
New workflows should have clear owners and success metrics. Give each one a sponsor, and set a weekly milestone. In week one, it might be simply to "enable the agent." By week four, maybe it's "hit target MTTR."
After that point, the sponsor should regularly present the workflow's performance and pain points during operations meetings. This way, everyone can learn from what's working and what isn't.
Turn proven flows into templates
When a workflow runs cleanly for at least two cycles, formalize it. Record the AI prompt, the fallback steps, and the human-approval checks in your shared runbook library or Automation Center of Excellence. Leverage Terraform to make it easy to share across multiple teams and make it part of your pipelines. Tag it with service and impact metrics so any team can enable the pattern with a single click. They shouldn't need to undergo any new code reviews or approvals.
Map out skills growth
Hands-on training is a great start, but sustainable AI adoption at scale requires a deliberate, organization-wide skill development strategy. Define what mastery looks like for incident commanders and on-call engineers by using a role-based competency matrix. It should outline the specific AI skills, tools, and judgment calls each team member needs to master.
Then, for each five-week cycle, choose one or two competencies per role to practice and define a simple check to prove them. For example:
With a structured path like this, each sprint helps build an AI-competent organization.
To get started, download our AI skills development eBook, How to Drive AI Skill Development Across Your Operations Teams .