What the 2025 DORA Report Teaches Us About Observability and Platform Quality

The 2025 DORA State of AI-Assisted Software Development Report delivers a critical insight for technology leaders: AI is fundamentally an amplifier, not a solution. It magnifies the strengths of high-performing organizations with robust observability while exposing the dysfunctions of struggling ones.

By: Shabih Syed

| October 2, 2025

LLMs

Software Engineering

Webinars

March 21, 2025

AI’s Unrealized Potential: Honeycomb and DORA on Smarter, More Reliable Development with LLMs

Watch Now

What the 2025 DORA Report Teaches Us About Observability and Platform Quality

For organizations that have rushed to adopt AI coding assistants all while expecting immediate productivity gains, this finding demands a strategic pivot. The research shows that while 90% of developers now use AI tools and report productivity increases, AI adoption also increases software delivery instability unless your organization has built the right observability foundation.

Seven AI capabilities that amplify results

DORA's inaugural AI Capabilities Model identifies seven foundational capabilities that amplify AI's benefits for software development teams. Two stand out for their direct connection to observability and operational excellence:

Quality internal platforms: Organizations with robust internal development platforms see dramatically amplified returns from AI on organizational performance
Healthy data ecosystems: High-quality, accessible, unified data systems are essential for AI to deliver measurable value

The 2025 DORA Report is explicit about the importance of observability infrastructure: "A high-quality platform serves as the distribution and governance layer required to scale the benefits of AI from individual productivity gains to systemic organizational improvements. Without this foundation, AI adoption remains a series of disconnected local optimizations."

The hidden risk of AI without observability

The DORA research on AI-assisted software development reveals a troubling pattern. While AI helps developers write code faster, this speed creates downstream chaos when observability systems aren't prepared to handle it. The report found that AI adoption corresponds with:

Increased software delivery throughput (positive outcome)
Increased software delivery instability (concerning trend)
No reduction in friction or burnout (missed opportunity)

This isn't a failure of AI technology. It's a failure of organizational observability systems to evolve alongside AI-accelerated development practices.

As Charity Majors, CTO and Co-founder of Honeycomb explains in her article on disposable code, AI fundamentally changes how developers interact with code—and this shift has profound implications for observability. Traditional monitoring assumes code is carefully crafted, reviewed, and stable. But when developers can generate and deploy multiple variations of the same functionality within hours, the old assumption that "we'll instrument it properly during code review" breaks down. Code moves too fast for traditional instrumentation practices to keep up.

According to Charity, the disposable code paradigm means organizations need observability that can:

Automatically capture context as code is generated and deployed, not rely on manual instrumentation
Track rapidly changing implementations of the same functionality to understand which variations perform best
Connect ephemeral code artifacts to persistent customer experience metrics, even as the underlying implementation changes

Without this evolution in observability, the DORA Report's finding of increased instability makes perfect sense. Teams are generating code faster than they can understand its production behavior. The speed advantage of AI becomes a liability when organizations lack the observability infrastructure to validate what's actually working in production.

How Honeycomb solves this problem

A leading AI-first customer service company experienced this disconnect firsthand. Their flagship AI agent boasted impressive metrics: a 65% resolution rate that improved 1% monthly. Their engineering teams deployed changes 70 times per day across 19 teams. By conventional software delivery metrics, they were succeeding. Then, customers started complaining: "It's painfully slow."

Using Honeycomb's distributed tracing and high-cardinality data model, the team created a "time to first token" root span that captured the complete customer experience, from question submission to answer streaming. This customer-centric metric was then connected to their backend instrumentation. With Honeycomb, engineers could now "formulate a hypothesis and see the real time impact of each change to the only thing that mattered: end-to-end latency." This real-time validation capability was impossible with their previous monitoring stack.

The team then extended Honeycomb's observability to include business metrics. This enabled instant cost-benefit analysis that led to discoveries impossible with traditional BI tools, including catching a critical billing discrepancy where prompt caching was causing misreported token counts.

The results with Honeycomb observability were dramatic:

60% reduction in median customer wait time (below seven seconds for AI responses)
Continuous 1% monthly improvement in AI resolution rates through systematic A/B testing enabled by Honeycomb's experimentation capabilities
Real-time cost optimization catching LLM billing discrepancies invisible in traditional tools
Rapid incident response where teams could "quickly confirm and diagnose the slow down and fix it too" when SLO alerts fired

As the technical lead noted: "Honeycomb helps us keep an eye on the end-to-end user experience, like how we empathize with the user, put ourselves in their shoes, and make sure that we measure exactly what matters to them."

This mirrors precisely what the DORA Report found: without proper observability, teams optimize components that aren't even on the critical path affecting user-perceived performance, creating local improvements with no organizational value.

Honeycomb Intelligence for AI-native observability

Honeycomb's launch of Honeycomb Intelligence directly addresses the foundational capabilities the DORA Report identifies as critical for AI development success. This isn't about adding AI features to an existing monitoring tool. It's about reimagining observability for an AI-native development paradigm.

Canvas for AI-native investigation that scales team knowledge

The DORA Report found that "AI-accessible internal data" is a key capability for successful AI adoption. Teams see amplified benefits when AI tools are connected to internal company information. Canvas embodies this principle by serving as a persistent, intelligent investigation workspace where:

Context flows continuously between human operators and AI agents through Model Context Protocol (MCP) integration
Tribal knowledge becomes queryable infrastructure instead of being locked in senior engineers' heads
Investigations compound over time rather than starting from scratch with each incident

MCP integration for closing the feedback loop AI creates

The DORA Report emphasizes that "strong version control practices" amplify AI's benefits, specifically the ability to roll changes back rapidly. Honeycomb's MCP integration extends this principle beyond code to operational context:

Development tools can query production telemetry in real-time as AI generates code
Teams can verify changes against actual system behavior before pushing to production
The "psychological safety net" DORA describes for version control now extends to runtime behavior

Query Assistant and Anomaly Detection shift teams from reactive to predictive

DORA's research shows AI adoption increases software delivery instability because downstream systems haven't evolved to handle AI-accelerated code generation. Honeycomb's AI capabilities address this directly:

Query Assistant democratizes complex system understanding, ensuring stability expertise isn't bottlenecked with senior engineers
Proactive Anomaly Detection catches issues introduced by AI-generated code before they cascade into production incidents
Both capabilities operate on Honeycomb's high-cardinality data model, providing the "healthy data ecosystem" DORA identifies as essential

Key takeaways for your AI development strategy

Based on the DORA Report and our own findings at Honeycomb, here are the three clear directives for technology leaders implementing AI-assisted development:

1. Stop thinking about AI adoption as a tooling problem. It's a systems transformation problem that requires reimagining your entire development and operational infrastructure, with observability as the foundation.

2. Prioritize platform quality before expanding AI usage. The DORA Report found that platforms "act as the distribution and governance layer required to scale the benefits of AI from individual productivity gains to systemic organizational improvements."

3. Invest in observability that matches AI velocity. Your monitoring tools were built for a world where humans wrote code at human speed. AI-accelerated development requires AI-native observability that can keep pace.

The bottom line on AI-native observability

The 2025 DORA Report validates what forward-thinking organizations are discovering: AI won't fix broken systems. It will expose them faster and at a greater scale.

The question isn't whether to adopt AI for software development. That ship has sailed: 90% of developers are already using it. The question is whether your observability infrastructure can handle what happens when AI-generated code hits production.

The AI-first customer service company's journey with Honeycomb illustrates both the problem and the solution. They transformed from teams optimizing in isolation with traditional monitoring to a unified organization with shared visibility into customer experience, engineering diagnostics, and business metrics through Honeycomb's observability platform.

Honeycomb Intelligence represents this different approach: observability rebuilt from the ground up for the AI-accelerated reality the DORA Report describes. It's designed to be the "quality internal platform" and "healthy data ecosystem" that the research shows are prerequisites for AI development success.

Want to learn more?

Talk to our team about how we're helping organizations build the operational foundation for AI development success.

Get a Demo

Frequently asked questions

What is AI-native observability? AI-native observability refers to the ability to understand, measure, and optimize AI-driven systems in production. It goes beyond traditional monitoring to provide real-time insights into how code impacts customer experience, system stability, and business costs.

Why does the DORA Report emphasize platform quality for AI? The 2025 DORA Report found that quality internal platforms amplify AI's benefits on organizational performance.

How does Honeycomb Intelligence differ from traditional monitoring? Honeycomb Intelligence is built for AI-native development workflows. It unifies customer experience metrics, engineering diagnostics, and business intelligence in a single queryable platform, enabling teams to validate optimizations in real-time.

What are the seven DORA AI capabilities? The DORA AI Capabilities Model identifies: clear AI stance, healthy data ecosystems, AI-accessible internal data, strong version control practices, working in small batches, user-centric focus, and quality internal platforms.

How can I measure if my observability platform is ready for AI development? Ask: Can you trace AI-generated code from development through production? Can you connect customer experience metrics to AI optimization efforts? Can you measure the cost-benefit of AI changes in real-time? If not, your observability infrastructure may not be ready.