The workflow metrics that make elite dev teams

4 mins

Promoted partner content

DORA metrics aren’t enough on their own. By focusing on pull request size, dev teams can quickly improve their cycle times and development workflow and make the leap to elite performance.

Since its inception in 2016, the DevOps Research and Assessment (DORA) program has provided dev teams with some great metrics to guide them on their journey to performing at an elite level. But DORA metrics should only be one piece of the puzzle.

Make no mistake, tracking DORA metrics are important and useful – just not for everything that an engineering team strives to do, such as showing how developers directly impact the business bottom line.

Used in tandem with LinearB’s own Engineering Benchmarks, however, dev teams can start to use DORA metrics to power themselves toward elite workflows. Based on a study of nearly 2,000 dev teams and 847,000 code branches, these benchmarks help guide us towards what the 10% of dev teams that we consider elite look like in practice.

Small but powerful PRs are best

What’s clear is that elite dev workflows start and end with small pull request (PR) sizes. In our experience, this is the best indicator of simpler merges, enhanced CI/CD, and faster cycle times.

PR size, rework rate, and deployment frequency all affect cycle times, but PR size continues to present the most significant opportunity for real organizational change.

Luckily, it’s also easiest to focus on reducing PR size. It’s concrete, measurable, and achievable. Elite teams make less than 225 code changes (including additions and removals), making them easier to review and safer to merge.

Because small PRs get picked up and reviewed fast, they lower cycle times and positively impact other DORA metrics. There are fewer hand-offs and less idle time. Production blow-ups are smaller and teams can recover more quickly.

Beyond efficiency and moving work through the development pipeline quickly, low PR pickup and review times also tell a good story about team chemistry. Teams that have a smooth code review process also tend to have better code quality.

To help streamline pull request merges, LinearB has released gitStream. This free dev tool allows teams to decide what pull requests should be deemed either low, medium, or high risk. The tool has already allowed hundreds of dev teams to deploy more frequently by systematically not treating all PRs the same.

Increasing deployment frequency

If PR size represents the guts of a project, deployment frequency is the heart. Teams should always strive to plan and work in small, manageable, and quickly releasable chunks. Good scoping and planning nets out to smaller PR sizes, resulting in a team that is constantly merging and deploying. The more frequent the deployment, the better the organizational cadence and developer experience.

It’s important to note that elite deployment frequency is daily – and anything more than a week suggests the need for critical focus. Daily deployment of code indicates a stable, healthy continuous delivery pipeline, which can happen quite naturally with lower PR sizes.

Smaller PR sizes correlate with higher test coverage and more thorough reviews (hallmarks of higher deployment frequency and code quality), reducing change failure rates (CFR). It’s also much easier to roll back and fix issues, helping to lower your mean time to restore (MTTR). Cycle times are lower, customers are happy, and so are developers.

Reworking and refactoring

The concept of rework rate (or code churn) can sometimes be confusing. If a dev writes code, the code merges to the main “trunk,” or the release, and it’ll almost always be refactored in time.

People assume refactoring is bad, but refactoring 6- or 12-month-old code is a good thing.

It’s important to distinguish between rework and refactoring. Refactoring is a process of making preexisting code more efficient. Rework is the bad kind – a repeating pattern in a poorly functioning process. Or the rework could be due to a quality problem; perhaps product or engineering aren’t aligned on objectives?

Unless the code has just been committed, strong refactoring is a healthy sign of a well-functioning team. Teams with lower PR sizes, rework rates, and higher deployment frequency also have more time to focus on refactoring.

How development workflow impacts other metrics

Understanding DORA metrics is important. They do matter, but they’re not enough.

PR size, deployment frequency, and rework rates all affect development workflow, impacting overall productivity and efficiency. Average, or even strong dev teams can grow stronger by zeroing in on these key metrics. When they do, other metrics like planning accuracy, CFR, and MTTR often fall in line.

A crucial dimension of LinearB’s Engineering Benchmarks study is that predictability stems from smaller PRs and shorter cycles. By looking at the key indicators, teams can foresee problems before they come up, and they have the time and space to plan for them. Instead of spending up to three cycles recovering, teams know exactly how long problem-solving will take – if those problems happen at all.

With proper focus and utilization, development workflow metrics can transform organizations. As useful as DORA metrics are, dev teams’ overarching goal should be using better indicators to power them into the elite performing bracket with a supercharged development workflow.

This article is based on an episode of Dev Interrupted, a podcast for dev leaders that explores different strategies and tricks for everything from managing dev teams to speeding up delivery times.

The workflow metrics that make elite dev teams

Posted in:

Written by:

Share:

Promoted partner content

Small but powerful PRs are best

Increasing deployment frequency

Reworking and refactoring

How development workflow impacts other metrics

Related content

Why Zig is one of the hottest programming languages to learn

How to build an effective technical strategy

Why OpenFeature is central to modern feature management

Understanding feature flags

What is retrieval-augmented generation (RAG) and are you ready for it?

How to standardize codebases across teams

WebAssembly is still waiting for its moment

Minimum viable architecture is the backbone of a successful product

A buyer’s checklist for AI coding assistants

5 mistakes to avoid when picking an AI coding assistant

The best AI coding assistants 2024

How to argue with the AI coding assistant skeptics

PostgreSQL: The database that quietly ate the world

Partner Content: The Engineering Leader’s Guide to Goals and Reporting

AI models can’t understand code. Does that matter?

6 questions to ask when buying a software developer metrics tool

How to combat generative AI security risks

How Zalando uses its own Tech Radar to make better technology choices

4 things you need to know from the latest Thoughtworks Tech Radar

9 women in AI you need to know about

AI and Kubernetes are pushing cloud costs out of control

How to write better AI prompts

A buyer’s checklist for software developer analytics tools

5 mistakes to avoid when choosing a software developer analytics tool

How to plan for and mitigate different types of tech debt

The best software development analytics tools 2024

Who holds the edge in the JavaScript framework wars?

11 generative AI programming tools for developers

Researchers say generative AI isn't replacing devs any time soon

Mastering tough technical decisions

Unlocking productivity with developer platforms

12 things to consider when assessing open source software

Choose a contextualized AI coding assistant

What developers need to know about generative AI in 2024

Leading open-source teams in large organizations

Whatever happened to Big Data?

6 steps to addressing legacy enterprise code

Learning to live with legacy code

A journey to tackle legacy code in online travel

How test coverage can improve code quality

What you need to know about Biden’s AI executive order

How OpenAI fought off security threats and GPU shortages to scale ChatGPT

Balancing build vs buy decisions in a post-boom world

3 strategies for maximizing your cloud savings

Building a cloud architecture that can scale to any challenge

Architecting for profit: A blueprint for modern cloud economics

How are engineering orgs achieving reliability in 2023?

Tech debt for engineering leaders: How a shortcut today impacts tomorrow

What AI has to offer: Using LLM tools in interviews

Tech debt traps to avoid

The 6 biggest generative AI risks for developers

7 generative AI productivity hacks for developers

SRE for engineering managers

Can platform engineering help you do more with less?

When to migrate from a monolithic to a distributed frontend architecture

The essential tools for software engineering managers

Let's mitigate bias in tech

Kubernetes for engineering managers

Solving the mean time to repair problem

The relationship between observability, OpenTelemetry, and UX

Will ChatGPT and generative AI replace internal code documentation?

The business case for headless CMS - a quick guide for developers

What makes a front-end developer in 2023?

Riding the ever-changing waves of front-end development

Observability for engineering managers

The case for and against building ChatGPT into your developer workflow

How to pay down your monitoring debt

Using cooperative gaming to drive positive engineering change

The four pillars of code health

Five reasons you shouldn’t rewrite that code