Engineering leadership is hard, there are few that would argue that fact. Its even harder when you don’t have any telemetry on how well your organization is operating. This is the root of the metrics frameworks that are out there today like Dora, SPACE, or DevEx.
Today we are not going to argue the merit of any of these frameworks but instead talk about the data itself. I have seen many organizations get absolutely lost in gathering the data, with initiatives to track certain sets of data taking months or even years.
I spoke with a company not to long ago that spent 12M to gather an implement the Dora metrics!
While getting your data in order is very valuable, is that the highest value problem you have? Are you looking to track metrics or actually make changes?
Don’t fall in love with your dashboards!
There are two ways to try and collect data, through qualitative surveys and through quantitative system data. They both come with their challenges but you need to figure out which one is better for a given scenario.
Lets take a look at how to gather Time to Value through quantitative methods. This isn’t the most complicated metric to calculate but it paints a good picture of some challenges. I asked my ChatGPT friend to describe how to calculate this metric.
Identify Key Stages:
Commit Time: When code is committed to the repository.
Build Time: When the build process starts and completes.
Testing Time: When automated tests are run and completed.
Deployment Time: When code is deployed to production and available for use.
Use CI/CD Tools:
Tools like Jenkins, GitLab CI, CircleCI, or GitHub Actions can help you track the time from commit to build completion.
Use these tools to capture timestamps for when a commit is made, when the build starts, and when the build completes.
Version Control Systems:
Use your version control system (e.g., Git) to pull commit timestamps.
Automated Testing Frameworks:
Integrate automated testing frameworks that log timestamps when tests start and finish.
Deployment Tools:
Deployment tools like Spinnaker, ArgoCD, or custom deployment scripts should log when deployments start and finish.
Monitoring and Logging Tools:
Use monitoring and logging tools (e.g., Splunk, ELK stack, Datadog) to track when new deployments go live and are accessible to users.
Aggregating Data:
Aggregate the collected timestamps from different tools and stages. You can use a data visualization or analysis tool like Grafana, Kibana, or even a custom dashboard.
Calculating Time to Value:
Calculate the duration between the commit time and the deployment time to get the "time to value" metric.
Automating the Process:
Set up a pipeline that automatically logs and aggregates these metrics after each deployment.
Use tools like Prometheus to scrape these metrics and Grafana to visualize them.
That it! Thats all you have to do… couldn’t be more simple, right?
All you need to do is connect all of these systems together, have a ID that can be passes around, build a data model, and a few reports. Simple!
When considering how to track a given metric, what is the fidelity that you need to get the job done? Is this a metric that you will be tracking across teams indefinitely? Considering this is one of the Dora metrics it might be worth going through the task of creating this dataset or leveraging engineering insights tools like DX to do the hard work for you.
The point here is that the fidelity of the information you gather really matters and should depend on what you are trying to accomplish.Â
So how else can you track some of these metrics? Qualitative surveys? Thats correct, they can get you to an answer pretty quickly and at a useful fidelity.
Let’s look at another metric like Time to first commit. This is far easier to track than Time to Value but still has its complexity. If you tracked this via survey, it could be a few simple questions for anyone that is new to a team.
How long did it take for you to make your first commit?
How comfortable are you in making another commit?
Considering you would use this metric to help improve and track onboarding, is there a difference between 1.337 days (quantitative) and 1.5 days (qualitative)? I would say that it doesn’t really matter at all, especially when considering the time, energy, and effort that was saved.
Another consideration is the additional information you can gather from surveys, like sentiment. It can be very easy to game the system when it comes to 1st or 10th commits to a codebase but when you overlay that with a sentiment question you can get some additional insights. For example, someone might have done their first commit on the first day on the team… but the codebase was so complicated and the deployment process so flakey… that it was terrifying.
This is another great mechanism to gather information on how well a given team or organization is doing.
Don’t get me wrong, there is a ton of great uses for quantitative system data. The point is to consider the best way to gather data to help provide insights to a given problem or set of problems. As you go on you metrics and data journey, I would consider the collection costs and the fidelity of information that you need to get the job done.
Spend your time working on the problem and not on gathering the data, your engineering organization and their teams will thank you.
Gathering feedback from the development community and making them apart of the process will improve culture and lead to better outcomes. Care just need to be taken on how to do it… because no one wants to be micro-managed or micro-measured.
Any thoughts or questions? Let me know!
Chris