Coding Assistants - Friend or Foe?
Coding assistants have been around for around two years and have generated more hype than just about anything in the past twenty years. The first mainstream model, ChatGPT, was released in 2022, and many other variants have followed. There have been promises of 30-50% productivity gains and even full replacement of engineers. Let's dig into the details and see if we can get an answer to a very complex question: Do genAI tools drive better productivity?
What are coding assistants (LLMs)?
If you haven't heard of coding assistants in 2024, then you must have been under a rock for the previous couple of years. While most have heard of them, there is still quite a bit of confusion about what they actually are. This is not a deep dive into the technical nature of LLMs so let’s just call them what they are, excellent text predictors.
They have the ability to answer questions, summarize data, and provide code for certain tasks. At first glance, they can seem to be magical with seemingly great answers to lots of different questions across many different languages and tasks. This ability has immense use cases across software engineering and the broader employment landscape.
We have built a worse calculator
For all of the technological age, we have worked on written logic. Regardless of how many times you run an application, you will get the same results. This is no longer the case with LLMs, they don’t work on this logic based system. They work similar to how a human would, taking in a bunch of inputs and then producing a response that makes the most sense based on that input.
Just like humans, LLMs have an error rate or hallucinations, as they are commonly referenced. This is due to a few different things, but a key problem is that it is language based. It is “searching” through its training data to find the next logical word to respond to you with. There is a fuzziness to language, the training data, and how familiar the training data is with the question. All of these factors contribute to the error rate of bad responses.
That said, we can’t forget that they are great text predictors and they don’t actually understand any of these actual concepts. For example, you can teach a LLM to answer questions around Calculus but not teach it Algebra. This doesn’t make any sense from a human perspective since you have to know Algebra to understand Calculus.
But since an LLM is a language-based text predictor… it doesn’t care if it knows the fundamental underlying knowledge. It just knows what you teach it via language, and it will respond accordingly.
This is the fundamental understanding that we need to take when considering how and when to leverage LLMs or coding assistants.
Treat them like an intern or junior developer
I remember back in my early career, when I knew everything and these senior Developers were just blowhards giving me trouble about doing things their way. Looking back, it was clear how much I didn’t know and how confident I was in my half-baked solutions.
The seniors and other mentors I had kept a good eye on me while I learned and matured into a better developer over time. As I grew and matured in my craft, I was given more and more leniency to take on bigger and bigger projects. After time, I was the one mentoring juniors and seeing so many of the same mistakes I made along the way.
While this is a great pattern to help juniors in their career, this is NOT how you should treat LLMs and coding assistants. They will very, very confidently lie to you after getting it right many times in a row. You should always treat them like the first day a junior joined your team. They can be helpful, but don’t let them control too much, or you may end up paying the price.
Haven’t we seen this before?
Since the release of CoPilot and others, the burning question has been what are we trying to do with these tools? There are many that would say they are going to completely replace developers… To that I say good luck. Though there is a lot of fear across the industry of people being replaced by LLMs or coding assistants, I have yet to see an example where this is remotely possible.
You don’t have to look to hard find other historical examples of monumental change, like when typewriters emerged in the late 1800s. People feared losing their jobs, particularly scribes, clerks, and professional penmen who made their living through handwriting.
However, what actually happened is quite interesting: rather than eliminating jobs, typewriters transformed them and created new opportunities, especially for women entering the workforce. The role of "typist" or "typewriter" (as the operators were initially called) became a new professional category. Business schools began offering typing courses, and many women found employment opportunities as secretaries and stenographers - positions that previously had been largely held by men.
There are some fascinating parallels with today's AI and automation concerns. Just as typewriters didn't eliminate the need for written communication but rather changed how it was done, many technologies end up transforming jobs rather than completely eliminating them. That said, the impact on individual workers who had specialized in handwriting was very real - particularly professional penmen who had made their living through calligraphy and handwritten documents.
How are teams successfully leveraging AI?
To me, the best benefit they had is around code generation for solved problems. For example, I want to integrate my app with Google Maps API, integrate Stripe using React, or scaffolding a new Java application. These are common problems that have been solved many, many times over and yet are things people struggle with on a day to day basis. Code assistants can absoltely help with these types of tasks including scaffolding tests, functions, methods, or just about any common block of code.
Another benefit is being able to document code, but this one I had a bit more of a problem with. Yes, you can ask CoPilot or any other coding assistant to “document” your code but this is far from trustworthy. The better pattern is to have it help you scaffold your documentation and then rewriting or editing the content based upon your knowledge of the code and what you were trying to solve with it. This can be very helpful while mitigating the downsides that can come from this type of use case.
Another common pattern that I have seen is using coding assistants during the PR process. This is not dissimilar to the documentation aspect listed above, just used in a different way. When you submit a PR, leveraging a coding assistant to write up the PR commit message can be very helpful. While this is useful, care really needs to be taken to ensure the accuracy of the write and also including the other context that could not be derived from the code itself.
Can you trust the output?
It is imperative to not forget the fact that they don’t actually know how to code and the output is just the best prediction of words based on your input. Is this something that you can trust? The answer is kind-of… or to state it a little more usefully… trust but verify.
This can be a challenge, especially when they seem so smart and get it right a lot of the time. The big question you need to ask, what will happen if it gets it wrong? Maybe there will be a production outage, cause other rework or dependency issues, or maybe something worse.
Going all the way back to the first autonomous cars when they were just starting to be tested on real road, people started to trust them very quickly. It didn’t take long until there were videos of people sleeping in the cars while they were driving down the road.
This same pattern can be seen across the usage of coding assistants. I recently helped a client do an analysis of this coding assistant usage to see if they were being more productive because of coding assistant usage. This is not an easy question to answer for most people because I believe it has a false dichotomy in it.
The Development or Iron Triangle
I recently posted a full article on this topic but as a quick recap, the development triangle is comprised of cost, quality, and speed. As with any triangle, you get to dictate two sides and the other is determined for you. The only other option you have is to make the triangle bigger but you can check out the full article for the deeper details.
But back to measuring coding assistants, you can’t just look at speed. You have to look through the lens of the triangle and understand its broader impact. For the client I worked with, they had very clear “productivity” gain when looking at the number of PRs submitted for the group that used coding assistants. It was almost double the amount for the same sample size.
But from the quality lens, they had 4x more PRs merged without review with 80% less comments on PRs overall. These are staggering stats which also coincide with recent study from Uplevel that showed a 41% increase in bugs for teams using coding assistants. In their study, they also showed little change in PR cycle time, time to merge, or PR throughput.
The problem is not needing faster typers
The biggest issue I have with coding assistants and the challenges they claim to tackle is that they assume the need to type faster is the problem. In a recent study from Haystack Analytics, ~75% of developers suffer from burnout, yet the vast majority of them code on the weekend. The most common reason for burnout is inefficient processes.
This flies directly into the face of the coding assistant hype. That coding assistants can deliver 30, 40, or even 50% productivity gains. On the surface, this sounds ridiculous… and it is. Consider that developers only spend 30-40% of their time coding (State of DevOps report). The rest of the time is spent dealing with all of the other organizational nonsense that is so common across enterprises.
Context is the key to success
The problem is that knowledge has and always will be the differentiator for humans, and with the new breed of computer, the same holds true. How much data has the model been trained on, and is the problem you are trying to solve in the training data? Here is an interesting comment from a Reddit post focused on coding assistants and if people are using them.
There are many that will read this is say something like, “These models are just going to keep getting smarter and will have more training data to help solve this problem”. While that might be true for commonly solved problems. Will an LLM really understand your companies inner workings? Will it understand the context in your code base of 100k lines? Does it understand the 20 years of legacy code and tech debt that has built up over time? I would say the answer to most of those is no.
It will get better, there is not doubt about that. But I would say that we are a long, long way from some of this coming to fruition. Why? Economics.
There have been burger flipping robots for at least 25 years but do we see them running every fast food joint. Not we don’t and it’s really easy to see why. They still need a lot of humans around to do all of the other work. They can help to make one part of the process easier and faster but the cost savings just doesn’t make sense financial over time. When you add in the cost of the machine, its maitanence costs, what happens when it breaks, and still needing humans around to handle a lot of the problems, it just doesn’t make sense.
To me, the same problem exists with coding assistants. If you can have a code base that is completely controlled by an LLM, then you could make a case that the costs and quality problems could be easily overcome. Actually, we already have that in low/no code solutions and I don’t see those taking over software development any time soon.
So, are coding assistants friend or foe?
Now that we understand a bit more about coding assistants, let’s consider the main question, are they a friend or foe? Like with most things in life, it depends on the implementation.
Are you using them in a way that brings out their best qualities?
Are you aware of the quality, cost, speed triangle?
Are you validating results?
As you can see, it’s up to you to decide. They can be very helpful for some (usually more senior) and a hinderence for other (usually more junior). My general advice is to allow people to use them when they want too, have good quality metrics, and don’t expect more out of them than they are good for.
Don’t expect 30 or 50% productivity gains as coding assistants will never get you there. Instead focus on the big problems your developers are facing and how you can improve those value streams. This is where the big gains can be found, not in an easy button.
Until next time,
Chris