What does code quality mean?

I’ve been reading The Art and Zen of Motorcycle Maintenance and one of the topics of exploration that comes up is quality and what it really means. The assertion is that quality can be seen and observed, but is nearly impossible to describe or articulate. With this in mind, I started thinking about what quality means with code.

Now, there are a few ways of analyzing code, but when it comes to articulating what really makes code high quality, I am similarly stumped. Of course things like time and space complexity can be used to measure the efficiency of an algorithm, but is this the same as code quality? There are other similar metrics like readability, efficiency of code, ie number of lines or expressions, and effectiveness of OO design, but while these can be measured, there is no formula for using them to really measure quality.

But again, none of these in isolation truly define quality. An algorithm or method may be perfectly optimized to run in constant time, but if no one supporting the system can understand it, or it leads to problems that have to be debugged, the quality is poor. Similarly, a poorly written method that is well documented, has good test coverage, and never needs to be touched again may be considered either high or low quality depending on some combination of these metrics.

So let’s take a step back and ask, what does quality even mean in a customer facing, production environment? Things like number of customer issues and contacts, scalability, and time taken to fix bugs as well as investigate issues come to mind. Code may not be greatly optimal or even very well documented, but if it has excellent instrumentation through metrics, alarms, and reporting, is the quality good? A beautifully written method that logs errors to a log file may not be as high quality as a poorly written one that reports to a custom error tool that makes debugging and fixing much easier.

So where does this leave us? I think at this point, it is becoming more obvious that quality needs to be viewed at the macro system level, though the micro, code and method level can be used to get a picture of overall health, similar to checking ears, nose, and throat for overall health.

So now we approach a better understanding of quality in a software system, but how do we actually improve this? There are countless resources on the low hanging fruit like unit tests, automation tests for UX, and code reviews, all of which can help keep checks on falling quality. Operational metrics can also help detect quality issues if closely monitored over time. Keeping an eye on trends like latencies, trouble tickets opened, and customer contacts is a great way to ensure that system quality isn’t slowly degrading over time.

Surprisingly, I’ve also found that interviews can help an organization’s quality stay high. This works in two ways, first and more obviously, using a high bar for measuring interviewees in the areas of algorithms, datastructures, and system architecture ensures that continual new blood is brought in, always raising the bar in these areas and ensuring the organization is always improving in these areas. This is in fact the guiding pricinple of Amazon’s bar raiser program. Secondly, the interviewers themselves, by continuously evaluating these skills in others, will look upon their own work and the work of those they work with with a critical eye and find ways to improve them. This is another reason to ask real world problems, or at least frame coding problems in real world terms, as it begins to get ingrained in thinking. If someone is critically evaluating the time and space complexity of a list traversal algorithm during an interview, and they see a similar problem in their work, they will take a step back and actually begin interviewing and evaluating their own response before deciding on a solution.

While it may seem easy and like a panacea to cure quality issues by instituting policies around code reviews and testing, these may only address the tip of the ice berg, and these more institutional changes may be far more effective. Yes, code reviews and setting goals of 90% unit test coverage can help with low hanging fruit, but they are not the cure all solutions that they can be advertised as. Instilling a culture of quality will be far more effective log term as it creates an avalanche effect as the snowball of quality starts rolling down the slope of the organization.

So we still don’t have an absolute definition for software quality, but at least have some measurable items now, and we have seen how we can address quality issues in both the short and long term.