The Real Problem with Software Development
It’s not writing code, it’s managing complexity
A few weeks ago, I saw a tweet that said “Writing code isn’t the problem. Controlling complexity is.” I wish I could remember who said that; I will be quoting it a lot in the future. That statement nicely summarizes what makes software development difficult. It’s not just memorizing the syntactic details of some programming language, or the many functions in some API, but understanding and managing the complexity of the problem you’re trying to solve.
We’ve all seen this many times. Lots of applications and tools start simple. They do 80% of the job well, maybe 90%. But that isn’t quite enough. Version 1.1 gets a few more features, more creep into version 1.2, and by the time you get to 3.0, an elegant user interface has turned into a mess. This increase in complexity is one reason that applications tend to become less useable over time. We also see this phenomenon as one application replaces another. RCS was useful, but didn’t do everything we needed it to; SVN was better; Git does just about everything you could want, but at an enormous cost in complexity. (Could Git’s complexity be managed better? I’m not the one to say.) OS X, which used to trumpet “It just works,” has evolved to “it used to just work”; the most user-centric Unix-like system ever built now staggers under the load of new and poorly thought-out features.
The problem of complexity isn’t limited to user interfaces; that may be the least important (though most visible) aspect of the problem. Anyone who works in programming has seen the source code for some project evolve from something short, sweet, and clean to a seething mass of bits. (These days, it’s often a seething mass of distributed bits.) Some of that evolution is driven by an increasingly complex world that requires attention to secure programming, cloud deployment, and other issues that didn’t exist a few decades ago. But even here: a requirement like security tends to make code more complex—but complexity itself hides security issues. Saying “yes, adding security made the code more complex” is wrong on several fronts. Security that’s added as an afterthought almost always fails. Designing security in from the start almost always leads to a simpler result than bolting security on as an afterthought, and the complexity will stay manageable if new features and security grow together. If we’re serious about complexity, the complexity of building secure systems needs to be managed and controlled in step with the rest of the software, otherwise it’s going to add more vulnerabilities.
That brings me to my main point. We’re seeing more code that’s written (at least in first draft) by generative AI tools, such as GitHub Copilot, ChatGPT (especially with Code Interpreter), and Google Codey. One advantage of computers, of course, is that they don’t care about complexity. But that advantage is also a significant disadvantage. Until AI systems can generate code as reliably as our current generation of compilers, humans will need to understand—and debug—the code they write. Brian Kernighan wrote that “Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?” We don’t want a future that consists of code too clever to be debugged by humans—at least not until the AIs are ready to do that debugging for us. Really brilliant programmers write code that finds a way out of the complexity: code that may be a little longer, a little clearer, a little less clever so that someone can understand it later. (Copilot running in VSCode has a button that simplifies code, but its capabilities are limited.)
Furthermore, when we’re considering complexity, we’re not just talking about individual lines of code and individual functions or methods. Most professional programmers work on large systems that can consist of thousands of functions and millions of lines of code. That code may take the form of dozens of microservices running as asynchronous processes and communicating over a network. What is the overall structure, the overall architecture, of these programs? How are they kept simple and manageable? How do you think about complexity when writing or maintaining software that may outlive its developers? Millions of lines of legacy code going back as far as the 1960s and 1970s are still in use, much of it written in languages that are no longer popular. How do we control complexity when working with these?
Humans don’t manage this kind of complexity well, but that doesn’t mean we can check out and forget about it. Over the years, we’ve gradually gotten better at managing complexity. Software architecture is a distinct specialty that has only become more important over time. It’s growing more important as systems grow larger and more complex, as we rely on them to automate more tasks, and as those systems need to scale to dimensions that were almost unimaginable a few decades ago. Reducing the complexity of modern software systems is a problem that humans can solve—and I haven’t yet seen evidence that generative AI can. Strictly speaking, that’s not a question that can even be asked yet. Claude 2 has a maximum context—the upper limit on the amount of text it can consider at one time—of 100,000 tokens1; at this time, all other large language models are significantly smaller. While 100,000 tokens is huge, it’s much smaller than the source code for even a moderately sized piece of enterprise software. And while you don’t have to understand every line of code to do a high-level design for a software system, you do have to manage a lot of information: specifications, user stories, protocols, constraints, legacies and much more. Is a language model up to that?
Could we even describe the goal of “managing complexity” in a prompt? A few years ago, many developers thought that minimizing “lines of code” was the key to simplification—and it would be easy to tell ChatGPT to solve a problem in as few lines of code as possible. But that’s not really how the world works, not now, and not back in 2007. Minimizing lines of code sometimes leads to simplicity, but just as often leads to complex incantations that pack multiple ideas onto the same line, often relying on undocumented side effects. That’s not how to manage complexity. Mantras like DRY (Don’t Repeat Yourself) are often useful (as is most of the advice in The Pragmatic Programmer), but I’ve made the mistake of writing code that was overly complex to eliminate one of two very similar functions. Less repetition, but the result was more complex and harder to understand. Lines of code are easy to count, but if that’s your only metric, you will lose track of qualities like readability that may be more important. Any engineer knows that design is all about tradeoffs—in this case, trading off repetition against complexity—but difficult as these tradeoffs may be for humans, it isn’t clear to me that generative AI can make them any better, if at all.
I’m not arguing that generative AI doesn’t have a role in software development. It certainly does. Tools that can write code are certainly useful: they save us looking up the details of library functions in reference manuals, they save us from remembering the syntactic details of the less commonly used abstractions in our favorite programming languages. As long as we don’t let our own mental muscles decay, we’ll be ahead. I am arguing that we can’t get so tied up in automatic code generation that we forget about controlling complexity. Large language models don’t help with that now, though they might in the future. If they free us to spend more time understanding and solving the higher-level problems of complexity, though, that will be a significant gain.
Will the day come when a large language model will be able to write a million line enterprise program? Probably. But someone will have to write the prompt telling it what to do. And that person will be faced with the problem that has characterized programming from the start: understanding complexity, knowing where it’s unavoidable, and controlling it.
Footnotes
- It’s common to say that a token is approximately ⅘ of a word. It’s not clear how that applies to source code, though. It’s also common to say that 100,000 words is the size of a novel, but that’s only true for rather short novels.