When to “address” “technical debt”?

The phrase “technical debt” appears in scare quotes here because, as observed in The Unreasonable Ineffectiveness of Considering Things Harmful, technical debt has quite a specific meaning and I’m talking about something broader here. Quoting Ward Cunningham:

Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite. Objects make the cost of this transaction tolerable. The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object-oriented or otherwise.

Ward Cunningham, the Wycash Portfolio Management System

It’s not old code that’s technical debt, it’s lightly-designed code. That thing that seemed like the solution when you first thought of it. Yes, ship it, by all means, but be ready to very quickly rewrite it when you learn more.

Some of what people mean when they say “we need to bring our technical debt under control” is that kind of technical debt, struggling under the compound interest of if statement accrual as multiple developers have added behaviour without adding design. But there are other things. Cutting corners is not technical debt, it’s technical gambling. Updating external dependencies is not technical debt repayment, but it does still need to be done. Removing deprecated symbols is paying for somebody else’s technical debt, not yours: again you still have to do it. Replacing last month’s favoured npm modules with this month’s is not technical debt, it’s buying yourself a new toy.

But all of these things get done, and all of these things need to get done. It’s the cost of deploying a system into an evolving context (and as I’ve said before, even that act of deployment itself triggers evolution). So the question is when, how often, how much?

Some teams put their “engineering requirements”, their name for the evolution-coping tasks, onto the same backlog as the product requirements, then try to advocate for prioritising them alongside feature requests and bug fixes. Unfortunately this rarely works: the perceived benefit of the engineering activity is zero retained customers plus zero acquired customers = zero revenue, and yet it costs the same as fixing a handful of customer-reported bugs.

So, other groups just try to carve out time. Maybe it’s “20% of developer effort on the sprint is not for product tasks”. Maybe it’s “there is a week between delivering one iteration and starting the next”. Maybe it’s “whoever is on support rotation can pick up engineering tasks when there’s no fire to put out”. And the most anti- of all the patterns is the “hardening sprint”: once per quarter we’ll spend two weeks fixing the problems we’ve been making for ourselves in the intervening time. All of these have the benefit of giving a predictable cadence, though they still suffer a bit from that product envy problem: why are we paying for these engineers to do non-useful work when they could be steadily adding value?

The key point is that part about steadily adding value. We know the reason we need to do this: it’s to avoid being brought to Ward’s stand-still. We need to consolidate what we’ve learned, we need to evolve the system to adapt to evolutionary changes in its context, we need to fix past mistakes. And we need to do it constantly. Remember the quote: “Every minute spent on not-quite-right code counts as interest on that debt”.

Ultimately, these attempts to carve out time are requests to do our jobs properly, directed at people who don’t have the same motivations that we do. That’s not to say that their motivations are wrong. Like us, they only have a partial view of the overall picture. Unlike us, that view does not extend to an understanding of how expensive a liability our source code is.

When we ask for time between this iteration and the next to “service technical debt”, we are saying “I know that I’m doing a bad job, I know what I need to do to be doing a good job, and I would like to do a good job for four hours in a fortnight’s time on Friday afternoon, if that’s alright with you”. Ironically we do not end up doing a better job, we normalise doing a bad job for the next couple of weeks (and undoubtedly finding that some delivery/support/operations problem gets in the way for those four hours anyway).

I recommend to my mentees, reports, and any engineer who will listen to avoid advocating for time-boxed good work. I propose building the trust relationship where the people who need the code written are happy that the code is being written, and being written well, without feeling the need to check over our shoulders to see how the sausage is made. Then we don’t need to justify doing a good job, and certainly don’t need to ask permission: we just do it while we’re going. When someone asks how long it’ll take to do something, the answer is how long it’ll take to do properly, with all the rewriting, testing, and everything else it takes to do it properly. What they get out of the end is something worth having, that doesn’t need hardening, or 20% of the effort dedicated to patching it up.

And of course, what they get is something that will almost immediately need a rewrite.

Posted in process | Tagged | 3 Comments

So that’s how it works

Back in Apple Silicon, Xeon Phi, and Amigas I asked how Apple would scale the memory up in a hypothetical Mac Pro based on the M1. We still don’t know because there still isn’t one, although now we sort of do know.

The M1 Ultra uses a dedicated interconnect allowing two (maybe more, but definitely two) M1 Max to act as a single SoC. So in an M1 Ultra-powered Mac Studio, there’ll be two M1 packages connected together, acting as if the memory is unified.

It remains to be seen whether the interconnect is fast enough that the memory appears unified, or whether we’ll start to need thread affinity APIs to say “this memory is on die 0, so please run this thread on one of the cores in die 0”. But, as predicted, they’ve gone for the simplest approach that could possibly work.

BTW here’s my unpopular opinion on the Mac Studio: it’s exactly the same as the 2013 Mac Pro (the cylinder one). Speeds, particularly for external peripherals on USB and Thunderbolt, are much faster, so people are ready to accept that their peripherals should all be outside the box. But really the power was all in using the word Studio instead of Pro, so that people don’t think this is the post-cheesegrater Mac.

Posted in AAPL, arm, Mac | Leave a comment

Episode 51: Responding to Change

Sometimes it just seems like our customers are fickle flibbertigibbets who change their minds at the drop of a hat, right? Let’s look at what might be going on, and how to work with that.

Don’t forget that you can subscribe to the newsletter to keep up to date with everything Graham Lee and software engineering on the internet!

Leave a comment

Having the right data

In the beginning there was the relational database, and it was…OK, I guess. It was based on the relational model, and allowed operations that were within the relational algebra.

I mean it actually didn’t. The usual standard for relational databases is ISO 9075, or SQL. It doesn’t really implement the relational model, but something very similar to it. Still, there is a standard way for dealing with relational data, using a standard syntax to construct queries and statements that are mathematically provable.

I mean there actually isn’t. None of the “SQL databases” you can get hold of actually implement the SQL standard accurately or in its entirety. But it’s close enough.

At some point people realised that you couldn’t wake up the morning of your TechCrunch demo and code up your seed-round-winning prototype before your company logo hit the big screen, because it involved designing your data model. So the schemaless database became popular. These let you iterate quickly by storing any data of any shape in the database. If you realise you’re missing a field, you add the field. If you realise you need the data to be in a different form, you change its form. No pesky schemata to migrate, no validation.

I mean actually there is. It’s just that the schema and the validation are the responsibility of the application code: if you add a field, you need to know what to do when you see records without the field (equivalent to the field being null in a relational database). If you realise the data need to be in a different form, you need to validate whether the data are in that form and migrate the old data. And because everyone needs to do that and the database doesn’t offer those facilities, you end up with lots of wasteful, repeated, buggy code that sort of does it.

So the pendulum swings back, and we look for ways to get all of that safety back in an automatic way. Enter JSON schema. Here’s a sample of the schema (not the complete thing) for Covid-19 cases in Global.health:

{
  bsonType: 'object',
  additionalProperties: false,
  properties: {
    location: {
      bsonType: 'object',
      additionalProperties: false,
      properties: {
        country: { bsonType: 'string', maxLength: 2, minLength: 2 },
        administrativeAreaLevel1: { bsonType: 'string' },
        administrativeAreaLevel2: { bsonType: 'string' },
        administrativeAreaLevel3: { bsonType: 'string' },
        place: { bsonType: 'string' },
        name: { bsonType: 'string' },
        geoResolution: { bsonType: 'string' },
        query: { bsonType: 'string' },
        geometry: {
          bsonType: 'object',
          additionalProperties: false,
          required: [ 'latitude', 'longitude' ],
          properties: {
            latitude: { bsonType: 'number', minimum: -90, maximum: 90 },
            longitude: { bsonType: 'number', minimum: -180, maximum: 180 }
          }
        }
      }
    }
  }
}

This is just the bit that describes geographic locations, relevant to the falsehoods we believed about countries in an earlier post. This schema is stored as a validator in the database (you know, the database that’s easier to work with because it doesn’t have validators). But you can also validate objects in the application if you want. (Actually we currently have two shadow schemas: a Mongoose document description and an OpenAPI specification, in the application. It would be a good idea to normalise those: pull requests welcome!)

Posted in software-engineering | Tagged | Leave a comment

Episode 50: Organisation and Community

I look at the historical basis of the white collar/blue collar divide in defining occupations, and the problems this distinction has with comprehending modern roles like engineering and various technician occupations. I then have difficulty fitting software roles into any of those categories. This is important because it helps to understand the various commitments practitioners make to different organisations: professional societies, their employers, trade unions, technology user groups, craft guilds and so on.

Finally, there’s a call for you: what have you seen of various modes of organisation in your experience in software? Please do let me know by commenting below or emailing grahamlee@acm.org.

Leave a comment

Aphorism Considered Harmful

Recently Dan North asked the origin of the software design aphorism “make it work, make it right, make it fast”. Before delving into that story, it’s important to note that I had already heard this phrase. I don’t know where, it’s one of those things that’s absorbed into the psyche of some software engineers like “goto considered harmful”, “adding people to a late project makes it later” or “premature optimisation is the root of all evil”.

My understanding of the quote was something akin to Greg Detre’s description: we want to build software to do the right thing, then make sure it does it correctly, then optimise.

Make it work. First of all, get it to compile, get it to run, make sure it spits out roughly the right kind of output.

Make it right. It’s time to test it, and make sure it behaves 100% correctly.

Make it fast. Now you can worry about speeding it up (if you need to). […]

When you write it down like this, everyone agrees it’s obvious.

Greg Detre, “Make it work, make it right, make it fast

That isn’t what everybody thinks though, as Greg points out. For example, Henrique Bastos laments that some teams never give themselves the opportunity to “make it fast”. He interprets making it right as being about design, not about correctness.

Just after that, you’d probably discovered what kind of libraries you will use, how the rest of your code base interacts with this new behavior, etc. That’s when you go for refactoring and Make it Right. Now you dry things out and organize your code properly to have a good design and be easily maintainable.

Henrique Bastos, “The make it work, make it right, make it fast misconception

We already see the problem with these little pithy aphorisms: the truth that they convey is interpreted by the reader. Software engineering is all about turning needs into instructions precise enough that a computer can accurately and reliably perform them, and yet our knowledge is communicated in soundbites that we can’t even agree on at the human level.

It wasn’t hard to find the source of that quote. There was a special issue of Byte magazine on the C programming language in August 1983. In it, Stephen C. Johnson and Brian W. Kernighan describe modelling systems processing tasks in C.

But the strategy is definitely: first make it work, then make it right, and, finally, make it fast.

Johnson and Kernighan, “The C Language and Models for Systems Programming

This sentence comes at the end of a section on efficiency, which follows a section on “Higher-Level Models” in which the design of programs that use C structures to operate on problem models, rather than bits and words, are described. The efficiency section tells us that higher-level models can make a program less efficient, but that C gives people the tools to get close to the metal to speed up the 5% of the code that’s performance critical. That’s where they lead into this idea that making it fast comes last.

Within context, the “right” that they want us to make appears to be the design/model type of “right”, not the correctness kind of right. This seems to make sense: if the thing is not correct, in what sense are you suggesting that you have already “made it work”?

A second source, contemporary with that Byte article, seems to seal the deal. Butler Lampson’s hints deal with ideas from various different systems, including Unix but also the Xerox PARC systems, Control Data Corporation mainframes, and others. He doesn’t use the phrase we’re looking for but his Figure 1 does have “Does it work?” as a functionality problem, from which follow “Get it right” and “Make it fast” as interface design concerns (with making it fast following on from getting it right). Indeed “Get it right” is a bullet point and cautionary tale at the end of the section on designing simple interfaces and abstractions. Only after that do we get to making it fast, which is contextualised:

Make it fast, rather than general or powerful. If it’s fast, the client can program the function it wants, and another client can program some other function. It is much better to have basic operations executed quickly than more powerful ones that are slower (of course, a fast, powerful operation is best, if you know how to get it). The trouble with slow, powerful operations is that the client who doesn’t want the power pays more for the basic function. Usually it turns out that the powerful operation is not the right one.

Butler W. Lampson, Hints for Computer System Design

So actually it looks like I had the wrong idea all this time: you don’t somehow make working software then correct software then fast software, you make working software and some inputs into that are the abstractions in the interfaces you design and the performance they permit in use. And this isn’t the only aphorism of software engineering that leads us down dark paths. I’ve also already gone into why the “premature optimisation” quote is used in misguided ways, in mature optimisation. Note that the context is that 97% of code doesn’t need optimisation: very similar to the 95% in Johnson and Kernighan!

What about some others? How about the ones that don’t say anything at all? It used to be common in Perl and Cocoa communities to say “simple things simple; complex things possible”. Now the Cocoa folks think that the best way to distinguish value types from identity types is the words struct and class (not, say, value and identity) so maybe it’s no longer a goal. Anyway, what’s simple to you may well be complex to me, and what’s complex to you may well be simplifiable but if you stop at making it possible, nobody will get that benefit.

Or the ones where meanings shifted over time? I did a podcast episode on “working software over comprehensive documentation”. It used to be that the comprehensive documentation meant the project collateral: focus on building the thing for your customer, not appeasing the project office with TPS reports. Now it seems to mean any documentation: we don’t need comments, the code works!

The value in aphorisms is similar to the value in a pattern language: you can quickly communicate ideas and intents. The cost of aphorisms is similar to the cost in a pattern language: if people don’t understand the same meaning behind the terms, then the idea spoken is not the same as the idea received. It’s best with a lot of the aphorisms in software that are older than a lot of the people using them to assume that we don’t all have the same interpretation, and to share ideas more precisely.

(I try to do this for software engineers and for programmers who want to become software engineers, and I gather all of that work into my fortnightly newsletter. Please do consider signing up!)

Posted in whatevs | Tagged | 3 Comments

What is software engineering?

I suppose if I’m going to have a tagline like “from programming to software engineering”, we ought to have some kind of shared understanding of what that journey entails. It would be particularly useful to agree on the destination.

The question “what is software engineering?” doesn’t have a single answer. Plenty of people have the job title “software engineer”, or work in the “engineering” department of a software company, so we could say that it’s whatever they do. But that’s not very constructive. It’s too broad: anyone who calls themselves a software engineer could be doing anything, and it would become software engineering. It’s also too narrow: the people with the job title “software engineer” are often the programmers, and there’s more to software engineering than programming. I wrote a whole book, APPropriate Behaviour, about the things programmers need to know beyond the programming; that only scratches the surface of what goes into software engineering.

Sometimes the definition of software engineering is a negative-space definition, telling us what it isn’t so that something else can fill that gap. In Software Craftsmanship: The New Imperative, Pete McBreen described that “the software engineering approach” to building software is something that doesn’t work for many organisations, because it isn’t necessary. He comes close to telling us what that approach is when he says that it’s what Watts Humphries is lamenting in Why Don’t They Practice What We Preach? But again, there are reasons not to like this definition. Firstly, it’s a No True Scotsman definition. Software engineering is whatever people who don’t do it properly, the way software craftsmen do it, do. Secondly it’s just not particularly fruitful: two decades after his book was published, most software organisations aren’t using the craftsmanship model. Why don’t they practice what he preaches?

I want to use a values-oriented definition of software engineering: software engineering is not what you do, it’s why you do what you do, and how you go about doing it. No particular practice is or isn’t software engineering, but the way you evaluate those practices and whether or not to adopt them can adopt an engineering perspective. Similarly, this isn’t methodology consultancy: the problems your team has with Agile aren’t because you aren’t Agiling hard enough and need to hire more Agile trainers. But the ways in which you reflect on and adapt your processes can be informed by engineering.

I like Shari Lawrence Pfleeger’s definition, in her book Software Engineering: The Production of Quality Software:

There may be many ways to perform a particular task on a particular system, but some are better than others. One way may be more efficient, more precise, easier to modify, easier to use, or easier to understand. Consequently, Software Engineering is about designing and developing high-quality software.

There’s a bit of shorthand, or some missing steps here, that we could fill in. We understand that of the many ways to build a software system, we can call some of them “better”. We declare some attributes that contribute to this “betterness”: efficiency, precision, ease of adaptation, ease of use, ease of comprehension. This suggests that we know what the properties of the software are, which ones are relevant, and what values a desirable system would have for those properties. We understand what would be seen as a high-quality product, and we choose to build the software to optimise for that view of quality.

The Software Engineering degree course I teach on offers a similar definition:

Software Engineering is the application of scientific and engineering principles to the development of software systems—principles of design, analysis, and management—with the aim of:

  • developing software that meets its requirements, even when these requirements change;
  • completing the development on time, and within budget;
  • producing something of lasting value—easy to maintain, re-use, and re-deploy.

So again we have this idea that there are desirable qualities (the requirements, the lasting value, ease of maintenance, re-use, and re-deployment; and also the project-level qualities of understanding and controlling the schedule and the cost), and the idea that we are going to take a principled approach to understanding how our work supports these properties.

Let me summarise: software engineering is understanding the desired qualities of the software we build, and taking a systematic approach to our work that maximises those qualities.

Posted in software-engineering | Tagged | Leave a comment

Diagnosing a Docker image build problem

My Python script for Global.health was not running in production, because it couldn’t find some imports. Now the real solution is to package up the imports with setuptools and install them at runtime (we manage the environments with poetry), but the quick solution is to fix up the path so that they get imported anyway. Or so I thought.

The deployment lifecycle of this script is that it gets packaged into a Docker image and published to Amazon Elastic Container Repository. An EventBridge event triggers a Batch job definition using that image to be queued. So to understand why the imports aren’t working, we need to understand the Docker image.

docker create --name broken_script sha256:blah gives me a container based on the image. Previously I would have started that image and launched an interactive shell to poke around, but this time I decided to try something else: docker export broken_cleanup | tar tf - gives me the filesystem listing (and all I would’ve done with the running shell was various ls incantations, so that’s sufficient).

Indeed my image had various library files alongside the main script:

/app/clean_old_ingestion_source_files.py
/app/EventBridgeClient.py
/app/S3Client.py
/app/__init__.py

Those supporting files should be in a subfolder. My copy command was wrong in the Dockerfile:

COPY clean_old_ingestion_source_files.py aws_access ./

This copies the content of aws_access into the current folder, I wanted to copy the folder (and, recursively, its content). Simple fix: break that line into two, putting the files in their correct destinations. Now rebuild the image, and verify that it is fixed. This time I didn’t export the whole filesystem from a container, I exported the layers from the image.

docker image save sha256:blah | tar xf -

This gives me a manifest.json showing each layer, and a tar file with the content of that layer. Using this I could just get the table of content for the layer containing my Python files, and confirm that they are now organised correctly.

Posted in whatevs | Leave a comment

Episode 49: REST and SOAP

I talk both about the difficulties of having objective conversations comparing technologies on the interwebs, and about a particular recent success in doing so: a comparison of RPC-over-HTTP methods.

This particular conversation was on the Brumtech slack: I particularly recommend local software groups as a more diverse source of information than the user groups for any particular technology. (And if you’re in England’s West Midlands, check out Brumtech!)

Thanks for listening! Please remember to sign up to the newsletter for more like this!

Leave a comment

Halloween is Over

Back in 2016, I sent the following letter to Linux Voice, and it was published in issue 24 as the star letter. LV came to an end (and made all of their content available as Creative Commons) when they merged with Linux Magazine. The domain still exists, but the certificate expired years ago; you should search for it if you’re interested in back numbers for the magazine and willing to take the risk on their SSL.

I think my letter is still relevant, so I’m reproducing it. Here’s what I wrote:

LV issue 023 contained, as have prior numbers, many jabs at Microsoft as the natural enemy of the Free Software believer. It’s time to accept that the world has changed.Like many among your staff and readers, I remember that period when the infamous Halloween memos were leaked, and we realised joyfully that the Free Software movement was big enough to concern the biggest software company in the world.

I remember this not because it was recent, but because I am old: this happened in 1998. Large companies like Microsoft can be slow to change, so it is right that we remain sceptical of their intentions with Free and open source software, but we need to remember that if we define our movement as Anti-Microsoft, it will live or die by their fortunes alone.

While we jab at Azure for their plush Tux swag, Apple has become one of the largest companies on the planet. It has done this with its proprietary iPhone and iOS platforms, which lock in more first-party applications than 1990s Windows did when the antitrust cases started flying. You can download alternatives from its store (and its store alone), but the terms of business on that store prohibit copyleft software. The downloads obtained by Apple’s users are restricted by DRM to particular Apple accounts.

Meanwhile, Apple co-opts open source projects like Clang and LLVM to replace successful Free Software components like GCC. How does the availability of a cuddly Tux with Microsoft branding stack up to these actions in respect to the FSF’s four freedoms?

We celebrate Google for popularising the Linux kernel through its Android mobile OS, and companies like it, including Facebook and Twitter, for their contributions to open source software. However, these companies thrive by providing proprietary services from their own server farms. None has embraced the AGPL, a licence that extends freedom to remote users of a hosted service. Is it meaningful to have the freedom to use a browser or a mobile device for any purpose, if the available purposes involve using non-free services?

So yes, Microsoft is still important, and its proprietary Windows and Office products are still huge obstacles to the freedom of computer users everywhere. On the other hand, Microsoft is no longer the headline company defining the computing landscape for many people. If the Free Software movement is the “say no to Microsoft” movement, then we will not win. Rather we will become irrelevant at the same time as our nemesis in Redmond.

You may think that Steve Jobs is an unlikely role model for someone in my position, but I will end by paraphrasing his statement on his return to Apple. We need to get out of the mindset that for the Four Freedoms to win, Microsoft has to lose.

Graham Lee

Their deputy editor responded.

I had never stopped to consider this, but what you say makes 100% sense. In practice though, for most people Microsoft is still the embodiment of proprietary software. Apple is arguably a more serious threat, but Microsoft keeps shooting itself in the foot, so it’s an easier target for us. Apple at least makes a lot of good products along with its egregious attitudes towards compatibility, planned obsolescence and forced upgrades; Microsoft seems to be successful only by abusing its market position.

Andrew Gregory

Things have changed a bit since then: Apple have made minimal efforts to permit alternative apps in certain categories; Microsoft have embraced and extended more open source technologies; various SaaS companies have piled in on the “open source but only when it works in our favour” bandwagon; Facebook renamed and is less likely to be praised now than it was in 2016.

But also things have stayed the same. As my friend and stream co-host Steven Baker put it, there’s a reason there isn’t an M in FAANG. Microsoft isn’t where the investors are interested any more, and they shouldn’t be where Free Software’s deciding battles are conducted.

If you like my writing on software engineering please subscribe to my fortnightly newsletter where I aggregate it from across the web, as well as sharing the things I’ve been reading about software engineering!

Posted in AAPL, FLOSS, msft | Tagged | Leave a comment