Software design is refinement, not abstraction

James Koppel tells us that software engineers keep using the word “abstraction” and that he does not think it means what they think it means. I believe that he is correct, and that the confusion over the term abstraction comes from thinking that programming is about abstraction.

Programming is refinement, not abstraction. You take your idea of what the software should be doing and you progressively refine that idea until what you get it so rote, so formulaic, so prescribed, that you can create a plan that a computer will follow reliably and accurately. Back in the day, that process used to be done literally in the order written as a sequence of distinct activities.

  • You take your idea of what the software should be doing: requirements gathering
  • refine that idea: software specification
  • create a plan that a computer will follow: construction
  • reliably and accurately: verification and validation

It doesn’t matter what paradigm you’re using, what tools, or whether you truly go from one step to the next as described here, what you’re doing is specifying the software you need, and getting so specific that you end up with instructions for a computer. Specific is the opposite of abstract: the process is the opposite of abstraction.

Thus Niklaus Wirth talks about Program Development by Stepwise Refinement. He describes a procedure for solving the 8 queens problem, then refines that procedure until it is entirely described in terms of Algol (more or less) instructions. He could have started by describing a function that turns the set of possible chess boards into the set of boards that solve 8 queens, or he could have started by describing the communication between the board and the queens that would cause them to solve 8 queens.

This is not to say that abstraction doesn’t appear in software development. Wirth’s starting point is an abstract procedure for solving a specific problem: you can follow that procedure to solve 8 queens, you just have to do a lot of colouring in yourself which a computer is incapable of. Maybe GPT-3 could follow that first procedure; maybe one of the intermediate ones.

And his end point is an abstract definition of the instructions the computer will do: you can say j<1 and the computer will do something along the lines of loading the word at the same address previously associated with j into an accumulator, subtracting one, checking the flags index, and conditionally modifying the program counter. And “loading the word at the same address” is itself an abstraction: what really happens might involve page faults, loading data from permanent storage, translation lookaside buffers, and other magic.

Abstractions in this view are a “meet in the middle” situation, not a goal: you can refine/specify your solution until you meet the thing that will do the rest of the work of refinement. Or sometimes a “meet off to the side” situation: if you can make your program’s data look like bags of tuples then you can use the relational model to do a lot of the storage and retrieval work, even if nothing in your problem description looks anything like bags of tuples.

Notice that Wirth’s last section is about generalisation, not abstraction: solving the “N queens” problem is not any less specific than solving the 8 queens problem.

Posted in design, software-engineering | Tagged | Leave a comment

Unit test: you keep using this word.

There’s an idea doing the rounds that the “unit” in “unit test” means the unity of the test, rather than a test of a software unit. Moreover, that it originally meant this, and that anyone who says “unit test” to mean the test of a software unit is misguided.

Here’s the report of the 1968 NATO conference on software engineering. On their page 20 (as marked, not as in the PDF) is a diagram of a waterfall-esque system development process, featuring these three phases among others (emphasis mine):

  • unit design
  • unit development
  • unit test

“Unit” meaning software unit is used throughout.

Posted in test, unittest | Tagged | Leave a comment

Episode 52: Software Freedom is a Civil Liberties Issue

Software freedom is a free speech issue. This has important consequences

Leave a comment

Why are we like this?

The recent post on addressing “technical debt” did the rounds of the usual technology forums, where it raised a reasonable question: why are people basing these decisions on balancing engineering-led with customer-led tasks on opinion? Why don’t engineers take an evidence-based approach to such choices?

The answer is complex but let’s start at the top: there’s too much money in software. There have been numerous crises in the global economy since software engineering has existed, but really the only one with any material effect on the software sector was the dot-com crash. The lesson there was “have a business plan”: plenty of companies had raised billions in speculative funding on the basis that they were on the internet but once the first couple started to fold, the investors pulled out en masse and the money was sucked from the room. This is the time that gave us Agile (constantly demonstrate that you’re delivering value to your customer), Lean Startup (demonstrate that you’re adding value with as little expenditure as possible), and Lean Software Development (eliminate all of the waste in your process).

Nobody ever demonstrated that Agile, Lean or whatever were better in some objective metric, what they did was tell convincing stories. Would you like to find out that you’ve built the wrong thing two years from now, or two weeks from now? Would you prefer to read an interim draft functional specification, or use working software? Let’s be clear though, nobody ever showed that what we were doing before that was better in any objective way either: software was written by defence contractors and electronics hardware companies, and they grandfathered in the processes used to procure and develop hardware. You can count the number of industry pundits advocating for a genuinely evidence-led approach to software cost control on two fingers (Barry Boehm and Watts Humphries) and you can still raise valid questions about the validities of either of their systems.

Since then, software teams have become less fragile to economic shock. This was already happening in the 2007 credit crunch (the downturn at the beginning of the 2007-2008 global financial crisis). The CFO where I worked explained that bookings of their subscription-based software would go up during a recession. Why? Because people were not confident enough to buy outright or to enter relatively cheap, long-term arrangements like three year contracts. They would instead take the more expensive but more flexible shorter-term contracts so that they could cancel or move if their belts needed tightening. After the crisis, the adoption of subscription-based pricing models has only increased in software, and extended to adjacent fields like media and hardware.

All of this means that there is relative stability in software businesses, and there is still growing demand for software engineers. That has meant that there isn’t the need for systematic approaches to cost-reduction hawked by every single thinker in the “software crisis” era: note that there hasn’t been significant movement beyond Agile, Lean or whatever in the subsequent two decades. They’re good enough, and there is no impetus to find out what’s better. In fact both Agile with its short increments and Lean Startup with its pivoting are optimised for the “get out quickly at any cost” flexibility that also leads customers to choose short-term subscription pricing: when the customers for your VR pet grooming business dry up you can quickly pivot to online fraud detection.

With no need to find or implement better approaches there’s also no need to particularly require software engineers to have a systematic approach or a detailed understanding of the knowledge of their industry. Thus software engineering—particularly programming—remains a craft-based discipline where anyone with an interest can start out at the bottom, learn on the job through mentoring and self-study, and use a process of survivor bias to get along. Did anyone demonstrate in 2002 that there’s objective benefit to a single-page application? Did anyone demonstrate in 2008 that there’s objective benefit to a native mobile app? Did anyone demonstrate in 2016 that there’s objective benefit to a Dapp? Has anyone crunched the numbers to resolve whether DevOps or Site Reliability Engineering is the one true way to do operations? No, but it doesn’t matter: there’s more than enough money to put into these things. And indeed most of those choices listed above are immaterial to where the money comes from or goes, but would be the sorts of “technical debt” transitions that engineering teams struggle to pay for.

You might ask why I’m at all interested in taking a systematic approach to our work when I also think it’s not necessary. Even if it isn’t necessary for survival, it’s definitely professional, and justifiable. When the clients do come to reduce their expenditure, or even when they don’t but are deciding who to go with, the people who can demonstrate that they systematically maximise the output of their work will be the preferred choice.

Posted in software-engineering | Leave a comment

When to “address” “technical debt”?

The phrase “technical debt” appears in scare quotes here because, as observed in The Unreasonable Ineffectiveness of Considering Things Harmful, technical debt has quite a specific meaning and I’m talking about something broader here. Quoting Ward Cunningham:

Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite. Objects make the cost of this transaction tolerable. The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a stand-still under the debt load of an unconsolidated implementation, object-oriented or otherwise.

Ward Cunningham, the Wycash Portfolio Management System

It’s not old code that’s technical debt, it’s lightly-designed code. That thing that seemed like the solution when you first thought of it. Yes, ship it, by all means, but be ready to very quickly rewrite it when you learn more.

Some of what people mean when they say “we need to bring our technical debt under control” is that kind of technical debt, struggling under the compound interest of if statement accrual as multiple developers have added behaviour without adding design. But there are other things. Cutting corners is not technical debt, it’s technical gambling. Updating external dependencies is not technical debt repayment, but it does still need to be done. Removing deprecated symbols is paying for somebody else’s technical debt, not yours: again you still have to do it. Replacing last month’s favoured npm modules with this month’s is not technical debt, it’s buying yourself a new toy.

But all of these things get done, and all of these things need to get done. It’s the cost of deploying a system into an evolving context (and as I’ve said before, even that act of deployment itself triggers evolution). So the question is when, how often, how much?

Some teams put their “engineering requirements”, their name for the evolution-coping tasks, onto the same backlog as the product requirements, then try to advocate for prioritising them alongside feature requests and bug fixes. Unfortunately this rarely works: the perceived benefit of the engineering activity is zero retained customers plus zero acquired customers = zero revenue, and yet it costs the same as fixing a handful of customer-reported bugs.

So, other groups just try to carve out time. Maybe it’s “20% of developer effort on the sprint is not for product tasks”. Maybe it’s “there is a week between delivering one iteration and starting the next”. Maybe it’s “whoever is on support rotation can pick up engineering tasks when there’s no fire to put out”. And the most anti- of all the patterns is the “hardening sprint”: once per quarter we’ll spend two weeks fixing the problems we’ve been making for ourselves in the intervening time. All of these have the benefit of giving a predictable cadence, though they still suffer a bit from that product envy problem: why are we paying for these engineers to do non-useful work when they could be steadily adding value?

The key point is that part about steadily adding value. We know the reason we need to do this: it’s to avoid being brought to Ward’s stand-still. We need to consolidate what we’ve learned, we need to evolve the system to adapt to evolutionary changes in its context, we need to fix past mistakes. And we need to do it constantly. Remember the quote: “Every minute spent on not-quite-right code counts as interest on that debt”.

Ultimately, these attempts to carve out time are requests to do our jobs properly, directed at people who don’t have the same motivations that we do. That’s not to say that their motivations are wrong. Like us, they only have a partial view of the overall picture. Unlike us, that view does not extend to an understanding of how expensive a liability our source code is.

When we ask for time between this iteration and the next to “service technical debt”, we are saying “I know that I’m doing a bad job, I know what I need to do to be doing a good job, and I would like to do a good job for four hours in a fortnight’s time on Friday afternoon, if that’s alright with you”. Ironically we do not end up doing a better job, we normalise doing a bad job for the next couple of weeks (and undoubtedly finding that some delivery/support/operations problem gets in the way for those four hours anyway).

I recommend to my mentees, reports, and any engineer who will listen to avoid advocating for time-boxed good work. I propose building the trust relationship where the people who need the code written are happy that the code is being written, and being written well, without feeling the need to check over our shoulders to see how the sausage is made. Then we don’t need to justify doing a good job, and certainly don’t need to ask permission: we just do it while we’re going. When someone asks how long it’ll take to do something, the answer is how long it’ll take to do properly, with all the rewriting, testing, and everything else it takes to do it properly. What they get out of the end is something worth having, that doesn’t need hardening, or 20% of the effort dedicated to patching it up.

And of course, what they get is something that will almost immediately need a rewrite.

Posted in process | Tagged | 3 Comments

So that’s how it works

Back in Apple Silicon, Xeon Phi, and Amigas I asked how Apple would scale the memory up in a hypothetical Mac Pro based on the M1. We still don’t know because there still isn’t one, although now we sort of do know.

The M1 Ultra uses a dedicated interconnect allowing two (maybe more, but definitely two) M1 Max to act as a single SoC. So in an M1 Ultra-powered Mac Studio, there’ll be two M1 packages connected together, acting as if the memory is unified.

It remains to be seen whether the interconnect is fast enough that the memory appears unified, or whether we’ll start to need thread affinity APIs to say “this memory is on die 0, so please run this thread on one of the cores in die 0”. But, as predicted, they’ve gone for the simplest approach that could possibly work.

BTW here’s my unpopular opinion on the Mac Studio: it’s exactly the same as the 2013 Mac Pro (the cylinder one). Speeds, particularly for external peripherals on USB and Thunderbolt, are much faster, so people are ready to accept that their peripherals should all be outside the box. But really the power was all in using the word Studio instead of Pro, so that people don’t think this is the post-cheesegrater Mac.

Posted in AAPL, arm, Mac | Leave a comment

Episode 51: Responding to Change

Sometimes it just seems like our customers are fickle flibbertigibbets who change their minds at the drop of a hat, right? Let’s look at what might be going on, and how to work with that.

Don’t forget that you can subscribe to the newsletter to keep up to date with everything Graham Lee and software engineering on the internet!

Leave a comment

Having the right data

In the beginning there was the relational database, and it was…OK, I guess. It was based on the relational model, and allowed operations that were within the relational algebra.

I mean it actually didn’t. The usual standard for relational databases is ISO 9075, or SQL. It doesn’t really implement the relational model, but something very similar to it. Still, there is a standard way for dealing with relational data, using a standard syntax to construct queries and statements that are mathematically provable.

I mean there actually isn’t. None of the “SQL databases” you can get hold of actually implement the SQL standard accurately or in its entirety. But it’s close enough.

At some point people realised that you couldn’t wake up the morning of your TechCrunch demo and code up your seed-round-winning prototype before your company logo hit the big screen, because it involved designing your data model. So the schemaless database became popular. These let you iterate quickly by storing any data of any shape in the database. If you realise you’re missing a field, you add the field. If you realise you need the data to be in a different form, you change its form. No pesky schemata to migrate, no validation.

I mean actually there is. It’s just that the schema and the validation are the responsibility of the application code: if you add a field, you need to know what to do when you see records without the field (equivalent to the field being null in a relational database). If you realise the data need to be in a different form, you need to validate whether the data are in that form and migrate the old data. And because everyone needs to do that and the database doesn’t offer those facilities, you end up with lots of wasteful, repeated, buggy code that sort of does it.

So the pendulum swings back, and we look for ways to get all of that safety back in an automatic way. Enter JSON schema. Here’s a sample of the schema (not the complete thing) for Covid-19 cases in Global.health:

{
  bsonType: 'object',
  additionalProperties: false,
  properties: {
    location: {
      bsonType: 'object',
      additionalProperties: false,
      properties: {
        country: { bsonType: 'string', maxLength: 2, minLength: 2 },
        administrativeAreaLevel1: { bsonType: 'string' },
        administrativeAreaLevel2: { bsonType: 'string' },
        administrativeAreaLevel3: { bsonType: 'string' },
        place: { bsonType: 'string' },
        name: { bsonType: 'string' },
        geoResolution: { bsonType: 'string' },
        query: { bsonType: 'string' },
        geometry: {
          bsonType: 'object',
          additionalProperties: false,
          required: [ 'latitude', 'longitude' ],
          properties: {
            latitude: { bsonType: 'number', minimum: -90, maximum: 90 },
            longitude: { bsonType: 'number', minimum: -180, maximum: 180 }
          }
        }
      }
    }
  }
}

This is just the bit that describes geographic locations, relevant to the falsehoods we believed about countries in an earlier post. This schema is stored as a validator in the database (you know, the database that’s easier to work with because it doesn’t have validators). But you can also validate objects in the application if you want. (Actually we currently have two shadow schemas: a Mongoose document description and an OpenAPI specification, in the application. It would be a good idea to normalise those: pull requests welcome!)

Posted in software-engineering | Tagged | Leave a comment

Episode 50: Organisation and Community

I look at the historical basis of the white collar/blue collar divide in defining occupations, and the problems this distinction has with comprehending modern roles like engineering and various technician occupations. I then have difficulty fitting software roles into any of those categories. This is important because it helps to understand the various commitments practitioners make to different organisations: professional societies, their employers, trade unions, technology user groups, craft guilds and so on.

Finally, there’s a call for you: what have you seen of various modes of organisation in your experience in software? Please do let me know by commenting below or emailing grahamlee@acm.org.

Leave a comment

Aphorism Considered Harmful

Recently Dan North asked the origin of the software design aphorism “make it work, make it right, make it fast”. Before delving into that story, it’s important to note that I had already heard this phrase. I don’t know where, it’s one of those things that’s absorbed into the psyche of some software engineers like “goto considered harmful”, “adding people to a late project makes it later” or “premature optimisation is the root of all evil”.

My understanding of the quote was something akin to Greg Detre’s description: we want to build software to do the right thing, then make sure it does it correctly, then optimise.

Make it work. First of all, get it to compile, get it to run, make sure it spits out roughly the right kind of output.

Make it right. It’s time to test it, and make sure it behaves 100% correctly.

Make it fast. Now you can worry about speeding it up (if you need to). […]

When you write it down like this, everyone agrees it’s obvious.

Greg Detre, “Make it work, make it right, make it fast

That isn’t what everybody thinks though, as Greg points out. For example, Henrique Bastos laments that some teams never give themselves the opportunity to “make it fast”. He interprets making it right as being about design, not about correctness.

Just after that, you’d probably discovered what kind of libraries you will use, how the rest of your code base interacts with this new behavior, etc. That’s when you go for refactoring and Make it Right. Now you dry things out and organize your code properly to have a good design and be easily maintainable.

Henrique Bastos, “The make it work, make it right, make it fast misconception

We already see the problem with these little pithy aphorisms: the truth that they convey is interpreted by the reader. Software engineering is all about turning needs into instructions precise enough that a computer can accurately and reliably perform them, and yet our knowledge is communicated in soundbites that we can’t even agree on at the human level.

It wasn’t hard to find the source of that quote. There was a special issue of Byte magazine on the C programming language in August 1983. In it, Stephen C. Johnson and Brian W. Kernighan describe modelling systems processing tasks in C.

But the strategy is definitely: first make it work, then make it right, and, finally, make it fast.

Johnson and Kernighan, “The C Language and Models for Systems Programming

This sentence comes at the end of a section on efficiency, which follows a section on “Higher-Level Models” in which the design of programs that use C structures to operate on problem models, rather than bits and words, are described. The efficiency section tells us that higher-level models can make a program less efficient, but that C gives people the tools to get close to the metal to speed up the 5% of the code that’s performance critical. That’s where they lead into this idea that making it fast comes last.

Within context, the “right” that they want us to make appears to be the design/model type of “right”, not the correctness kind of right. This seems to make sense: if the thing is not correct, in what sense are you suggesting that you have already “made it work”?

A second source, contemporary with that Byte article, seems to seal the deal. Butler Lampson’s hints deal with ideas from various different systems, including Unix but also the Xerox PARC systems, Control Data Corporation mainframes, and others. He doesn’t use the phrase we’re looking for but his Figure 1 does have “Does it work?” as a functionality problem, from which follow “Get it right” and “Make it fast” as interface design concerns (with making it fast following on from getting it right). Indeed “Get it right” is a bullet point and cautionary tale at the end of the section on designing simple interfaces and abstractions. Only after that do we get to making it fast, which is contextualised:

Make it fast, rather than general or powerful. If it’s fast, the client can program the function it wants, and another client can program some other function. It is much better to have basic operations executed quickly than more powerful ones that are slower (of course, a fast, powerful operation is best, if you know how to get it). The trouble with slow, powerful operations is that the client who doesn’t want the power pays more for the basic function. Usually it turns out that the powerful operation is not the right one.

Butler W. Lampson, Hints for Computer System Design

So actually it looks like I had the wrong idea all this time: you don’t somehow make working software then correct software then fast software, you make working software and some inputs into that are the abstractions in the interfaces you design and the performance they permit in use. And this isn’t the only aphorism of software engineering that leads us down dark paths. I’ve also already gone into why the “premature optimisation” quote is used in misguided ways, in mature optimisation. Note that the context is that 97% of code doesn’t need optimisation: very similar to the 95% in Johnson and Kernighan!

What about some others? How about the ones that don’t say anything at all? It used to be common in Perl and Cocoa communities to say “simple things simple; complex things possible”. Now the Cocoa folks think that the best way to distinguish value types from identity types is the words struct and class (not, say, value and identity) so maybe it’s no longer a goal. Anyway, what’s simple to you may well be complex to me, and what’s complex to you may well be simplifiable but if you stop at making it possible, nobody will get that benefit.

Or the ones where meanings shifted over time? I did a podcast episode on “working software over comprehensive documentation”. It used to be that the comprehensive documentation meant the project collateral: focus on building the thing for your customer, not appeasing the project office with TPS reports. Now it seems to mean any documentation: we don’t need comments, the code works!

The value in aphorisms is similar to the value in a pattern language: you can quickly communicate ideas and intents. The cost of aphorisms is similar to the cost in a pattern language: if people don’t understand the same meaning behind the terms, then the idea spoken is not the same as the idea received. It’s best with a lot of the aphorisms in software that are older than a lot of the people using them to assume that we don’t all have the same interpretation, and to share ideas more precisely.

(I try to do this for software engineers and for programmers who want to become software engineers, and I gather all of that work into my fortnightly newsletter. Please do consider signing up!)

Posted in whatevs | Tagged | 3 Comments