So that’s how it works

Back in Apple Silicon, Xeon Phi, and Amigas I asked how Apple would scale the memory up in a hypothetical Mac Pro based on the M1. We still don’t know because there still isn’t one, although now we sort of do know.

The M1 Ultra uses a dedicated interconnect allowing two (maybe more, but definitely two) M1 Max to act as a single SoC. So in an M1 Ultra-powered Mac Studio, there’ll be two M1 packages connected together, acting as if the memory is unified.

It remains to be seen whether the interconnect is fast enough that the memory appears unified, or whether we’ll start to need thread affinity APIs to say “this memory is on die 0, so please run this thread on one of the cores in die 0”. But, as predicted, they’ve gone for the simplest approach that could possibly work.

BTW here’s my unpopular opinion on the Mac Studio: it’s exactly the same as the 2013 Mac Pro (the cylinder one). Speeds, particularly for external peripherals on USB and Thunderbolt, are much faster, so people are ready to accept that their peripherals should all be outside the box. But really the power was all in using the word Studio instead of Pro, so that people don’t think this is the post-cheesegrater Mac.

Posted in AAPL, arm, Mac | Leave a comment

Episode 51: Responding to Change

Sometimes it just seems like our customers are fickle flibbertigibbets who change their minds at the drop of a hat, right? Let’s look at what might be going on, and how to work with that.

Don’t forget that you can subscribe to the newsletter to keep up to date with everything Graham Lee and software engineering on the internet!

Leave a comment

Having the right data

In the beginning there was the relational database, and it was…OK, I guess. It was based on the relational model, and allowed operations that were within the relational algebra.

I mean it actually didn’t. The usual standard for relational databases is ISO 9075, or SQL. It doesn’t really implement the relational model, but something very similar to it. Still, there is a standard way for dealing with relational data, using a standard syntax to construct queries and statements that are mathematically provable.

I mean there actually isn’t. None of the “SQL databases” you can get hold of actually implement the SQL standard accurately or in its entirety. But it’s close enough.

At some point people realised that you couldn’t wake up the morning of your TechCrunch demo and code up your seed-round-winning prototype before your company logo hit the big screen, because it involved designing your data model. So the schemaless database became popular. These let you iterate quickly by storing any data of any shape in the database. If you realise you’re missing a field, you add the field. If you realise you need the data to be in a different form, you change its form. No pesky schemata to migrate, no validation.

I mean actually there is. It’s just that the schema and the validation are the responsibility of the application code: if you add a field, you need to know what to do when you see records without the field (equivalent to the field being null in a relational database). If you realise the data need to be in a different form, you need to validate whether the data are in that form and migrate the old data. And because everyone needs to do that and the database doesn’t offer those facilities, you end up with lots of wasteful, repeated, buggy code that sort of does it.

So the pendulum swings back, and we look for ways to get all of that safety back in an automatic way. Enter JSON schema. Here’s a sample of the schema (not the complete thing) for Covid-19 cases in Global.health:

{
  bsonType: 'object',
  additionalProperties: false,
  properties: {
    location: {
      bsonType: 'object',
      additionalProperties: false,
      properties: {
        country: { bsonType: 'string', maxLength: 2, minLength: 2 },
        administrativeAreaLevel1: { bsonType: 'string' },
        administrativeAreaLevel2: { bsonType: 'string' },
        administrativeAreaLevel3: { bsonType: 'string' },
        place: { bsonType: 'string' },
        name: { bsonType: 'string' },
        geoResolution: { bsonType: 'string' },
        query: { bsonType: 'string' },
        geometry: {
          bsonType: 'object',
          additionalProperties: false,
          required: [ 'latitude', 'longitude' ],
          properties: {
            latitude: { bsonType: 'number', minimum: -90, maximum: 90 },
            longitude: { bsonType: 'number', minimum: -180, maximum: 180 }
          }
        }
      }
    }
  }
}

This is just the bit that describes geographic locations, relevant to the falsehoods we believed about countries in an earlier post. This schema is stored as a validator in the database (you know, the database that’s easier to work with because it doesn’t have validators). But you can also validate objects in the application if you want. (Actually we currently have two shadow schemas: a Mongoose document description and an OpenAPI specification, in the application. It would be a good idea to normalise those: pull requests welcome!)

Posted in software-engineering | Tagged | Leave a comment

Episode 50: Organisation and Community

I look at the historical basis of the white collar/blue collar divide in defining occupations, and the problems this distinction has with comprehending modern roles like engineering and various technician occupations. I then have difficulty fitting software roles into any of those categories. This is important because it helps to understand the various commitments practitioners make to different organisations: professional societies, their employers, trade unions, technology user groups, craft guilds and so on.

Finally, there’s a call for you: what have you seen of various modes of organisation in your experience in software? Please do let me know by commenting below or emailing grahamlee@acm.org.

Leave a comment

Aphorism Considered Harmful

Recently Dan North asked the origin of the software design aphorism “make it work, make it right, make it fast”. Before delving into that story, it’s important to note that I had already heard this phrase. I don’t know where, it’s one of those things that’s absorbed into the psyche of some software engineers like “goto considered harmful”, “adding people to a late project makes it later” or “premature optimisation is the root of all evil”.

My understanding of the quote was something akin to Greg Detre’s description: we want to build software to do the right thing, then make sure it does it correctly, then optimise.

Make it work. First of all, get it to compile, get it to run, make sure it spits out roughly the right kind of output.

Make it right. It’s time to test it, and make sure it behaves 100% correctly.

Make it fast. Now you can worry about speeding it up (if you need to). […]

When you write it down like this, everyone agrees it’s obvious.

Greg Detre, “Make it work, make it right, make it fast

That isn’t what everybody thinks though, as Greg points out. For example, Henrique Bastos laments that some teams never give themselves the opportunity to “make it fast”. He interprets making it right as being about design, not about correctness.

Just after that, you’d probably discovered what kind of libraries you will use, how the rest of your code base interacts with this new behavior, etc. That’s when you go for refactoring and Make it Right. Now you dry things out and organize your code properly to have a good design and be easily maintainable.

Henrique Bastos, “The make it work, make it right, make it fast misconception

We already see the problem with these little pithy aphorisms: the truth that they convey is interpreted by the reader. Software engineering is all about turning needs into instructions precise enough that a computer can accurately and reliably perform them, and yet our knowledge is communicated in soundbites that we can’t even agree on at the human level.

It wasn’t hard to find the source of that quote. There was a special issue of Byte magazine on the C programming language in August 1983. In it, Stephen C. Johnson and Brian W. Kernighan describe modelling systems processing tasks in C.

But the strategy is definitely: first make it work, then make it right, and, finally, make it fast.

Johnson and Kernighan, “The C Language and Models for Systems Programming

This sentence comes at the end of a section on efficiency, which follows a section on “Higher-Level Models” in which the design of programs that use C structures to operate on problem models, rather than bits and words, are described. The efficiency section tells us that higher-level models can make a program less efficient, but that C gives people the tools to get close to the metal to speed up the 5% of the code that’s performance critical. That’s where they lead into this idea that making it fast comes last.

Within context, the “right” that they want us to make appears to be the design/model type of “right”, not the correctness kind of right. This seems to make sense: if the thing is not correct, in what sense are you suggesting that you have already “made it work”?

A second source, contemporary with that Byte article, seems to seal the deal. Butler Lampson’s hints deal with ideas from various different systems, including Unix but also the Xerox PARC systems, Control Data Corporation mainframes, and others. He doesn’t use the phrase we’re looking for but his Figure 1 does have “Does it work?” as a functionality problem, from which follow “Get it right” and “Make it fast” as interface design concerns (with making it fast following on from getting it right). Indeed “Get it right” is a bullet point and cautionary tale at the end of the section on designing simple interfaces and abstractions. Only after that do we get to making it fast, which is contextualised:

Make it fast, rather than general or powerful. If it’s fast, the client can program the function it wants, and another client can program some other function. It is much better to have basic operations executed quickly than more powerful ones that are slower (of course, a fast, powerful operation is best, if you know how to get it). The trouble with slow, powerful operations is that the client who doesn’t want the power pays more for the basic function. Usually it turns out that the powerful operation is not the right one.

Butler W. Lampson, Hints for Computer System Design

So actually it looks like I had the wrong idea all this time: you don’t somehow make working software then correct software then fast software, you make working software and some inputs into that are the abstractions in the interfaces you design and the performance they permit in use. And this isn’t the only aphorism of software engineering that leads us down dark paths. I’ve also already gone into why the “premature optimisation” quote is used in misguided ways, in mature optimisation. Note that the context is that 97% of code doesn’t need optimisation: very similar to the 95% in Johnson and Kernighan!

What about some others? How about the ones that don’t say anything at all? It used to be common in Perl and Cocoa communities to say “simple things simple; complex things possible”. Now the Cocoa folks think that the best way to distinguish value types from identity types is the words struct and class (not, say, value and identity) so maybe it’s no longer a goal. Anyway, what’s simple to you may well be complex to me, and what’s complex to you may well be simplifiable but if you stop at making it possible, nobody will get that benefit.

Or the ones where meanings shifted over time? I did a podcast episode on “working software over comprehensive documentation”. It used to be that the comprehensive documentation meant the project collateral: focus on building the thing for your customer, not appeasing the project office with TPS reports. Now it seems to mean any documentation: we don’t need comments, the code works!

The value in aphorisms is similar to the value in a pattern language: you can quickly communicate ideas and intents. The cost of aphorisms is similar to the cost in a pattern language: if people don’t understand the same meaning behind the terms, then the idea spoken is not the same as the idea received. It’s best with a lot of the aphorisms in software that are older than a lot of the people using them to assume that we don’t all have the same interpretation, and to share ideas more precisely.

(I try to do this for software engineers and for programmers who want to become software engineers, and I gather all of that work into my fortnightly newsletter. Please do consider signing up!)

Posted in whatevs | Tagged | 3 Comments

What is software engineering?

I suppose if I’m going to have a tagline like “from programming to software engineering”, we ought to have some kind of shared understanding of what that journey entails. It would be particularly useful to agree on the destination.

The question “what is software engineering?” doesn’t have a single answer. Plenty of people have the job title “software engineer”, or work in the “engineering” department of a software company, so we could say that it’s whatever they do. But that’s not very constructive. It’s too broad: anyone who calls themselves a software engineer could be doing anything, and it would become software engineering. It’s also too narrow: the people with the job title “software engineer” are often the programmers, and there’s more to software engineering than programming. I wrote a whole book, APPropriate Behaviour, about the things programmers need to know beyond the programming; that only scratches the surface of what goes into software engineering.

Sometimes the definition of software engineering is a negative-space definition, telling us what it isn’t so that something else can fill that gap. In Software Craftsmanship: The New Imperative, Pete McBreen described that “the software engineering approach” to building software is something that doesn’t work for many organisations, because it isn’t necessary. He comes close to telling us what that approach is when he says that it’s what Watts Humphries is lamenting in Why Don’t They Practice What We Preach? But again, there are reasons not to like this definition. Firstly, it’s a No True Scotsman definition. Software engineering is whatever people who don’t do it properly, the way software craftsmen do it, do. Secondly it’s just not particularly fruitful: two decades after his book was published, most software organisations aren’t using the craftsmanship model. Why don’t they practice what he preaches?

I want to use a values-oriented definition of software engineering: software engineering is not what you do, it’s why you do what you do, and how you go about doing it. No particular practice is or isn’t software engineering, but the way you evaluate those practices and whether or not to adopt them can adopt an engineering perspective. Similarly, this isn’t methodology consultancy: the problems your team has with Agile aren’t because you aren’t Agiling hard enough and need to hire more Agile trainers. But the ways in which you reflect on and adapt your processes can be informed by engineering.

I like Shari Lawrence Pfleeger’s definition, in her book Software Engineering: The Production of Quality Software:

There may be many ways to perform a particular task on a particular system, but some are better than others. One way may be more efficient, more precise, easier to modify, easier to use, or easier to understand. Consequently, Software Engineering is about designing and developing high-quality software.

There’s a bit of shorthand, or some missing steps here, that we could fill in. We understand that of the many ways to build a software system, we can call some of them “better”. We declare some attributes that contribute to this “betterness”: efficiency, precision, ease of adaptation, ease of use, ease of comprehension. This suggests that we know what the properties of the software are, which ones are relevant, and what values a desirable system would have for those properties. We understand what would be seen as a high-quality product, and we choose to build the software to optimise for that view of quality.

The Software Engineering degree course I teach on offers a similar definition:

Software Engineering is the application of scientific and engineering principles to the development of software systems—principles of design, analysis, and management—with the aim of:

  • developing software that meets its requirements, even when these requirements change;
  • completing the development on time, and within budget;
  • producing something of lasting value—easy to maintain, re-use, and re-deploy.

So again we have this idea that there are desirable qualities (the requirements, the lasting value, ease of maintenance, re-use, and re-deployment; and also the project-level qualities of understanding and controlling the schedule and the cost), and the idea that we are going to take a principled approach to understanding how our work supports these properties.

Let me summarise: software engineering is understanding the desired qualities of the software we build, and taking a systematic approach to our work that maximises those qualities.

Posted in software-engineering | Tagged | Leave a comment

Diagnosing a Docker image build problem

My Python script for Global.health was not running in production, because it couldn’t find some imports. Now the real solution is to package up the imports with setuptools and install them at runtime (we manage the environments with poetry), but the quick solution is to fix up the path so that they get imported anyway. Or so I thought.

The deployment lifecycle of this script is that it gets packaged into a Docker image and published to Amazon Elastic Container Repository. An EventBridge event triggers a Batch job definition using that image to be queued. So to understand why the imports aren’t working, we need to understand the Docker image.

docker create --name broken_script sha256:blah gives me a container based on the image. Previously I would have started that image and launched an interactive shell to poke around, but this time I decided to try something else: docker export broken_cleanup | tar tf - gives me the filesystem listing (and all I would’ve done with the running shell was various ls incantations, so that’s sufficient).

Indeed my image had various library files alongside the main script:

/app/clean_old_ingestion_source_files.py
/app/EventBridgeClient.py
/app/S3Client.py
/app/__init__.py

Those supporting files should be in a subfolder. My copy command was wrong in the Dockerfile:

COPY clean_old_ingestion_source_files.py aws_access ./

This copies the content of aws_access into the current folder, I wanted to copy the folder (and, recursively, its content). Simple fix: break that line into two, putting the files in their correct destinations. Now rebuild the image, and verify that it is fixed. This time I didn’t export the whole filesystem from a container, I exported the layers from the image.

docker image save sha256:blah | tar xf -

This gives me a manifest.json showing each layer, and a tar file with the content of that layer. Using this I could just get the table of content for the layer containing my Python files, and confirm that they are now organised correctly.

Posted in whatevs | Leave a comment

Episode 49: REST and SOAP

I talk both about the difficulties of having objective conversations comparing technologies on the interwebs, and about a particular recent success in doing so: a comparison of RPC-over-HTTP methods.

This particular conversation was on the Brumtech slack: I particularly recommend local software groups as a more diverse source of information than the user groups for any particular technology. (And if you’re in England’s West Midlands, check out Brumtech!)

Thanks for listening! Please remember to sign up to the newsletter for more like this!

Leave a comment

Halloween is Over

Back in 2016, I sent the following letter to Linux Voice, and it was published in issue 24 as the star letter. LV came to an end (and made all of their content available as Creative Commons) when they merged with Linux Magazine. The domain still exists, but the certificate expired years ago; you should search for it if you’re interested in back numbers for the magazine and willing to take the risk on their SSL.

I think my letter is still relevant, so I’m reproducing it. Here’s what I wrote:

LV issue 023 contained, as have prior numbers, many jabs at Microsoft as the natural enemy of the Free Software believer. It’s time to accept that the world has changed.Like many among your staff and readers, I remember that period when the infamous Halloween memos were leaked, and we realised joyfully that the Free Software movement was big enough to concern the biggest software company in the world.

I remember this not because it was recent, but because I am old: this happened in 1998. Large companies like Microsoft can be slow to change, so it is right that we remain sceptical of their intentions with Free and open source software, but we need to remember that if we define our movement as Anti-Microsoft, it will live or die by their fortunes alone.

While we jab at Azure for their plush Tux swag, Apple has become one of the largest companies on the planet. It has done this with its proprietary iPhone and iOS platforms, which lock in more first-party applications than 1990s Windows did when the antitrust cases started flying. You can download alternatives from its store (and its store alone), but the terms of business on that store prohibit copyleft software. The downloads obtained by Apple’s users are restricted by DRM to particular Apple accounts.

Meanwhile, Apple co-opts open source projects like Clang and LLVM to replace successful Free Software components like GCC. How does the availability of a cuddly Tux with Microsoft branding stack up to these actions in respect to the FSF’s four freedoms?

We celebrate Google for popularising the Linux kernel through its Android mobile OS, and companies like it, including Facebook and Twitter, for their contributions to open source software. However, these companies thrive by providing proprietary services from their own server farms. None has embraced the AGPL, a licence that extends freedom to remote users of a hosted service. Is it meaningful to have the freedom to use a browser or a mobile device for any purpose, if the available purposes involve using non-free services?

So yes, Microsoft is still important, and its proprietary Windows and Office products are still huge obstacles to the freedom of computer users everywhere. On the other hand, Microsoft is no longer the headline company defining the computing landscape for many people. If the Free Software movement is the “say no to Microsoft” movement, then we will not win. Rather we will become irrelevant at the same time as our nemesis in Redmond.

You may think that Steve Jobs is an unlikely role model for someone in my position, but I will end by paraphrasing his statement on his return to Apple. We need to get out of the mindset that for the Four Freedoms to win, Microsoft has to lose.

Graham Lee

Their deputy editor responded.

I had never stopped to consider this, but what you say makes 100% sense. In practice though, for most people Microsoft is still the embodiment of proprietary software. Apple is arguably a more serious threat, but Microsoft keeps shooting itself in the foot, so it’s an easier target for us. Apple at least makes a lot of good products along with its egregious attitudes towards compatibility, planned obsolescence and forced upgrades; Microsoft seems to be successful only by abusing its market position.

Andrew Gregory

Things have changed a bit since then: Apple have made minimal efforts to permit alternative apps in certain categories; Microsoft have embraced and extended more open source technologies; various SaaS companies have piled in on the “open source but only when it works in our favour” bandwagon; Facebook renamed and is less likely to be praised now than it was in 2016.

But also things have stayed the same. As my friend and stream co-host Steven Baker put it, there’s a reason there isn’t an M in FAANG. Microsoft isn’t where the investors are interested any more, and they shouldn’t be where Free Software’s deciding battles are conducted.

If you like my writing on software engineering please subscribe to my fortnightly newsletter where I aggregate it from across the web, as well as sharing the things I’ve been reading about software engineering!

Posted in AAPL, FLOSS, msft | Tagged | Leave a comment

Design Patterns On Trial

Back in 1999, the OOPSLA conference held a show trial for the “Gang of Four”: Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Five years earlier, they had released their book “Design Patterns: Elements of Reusable Object-Oriented Software” at the very same conference, an act of subversion now seen as a crime against computer science. After hearing the case for the prosecution and for the defence, a simple minority of the conference delegates in attendance at the kangaroo court found the defendants guilty as charged.

The full indictment makes for some interesting reading, and it is clear that some real points are being made with tongues firmly in cheeks. “The Accused, by making it possible to design object-oriented programs in C++, have inhibited the rightful growth of competing object-oriented languages such as Smalltalk, CLOS, and Java.”

Perhaps this is a nod to Alan Kay’s OOPSLA 1997 outburst about not having C++ in mind when he invented the term Object-Oriented Programming. It’s certainly true that Smalltalk and CLOS are way less popular for Object-Oriented software implementation now than C++, but…Java? Really?

I could be asking “is Java really less popular than C++” here. But it’s also a good question to ask whether Java is really used to write Object-Oriented programs; the practice is so rare that Kevlin Henney recently had to do a talk reminding people that it’s possible.

The Accused, by distilling hard-won design expertise into patterns, have encouraged novices to act like experts.

In other words, they have advanced the field. By making it easier to today’s programmers to stand on the shoulders of those who came before, they have given them the skills previously afforded only by expertise. What a crime against computer science, making the field accessible!

Examples abound in history of people being tried for historical crimes when it’s important to make an example in current society. After the restoration of the monarchy in England and Scotland, Oliver Cromwell was exhumed and beheaded. Let’s rake the gang of four over the coals again. Here’s my addition to their indictment.

The Accused have, through their success in claiming the “Design Patterns” phrase as their own, maliciously tricked decades of programmers into believing that the whole idea behind Design Patterns is the catalogue of 23 patterns in their own book.

The point of design patterns is to have a patterns language, a shared glossary of solutions to common problems. When someone cites a pattern such as “microservice architecture” or “monorepo”, others know what they are doing, what problem they are solving, and the shape and trade-offs of their chosen solution.

But because people learn about “Design Patterns” from the GoF book (or, more likely, from Head First Design Patterns), they come to learn that Design Patterns are about Prototype, Flyweight, Iterator, and Template Method. They do not need these because their languages support them, therefore Design Patterns are old hat that nobody needs any more.

In fact they do not need those patterns because we stand on the shoulders of giants, and learned that these patterns were ubiquitously helpful in 1994. People are still publishing new design patterns (using the “particularly awkward and odious pattern format” of the Gang of Four), but these tend to be relegated to some dusty corner of academia. Worse, they tend to be written because nobody has written them yet (the “traditional standards of academic originality”), rather than because they represent “hard-won design expertise”.

In summary: design patterns are useful, because they let us quickly communicate a familiar solution with known properties when faced with a similar problem. The catalogue of design patterns in the book “Design Patterns” are useful, and because they have been useful since 1994 many of them have been subsumed into our tools at very basic levels. That book is a catalogue of useful design patterns from 1994, it is not all there is to know about design patterns.

Design patterns are dead. Long live design patterns!

Posted in whatevs | Tagged | 1 Comment