What is software engineering?

I suppose if I’m going to have a tagline like “from programming to software engineering”, we ought to have some kind of shared understanding of what that journey entails. It would be particularly useful to agree on the destination.

The question “what is software engineering?” doesn’t have a single answer. Plenty of people have the job title “software engineer”, or work in the “engineering” department of a software company, so we could say that it’s whatever they do. But that’s not very constructive. It’s too broad: anyone who calls themselves a software engineer could be doing anything, and it would become software engineering. It’s also too narrow: the people with the job title “software engineer” are often the programmers, and there’s more to software engineering than programming. I wrote a whole book, APPropriate Behaviour, about the things programmers need to know beyond the programming; that only scratches the surface of what goes into software engineering.

Sometimes the definition of software engineering is a negative-space definition, telling us what it isn’t so that something else can fill that gap. In Software Craftsmanship: The New Imperative, Pete McBreen described that “the software engineering approach” to building software is something that doesn’t work for many organisations, because it isn’t necessary. He comes close to telling us what that approach is when he says that it’s what Watts Humphries is lamenting in Why Don’t They Practice What We Preach? But again, there are reasons not to like this definition. Firstly, it’s a No True Scotsman definition. Software engineering is whatever people who don’t do it properly, the way software craftsmen do it, do. Secondly it’s just not particularly fruitful: two decades after his book was published, most software organisations aren’t using the craftsmanship model. Why don’t they practice what he preaches?

I want to use a values-oriented definition of software engineering: software engineering is not what you do, it’s why you do what you do, and how you go about doing it. No particular practice is or isn’t software engineering, but the way you evaluate those practices and whether or not to adopt them can adopt an engineering perspective. Similarly, this isn’t methodology consultancy: the problems your team has with Agile aren’t because you aren’t Agiling hard enough and need to hire more Agile trainers. But the ways in which you reflect on and adapt your processes can be informed by engineering.

I like Shari Lawrence Pfleeger’s definition, in her book Software Engineering: The Production of Quality Software:

There may be many ways to perform a particular task on a particular system, but some are better than others. One way may be more efficient, more precise, easier to modify, easier to use, or easier to understand. Consequently, Software Engineering is about designing and developing high-quality software.

There’s a bit of shorthand, or some missing steps here, that we could fill in. We understand that of the many ways to build a software system, we can call some of them “better”. We declare some attributes that contribute to this “betterness”: efficiency, precision, ease of adaptation, ease of use, ease of comprehension. This suggests that we know what the properties of the software are, which ones are relevant, and what values a desirable system would have for those properties. We understand what would be seen as a high-quality product, and we choose to build the software to optimise for that view of quality.

The Software Engineering degree course I teach on offers a similar definition:

Software Engineering is the application of scientific and engineering principles to the development of software systems—principles of design, analysis, and management—with the aim of:

  • developing software that meets its requirements, even when these requirements change;
  • completing the development on time, and within budget;
  • producing something of lasting value—easy to maintain, re-use, and re-deploy.

So again we have this idea that there are desirable qualities (the requirements, the lasting value, ease of maintenance, re-use, and re-deployment; and also the project-level qualities of understanding and controlling the schedule and the cost), and the idea that we are going to take a principled approach to understanding how our work supports these properties.

Let me summarise: software engineering is understanding the desired qualities of the software we build, and taking a systematic approach to our work that maximises those qualities.

Posted in software-engineering | Tagged | Leave a comment

Diagnosing a Docker image build problem

My Python script for Global.health was not running in production, because it couldn’t find some imports. Now the real solution is to package up the imports with setuptools and install them at runtime (we manage the environments with poetry), but the quick solution is to fix up the path so that they get imported anyway. Or so I thought.

The deployment lifecycle of this script is that it gets packaged into a Docker image and published to Amazon Elastic Container Repository. An EventBridge event triggers a Batch job definition using that image to be queued. So to understand why the imports aren’t working, we need to understand the Docker image.

docker create --name broken_script sha256:blah gives me a container based on the image. Previously I would have started that image and launched an interactive shell to poke around, but this time I decided to try something else: docker export broken_cleanup | tar tf - gives me the filesystem listing (and all I would’ve done with the running shell was various ls incantations, so that’s sufficient).

Indeed my image had various library files alongside the main script:

/app/clean_old_ingestion_source_files.py
/app/EventBridgeClient.py
/app/S3Client.py
/app/__init__.py

Those supporting files should be in a subfolder. My copy command was wrong in the Dockerfile:

COPY clean_old_ingestion_source_files.py aws_access ./

This copies the content of aws_access into the current folder, I wanted to copy the folder (and, recursively, its content). Simple fix: break that line into two, putting the files in their correct destinations. Now rebuild the image, and verify that it is fixed. This time I didn’t export the whole filesystem from a container, I exported the layers from the image.

docker image save sha256:blah | tar xf -

This gives me a manifest.json showing each layer, and a tar file with the content of that layer. Using this I could just get the table of content for the layer containing my Python files, and confirm that they are now organised correctly.

Posted in whatevs | Leave a comment

Episode 49: REST and SOAP

I talk both about the difficulties of having objective conversations comparing technologies on the interwebs, and about a particular recent success in doing so: a comparison of RPC-over-HTTP methods.

This particular conversation was on the Brumtech slack: I particularly recommend local software groups as a more diverse source of information than the user groups for any particular technology. (And if you’re in England’s West Midlands, check out Brumtech!)

Thanks for listening! Please remember to sign up to the newsletter for more like this!

Leave a comment

Halloween is Over

Back in 2016, I sent the following letter to Linux Voice, and it was published in issue 24 as the star letter. LV came to an end (and made all of their content available as Creative Commons) when they merged with Linux Magazine. The domain still exists, but the certificate expired years ago; you should search for it if you’re interested in back numbers for the magazine and willing to take the risk on their SSL.

I think my letter is still relevant, so I’m reproducing it. Here’s what I wrote:

LV issue 023 contained, as have prior numbers, many jabs at Microsoft as the natural enemy of the Free Software believer. It’s time to accept that the world has changed.Like many among your staff and readers, I remember that period when the infamous Halloween memos were leaked, and we realised joyfully that the Free Software movement was big enough to concern the biggest software company in the world.

I remember this not because it was recent, but because I am old: this happened in 1998. Large companies like Microsoft can be slow to change, so it is right that we remain sceptical of their intentions with Free and open source software, but we need to remember that if we define our movement as Anti-Microsoft, it will live or die by their fortunes alone.

While we jab at Azure for their plush Tux swag, Apple has become one of the largest companies on the planet. It has done this with its proprietary iPhone and iOS platforms, which lock in more first-party applications than 1990s Windows did when the antitrust cases started flying. You can download alternatives from its store (and its store alone), but the terms of business on that store prohibit copyleft software. The downloads obtained by Apple’s users are restricted by DRM to particular Apple accounts.

Meanwhile, Apple co-opts open source projects like Clang and LLVM to replace successful Free Software components like GCC. How does the availability of a cuddly Tux with Microsoft branding stack up to these actions in respect to the FSF’s four freedoms?

We celebrate Google for popularising the Linux kernel through its Android mobile OS, and companies like it, including Facebook and Twitter, for their contributions to open source software. However, these companies thrive by providing proprietary services from their own server farms. None has embraced the AGPL, a licence that extends freedom to remote users of a hosted service. Is it meaningful to have the freedom to use a browser or a mobile device for any purpose, if the available purposes involve using non-free services?

So yes, Microsoft is still important, and its proprietary Windows and Office products are still huge obstacles to the freedom of computer users everywhere. On the other hand, Microsoft is no longer the headline company defining the computing landscape for many people. If the Free Software movement is the “say no to Microsoft” movement, then we will not win. Rather we will become irrelevant at the same time as our nemesis in Redmond.

You may think that Steve Jobs is an unlikely role model for someone in my position, but I will end by paraphrasing his statement on his return to Apple. We need to get out of the mindset that for the Four Freedoms to win, Microsoft has to lose.

Graham Lee

Their deputy editor responded.

I had never stopped to consider this, but what you say makes 100% sense. In practice though, for most people Microsoft is still the embodiment of proprietary software. Apple is arguably a more serious threat, but Microsoft keeps shooting itself in the foot, so it’s an easier target for us. Apple at least makes a lot of good products along with its egregious attitudes towards compatibility, planned obsolescence and forced upgrades; Microsoft seems to be successful only by abusing its market position.

Andrew Gregory

Things have changed a bit since then: Apple have made minimal efforts to permit alternative apps in certain categories; Microsoft have embraced and extended more open source technologies; various SaaS companies have piled in on the “open source but only when it works in our favour” bandwagon; Facebook renamed and is less likely to be praised now than it was in 2016.

But also things have stayed the same. As my friend and stream co-host Steven Baker put it, there’s a reason there isn’t an M in FAANG. Microsoft isn’t where the investors are interested any more, and they shouldn’t be where Free Software’s deciding battles are conducted.

If you like my writing on software engineering please subscribe to my fortnightly newsletter where I aggregate it from across the web, as well as sharing the things I’ve been reading about software engineering!

Posted in AAPL, FLOSS, msft | Tagged | Leave a comment

Design Patterns On Trial

Back in 1999, the OOPSLA conference held a show trial for the “Gang of Four”: Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Five years earlier, they had released their book “Design Patterns: Elements of Reusable Object-Oriented Software” at the very same conference, an act of subversion now seen as a crime against computer science. After hearing the case for the prosecution and for the defence, a simple minority of the conference delegates in attendance at the kangaroo court found the defendants guilty as charged.

The full indictment makes for some interesting reading, and it is clear that some real points are being made with tongues firmly in cheeks. “The Accused, by making it possible to design object-oriented programs in C++, have inhibited the rightful growth of competing object-oriented languages such as Smalltalk, CLOS, and Java.”

Perhaps this is a nod to Alan Kay’s OOPSLA 1997 outburst about not having C++ in mind when he invented the term Object-Oriented Programming. It’s certainly true that Smalltalk and CLOS are way less popular for Object-Oriented software implementation now than C++, but…Java? Really?

I could be asking “is Java really less popular than C++” here. But it’s also a good question to ask whether Java is really used to write Object-Oriented programs; the practice is so rare that Kevlin Henney recently had to do a talk reminding people that it’s possible.

The Accused, by distilling hard-won design expertise into patterns, have encouraged novices to act like experts.

In other words, they have advanced the field. By making it easier to today’s programmers to stand on the shoulders of those who came before, they have given them the skills previously afforded only by expertise. What a crime against computer science, making the field accessible!

Examples abound in history of people being tried for historical crimes when it’s important to make an example in current society. After the restoration of the monarchy in England and Scotland, Oliver Cromwell was exhumed and beheaded. Let’s rake the gang of four over the coals again. Here’s my addition to their indictment.

The Accused have, through their success in claiming the “Design Patterns” phrase as their own, maliciously tricked decades of programmers into believing that the whole idea behind Design Patterns is the catalogue of 23 patterns in their own book.

The point of design patterns is to have a patterns language, a shared glossary of solutions to common problems. When someone cites a pattern such as “microservice architecture” or “monorepo”, others know what they are doing, what problem they are solving, and the shape and trade-offs of their chosen solution.

But because people learn about “Design Patterns” from the GoF book (or, more likely, from Head First Design Patterns), they come to learn that Design Patterns are about Prototype, Flyweight, Iterator, and Template Method. They do not need these because their languages support them, therefore Design Patterns are old hat that nobody needs any more.

In fact they do not need those patterns because we stand on the shoulders of giants, and learned that these patterns were ubiquitously helpful in 1994. People are still publishing new design patterns (using the “particularly awkward and odious pattern format” of the Gang of Four), but these tend to be relegated to some dusty corner of academia. Worse, they tend to be written because nobody has written them yet (the “traditional standards of academic originality”), rather than because they represent “hard-won design expertise”.

In summary: design patterns are useful, because they let us quickly communicate a familiar solution with known properties when faced with a similar problem. The catalogue of design patterns in the book “Design Patterns” are useful, and because they have been useful since 1994 many of them have been subsumed into our tools at very basic levels. That book is a catalogue of useful design patterns from 1994, it is not all there is to know about design patterns.

Design patterns are dead. Long live design patterns!

Posted in whatevs | Tagged | 1 Comment

Episode 48: The Personal Software Process

This episode is about the Software Engineering Institute’s Personal Software Process (PSP), a particular disciplined way of improving a software engineer’s work. We talk about other the process in particular, and the idea of a continuous improvement process more generally.

I also introduce the SICPers newsletter and mailing list where I collect audio, video, and text from my journey to become a better software engineer, and to share what I learn with you. Please consider signing up, the first issue lands tomorrow (at time of writing)!

1 Comment

Falsehoods These Programmers Believed About Countries

Well this was a hard-fought issue. Setting the scene: since April 2020 I’ve been working on Global.health: a Data Science Initiative, where we collate information about Covid-19 cases worldwide and make them available in a standard schema for analysis. In the early days the data were collected by hand, as had been done for prior outbreaks, but the pandemic quickly grew beyond a scale where that was manageable so instead we looked to automatically import data from trustworthy published sources like ministries of health.

Cases typically have a location associated with them (often the centroid of their local health service district, or the administrative region the sufferer is registered in; never something uniquely identifiable like a home address). Now already being able to work with location data throws us some usability curveballs. Did you search for London as in anywhere with the name “London” in (such as London Oxford Airport, just outside Oxford and 62 miles from the City of London, a city within the place people generally know as London in England), or did you have a specific location in mind like London, Ontario, Canada or London Street, Los Angeles, California, USA?

But we had it worse than that. Our data format had the name of the country in, which led to all sorts of problems. Had the curator entered that London St case as being in the US, or the USA, or America, or the United States of America, or the United States? Sometimes even cases that had location data filled by a geolocation service had weird glitches, like a number of cases from Algeria being associated with the non-country Algiers. And it made it easier for us devs to make unforced errors, like not generating cached per-country data when the country has a space in its name.

For all of these reasons, I ended up with the task of changing our schema so that countries are stored as ISO-3166 two-letter codes. Along the way I spotted all of the above difficulties, and more, some of which even manifested in the libraries that map from country codes to names, and vice versa. Note I’m using “country code” fairly loosely; some places are far enough from where everybody thinks of as “the country” that they have a separate ISO code (it wouldn’t help anyone to record a case as being in “the UK” when you mean “the Falkland Islands”, of which more below).

  • Countries have changed names recently. Swaziland became Eswatini (ISO code SZ) in 2018. The Former Yugoslavian Republic of Macedonia (officially, as far as they were concerned, Macedonia; but often named FYROM as Greece wouldn’t allow them to accede to the EU under Macedonia) became the Republic of North Macedonia (ISO code MK) in 2019. Both of these appeared in our geocoding provider under their original names, even though we didn’t start gathering data until early 2020.
  • People don’t think of a country by its official name. China is China to many, not the People’s Republic of China (ISO code CN). Do we show the official name or the common name?
  • People don’t think of a region as part of its sovereign country. The Hong Kong Special Administrative Region of the People’s Republic of China has ISO code HK, but is…well, it’s a special administrative region of the People’s Republic of China (CN). When you show people countries on a map like they asked for, and they ask “where is Hong Kong”, the answer is “it isn’t there because it isn’t a country”.
  • A country’s ISO code is not the same as its top-level domain. Not a correctness problem for us, but one that might impact usability, when people look for the United Kingdom of Great Britain and Northern Ireland under “UK” when they’ll find it under “GB”. There is a .gb TLD, but it doesn’t accept new registrations.
  • The extent of a region can change when its name changes. We have cases geocoded to the Netherlands Antilles (doesn’t have an ISO code, technically, see next point, but used to be AN); there’s extra work involved to decide whether these should be associated with Aruba (AW), Curaçao (CW), Sint Maarten (SX) or the Carribean Netherlands (BQ).
  • As mentioned above, ISO “retires” codes. There isn’t a code AN any more, because when the Netherlands Antilles stopped existing they decided it isn’t needed. This causes a problem in that a library that has “a complete” database of ISO country codes doesn’t necessarily have the retired ones. The historical cases are in ISO 3166-3 but occupy a different namespace than the ISO 3166-1 codes so it’s not like AN still means “the region that used to be Netherlands Antilles”: its ISO 3166-3 code is ANHH. Similarly, the German Democratic Republic used to have the ISO 3166-1 code DD but now has the ISO 3166-3 code DDDE.
  • A region may have two official names. The Falkland Islands (FK) are a British Overseas Territory; they are also constitutionally part of Argentina under the name “Islas Malvinas”. Politics aside, the reason this is an immediate problem is that some services feel the need to helpfully list both but not in a standard way; you need to be able to cope with “Falkland Islands [Malvinas]”, “Falkland Islands (Islas Malvinas)” and more.
  • A region might have no official existence. Kosovo (XK) has an ISO code but is a disputed region, recognised by about half of the UN as a sovereign country and claimed as an autonomous region by Serbia.
  • A country’s name might be the same as another country’s name. The Democratic Republic of the Congo (CD) is sometimes known as “the Congo”, and the neighbouring Republic of the Congo (CG) is also sometimes known as “the Congo”. Frustratingly, the i18n-iso-countries package lists “The Congo” as a name for both countries, so name-to-code mapping is unreliable.

You may be able to find more; feel welcome to comment (or maybe file a bug in Global.health if it’s causing trouble over there). Also, I’ve just created the SICPers mailing list as a one-stop shop for all my reading, writing and talking about software engineering, please consider subscribing!

Posted in design, FLOSS, UI | 1 Comment

Introducing the SICPers Newsletter

I write a lot about software engineering. I talk a lot about software engineering. And I read a lot about software engineering.

And that stuff is scattered all over the interwebs. Well, some of it isn’t even there, it’s in notebooks. Which is a shame, because I think there are interesting conversations to be had if we talk about it some more.

Introducing the SICPers newsletter. A regular (currently fortnightly, first issue Feb 18th) collection of things I’ve said and heard about the world of software. This is supposed to be a conversation starter, not a monologue, so please do sign up, please do reply to the emails, and please do suggest topics for inclusion.

Posted in writing | Leave a comment

[objc retain]; continues apace

I just finished recording episode 35 of [objc retain]; the stream on Objective-C programming with Free Software that I co-host with Steven Baker. It is available on Twitch and you can subscribe there to get notified about new episodes.

It will also soon be available on the replay server where you can watch all historical episodes of the show.

If you enjoy the show (or this blog, or the podcast, or…) please consider supporting my work on Patreon, thank you!

Posted in FLOSS, freesoftware, gnustep, objc | Leave a comment

Episode 47: comprehensive documentation

I talk about the historical context of the Agile manifesto, what “comprehensive documentation” meant then, and what documentation is still important now.

I also remind you that you can support this podcast by becoming a patron. I chose not to tell you about the brand of underpants I wear in return for cash dollars, but nonetheless do find the occasional cash dollar handy.

Leave a comment