Episode 48: The Personal Software Process

This episode is about the Software Engineering Institute’s Personal Software Process (PSP), a particular disciplined way of improving a software engineer’s work. We talk about other the process in particular, and the idea of a continuous improvement process more generally.

I also introduce the SICPers newsletter and mailing list where I collect audio, video, and text from my journey to become a better software engineer, and to share what I learn with you. Please consider signing up, the first issue lands tomorrow (at time of writing)!

1 Comment

Falsehoods These Programmers Believed About Countries

Well this was a hard-fought issue. Setting the scene: since April 2020 I’ve been working on Global.health: a Data Science Initiative, where we collate information about Covid-19 cases worldwide and make them available in a standard schema for analysis. In the early days the data were collected by hand, as had been done for prior outbreaks, but the pandemic quickly grew beyond a scale where that was manageable so instead we looked to automatically import data from trustworthy published sources like ministries of health.

Cases typically have a location associated with them (often the centroid of their local health service district, or the administrative region the sufferer is registered in; never something uniquely identifiable like a home address). Now already being able to work with location data throws us some usability curveballs. Did you search for London as in anywhere with the name “London” in (such as London Oxford Airport, just outside Oxford and 62 miles from the City of London, a city within the place people generally know as London in England), or did you have a specific location in mind like London, Ontario, Canada or London Street, Los Angeles, California, USA?

But we had it worse than that. Our data format had the name of the country in, which led to all sorts of problems. Had the curator entered that London St case as being in the US, or the USA, or America, or the United States of America, or the United States? Sometimes even cases that had location data filled by a geolocation service had weird glitches, like a number of cases from Algeria being associated with the non-country Algiers. And it made it easier for us devs to make unforced errors, like not generating cached per-country data when the country has a space in its name.

For all of these reasons, I ended up with the task of changing our schema so that countries are stored as ISO-3166 two-letter codes. Along the way I spotted all of the above difficulties, and more, some of which even manifested in the libraries that map from country codes to names, and vice versa. Note I’m using “country code” fairly loosely; some places are far enough from where everybody thinks of as “the country” that they have a separate ISO code (it wouldn’t help anyone to record a case as being in “the UK” when you mean “the Falkland Islands”, of which more below).

  • Countries have changed names recently. Swaziland became Eswatini (ISO code SZ) in 2018. The Former Yugoslavian Republic of Macedonia (officially, as far as they were concerned, Macedonia; but often named FYROM as Greece wouldn’t allow them to accede to the EU under Macedonia) became the Republic of North Macedonia (ISO code MK) in 2019. Both of these appeared in our geocoding provider under their original names, even though we didn’t start gathering data until early 2020.
  • People don’t think of a country by its official name. China is China to many, not the People’s Republic of China (ISO code CN). Do we show the official name or the common name?
  • People don’t think of a region as part of its sovereign country. The Hong Kong Special Administrative Region of the People’s Republic of China has ISO code HK, but is…well, it’s a special administrative region of the People’s Republic of China (CN). When you show people countries on a map like they asked for, and they ask “where is Hong Kong”, the answer is “it isn’t there because it isn’t a country”.
  • A country’s ISO code is not the same as its top-level domain. Not a correctness problem for us, but one that might impact usability, when people look for the United Kingdom of Great Britain and Northern Ireland under “UK” when they’ll find it under “GB”. There is a .gb TLD, but it doesn’t accept new registrations.
  • The extent of a region can change when its name changes. We have cases geocoded to the Netherlands Antilles (doesn’t have an ISO code, technically, see next point, but used to be AN); there’s extra work involved to decide whether these should be associated with Aruba (AW), Curaçao (CW), Sint Maarten (SX) or the Carribean Netherlands (BQ).
  • As mentioned above, ISO “retires” codes. There isn’t a code AN any more, because when the Netherlands Antilles stopped existing they decided it isn’t needed. This causes a problem in that a library that has “a complete” database of ISO country codes doesn’t necessarily have the retired ones. The historical cases are in ISO 3166-3 but occupy a different namespace than the ISO 3166-1 codes so it’s not like AN still means “the region that used to be Netherlands Antilles”: its ISO 3166-3 code is ANHH. Similarly, the German Democratic Republic used to have the ISO 3166-1 code DD but now has the ISO 3166-3 code DDDE.
  • A region may have two official names. The Falkland Islands (FK) are a British Overseas Territory; they are also constitutionally part of Argentina under the name “Islas Malvinas”. Politics aside, the reason this is an immediate problem is that some services feel the need to helpfully list both but not in a standard way; you need to be able to cope with “Falkland Islands [Malvinas]”, “Falkland Islands (Islas Malvinas)” and more.
  • A region might have no official existence. Kosovo (XK) has an ISO code but is a disputed region, recognised by about half of the UN as a sovereign country and claimed as an autonomous region by Serbia.
  • A country’s name might be the same as another country’s name. The Democratic Republic of the Congo (CD) is sometimes known as “the Congo”, and the neighbouring Republic of the Congo (CG) is also sometimes known as “the Congo”. Frustratingly, the i18n-iso-countries package lists “The Congo” as a name for both countries, so name-to-code mapping is unreliable.

You may be able to find more; feel welcome to comment (or maybe file a bug in Global.health if it’s causing trouble over there). Also, I’ve just created the SICPers mailing list as a one-stop shop for all my reading, writing and talking about software engineering, please consider subscribing!

Posted in design, FLOSS, UI | 1 Comment

Introducing the SICPers Newsletter

I write a lot about software engineering. I talk a lot about software engineering. And I read a lot about software engineering.

And that stuff is scattered all over the interwebs. Well, some of it isn’t even there, it’s in notebooks. Which is a shame, because I think there are interesting conversations to be had if we talk about it some more.

Introducing the SICPers newsletter. A regular (currently fortnightly, first issue Feb 18th) collection of things I’ve said and heard about the world of software. This is supposed to be a conversation starter, not a monologue, so please do sign up, please do reply to the emails, and please do suggest topics for inclusion.

Posted in writing | Leave a comment

[objc retain]; continues apace

I just finished recording episode 35 of [objc retain]; the stream on Objective-C programming with Free Software that I co-host with Steven Baker. It is available on Twitch and you can subscribe there to get notified about new episodes.

It will also soon be available on the replay server where you can watch all historical episodes of the show.

If you enjoy the show (or this blog, or the podcast, or…) please consider supporting my work on Patreon, thank you!

Posted in FLOSS, freesoftware, gnustep, objc | Leave a comment

Episode 47: comprehensive documentation

I talk about the historical context of the Agile manifesto, what “comprehensive documentation” meant then, and what documentation is still important now.

I also remind you that you can support this podcast by becoming a patron. I chose not to tell you about the brand of underpants I wear in return for cash dollars, but nonetheless do find the occasional cash dollar handy.

Leave a comment

On Apple’s swings and misses

There’s a trope in the Apple-using technologist world that when an Apple innovation doesn’t immediately succeed, they abandon it. It’s not entirely true, let’s see what actually happens.

The quote in the above-linked item that supports the claim: “Apple has a tendency to either hit home runs out of the box (iPod, iPhone, AirPods) or come out with a dud and just sweep it under the rug, like iMessage apps and stickers.” iMessage apps and stickers are new features in iMessage. These are incremental additions to an existing technology. Granted, neither of them have revolutionised the way that everybody uses iMessage, and neither of them have received much (or any) further (user-facing) development, but both are themselves attempts to improve an actual product that Apple actually has and has not swept under the rug.

We can make a similar argument about the TouchBar. The TouchBar is the touchscreen strip on some models of MacBook Pro laptop that replaces the function key row on the keyboard with an adaptive UI. It appeared, it…stayed around a bit, then it seems to have now disappeared. Perhaps importantly, it never got replicated on their other keyboards, like the one that comes with the iMac or the one you can buy separately. We could say that the TouchBar was a dud that got swept under the rug, or we could say that it was an incremental change to the MacBook Pro and that Apple have since tried other changes to this long-running product, like the M1 architecture.

There are two other categories of non-home-run developments to take into account. The first is the duds that do get incremental development. iTV/Apple TV was such a bad business for the first many years of its history that execs would refer to it as a hobby, right up until it made them a billion dollars and was no longer a hobby.

Mac OS X’s first release was a lightly sparkling OpenStep, incompatible with any Mac software (it came with a virtual machine to run actual MacOS) and incompatible with most Unix software too. It was sold as a server-only product, which given the long wait involved when doing something as simple as opening the text editor (a Java application) was a sensible move. Yet, here we are, 23 years later, and macOS/iOS/iPadOS/tvOS/watchOS/bridgeOS is the same technology, incrementally improved.

Then the next category is things that go away, get rethought, and brought back. The thing we see might look like a dud but it’s actually an idea that Apple stick with. Again, two examples: remember dashboard widgets in Tiger? There was an overlay view that let you organise a screen of little javascript widgets to do world time, stocks, weather, and other things including those supplied by third-party developers. It was there, it looked the same for a bit (as long as you don’t mention the DashCode tool introduced along the way), then it wasn’t there. But later, control center came along, and we got a new version of the same idea.

In between that fizzy NeXT version of Mac OS X Server and the first public release of Mac OS X 10.0 (which was also a dud, many users sticking with MacOS 9 and Apple even giving away 10.1 for free to ensure as many people as possible got the fixes), the Aqua interface was born. Significantly more “lickable” than its modern look, it was nonetheless recognisable to a Monterey user, with its familiar traffic-light window controls: red for close, yellow for minimise, green for zoom, and purple for…wait, purple? Yes, purple. This activated single-window mode, in which only the active window was shown and all others minimised to the dock. Switch window, and the previous one disappeared. This wasn’t in the public release, but now we have Mission Control and fullscreen mode, so did it truly go away?

Posted in AAPL, UI | 2 Comments

Licenses aren’t sufficient

Another recent issue in the world of “centralised open source dependency repositories were a bad idea” initiated by the central contradiction of free software. People want to both give everything away without limitation on who uses it or how, and they want “Big Program” to pay for the work to be done.

While the license is the only tool used by free software authors, there is no way that this is going to be resolved in the favour of the Robin Hood model. There’s nothing of value on offer to Big Program in the software. They want the right to use the software for their nefarious purposes, and for free they can get the right to use the software for any purpose. Why would they pay more?

They wouldn’t. And no amount of whataboutism is going to change that. Whatabout if nobody can afford to work on free software any more, and they lose access to updates? Doesn’t happen. The current set of incentives – part financial, mostly reputational, and part itch-scratching – actually observably cause an increasing amount of free software to be created over time.

That gap needs to be resolved in other ways. There are things that companies will pay for even when they have the freedom to use the software for any purpose, at no charge. They will pay for support, bug bounties, indemnification, training, documentation, consultancy, integration, operations…

If the free software community hadn’t completely withdrawn from the patents discussion, they might pay to license the patent whether or not they take the (free) copyright licence. But that has yet to happen.

Plenty of organisations understand this: Red Hat became a forty-odd-billion dollar company giving away the software for free and selling other things. Canonical, Cygnus, ActiveState, O’Reilly, Mozilla, Musescore, Nextcloud…all of them make software, none of them is a software company. All make money in the free software world, none is a free software company.

Please continue giving us all the freedom to use the software for any purpose. Also the other freedoms, to study, improve, and share the software. But remember that freedom is not for sale.

Posted in FLOSS | Leave a comment

Episode 46: popularity

This episode is all about the TIOBE Index of programming language popularity: when to use it, what its limitations are, why certain things are or aren’t popular, and why the hell isn’t Excel on the list.

Leave a comment

On the glorification of ignorance

When I wrote I have some small idea of what I’m doing, it was on the basis that DHH was engaging in some exaggeration. Surely software engineers, whose job depends on what they know and what they can learn, would not really revel in their lack of knowledge?

Then it happened. A technology forum I’m a member of had a discussion in which participants expressed that they did not understand the topic, that they did not intend to understand that topic, and they still wished to dunk on the people in a video about said topic.

The topic, by the way, is cryptocurrency. It happens that I don’t have a lot of time for cryptocurrency and I think most other blockchain applications are not particularly beneficial, but this comes after taking a course on blockchain, reading a textbook, talking to some startups about their products, generally engaging with the topic. I haven’t flipped the bozo bit, but I have decided that I do not currently see any use for that technology and see a lot of downside to its application. If you’d asked me before all of that study, and people did, I would have told you that I don’t know anything about the topic.

I feel a bit bad for, and about, that technology forum. It contains people I respect, and I’ve had valuable conversations there, so I don’t want to disengage completely. I would then be flipping bozo bits at scale, which is exactly the problem we have with many current attempts to converse. I also don’t want it to degenerate into a bubble for the one approved mindset, and I particularly don’t want the software engineering mindset to be one where making your mind up before learning about a topic, and valorising that decision to engage before learning, is the preferred form of contribution.

Suggestions welcome.

Posted in whatevs | Leave a comment

Episode 45: Information Security

This issue is all about the various reasons information security isn’t taken more seriously by developers.

Leave a comment