Skip to content

Preparing for Computing’s Big One-Oh-Oh

However you slice the pie, we’re between two and three decades away from the centenary celebration for applied computing (which is of course significantly after theoretical or hypothetical advances made by the likes of Lovelace, Turing and others). You might count the anniversary of Colossus in 2043, the ENIAC in 2046, or maybe something earlier (and arguably not actually applied) like the Z3 or ABC (both 2041). Whichever one you pick, it’s not far off.

That means that the time to start organising the handover from the first century’s programmers to the second is now, or perhaps a little earlier. You can see the period from the 1940s to around 1980 as a time of discovery, when people invented new ways of building and applying computers because they could, and because there were no old ways yet. The next three and a half decades—a period longer than my life—has been a period of rediscovery, in which a small number of practices have become entrenched and people occasionally find existing, but forgotten, tools and techniques to add to their arsenal, and incrementally advance the entrenched ones.

My suggestion is that the next few decades be a period of uncovery, in which we purposefully seek out those things that have been tried, and tell the stories of how they are:

  • successful because they work;
  • successful because they are well-marketed;
  • successful because they were already deployed before the problems were understood;
  • abandoned because they don’t work;
  • abandoned because they are hard;
  • abandoned because they are misunderstood;
  • abandoned because something else failed while we were trying them.

I imagine a multi-volume book✽, one that is to the art of computer programming as The Art Of Computer Programming is to the mechanics of executing algorithms on a machine. Such a book✽ would be mostly a guide, partly a history, with some, all or more of the following properties:

  • not tied to any platform, technology or other fleeting artefact, though with examples where appropriate (perhaps in a platform invented for the purpose, as MIX, Smalltalk, BBC BASIC and Oberon all were)
  • informed both by academic inquiry and practical experience
  • more accessible than the Software Engineering Body of Knowledge
  • as accepting of multiple dissenting views as Ward’s Wiki
  • at least as honest about our failures as The Mythical Man-Month
  • at least as proud of our successes as The Clean Coder
  • more popular than The Celestial Homecare Omnibus

As TAOCP is a survey of algorithms, so this book✽ would be a survey of techniques, practices and modes of thought. As this century’s programmer can go to TAOCP to compare algorithms and data structures for solving small-scale problems then use selected algorithms and data structures in their own work, so next century’s applier of computing could go to this book✽ to compare techniques and ways of reasoning about problems in computing then use selected techniques and reasons in their own work. Few people would read such a thing from cover to cover. But many would have it to hand, and would be able to get on with the work of invention without having to rewrite all of Doug Engelbart’s work before they could get to the new stuff.

It's dangerous to go alone! Take this.

✽: don’t get hung up on the idea that a book is a collection of quires of some pigmented flat organic matter bound into a codex, though.

Intuitive is the Enemy of Good

In the previous instalment, I discussed an interview in which Alan Kay maligned growth-restricted user interfaces. Here’s the quote again:

There is the desire of a consumer society to have no learning curves. This tends to result in very dumbed-down products that are easy to get started on, but are generally worthless and/or debilitating. We can contrast this with technologies that do have learning curves, but pay off well and allow users to become experts (for example, musical instruments, writing, bicycles, etc. and to a lesser extent automobiles).

This is nowhere more evident than in the world of the mobile app. Any one app comprises a very small number of very focussed, very easy to use features. This has a couple of different effects. One is that my phone as a whole is an incredibly broad, incredibly shallow experience. For example, one goal I want help with from technology is:

As an obese programmer, I want to understand how I can improve my lifestyle in order to live longer and be healthier.

Is there an app for that? No; I have six apps that kindof together provide an OK, but pretty disjointed experience that gets me some dissatisfying way toward my goal. I can tell three of these apps how much I run, but I have to remember that some subset can feed information to the others but the remainder cannot. I can tell a couple of them how much I ate, but if I do it in one of them then another won’t count it correctly. Putting enough software to fulfil my goal into one app presumably breaks the cardinal rule of making every feature available within two gestures of the app’s launch screen. Therefore every feature is instead hidden behind the externalised myriad gestures required to navigate my home screens and their folders to get to the disparate subsets of utility.

The second observable effect is that there is a lot of wasted potential in both the device, and the person operating that device. You have never met an expert iPhone user, for the simple reason that someone who’s been using an iPhone for six years is no more capable than someone who has spent a week with their new device diligently investigating. There is no continued novelty, there are no undiscovered experiences. There is no expertise. Welcome to the land of the perpetual beginner.

Thankfully, marketing provided us with a thought-terminating cliché, to help us in our discomfort with this situation. They gave us the cars and trucks analogy. Don’t worry that you can’t do everything you’d expect with this device. You shouldn’t expect to do absolutely everything with this device. Notice the sleight of brain?

Let us pause for a paragraph to notice that even if making the most simple, dumbed-down (wait, sorry, intuitive) experience were our goal, we use techniques that keep that from within our grasp. An A/B test will tell you whether this version is incrementally “better” than that version, but will not tell you whether the peak you are approaching is the tallest mountain in the range. Just as with evolution, valley crossing is hard without a monumental shake-up or an interminable period of neutral drift.

Desktop environments didn’t usually get this any better. The learning path for most WIMP interfaces can be listed thus:

  1. cannot use mouse.
  2. can use mouse, cannot remember command locations.
  3. can remember command locations.
  4. can remember keyboard shortcuts.
  5. ???
  6. programming.

A near-perfect example of this would be emacs. You start off with a straightforward modeless editor window, but you don’t know how to save, quit, load a file, or anything. So you find yourself some cheat-sheet, and pretty soon you know those things, and start to find other things like swapping buffers, opening multiple windows, and navigating around a buffer. Then you want to compose a couple of commands, and suddenly you need to learn LISP. Many people will cap out at level 4, or even somewhere between 3 and 4 (which is where I am with most IDEs unless I use them day-in, day-out for months).

The lost magic is in level 5. Tools that do a good job of enabling improvement without requiring that you adopt a skill you don’t identify with (i.e. programming, learning the innards of a computer) invite greater investment over time, rewarding you with greater results. Photoshop gets this right. Automator gets it right. AppleScript gets it wrong; that’s just programming (in fact it’s all the hard bits from Smalltalk with none of the easy or welcoming bits). Yahoo! Pipes gets it right but markets it wrong. Quartz Composer nearly gets it right. Excel is, well, a bit of a boundary case.

The really sneaky bit is that level 5 is programming, just with none of the trappings associated with the legacy way of programming that professionals do it. No code (usually), no expressing your complex graphical problem as text, no expectation that you understand git, no philosophical wrangling over whether squares are rectangles or not. It’s programming, but with a closer affinity with the problem domain than bashing out semicolons and braces. Level 5 is where we can enable people to get the most out of their computers, without making them think that they’re computering.

How much programming language is enough?

Many programmers have opinions on programming languages. Maybe, if I present an opinion on programming languages, I can pass off as a programmer.

An old debate in psychology and anthropology is that of nature vs nurture, the discussion over which characteristics of humans and their personalities are innate and which are learned or otherwise transferred.

We can imagine two extremists in this debate turning their attention to programming languages. On the one hand, you might imagine that if the ability to write a computer program is somehow innate, then there is a way of expressing programming concepts that is closely attuned to that innate representation. Find this expression, and everyone will be able to program as fast as they can think. Although there’ll still be arguments over bracket placement, and Dijkstra will still tell you it’s rubbish.

On the other hand, you might imagine that the mind is a blank slate, onto which can be writ any one (or more?) of diverse patterns. Then the way in which you will best express a computer program is dependent on all of your experiences and interactions, with the idea of a “best” way therefore being highly situated.

We will leave this debate behind. It seems that programming shares some brain with learning other languages, and when it comes to deciding whether language is innate or learned we’re still on shaky ground. It seems unlikely on ethical grounds that Nim Chimpsky will ever be joined by Charles Babboonage, anyway.

So, having decided that there’s still an open question, there must exist somewhere into which I can insert my uninvited opinion. I had recently been thinking that a lot of the ceremony and complexity surrounding much of modern programming has little to do with it being difficult to represent a problem to a computer, and everything to do with there being unnecessary baggage in the tools and languages themselves. That is to say that contrary to Fred Brooks’s opinion, we are overwhelmed with Incidental Complexity in our art. That the mark of expertise in programming is being able to put up with all the nonsense programming makes you do.

From this premise, it seems clear that less complex programming languages are desirable. I therefore look admirably at tools like Self, io and Scheme, which all strive for a minimum number of distinct parts.

However, Clemens Szyperski from Microsoft puts forward a different argument in this talk. He works on the most successful development environment. In the talk, Szyperski suggests that experienced programmers make use of, and seek out, more features in a programming language to express ideas concisely, using different features for different tasks. Beginners, on the other hand, benefit from simpler languages where there is less to impede progress. So, what now? Does the “less is more” principle only apply to novice programmers?

Maybe the experienced programmers Szyperski identified are not experts. There’s an idea that many programmers are expert beginners, that would seem to fit Szyperski’s model. The beginner is characterised by a microscopic, non-holistic view of their work. They are able to memorise and apply heuristic rules that help them to make progress.

The expert beginner is someone who has simply learned more rules. To the expert beginner, there is a greater number of heuristics to choose from. You can imagine that if each rule is associated with a different piece of programming language grammar, then it’d be easier to remember the (supposed) causality behind “this situation calls for that language feature”.

That leaves us with some interesting open questions. What would a programming tool suitable for experts (or the proficient) look like? Do we have any? Alan Kay is fond of saying that we’re stuck with novice-friendly user experiences, that don’t permit learning or acquiring expertise:

There is the desire of a consumer society to have no learning curves. This tends to result in very dumbed-down products that are easy to get started on, but are generally worthless and/or debilitating. We can contrast this with technologies that do have learning curves, but pay off well and allow users to become experts (for example, musical instruments, writing, bicycles, etc. and to a lesser extent automobiles).

Perhaps, while you could never argue that common programming languages don’t have learning curves, they are still “generally worthless and/or debilitating”. Perhaps it’s true that expertise at programming means expertise at jumping through the hoops presented by the programming language, not expertise at telling a computer how to solve a problem in the real world.

On too much and too little

In the following text, remember that words like me or I are to be construed in the broadest possible terms.

It’s easy to be comfortable with my current level of knowledge. Or perhaps it’s not the value, but the derivative of the value: the amount of investment I’m putting into learning a thing. Anyway, it’s easy to tell stories about why the way I’m doing it is the right, or at least a good, way to do it.

Take, for example, object-oriented design. We have words to describe insufficient object-oriented design. Spaghetti Code, or a Big Ball of Mud. Obviously these are things that I never succumb to, but other people do. So clearly (actually, not clearly at all, but that’s beside the point) there is some threshold level of design or analysis practice that represents an acceptable minimum. Whatever that value is, it’s less than the amount that I do.

Interestingly there are also words to describe the over-application of object-oriented design. Architecture Astronauts, for example, are clearly people who do too much architecture (in the same way that NASA astronauts got carried away with flying and overdid it, I suppose). It’s so cold up in space that you’ll catch a fever, resulting in Death by UML Fever. Clearly I am only ever responsible for tropospheric architecture, thus we conclude that there is some acceptable maximum threshold for analysis and design too.

The really convenient thing is that my current work lies between these two limits. In fact, I’m comfortable in saying that it always has.

But wait. I also know that I’m supposed to hate the code that I wrote six months ago, probably because I wasn’t doing enough of whatever it is that I’m doing enough of now. But I don’t remember thinking six months ago that I was below the threshold for doing acceptable amounts of the stuff that I’m supposed to be doing. Could it be, perhaps, that the goalposts have conveniently moved in that time?

Of course they have. What’s acceptable to me now may not be in the future, either because I’ve learned to do more of it or because I’ve learned that I was overdoing it. The trick is not so much in recognising that, but in recognising that others who are doing more or less than me are not wrong, they could in fact be me at a different point on my timeline but with the benefit that they exist now so I can share my experiences with them and work things out together. Or they could be someone with a completely different set of experiences, which is even more exciting as I’ll have more stories to swap.

When it comes to techniques and devices for writing software, I tend to prefer overdoing things and then finding out which bits I don’t really need after all, rather than under-application. That’s obviously a much larger cognitive and conceptual burden, but it stems from the fact that I don’t think we really have any clear ideas on what works and what doesn’t. Not much in making software is ever shown to be wrong, but plenty of it is shown to be out of fashion.

Let me conclude by telling my own story of object-oriented design. It took me ages to learn object-oriented thinking. I learned the technology alright, and could make tools that used the Objective-C language and Foundation and AppKit, but didn’t really work out how to split my stuff up into objects. Not just for a while, but for years. A little while after that Death by UML Fever article was written, my employer sent me to Sun to attend their Object-Oriented Analysis and Design Using UML course.

That course in itself was a huge turning point. But just as beneficial was the few months afterward in which I would architecturamalise all the things, and my then-manager wisely left me to it. The office furniture was all covered with whiteboard material, and there soon wasn’t a bookshelf or cupboard in my area of the office that wasn’t covered with sequence diagrams, package diagrams, class diagrams, or whatever other diagrams. I probably would’ve covered the external walls, too, if it wasn’t for Enterprise Architect. You probably have opinions(TM) of both of the words in that product’s name. In fact I also used OmniGraffle, and dia (my laptop at the time was an iBook G4 running some flavour of Linux).

That period of UMLphoria gave me the first few hundred hours of deliberate practice. It let me see things that had been useful, and that had either helped me understand the problem or communicate about it with my peers. It also let me see the things that hadn’t been useful, that I’d constructed but then had no further purpose for. It let me not only dial back, but work out which things to dial back on.

I can’t imagine being able to replace that experience with reading web articles and Stack Overflow questions. Sure, there are plenty of opinions on things like OOA/D and UML on the web. Some of those opinions are even by people who have tried it. But going through that volume of material and sifting the experience-led advice from the iconoclasm or marketing fluff, deciding which viewpoints were relevant to my position: that’s all really hard. Harder, perhaps, than diving in and working slowly for a few months while I over-practice a skill.

Some so-called expert

There’s a comedy sketch being frequently tweeted called The Expert. Now, all programmers will be aware that there is nothing funnier than interpreting a joke literally and telling everyone the many ways in which it’s wrong, and that there is no way to be seen as a more intelligent and empathetic person than to do this. So here we go: what are all the inexpert things this “expert” does?

Firstly, having been told how important the strategic initiative is, he makes no attempt to actually find out what it is, and how his task is connected to the objectives described. This means that he doesn’t know anything about the context of his work, which is just setting himself up for all sorts of trouble. It’s like a programmer going “yeah sure, I can add a second copy of that goto line” without checking whether they’re working on some sort of security-sensitive module.

He refuses to accept any form of creative solution to the problem, and his project manager is correct to try to tactfully defer his immediate refusal to do the work asked. Immediately saying “no, I can’t do that” is identical to saying “I have never done that, and I cannot imagine any novelty entering my life”. This is not symptomatic of expertise, but of narrow-mindedness.

A pause, and a gathering of resources, leads us to conclude that some of the tasks set are eminently achievable, making this alleged expert look like the comfort-zone-hogging risk-averse luddite that perhaps he is. Of course you can draw a red line with inks of other colours, for example. You simply rely on the relativistic Doppler effect, or on fluorescent properties of the materials. Of course you can draw seven lines all perpendicular, if your diagram can extend into seven dimensions. And that is of course assuming a Euclidean geometry for the diagram; an assumption that our “I know best” expert doesn’t even think to question. Alternatively, you can find out what the time-dependent evolution of the diagram is, as it may be that a total of seven lines that are each instantaneously perpendicular to the other lines present but that do not all simultaneously exist is a sufficient solution. Again, our unimaginative expert doesn’t think about that. In fact, he never really explores whether the perpendicularity requirement means mutually perpendicular, he just proceeds to mansplain to the client representative why he is right and she is wrong.

Assured of his expertise, he then injects sarcasm into his voice in a condescending fashion. “I’m sure your target audience doesn’t exist solely of those people.” Again, this is indicative of a lack of empathy and an unwillingness to consider other viewpoints than his own.

Although, having said that, he’s pretty quick to demur to authority, and on the few occasions that he does want to enquire about the requirements, does not pursue the matter if someone else interrupts.

This is an “expert” who is going to go away with an incomplete understanding of the problem, and will likely fail to give a satisfactory solution. Often such people will then seek to externalise any responsibility for the failure, complaining that the requirements weren’t clear or that the clients had unrealistic expectations. Maybe they weren’t and they did, but as an expert it’s his responsibility to understand those and apply his skills to solving the problem at hand, not to find ways to throw other people under the proverbial bus.

The manager in this video is clearly the sanest voice, and also manages to keep his frustration at his own mistake somewhat bottled. The extent of that mistake? He has contracted an “expert in a narrow field”, who “doesn’t see the overall picture”, and put him in a meeting with their client for which he was totally unprepared. So it’s a shame that the expert’s grandest commitment—to inflate a balloon of unknown quality and structure into the shape of a kitten—is made without the manager around to intermediate. He might have been able to intervene before the physical contact between the “expert” and the designer, which should be considered wholly inappropriate for a business meeting.

Maybe it was a mistake to put someone so junior in front of the client without some coaching. Hopefully, with appropriate mentoring and support, our “expert” can grow to be a mature, empathetic and positive contributor to his team.

The Software Leviathan

Thomas Hobbes viewed society as a meta-person, a gigantic creature whose parts were human and which was in the service of those humans. Left to their own devices, people would not work well together as their notion of individualism and search for personal gain leads directly to conflict: strong government is needed to instil a sense of cooperation and of social obligation. This idea of “government through social contract” is pervasive in Western political thought, being the basis as it is for the “government of the people, by the people, for the people” with which Abraham Lincoln hoped to lead post-civil war America.

Software systems themselves can also be thought of as Leviathans. From a purely technical sense, all of “professional” software construction is based on notions of composition, of software systems that are themselves made of software systems. So we have structured or procedural programming, with routines composed of subroutines. And functional programming, with functions composed of functions. And object-oriented programming, with objects composed of objects. So central are these ideas to expressions of thought in software that they are considered paradigmatic by many, representing fundamental world-views of the art/craft/science.

There’s a second formulation of software-as-Leviathan, which is closer to Hobbesian meaning. The technical aspect of our software systems is merely a substrate[*] through which a social system—that of the people interacting with the software, the people acting on the software, and the people interacting with the other people—is reified. So the descriptions Hobbes made of his Leviathan can be made of these socio-technical systems:

  • First the Matter thereof, and the Artificer; both which is Man[sic].
  • Secondly, How, and by what Covenants it is made; what are the Rights and just Power or Authority of a Soveraigne; and what it is that Preserveth and Dissolveth it.
  • Thirdly, what is a Christian Common-Wealth.
  • Lastly, what is the Kingdome of Darkness.

[*] I wonder what form of substance gives the best sense of the analogy. Scaffolding? Lubricant? Mortar? Framework?

OK, maybe not so much the third one, except that it is really an attempt to define the values and norms of a society, which in the context of Hobbes’s writing, meant a Christian society.

Of course, any attempt to describe such a system is going to be filtered by the preconceptions, ideas and values of the person creating the description. Which brings me onto today’s topic: the pun in the new domain of this blog. Evidently it’s a contraction of “Structure and Interpretation of Computer Programmers”, based on the Abelson and Sussman book title. That book is abbreviated to SICP, so it’s not too difficult to see how it might be adapted to SICPers.

We can also see it as being a Latin abbreviation: sic pers., meaning such a person. So there is both the Structure and Interpretation of Computer Programmers, and there is this person who is doing the interpreting, in the domain name.

Where am I going with this?

I recently asked how people would describe this Secure Mac Programming blog were they trying to tell someone else they should read it. Of all the answers, the one that most succinctly sums up the trouble with the old name is from Alan:

@secboffin Not Just Secure, Not Just Mac, Not Just Programming.

I’m probably in the midst of some existential crisis, having spent a couple of years thinking and writing about philosophy, ethics, and the social responsibility of my work and its context. It’s clear that I’m dealing with some conflict, and it doesn’t look like reconciliation is an option.

Often I write about ideas that are still knocking around my head, such that I never come to any conclusion. I’ve used multiple choice conclusions, conclusions that appear to be from a different argument, and have concluded that my entire argument may or may not be useful.

This is just something I need to work out: what do I think I do, what do other people think I do, what parts of that do I like and dislike, are there other things I would like, can I replace the disliked parts with the liked parts, and so on. I write it here as you may have related ideas, or you may be thinking about the same things yourself and benefit from knowing that other people are, too.

What I know includes a list of things that currently interest me:

With all that in mind, I’m happy to introduce the beginning of a slow rebranding of this blog. It is now called the Structure and Interpretation of Computer Programmers, and can be found at http://www.sicpers.info/ in addition to its previous home at http://blog.securemacprogramming.com.

I do not intend to remove the old domain or break existing feed subscriptions. Over time (basically, as I work out how to do it) I’ll migrate links, feed entries and so on to reference the new domain, and the age-old updated mission of the blog.

My use of Latin: a glossary

  • i.e.: I Explain
  • e.g.: Example Given
  • et al.: Extremely Tedious Author List
  • op. cit.: Other Page Cited It Too
  • ibid.: In Book I Described
  • etc.: Evermore To Continue
  • a.m.: Argh! Morning!
  • p.m.: Past Morning
  • ca.: Close Approximation
  • sic.: See Inexcusable Cock-up

Depending on the self-interest of strangers

The title is borrowed from an economics article by Art Carden, which is of no further relevance to this post. Interesting read though, yes?

I’m enjoying the discussion in the iOS Developer Community™ about dependency of app makers on third-party libraries.

My main sources for what I will (glibly, and with a lot of simplification) summarise as the “anti-dependency” argument are a talk by Marcus Zarra which he gave (a later version than I saw of) at NSConference, and a blog post by Justin Williams.

This is not perhaps an argument that is totally against use of libraries, but an argument in favour of caution and conservatism. As I understand it, the position(s) on this side can be summarised as an attempt to mitigate the following risks:

  • if I incorporate someone else’s library, then I’m outsourcing my understanding of the problem to them. If they don’t understand it in the same way that I do, then I might not end up with a desired solution.
  • if I incorporate someone else’s library, then I’m outsourcing my understanding of the code to them. If it turns out I need something different, I may not be able to make that happen.
  • incorporating someone else’s library may mean bringing in a load of code that doesn’t actually solve my problem, but that increases the cognitive load of understanding my product.

I can certainly empathise with the idea that bringing in code to solve a problem can be a liability. A large app I was involved in writing a while back used a few open source libraries, and all but one of them needed patching either to fix problems or to extend their capabilities for our novel setting. The one (known) bug that came close to ending up in production was due to the interaction between one of these libraries and the operating system.

But then there’s all of my code in my software that’s also a liability. The difference between my code and someone else’s code, to a very crude level of approximation that won’t stand up to intellectual rigour but is good enough for a first pass, is that my code cost me time to write. Other than that, it’s all liability. And let’s not accidentally give a free pass to platform-vendor libraries, which are written by the same squishy, error-prone human-meat that produces both the first- and third-party code.

The reason I find this discussion of interest is that at the beginning of OOP’s incursion into commercial programming, the key benefit of the technique was supposedly that we could all stop solving the same problems and borrow or buy other peoples’ solutions. Here are Brad Cox and Bill Hunt, from the August 1986 issue of Byte:

Encapsulation means that code suppliers
can build. test. and document solutions to difficult user
interface problems and store them in libraries as reusable
software components that depend only loosely on the applications that use them. Encapsulation lets consumers assemble generic components directly into their applica­tions. and inheritance lets them define new application­ specific components by inheriting most of the work from generic components in the library.

Building programs by reusing generic components will seem strange if you think of programming as the act of assembling the raw statements and expressions of a programming language. The integrated circuit seemed just as strange to designers who built circuits from discrete elec­tronic components. What is truly revolutionary about object-oriented programming is that it helps programmers reuse existing code. just as the silicon chip helps circuit builders reuse the work of chip designers. To emphasise this parallel we call reusable classes Software-ICs.

In the “software-IC” formulation of OOP, CocoaPods, Ruby Gems etc. are the end-game of the technology. We take the work of the giants who came before us and use it to stand on their shoulders. Our output is not merely applications, which are fleeting in utility, but new objects from which still more applications can be built.

I look forward to seeing this discussion play out and finding out whether it moves us collectively in a new direction. I offer zero or more of the following potential conclusions to this post:

  • Cox and contemporaries were wrong, and overplayed the potential for re-use to help sell OOP and their companies’ products.
  • The anti-library sentiment is wrong, and downplays the potential for re-use to help sell billable hours.
  • Libraries just have an image problem, and we can define some trustworthiness metric (based, perhaps, on documentation or automated test coverage) that raises the bar and increases confidence.
  • Libraries inherently work to stop us understanding low-level stuff that actually, sooner or later, we’ll need to know about whether we like it or not.
  • Everyone’s free to do what they want, and while the dinosaurs are reinventing their wheels, the mammals can outcompete them by moving the craft forward.
  • Everyone’s free to do what they want, and while the library-importers’ castles are sinking into the swamps due to their architectural deficiencies, the self-inventors can outcompete them by building the most appropriate structures for the tasks at hand.

Software, Science?

Is there any science in software making? Does it make sense to think of software making as scientific? Would it help if we could?

Hold on, just what is science anyway?

Good question. The medieval French philosopher-monk Buridan said that the source of all knowledge is experience, and Richard Feynman paraphrased this as “the test of all knowledge is experiment”.

If we accept that science involves some experimental test of knowledge, then some of our friends who think of themselves as scientists will find themselves excluded. Consider the astrophysicists and cosmologists. It’s all very well having a theory about how stars form, but if you’re not willing to swirl a big cloud of hydrogen, helium, lithium, carbon etc. about and watch what happens to it for a few billion years then you’re not doing the experiment. If you’re not doing the experiment then you’re not a scientist.

Our friends in what are called the biological and medical sciences also have some difficulty now. A lot of what they do is tested by experiment, but some of the experiments are not permitted on ethical grounds. If you’re not allowed to do the experiment, maybe you’re not a real scientist.

Another formulation (OK, I got this from the wikipedia entry on Science) sees science as a sort of systematic storytelling: the making of “testable explanations and predictions about the universe”.

Under this definition, there’s no problem with calling astronomy a science: you think this is how things work, then you sit, and watch, and see whether that happens.

Of course a lot of other work fits into the category now, too. There’s no problem with seeing the “social sciences” as branches of science: if you can explain how people work, and make predictions that can (in principle, even if not in practice) be tested, then you’re doing science. Psychology, sociology, economics: these are all sciences now.

Speaking of the social sciences, we must remember that science itself is a social activity, and that the way it’s performed is really defined as the explicit and implicit rules and boundaries created by all the people who are doing it. As an example, consider falsificationism. This is the idea that a good scientific hypothesis is one that can be rejected, rather than confirmed, by an appropriately-designed experiment.

Sounds pretty good, right? Here’s the interesting thing: it’s also pretty new. It was mostly popularised by Karl Popper in the 20th Century. So if falsificationism is the hallmark of good science, then Einstein didn’t do good science, nor did Marie Curie, nor Galileo, or a whole load of other people who didn’t share the philosophy. Just like Dante’s Virgil was not permitted into heaven because he’d been born before Christ and therefore could not be a Christian, so all of the good souls of classical science are not permitted to be scientists because they did not benefit from Popper’s good message.

So what is science today is not necessarily science tomorrow, and there’s a sort of self-perpetuation of a culture of science that defines what it is. And of course that culture can be critiqued. Why is peer review appropriate? Why do the benefits outweigh the politics, the gazumping, the gender bias? Why should it be that if falsification is important, negative results are less likely to be published?

Let’s talk about Physics

Around a decade ago I was studying particle physics pretty hard. Now there are plenty of interesting aspects to particle physics. Firstly that it’s a statistics-heavy discipline, and that results in statistics are defined by how happy you are with them, not by some binary right/wrong criterion.

It turns out that particle physicists are a pretty conservative bunch. They’ll only accept a particle as “discovered” if the signal indicating its existence is measured as a five-sigma confidence: i.e. if there’s under a one-on-a-million chance that the signal arose randomly in the absence of the particle’s existence. Why five sigma? Why not three (a 99.7% confidence) or six (to keep Motorola happy)? Why not repeat it three times and call it good, like we did in middle school science classes?

Also, it’s quite a specialised discipline, with a clear split between theory and practice and other divisions inside those fields. It’s been a long time since you could be a general particle physicist, and even longer since you could be simply a “physicist”. The split leads to some interesting questions about the definition of science again: if you make a prediction which you know can’t be verified during your lifetime due to the lag between theory and experimental capability, are you still doing science? Does it matter whether you are or not? Is the science in the theory (the verifiable, or falsifiable, prediction) or in the experiment? Or in both?

And how about Psychology, too

Physicists are among the most rational people I’ve worked with, but psychologists up the game by bringing their own feature to the mix: hypercriticality. And I mean that in the technical sense of criticism, not in the programmer “you’re grammar sucks” sense.

You see, psychology is hard, because people are messy. Physics is easy: the apple either fell to earth or it didn’t. Granted, quantum gets a bit weird, but it generally (probably) does its repeatable thing. We saw above that particle physics is based on statistics (as is semiconductor physics, as it happens); but you can still say that you expect some particular outcome or distribution of outcomes with some level of confidence. People aren’t so friendly. I mean, they’re friendly, but not in a scientific sense. You can do a nice repeatable psychology experiment in the lab, but only by removing so many confounding variables that it’s doubtful the results would carry over into the real world. And the experiment only really told you anything about local first year psychology undergraduates, because they’re the only people who:

  1. walked past the sign in the psychology department advertising the need for participants;
  2. need the ten dollars on offer for participation desperately enough to turn up.

In fact, you only really know about how local first year psychology undergraduates who know they’re participating in a psychology experiment behave. The ethics rules require informed consent which is a good thing because otherwise it’s hard to tell the difference between a psychology lab and a Channel 4 game show. But it means you have to say things like “hey this is totally an experiment and there’ll be counselling afterward if you’re disturbed by not really electrocuting the fake person behind the wall” which might affect how people react, except we don’t really know because we’re not allowed to do that experiment.

On the other hand, you can make observations about the real world, and draw conclusions from them, but it’s hard to know what caused what you saw because there are so many things going on. It’s pretty hard to re-run the entire of a society with just one thing changed, like “maybe if we just made Hitler an inch taller then Americans would like him, or perhaps try the exact same thing as prohibition again but in Indonesia” and other ideas that belong in Philip K. Dick novels.

So there’s this tension: repeatable results that might not apply to the real world (a lack of “ecological validity”), and real-world phenomena that might not be possible to explain (a lack of “internal validity”). And then there are all sorts of other problems too, so that psychologists know that for a study to hold water they need to surround what they say with caveats and limitations. Thus is born the “threats to validity” section on any paper, where the authors themselves describe the generality (or otherwise) of their results, knowing that such threats will be a hot topic of discussion.

But all of this—the physics, the psychology, and the other sciences—is basically a systematised story-telling exercise, in which the story is “this is why the universe is as it is” and the system is the collection of (time-and-space-dependent) rules that govern what stories may be told. It’s like religion, but with more maths (unless your religion is one of those ones that assigns numbers to each letter in a really long book then notices that your phone number appears about as many times as a Poisson distribution would suggest).

Wait, I think you were talking about software

Oh yeah, thanks. So, what science, if any, is there in making software? Does there exist a systematic approach to storytelling? First, let’s look at the kinds of stories we need to tell.

The first are the stories about the social system in which the software finds itself: the story of the users, their need (or otherwise) for a software system, their reactions (or otherwise) to the system introduced, how their interactions with each other change as a result of introducing the system, and so on. Call this requirements engineering, or human-computer interaction, or user experience; it’s one collection of stories.

You can see these kinds of stories emerging from the work of Manny Lehman. He identifies three types of software:

  • an S-system is exactly specified.
  • a P-system executes some known procedure.
  • an E-system must evolve to meet the needs of its environment.

It may seem that E-type software is the type in which our stories about society are relevant, but think again: why do we need software to match a specification, or to follow a procedure? Because automating that specification or procedure is of value to someone. Why, or to what end? Why that procedure? What is the impact of automating it? We’re back to telling stories about society. All of these software systems, according to Lehman, arise from discovery of a problem in the universe of discourse, and provide a solution that is of possible interest in the universe of discourse.

The second kind are the stories about how we worked together to build the software we thought was needed in the first stories. The practices we use to design, build and test our software are all tools for facilitating the interaction between the people who work together to make the things that come out. The things we learned about our own society, and that we hope we can repeat (or avoid) in the future, become our design, architecture, development, testing, deployment, maintenance and support practices. We even create our own tools—software for software’s sake—to automate, ease or disrupt our own interactions with each other.

You’ll mostly hear the second kind of story at most developer conferences. I believe that’s because the people who have most time and inclination to speak at most developer conferences are consultants, and they understand the second stories to a greater extent than the first because they don’t spend too long solving any particular problem. It’s also because most developer conferences are generally about making software, not about whatever problem it is that each of the attendees is trying to solve out in the world.

I’m going to borrow a convention that Rob Rix told me in an email, of labelling the first type of story as being about “external quality” and the second type about “internal quality”. I went through a few stages of acceptance of this taxonomy:

  1. Sounds like a great idea! There really are two different things we have to worry about.
  2. Hold on, this doesn’t sounds like such a good thing. Are we really dividing our work into things we do for “us” and things we do for “them”? Labelling the non-technical identity? That sounds like a recipe for outgroup homogeneity effect.
  3. No, wait, I’m thinking about it wrong. The people who make software are not the in-group. They are the mediators: it’s just the computers and the tools on one side of the boundary, and all of society on the other side. We have, in effect, the Janus Thinker: looking on the one hand toward the external stories, on the other toward the internal stories, and providing a portal allowing flow between the two.

JANUS (from Vatican collection) by Flickr user jinnrouge

So, um, science?

What we’re actually looking at is a potential social science: there are internal stories about our interactions with each other and external stories about our interactions with society and of society’s interactions with the things we create, and those stories could potentially be systematised and we’d have a social science of sorts.

Particularly, I want to make the point that we don’t have a clinical science, an analogy drawn by those who investigate evidence-based software engineering (which has included me, in my armchair way, in the past). You can usefully give half of your patients a medicine and half a placebo, then measure survival or recovery rates after that intervention. You cannot meaningfully treat a software practice, like TDD as an example, as a clinical intervention. How do you give half of your participants a placebo TDD? How much training will you give your ‘treatment’ group, and how will you organise placebo training for the ‘control’ group? [Actually I think I've been on some placebo training courses.]

In constructing our own scientific stories about the world of making software, we would run into the same problems that social scientists do in finding useful compromises between internal and ecological validity. For example, the oft-cited Exploratory experimental studies comparing online and offline programming performance (by Sackman et al., 1968) is frequently used to support the notion that there are “10x programmers”, that some people who write software just do it ten times faster than others.

However, this study does not have much ecological validity. It measures debugging performance, using either an offline process (submitting jobs to a batch system) or an online debugger called TSS, which probably isn’t a lot like the tools used in debugging today. The problems were well-specified, thus removing many of the real problems programmers face in designing software. Participants were expected to code a complete solution with no compiler errors, then debug it: not all programmers work like that. And where did they get their participants from? Did they have a diverse range of backgrounds, cultures, education, experience? It does not seem that any results from that study could necessarily apply to modern software development situated in a modern environment, nor could the claim of “10x programmers” necessarily generalise as we don’t know who is 10x better than whom, even at this one restricted task.

In fact, I’m also not convinced of its internal validity. There were four conditions (two programming problems and two debugging setups), each of which was assigned to six participants. Variance is so large that most of the variables are independent of each other (the independent variables are the programming problem and the debugging mode, and the dependent variables are the amount of person-time and CPU-time), unless the authors correlate them with “programming skill”. How is this skill defined? How is it measured? Why, when the individual scores are compared, is “programming skill” not again taken into consideration? What confounding variables might also affect the wide variation in scored reported? Is it possible that the fastest programmers had simply seen the problem and solved it before? We don’t know. What we do know is that the reported 28:1 ratio between best and worst performers is across both online and offline conditions (as pointed out in, e.g., The Leprechauns of Software Engineering, so that’s definitely a confounding factor. If we just looked at two programmers using the same environment, what difference would be found?

We had the problem that “programming skill” is not well-defined when examining the Sackman et al. study, and we’ll find that this problem is one we need to overcome more generally before we can make the “testable explanations and predictions” that we seek. Let’s revisit the TDD example from earlier: my hypothesis is that a team that adopts the test-driven development practice will be more productive some time later (we’ll defer a discussion of how long) than the null condition.

OK, so what do we mean by “productive”? Lines of code delivered? Probably not, their amount varies with representation. OK, number of machine instructions delivered? Not all of those would be considered useful. Amount of ‘customer value’? What does the customer consider valuable, and how do we ensure a fair measurement of that across the two conditions? Do bugs count as a reduction in value, or a need to do additional work? Or both? How long do we wait for a bug to not be found before we declare that it doesn’t exist? How is that discovery done? Does the expense related to finding bugs stay the same in both cases, or is that a confounding variable? Is the cost associated with finding bugs counted against the value delivered? And so on.

Software dogma

Because our stories are not currently very testable, many of them arise from a dogmatic belief that some tool, or process, or mode of thought, is superior to the alternatives, and that there can be no critical debate. For example, from the Clean Coder:

The bottom line is that TDD works, and everybody needs to get over it.

No room for alternatives or improvement, just get over it. If you’re having trouble defending it, apply a liberal sprinkle of argumentum ab auctoritate and tell everyone: Robert C. Martin says you should get over it!

You’ll also find any number of applications of the thought-terminating cliché, a rhetorical technique used to stop cognitive dissonance by allowing one side of the issue to go unchallenged. Some examples:

  • “I just use the right tool for the job”—OK, I’m done defending this tool in the face of evidence. It’s just clearly the correct one. You may go about your business. Move along.
  • “My approach is pragmatic”—It may look like I’m doing the opposite of what I told you earlier, but that’s because I always do the best thing to do, so I don’t need to explain the gap.
  • “I’m passionate about [X]“—yeah, I mean your argument might look valid, I suppose, if you’re the kind of person who doesn’t love this stuff as much I do. Real craftsmen would get what I’m saying.
  • and more.

The good news is that out of such religious foundations spring the shoots of scientific thought, as people seek to find a solid justification for their dogma. So just as physics has strong spiritual connections, with Steven Hawking concluding in A Brief History of Time:

However, if we discover a complete theory, it should in time be understandable by everyone, not just by a few scientists. Then we shall all, philosophers, scientists and just ordinary people, be able to take part in the discussion of the question of why it is that we and the universe exist. If we find the answer to that, it would be the ultimate triumph of human reason — for then we should know the mind of God.

and Einstein debating whether quantum physics represented a kind of deific Dungeons and Dragons:

[…] an inner voice tells me that it is not yet the real thing. The theory says a lot, but does not really bring us any closer to the secret of the “old one.” I, at any rate, am convinced that He does not throw dice.

so a (social) science of software could arise as an analogous form of experimental theology. I think the analogy could be drawn too far: the context is not similar enough to the ages of Islamic Science or of the Enlightenment to claim that similar shifts to those would occur. You already need a fairly stable base of rational science (and its application via engineering) to even have a computer at all upon which to run software, so there’s a larger base of scientific practice and philosophy to draw on.

It’s useful, though, when talking to a programmer who presents themselves as hyper-rational, to remember to dig in and to see just where the emotions, fallacious arguments and dogmatic reasoning are presenting themselves, and to wonder about what would have to change to turn any such discussion into a verifiable prediction. And, of course, to think about whether that would necessarily be a beneficial change. Just as we can critique scientific culture, so should we critique software culture.

Switch to our mobile site