Hello readers, part 3, the final part of the “OOP the Easy Way” journey, has now been published at Leanpub! Thanks for joining me along the way! As ever, corrections, questions, and comments are welcome (you can comment here if you like), and as ever, readers who buy the book now will receive free updates for the lifetime of the book. While there’s nothing new to add, this means that corrections and expansions will be free to all readers.
If you enjoy OOP the Easy Way or found it informative (or maybe even both), please recommend it to your friends, colleagues and followers. It’d be great if they could enjoy it, be informed by it, or both, too!
Thank you for the book. Your past posts, talks and now this book have helped me understand OOP much more deeply.
The vision of a vast environment of composed objects, written in multiple languages, live and hot swapible, each in their own process, is intregining in many ways.
But, it seems to me the overhead of so many processes would be more overwhelming than the inherent multiprocessing gains. I was always taught bringing up another process is expensive. This feels like the Achilles’ heel of this otherwise ideal enviornment.
Do you understand this system level aspect well enough to say you think this can be overcome? Would this likely put a lower limit on what makes sense to become its own object? In other words, would the performance cost of separating code into a separate object be high enough that developers would be putting too much responsibility in overly large objects?
Thanks for your kind words! Let me start on the multiprocessing question with this preface: there is no such thing as a universal statement about performance. Well, maybe “adding NOP instructions makes it slower”, but even then, slower doesn’t necessarily equate to too slow. Or is the slowness trade-off made in order to gain other value, like robustness? Do we want fast and wobbly or not-as-fast and not-as-wobbly?
Putting “bringing up another process is expensive” into some kind of quantitative basis, I think there are two accounts to measure:
memory. Threads used to be called “lightweight processes” for a reason: you don’t need all of the memory to describe a thread that you do to describe a process (you can do away with separate heap memory tracking, file descriptor tables, and of course the executable program is the same in two threads in the same process). So if we say “every object is its own process”, then we have more processes than if we say “every object is in my monolithic Java app”, and by that measure more memory. We could, perhaps, ameliorate that by saying “every type (I’m avoiding the word class, in favour of every object that is constructed in the same way) has its own process, and every instance is a separate thread in that process”, so that the code for “array” is loaded once. Going further, you could say “everyone who wants an empty array gets a channel to a single empty array thread”, or a scalable pool, or whatever.
time. Starting a process means allocating that structure, setting up some initial state, loading the process’s executable from external store, loading dynamic shared objects and editing links to those objects, setting the initial thread(s) up then letting rip. The slowest bit is loading the parts, and I think those executables would be smaller and need fewer shared libraries so that’s less waiting. Additionally, newer architectures in the high-performance computing world replace “filesystem” storage like SSDs with non-volatile RAM, which is much faster and points to having a Smalltalk-like image model where you don’t distinguish between “executables” on disk and “processes” in memory, you just have your “program” in NVRAM and a stack of caches for the bits you need to access quickly, so over the next decades this aspect of the issue could disappear.
Speaking of caches. We know from Mach (a microkernel OS that supposedly “failed” because of performance, but it’s best not to tell Apple that) that “IPC is slow”, and that having so many bits of the operating system as user space processes communicating over IPC exposes that slowness. Lots of subsequent microkernel implementations, including QNX, worked at speeding up the IPC, and indeed Mach 4 contains “shuttles” which address that side of the performance problem.
But! When we stop armchair-quarterbacking Mach performance and start measuring, we find that a lot of the problem came not from doing IPC but from the cache misses and write stalls that happen when you pass messages. There’s lots of literature on this, search for “Mach performance cache misses” or “operating system structure impact on memory performance”.
Now my hypothesis, and it is truly just that, is that by driving down to the “single object per executable” scale, we get down to the point where the object’s code (which contains few conditions and jumps, because it just isn’t that complex) takes up little memory and the system causes fewer cache events.
My other guess is that core count is continuing to increase (high-end datacenter/HPC CPUs have 32 (Oracle), 34 (Fujitsu), or 72 (Intel) cores per chip), and that enforcing asynchronous programming will take more advantage of those architectures than merely permitting it.
The only real way to find out is to measure, though :)
Nice. Thank you for the thoughtful response. I would love to see this measured.
I’ve always been intrigued by the message passing definition of OOP. But, my depth of programming knowledge was built in the age of Swift. I’ve come to embraced the compile time assistance I get from a strong type system — even as I recognize that I can’t simply swap my implementation of a button (with matching API) in place for a UIButton.
Thinking through what you have written, I don’t see any reason why this OOP environment couldn’t have objects written using a language with a strong type system. The key seems to be that as long as contracts between objects are satisfied, you can have all the compile time static checks you want. Or not if you prefer a dynamic runtime solution.
Am I missing anything?
Brian, you’re not missing anything, I think that’s spot on. I believe that the type of an object is its shape (its protocol and its contract), not its name (in ObjC etc, the name of its class; in JS etc, the name of its constructor function). If my compiler can check my contracts match up, it can check my types match up.