The Difference Engine – the Charles Babbage machine, not the steampunk novel – is a device for finding successive solutions to polynomial equations by adding up the differences introduced by each term between the successive input values.
This sounds like a fairly niche market, but in fact it’s quite useful because there are a whole lot of other functions that can be approximated by polynomial equations. The approach, which is based in calculus, generates a Taylor series (or a MacLaurin series, if the approximation is for input values near zero).
Now, it happens that this collection of other functions includes logarithms:
\(ln(1+x) \approx x – x^2/2 + x^3/3 – x^4/4 + \ldots\)and exponents:
\(e^x \approx 1 + x + x^2/2! + x^3/3! + x^4/4! + \ldots\)and so, given a difference engine, you can make tables of logarithms and exponents.
In fact, your computer is probably using exactly this approach to calculate those functions. Here’s how glibc calculates ln(x)
for x roughly equal to 1:
r = x - 1.0;
r2 = r * r;
r3 = r * r2;
y = r3 * (B[1] + r * B[2] + r2 * B[3]
+ r3 * (B[4] + r * B[5] + r2 * B[6]
+ r3 * (B[7] + r * B[8] + r2 * B[9] + r3 * B[10])));
// some more twiddling that add terms in r and r*r, then return y
In other words, it works out r
so that it is calculating ln(1+r)
, instead of ln(x)
. Then it adds together r + a*r^2 + b*r^3 + c*r^4 + d*r^5 + ... + k*r^12
…it does the Taylor series for ln(1+r)
!
Now given these approximations, we can combine numbers into probabilities (using the sigmoid function, which is in terms of e^x
) and find the errors on those probabilities (using the cross entropy, which is in terms of ln(x)
. We can build a learning neural network!
And, more than a century after it was designed, our technique could still do it using the Difference Engine.