Extrapolating quantum factoring

In 2001, a quantum computer was able to factor 15.

In 2012, a quantum computer was able to factor 21, sorta. They cheated by not implementing gates that they knew a priori would not be used. To this day, there still hasn’t been a quantum computer to factor 21 without taking some shortcuts. Some reasons why are given here.

Extrapolating from two data points is kinda ridiculous, but we only have two data points, and we’re going to do it anyway.

xkcd 605: Extrapolating

Some may object that extrapolating the size of the numbers that can be factored isn’t the right approach. Instead we should be extrapolating the number of qubits or something else. Fine, but that’s not what we’re doing in this post.

Linear interpolation

Using linear interpolation, a quantum computer wouldn’t be able to factor a cryptographically relevant prime number before the heat death of the universe.

Exponential interpolation

Let’s switch to exponential extrapolation. Instead of assuming the size of the numbers grows linearly, we’ll assume the number of bits in the numbers grows linearly, meaning the numbers grow exponentially.

In about 10 years, the size of number that could be factored increased by

21/15 = 1.4 ≈ √2.

This means the size of the numbers increased by about half a bit in a decade. By now we should be able to factor 35.

RSA keys have at least 1024 bits, so at half a bit per decade, quantum computers should be able to factor RSA keys 20,000 years from now.

Double exponential interpolation

Now let’s assume the number of bits in a number that a quantum computer can factor is itself growing exponentially, meaning that the numbers are growing doubly exponentially, doubling every decade. Then we should be able to factor 9-bit numbers by now, and in 100 years we will be able to factor RSA keys.

Breaking RSA in 10 years

The estimate above is kinda arbitrary because a double exponential function has three degrees of freedom and so we’d need three data points, which we don’t have.

We can fit a double exponential by making up a third data point. Some researchers speculate that a quantum computer will be able to factor 1024-bit RSA keys 10 years from now. If so, that gives us a third data point.

To make things easier, let’s work in terms of bits and set x to be the number of years since 2000. Then we have

f(x) = ab 2cx

with f(1) = 15, f(12) = 21, and f(35) = 1024. We can fit this function using Python.

from scipy.optimize import curve_fit

def f(x, a, b, c):
    return a +  b * 2**(c*x)

x_data = [1.0, 12.0, 35.0]
y_data = [log2(15), log2(21), 1024]  
initial_guess =  [4, 0.01, 0.3]

popt, pcov = curve_fit(f, x_data, y_data, p0=initial_guess, maxfev=10000)
a, b, c = popt
print(f"Fitted parameters: {a=}, {b=}, {c=}")

The parameter values are a = 3.893887005243357, b = 0.0093347992761344, c = 0.4782191085655488.

Here’s a plot of the fitted function f.

If we’re going to be able to factor 1024-bit integers by 2035, according to a double exponential model, we’re already behind schedule. We should be factoring 40-bit numbers by now, but so far we’ve only factored a 4-bit number.

Or to put it another way, those who believe we will be able to factor 1024-bit integers by 2035 are implicitly assuming a future rate of growth faster than double exponential. And maybe they’re right.

If we’re going to break 1024-bit RSA keys by 2035, at some point we’ll have to do better than the curve above, and so far we’re doing worse.

Bookmark this post and come back when a new quantum factoring record is set and rerun the code with new data.

Related posts

Post-quantum RSA with gargantuan keys

If and when practical scalable quantum computers become available, RSA encryption would be broken, at least for key sizes currently in use. A quantum computer could use Shor’s algorithm factor n-bit numbers in time on the order of n². The phrase “quantum leap” is misused and overused, but this would legitimately be a quantum leap.

That said, Shor’s method isn’t instantaneous, even on a hypothetical machine that does not yet exist and may never exist. Daniel Bernstein estimates that RSA encryption with terabyte public keys would be secure even in a post-quantum world.

Bernstein said on a recent podcast that he isn’t seriously suggesting using RSA with terabyte keys. Computing the necessary key size is an indirect way of illustrating how impractical post-quantum RSA would be.

Related posts

Conway’s pinwheel tiling

John Conway discovered a right triangle that can be partitioned into five similar triangles. The sides are in proportion 1 : 2 : √5.

You can make a larger similar triangle by making the entire triangle the central (green) triangle of a new triangle.

Here’s the same image with the small triangles filled in as in the original.

Repeating this process creates an aperiodic tiling of the plane.

The tiling was discovered by Conway, but Charles Radin was the first to describe it in a publication [1]. Radin attributes the tiling to Conway.

Alternate visualization

It would be easiest to illustrate the tiling if we were standing together in a room and placing new triangles on the floor, watching the tiling expand. Given the limitations of a screen, it may be easier to visualize subdividing the triangle rather than tiling the plane.

Imagine the smallest triangles are a constant size and at each step we’re viewing the process from further away. We see a constant size outer triangle at each step, but the triangle is growing and covering the plane.

Here’s an animated GIF of the process.

 

Related posts

[1] Charles Radin. “The Pinwheel Tilings of the Plane.” Annals of Mathematics, vol. 139, no. 3, 1994, pp. 661–702.

Silent Payments

Bitcoin transactions appear to be private because names are not attached to accounts. But that is not sufficient to ensure privacy; if it were, much of my work in data privacy would be unnecessary. It’s quite possible to identify people in data that does not contain any direct identifiers.

I hesitate to use the term pseudonymization because people define it multiple ways, but I’d say Bitcoin addresses provide pseudonymization but not necessarily deidentification.

Privacy and public ledgers are in tension. The Bitcoin ledger is superficially private because it does not contain user information, only addresses [1]. However, transaction details are publicly recorded on the blockchain.

To add some privacy to Bitcoin, addresses are typically used only once. Wallet software generates new addresses for each transaction, and so it’s not trivial to track all the money flowing between two people. But it’s not impossible either. Forensic tools make it possible to identify parties behind blockchain transactions via metadata and correlating information on the blockchain with information available off-chain.

Silent Payments

Suppose you want to take donations via Bitcoin. If you put a Bitcoin address on your site, that address has to be permanent, and it’s publicly associated with you. It would be trivial to identify (the addresses of) everyone who sends you a donation.

Silent payments provide a way to work around this problem. Alice could send donations to Bob repeatedly without it being obvious who either party is, and Bob would not have to change his site. To be clear, there would be no way to tell from the addresses that funds are moving between Alice and Bob. The metadata vulnerabilities remain.

Silent payments are not implemented in Bitcoin Core, but there are partial implementations out there. See BIP 352.

The silent payment proposal depends on ECDH (elliptic curve Diffie-Hellman) key exchange, so I’ll do a digression on ECDH before returning to silent payments per se.

ECDH

The first thing to know about elliptic curves, as far as cryptography is concerned, is that there is a way to add two points on an elliptic curve and obtain a third point. This turns the curve into an Abelian group.

You can bootstrap this addition to create a multiplication operation: given a point g on an elliptic curve and an integer nng means add g to itself n times. Multiplication can be implemented efficiently even when n is an enormous number. However, and this is key, multiplication cannot be undone efficiently. Given g and n, you can compute ng quickly, but it’s practically impossible to start with knowledge of ng and g and recover n. To put it another way, multiplication is efficient but division is practically impossible [2].

Suppose Alice and Bob both agree on an elliptic curve and a point g on the curve. This information can be public. ECDH lets Alice and Bob share a secret as follows. Alice generates a large random integer a, her private key, and computes a public key A = ag. Similarly, Bob generates a large random integer b and computes a public key Bbg. They’re shared secret is

aBbA.

Alice can compute aB because she (alone) knows a and B is public information. Similarly Bob can compute bA. The two points on the curve aB and bA are the same because

aBabgbagbA.

Back to slient payments

ECDH lets Alice and Bob share a secret, a point on a (very large) elliptic curve. This is the heart of silent payments.

In its simplest form, Alice can send Bob funds using the address

PB + hash(aBg.

A slightly more complicated form lets Alice send Bob funds multiple times. The kth time she sends money to Bob she uses the address

PB + hash(aB || kg

where || denotes concatenation.

But how do you know k? You have to search the blockchain to determine k, much like the way hierarchical wallets search the blockchain. Is the address corresponding to k = 0 on the blockchain? Then try again with k = 1. Keep doing this until you find the first unused k. Making this process more efficient is one of the things that is being worked on to fully implement silent payments.

Note that Bob needs to make B public, but B is not a Bitcoin address per se; it’s information needed to generate addresses via the process described above. Actual addresses are never reused.

Related posts

[1] Though if you obtain your Bitcoin through an exchange, KYC laws require them to save a lot of private information.

[2] Recovering n from ng is known as the discrete logarithm problem. It would be more logical to call it the discrete division problem, but if you write the group operation on an elliptic curve as multiplication rather than addition, then it’s a discrete logarithm, i.e. trying to solve for an unknown exponent. If and when a large-scale quantum computer exists, the discrete logarithm problem will be practical to solve, but presumably not until then.

An inverse problem for the quadratic equation

In elementary algebra we learn how to solve ax2 + bx + c = 0 for roots r = r1, r2. But how about the inverse problem: given a real-valued r, find integer coefficients a, b and c such that r is a root of the corresponding quadratic equation.

A simple approximation can be found by setting b=0, so r2 = – c/a. By appropriately choosing large values of a and c, one can approximate the solution to this inverse problem to arbitrary accuracy.

Are there any other solutions (possibly taking advantage of b for a better approximation)?

Two families of algorithms are available: PSLQ [1-3] and LLL [4-5]. The underlying problem formulation behind these algorithms can be understood geometrically. For fixed x, L(a, b, c) = ax2 + bx + c is a linear function of (a, b, c), here considering a, b and c as real-valued. Thus, for such x, the equation L(a, b, c) = 0 defines a known plane in 3-D space. However, we seek a, b, c that are integers. The problem then is to find a lattice point (a, b, c) with a, b, c integers that is “near” to this plane (for example, close to the plane within tolerance τ, which means a 2-D thin “slab” of thickness 2τ surrounding this plane hits a lattice point). We would also prefer if the magnitude of a, b and c is “small” (e.g., bounded by a constant H).

Given this formulation, the PSLQ algorithm attempts to find an exact solution, and gives up if it can’t find one. The LLL algorithm on the other hand finds approximate solutions within a tolerance.

Python has the mpmath package, which contains mpmath.pslq for the PSLQ algorithm. The Python sypy package has Matrix.LLL() for the LLL reduction.

A colleague of mine was previously submitting overnight runs to compute solutions using  brute force search. More recently, using a Python code for the LLL algorithm, he is able to solve a large problem in seconds, with answers that exactly reproduce the brute force solution.

References

[1] Ferguson, Helaman R. P.; Bailey, David H. (1992). A Polynomial Time, Numerically Stable Integer Relation Algorithm. RNR Technical Report RNR-91-032, NASA Ames Research Center. URL: https://www.davidhbailey.com/dhbpapers/pslq.pdf

[2] Ferguson, Helaman R. P.; Bailey, David H.; Arno, Steve (1999). Analysis of PSLQ, an integer relation finding algorithm. Mathematics of Computation 68(225): 351–369. URL (AMS PDF): https://www.ams.org/mcom/1999-68-225/S0025-5718-99-00995-3/S0025-5718-99-00995-3.pdf

[3] Bailey, David H. (2020). PSLQ: An Algorithm to Discover Integer Relations. (Expository overview.) URL: https://www.davidhbailey.com/dhbpapers/pslq-comp-alg.pdf

[4] Lenstra, Arjen K.; Lenstra, Hendrik W., Jr.; Lovász, László (1982). Factoring Polynomials with Rational Coefficients. Mathematische Annalen 261: 515–534. DOI: https://doi.org/10.1007/BF01457454 (Springer page: https://link.springer.com/article/10.1007/BF01457454 (cf. https://www.math.ucdavis.edu/~deloera/MISC/LA-BIBLIO/trunk/Lovasz/LovaszLenstraLenstrafactor.pdf)

[5] Nguyen, Phong Q.; Vallée, Brigitte (eds.) (2010). The LLL Algorithm: Survey and Applications. Springer. URL: https://link.springer.com/book/10.1007/978-3-642-02295-1

Inverting series that are flat at zero

The previous post looked at solving the equation

2\theta + \sin(2\theta) = \pi \sin( \psi)

which arises from the Mollweide map projection. Newton’s method works well unless φ is near π/2. Using a modified version of Newton’s method makes the convergence faster when φ = π/2, which is kinda useless because we know the solution is $theta; = π/2 there. When φ is near π/2, the modified Newton’s method may diverge. I ended the previous post by saying a series solution would work better when φ is sufficiently close to π/2. This post will flesh that out.

Let x = π − 2θ. Now the task is to solve

x - \sin(x) = y

for small positive values of y.

The left side is

\frac{x^3}{3!} - \frac{x^5}{5!} + \frac{x^7}{7!} + \cdots

and so for very small values of y, and thus very small values of x, we have

x = (6y)^{1/3}

If this solution is not sufficiently accurate, we can invert the power series above to get a power series in y that gives the solution x. However, the Lagrange inversion theorem does not apply because the series has a zero derivative at 0. Instead, we have to use Puiseux series inversion, looking for a series in y1/3 rather than a series in y. From the Puiseux series we can see that

x = (6y)^{1/3} + y/10

is a more accurate solution. For even more accuracy, you can compute more terms of the Puiseux series.

Mollweide map projection and Newton’s method

Mollweide projection

Karl Brandan Mollweide (1774-1825) designed an equal-area map projection, mapping the surface of the earth to an ellipse with an aspect ratio of 2:1. If you were looking at the earth from a distance, the face you’re looking at would correspond to a circle in the middle of the Mollweide map [1]. The part of the earth that you can’t see is mapped to the rest of the ellipse.

The lines of latitude are not quite evenly spaced; some distortion is required to achieve equal area. Instead, latitude φ corresponds to θ on the map according to the equation

2\theta + \sin(2\theta) = \pi \sin( \varphi)

There is no closed-form solution for θ as a function of φ and so the equation must be solved numerically. For each φ we have to find the root of

f(\theta) = 2\theta + \sin(2\theta) - \pi \sin(\varphi)

Here’s a plot of θ as a function of φ.

Newton’s method

Newton’s method works efficiently (converges quadratically) except when φ is close to π/2. The reason Newton’s method slows down for high latitudes is that f and its derivative are both at π/2, i.e. f has a double root there.

Newton’s method slows down from quadratic to linear convergence near a double root, but there is a minor modification that maintains quadratic convergence near a double root.

x_{n+1} = x_n - m\frac{f(x_n)}{f^\prime(x_n)}

When m = 1 this is the usual form of Newton’s method. Setting m = 2 tunes the method for double roots.

The modified version of Newton’s method with m = 2 works when the root you’re trying to is exactly a double root. However, if you’re trying to find a root near a double root, setting m = 2 can cause the method to diverge, so you have to be very careful with changing m.

Illustration

Here’s a version of Newton’s method, specialized for the Mollweide equation. We can specify the value of m, and the initial estimate of the root defaults to θ = φ.

from numpy import sin, cos, pi

def mynewton(phi, m, x0 = None):
    x = x0 if x0 else phi
    delta = 1
    iterations = 0
    
    while abs(delta) > 1e-12:
        fx = 2*x + sin(2*x) - pi*sin(phi)
        fprime = 2 + 2*cos(2*x)
        delta = m*fx/fprime
        x -= delta
        iterations += 1

    return (x, iterations)

Far from the double root

Say φ = 0.25 and m = 1. When we use the default initial estimate Newton’s method converges in 5 iterations.

Near the double root

Next, say φ = π/2 − 0.001. Again we set m = 1 and use the default initial estimate. Now Newton’s method takes 16 iterations. The convergence is slower because we’re getting close to the double root at π/2.

But if we switch to m = 2, Newton’s method diverges. Slow convergence is better than no convergence, so we’re better off sticking with m = 1.

On the double root

Now let’s say φ = π/2. In this case the default guess is exactly correct, but the code divides by zero and crashes. So let’s take our initial estimate to be π/2 − 0.001. When m = 1, Newton’s method takes 15 steps to converge. When m = 2, it converges in 6 steps, which is clearly faster. However, both cases the returned result is only good to 6 decimal places, even though the code is looking for 12 decimal places.

This shows that as we get close to φ = π/2, we have both a speed and an accuracy issue.

A better method

I haven’t worked this out, but here’s how I imagine I would fix this. I’d tune Newton’s method, preventing it from taking steps that are too large, so that the method is accurate for φ in the range [0, π/2 − ε] and use a different method, such as power series inversion, for φ in [π/2 − ε, π/2].

Update: The next post works out the series solution alluded to above.

Related posts

[1] Someone pointed out on Hacker News that this statement could be confusing. The perspective you’d see from a distance is not the Mollweide projection—that would not be equal area—but it corresponds to the same region. As someone put it on HN “a circle in the center corresponds to one half of the globe.”

1420 MHz

There are four manmade object that have left our solar system: Pioneer 10, Pioneer 11, Voyager 1, and Voyager 2.

Yesterday I wrote about the Golden Record which is on both of the Voyager probes. Some of the graphics on the cover of the Golden Record is also on plaques attached to the Pioneer probes. In particular, the two plaques and the two record covers have the following image.

Pioneer plaque hydrogen diagram

This image is essential to decoding the rest of the images. It represents a “spin flip transition” of a neutral hydrogen atom. This event has a frequency of 1420.405 MHz, corresponding to a wavelength of about 21 cm. The little hash mark is meant to indicate that this is a unit of measurement, both of length (21 cm) and time (the time it takes light to travel 21 cm).

The thought behind the unit of measurement was that it might make sense to an extraterrestrial being. Frequencies associated with hydrogen atoms are more universal than seconds (which ultimately derive from the period of Earth’s rotation) or meters (which derive from the circumference of the Earth).

The Wow! signal

The frequency 1420 MHz is special, so special that scientists believed advanced civilizations would recognize it. That’s why an astronomer, Jerry R. Ehman, was excited to observe a strong signal at that frequency on August 15, 1977. (Incidentally, this was five days before Voyager 2 was launched.)

Ehman wrote “Wow!” on a computer printout of the signal, and the signal has become known as “the Wow! signal.”

Ehman thought this signal might have come from an extraterrestrial intelligence for the same reason we were using its frequency in our message to potential extraterrestrials.

Wow! 6DQUJ5

Related posts

The AI fork in the road

If AI can do part of your job, you’re likely to either be fired or promoted.

AI will always have some error rate, and if it can do your job at an acceptable error rate, you’ll need to find another job.

But if the error rate is not acceptable, the ability to identify and fix errors becomes more valuable.

People with the experience to recognize errors and fix them quickly are more productive when delegating work to AI. Higher productivity should result in higher wages, though you may have to become self-employed to be paid more for getting more done.

Contractors and consultants are more often paid in proportion to the value of their work than salaried employees are, especially since 1971.

Recognizing errors takes experience. This is especially true of LLM-generated errors because LLMs always return plausible results (according to some model) though they do not always return correct results.

Day of the week centuries from now

Yesterday I wrote a post outlining mental math posts I’d written over the years. I don’t write about mental math that often, but I’ve been writing for a long time, and so the posts add up. In the process of writing the outline post I noticed a small gap.

In the post on mentally calculating the day of the week I focus on this century, then tersely say “For dates in the 20th century, add 1. For dates in the 22nd century, subtract 2.” This post will expand on that. I’ll say a little more about what these rules mean, how to extend them to other centuries, and why they work.

Dates in the 20th century

Suppose you wanted to know the day of the week for July 4, 1976. You could use the method in the post linked above and end up with Saturday. Moving ahead one day of the week gives you Sunday.

Dates in future centuries

For days in the 2100s you would find the day of the week for the corresponding date in this century, then subtract two days. For the 2200s you’d add three days, and for the 2300s you’d add one day.

The Gregorian calendar repeats every 400 years, so if you can find the days of the week for the years 2000 through 2399, you can find the day of the week for all years in the future by subtracting enough multiples of 400 to get into that range of years. For example, in the year 2825, September 18 will fall on a Thursday because it falls on a Thursday today in 2025.

You could also go backward by centuries as well as forward, but there’s a complication. If you run the Gregorian calendar backward far enough, you run into a time before the Gregorian calendar was adopted.

Why the rules work

So why do the century adjustment rules work? There’s a simple but incomplete explanation, and a more complicated complete explanation.

Partial explanation

The number of days in most centuries, i.e. those not divisible by 400, is

365 × 100 + 24 = 36524

which is congruent to 5 mod 7, which is congruent to −2 mod 7. Dates in the 2100s are 36524 days ahead of their counterparts in the 2000s, so we move back two days in the week. The same applies when going from the 2100s to the 22oos, and from the 2200s to the 2300s.

For example, September 18, 2125 is 36524 days from now—note that February, 2100 will not have a Leap Day—and so it will fall two days earlier in the week, on a Tuesday.

That explanation is mostly correct, but there’s a wrinkle. Not all dates in the 2100s are 36524 days ahead of their counterparts in the 2000s. There are indeed 36524 days between September 18, 2025 and September 18, 2125. But here are 36525 days between January 1, 2000 and January 1, 2100. The former was on a Saturday and the latter will be on a Friday. It would seem that our process is wrong, but it’s not.

Full explanation

Let’s go back and look at the algorithm for finding days of the week,

  1. Take the last two digits of the year and add the number of times 4 divides that number [1].
  2. Add a constant corresponding to the month.
  3. Add the day of the month.
  4. Subtract 1 in January or February of a leap year.
  5. Take the remainder by 7.
  6. Adjust for the century.

Suppose we wanted to find the day of the week for January 1, 2100. The first step gives us 0. The constant for January is 6, so at the end of step 2 we have 6. At the end of step 3 we have 7.

Now comes the key part: in step 4 we do not subtract 1 because 2100 is not a leap year. Step 5 gives us 0, i.e. Sunday. When we adjust for the century by moving back two days, we get Friday.

This explanation is admittedly tedious, but it reveals a subtle part of the algorithm: you have to carry out the algorithm, in particular step 4, for the original year. To find the day of the week for a year in the 2200s, for example, you carry out the algorithm as if it were valid for the 2200s, until you get to the last step.

Related posts

[1] This is known as the year share. There are many ways to calculate it (mod 7) that are less obvious but easier to calculate mentally. It’s interesting that these methods generally look more complicated in writing, but they’re easier to carry out.