Naming Awk

The Awk programming language was named after the initials of its creators. In the preface to a book that just came out, The AWK Programing Language, Second Edition, the authors give a little background on this.

Naming a language after its creators shows a certain paucity of imagination. In our defense, we didn’t have a better idea, and by coincidence, at some point in the process we were in three adjacent offices in the order Aho, Weinberger, and Kernighan.

By the way, here’s a nice line from near the end of the book.

Realistically, if you’re going to learn only one programming language, Python is the one. But for small programs typed at the command line, Awk is hard to beat.

Geometric mean on unit circle

Warm up

The geometric mean of two numbers is the square root of their product. For example, the geometric mean of 9 and 25 is 15.

More generally, the geometric mean of a set of n numbers is the nth root of their product.

Alternatively, the geometric mean of a set of n numbers the exponential of their average logarithm.

\left(\prod_{i=1}^n x_i\right)^{1/n} = \exp\left(\frac{1}{n} \sum_{i=1}^n \log x_i\right)

The advantage of the alternative definition is that it extends to integrals. The geometric mean of a function over a set is the exponential of the average value of its logarithm. And the average of a function over a set is its integral over that set divided by the measure of the set.

Mahler measure

The Mahler measure of a polynomial is the geometric mean over the unit circle of the absolute value of the polynomial.

M(p) = \exp\left( \int_0^1 \log \left|p(e^{2\pi i \theta})\right| \, d\theta\right)

The Mahler measure equals the product of the absolute values of the leading coefficient and roots outside the unit circle. That is, if

p(z) = a \prod_{i=1}^n(z - a_i)

then

M(p) = |a| \prod_{i=1}^n\max(1, |a_i|)

Example

Let p(z) = 7(z − 2)(z − 3)(z + 1/2). Based on the leading coefficient and the roots, we would expect M(p) to be 42. The following Mathematica code shows this is indeed true by returning 42.

    z = Exp[2 Pi I theta]
    Exp[Integrate[Log[7 (z - 2) (z - 3) (z + 1/2)], {theta, 0, 1}]]

Multiplication and heights

Mahler measure is multiplicative: for any two polynomials p and q, the measure of their product is the product of their measures.

M(pq) = M(p)\,M(q)

A few days ago I wrote about height functions for rational numbers. Mahler measure is a height function for polynomials, and there are theorems bounding Mahler measure by other height functions, such as the sum or maximum of the absolute values of the coefficients.

Related posts

Gauss map, Euclidean algorithm, and continued fractions

The Gauss map [1] is the function

f(x) = \frac{1}{x} - \left\lfloor \frac{1}{x}\right\rfloor

where ⌊y⌋ is the floor of y, the greatest integer no larger than y.

I’ve written about this map a couple times before. First, I wrote about how this map is measure-preserving. Second, I wrote about the image at the top of the post, based on Michael Trott’s idea of extending the floor function to the complex plane and plotting it.

This post is a third take on the Gauss map, expanding on a comment by Giovanni Panti. His paper opens by saying

The fact that the euclidean algorithm eventually terminates is pervasive in mathematics. In the language of continued fractions, it can be stated by saying that the orbits of rational points under the Gauss map xx−1 −⌊x−1⌋ eventually reach zero.

What does the Gauss map have to do with continued fractions or the Euclidean algorithm? We’ll show this by working through an example.

A continued fraction has the form

a_0 + \cfrac{1}{a_1+\cfrac{1}{a_2+\cfrac{1}{a_3+\ddots}}}

Let’s start with 162/47 and see how we would write it as a continued fraction. An obvious place to start would be to write this as a proper fraction.

\frac{162}{47} = 3 + \frac{21}{47}

Next we turn 21/47 into 1 over something.

\frac{162}{47} = 3 + \frac{21}{47} = 3 + \cfrac{1}{\cfrac{47}{21}}

Now let’s do the same thing with 47/21: turn it into a proper fraction 2 + 5/21, then rewrite the fraction part 5/21 as the reciprocal of its reciprocal:

 \frac{162}{47} = 3 + \cfrac{1}{\cfrac{47}{21}} = 3 + \cfrac{1}{2 + \cfrac{5}{21}} = 3 + \cfrac{1}{2 + \cfrac{1}{\cfrac{21}{5}}

Finally, we write 21/5 as 4 + 1/5, and we’re done:

 \frac{162}{47} = 3 + \cfrac{1}{2 + \cfrac{1}{4 + \cfrac{1}{5}}}

Now go back and look at what happens to the fraction in the bottom left corner at each step:

 \frac{162}{47} = 3 + \frac{21}{47} = 3 + \cfrac{1}{2 + \cfrac{5}{21}} = 3 + \cfrac{1}{2 + \cfrac{1}{4 + \cfrac{1}{5}}}

The sequence of bottom left fractions is 21/47, 5/21, 1/5. Each fraction is replaced by its Gauss map: f(21/47) = 5/21, and f(5/21) = 1/5. We applied the Gauss map above naturally in the process of creating a continued fraction.

Now suppose we wanted to find the greatest common divisor of 162 and 47 using the Euclidean algorithm.

Notice that these are the same numbers, produced by the same calculations as above.

[1] There are other things called the Gauss map, such as the map that takes a point on a surface to the unit normal at that point. That’s not the Gauss map we’re talking about here.

An elliptic curve is a functor

The goal of this post is to unpack a remark in [1]:

… we can say this in fancier terms. Fix a field k …. We say that an elliptic curve E defined over k is that functor which …

Well that is fancy. But what does it mean?

Looking for objects

A functor is a pair of functions [2]. At the base level, a functor takes objects in one category to objects in another category. The quote above continues

… that functor which associates fields K containing k to an algebraic set of the form …

The key word here is “associates.” That must be our function between objects. Our functor maps fields containing k to a certain kind of algebraic set. So the domain of our functor must be fields containing the field k, or fields more generally if you take “containing k” to apply on both sides for all fields k.

Looking for morphisms

Categories are more than objects, and functors are more than morphisms.

A category consists of objects and morphisms between those objects. So our category of fields must also contain some sort of morphisms between fields, presumably field homomorphisms. And our category of algebraic sets must have some kind of morphisms between algebraic sets that preserve the algebraic structure.

The functor between fields and algebraic sets must map fields to algebraic sets, and maps between fields to maps between algebraic sets, in a structure-preserving way.

Categorification

The algebraic sets are what we normally think of as elliptic curves, but the author is saying we can think of this functor as an elliptic curve. This is an example of categorification: taking something that doesn’t initially involve category theory, then building a categorical scaffold around it.

Why do this? In order to accentuate structure that is implicit before the introduction of category theory. In our case, that implicit structure has to do with fields.

The most concrete way to think of an elliptic curve is over a single field, but we can look at the same curve over larger fields as well. We might start, for example, by thinking of a curve over the rational numbers ℚ, then extending our perspective to include the real numbers ℝ, and then extending it again to include the complex numbers ℂ.

The categorification of elliptic curves emphasizes that things behave well when we go from thinking of an elliptic curve as being over a field k to thinking of it as a field over an extension field K that contains k.

A functor between fields (and their morphisms) and algebraic sets (of a certain form, along with their morphisms) has to act in a “structure-preserving way” as we said above. This means that morphisms between fields carry over to morphisms between these special algebraic sets in a way that has all the properties one might reasonably expect.

Coda

This post has been deliberately short on details, in part because the line in [1] that it is expanding on is short on details. But we’ve seen how you might tease out a passing comment that something is a functor. You know the right questions to ask if you want to look into this further: what exactly are the morphisms in both categories, and does functorality tell us about elliptic curves as we change fields?

There are many ways to categorify something, some more useful than others. Useful categorifications express some structure; some fact that was a theorem before categorification becomes a corollary of the new structure after categorification. There are other ways to categorify elliptic curves, each with their own advantages and disadvantages.

Related posts

[1] Edray Herber Goins. The Ubiquity of Elliptic Curves. Notices of the American Mathematical Society. February 2019.

[2] Strictly speaking, a pair of “morphisms” that may or may not be functions.

Elliptic curve addition formulas

The geometric description of addition of points P and Q on an elliptic curve involves four logical branches:

If one of P or Q is the point at infinity …

Else if P = Q

Else if P and Q lie on a vertical line …

Else …

It would seem that an algorithm for adding points would have to have the same structure, which is unfortunate for at least a couple reasons: it adds complexity, and it raises the possibility of a timing attack since some branches will execute faster than others.

However, an algebraic formula for addition need not mirror its geometric motivation.

Jacobian coordinates

Jacobian coordinates are a different way to describe elliptic curves. They have the advantage that addition formulas have fewer logic branches than the geometric description. They may have one test of whether a point is the point at infinity but they don’t have multiple branches.

Jacobian coordinates also eliminate the need to do division. The geometric description of addition on an elliptic curve involves calculating where the line through two points intersects the curve, and this naturally involves calculating slopes. But in Jacobian coordinates, addition is done using only addition and multiplication, no division. When you’re working over the field of integers modulo some gigantic prime, multiplication and addition are much faster than division.

You can find much more on Jacobian coordinates, including explicit formulas and operation counts, here.

Complete formulas

Sometimes it’s possible to completely eliminate logic branches as well as division operations. An elliptic curve addition formula is called complete if it is valid for all inputs. The first surprise is that complete addition formulas sometimes exist. The second surprise is that complete addition formulas often exist.

You can find much more on complete addition formulas in the paper “Complete addition formulas for prime order elliptic curves” available here.

McCabe versus Kolmogorov

Whether the Jacobian coordinate formulas or complete formulas are more or less complex than direct implementation of the geometric definition depends on how you define complexity. The formulas are definitely less complex in terms of McCabe complexity, also known as cyclomatic complexity. But they are more complex in terms of Kolmogorov complexity.

The McCabe complexity of a function is essentially the number of independent paths through the function. A complete addition formula, one that could be implemented in software with no branching logic, has the smallest possible McCabe complexity.

I’m using the term Kolmogorov complexity to mean simply the amount of code it takes to implement a function. It’s intuitively clear what this means. But if you make it precise, you end up with a measure of complexity that is only useful in theory and cannot be calculated in practice.

Counting operations

Instead of literally computing Kolmogorov complexity you’d count the number of multiplications and additions necessary to execute a formula. The links above do just that. If you know how many cycles it takes to execute an addition or multiplication (in the field you’re working over), and how many cycles it would take to carry out addition (on the elliptic curve) implemented directly from the definition, include the division required, then you could estimate whether these alternative formulas would save time.

It used to be common to count how many floating point operations it would take to execute an algorithm. I did that back when I took numerical analysis in college. But this became obsolete not long after I learned it. Counting floating point operations no longer tells you as much about an algorithm’s runtime as it used to due to changes in the speed of arithmetic relative to memory access and changes in computer architecture. However, in the context of this post, counting operations still matters because each operation—such as adding two 512-bit numbers modulo a 512-bit prime—involves a lot of more basic operations.

Related posts

Rational height functions

Mathematicians often speak informally about the relative simplicity of rational numbers. For example, musical intervals that correspond to simple fractions have less tension than intervals that correspond to more complicated fractions.

Such informal statements can be made more precise using height functions. There are a variety of height functions designed for different applications, but the most common height function defines the height of a fraction p/q in lowest terms to be the sum of the numerator and denominator:

height(p/q) = |p| + |q|.

This post will look at how this applies to musical intervals, to approximations for π, and the number of days in a year.

Musical intervals

Here are musical intervals, ranked by the height of their just tuning frequency ratios.

  1. octave (2:1)
  2. fifth (3:2)
  3. forth (4:3)
  4. major sixth (5:3)
  5. major third (5:4)
  6. minor third (6:5)
  7. minor sixth (8:5)
  8. minor seventh (9:5)
  9. major second (10:9)
  10. major seventh (15:8)
  11. minor second (16:15)
  12. augmented fourth (45:32)

The least tension is an interval of an octave. The next six intervals are considered consonant. A minor seventh is considered mildly dissonant, and the rest are considered more dissonant. The most dissonant interval is the augmented fourth, also known as a tritone because it is the same interval as three whole steps.

Incidentally, a telephone busy signal consists of two pitches, 620 Hz and 480 Hz. This is a ratio of 24:31, which has a height of 54. This is consistent with the signal being moderately dissonant.

Approximations to π

The first four continued fraction approximations to π are 3, 22/7, 333/106, and 335/113.

Continued fraction convergents give the best rational approximation to an irrational for a given denominator. But for a height value that is not the height of a convergent, the best approximation might not be a convergent.

For example, the best approximation to π with height less than or equal to 333 + 106 is 333/106. But the best approximation with height less than or equal to 400 is 289/92, which is not a convergent of the continued fraction for π.

Days in a year

The number of days in a year is 365.2424177. Obviously that’s close to 365 1/4, and 1/4 is the best approximation to 0.2424177 for its height.

The Gregorian calendar has 97 leap days every 400 years, which approximates 0.2424177 with 97/400. This approximation has practical advantages for humans, but 8 leap days every 33 years would be a more accurate approximation with a much smaller height.

Related posts

Timing attacks

If you ask someone a question and they say “yes” immediately, that gives you different information than if they pause and slowly say “yes.” The information you receive is not just the response but also the time it took to generate the response.

Encryption can be analogous. The time it takes to encrypt data can leak information about the data being encrypted. It probably won’t reveal the data per se, but it may reveal enough about the data or the encryption process to reduce the effort needed to break the encryption.

There are two ways of thwarting timing attacks. One is to try to make the encryption take the same amount of time, independent of the data. This would prevent an attacker from inferring, for example, which branch of an algorithm was taken if one branch executes faster than the other.

If the encryption process always takes the same amount of time, then the execution time of the encryption process carries no information. But its enough that the execution time carries no useful information.

It may be easier to make execution time uncorrelated with content than to make execution time constant. Also, keeping the execution time of an algorithm constant may require making the program always run as slow as the worst case scenario. You may get faster average execution time by allowing the time to vary in a way that is uncorrelated with any information useful to an attacker.

One example of this would be Garner’s algorithm used in decrypting RSA encoded messages.

Suppose you’re using RSA encryption with a public key e, private key d, and modulus n. You can decrypt cyphertext c to obtain the cleaertext m by computing

m = cd mod n.

An alternative would be to compute a random message r and decrypt rec:

(rec)d = red  cdrm mod n

then multiply by the inverse of r mod n to obtain m. Because r is random, the time required to decrypt rec is uncorrelated with the time required to decrypt c.

Elliptic curve Diffie-Hellman key exchange

I concluded the previous post by saying elliptic curve Diffie-Hellman key exchange (ECDHE) requires smaller keys than finite field Diffie-Hellman (FFDHE) to obtain the same level of security.

How much smaller are we talking about? According to NIST recommendations, a 256-bit elliptic curve provides about the same security as working over a 3072-bit finite field. Not only are elliptic curves smaller, they scale better. A 512-bit elliptic curve is believed to be about as secure as a 15360-bit finite field: a factor of 2x for elliptic curves and a factor of 5x for finite fields.

The core idea of Diffie-Hellman is to pick a group G, an element g, and a large number of x. If y is the result of starting with x and applying the group operation x times, it is difficult to recover x from knowing y. This is called the discrete logarithm problem, taking its name from the case of the group operation being multiplication. But the inverse problem is still called the discrete logarithm problem when the group is additive.

In FFDHE the group G is the multiplicative group of a generator g modulo a large prime p. Applying the group operation (i.e. multiplication) to g a number of times x is computing

y = gx

and x is rightly called a discrete logarithm; the process is directly analogous to taking the logarithm of a real number.

In ECDHE the group is given by addition on an elliptic curve. Applying the group operation x times to g, adding g to itself x times, is written xg. The problem of recovering x from xg is still called the discrete logarithm problem, though you could call it the discrete “division” problem.

Some groups are unsuited for Diffie-Hellman cryptography because the discrete logarithm problem is easy. If we let G be the additive group modulo a prime (not the multiplicative group) then it is easy to recover x from xg.

Note that when we talk about applying the group operation a large number of times, we mean a really large number of times, in theory, though not in practice. If you’re working over an elliptic curve with on the order of 2256 elements, and x is on the order of 2256, then xg is the result of adding x to itself on the order of 2256 times. But in practice you’d double g on the order of 256 times. See fast exponentiation.

In the post on FFDHE we said that you have to be careful that your choice of prime and generator doesn’t give the group structure that a cryptanalysist could exploit. This is also true for the elliptic curves used in ECDHE, and even more so because elliptic curves are more subtle than finite fields.

If large-scale quantum computing ever becomes practical, Diffie-Hellman encryption will be broken because a quantum computer can solve discrete logarithm problems efficiently via Schor’s algorithm. This applies equally to finite fields and elliptic curves.

Related posts

Finite field Diffie Hellman primes

Diffie-Hellman key exchange is conceptually simple. Alice and Bob want to generate a shared cryptographic key. They want to use asymmetric (public) cryptography to share a symmetric (private) key.

The starting point is a large prime p and a generator 1 < g < p.

Alice generates a large random number x, her private key, and sends Bob gx mod p.

Similarly, Bob generates a large random number y, his private key, and sends Alice gy mod p.

Alice takes gy and raises it to her exponent x, and Bob takes gx and raises it to the exponent y. They arrive at a common key k because

k = (gy)x = (gx)y mod p.

The security of the system rests on the assumption that the discrete logarithm problem is hard, i.e. given g and gz it is computationally impractical to solve for z. This assumption appears to be true in general, but can fail when the group generated by g has exploitable structure.

You can read more about Diffie-Hellman here.

Recommended primes

The choice of prime p and generator g can matter is subtle ways and so there are lists of standard choices that are believed to be secure.

IETF RFC 7919 recommends five standard primes. These have the form

p = 2^b - 2^{b-64} + 2^{64} \left( \lfloor 2^{b-130} e\rfloor + X \right) - 1

where b is the size of p in bits, e is the base of natural logarithms, and X is the smallest such that p is a safe prime. In every case the generator is g = 2.

The values of b are 2048, 3072, 4096, 6144, and 8192. The values of X and p are given in RFC 7919, but they’re both determined by b.

I don’t imagine there’s anything special about the constant e above. I suspect it’s there to shake things up a bit in a way that doesn’t appear to be creating a back door. Another irrational number like π or φ would probably do as well, but I don’t know this for sure.

ffdhe2048

The recommended primes have names of the form “ffdhe” followed by b. For b = 2048, the corresponding value is X is 560316.

I wrote a little Python code to verify that this value of X does produce a safe prime and that smaller values of X do not.

    #!/usr/bin/env python3
    
    from sympy import isprime, E, N, floor
    
    b = 2048
    e = N(E, 1000)
    c = floor(2**(b-130) * e)
    d = 2**b - 2**(b-64) + 2**64*c - 1
    
    def candidate(b, x):
        p = d + 2**64*x
        return p
    
    for x in range(560316, 0, -1):
        p = candidate(b, x)
        if isprime(p) and isprime((p-1)//2):
            print(x)

This took about an hour to run. It only printed 560316, verifying the claim in RFC 7919.

Other groups

Finite field Diffie-Hellman is so called because the integers modulo a prime form a finite field. We don’t need a field per se; we’re working in the group formed by the orbit of g within that field. Such groups need to be very large in order to provide security.

It’s possible to use Diffie-Hellman over any group for which the discrete logarithm problem is intractable, and the discrete logarithm problem is harder over elliptic curves than over finite fields. The elliptic curve groups can be smaller and provide the same level of security. Smaller groups mean smaller keys to exchange. For this reason, elliptic curve Diffie-Hellman is more commonly used than finite field Diffie-Hellman.

Chinese Remainder Theorem synthesis algorithm

Suppose m = pq where p and q are large, distinct primes. In the previous post we said that calculations mod m can often be carried out more efficiently by working mod p and mod q, then combining the results to get back to a result mod m.

The Chinese Remainder Theorem assures us that the system of congruences

\begin{align*} x &\equiv a\pmod p \\ x &\equiv b\pmod q \end{align*}

has a unique solution mod m, but the theorem doesn’t say how to compute x efficiently.

H. L. Garner developed an algorithm to directly compute x [1]:

x = p \biggl(\big(\,(a-b)(q^{-1}\bmod p\,)\big)\bmod p\biggr) + b

You compute the inverse of q mod p once and save it, then solving the system above for multiple values of a and b is very efficient.

Garner’s algorithm extends to more than two factors. We will present the general case of his algorithm below, but first we do a concrete example with RSA keys.

RSA key example

This is a continuation of the example at the bottom of this post.

This shows that the numbers in the key file besides those that are strictly necessary for the RSA algorithm are numbers needed for Garner’s algorithm.

What the key file calls “coefficient” is the inverse of q modulo p.

What the key file calls “exponent1” is the the decryption exponent d reduced mod p − 1. Similarly, “exponent2” is d reduced mod q − 1 as explained here.

    from sympy import lcm

    prime1 = 0xf33514...d9
    prime2 = 0xfee496...51

    publicExponent = 65537
    privateExponent = 0x03896d...91

    coefficient = 0x4d5a4c...b7 # q^-1 mod p
    assert(coefficient*prime2 % prime1 == 1)

    exponent1 = 0x37cc69...a1 # e % phi(p)
    exponent2 = 0x2aa06f...01 # e % phi(q)
    assert(privateExponent % (prime1 - 1) == exponent1)
    assert(privateExponent % (prime2 - 1) == exponent2)

More factors

Garner’s algorithm can be used more generally when m is the product of more than two primes [2]. Suppose

m = \prod_{i=1}^n m_i

where the mi are pairwise relatively prime (not necessarily prime). Then the system of congruences

x \equiv x_i \pmod {m_i}

for i = 1, 2, 3, …, n can be solved by looking for a solution of the form

x = v_1 + v_2 m_1 + v_3 m_1m_2 + \cdots + v_n \prod_{i=1}^{n-1} m_i

where

v_k \equiv \bigg( x_k - \big(v_1 + v_2m_1 + \cdots + v_{k-1} \prod_{i=1}^{k-2}m_i \big) \bigg) \bigg( \prod_{i=1}^{k-1} m_i\bigg)^{-1} \pmod {m_k}

Again, in practice the modular inverses of the products of the ms would be precomputed and cached.

Related posts

[1] Ferguson, Schneier, Kohno. Cryptography Engineering. Wiley. 2010.

[2] Geddes, Czapor, and Labahn. Algorithms for Computer Algebra. Kluwer Academic Publishers. 1992.