Multiplication tables and Latin squares

The multiplication table of a finite group forms a Latin square.

You form the multiplication table of a finite group just as you would the multiplication tables from your childhood: list the elements along the top and side of a grid and fill in each square with the products. In the context of group theory the multiplication table is called a Cayley table.

There are two differences between Cayley tables and the multiplication tables of elementary school. First, Cayley tables are complete because you can include all the elements of the group in the table. Elementary school multiplication tables are the upper left corner of an infinite Cayley table for the positive integers.

(The positive integers do not form a group under multiplication because only 1 has a multiplicative inverse. The positive integers form a magma, not a group, but we can still talk of the Cayley table of a magma.)

The other difference is that the elements of a finite group typically do not have a natural order, unlike integers. It’s conventional to start with the group identity, but other than that the order of the elements along the top and sides could be arbitrary. Two people may create different Cayley tables for the same group by listing the elements in a different order.

A Cayley table is necessarily a Latin square. That is, each element appears exactly once in each row and column. Here’s a quick proof. The row corresponding to the element a consists of a multiplied by each of the elements of the group. If two elements were the same, i.e. ab = ac for some b and c, then bc because you can multiply on the left by the inverse of a. Each row is a permutation of the group elements. The analogous argument holds for columns, multiplying on the right.

Not all Latin squares correspond to the Cayley table of a group. Also, the Cayley table of an algebraic structure without inverses may not form a Latin square. The Cayley table for the positive integers, for example, is not a Latin square.

Here’s an example, the Cayley table for Q8, the group of unit quaternions. The elements are ±1, ±i, ±j, and ±k and the multiplication is the usual multiplication for quaternions:

i² = j² = k² = ijk = −1.

as William Rowan Hamilton famously carved into the stone of Brougham Bridge. I colored the cells of the table to make it easier to scan to verify that table is a Latin square.

Latin square posts

Quaternion posts

The Buenos Aires constant

The Buenos Aires constant is 2.92005097731613…

What’s so special about this number? Let’s see when we use it to initialize the following Python script.

s = 2.920050977316134

for _ in range(10):
    i = int(s)
    print(i)
    s = i*(1 + s - i)

What does this print?

    2, 3, 5, 7, 11, 13, 17, 19, 23, 29

If we started with the exact Buenos Aires constant and carried out our operations exactly, the procedure above would generate all prime numbers. In actual IEEE 754 arithmetic, it breaks down around Douglas Adams’ favorite number.

The Buenos Aires constant is defined by the infinite sum

\lambda = \frac{2-1}{1} + \frac{3-1}{2} + \frac{5-1}{3\cdot 2} + \frac{7-1}{5\cdot 3 \cdot 2} + \cdots

As you can tell, the primes are baked into the definition of λ, so the series can’t generate new primes.

I used mpmath to calculate λ to 100 decimal places:

2.920050977316134712092562917112019468002727899321426719772682533107733772127766124190178112317583742

This required carrying out the series defining λ for the first 56 primes. When I carried out the iteration above also to 100 decimal places, it failed on the 55th prime. So I got about as many primes out of the computation as I put into it.

Related posts

Reference: Beyond Pi and e: a Collection of Constants. James Grime, Kevin Knudson, Pamela Pierce, Ellen Veomett, Glen Whitney. Math Horizons, Vol. 29, No. 1 (September 2021), pp. 8–12

1 + 2 + 3 + … = −1/12

The other day MathMatize posted

roses are red
books go on a shelf
1+2+3+4+ …

with a photo of Ramanujan on X.

This was an allusion to the bizarre equation

1 + 2 + 3 + … = − 1/12.

This comes up often enough that I wanted to write a post that I could share a link to next time I see it.

The equation is nonsense if interpreted in the usual way. The sum on the left diverges. You could say the sum is ∞ if by that you mean you can make the sum as large as you like by taking the partial sum out far enough.

Here’s how the equation is mean to be interpreted. The Riemann zeta function is defined as

\zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s}

for complex numbers s with positive real part, and defined for the rest of the complex plane by analytic continuation. The qualifiers matter. The infinite sum above does not define the zeta function for all numbers; it defines ζ(s) for numbers with real part greater than 1. The sum is valid for numbers like 7, or 42 −476i, or √2 + πi, but not for −1.

If the sum did define ζ(−1) then the sum would be 1 + 2 + 3 + …, but it doesn’t.

However, ζ(−1) is defined, and it equals −1/12.

What does it mean to define a function by analytic continuation? There is a theorem that essentially says there is only one way to extend an analytic function. It is possible to construct an analytic function that has the same values as ζ(s) when Re(s) > 1, where that function is defined for all s ≠ 1.

We could give that function a new name, say f(s). That is the function whose value at −1 equals − 1/12. But since there is only one possible analytic function f that overlaps with ζ(s) we go ahead and use the notation ζ(s) for this extended function.

To put it another way, the function ζ(s) is valid for all s ≠ 1, but the series representation for ζ(s) is not valid unless Re(s) > 1. There are other representations for ζ(s) for other regions of the complex plane, including for s = −1, and that’s what lets us compute ζ(−1) to find out that it equals −1/12.

So the rigorous but less sensational way to interpret the equation is to say

1s + 2s + 3s + …

is a whimsical way of referring to the function defined by the series, when the series converges, and defined by its analytic continuation otherwise. So in addition to saying

1 + 2 + 3 + … = − 1/12

we could also say

1² + 2² + 3² + … = 0

and

1³ + 2³ + 3³ + … = 1/120.

You can make up your own equation for any value of s for which you can calculate ζ(s).

Multiple angle asymmetry

The cosine of a multiple of θ can be written as a polynomial in cos θ. For example,

cos 3θ = 4 cos3 θ − 3 cos θ

and

cos 4θ = 8 cos4 θ − 8 cos2 θ + 1.

But it may or may not be possible to write the sine of a multiple of θ as a polynomial in sin θ. For example,

sin 3θ = −4 sin3 θ + 3 sin θ

but

sin 4θ =  − 8 sin3 θ cos θ + 4 sin θ cos θ

It turns out cos nθ can always be written as a polynomial in cos θ, but sin nθ can be written as a polynomial in sin θ if and only if n is odd. We will prove this, say more about sin nθ for even n, then be more specific about the polynomials alluded to.

Proof

We start by writing exp(inθ) two different ways:

cos nθ + i sin nθ = (cos θ + i sin θ)n

The real part of the left hand side is cos nθ and the real part of the right hand side contains powers of cos θ and even powers of sin θ. We can convert the latter to cosines by replacing sin2 θ with 1 − cos2 θ.

The imaginary part of the left hand side is sin nθ. If n is odd, the right hand side involves odd powers of sin θ and even powers of cos θ, in which case we can replace the even powers of cos θ with even powers of sin θ. But if n is even, every term in the imaginary part will involve odd powers of sin θ and odd powers of cos θ. Every odd power of cos θ can be turned into terms involving a single cos θ and an odd power of sin θ.

We’ve proven a little more than we set out to prove. When n is even, we cannot write sin nθ as a polynomial in sin θ, but we can write it as cos θ multiplied by an odd degree polynomial in sin θ. Alternatively, we could write sin nθ as sin θ multiplied by an odd degree polynomial in cos θ.

Naming polynomials

The polynomials alluded to above are not arbitrary polynomials. They are well-studied polynomials with many special properties. Yesterday’s post on Chebyshev polynomials defined Tn(x) as the nth degree polynomial for which

Tn(cos θ) = cos nθ.

That post didn’t prove that the right hand side is a polynomial, but this post did. The polynomials Tn(x) are known as Chebyshev polynomials of the first kind, or sometimes simply Chebyshev polynomials since they come up in application more often than the other kinds.

Yesterday’s post also defined Chebyshev polynomials of the second kind by

Un(cos θ) sin θ = sin (n+1)θ.

So when we say cos nθ can be written as a polynomial in cos θ, we can be more specific: that polynomial is Tn.

And when we say sin nθ can be written as sin θ times a polynomial in cos θ, we can also be more specific:

sin nθ = sin θ Un−1(cos θ).

Solving trigonometric equations

A couple years ago I wrote about systematically solving trigonometric equations. That post showed that any polynomial involving sines and cosines of multiples of θ could be reduced to a polynomial in sin θ and cos θ. The results in this post let us say more about this polynomial, that we can write it in terms of Chebyshev polynomials. This might allow us to apply some of the numerous identities these polynomials satisfy and find useful structure.

Related posts

Posthumous Chebyshev Polynomials

Two families of orthogonal polynomials are named after Chebyshev because he explored their properties. These are prosaically named Chebyshev polynomials of the first and second kind.

I recently learned there are Chebyshev polynomials of the third and fourth kind as well. You might call these posthumous Chebyshev polynomials. They were not developed by Mr. Chebyshev, but they bear a family resemblance to the polynomials he did develop.

The four kinds of Chebyshev polynomials may be defined in order as follows.

\begin{align*} T_n(\cos\theta) &= \cos n\theta \\ U_n(\cos\theta) &= \frac{\sin (n+1)\theta}{\sin \theta} \\ V_n(\cos\theta) &= \frac{\cos \left(n+\frac{1}{2}\right)\theta}{\cos \frac{1}{2}\theta} \\ W_n(\cos\theta) &= \frac{\sin \left(n+\frac{1}{2}\right)\theta}{\sin \frac{1}{2}\theta} \\ \end{align*}

It’s not obvious that these definitions even make sense, but in each case the right hand side can be expanded into a sum of powers of cos θ, i.e. a polynomial in cos θ. [1]

All four kinds of Chebyshev polynomials satisfy the same recurrence relation

P_n(x) = 2x\,P_{n-1}(x) - P_{n-2}(x)

for n ≥ 2 and P0 = 1 but with different values of P1, namely x, 2x, 2x − 1, and 2x + 1 respectively [2].

Plots

We can implement Chebyshev polynomials of the third kind using the recurrence relation above.

def V(n, x):
    if n == 0: return 1
    if n == 1: return 2*x - 1
    return 2*x*V(n-1, x) - V(n-2, x)

Here is a plot of Vn(x) for n = 0, 1, 2, 3, 4.

The code for implementing Chebyshev polynomials of the fourth kind is the same, except the middle line becomes

    if n == 1: return 2*x + 1

Here is the corresponding plot.

Square roots

The Chebyshev polynomials of the first and third kind, and polynomials of the second and fourth kind, are related as follows:

\begin{align*} V_n(x)&=\sqrt\frac{2}{1+x}T_{2n+1}\left(\sqrt\frac{x+1}{2}\right) \\ W_n(x)&=U_{2n}\left(\sqrt\frac{x+1}{2}\right) \end{align*}

To see that the expressions on the right hand side really are polynomials, note that Chebyshev polynomials of the first and second kinds are odd for odd orders and even for even orders [3]. This means that in the first equation, every term in T2n + 1 has a factor of √(1 + x) that is canceled out by the 1/√(1 + x) term up front. In the second equation, there are only even powers of the radical term so all the radicals go away.

You could take the pair of equations above as the definition of Chebyshev polynomials of the third and fourth kind, but the similarity between these polynomials and the original Chebyshev polynomials is more apparent in the definition above using sines and cosines.

The square roots hint at how these polynomials first came up in applications. According to [2], Chebyshev polynomials of the third and fourth kind

have been called “airfoil polynomials”, since they are appropriate for approximating the single square root singularities that occur at the sharp end of an airfoil.

Dirichlet kernel

There’s an interesting connection between Chebyshev polynomials of the fourth kind and Fourier series.

The right hand side of the definition of Wn is known in Fourier analysis as Dn, the Dirichlet kernel of order n.

D_n(\theta) = \frac{\sin \left(n+\frac{1}{2}\right)\theta}{\sin \frac{1}{2}\theta}

The nth order Fourier series approximation of f, i.e. the sum of terms −n through n in the Fourier series for f is the convolution of f with Dn, times 2π.

(D_n * f)(\theta) = 2\pi \sum_{k=-n}^n \hat{f}(k) \exp(ik\theta)

Note that Dn(θ) is a function of θ, not of x. The equation Wn(cos θ) = Dn(θ) defines Wn(x) where x = cos θ. To put it another way, Dn(θ) is not a polynomial, but it can be expanded into a polynomial in cos θ.

Related posts

[1] Each function on the right hand side is an even function, which implies it’s at least plausible that each can be written as powers of cos θ. In fact you can apply multiple angle trig identities to work out the polynomials in cos θ.

[2] J.C. Mason and G.H. Elliott. Near-minimax complex approximation by four kinds of Chebyshev polynomial expansion. Journal of Computational and Applied Mathematics 46 (1993) 291–300

[3] This is not true of Chebyshev polynomials of the third and fourth kind. To see this note that V1(x) = 2x − 1, and W1(x) = 2x + 1, neither of which is an odd function.

Sparse binary Pythagorean triples

I recently ran across an interesting family of Pythagorean triples [1].

\begin{align*} a &= 2^{4n} + 2^{2n+1} \\ b &= 2^{4n} - 2^{4n-2} - 2^{2n} - 1 \\ c &= 2^{4n} + 2^{4n-2} + 2^{2n} + 1 \\ \end{align*}

You can verify that a² + b² = c² for all n.

Sparseness

When written in binary, a has only two bits set, and c has only four bits set.

It’s not as immediately obvious, but b has only two bits that are not set.

For example, here’s what we get writing the Pythagorean triple (a, b, c) in binary when n = 5.

(100000000100000000000,  10111111101111111111, 101000000010000000001)

In linear algebra, we say a matrix is sparse if most of its entries are zeros. The word “sparse” is used similarly in other areas, generally meaning something contains a lot of zeros. So in that sense a and c are sparse.

I suppose you could call b dense since its binary representation is a string of almost all 1s. Or you could say that it is sparse in that has little variation in symbols.

Aside from sparseness, these triples touch on a couple other ideas from the overlap of math and computer science.

One’s complement

The number b is the one’s complement of c, i.e. if you flip every bit in c then you obtain b (with a leading zero).

More precisely, one’s complement is given relative to a number of bits N. The N-bit one’s complement of x equals 2Nx.

In our case b is the (4n + 1)-bit one’s complement of c. Also c is the (4n + 1)-bit one’s complement of b because one’s complement is its own inverse, i.e. it is an involution.

Run-length encoding

The binary representations of ab, and c are highly compressible strings when n is large. Run-length encoding (RLE) represents each string compactly.

RLE simply describes a string by stating symbols and how many times each is repeated. So to compute the run-length encoding of

100000000100000000000,10111111101111111111,101000000010000000001

from the example above, you’d observe one 1, eight 0s, one 1, eleven zeros, etc.

There’s ambiguity writing the RLE of a sequence of digits unless you somehow put the symbols and counts in a different namespace. For example, if we write 1180110 we intend this to be read as above, but someone could read this as 180 1s followed by 1o 1s.

Let’s replace 0s with z (for zero) and 1s with u (for unit) so our string will not contain any digits.

uzzzzzzzzuzzzzzzzzzzz,uzuuuuuuuzuuuuuuuuuu,uzuzzzzzzzuzzzzzzzzzu

Then the RLE of the string is

uz8uz11,uzu7zu10,uzuz7uz9u

Here a missing count is implicitly 1. So uz8… is read as u, followed by z repeated 8 times, etc.

As n increases, the length of the binary string grows much faster than the length of the corresponding RLE.

Exercise for the reader: What is the RLE of the triple for general n?

Related posts

[1] H. S. Uhler. A Colossal Primitive Pythagorean Triangle. The American Mathematical Monthly, Vol. 57, No. 5 (May, 1950), pp. 331–332.

Matrix representations of number systems

The previous post discussed complex numbers, dual numbers, and double numbers. All three systems are constructed by adding some element to the real numbers that has some special algebraic property. The complex numbers are constructed by adding an element i such that i² = −1. The dual numbers add an element ε ≠ 0 with ε² = 0, and the double numbers are constructed by adding j ≠ 1 with j² = 1.

If adding special elements seems somehow illegitimate, there is an alternative way to define these number systems that may seem more concrete using 2 × 2 matrices. (A reader from 150 years ago would probably be more comfortable with appending special numbers than with matrices, but now we’re accustomed to matrices.)

The following mappings provide isomorphisms between complex, dual, and double numbers and their embeddings in the ring of 2 × 2 matrices.

\begin{align*} a + ib &\leftrightarrow \begin{pmatrix} a & -b \\ b & a \end{pmatrix} \\ a + \varepsilon b &\leftrightarrow \begin{pmatrix} a & b \\ 0 & a \end{pmatrix} \\ a + jb &\leftrightarrow \begin{pmatrix} a & b \\ b & a \end{pmatrix} \\ \end{align*}

Because the mappings are isomorphisms, you can translate a calculation in one of these number systems into a calculation involving real matrices, then translate the result back to the original number system. This is conceptually interesting, but it could also be useful if you’re using software that supports matrices but does not directly support alternative number systems.

You can also apply the correspondences from right to left. If you need to carry out calculations on matrices of the special forms above, you could move over to complex (or dual, or double) numbers, do your algebra, then convert the result back to matrices.

Functions of a matrix

The previous post looked at variations on Euler’s theorem in complex, dual, and double numbers. You could verify these three theorems by applying exp, sin, cos, sinh, and cosh to matrices. In each case you define the function in terms of its power series and stick in matrices. You should be a little concerned about convergence, but it all works out.

You should also be concerned about commutativity. Multiplication of real numbers is commutative, but multiplication of matrices is not, so you can’t just stick matrices into any equation derived for real numbers and expect it to hold. For example, it’s not true in general that exp(A + B) equals exp(A) exp(B). But it is true if the matrices A and B commute, and the special matrices that represent complex (or dual, or double) numbers do commute.

Related posts

Euler’s formula for dual numbers and double numbers

The complex numbers are formed by adding an element i to the real numbers such that i² = − 1. We can create other number systems by adding other elements to the reals.

One example is dual numbers. Here we add a number ε ≠ 0 with the property ε² = 0. Dual numbers have been used in numerous applications, most recently in automatic differentiation.

Another example is double numbers [1]. Here we add a number j ≠ ±1 such that j² = 1. (Apologies to electrical engineers and Python programmers. For this post, j is not the imaginary unit from complex numbers.)

(If adding special numbers to the reals makes you uneasy, see the next post for an alternative approach to defining these numbers.)

We can find analogs of Euler’s formula

\exp(i\theta) = \cos(\theta) + i \sin(\theta)

for dual numbers and double numbers by using the power series for the exponential function

\exp(z) = \sum_{k=0}^\infty \frac{z^k}{k!}

to define exp(z) in these number systems.

For dual numbers, the analog of Euler’s theorem is

\exp(\varepsilon x) = 1 + \varepsilon x

because all the terms in the power series after the first two involve powers of ε that evaluate to 0. Although this equation only holds for dual numbers, not real numbers, it is approximately true of ε is a small real number. This is the motivation for using ε as the symbol for the special number added to the reals: Dual numbers can formalize calculations over the reals that are not formally correct.

For double numbers, the analog of Euler’s theorem is

\exp(j x) = \cosh(x) + j \sinh(x)

and the proof is entirely analogous to the proof of Euler’s theorem for complex numbers: Write out the power series, then separate the terms involving even exponents from the terms involving odd exponents.

Related posts

[1] Double numbers have also been called motors, hyperbolic numbers, split-complex numbers, spacetime numbers, …

Duplicating Hankel plot from A&S

Abramowitz and Stegun has quite a few intriguing plots. The post will focus on the follow plot, Figure 9.4, available here.

A&S figure 9.4

We will explain what the plot is and approximately reproduce it.

The plot comes from the chapter on Bessel functions, but the caption says it is a plot of the Hankel function H0(1). Why a plot of a Hankel function and not a Bessel function? The Hankel functions are linear combinations of the Bessel functions of the first and second kind:

H0(1) = J0i Y0

More on that Hankel functions and their relations to Bessel functions here.

The plot is the overlay of two kinds of contour plots: one for lines of constant magnitude and one for lines of constant phase. That is, if the function values are written in the form reiθ then one plot shows lines of constant r and one plot shows lines of constant θ.

We can roughly reproduce the plot of magnitude contours with the following Mathematica command:

ContourPlot[Abs[HankelH1[0, x + I y]], {x, -4, 2 }, {y, -1.5 , 1.5 }, 
 Contours -> 20, ContourShading -> None, AspectRatio -> 1/2]

This produces the following plot.

Absolute value contour

Similarly, we can replace Abs with Arg in the Mathematica command and increase Contours to 30 to obtain the following phase contour plot.

Phase contour

Finally, we can stack the two plots on top of each other using Mathematica’s Show command.

Magnitude and phase contours

By the way, you can clearly see the branch cut in the middle. The Hankel function is continuous (even analytic) as you move clockwise from the second quadrant around to the third, but it is discontinuous across the negative real axis because of the branch cut.

Related posts

Area of a quadrilateral from the lengths of its sides

Last week Heron’s formula came up in the post An Unexpected Triangle. Given the lengths of the sides of a triangle, there is a simple expression for the area of the triangle.

A = \sqrt{s(s-a)(s-b)(s-c)}

where the sides are a, b, and c and s is the semiperimeter, half the perimeter.

Is there an analogous formula for the area of a quadrilateral? Yes and no. If the quadrilateral is cyclic, meaning there exists a circle going through all four of its vertices, then Brahmagupta’s formula for the area of a quadrilateral is a direct generalization of Heron’s formula for the area of a triangle. If the sides of the cyclic quadrilateral are a, b, c, and d, then the area of the quadrilateral is

A = \sqrt{(s-a)(s-b)(s-c)(s-d)}

where again s is the semiperimeter.

But in general, the area of a quadrilateral is not determined by the length of its sides alone. There is a more general expression, Bretschneider’s formula, that expresses the area of a general quadrilateral in terms of the lengths of its sides and the sum of two opposite angles. (Either pair of opposite angles lead to the same value.)

A = \sqrt {(s-a)(s-b)(s-c)(s-d) - abcd \, \cos^2 \left(\frac{\alpha + \gamma}{2}\right)}

In a cyclic quadrilateral, the opposite angles α and γ add up to π, and so the cosine term drops out.

The contrast between the triangle and the quadrilateral touches on an area of math called distance geometry. At first this term may sound redundant. Isn’t geometry all about distances? Well, no. It is also about angles. Distance geometry seeks results, like Heron’s theorem, that only depend on distances.

Related posts