I recently ran across an interesting family of Pythagorean triples [1].
You can verify that a² + b² = c² for all n.
Sparseness
When written in binary, a has only two bits set, and c has only four bits set.
It’s not as immediately obvious, but b has only two bits that are not set.
For example, here’s what we get writing the Pythagorean triple (a, b, c) in binary when n = 5.
(100000000100000000000, 10111111101111111111, 101000000010000000001)
In linear algebra, we say a matrix is sparse if most of its entries are zeros. The word “sparse” is used similarly in other areas, generally meaning something contains a lot of zeros. So in that sense a and c are sparse.
I suppose you could call b dense since its binary representation is a string of almost all 1s. Or you could say that it is sparse in that has little variation in symbols.
Aside from sparseness, these triples touch on a couple other ideas from the overlap of math and computer science.
One’s complement
The number b is the one’s complement of c, i.e. if you flip every bit in c then you obtain b (with a leading zero).
More precisely, one’s complement is given relative to a number of bits N. The N-bit one’s complement of x equals 2N − x.
In our case b is the (4n + 1)-bit one’s complement of c. Also c is the (4n + 1)-bit one’s complement of b because one’s complement is its own inverse, i.e. it is an involution.
Run-length encoding
The binary representations of a, b, and c are highly compressible strings when n is large. Run-length encoding (RLE) represents each string compactly.
RLE simply describes a string by stating symbols and how many times each is repeated. So to compute the run-length encoding of
100000000100000000000,10111111101111111111,101000000010000000001
from the example above, you’d observe one 1, eight 0s, one 1, eleven zeros, etc.
There’s ambiguity writing the RLE of a sequence of digits unless you somehow put the symbols and counts in a different namespace. For example, if we write 1180110 we intend this to be read as above, but someone could read this as 180 1s followed by 1o 1s.
Let’s replace 0s with z (for zero) and 1s with u (for unit) so our string will not contain any digits.
uzzzzzzzzuzzzzzzzzzzz,uzuuuuuuuzuuuuuuuuuu,uzuzzzzzzzuzzzzzzzzzu
Then the RLE of the string is
uz8uz11,uzu7zu10,uzuz7uz9u
Here a missing count is implicitly 1. So uz8… is read as u, followed by z repeated 8 times, etc.
As n increases, the length of the binary string grows much faster than the length of the corresponding RLE.
Exercise for the reader: What is the RLE of the triple for general n?
Related posts
- Generating Pythagorean triples with linear algebra
- Making text more compressible
- Powers of 3 in binary
[1] H. S. Uhler. A Colossal Primitive Pythagorean Triangle. The American Mathematical Monthly, Vol. 57, No. 5 (May, 1950), pp. 331–332.