A large class of checksum algorithms have the following pattern:
- Think of the bits in a file as the coefficients in a polynomial P(x).
- Divide P(x) by a fixed polynomial Q(x) mod 2 and keep the remainder.
- Report the remainder as a sequence of bits.
In practice there’s a little more to the algorithm than this, such as appending the length of the file, but the above pattern is at the heart of the algorithm.
There’s a common misconception that the polynomial Q(x) is irreducible, i.e. cannot be factored. This may or may not be the case.
CRC-32
Perhaps the most common choice of Q is
Q(x) = x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x3 + x2 + x + 1
This polynomial is used in the cksum
utility and is part of numerous standards. It’s know as CRC-32 polynomial, though there are other polynomials occasionally used in 32-bit implementations of the CRC algorithm. And it is far from irreducible as the following Mathematica code shows. The command
Factor[x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^3 + x^2 + x + 1, Modulus -> 2]
shows that Q can be factored as
(1 + x)5 (1 + x + x3 + x4 + x6) (1 + x + x2 + x5 + x6)
(1 + x + x4 + x6 + x7) (1 + x + x4 + x5 + x6 + x7 + x8)
(Mathematica displays polynomials in increasing order of terms.)
Note that the factorization is valid when done over the field with 2 elements, GF(2). Whether a polynomial can be factored, and what the factors are, depends on what field you do your arithmetic in. The polynomial Q(x) above is irreducible as a polynomial with real coefficients. It can be factored working mod 3, for example, but it factors differently mod 3 than it factors mod 2. Here’s the factorization mod 3:
(1 + 2 x2 + 2 x3 + x4 + x5) (2 + x + 2 x2 + x3 + 2 x4 + x6 + x7)
(2 + x + x3 + 2 x7 + x8 + x9 + x10 + 2 x12 + x13 + x15 + 2 x16 + x17 + x18 + x19 + x20)
CRC-64
The polynomial
Q(x) = x64 + x4 + x3 + x + 1
is known as CRC-64, and is part of several standards, including ISO 3309. This polynomial is irreducible mod 2 as the following Mathematica code confirms.
IrreduciblePolynomialQ[x^64 + x^4 + x^3 + x + 1, Modulus -> 2]
The CRC algorithm uses this polynomial mod 2, but out of curiosity I checked whether it is irreducible in other contexts. The following code tests whether the polynomial is irreducible modulo the first 100 primes.
Table[IrreduciblePolynomialQ[x^64 + x^4 + x^3 + x + 1, Modulus -> Prime[n]], {n, 1, 100}]
It is irreducible mod p for p = 2, 233, or 383, but not for any other primes up to 541. It’s also irreducible over the real numbers.
Since Q is irreducible mod 2, the check sum essentially views its input P(x) as a member of the finite field GF(264).
Ben Eater, one of my favorite YouTube channels did an amazing breakdown of CRC including building the circuitry on breadboards.
https://youtu.be/izG7qT0EpBw?si=u3Olf3seXa4eU6MO