Formalizing a ring theorem with Lean 4 and Claude

I’ve been testing Claude’s ability to generate Lean 4 code to prove theorems. I’ve written about a couple experiments that verified calculations. I did not write about my failed attempt to get Claude to formalize a proof of the pqr theorem for seminorms.

This time I asked Claude to formally prove the theorem from the previous post on partial fraction decomposition in an abstract setting. Unlike the previous experiments, I did not give it a proof or calculation to verify; I asked it to generate a proof.

This was the prompt:

Write a proof in Lean 4 for the following generalization of partial fraction decomposition.

If a ring R is a principle ideal domain (PID), then every element c of the field of fractions over R can be written in the form \[c = \sum_i \frac{a_i}{p_i^{r_i}}\]

where the p_i are nonassociate irreducible elements of R, the r_i are non-negative integers, and the elements a_i and p_i are relatively prime.

This was Claude’s immediate response:

I’ll work through this carefully, as it’s a substantial formalization task. Let me think through what’s available in Mathlib and what needs to be constructed.

This is a genuinely hard Lean 4 formalization problem. Let me think carefully about what’s in Mathlib, what the statement precisely means, and what proof strategy is realistic.

It also stated its overall proof strategy:

Proof strategy: Write c = a/b in lowest terms. Factor b = ∏ pᵢ^rᵢ. Use the Chinese Remainder Theorem / Bezout identity to decompose a/b into partial fractions.

If I needed to prove the theorem by hand, this suggestion alone could be very helpful.

The first attempt generated code that would not build. As before, a lot of the problems seem to stem from things moving around in Mathlib due to refactorings. There are also occasional hallucinations.

I did not offer any help other than pasting in error messages after each failure. When working on a real problem, I’d be more interactive. But as a test, I just passively pasted error messages and tried again. It took 11 iterations, but I finally got a proof. The proof had five sections marked “sorry,” which is essentially a TODO note indicating an unproven assumption.

I asked if it could explain or remove the sorries, and it said it could remove three of them. But the generated code did not build and I hit a session limit before I could debug the code. Here is the code with five sorry sections.

/-
  PartialFractions.lean  (v11 — final)
-/

import Mathlib

open UniqueFactorizationMonoid

variable {R : Type*} [CommRing R] [IsDomain R] [IsPrincipalIdealRing R]

structure PFDSummand (R : Type*) [CommRing R] where
  prime    : R
  exponent : ℕ
  numer    : R

noncomputable def PFDSummand.val [CommRing R] [IsDomain R] (s : PFDSummand R) :
    FractionRing R :=
  algebraMap R (FractionRing R) s.numer /
  algebraMap R (FractionRing R) (s.prime ^ s.exponent)

structure IsPFD [CommRing R] [IsDomain R] [IsPrincipalIdealRing R]
    (terms : List (PFDSummand R)) (c : FractionRing R) : Prop where
  irred    : ∀ s ∈ terms, Irreducible s.prime
  nonAssoc : ∀ s ∈ terms, ∀ t ∈ terms, s ≠ t → ¬Associated s.prime t.prime
  coprime  : ∀ s ∈ terms, IsCoprime s.numer s.prime
  sum_eq   : (terms.map PFDSummand.val).sum = c

private lemma algMap_ne_zero {a : R} (h : a ≠ 0) :
    (algebraMap R (FractionRing R)) a ≠ 0 :=
  map_ne_zero_iff _ (IsFractionRing.injective R (FractionRing R)) |>.mpr h

private lemma exists_num_denom (c : FractionRing R) :
    ∃ a b : R, b ≠ 0 ∧
      (algebraMap R (FractionRing R)) a / (algebraMap R (FractionRing R)) b = c := by
  obtain ⟨a, b, hb, hab⟩ := IsFractionRing.div_surjective (A := R) c
  exact ⟨a, b, nonZeroDivisors.ne_zero hb, hab⟩

-- ======================================================================
-- §6  Bézout splitting
-- ======================================================================

private lemma bezout_split (x y : R) (hcop : IsCoprime x y)
    (hx : x ≠ 0) (hy : y ≠ 0) (e : R) :
    ∃ s t : R, s * x + t * y = 1 ∧
      (algebraMap R (FractionRing R) e) /
        ((algebraMap R (FractionRing R) x) * (algebraMap R (FractionRing R) y)) =
      (algebraMap R (FractionRing R) (e * t)) / (algebraMap R (FractionRing R) x) +
      (algebraMap R (FractionRing R) (e * s)) / (algebraMap R (FractionRing R) y) := by
  obtain ⟨s, t, hst⟩ := hcop
  refine ⟨s, t, hst, ?_⟩
  have hxF : (algebraMap R (FractionRing R)) x ≠ 0 := algMap_ne_zero hx
  have hyF : (algebraMap R (FractionRing R)) y ≠ 0 := algMap_ne_zero hy
  have hxyF : (algebraMap R (FractionRing R)) x *
              (algebraMap R (FractionRing R)) y ≠ 0 := mul_ne_zero hxF hyF
  have hst_F : (algebraMap R (FractionRing R)) s * (algebraMap R (FractionRing R)) x +
               (algebraMap R (FractionRing R)) t * (algebraMap R (FractionRing R)) y = 1 := by
    have h := congr_arg (algebraMap R (FractionRing R)) hst
    simp only [map_add, map_mul, map_one] at h; exact h
  -- Prove: ι e / (ι x * ι y) = ι(e*t)/ι x + ι(e*s)/ι y
  -- Strategy: show RHS = LHS via eq_div_iff, simplify, linear_combination.
  symm
  rw [eq_div_iff hxyF, add_mul, div_mul_eq_mul_div, div_mul_eq_mul_div]
  -- Goal: ι(e*t)*(ι x*ι y)/ι x + ι(e*s)*(ι x*ι y)/ι y = ι e
  -- The mul_comm on the product inside:
  rw [show (algebraMap R (FractionRing R)) x * (algebraMap R (FractionRing R)) y =
        (algebraMap R (FractionRing R)) y * (algebraMap R (FractionRing R)) x from mul_comm _ _]
  -- Now: ι(e*t)*(ι y*ι x)/ι x + ι(e*s)*(ι y*ι x)/ι y = ι e
  -- Use field_simp to clear remaining denominators
  field_simp [hxF, hyF]
  -- Now a polynomial goal: close with linear_combination
  simp only [map_mul]
  linear_combination (algebraMap R (FractionRing R) e) * hst_F

-- ======================================================================
-- §7  Coprime numerator reduction
-- ======================================================================

private lemma coprime_numer_exists {p : R} (hp : Irreducible p) (a : R) (n : ℕ) :
    ∃ a' q : R, a = a' + q * p ^ n ∧ IsCoprime a' p := by
  rcases Classical.em (p ∣ a) with _h | h
  · exact ⟨0, 0, by sorry, by sorry⟩
  · exact ⟨a, 0, by ring,
      isCoprime_comm.mpr ((irreducible_iff_prime.mp hp).coprime_iff_not_dvd.mpr h)⟩

-- ======================================================================
-- §8  Main theorem
-- ======================================================================

private def emptyPFD (c : FractionRing R) (hc : c = 0) :
    ∃ terms : List (PFDSummand R), IsPFD terms c :=
  ⟨[], { irred := by simp, nonAssoc := by simp, coprime := by simp,
         sum_eq := by simp [hc] }⟩

private lemma isPFD_of_denom : ∀ (d : R) (e : R),
    ∃ terms : List (PFDSummand R),
      IsPFD terms ((algebraMap R (FractionRing R) e) /
                   (algebraMap R (FractionRing R) d)) := by
  intro d
  induction d using UniqueFactorizationMonoid.induction_on_coprime with
  | h0 =>
    intro e; simp only [map_zero, div_zero]
    exact emptyPFD 0 rfl
  | h1 =>
    intro e
    -- Unit denominator: sorry for now (§9C)
    exact ⟨[], { irred := by simp, nonAssoc := by simp,
                 coprime := by simp, sum_eq := by simp; sorry }⟩
  | hpr =>
    rename_i prime_elem exp hp_prime
    intro e
    have hp_irred : Irreducible prime_elem := irreducible_iff_prime.mpr hp_prime
    obtain ⟨e', _q, _he, he'_cop⟩ := coprime_numer_exists hp_irred e exp
    exact ⟨[⟨prime_elem, exp, e'⟩], {
      irred    := fun s hs => by
        simp only [List.mem_singleton] at hs; subst hs; exact hp_irred
      nonAssoc := fun s hs t ht hne => by
        simp only [List.mem_singleton] at hs ht; subst hs; subst ht; exact absurd rfl hne
      coprime  := fun s hs => by
        simp only [List.mem_singleton] at hs; subst hs; exact he'_cop
      sum_eq   := by
        simp only [List.map_singleton, List.sum_singleton, PFDSummand.val]; sorry }⟩
  | hcp =>
    rename_i x y hrel ih_x ih_y
    intro e
    -- Handle the case where x = 0 or y = 0 directly
    rcases eq_or_ne x 0 with rfl | hx
    · -- x = 0: denominator = 0*y = 0, so quotient = 0
      simp only [zero_mul, map_zero, div_zero]
      exact emptyPFD 0 rfl
    rcases eq_or_ne y 0 with rfl | hy
    · -- y = 0: denominator = x*0 = 0, so quotient = 0
      simp only [mul_zero, map_zero, div_zero]
      exact emptyPFD 0 rfl
    -- Both nonzero: apply Bézout splitting
    -- IsCoprime x y from IsRelPrime x y (using IsRelPrime.isCoprime)
    have hcop : IsCoprime x y := hrel.isCoprime
    obtain ⟨s, t, _hst, hsplit⟩ := bezout_split x y hcop hx hy e
    obtain ⟨terms_x, hpfd_x⟩ := ih_x (e * t)
    obtain ⟨terms_y, hpfd_y⟩ := ih_y (e * s)
    exact ⟨terms_x ++ terms_y, {
      irred    := fun s' hs' => by
        rw [List.mem_append] at hs'
        exact hs'.elim (hpfd_x.irred s') (hpfd_y.irred s')
      nonAssoc := fun s' hs' t' ht' hne => by
        rw [List.mem_append] at hs' ht'
        rcases hs' with hs' | hs' <;> rcases ht' with ht' | ht'
        · exact hpfd_x.nonAssoc s' hs' t' ht' hne
        · intro; sorry
        · intro; sorry
        · exact hpfd_y.nonAssoc s' hs' t' ht' hne
      coprime  := fun s' hs' => by
        rw [List.mem_append] at hs'
        exact hs'.elim (hpfd_x.coprime s') (hpfd_y.coprime s')
      sum_eq   := by
        rw [List.map_append, List.sum_append, hpfd_x.sum_eq, hpfd_y.sum_eq]
        -- Goal: ι(e*t)/ι x + ι(e*s)/ι y = ι e / ι(x*y)
        -- hsplit: ι e/(ι x*ι y) = ι(e*t)/ι x + ι(e*s)/ι y
        -- So: ι(e*t)/ι x + ι(e*s)/ι y = ι e/(ι x*ι y) = ι e/ι(x*y)
        rw [← hsplit, map_mul] }⟩

theorem pid_partial_fraction_decomposition (c : FractionRing R) :
    ∃ terms : List (PFDSummand R), IsPFD terms c := by
  obtain ⟨a, b, _hb, hc⟩ := exists_num_denom c
  rw [← hc]; exact isPFD_of_denom b a

section IntExample
example (c : FractionRing ℤ) :
    ∃ terms : List (PFDSummand ℤ), IsPFD terms c :=
  pid_partial_fraction_decomposition c
end IntExample

Partial fraction decomposition

Nearly everyone who has seen partial fraction decomposition was introduced to it as a way to compute integrals. If P(x) and Q(x) are polynomials, then you can break their ratio P(x)/Q(x) into a sum of terms that can each be integrated in closed form. As with most topics in a calculus class, partial fractions go by in a blur.

This post will look at partial fractions more generally.

Computation

Every polynomial with real coefficients can be factored into a product of linear and irreducible quadratic terms. But actually calculating this factorization is difficult if the degree of the denominator is large.

The quadratic equation is easy to use. There are analogs for 3rd and 4th order polynomials, but they’re cumbersome. And there is no formula in general for finding roots of polynomials of degree 5 or higher.

You could find the roots numerically, but if you’re going to go that route, maybe you should evaluate your integral numerically.

Still, it is useful in proving theorems to know that a partial fraction decomposition exists, even if in practice you cannot calculate it.

Complex numbers

Rational polynomials over the real numbers can be factored into powers of linear terms and irreducible quadratic terms. There are no irreducible quadratics over the complex numbers thanks to the Fundamental Theorem of Algebra, and every polynomial can be factored into a product of linear terms.

This means every rational in z can be broken into a sum of a polynomial in z and polynomials in 1/(zzi) where the zi are the roots of the denominator. This fact is important, for example, in contour integration.

Principle ideal domains

The concept of partial fraction decomposition can be generalized to the field of fractions over a ring R [1].

If the ring R is a principle ideal domain (PID) [2], then every element c of the field K of fractions over R can be written in the form

c = \sum_i \frac{a_i}{p_i^{r_i}}

where the pi are nonassociate [3] irreducible elements of R, the ri are non-negative integers, and the elements ai and pi are relatively prime.

When R is the ring of of polynomials over a field, R is a PID, and the field of fractions is the set of rational functions over that field. When the field is the real or complex numbers, we get the results above. But the field could be something else, such as a finite field.

Integers

When R is the ring of integers, the irreducible elements are prime numbers. The nonassociate condition means you can’t count p and −p as distinct elements, so practically this means we only look at positive primes. The field of fractions is the rational numbers. So the theorem above says that every rational number can be written as a sum of fractions where the denominators of the fractions are prime powers and the numerators are relatively prime to the denominators.

The way you would decompose a rational number into fractions with prime power denominators is analogous to the way you’d do partial fraction decomposition in a calculus class. For example, suppose we want to decompose 46/75. The distinct prime factors of 75 are 3 and 5, and so we’d look for fractions with denominators 3, 5, and 25, and in fact

\frac{46}{75} = \frac{1}{3} + \frac{2}{5} - \frac{3}{25}

Footnotes

[1] The field of fractions over R is the set of formal terms a/b where a and b are in R and b ≠ 0. Operations are defined by analogy with rational numbers. If R is an integral domain, the field of fractions really is a field.

[2] A ring is a PID if every ideal can be generated by a single element.

[3] Two elements of an integral domain are said to be associate if they generate the same ideal.

Three examples suffice

You can’t prove a theorem by just checking a few examples. Except sometimes you can.

A few weeks ago I wrote Pentagonal numbers are truncated triangular numbers. In a nutshell, if the pentagonal numbers are defined by

Pn = (3n² − n)/2

and the triangular numbers by

Tn = (n² + n)/2

then

Pn = T2n − 1 − Tn − 1.

Here’s a visualization of the equation.

Note that the equation asserts that two quadratic polynomials are equal. If the two polynomials are equal at three points, then they’re equal everywhere. We might as well make life easy and choose n = 0, 1, and 2.

If you’d like, you could do this in code.

>>> P = lambda n: (3*n**2 - n)/2
>>> T = lambda n: (n**2 + n)/2
>>> for n in [0, 1, 2]: assert(P(n) == T(2*n-1) - T(n-1))

This provides a rigorous proof, not just a sanity check.

Sometimes checking a few points is not enough to prove an equation with certainty, but it is enough to establish an equation with high probability. More on that here.

Related posts

Testing pentagonal numbers

The nth pentagonal number Pn is the number of dots in diagrams like those below with n concentric pentagons.

We have the formula

Pn = (3n² − n)/2

where n is a positive integer. If n is an integer but not positive, the equation above defines a generalized pentagonal number.

If you’re given an n, you can easily compute Pn. But suppose you’re given a large number x. How would you determine if it is a pentagonal number? And if it is a pentagonal number, how would you find n such that x = Pn?

Rejecting non-pentagonal numbers

If

x = Pn = (3n² − n)/2

then we can solve a quadratic equation for n:

n = (1 ± √(24x + 1))/6.

If 24x + 1 is not a perfect square, n is not an integer and x is not a pentagonal number, ordinary or generalized.

Small numbers

For example,

√(24 × 20260615 + 1)) = 22051.185…

and so 20260615 is not a pentagonal number nor a generalized pentagonal number.

Big numbers

Now suppose

x = 170141183460469231731687303715884105727.

Is this a pentagonal number? You can’t just compute √(24x + 1) in floating point arithmetic because the result is a 20-digit number, and floating point number have 15 digits of precision, so you can’t tell whether the result is an integer.

However, you can compute

⌊√(24x + 1)⌋

with only integer arithmetic using the sqrt_floor function from this post.

def sqrt_floor(n):
    a = n
    b = (n + 1) // 2
    while b < a:
        a = b
        b = (a*a + n) // (2*a)
    return a

The following prints a positive number,

x = 2**127 - 1
y = 24*x + 1
r = sqrt_floor(y)
print(y - r**2)

which tells us y is not a perfect square.

Finding the index

Now suppose y is a perfect square. Then the roots of

(1 ± √(24x + 1))/6

are rational, but are they integers? In fact one, and only one, of the roots will be an integer. If

24x + 1 = r²

then r is congruent to ±1 mod 6 because the left side is congruent to 1 mod 6. If r = 1 mod 6 then the smaller root is an integer, and if r = 5 mod 6 then the larger root is an integer.

So if 24x + 1 = r², then x is a pentagonal number if r = 5 mod 6 and a generalized pentagonal number otherwise.

The function pentagonal_index takes a number x and return n if x = Pn and None if no such n exists.

def pentagonal_index(x):
    y = 24*x + 1
    r = sqrt_floor(y)
    if r*r != y:
        return None
    if r % 6 == 5:
        return (1 + r) // 6
    else:
        return (1 - r) // 6

We can test this with the following code.

P = lambda n: (3*n**2 - n) // 2
for n in [2, 3, -4, -5, 10**200]:
    assert(pentagonal_index(P(n)) == n)

Integer division in Python

Note that P(10**200) is too big to fit into a float, but the code works fine. This is because we use integer division (//) everywhere. If we had said

P = lambda n: (3*n**2 - n) / 2

the test above would pass for the small values of n but output

OverflowError: integer division result too large for a float

when it came to n = 10200.

Related posts

Quaternion Rotations, Claude, and Lean

I got an email message this afternoon reporting a typo in a blog post from about a year ago on converting between quaternions and rotation matrices [1]. The email said exactly where the typo was, but I decided to see whether Claude would find it. Specifically, I prompted Sonnet 4.6 Medium with the following.

Write Lean code to verify the two theorems at the top of this post:
https://www.johndcook.com/blog/2025/05/07/quaternions-and-rotation-matrices/

That is, prove that the expressions given in the two SVG files are correct.

The post included Python code to numerically verify the equations. However, the Python code differed from the LaTeX code for the image in one subscript [2]. Although I asked Claude to prove the expressions in the SVG file produced by the LaTeX code, it detected the conflict between the Python and LaTeX and correctly concluded that the former was correct.

The SVG is an image — let me rely on the Python code in the blog (which is the ground truth implementation) and the matrix entries in the alt text. … Note the alt text says row 1 col 2 is 2(q_1 q_3 - q_0 q_1) — that’s a typo in the alt text; the Python code says 2*(q2*q3 - q0*q1) which makes sense. I’ll use the Python code as the authoritative source.

The code did not run on first attempt. After four iterations of pasting in error messages and generating code again, it worked.

Here is the final Lean 4 code.

/-
  Lean 4 / Mathlib verification of the two theorems from:
    "Converting between quaternions and rotation matrices"
    ...

  Matrix entries (0-based, matching the post's Python code):
    R00 = 2(q0²+q1²)−1    R01 = 2(q1 q2−q0 q3)   R02 = 2(q1 q3+q0 q2)
    R10 = 2(q1 q2+q0 q3)  R11 = 2(q0²+q2²)−1     R12 = 2(q2 q3−q0 q1)
    R20 = 2(q1 q3−q0 q2)  R21 = 2(q2 q3+q0 q1)   R22 = 2(q0²+q3²)−1

  THEOREM 1  (quaternion → rotation matrix)
    If q0²+q1²+q2²+q3² = 1 then R is orthogonal (Rᵀ R = I),
    proved via the 9 scalar dot-product identities.

  THEOREM 2  (rotation matrix → quaternion, Chiaverini–Siciliano)
    With rᵢⱼ as above:
        1 + R00 + R11 + R22 = 4 q0²
        1 + R00 − R11 − R22 = 4 q1²
        1 − R00 + R11 − R22 = 4 q2²
        1 − R00 − R11 + R22 = 4 q3²
-/

import Mathlib.Tactic

set_option linter.style.whitespace false

variable (q0 q1 q2 q3 : ℝ)

/-! ## Theorem 1 : Rᵀ R = I -/

-- ── Column norms = 1 ─────────────────────────────────────────────────────────

theorem col0_norm (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q0 ^ 2 + q1 ^ 2) - 1) ^ 2 + (2 * (q1 * q2 + q0 * q3)) ^ 2 +
    (2 * (q1 * q3 - q0 * q2)) ^ 2 = 1 := by
  nlinarith [sq_nonneg (q0 ^ 2 + q1 ^ 2 - q2 ^ 2 - q3 ^ 2),
             sq_nonneg (q0 * q2 + q1 * q3), sq_nonneg (q0 * q3 - q1 * q2),
             sq_nonneg q0, sq_nonneg q1, sq_nonneg q2, sq_nonneg q3]

theorem col1_norm (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q1 * q2 - q0 * q3)) ^ 2 + (2 * (q0 ^ 2 + q2 ^ 2) - 1) ^ 2 +
    (2 * (q2 * q3 + q0 * q1)) ^ 2 = 1 := by
  nlinarith [sq_nonneg (q0 ^ 2 - q1 ^ 2 + q2 ^ 2 - q3 ^ 2),
             sq_nonneg (q0 * q1 + q2 * q3), sq_nonneg (q0 * q3 - q1 * q2),
             sq_nonneg q0, sq_nonneg q1, sq_nonneg q2, sq_nonneg q3]

theorem col2_norm (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q1 * q3 + q0 * q2)) ^ 2 + (2 * (q2 * q3 - q0 * q1)) ^ 2 +
    (2 * (q0 ^ 2 + q3 ^ 2) - 1) ^ 2 = 1 := by
  nlinarith [sq_nonneg (q0 ^ 2 - q1 ^ 2 - q2 ^ 2 + q3 ^ 2),
             sq_nonneg (q0 * q1 - q2 * q3), sq_nonneg (q0 * q2 + q1 * q3),
             sq_nonneg q0, sq_nonneg q1, sq_nonneg q2, sq_nonneg q3]

-- ── Column orthogonality = 0 ─────────────────────────────────────────────────
-- These need h (the residuals shown by Lean are multiples of (q0²+q1²+q2²+q3²-1)).
-- We use linear_combination with the explicit witnesses.

theorem col0_col1_orth (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q0 ^ 2 + q1 ^ 2) - 1) * (2 * (q1 * q2 - q0 * q3)) +
    (2 * (q1 * q2 + q0 * q3)) * (2 * (q0 ^ 2 + q2 ^ 2) - 1) +
    (2 * (q1 * q3 - q0 * q2)) * (2 * (q2 * q3 + q0 * q1)) = 0 := by
  linear_combination 4 * q1 * q2 * h

theorem col0_col2_orth (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q0 ^ 2 + q1 ^ 2) - 1) * (2 * (q1 * q3 + q0 * q2)) +
    (2 * (q1 * q2 + q0 * q3)) * (2 * (q2 * q3 - q0 * q1)) +
    (2 * (q1 * q3 - q0 * q2)) * (2 * (q0 ^ 2 + q3 ^ 2) - 1) = 0 := by
  linear_combination 4 * q1 * q3 * h

theorem col1_col2_orth (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q1 * q2 - q0 * q3)) * (2 * (q1 * q3 + q0 * q2)) +
    (2 * (q0 ^ 2 + q2 ^ 2) - 1) * (2 * (q2 * q3 - q0 * q1)) +
    (2 * (q2 * q3 + q0 * q1)) * (2 * (q0 ^ 2 + q3 ^ 2) - 1) = 0 := by
  linear_combination 4 * q2 * q3 * h

-- ── Row norms = 1 ────────────────────────────────────────────────────────────

theorem row0_norm (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q0 ^ 2 + q1 ^ 2) - 1) ^ 2 + (2 * (q1 * q2 - q0 * q3)) ^ 2 +
    (2 * (q1 * q3 + q0 * q2)) ^ 2 = 1 := by
  nlinarith [sq_nonneg (q0 ^ 2 + q1 ^ 2 - q2 ^ 2 - q3 ^ 2),
             sq_nonneg (q0 * q2 - q1 * q3), sq_nonneg (q0 * q3 + q1 * q2),
             sq_nonneg q0, sq_nonneg q1, sq_nonneg q2, sq_nonneg q3]

theorem row1_norm (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q1 * q2 + q0 * q3)) ^ 2 + (2 * (q0 ^ 2 + q2 ^ 2) - 1) ^ 2 +
    (2 * (q2 * q3 - q0 * q1)) ^ 2 = 1 := by
  nlinarith [sq_nonneg (q0 ^ 2 - q1 ^ 2 + q2 ^ 2 - q3 ^ 2),
             sq_nonneg (q0 * q1 - q2 * q3), sq_nonneg (q0 * q3 + q1 * q2),
             sq_nonneg q0, sq_nonneg q1, sq_nonneg q2, sq_nonneg q3]

theorem row2_norm (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q1 * q3 - q0 * q2)) ^ 2 + (2 * (q2 * q3 + q0 * q1)) ^ 2 +
    (2 * (q0 ^ 2 + q3 ^ 2) - 1) ^ 2 = 1 := by
  nlinarith [sq_nonneg (q0 ^ 2 - q1 ^ 2 - q2 ^ 2 + q3 ^ 2),
             sq_nonneg (q0 * q1 + q2 * q3), sq_nonneg (q0 * q2 - q1 * q3),
             sq_nonneg q0, sq_nonneg q1, sq_nonneg q2, sq_nonneg q3]

-- ── Row orthogonality = 0 ────────────────────────────────────────────────────

theorem row0_row1_orth (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q0 ^ 2 + q1 ^ 2) - 1) * (2 * (q1 * q2 + q0 * q3)) +
    (2 * (q1 * q2 - q0 * q3)) * (2 * (q0 ^ 2 + q2 ^ 2) - 1) +
    (2 * (q1 * q3 + q0 * q2)) * (2 * (q2 * q3 - q0 * q1)) = 0 := by
  linear_combination 4 * q1 * q2 * h

theorem row0_row2_orth (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q0 ^ 2 + q1 ^ 2) - 1) * (2 * (q1 * q3 - q0 * q2)) +
    (2 * (q1 * q2 - q0 * q3)) * (2 * (q2 * q3 + q0 * q1)) +
    (2 * (q1 * q3 + q0 * q2)) * (2 * (q0 ^ 2 + q3 ^ 2) - 1) = 0 := by
  linear_combination 4 * q1 * q3 * h

theorem row1_row2_orth (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    (2 * (q1 * q2 + q0 * q3)) * (2 * (q1 * q3 - q0 * q2)) +
    (2 * (q0 ^ 2 + q2 ^ 2) - 1) * (2 * (q2 * q3 + q0 * q1)) +
    (2 * (q2 * q3 - q0 * q1)) * (2 * (q0 ^ 2 + q3 ^ 2) - 1) = 0 := by
  linear_combination 4 * q2 * q3 * h

/-! ## Theorem 2 : Chiaverini–Siciliano square-root arguments -/

theorem cs_arg0 (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    1 + (2 * (q0 ^ 2 + q1 ^ 2) - 1) + (2 * (q0 ^ 2 + q2 ^ 2) - 1) +
    (2 * (q0 ^ 2 + q3 ^ 2) - 1) = 4 * q0 ^ 2 := by linarith

theorem cs_arg1 (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    1 + (2 * (q0 ^ 2 + q1 ^ 2) - 1) - (2 * (q0 ^ 2 + q2 ^ 2) - 1) -
    (2 * (q0 ^ 2 + q3 ^ 2) - 1) = 4 * q1 ^ 2 := by linarith

theorem cs_arg2 (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    1 - (2 * (q0 ^ 2 + q1 ^ 2) - 1) + (2 * (q0 ^ 2 + q2 ^ 2) - 1) -
    (2 * (q0 ^ 2 + q3 ^ 2) - 1) = 4 * q2 ^ 2 := by linarith

theorem cs_arg3 (h : q0 ^ 2 + q1 ^ 2 + q2 ^ 2 + q3 ^ 2 = 1) :
    1 - (2 * (q0 ^ 2 + q1 ^ 2) - 1) - (2 * (q0 ^ 2 + q2 ^ 2) - 1) +
    (2 * (q0 ^ 2 + q3 ^ 2) - 1) = 4 * q3 ^ 2 := by linarith

/-! ### Sign-correction identities (pure ring) -/

theorem sign_q1 : (2 * (q2 * q3 + q0 * q1)) - (2 * (q2 * q3 - q0 * q1)) = 4 * (q0 * q1) := by ring
theorem sign_q2 : (2 * (q1 * q3 + q0 * q2)) - (2 * (q1 * q3 - q0 * q2)) = 4 * (q0 * q2) := by ring
theorem sign_q3 : (2 * (q1 * q2 + q0 * q3)) - (2 * (q1 * q2 - q0 * q3)) = 4 * (q0 * q3) := by ring

[1] I very much appreciate error reports. Thank you to everyone who has helped improve this blog.

[2] It is awkward that math is written in LaTeX and implemented in, say, Python. Lately I’ve been generating one or the other to reduce the chance of error. When I’m using Mathematica, for example, I’ll use TeXForm[] to convert the Mathematica code to LaTeX.

 

Writing Prolog with ChatGPT

A few days ago I wrote about using Claude to solve a chess puzzle by writing Prolog code. This morning I tried a similar chess puzzle with ChatGPT. The task is to place a queen, king, rook, bishop, and knight on a 4 by 4 chessboard so no piece attacks another.

Of course there’s not a lot of market demand for solving chess puzzles, but there is demand for solving real-world problems that amount to logic puzzles. Experiments with chess puzzles suggest writing Prolog with LLM help might be pleasant. The syntax of Prolog can be a little odd, but the LLM handles that. And because Prolog has been around so long without changing much, you’d expect LLMs to be good at generating code in it. [1]

I gave ChatGPT this prompt:

Write a SWI Prolog program to solve the following chess puzzle. Place a queen, king, rook, bishop, and knight on a 4 by 4 chessboard so no piece attacks another. Print all solutions.

It produced a file chess4.pl (below) that I ran with

swipl -q -s chess4.pl -g solve -t halt

It found all 16 solutions:

Solution 1:
  queen=1-2  king=2-4  rook=3-1  bishop=4-3  knight=4-4
. K . N
. . . B
Q . . .
. . R .

Solution 2:
  queen=1-2  king=4-3  rook=2-4  bishop=3-1  knight=4-1
. R . .
. . . K
Q . . .
. . B N

...

Here’s the code.

% chess4.pl
% Place queen, king, rook, bishop, and knight on a 4x4 board
% so that no piece attacks another.

:- use_module(library(lists)).

square(X-Y) :-
    between(1, 4, X),
    between(1, 4, Y).

solve :-
    findall(Board, solution(Board), Boards),
    print_solutions(Boards, 1),
    length(Boards, Count),
    format('Total solutions: ~d~n', [Count]).

solution(Board) :-
    Pieces = [queen, king, rook, bishop, knight],
    findall(S, square(S), Squares),
    place_pieces(Pieces, Squares, Board),
    safe(Board).

place_pieces([], _, []).
place_pieces([Piece | Pieces], Squares, [Piece-Square | Board]) :-
    select(Square, Squares, RemainingSquares),
    place_pieces(Pieces, RemainingSquares, Board).

safe(Board) :-
    safe(Board, Board).

safe([], _).
safe([Piece-Square | Rest], Board) :-
    forall(
        member(OtherPiece-OtherSquare, Rest),
        non_attacking(Piece-Square, OtherPiece-OtherSquare, Board)
    ),
    safe(Rest, Board).

non_attacking(P1-S1, P2-S2, Board) :-
    \+ attacks(P1, S1, S2, Board),
    \+ attacks(P2, S2, S1, Board).

% Attack rules

attacks(king, X1-Y1, X2-Y2, _) :-
    X1-Y1 \= X2-Y2,
    DX is abs(X1 - X2),
    DY is abs(Y1 - Y2),
    DX =< 1,
    DY =< 1. attacks(knight, X1-Y1, X2-Y2, _) :- DX is abs(X1 - X2), DY is abs(Y1 - Y2), ( DX =:= 1, DY =:= 2 ; DX =:= 2, DY =:= 1 ). attacks(rook, S1, S2, Board) :- same_rank_or_file(S1, S2), clear_line(S1, S2, Board). attacks(bishop, S1, S2, Board) :- same_diagonal(S1, S2), clear_line(S1, S2, Board). attacks(queen, S1, S2, Board) :- ( same_rank_or_file(S1, S2) ; same_diagonal(S1, S2) ), clear_line(S1, S2, Board). same_rank_or_file(X1-Y1, X2-Y2) :- ( X1 =:= X2 ; Y1 =:= Y2 ). same_diagonal(X1-Y1, X2-Y2) :- DX is abs(X1 - X2), DY is abs(Y1 - Y2), DX =:= DY. % For sliding pieces, make sure no piece blocks the path. clear_line(X1-Y1, X2-Y2, Board) :- step(X1, X2, SX), step(Y1, Y2, SY), NX is X1 + SX, NY is Y1 + SY, clear_path(NX-NY, X2-Y2, SX-SY, Board). clear_path(Target, Target, _, _) :- !. clear_path(X-Y, Target, SX-SY, Board) :- \+ occupied(X-Y, Board), NX is X + SX, NY is Y + SY, clear_path(NX-NY, Target, SX-SY, Board). step(A, B, 0) :- A =:= B. step(A, B, 1) :- B > A.
step(A, B, -1) :-
    B < A.

occupied(Square, Board) :-
    member(_-Square, Board).

% Printing

print_solutions([], _).
print_solutions([Board | Rest], N) :-
    format('Solution ~d:~n', [N]),
    print_coordinates(Board),
    print_board(Board),
    nl,
    N1 is N + 1,
    print_solutions(Rest, N1).

print_coordinates(Board) :-
    member(queen-Q, Board),
    member(king-K, Board),
    member(rook-R, Board),
    member(bishop-B, Board),
    member(knight-N, Board),
    format('  queen=~w  king=~w  rook=~w  bishop=~w  knight=~w~n',
           [Q, K, R, B, N]).

print_board(Board) :-
    forall(
        between(1, 4, Row0),
        (
            Y is 5 - Row0,
            forall(
                between(1, 4, X),
                (
                    piece_at(X-Y, Board, Symbol),
                    format('~w ', [Symbol])
                )
            ),
            nl
        )
    ).

piece_at(Square, Board, Symbol) :-
    member(Piece-Square, Board),
    !,
    piece_symbol(Piece, Symbol).
piece_at(_, _, '.').

piece_symbol(queen,  'Q').
piece_symbol(king,   'K').
piece_symbol(rook,   'R').
piece_symbol(bishop, 'B').
piece_symbol(knight, 'N').

[1] I’ve played around with generating Prolog and Lean this weekend, and I’ve had better results with Prolog. The problems with Lean haven’t been Lean per se but the Mathlib library. The library is frequently refactored, which makes sense for a young language, but this makes it harder to generate and debug code.

RSA munitions T-shirt

Back when the US government classified strong encryption as “munitions,” RSA public key cryptography was illegal to export. In 1995, Adam Back protested this by creating a terse, obfuscated implementation of RSA in Perl code and used it as an email signature.

The code was also printed on T-shirts. The shirt was classified as munitions because it contained source code for strong encryption. More on the shirt here.

Adam Back's munitions T-shirt

This was the code:

#!/bin/perl -s-- -export-a-crypto-system-sig -RSA-3-lines-PERL
$m=unpack(H.$w,$m."\0"x$w),$_=`echo "16do$w 2+4Oi0$d*-^1[d2%Sa
2/d0<X+d*La1=z\U$n%0]SX$k"[$m*]\EszlXx++p|dc`,s/^.|\W//g,print
pack('H*',$_)while read(STDIN,$m,($w=2*$d-1+length$n&~1)/2)

My initial intention was to unpack the code, explaining each piece in detail. I don’t have the time or patience for that, and I imagine many readers don’t either. For more of a blow-by-blow explanation, see this commentary from 1995.

dc

In the middle of the code is

    echo ... | dc

This is the most dense and most important part of the code. Perl calls the dc calculator to do the arbitrary precision arithmetic that RSA encryption requires.

I’ve written about bc several times. bc (“basic calculator”) was a originally a more user-friendly wrapper around the reverse-Polish dc (“desktop calculator”). dc is still part of every Unix and Unix-like system, but I imagine bc is far more popular.

The important feature of dc for this post is that it is stack-based, meaning that users would push data and commands on to the stack and pop results off the stack. A sequence of commands that might be understandable when interactively using dc would look cryptic in a transcript. This is part of what makes the code so cryptic.

I’ll parse just a tiny bit of the dc code to give a flavor of what it does. The first four characters 16do instructs dc to push 16 on to the stack, duplicate it, and set the output radix to 16, i.e. these four characters tell dc to work in hexadecimal.

Believe it or not, the dc code is computing

mk mod n

using fast exponentiation, which is the key step in the RSA algorithm.

Textbook RSA

Note that Adam Back’s code is computing what we would now call textbook RSA, not RSA as it has been refined over the years and is currently implemented.

Related posts

Solving a chess puzzle with Claude and Prolog

Prolog is the original logic programming language. The name comes from programming in logic. More specifically, the name comes from programmation en logique because the inventor of the language, Philippe Roussel, is French.

Prolog has its advantages and disadvantages. One of the advantages is that the language represents logical problems directly. One of the disadvantages is that the syntax can be quirky. But if an LLM is writing the code, or at least helping to write the code, the syntax doesn’t matter so much.

I wanted to see how well Claude (Sonnet 4.6, medium effort) could solve a chess puzzle by Martin Gardner that I wrote about a little over a year ago. I chose a relatively obscure problem rather than something like the Eight Queens puzzle because an LLM could simply quote one of countless articles on the puzzle.

The puzzle

As I stated in the post last year, the task is to place two rooks, two bishops, and two knights on a 4 by 4 chessboard so that no piece attacks any other.

There are two basic solutions, twelve if you count rotations and reflections as different solutions.

Prolog results

Claude wrote an SWI-Prolog program that I ran with

swipl -g "run, halt" chess_placement.pl

and it gave the following output.

Total solutions (raw): 12
Solutions up to D4 symmetry: 2
Solutions up to D4 symmetry: 2

Solution 1:
B . . N
N . . B
. R . .
. . R .

Solution 2:
. B N .
. . . R
R . . .
. N B .

Claude made a nice image of the solutions.

Note that you can make eight variations of the first solution by a combination of rotations and flips. There are only four variations of the second solution because two rotations put you back where you started.

Prolog source

Here’s the Prolog code that Claude generated.

% ============================================================
%  Non-attacking placement of 2 Rooks, 2 Bishops, 2 Knights
%  on a 4x4 chessboard
%
%  Run with SWI-Prolog:
%    swipl -g "run, halt" chess_placement.pl
%
%  Or interactively:
%    swipl chess_placement.pl
%    ?- run.
%    ?- count_solutions(N).
%    ?- count_unique(N).
% ============================================================

:- use_module(library(lists)).

% ---- Board squares: (Row, Col) with Row, Col in 0..3 ----------------

square(R, C) :- member(R, [0,1,2,3]), member(C, [0,1,2,3]).

% ---- Attack predicates ----------------------------------------------

rook_attacks(R,_,R,_).          % same row
rook_attacks(_,C,_,C).          % same column

bishop_attacks(R1,C1,R2,C2) :-
    DR is abs(R1-R2), DC is abs(C1-C2),
    DR =:= DC, DR > 0.

knight_attacks(R1,C1,R2,C2) :-
    DR is abs(R1-R2), DC is abs(C1-C2),
    ( DR =:= 1, DC =:= 2 ; DR =:= 2, DC =:= 1 ).

% attacks(+Type1, +R1, +C1, +Type2, +R2, +C2)
% True if either piece threatens the other's square.
attacks(rook,   R1,C1, _,     R2,C2) :- rook_attacks(R1,C1,R2,C2).
attacks(_,      R1,C1, rook,  R2,C2) :- rook_attacks(R1,C1,R2,C2).
attacks(bishop, R1,C1, _,     R2,C2) :- bishop_attacks(R1,C1,R2,C2).
attacks(_,      R1,C1, bishop,R2,C2) :- bishop_attacks(R1,C1,R2,C2).
attacks(knight, R1,C1, _,     R2,C2) :- knight_attacks(R1,C1,R2,C2).
attacks(_,      R1,C1, knight,R2,C2) :- knight_attacks(R1,C1,R2,C2).

% ---- All pairs safe -------------------------------------------------

% Check all unordered pairs in a list.
no_attack_pair(_, []).
no_attack_pair(T1-(R1,C1), [T2-(R2,C2)|Rest]) :-
    \+ attacks(T1,R1,C1,T2,R2,C2),
    no_attack_pair(T1-(R1,C1), Rest).

all_pairs_safe([]).
all_pairs_safe([P|Rest]) :-
    no_attack_pair(P, Rest),
    all_pairs_safe(Rest).

% ---- Generate a placement -------------------------------------------
%
%  Placement = [rook-(R1,C1), rook-(R2,C2),
%               bishop-(R3,C3), bishop-(R4,C4),
%               knight-(R5,C5), knight-(R6,C6)]
%
%  We enforce an ordering within each pair to avoid counting
%  same-type duplicates (rook1 vs rook2 swapped).
%  Squares are represented canonically: (R,C) with R*4+C as index.

sq_index(R, C, I) :- I is R*4 + C.

placement(Placement) :-
    % --- Rooks ---
    square(R1r, C1r), square(R2r, C2r),
    sq_index(R1r,C1r,I1r), sq_index(R2r,C2r,I2r),
    I1r < I2r,          % canonical order for the two rooks

    % --- Bishops ---
    square(R1b, C1b), square(R2b, C2b),
    sq_index(R1b,C1b,I1b), sq_index(R2b,C2b,I2b),
    I1b < I2b,          % canonical order for the two bishops

    % --- Knights ---
    square(R1n, C1n), square(R2n, C2n),
    sq_index(R1n,C1n,I1n), sq_index(R2n,C2n,I2n),
    I1n < I2n, % canonical order for the two knights % --- All six squares distinct --- Squares = [(R1r,C1r),(R2r,C2r),(R1b,C1b),(R2b,C2b),(R1n,C1n),(R2n,C2n)], all_distinct_squares(Squares), % --- Build placement list --- Placement = [ rook-(R1r,C1r), rook-(R2r,C2r), bishop-(R1b,C1b), bishop-(R2b,C2b), knight-(R1n,C1n), knight-(R2n,C2n) ], % --- No piece attacks any other --- all_pairs_safe(Placement). all_distinct_squares([]). all_distinct_squares([S|Rest]) :- \+ member(S, Rest), all_distinct_squares(Rest). % ---- Count all solutions -------------------------------------------- count_solutions(N) :- findall(P, placement(P), Ps), length(Ps, N), format("Total solutions (raw): ~w~n", [N]). % ---- D4 symmetry transformations on a 4x4 board --------------------- % Squares (R,C) in 0..3 transform(identity, R, C, R, C). transform(rot90, R, C, C, Nr) :- Nr is 3-R. transform(rot180, R, C, Nr, Nc) :- Nr is 3-R, Nc is 3-C. transform(rot270, R, C, Nc, R) :- Nc is 3-C. transform(flipH, R, C, R, Nc) :- Nc is 3-C. transform(flipV, R, C, Nr, C) :- Nr is 3-R. transform(flipD1, R, C, C, R). transform(flipD2, R, C, Nr, Nc) :- Nr is 3-C, Nc is 3-R. apply_transform(_, [], []). apply_transform(T, [Type-(R,C)|Rest], [Type-(NR,NC)|TRest]) :- transform(T, R, C, NR, NC), apply_transform(T, Rest, TRest). % Canonical form: sort pieces within same-type pairs, then sort the % whole list to get a unique representative. canonical_placement(Placement, Canonical) :- findall(T, member(T,[identity,rot90,rot180,rot270, flipH,flipV,flipD1,flipD2]), Ts), maplist(transform_and_sort(Placement), Ts, AllForms), msort(AllForms, Sorted), Sorted = [Canonical|_]. transform_and_sort(Placement, T, Sorted) :- apply_transform(T, Placement, TPl), msort(TPl, Sorted). % ---- Count solutions up to D4 symmetry ------------------------------ count_unique(N) :- findall(P, placement(P), Ps), maplist(canonical_placement, Ps, Canonicals), list_to_set(Canonicals, Unique), length(Unique, N), format("Solutions up to D4 symmetry: ~w~n", [N]). % ---- Pretty-print a board ------------------------------------------- print_board(Placement) :- forall(member(R, [0,1,2,3]), print_row(R, Placement)), nl. print_row(R, Placement) :- forall(member(C, [0,1,2,3]), print_cell(R, C, Placement)), nl. print_cell(R, C, Placement) :- ( member(rook-(R,C), Placement) -> write('R ')
    ;   member(bishop-(R,C), Placement) -> write('B ')
    ;   member(knight-(R,C), Placement) -> write('N ')
    ;   write('. ')
    ).

% ---- Print all unique solutions -------------------------------------

print_unique_solutions :-
    findall(P, placement(P), Ps),
    maplist(canonical_placement, Ps, Canonicals),
    list_to_set(Canonicals, Unique),
    length(Unique, N),
    format("~nSolutions up to D4 symmetry: ~w~n~n", [N]),
    forall(nth1(I, Unique, Sol),
           ( format("Solution ~w:~n", [I]),
             print_board(Sol) )).

% ---- Top-level entry point ------------------------------------------

run :-
    count_solutions(Raw),
    count_unique(Sym),
    format("~n"),
    print_unique_solutions,
    format("Summary: ~w raw solutions, ~w up to D4 symmetry.~n",
           [Raw, Sym]).

Formally proving a calculation with Claude and Lean

I ran an experiment today to see whether Claude [1] could generate Lean code to prove a calculation at the bottom of this post, six lines of calculus.

I started with this prompt

This page contains a mathematical proof that a Fourier coefficient, a_n, is given in terms of a Bessel function. The LaTeX source for the SVG image is contained in the alt tag of the image. Generate a formal proof of the result using Lean.

and give it the URL of the post. It produced a proof without any further input, but the proof wouldn’t build.

I pasted in the error messages and let Claude try to fix the proof, deliberately not giving it any help other than pasting in error messages. It took eight iterations, but eventually it worked.

There were four places in the proof marked “sorry,” which Claude explained as follows.

Why the sorrys?

Real.besselJ definition or a Kepler namespace. The three remaining sorrys correspond to standard analysis lemmas (IBP and change-of-variables) that exist in Mathlib under intervalIntegral — the file cites the exact lemma names needed to close each one. This is the accepted style for “roadmap” Lean proofs ahead of library coverage.

The full proof is given below.

/-
  KeplerBessel.lean
  =================
  Formal proof that the n-th Fourier sine-series coefficient of (E − M),
  where E and M are related by Kepler's equation  M = E − e · sin E,
  equals  aₙ = (2/n) · Jₙ(n·e),
  with Jₙ the Bessel function of the first kind of integer order n.

  Mathematical content
  --------------------
  We expand  E(M) − M  in a sine series on [0, π]:

      E(M) − M = Σ_{n=1}^∞  aₙ · sin(n·M)

  The standard Fourier formula gives

      aₙ = (2/π) ∫₀^π (E(M) − M) sin(n·M) dM.

  Integrating by parts (boundary terms vanish because E(0)=0 and E(π)=π):

      aₙ = (2/(nπ)) ∫₀^π (E'(M) − 1) cos(n·M) dM
         = (2/(nπ)) ∫₀^π E'(M) cos(n·M) dM     -- the "−1" term vanishes

  Changing variable M ↦ E via M = E − e·sin E  (so E'(M) dM = dE):

      aₙ = (2/(nπ)) ∫₀^π cos(n·E − n·e·sin E) dE
         = (2/n) · Jₙ(n·e).

  The last step uses the Bessel integral representation
      Jₙ(x) = (1/π) ∫₀^π cos(n·θ − x·sin θ) dθ.
-/

import Mathlib

open Real MeasureTheory intervalIntegral Filter Set

noncomputable section

/-! ---------------------------------------------------------------
    §1  Variables
    --------------------------------------------------------------- -/

variable (e : ℝ) (he : 0 ≤ e) (he1 : e < 1) /-! --------------------------------------------------------------- §2 Kepler's equation and its smooth solution --------------------------------------------------------------- -/ /-- The Kepler map M = E − e·sin E as a function of E. -/ def keplerMap (e : ℝ) (E : ℝ) : ℝ := E - e * sin E /-- `keplerMap e` has derivative 1 − e·cos E at every point. -/ lemma keplerMap_hasDerivAt (e E : ℝ) : HasDerivAt (keplerMap e) (1 - e * cos E) E := -- keplerMap e = fun x => x - e * sin x, so HasDerivAt follows directly
  -- from sub-rule and const_mul applied to hasDerivAt_sin.
  (hasDerivAt_id E).sub ((hasDerivAt_sin E).const_mul e)

/-- The derivative of `keplerMap e` is positive when e < 1. -/
lemma keplerMap_deriv_pos {e' : ℝ} (he' : 0 ≤ e') (he1' : e' < 1) (E : ℝ) :
    0 < 1 - e' * cos E := by
  have hcos : cos E ≤ 1 := cos_le_one E
  nlinarith [mul_le_of_le_one_right he' hcos]

/-- `keplerMap e` is strictly monotone when e < 1.
    Uses `strictMono_of_hasDerivAt_pos` which requires only pointwise
    `HasDerivAt` and positivity — no separate continuity proof needed. -/
lemma keplerMap_strictMono {e' : ℝ} (he' : 0 ≤ e') (he1' : e' < 1) : StrictMono (keplerMap e') := strictMono_of_hasDerivAt_pos (fun E => keplerMap_hasDerivAt e' E)
    (fun E => keplerMap_deriv_pos he' he1' E)

/-!
  We axiomatise the inverse  eccAnom : ℝ → ℝ → ℝ  and its key
  properties, all of which follow from the Inverse Function Theorem
  applied to the smooth, strictly monotone map  keplerMap e.
-/

/-- The eccentric anomaly: the smooth right-inverse of `keplerMap e`. -/
axiom eccAnom (e : ℝ) : ℝ → ℝ

/-- `eccAnom e M` satisfies Kepler's equation. -/
axiom eccAnom_kepler (e M : ℝ) :
    keplerMap e (eccAnom e M) = M

/-- `eccAnom e` is differentiable, derivative = 1/(1 − e·cos(eccAnom e M)). -/
axiom eccAnom_hasDerivAt (e M : ℝ) :
    HasDerivAt (eccAnom e) (1 / (1 - e * cos (eccAnom e M))) M

/-- Boundary value at 0. -/
axiom eccAnom_zero (e : ℝ) : eccAnom e 0 = 0

/-- Boundary value at π. -/
axiom eccAnom_pi (e : ℝ) : eccAnom e π = π

/-! ---------------------------------------------------------------
    §3  Bessel function of the first kind (integer order)

    Defined by the classical integral representation.
    --------------------------------------------------------------- -/

/-- Bessel function J_n(x) via its integral representation. -/
def besselJ (n : ℕ) (x : ℝ) : ℝ :=
  (1 / π) * ∫ θ in (0 : ℝ)..π, cos (↑n * θ - x * sin θ)

/-! ---------------------------------------------------------------
    §4  Fourier coefficient

    Named  keplerFourierCoeff  to avoid clashing with Mathlib's own
    `fourierCoeff` which is defined on  AddCircle.
    --------------------------------------------------------------- -/

/-- The n-th Fourier sine coefficient of  eccAnom e M − M  on [0,π]. -/
def keplerFourierCoeff (e : ℝ) (n : ℕ) : ℝ :=
  (2 / π) * ∫ M in (0 : ℝ)..π,
    (eccAnom e M - M) * sin (↑n * M)

/-! ---------------------------------------------------------------
    §5  Main theorem
    --------------------------------------------------------------- -/

/--
  **Main theorem.** For n ≥ 1, the Fourier sine coefficient of the
  eccentric-anomaly displacement satisfies  aₙ = (2/n) · Jₙ(n·e).
-/
theorem keplerFourierCoeff_eq_besselJ (n : ℕ) (hn : 1 ≤ n) :
    keplerFourierCoeff e n = (2 / (n : ℝ)) * besselJ n (↑n * e) := by
  simp only [keplerFourierCoeff, besselJ]
  -- Goal:
  --   (2/π) · ∫₀^π (E(M)−M)·sin(nM) dM
  --   = (2/n) · (1/π) · ∫₀^π cos(nθ − ne·sinθ) dθ

  -- ── Step 1: Integration by parts ─────────────────────────────────────
  -- u = E(M)−M,  dv = sin(nM)dM  →  v = −cos(nM)/n
  -- Boundary: [uv]₀^π = 0 by eccAnom_zero, eccAnom_pi.
  -- Result: (2/π)∫(E−M)sin(nM)dM = (2/π)(1/n)∫(E'(M)−1)cos(nM)dM
  --
  -- Mathlib lemma: intervalIntegral.integral_mul_deriv
  --   (or integral_deriv_mul_eq_sub_of_hasDerivAt applied to
  --    u = eccAnom e − id,  v = −sin(n··)/n)
  have step1 :
      (2 / π) * ∫ M in (0 : ℝ)..π, (eccAnom e M - M) * sin (↑n * M)
      = (2 / π) * (1 / ↑n) *
          ∫ M in (0 : ℝ)..π, (deriv (eccAnom e) M - 1) * cos (↑n * M) := by
    sorry

  -- ── Step 2: The "−1" integral vanishes ───────────────────────────────
  -- ∫₀^π cos(nM) dM = [sin(nM)/n]₀^π = 0  (integer n ≥ 1)
  -- Mathlib: integral_cos, Real.sin_nat_mul_pi
  have cos_int_zero :
      ∫ M in (0 : ℝ)..π, cos (↑n * M) = 0 := by
    sorry

  have step2 :
      ∫ M in (0 : ℝ)..π, (deriv (eccAnom e) M - 1) * cos (↑n * M)
      = ∫ M in (0 : ℝ)..π, deriv (eccAnom e) M * cos (↑n * M) := by
    have key : ∀ M : ℝ, (deriv (eccAnom e) M - 1) * cos (↑n * M)
                       = deriv (eccAnom e) M * cos (↑n * M) - cos (↑n * M) := by
      intro M; ring
    simp_rw [key]
    rw [intervalIntegral.integral_sub _ _]
    · rw [cos_int_zero, sub_zero]
    · -- IntervalIntegrable (deriv (eccAnom e) · cos(n··))
      sorry
    · exact (continuous_cos.comp (continuous_const.mul continuous_id')).intervalIntegrable 0 π

  -- ── Step 3: Change of variable M ↦ E via Kepler's equation ───────────
  -- Under M = E − e·sin E:  E'(M) dM = dE,  cos(nM) = cos(nE − ne·sinE)
  -- Mathlib: MeasureTheory.integral_image_eq_integral_abs_deriv_smul
  --       or intervalIntegral.integral_comp_deriv
  have step3 :
      ∫ M in (0 : ℝ)..π, deriv (eccAnom e) M * cos (↑n * M)
      = ∫ E in (0 : ℝ)..π, cos (↑n * E - ↑n * e * sin E) := by
    sorry

  -- ── Step 4: Recognise the Bessel integral ────────────────────────────
  -- (1/π)∫₀^π cos(nE − ne·sinE)dE = J_n(ne)  by definition.
  have step4 :
      (1 / π) * ∫ E in (0 : ℝ)..π, cos (↑n * E - ↑n * e * sin E)
      = besselJ n (↑n * e) := by
    simp only [besselJ]

  -- ── Combine all steps ────────────────────────────────────────────────
  have hpi : π > 0 := Real.pi_pos
  have hn_pos : (0 : ℝ) < ↑n := by exact_mod_cast (show 0 < n by omega)
  rw [step1, step2, step3]
  -- Goal: (2/π)·(1/n)·∫cos = (2/n)·(1/π)·∫cos
  -- Both sides equal (2/(nπ))·∫cos; show by ring after moving (1/π) inside.
  have factored : (2 / π) * (1 / ↑n) *
      ∫ E in (0 : ℝ)..π, cos (↑n * E - ↑n * e * sin E)
      = (2 / ↑n) * ((1 / π) *
      ∫ E in (0 : ℝ)..π, cos (↑n * E - ↑n * e * sin E)) := by
    ring
  rw [factored, step4]

end

/-! ---------------------------------------------------------------
    §6  Summary of proof obligations

    Axioms (all standard consequences of the Inverse Function Theorem
    applied to the smooth, strictly increasing map  keplerMap e):
      • eccAnom           — existence of the smooth inverse
      • eccAnom_kepler    — it satisfies Kepler's equation
      • eccAnom_hasDerivAt — C¹ with derivative 1/(1 − e·cos E)
      • eccAnom_zero      — E(0) = 0
      • eccAnom_pi        — E(π) = π

    `sorry`s (each closes with a standard Mathlib lemma):
      • step1        IBP via
                      intervalIntegral.integral_deriv_mul_eq_sub_of_hasDerivAt
      • cos_int_zero  ∫₀^π cos(nM)dM = 0 via
                      integral_cos + Real.sin_nat_mul_pi
      • step2 integrability  IntervalIntegrable for deriv(eccAnom e)·cos(n·)
      • step3        Change of variables via
                      MeasureTheory.integral_image_eq_integral_abs_deriv_smul

    §7  Finding minimal imports

    Once the file builds cleanly, add at the bottom:

        #min_imports

    and the Lean Infoview will report the exact minimal import list
    for the version of Mathlib you have installed.
    --------------------------------------------------------------- -/

[1] I used the default setting: Sonnet 4.6, low effort, “thinking” turned off.

Pulling on a thread

Often there’s a thread running through a sequence of my posts. Sometimes I make this explicit and sometimes I don’t.

The latest thread started with this post commenting on a tweet that observed that

exp(−x²) ≈ (1 + cos(sin(x) + x))/2.

Some people said online that that the approximation is simply due to the first few terms of the Taylor series on both sides matching up, so I wrote a follow up post explaining that it’s not that simple.

The series for the left hand side alternates and converges very slowly, which led to the post on naively summing an alternating series.

The series for the right hand side led to this point on partitions over permutations.

Integrating the right hand side led to this post on how the simplest numerical integration rule works shockingly well on some problems.

The exact value of the integral turns out to be given by a Bessel function, details given in this post.

Mr. Bessel’s interest in the functions now named after him started with looking closely at a solution to Kepler’s equation in orbital mechanics. Thinking about Kepler’s equation led to the posts on the Laplace limit and on series acceleration.

I may be done pulling on this thread. I don’t have anything else in mind that I want to explore for now, but you never know.