Binomial bound

I recently came across an upper bound I hadn’t seen before [1]. Given a binomial coefficient C(r, k), let

n = min(k, rk)

and

m = r −  n.

Then for any ε > 0,

C(n + m, n) ≤ (1 + ε)n + m / εn.

The proof follows quickly from applying the binomial theorem to (1 + ε)n + m.

I could imagine how non-optimal choice of ε could be convenient in some context, it’s natural to want to see how good the upper bound is for the best ε, which works out to be ε = n/m.

A little algebra shows this value of ε leads to

C(n + m, n) ≤ (n + m)n + m / nn mm.

Note that while the original bound is not symmetric in n and m, the optimal bound is.

Returning to the original notation C(r, k), let’s see how tight the optimal bound is by plotting, as a function of r, the maximum relative error as a k varies.

The maximum relative error, over the range plotted, is very roughly r/10.

Related posts

[1] Grzegorz Łysik. The ε-binomial inequality. The Mathematical Gazette. Vol. 92, No. 523 (March 2008), pp. 97–99

Separable functions in different contexts

I was skimming through the book Mathematical Reflections [1] recently. He was discussing a set of generalizations [2] of the Star of David theorem from combinatorics.

\gcd\left(\binom{n - 1}{r - 1}, \binom{n}{r+1}, \binom{n+1}{r}\right) = \gcd\left(\binom{n-1}{r}, \binom{n+1}{r+1}, \binom{n}{r-1}\right)

The theorem is so named because if you draw a Star of David by connecting points in Pascal’s triangle then each side corresponds to the vertices of a triangle.

diagram illustrating the Star of David Theorem

One such theorem was the following.

\binom{n - \ell}{r - \ell} \binom{n - k}{r} \binom{n - k - \ell}{r - k} = \binom{n-k}{r-k} \binom{n - \ell}{r} \binom{n - k - \ell}{r - \ell}

This theorem also has a geometric interpretation, connecting vertices within Pascal’s triangle.

The authors point out that the binomial coefficient is a separable function of three variables, and that their generalized Star of David theorem is true for any separable function of three variables.

The binomial coefficient C(nk) is a function of two variables, but you can think of it as a function of three variables: n, k, and nk. That is

{n \choose k} = f(n) \, g(k) \, g(n-k)

where f(n) = n! and g(k) = 1/k!.

I was surprised to see the term separable function outside of a PDE context. My graduate work was in partial differential equations, and so when I hear separable function my mind goes to separation of variables as a technique for solving PDEs.

Coincidentally, I was looking a separable coordinate systems recently. These are coordinate systems in which the Helmholtz equation can be solved by separable function, i.e. a coordinate system in which the separation of variables technique will work. The Laplacian can take on very different forms in different coordinate systems, and if possible you’d like to choose a coordinate system in which a PDE you care about is separable.

Related posts

[1] Peter Hilton, Derek Holton, and Jean Pedersen. Mathematical Reflections. Springer, 1996.

[2] Hilton et al refer to a set of theorems as generalizations of the Star of David theorem, but these theorems are not strictly generalizations in the sense that the original theorem is clearly a special case of the generalized theorems. The theorems are related, and I imagine with more effort I could see how to prove the older theorem from the newer ones, but it’s not immediately obvious.

Body roundness index

Body Roundness Index (BRI) is a proposed replacement for Body Mass Index (BMI) [1]. Some studies have found that BRI is a better measure of obesity and a more effective predictor of some of the things BMI is supposed to predict [2].

BMI is based on body mass and height, and so it cannot distinguish a body builder and an obese man if both have the same height and weight. BRI looks at body shape more than body mass.

The basic idea behind Body Roundness Index is to draw an ellipse based on a person’s body and report how close that ellipse is to being a circle. The more a person looks like a circle, higher his BRI. The formula for BRI is

BRI = 364.2 − 365.5 e

where e is the eccentricity of the ellipse.

Now what is this ellipse we’ve been talking about? It’s roughly an ellipse whose major axis runs from head to toe and whose minor axis runs across the person’s width.

There are a couple simplifications here.

  1. You don’t actually measure how wide someone is. You measure the circumference of their waist and find the diameter of a circle with that circumference.
  2. You don’t actually measure how high their waist is [3]. You assume their waist is at exactly half their height.

It’s conventional to describe an ellipse in terms of its semi-major axis a and semi-minor axis b. For a circle, a = b = radius. But in general an ellipse doesn’t have a single radius and a > b. You could think of a and b as being the maximum and minimum radii.

So to fit an ellipse to our idealized model of a person, the major axis, 2a, equals the person’s height.

a = h/2

The minor axis b is the radius of a circle of circumference c where c is the circumference of the person’s waist (or hips [3]).

b = c / 2π

As explained here, eccentricity is computed from a and b by

 

e = \sqrt{1 - \frac{b^2}{a^2}}

As an example, consider a man who is 6 foot (72 inches) tall and has a 34 inch waist. Then

a = 36
b = 17/π = 5.4112
e = √(1 − b²/a²) = 0.9886
BRI = 364.2 − 365.5 e = 2.8526

Note that the man’s weight doesn’t enter the calculation. He could be a slim guy weighing 180 pounds or a beefier guy weighing 250 pounds as long as he has a 34 inch waist. In the latter case, the extra mass is upper body muscle and not around his waist.

thin ellipse graph

Related posts

[1] Diana M. Thomas et al. Relationships Between Body Roundness with Body Fat and Visceral Adipose Tissue Emerging from a New Geometrical Model. Obesity (2013) 21, 2264–2271. doi:10.1002/oby.20408.

[2] Researchers argue over which number to reduce a person to: BMI, BRI, or some other measure. They implicitly agree that a person must be reduced to a number; they just disagree on which number.

[3] Or waist. There are two versions of BRI, one based on waist circumference and one based on hip circumference.

A couple more variations on an ancient theme

I’ve written a couple posts on the approximation

\cos x \approx \frac{\pi^2 - 4x^2}{\pi^2 + x^2}

by the Indian astronomer Aryabhata (476–550). The approximation is accurate for x in [−π/2, π/2].

The first post collected a Twitter thread about the approximation into a post. The second looked at how far the coefficients in Aryabhata’s approximation are from the optimal approximation as a ratio of quadratics.

This post will answer a couple questions. First, what value of π did Aryabhata have and how would that effect the approximation error? Second, how bad would Aryabhata’s approximation be if we used the approximation π² ≈ 10?

Using Aryabhata’s value of π

Aryabhata knew the value 3.1416 for π. We know this because he said that a circle of diameter 20,000 would have circumference  62,832. We don’t know, but it’s plausible that he knew π to more accuracy and rounded it to the implied value.

Substituting 3.1416 for π changes the sixth decimal of the approximation, but the approximation is good to only three decimal places, so 3.1416 is as good as a more accurate approximation as far as the error in approximating cosine is concerned.

Using π² ≈ 10

Substituting 10 for π² in Aryabhata’s approximation gives an approximation that’s convenient to evaluate by hand.

\cos x \approx \frac{10 - 4x^2}{10 + x^2}

It’s very accurate for small values of x but the maximum error increases from 0.00163 to 0.01091. Here’s a plot of the error.

Finding pi in the alphabet

Write the letters of the alphabet around a circle, then strike out the letters that are symmetrical about a vertical line. The remaining letters are grouped in clumps of 3, 1, 4, 1, and 6 letters.

I’ve heard that this observation is due to Martin Gardner, but I don’t have a specific reference.

In case you’re interested, here’s the Python script I wrote to make the image above.

    from numpy import *
    import matplotlib.pyplot as plt
    
    for i in range(26):
        letter = chr(ord('A') + i)
        if letter in "AHIMOTUVWXY":
            latex = r"$\equiv\!\!\!\!\!" + letter + "$"
        else:
            latex = f"${letter}$"
        theta = pi/2 - 2*pi*i/26
        pt = (0.5*cos(theta), 0.5*sin(theta))
        plt.plot(pt[0], pt[1], ' ')
        plt.annotate(latex, pt, fontsize="xx-large")
    plt.axis("off")
    plt.gca().set_aspect("equal")
    plt.savefig("alphabet_pi.png")

Optimal rational approximation

A few days ago I wrote about the approximation

\cos x \approx \frac{\pi^2 - 4x^2}{\pi^2 + x^2}

for cosine due to the Indian astronomer Aryabhata (476–550) and gave this plot of the error.

I said that Aryabhata’s approximation is “not quite optimal since the ripples in the error function are not of equal height.” This was an allusion to the equioscillation theorem.

Chebyshev proved that an optimal polynomial approximation has an error function that has equally large positive and negative oscillations. Later this theorem was generalized to rational approximations through a sequence of results by de la Vallée Poussin, Walsh, and Achieser. Here’s the formal statement of the theorem from [1] in the context of real-valued rational approximations with numerators of degree m and denominators of degree n

Theorem 24.1. Equioscillation characterization of best approximants. A real function f in C[−1, 1] has a unique best approximation r*, and a function r is equal to r* if and only if fr equioscillates between at least m + n + 2 − d extremes where d is the defect of r.

When the theorem says the error equioscillates, it means the error alternately takes on ± its maximum absolute value.

The defect is non-zero when both numerator and denominator have less than maximal degree, which doesn’t concern us here.

We want to find the optimal rational approximation for cosine over the interval [−π/2, π/2]. It doesn’t matter that the theorem is stated for continuous functions over [−1, 1] because we could just rescale cosine. We’re looking for approximations with (m, n) = (2, 2), i.e. ratios of quadratic polynomials, to see if we can improve on the approximation at the top of the post.

The equioscillation theorem says our error should oscillate at least 6 times, and so if we find an approximation whose error oscillates as required by the theorem, we know we’ve found the optimal approximation.

I first tried finding the optimal approximation using Mathematica’s MiniMaxApproximation function. But this function tries to optimize relative error and I’m trying to minimize absolute error. Minimizing relative error creates problems because cosine evaluates to zero at the ends of interval ±π/2. I tried several alternatives and eventually decided to take another approach.

Because the cosine function is even, the optimal approximation is even. Which means the optimal approximation has the form

(a + bx²) / (c + dx²)

and we can assume without loss of generality that a = 1. I then wrote some Python code to minimize the error as a function of the three remaining variables. The results were b = −4.00487004, c = 9.86024544, and d = 1.00198695, very close to Aryabhata’s approximation that corresponds to b = −4, c = π², and d = 1.

Here’s a plot of the error, the difference between cosine and the rational approximation.

The absolute error takes on its maximum value seven times, alternating between positive and negative values, and so we know the approximation is optimal. However sketchy my approach to finding the optimal approximation may have been, the plot shows that the result is correct.

Aryabhata’s approximation had maximum error 0.00163176 and the optimal approximation has maximum error 0.00097466. We were able to shave about 1/3 off the maximum error, but at a cost of using coefficients that would be harder to use by hand. This wouldn’t matter to a modern computer, but it would matter a great deal to an ancient astronomer.

Related posts

[1] Approximation Theory and Approximation Practice by Lloyd N. Trefethen

Pell is to silver as Fibonacci is to gold

As mentioned in the previous post, the ratio of consecutive Fibonacci numbers converges to the golden ratio. Is there a sequence whose ratios converge to the silver ratio the way ratios of Fibonacci numbers converge to the golden ratio?

(If you’re not familiar with the silver ratio, you can read more about it here.)

The Pell numbers Pn start out just like the Fibonacci numbers:

P0 = 0
P1 = 1.

But the recurrence relationship is slightly different:

Pn+2 = 2Pn+1 + Pn.

So the Pell numbers are 0, 1, 2, 5, 12, 29, ….

The ratios of consecutive Pell numbers converge to the silver ratio.

Metallic ratios

There are more analogs of the golden ratio, such as the bronze ratio and more that do not have names. In general the kth metallic ratio is the larger root of

x² − kx − 1 = 0.

The cases n = 1, 2, and 3 correspond to the gold, silver, and bronze ratios respectively.

The quadratic equation above is the characteristic equation of the recurrence relation

Pn+2 = kPn+1 + Pn.

which suggests how we construct a sequence of integers such that consecutive ratios converge to the nth metallic constant.

So if we use k = 3 in the recurrence relation, we should get a sequence whose ratios converge to the bronze ratio. The results are

0, 1, 3, 10, 33, 109, 360, 1189, 3927, 12970, …

The following code will print the ratios.

    def bronze(n):
        if n == 0: return 0
        if n == 1: return 1
        return 3*bronze(n-1) + bronze(n-2)

    for n in range(2, 12):
        print( bronze(n)/bronze(n-1) )

Here’s the output.

    3.0
    3.3333333333333335
    3.3
    3.303030303030303
    3.302752293577982
    3.3027777777777776
    3.302775441547519
    3.3027756557168324
    3.302775636083269
    3.3027756378831383

The results are converging to the bronze ratio

(3 + √13)/2 = 3.302775637731995.

Plastic ratio

The plastic ratio is the real root of x³ − x − 1 = 0. Following the approach above, we can construct a sequence of integers whose consecutive ratios converge to the plastic ratio with the recurrence relation

Sn+3 = Sn+1 + Sn

Let’s try this out with a little code.

    def plastic(n):
        if n < 3: return n
        return plastic(n-2) + plastic(n-3)

    for n in range(10, 20):
        print( plastic(n)/plastic(n-1) )

This prints

    1.3
    1.3076923076923077
    1.3529411764705883
    1.3043478260869565
    1.3333333333333333
    1.325
    1.320754716981132
    1.3285714285714285
    1.3225806451612903

which shows the ratios are approaching the plastic constant 1.324717957244746.

Related posts

Miles to kilometers

The number of kilometers in a mile is k = 1.609344 which is close to the golden ratio φ = 1.6180334.

The ratio of consecutive Fibonacci numbers converges to φ, and so you can approximately convert miles to kilometers by multiplying by a Fibonacci number and dividing by the previous Fibonacci number. For example, you could multiply by 8 and divide by 5, or you could multiply by 13 and divide by 8.

As you start going down the Fibonacci sequence, consecutive ratios get closer to k and closer to φ. But since the ratios converge to φ, at some point the ratios get closer to φ and further from k. That means there’s an optimal Fibonacci ratio for converting miles to kilometers.

I was curious what this optimal ratio is, and it turns out to be 21/13. There we have

|k − 21/13| = 0.0060406

and so the error in the approximation is 0.375%. The error is about a third smaller than using φ as the conversion factor.

The Lucas numbers satisfy the same recurrence relation as the Fibonacci numbers, but start with L0 = 2 and L1 = 1. The ratio of consecutive Lucas numbers also converges to φ, and so you could also use Lucas numbers to convert miles to kilometers.

There is an optimal Lucas ratio for converting miles to kilometers for the same reasons there is an optimal Fibonacci ratio. That ratio turns out to be 29/18, and

|k − 29/18| = 0.001767

which is about 4 times more accurate than the best Fibonacci ratio.

Ancient accurate approximation for sine

This post started out as a Twitter thread. The text below is the same as that of the thread after correcting an error in the first part of the thread. I also added a footnote on a theorem the thread alluded to.

***

The following approximation for sin(x) is remarkably accurate for 0 < x < π.

\sin(x) \approx \frac{16x(\pi - x)}{5\pi^2 - 4x(\pi - x)}

The approximation is so good that you can’t see the difference between the exact value and the approximation until you get outside the range of the approximation.

Here’s a plot of just the error.

This is a very old approximation, dating back to Aryabhata I, around 500 AD.

In modern terms, it is a rational approximation, quadratic in the numerator and denominator. It’s not quite optimal since the ripples in the error function are not of equal height [1], but the coefficients are nice numbers.

***

As pointed out in the comments, replacing x with π/2 − x in order to get an approximation for cosine gives a much nicer equation.

\cos x \approx \frac{\pi^2 - 4x^2}{\pi^2 + x^2}

***

[1] The equioscillation theorem says that the optimal approximation will have ripples of equal positive and negative amplitude. This post explores the equioscillation theorem further and finds how far Aryabhata’s is from optimal.

Mentally multiply by π

This post will give three ways to multiply by π taken from [1].

Simplest approach

Here’s a very simple observation about π :

π ≈ 3 + 0.14 + 0.0014.

So if you need to multiply by π, you need to multiply by 3 and by 14. Once you’ve multiplied by 14 once, you can reuse your work.

For example, to compute 4π, you’d compute 4 × 3 = 12 and 4 × 14 = 56. Then

4π ≈ 12 + 0.56 + 0.0056 = 12.5656.

The correct value is 12.56637… and so the error is .00077.

First refinement

Now of course π = 3.14159… and so the approximation above is wrong in the fourth decimal place. But you can squeeze out a little more accuracy with the observation

π ≈ 3 + 0.14 + 0.0014 + 0.00014 = 3.14154.

Now if we redo our calculation of 4π we get

4π ≈ 12 + 0.56 + 0.0056 + 0.00056 = 12.56616.

Now our error is .00021, which is 3.6 times smaller.

Second refinement

The approximation above is based on an underestimate of π. We can improve it a bit by adding half of our last term, based on

π ≈ 3 + 0.14 + 0.0014 + 0.00014 + 0.00014/2 = 3.14157

So in our running example,

4π ≈ 12 + 0.56 + 0.0056 + 0.00056 + 00028 = 12.5656 = 12.56654.

which has an error of 0.00007, which is three times smaller than above.

Related posts

[1] Trevor Lipscombe. Mental mathematics for multiples of π. The Mathematical Gazette, Vol. 97, No. 538 (March 2013), pp. 167–169