To integrate the impossible integral

Don Quixote

In the Broadway musical Man of La Mancha, Don Quixote sings

To dream the impossible dream
To fight the unbeatable foe
To bear with unbearable sorrow
To run where the brave dare not go

Yesterday my daughter asked me to integrate the impossible integral, and this post has a few thoughts on the quixotic quest to run where the brave calculus student dare not go.

The problem apparently required computing the indefinite integral of the square root of sine:

\int \sqrt{\sin x} \,dx

I say apparently for reasons that will soon be clear.

My first thought was that some sort of clever u-substitution would work. This was coming from a homework problem, so I was in the mindset of expecting every problem to be doable. If I ran across this integral while I had my consultant hat on rather than my father/tutor hat, I would have thought “It probably doesn’t have an elementary closed form because most integrals don’t.”

It turns out I should have had my professional hat on, because the integral does indeed not have an elementary closed form. There is a well-known function which is an antiderivative of √sin, but it’s not an elementary function. (“Well-known” relative to professional mathematicians, not relative to calculus students.)

You can evaluate the integral as

\int \sqrt{\sin x} \,dx = -2 E\left(\frac{\pi - 2x}{4}\, \middle|\, 2 \right) + C

where E is Legendre’s “elliptic integral of the second kind.”

I assume it wasn’t actually necessary to compute the antiderivative, though I didn’t see the full problem. In the artificial world of the calculus classroom, everything works out nicely. And this is a shame.

For one thing, it creates a false impression. Most integrals are “impossible” in the sense of not having an elementary closed form. Students who go on to use calculus for anything more than artificial homework problems may incorrectly assume they’ve done something wrong when they encounter an integral in the wild.

For another, it’s a missed opportunity. Or maybe two or three missed opportunities. These may or may not be appropriate to bring up, depending on how receptive the audience is. (But we said at the beginning we would “run where the brave calculus student dare not go.”)

It’s a missed opportunity to emphasize what integrals really are, to separate meaning from calculation technique. Is the integral of √sin impossible? Not in the sense that it doesn’t exist. Yes, there are functions whose derivative is √sin, at least if we limit ourselves to a range where sine is not negative. No, the integral does not exist in the sense of a finite combination of functions a calculus student would have seen.

It’s also a missed opportunity to show that we can define new functions as the solution to problems we can’t solve otherwise. The elliptic integrals mentioned above are functions defined in terms of integrals that cannot be computed in more elementary terms. By giving the function a name, we can compare where it comes up in other contexts. For example, I wrote a few months ago about the problem of finding the length of a serpentine wall. This also pops out of a calculus problem that doesn’t have an elementary solution, and in fact it also has a solution in terms of elliptic integrals.

Finally, it’s a missed opportunity to give a glimpse of the wider mathematical landscape. If not everything elementary function has an elementary antiderivative, do they at least have antiderivatives in terms of special functions such as elliptic integrals? No, but that’s a good question.

Any continuous function has an antiderivative, and that antiderivative might be an elementary function, or it might be a combination of elementary functions and familiar special functions. Or it might be an obscure special function, something not exactly common, but something that has been named and studied before. Or it might be a perfectly valid function that hasn’t been given a name yet. Maybe it’s too specialized to deserve a name, or maybe you’ve discovered something that comes up repeatedly and deserves to be cataloged.

This touches more broadly on the subject of what functions exist versus what functions have been named. Students implicitly assume these two categories are the same. Here’s an example of the same phenomenon in probability. It also touches on the gray area between what has been named and what hasn’t, and how you decide whether something deserves to be named.

Update: The next post gives a simple approximation to the integral in this post.

Extracting independent random bits from dependent sources

Sometimes you have a poor quality source of randomness and you want to refine it into a better source. You might, for example, want to generate cryptographic keys from a hardware source that is more likely to produce 1’s than 0’s. Or maybe your source of bits is dependent, more likely to produce a 1 when it has just produced a 1.

Von Neumann’s method

If the bits are biased but independent, a simple algorithm by John von Neumann will produce unbiased bits. Assume each bit of input is a 1 with probability p and a 0 otherwise. As long as 0 < p < 1 and the bits are independent, it doesn’t matter what p is [1].

Von Neumann’s algorithm works on consecutive pairs bits. If the pair is 00 or 11, don’t output anything. When you see 10 output a 1, and when you see 01 output a 0. This will output a 0 or a 1 with equal probability.

Manuel Blum’s method

Manuel Blum built on von Neumann’s algorithm to extract independent bits from dependent sources. So von Neumann’s algorithm can remove bias, but not dependence. Blum’s algorithm can remove bias and dependence, if the dependence takes the form of a Markov chain.

Blum’s assumes each state in the Markov chain can be exited two ways, labeled 0 and 1. You could focus on a single state and use von Neumann’s algorithm, returning 1 when the state exits by the 1 path the first time and 0 the second time, and return 0 when it does the opposite. But if there are many states, this method is inefficient since it produces nothing when the chain is in all the other states.

Blum basically applies the same method to all states, not just one state. But there’s a catch! His paper shows that the naive implementation of this idea is wrong. He gives a pseudo-theorem and a pseudo-proof that the naive algorithm is correct before showing that it is not. Then he shows that the correct algorithm really is correct.

This is excellent exposition. It would have been easier to say nothing of the naive algorithm, or simply mention in passing that the most obvious approach is flawed. But instead he went to the effort of fully developing the wrong algorithm (with a warning up-front that it will be shown to be wrong).

Related posts

[1] It matters for efficiency. The probability that a pair of input bits will produce an output bit is 2p(1 – p), so the probability is maximized when p = 0.5. If p = 0.999, for example, the method will still work correctly, but you’ll have to generate 500 pairs of input on average to produce each output bit.

Convert PostScript to PDF locally

There are numerous websites that will let you upload a PostScript (.ps) file and download it as a PDF. Some of these online file format conversion sites look sketchy, and the ones that seem more reputable don’t always do a good job.

If you want to convert PostScript to PDF locally, I’d suggest trying the Ghostscript utility ps2pdf. I’ve had good luck with it.

To convert the file input.ps to the file output.pdf you simply say

    ps2pdf input.ps output.pdf

If ps2pdf is not installed on your computer, you can run

    sudo apt install ghostscript

on Linux.

Windows installation

On Windows, you can install Ghostscript by first installing WSL then using the command above.

I’m half joking. That advice sounds like something the stereotypical arrogant Unix user would say: the way to run software on Windows is to first install Linux, then run the software in Linux. You can install a Windows port of Ghostscript directly on Windows.

Source http://dilbert.com/strip/1995-06-24

But in all seriousness, I’ve used Windows ports of Unix-first software for a long time with mixed success. Sometimes the ports work seamlessly, but often they fail or half work. I find it less frustrating to use WSL (Windows Subsystem for Linux) to run software on Windows that was first written for Unix.

WSL is not a virtual machine per se but an implementation of Linux using the Windows API. It starts up quickly, and it’s well integrated with Windows. I have no complaints with it.

Related posts

Chebyshev’s other polynomials

There are two sequences of polynomials named after Chebyshev, and the first is so much more common that when authors say “Chebyshev polynomial” with no further qualification, they mean Chebyshev polynomials of the first kind. These are denoted with Tn, so they get Chebyshev’s initial [1]. The Chebyshev polynomials of the second kind are denoted Un.

Chebyshev polynomials of the first kind are closely related to cosines. It would be nice to say that Chebyshev polynomials of the second kind are to sines what Chebyshev polynomials of the first kind are to cosines. That would be tidy, but it’s not true. There are relationships between the two kinds of Chebyshev polynomials, but they’re not that symmetric.

It is true that Chebyshev polynomials of the second kind satisfy a relation somewhat analogous to the relation

\cos n\theta = T_n(\cos\theta)

for his polynomials of the first kind, and it involves sines:

\frac{\sin n\theta}{\sin \theta} = U_{n-1}(\cos\theta)

We can prove this with the equation we’ve been discussing in several posts lately, so there is yet more juice to squeeze from this lemon.

Once again we start with the equation

\cos n\theta + i\sin n\theta = \sum_{j=0}^n {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

and take the complex part of both sides. The odd terms of the sum contribute to the imaginary part, so we can assume j = 2k + 1. We make the replacement

(\sin\theta)^{2k+1} = (1 - \cos^2\theta)^k \sin\theta

and so we’re left with a polynomial in cos θ, except for an extra factor of sin θ in every term.

This shows that sin nθ / sin θ, is a polynomial in cos θ, and in fact a polynomial of degree n − 1. Given the theorem that

\frac{\sin n\theta}{\sin \theta} = U_{n-1}(\cos\theta)

it follows that the polynomial in question must be Un−1.

More special function posts

[1]Chebyshev has been transliterated from the Russian as Chebysheff, Chebychov, Chebyshov, Tchebychev, Tchebycheff, Tschebyschev, Tschebyschef, Tschebyscheff, Chebychev, etc. It is conventional now to use “Chebyshev” as the name, at least in English, and to use “T” for the polynomials.

More juice in the lemon

There’s more juice left in the lemon we’ve been squeezing lately.

A few days ago I first brought up the equation

\cos n\theta + i\sin n\theta = \sum_{j=0}^n {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

which holds because both sides equal exp(inθ).

Then a couple days ago I concluded a blog post by noting that by taking the real part of this equation and replacing sin²θ with 1 – cos²θ one could express cos nθ as a polynomial in cos θ,

\cos n\theta = \sum {n\choose 2k} (-1)^k (\cos\theta)^{n-2k} (1 - \cos^2\theta)^k

and in fact this polynomial is the nth Chebyshev polynomial Tn since these polynomials satisfy

\cos n\theta = T_n(\cos\theta)

Now in this post I’d like to prove a relationship between Chebyshev polynomials and sines starting with the same raw material. The relationship between Chebyshev polynomials and cosines is well known, even a matter of definition depending on where you start, but the connection to sines is less well known.

Let’s go back to the equation at the top of the post, replace n with 2n + 1, and take the imaginary part of both sides. The odd terms of the sum contribute to the imaginary part, so we sum over 2ℓ+ 1.

\begin{align*} \sin((2n+1)\theta) &= \sum_{\ell} (-1)^\ell {2n+1 \choose 2\ell + 1} (\cos\theta)^{2n-2\ell} (\sin\theta)^{2\ell + 1} \\ &= \sum_k (-1)^{n-k}{2n + 1 \choose 2k} (\sin \theta)^{2n + 1 -2k} (\cos\theta)^{2k} \\ &= (-1)^n \sum_k (-1)^k{2n + 1 \choose 2k} (\sin \theta)^{2n + 1 -2k} (1 -\sin^2\theta)^{k} \end{align*}

Here we did a change of variables k = n – ℓ.

The final expression is the expression we began with, only evaluated at sin θ instead of cos θ. That is,

 \sin((2n+1)\theta) = (-1)^n T_{2n+1}(\sin\theta)

So for all n we have

\cos n\theta = T_n(\cos\theta)

and for odd n we also have

 \sin n\theta = \pm \,\,T_n(\sin\theta)

The sign is positive when n is congruent to 1 mod 4 and negative when n is congruent to 3 mod 4.

More Chebyshev posts

CRM consulting gig

This morning I had someone from a pharmaceutical company call me with questions about conducting a CRM dose-finding trial and I mentioned it to my wife.

Then this afternoon she was reading a book in which there was a dialog between husband and wife including this sentence:

He launched into a technical explanation of his current consulting gig—something about a CRM implementation.

You can’t make this kind of thing up. A few hours before reading this line, my wife had exactly this conversation. However, I doubt the author and I had the same thing in mind.

In my mind, CRM stands for Continual Reassessment Method, a Bayesian method for Phase I clinical trials, especially in oncology. We ran a lot of CRM trials while I worked in biostatistics at MD Anderson Cancer Center.

For most people, presumably including the author of the book quoted above, CRM stands for Customer Relationship Management software.

Like my fictional counterpart, I know a few things about CRM implementation, but it’s a different kind of CRM.

More clinical trial posts

Manipulating sums

This post is a list of five spinoffs of my previous post. Except for the last point it doesn’t build on the previous post per se, but I’ll use a sum from that post to illustrate five things:

  1. Putting multiple things under a summation sign in LaTeX
  2. Simplifying sums by generalizing binomial coefficients
  3. A bit of notation from Iverson
  4. Changing variables in a sum
  5. Chebyshev polynomials come up again.

Let’s get started. The equation I want to focus on is the following.

\cos n\theta + i\sin n\theta = \sum_{j=0}^n {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

Putting things under a sum in LaTeX

I said in the previous post that you could equate the real parts of the left and right side to show that cos nθ can be written as sums and products of cos θ and sin θ. To write this in LaTeX we’d say

\cos n\theta = \sum_{\substack{0 \leq j \leq n \\ j \text{ even}}} {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

The command that makes it possible to put two lines of stuff under the summation is \substack. Here’s the LaTeX code that produce the summation sign and the text below it.

    \sum_{\substack{0 \leq j \leq n \\ j \text{ even}}}

Binomial coefficients

We can simplify the sum by removing the limits on j and implicitly letting j run over all integers:

\cos n\theta = \sum_{ j \text{ even}} {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j

This is because the binomial coefficient term is zero when j > n or j < 0. (See this post for an explanation of more general binomial coefficients.)

Indicator functions

It’s often helpful to turn a sum over a complicated region into a more complicated sum over a simpler region. That is, it’s often easier to deal with complications in the summand than in the domain of summation.

In our case, instead of summing over even integers, we could sum over all integers, if we multiply the summand by a function that is 0 on odd numbers and 1 on even numbers. That is, we multiply by the indicator function of the even integers. The indicator function of a set is 1 on that set and 0 everywhere else.

Kenneth Iverson’s notation uses a Boolean expression in brackets to indicate the function that is 1 if the condition is true and 0 otherwise. So [j even] means the function that is 1 when j is even and 0 when j is odd. So we could write our sum as follows.

\cos n\theta = \sum {n\choose j} i^j [j \text{ even}](\cos\theta)^{n-j} (\sin\theta)^j

Change of variables

We could get rid of the requirement that j is even by replacing j with 2k for a new variable k. Now our sum is

\cos n\theta = \sum {n\choose 2k} (-1)^k (\cos\theta)^{n-2k} (\sin\theta)^{2k}

Notice a couple things. For one thing we were table to write (-1)k rather than i2k.

More importantly, the change of variables was trivial because the sum runs over all integers. If we had explicit limits on j, we would have to change them to different explicit limits on k.

Changing limits of summation is error-prone. This happens a lot, for example, when computing power series solutions for differential equations, and there are mnemonics for reducing errors such as “limits and exponents move in opposite directions.” These complications go away when you sum over all integers.

Chebyshev strikes again

GlennF left a comment on the previous post to the effect that the sum we’ve been talking about reduces to a Chebyshev polynomial.

Since the powers of sin θ are all even, we can replace sin²θ with 1 – cos²θ and get the following.

\cos n\theta = \sum {n\choose 2k} (-1)^k (\cos\theta)^{n-2k} (1 - \cos^2\theta)^k

Now the left side is a polynomial in cos θ, call it P(cos θ). Then P = Tn, the nth Chebyshev polynomial because as explained here, one way to define the Chebyshev polynomials is by

\cos n\theta = T_n(\cos\theta)

If you don’t like that definition, you could use another definition and the equation becomes a theorem.

Building high frequencies out of low frequencies

If you have sines and cosines of some fundamental frequency, and you’re able to take products and sums, then you can construct sines and cosines at any multiple of the fundamental frequency.

Here’s a proof.

\begin{align*} \cos n\theta + i\sin n\theta &= \exp(in\theta) \\ &= \exp(i\theta)^n \\ &= (\cos\theta + i\sin\theta)^n \\ &= \sum_{j=0}^n {n\choose j} i^j(\cos\theta)^{n-j} (\sin\theta)^j \end{align*}

Taking real parts gives us cos nθ in the first equation and the even terms of the sum in the last equation.

Taking imaginary parts gives us sin nθ in the first equation and the odd terms of the sum in the last equation.

A radio may use this technique to create signals with frequencies higher than the frequency of its oscillator.

***

Read more on mathematics and radio.

Prime plus power of 2

A new article [1] looks at the problem of determining the proportion of odd numbers that can be written as a power of 2 and a prime. A. de Polignac conjectured in 1849 that all odd numbers have this form.

A counterexample is 127, and so apparently the conjecture was that every odd number is either prime or a power of 2 plus a prime [2]. Alternatively, you could say that a prime is a prime plus a power of 2 where the exponent is -∞.

The smallest counterexample to Polignac’s conjecture is 905. It is clearly not prime, nor are any of the values you get by subtracting powers of 2: 393, 649, 777, 841, 873, 889, 897, 901, and 903.

The proportion of numbers of the form 2n plus a prime is known to be between 0.09368 and 0.49095. In [1] the authors give evidence that the proportion is around 0.437.

You could generalize Polignac’s conjecture by asking how many powers of 2 you need to add to a prime in order to represent any odd number. Clearly every odd number x is the sum of some number of powers of 2 since you can write numbers in binary. But is there a finite upper bound k on the number of powers of 2 needed, independent of x? I imagine someone has already asked this question.

Polignac conjectured that k = 1. The example x = 905 above shows that k = 1 won’t work. Would k = 2 work? For example, 905 = 2 + 16 + 887 and 887 is prime. Apparently k = 2 is sufficient for numbers less than 1,000,000,000.

More posts on number theory

[1] Gianna M. Del Corso, Ilaria Del Corso, Roberto Dvornicich, and Francesco Romani. On computing the density of integers of the form 2n + p. Mathematics of Computation. https://doi.org/10.1090/mcom/3537 Article electronically published on April 28, 2020

Analogy between Fibonacci and Chebyshev

Quick observation: I recently noticed that Chebyshev polynomials and Fibonacci numbers have analogous formulas.

The nth Chebyshev polynomial satisfies

T_n(x) = \frac{(x + \sqrt{x^2-1})^n + (x - \sqrt{x^2-1})^n }{2}

for |x| ≥ 1, and the nth Fibonacci number is given by

F_n = \frac{(1 + \sqrt{5})^n - (1 - \sqrt{5})^n }{2^n\sqrt 5}

There’s probably a way to explain the similarity in terms of the recurrence relations that both sequences satisfy.

More on Chebyshev polynomials

More on Fibonacci numbers