Top 100 radio

Suppose you want to hear the top 100 songs in some music genre, so you listen to a radio station that specializes in that kind of music. How long would it take to hear each of the songs at least once?

This is a variation on the coupon collector problem. If a radio station plays the top N songs in some genre, selected at random, the expected number of songs you would need to listen to before hearing each one at least once is

N HN

where HN is the Nth harmonic number. For large N this is approximately equal to

N(log N + γ)

where γ is the Euler-Mascheroni constant. When N = 100, this equals 518.

But radio stations do not choose songs at random. Even a station that only plays the 100 most popular songs will presumably play the most popular songs more often than the less popular ones. They probably also have some rules about waiting a while before playing the same song again, but we’ll ignore that for this post.

Billboard for radio station ZIPF

Suppose we’re listening to ZIPF, a radio station that plays the top 100 songs according to Zipf’s law: the probability of playing the nth most popular song is proportional to 1/n. How long before you’ve heard all 100 songs?

We know it’ll take more than 518 songs on average. The unequal frequency of the songs will cause us to wait longer to hear the least popular song. The 10oth most popular song, for example, will only play with probability 1/518 rather than probability 1/100.

Does the number 518 look familiar? It’s exactly the same number as above:

100 H100

That’s because the probability of hearing the 100th most popular song is proportional to 1/100, and the proportionality constant is the reciprocal of

1 + 1/2 + 1/3 + … 1/100 = H100 = 5.1873…

So in general the expected number of draws from a set of N items until you’ve seen everything at least once equals the probability of selected the least likely item when the items have draw probabilities according to Zipf’s law.

I don’t know how hard it would be to work out the exact probability for our problem of hearing all 100 songs on ZIPF, but we could estimate the probability by simulation.

When I ran it, the average of 1000 simulation reps was 1900 with a standard deviation of 595.

So it would take roughly 20x longer to hear every song selected according to Zipf’s law than to listen to the list.

Related posts

Estimating satellite altitude from apparent motion

If you see a light moving in the sky, how could you tell whether it’s an airplane or a satellite? If it is a satellite, could you tell how high of an orbit it’s in?

For an object in a circular orbit,

v² = μ/r

where r is the orbit radius and μ is the standard gravitational parameter. For a satellite orbiting the earth,

μ = 5.1658 × 1012  km³/hr².

The orbit radius r is the earth’s radius R plus the altitude a above the earth’s surface.

rRa

where the mean radius of the earth is R = 6,371 km.

Angular velocity is

vr = √(μ/r³)

This velocity is relative to an observer hovering far above the earth. An observer on the surface of the earth would need to subtract the earth’s angular velocity to get the apparent velocity [1].

from numpy import pi, sqrt

def angular_v(altitude): # in radians/hour
    R = 6371 # mean earth radius in km
    r = altitude + R
    mu = 5.1658e12 # km^3/hr^2
    return sqrt(mu/r**3)

def apparent_angular_v(altitude):
    earth_angular_v = 2*pi/(23 + 56/60 + 4/3600)
    return angular_v(altitude) - earth_angular_v

Here’s a plot of apparent angular velocity for altitudes between 350 km (initial orbit of a Starlink satellite) and geostationary orbit (35,786 km).

Radians per hour is a little hard to conceptualize; degrees per minute would be easier. But fortunately, the two are nearly the same. One radian per hour is 3/π = 0.955 degrees per minute.

The apparent size of the moon is about 0.5°, so you could estimate angular velocity by seeing how long it takes a satellite to cross the moon (or a region of space the width of the moon). It would only take an object in low earth orbit (LEO) a few seconds to cross the moon.

Can you see objects in medium earth orbit (MEO) or high earth orbit (HEO)? Yes. Although most satellites are in LEO, objects in higher orbits may be larger, and if they’re reflective enough they may be visible.

Airplanes move much slower than LEO satellites. An airliner at altitude 10 km moving 1000 km/hr would have an apparent angular velocity of 0.16 radians per hour. An object appearing to move that slowly is either an airplane or an object in HEO, and very likely the former.

[1] The earth completes a full rotation (2π radians) in 23 hours 56 minutes and 4 seconds, so its angular velocity is 0.2625 radians per hour. Why not 24 hours? That’s the length of a rotation of the earth relative to the sun. Since the earth moves in its orbit relative to the sun while it rotates, it has to rotate a little longer (i.e. for about 4 more minutes) to reach the same position relative to the sun.

Linear KdV dispersion

The Korteweg–De Vries (KdV) equation

u_t - 6 u\, u_x + u_{xxx} = 0

is a nonlinear PDE used to model shallow water waves. The linear counterpart omits the nonlinear term in the middle.

u_t + u_{xxx} = 0

This variant is useful in itself, but also for understanding the nonlinear KdV equation.

Solitons

Solutions to the linear KdV equation spread out over time. The nonlinear term in the KdV equation counterbalances the dispersion term uxxx so that solutions to the nonlinear PDE behave in some ways like linear solutions.

Solutions to a nonlinear equation don’t add, and yet they sorta add. Here’s a description from [1].

At the heart of these observations is the discovery that these nonlinear waves can interact strongly and then continue thereafter almost as if there had been no interaction at all. This persistence of the wave led Zabusky and Kruskal to coin the term ‘soliton’ (after photon, proton, etc.) to emphasize the particle-like character of these waves which retain their identities in a collision.

I added the emphasis on almost because many descriptions leave out this important qualifier.

Solution to linear KdV

There is a compact expression for the solution to the linear KdP equation if u, ux , and uxx all go to 0 as |x| → ∞. If u(x, 0) = f(x), then the solution is

u(x, t) = (3t)^{-1/3} \int_{-\infty}^\infty f(y) \text{ Ai}\!\left( \frac{x-y}{(3t)^{1/3}} \right) \,dy

Here Ai(z) is the Airy function. This function has come up several times here. For example, I’ve written about the Airy function in the context of triple factorials and in connection with Bessel functions.

Aside on third order equations

Third order differential equations are uncommon. Third order linear ODEs are quite rare. Third order differential equations are usually nonlinear PDEs, like the KdV equation. The linear KdV is an example of a linear third order PDE that arises in applications.

 

[1] P. G. Drazin and R. S. Johnson. Solitons: an introduction. Cambridge University Press. 1989.

Closed-form minimal surface solutions

Differential equations, especially nonlinear differential equations, rarely have a closed-form solution, but sometimes it happens. As I wrote about a year ago

It is unusual for a nonlinear PDE to have a closed-form solution, but it is not unheard of. There are numerous examples of nonlinear PDEs, equations with important physical applications, that have closed-form solutions.

This post will present some closed-form solutions of the minimal surface equation

(1 + |ux|²) uyy − 2 ux uy uxy + (1 + |uy|²) uxx = 0

One trivial class of closed-form solutions are planes.

u(xy) = axbyc.

There are three non-trivial classes of solutions as far as I know. Jean Baptiste Marie Meusnier discovered two of these in 1776, namely the helicoild

u(x, y) = tan−1(y/x)

and the catenoid

u(x, y) = cosh−1(a (x² + y²)½) / a

Heinrich Scherk discovered another closed form solution in 1830:

u(x, y) = log( cos(ay) / cos(ax) ) / a

Here’s a plot.

The surface formed by the graph of the solution is known as Scherk’s surface. You could image that if the edges of this surface were made of wire and the wire was dipped in soapy wanter, it would form a bubble like Sherk’s surface.

Note that the closed-form solutions satisfy the minimal surface PDE itself, but do not satisfy any given boundary conditions, unless the boundary values you’d like to specify happen to be exactly the values this function has.

Fundamental solution

The “fundamental solution” to a PDE solves the equation with the right-hand side set to δ. Intuitively, you can think of the delta function as striking something with a hammer in order to see how it rings.

An aside on rigor

A novice might be OK with the explanation above.

A sophomore might rightly object that this doesn’t make sense. This delta “function” isn’t even a function. How can you set one side of a differential equation to something that isn’t even a function?

An expert would understand that calling δ a function is just a convenient figure of speech for a rigorous construction using distribution theory. You can find a high-level introduction here.

Bell curve meme.

As with many of the bell curve memes, the horizontal axis is really experience rather than intelligence. “Whatever you say” could be an intelligent response to someone talking about things they understand but you don’t. And objecting that something doesn’t make sense (as stated) is an intelligent response when you’re exposed to a metaphor that you didn’t realize was a metaphor. A mature response is to appreciate the value of rigor and the value of metaphor.

Why fundamental

The reason a fundamental solution is called “fundamental” is that once you have the fundamental solution, you can find more solutions by convolving the right-hand side with it.

So if L is a linear differential operator and F is a fundamental solution, i.e.

L F = δ

then the convolution f =  Fh is a solution to

L fh.

Poisson’s equation

The fundamental solution to Poisson’s equation

∇² f = h

depends on dimension.

For dimension d > 2 the solution is proportional to rd−2 where r is the radial distance to the origin.

For dimension d = 2 the solution is proportional to log r.

This is an example of the phenomenon alluded to in the article titled A Zeroth Power Is Often a Logarithm Yearning to Be Free by Sanjoy Mahajan. If we naively stuck d = 2 into the fundamental solution rd−2 for higher dimensions we’d get r0  = 1, which doesn’t work. But we’d read Mahajan’s article, we might guess that log r and then verify that it works.

I give a couple more examples of “logarithms yearning to be free” in this post.

 

No matter how dubious

The following quote stuck with me when I read it years ago. Looking back I appreciate it even more.

Now, when solving differential equations, or indeed solving any problem, it is permissible to use any methods at all, no matter how dubious, provided that once the solution has been found it can be proved to satisfy all the conditions of the problem.

You could make a bell curve meme out of this. A novice would say “Sure, if it works it works.” An expert would agree. But someone in between who has recently been introduced to rigorous mathematics would object. They might say, for example, “You can’t just treat dy/dx like a fraction!” even though they did a few weeks ago.

Mathematics is discovered inductively but taught deductively. This creates the false impression that math advances deductively. It does not. That is a rationalist fantasy. To use Iain McGilchrist’s metaphor from “The Master and His Emissary,” the intuitive master solves problems, then assigns his analytical emissary to check the work.

I’ve looked for the source of the quote above several times without success. I was convinced it was a footnote in Boyce and DiPrima, but could never find it there. I recently ran across the line when I was looking for something else. The quote come in fact it comes from Applied Functional Analysis by D. H. Griffel, 1985.

Superhyperbola

An ellipse has equation

\left(\frac{x}{a}\right)^2 + \left(\frac{y}{b}\right)^2 = 1

and a hyperbola has equation

\left(\frac{x}{a}\right)^2 - \left(\frac{y}{b}\right)^2 = 1

Similarly the superellipse has equation

\left|\frac{x}{a}\right|^p + \left|\frac{y}{b}\right|^p = 1

and the superhyperbola

\left|\frac{x}{a}\right|^p - \left|\frac{y}{b}\right|^p = 1
When p = 2, the absolute value signs are unnecessary and the superellipse and superhyperbola reduce to the ellipse and hyperbola respectively.

Increasing p makes the superellipse more like a rectangle. But unlike a rectangle with rounded corners, the change in curvature is continuous.

Increasing p makes the superhyperbola more blunt at the vertices.

Marketing

The superellipse is a fairly well known variation on an ellipse. Even if you’re not familiar the term, you’ve probably seen the shape. I give a couple examples here. The superhyperbola is the obvious analog of a superellipse, but the term is far less common. I’d never hear the term until yesterday.

It’s not clear why the superellipse would be common and the superhyperbola obscure, but here’s some speculation. First of all, the superellipse had an advocate, Piet Hein. If the superhyperbola has an advocate, he’s not a very effective advocate.

The name is also off-putting: juxtaposing super and hyper sounds silly. The etymology makes sense, even if it sounds funny. Piet Hein used the prefix super– to refer to increasing the exponent from the usual value of 2. Its unfortunate that hyperbola begins with a root that is similar to super.

Related posts

The glass disk game

The glass disk game is played on a grid. You have translucent colored glass disks you can either place on an edge or a vertex.

There are two kinds of disks that can be placed on an edge: blue or yellow. A vertex with a blue and yellow disk looks green.

There are two kinds of disks that can be placed on a vertex: red or white. A vertex with a red and white disk looks pink.

For this post we only need red disks. We may use white and pink in a future post.

The following rules tell you how you are allowed to add disks.

Rule 1

If you have this configuration

then you may add a green disk in the middle.

Mnemonic: Think of copying the blue and yellow dots on the end and placing them both in the middle.

Rule 2

If you have this configuration

you may add a blue disk between the two blue disks.

If you have this configuration

you may add a yellow disk between the two yellow disks.

Interpretation

The rules above are all the rules of the game. You do not need to know any mathematics to play the game.

But the game does have a mathematical interpretation. The grid is a commutative diagram. A red disk means the horizontal diagram is exact at that vertex, i.e. the image of the function coming in from the left is the kernel of the function going out to the right.

A blue disk means a function is surjective (onto) and a yellow disk means a function is injective (one-to-one).

The diagram need not represent sets and functions. It could represent objects in a category along with morphisms. In that case blue disks represent epimorphisms and yellow disks monomorphisms.

Rule 1 is known as the five lemma. Rule 2 is called the four lemma.

The glass disk game is a pair of theorems in algebraic topology, or more generally homological algebra.

Voyager’s slingshot maneuvers

This post started out as a thread on X. Here I’ve edited it into a blog post. The image below and the fact cited can be found in JPL Publication 89-24.

Voyager 2 velocity relative to the sun over time

In 1960 it didn’t seem that it would be possible to explore the solar system beyond Jupiter without greatly improved propulsion.

Then the gravitational assist (“slingshot”) maneuver was discovered in 1961. With this new discovery, NASA began making plans to take advantage of an alignment of the outer planets in the 1970s. This led to the Voyager missions.

(Fun fact: In a gravitational assist, the velocity of a spacecraft with respect to the planet doesn’t change, but the velocity relative to the sun changes greatly.)

Note that before encountering Jupiter, Voyager was moving well below solar system escape velocity. As a result of gravitational assists at four planets, the spacecraft is traveling at well over solar system escape velocity.

In a gravitational assist, speed (relative to the sun) increases or decreases, depending on which direction you approach the planet. At the end of the tour, Voyager 2 no longer had a reason to increase speed—it wasn’t possible to visit Pluto—but decreasing speed allowed it to visit both Neptune and its moon Triton.

It may seem that a gravitational assist violates conservation laws: where does the additional momentum come from? From the planets. When Voyager 2 passed Jupiter, Saturn, and Uranus, each of these planets lost some momentum, transferring it to the probe. When the spacecraft passed Neptune, the planet gained some momentum and the probe lost momentum. The changes in momentum were infinitesimal relative to the momentum of the gas giants, but large relative to the momentum of Voyager.

Related post: Error-correcting codes using on Voyager missions

Advantages of Reed-Solomon codes over Golay codes

When Voyager 1 and 2 left Earth, their internal computers were programmed to use Golay error correction codes. Images transmitted from Jupiter and Saturn were encoded using Golay codes. After leaving Saturn, the software was upgraded to use Reed-Solomon error correction codes.

I didn’t realize how much difference the change of encoding made until I ran across a JPL report that elaborated on the efficiency of both codes.

Encoding these data has a price, and that paid for the old Golay encoding algorithm (used at Jupiter and Saturn) was one code bit overhead for every data bit (100 percent). The new RS encoding scheme reduces this overhead to about 20 percent. In addition, it reduces the number of bit errors from 5 in 100,000 to only 1 in a million!

So switching to Reed-Solomon cut overhead by 5x and improved error correction 50x.

There’s a reason CDs use Reed-Solomon codes and not Golay codes.

Related posts