Solving hard problems

We help companies solve hard problems in mathematics, statistics, and computing. Let’s explore how we might work together.

Groups vs Abelian groups: Pedantic or profound?

Posted on 3 October 2023 by John

This article will probably only be of interest to a small number of readers. Those unfamiliar with category theory may find it bewildering, and those well versed in category theory may find it trivial. My hope is that someone in between, someone just starting to get a handle on category theory, will find it helpful. I wish I’d run across an article like this when I was in school.

My confusion

As a student I was confused by the inordinate stress on the distinction between general groups and Abelian groups. This seemed like a very simple thing that was being overemphasized. Does multiplication commute or not? If it does, you have an Abelian group; otherwise you do not. That’s all. And yet my professor seemed to think something deep was going on.

What I didn’t appreciate at the time is that there is something deep going on, not when you look at individual groups but when you look at kinds of groups collectively. That is, the category of general groups is quite different from the category of Abelian groups. This distinction was totally lost on me at the time.

Clarifying example

I ran across an exercise recently that pinpoints what I was missing. The exercise asks the reader to show that the product of two cyclic groups is a coproduct in the category of Abelian groups but it is not a coproduct in the category of groups.

Wrong perspective

Here’s how I would have thought about the problem at the time. The coproduct of two cyclic groups is their direct sum, and that’s the same group as the product. The coproduct is an Abelian group, so it’s a group, so it’s in the category of groups. So the statement in the exercise is wrong!

The exercise wasn’t wrong; the thinking above is wrong. But it’s wrong in a very subtle way.

In my mind, a category was a label that you put on things after you’ve done your calculations. This is a bear, it’s brown, so it’s a brown bear. What’s hard about that? What I was missing was the idea of a category as a working context, not just a classification label.

Right perspective

Products and coproducts are defined in the context of a category. That’s what I was missing. In my mind, the coproduct of two groups was defined in some operational way. But what I thought of as a definition was a theorem. The definition depends on the context of a category, the category of Abelian groups, and the thing defined in that context turns out to have the operational properties that I took to be the definition.

You can’t just carry out some calculation and ask what category your result lies in, because the definition of what you’re calculating depends on the context of the category.

In category theory, products and coproducts are defined by universal properties. The (co)product of two objects A and B in a category is defined by saying that something holds for every object C in the category. (More on this here.)

In the category of Abelian groups, we’re saying that something happens for every Abelian group, but not necessarily for every group. That’s why a coproduct in the category of Abelian groups may not be a coproduct in the category of all groups.

Supereggs, squigonometry, and squircles

Posted on 2 October 2023 by John

The Depths of Wikipedia twitter account posted a screenshot about supereggs that’s popular at the moment. It says

there’s no way this is real. they must be making these words up

above a screenshot from the Wikipedia article on supereggs saying

The definition can be changed to have an equality rather than an inequality; this changes the superegg to being a surface of revolution rather than a solid.

I assume the Twitter account is having fun, not seriously suggesting that the terms are made up.

The terms “superegg” and “squircles” are whimsical but have been around for decades and have precise meanings. I hadn’t heard of “squigonometry,” but there are many variations on trigonometry that replace a circle with another curve, the best known example being hyperbolic trigonometry.

The equation for the volume of the superegg looked familiar but not quite right. It turns out the definition of superegg is not quite what I thought it was.

Brass superegg by Piet Hein

Piet Hein coined the terms superellipse and superegg. The photo above is a brass superegg made by Piet Hein [1].

A superellipse is what mathematicians more commonly call a p-norm ball in two dimensions. I assumed that a superegg was a p-norm ball in three dimensions, but it’s not quite.

A unit p-norm ball in 3 dimensions has equation

A superegg, however, has equation

$\left(\sqrt{x^2 + y^2}\right)^p + |z|^p = 1$

If you slice a p-norm ball horizontally or vertically you get another p-norm ball. So in three dimensions, either a vertical or horizontal slice gives you a superellipse.

But a horizontal slice of a superegg is a circle while a vertical slice is a superellipse, which is not a circle unless p = 2. Said another way, supereggs are symmetric about the z-axis but p-norm balls are not.

I’ve left out one detail: superellipses and supereggs typically stretch one of the axes. So you’d replace x with x/k in the definition of a superellipse or replace z with z/k in the definition of a superegg. A squircle is a superellipse with the two axes equally, and typically p is set to 4 or a value near 4.

[1] Photo by Malene Thyssen, licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

Corny AI

Posted on 2 October 2023 by John

Clippy

Meredith Whittaker posted on Twitter that

In addition to being the best in privacy, Signal is also the best in not subjecting you to corny ‘AI’ features no one asked for or wants.

I love the phrase “corny AI.” That’s exactly what a lot of AI features are.

“Would you like help composing that tweet?”

“No than you, I can write tiny text messages by myself.”

AI is the new Clippy.

I’m sure someone will object that these are early days, and AI applications will get better. That’s probably true, but they’re corny today. Inserting gimmicky, annoying technology now on the basis that future versions might be useful is like serving someone unripe fruit.

“This banana is hard and bitter!”

“Yes, but you should enjoy it because it would have been soft and sweet a few days from now.”

Of course not all AI is corny. For example, GPS has become reliable and unobtrusive. But there’s a rush to add AI just so a company can say their product includes AI. If the AI worked really well, the company would brag about how well the software works, not what technology it uses.

Today’s star

Posted on 2 October 2023 by John

Exponential sum of the day 10/2/2023

The star-like image above is today’s exponential sum.

The exponential sum page on my site generates a new image each day by putting the numbers of the day’s month, day, and year into the equation

$\sum_{n=0}^N \exp\left( 2\pi i \left( \frac{n}{m} + \frac{n^2}{d} + \frac{n^3}{y} \right ) \right )$

and connecting the partial sums in the complex plane. Here m is the month, d is the day, and y is the last two digits of the year.

Some people have asked why I use American date order: month, day, year. The flippant answer is I use American date order because I’m American. But I did experiment with other date orders, and I prefer the sequence of images produced by the order above. There’s more contrast between consecutive images by associating the day with the quadratic term rather than the linear term inside the exponential.

The exponential sum page is about six years old [1], and I still enjoy checking in on it each day. Short of making the plot, it’s not possible to imagine what an image will look like based on the date, other than the very rough rule that larger numbers tend to produce more complicated images. For example, images are much more intricate on New Year’s Eve than on New Year’s Day.

The images are often highly symmetric, as today’s image is. But occasionally they have no symmetry, as will be the case on 10/10/23.

The page lets you scroll back and forth by day, but you can put in any parameters you’d like by editing the page URL. For example, the link to today’s image is

   https://www.johndcook.com/expsum/?y=23&m=10&d=2

but you can change y, m, and d to any numbers you wish. There’s nothing that constrains m, for example, to be a number between 1 and 12. You could set it to 17 if you’d like. And although thirty days hath September, you can see what the image for September 31st would have looked like.

[1] The page was launched October 9, 2017, so its sixth anniversary is a week from today.

Consecutive coupon collector problem

Posted on 30 September 2023 by John

Coupon collector problem

Suppose you have a bag of balls labeled 1 through 1,000. You draw balls one at a time and put them back after each draw. How many draws would you have to make before you’ve seen every ball at least once?

This is the coupon collector problem with N = 1000, and the expected number of draws is

N H_N

where

H_N = 1 + 1/2 + 1/3 + … + 1/N

is the Nth harmonic number.

As N increases, H_N approaches log(N) + γ where γ = 0.577… is the Euler-Mascheroni constant, and so the expected time for the coupon collector problem is approximately

N (log(N) + γ).

Consecutive draws

Now suppose that instead of drawing single items, you draw blocks of consecutive items. For example, suppose the 1,000 balls are arranged in a circle. You pick a random starting point on the circle, then scoop up 10 consecutive balls, then put them back. Now how long would it take to see everything?

By choosing consecutive balls, you make it harder for a single ball to be a hold out. Filling in the holes becomes easier.

Bucketed problem

Now suppose the 1,000 balls are placed in 100 buckets and the buckets are arranged in a circle. Now instead of choosing 10 consecutive balls, you choose a bucket of 10 balls. Now you have a new coupon collector problem with N = 100.

This is like the problem above, except you are constraining your starting point to be a multiple of n.

Upper and lower bounds

I’ll use the word “scoop” to mean a selection of n balls at a time to avoid possible confusion over drawing individual balls or groups of balls.

If you scoop n balls at a time by making n independent draws, then you just have the original coupon collector problem, with the expected time divided by n.

If you scoop up n consecutively numbered balls each time, you reduce the expected time to see everything at least once. But your scoops can still overlap. For example, maybe you selected 13 through 22 on one draw, and 19 through 38 on the next.

In the bucketed problem, you reduce the expected time even further. Now your scoops will not partially overlap. (But they may entirely overlap, so it’s not clear that this reduces the total time.)

It would seem that we have sandwiched our problem between two other problems we have the solution to. The longest expected time would be if our scoop is made of n independent draws. Then the expected number of scoops is

N H_N / n.

The shortest time is the bucketed problem in which the expected number of scoops is

(N/n) H_(N/n).

It seems the problem of scooping n consecutive balls, with no constraint on the starting point, would have expected time somewhere between these two bounds. I say “it seems” because I haven’t proven anything here, just given plausibility arguments.

By the way, we can see how much bucketing reduces the expected time by using the log approximation above. With n independent draws each time, the expected number of scoops is roughly

(N/n) log(N)

whereas with the bucketed problem the expected number of scoops is roughly

(N/n) log(N/n).

Expected number of scoops

I searched a bit on this topic, and I found many problems with titles like “A variation on the coupon collector problem,” but none of the papers I found considered the variation I’ve written about here. If you work out the expected number of scoops, or find a paper where someone has worked this out, please let me know.

The continuous analog seems like an easier problem, and one that would provide a good approximation. Suppose you have a circle of circumference N and randomly place arcs of length n on the circle. What is the expected time until the circle is covered? I imagine this problem has been worked out many times and may even have a name.

Update: Thanks to Monte for posting links to the solution to the continuous problem in the comments below.

Simulation results

When N = 1000 and n = 10, the upper and lower bounds work out to 748 and 518.

When I simulated the consecutive coupon collector problem I got an average of 675 scoops, a little more than the average of the upper and lower bounds.

Regular solids and Monte Carlo integration

Posted on 29 September 2023 by John

Monte Carlo integration is not as simple in practice as it is often introduced. A homework problem might as you to integrate a function of two variables by selecting random points from a cube and counting how many of the points fall below the graph of the function. This would indeed give you an estimate of the volume bounded by the surface and hence the value of the integral.

But Monte Carlo integration is most often used in high dimensions, and the region of interest takes up a tiny proportion of a bounding box. In practice you’d rarely sample uniformly from a high-dimensional box. This post will look at sampling points on a (possibly high-dimensional) sphere.

The rate of convergence of Monte Carlo integration depends on the variance of the samples, and so people look for ways to reduce variance. Antipodal sampling is one such approach. The idea is that a function on a sphere is likely to take on larger values on one side of the sphere and smaller values on the other. So for every point x where the function is sampled, it is also sampled at the diametrically opposite point −x on the assumption/hope that the values of the function at the two points are negatively correlated.

Antipodal sampling is a first step in the direction of a hybrid of regular and random sampling, sampling by random choices of regularly spaced points, such as antipodal points. When this works well, you get a sort of synergy, an integration method that converges faster than either purely systematic or purely random sampling.

If a little is good, then more is better, right? Not necessarily, but maybe, so it’s worth exploring. If I remember correctly, Alan Genz explored this. Instead of just taking antipodal points, you could sample at the points of a regular solid, like a tetrahedron. Randomly select and initial point, create a tetrahedron on the sphere with this as one of the vertices, and sample your integrand at each of the vertices. Or you could think of having a tetrahedron fixed in space and randomly rotating the sphere so that the sphere remains in contact with the vertices of the tetrahedron.

If you’re going to sample at the vertices of a regular solid, you’d like to know what regular solids are possible. In three dimensions, there are five: tetrahedron, hexahedron (cube), octahedron, dodecahedon, and icosahedron. Only the first three of these generalize to dimensions 5 and higher, so you only have three choices in high dimension if you want to sample at the vertices of a regular solid.

Here’s more about the cross polytope, the generalization of the octahedron.

If you want more regularly-spaced points on a sphere than regular solids will permit, you could compromise and use points whose spacing is approximately regular, such as the Fibonacci lattice. You could randomly rotate your Fibonacci lattice to create a randomized quasi-Monte Carlo (RQMC) method.

You have a knob you can turn determining the amount of regularity and the amount of randomness. At one extreme is purely random sampling. As you turn the knob you go from antipodes to tetrahedra and up to cross polytopes. Then there’s a warning line, but you can keep going with things like the Fibonacci lattice, introducing some distortion, sorta like turning the volume of a speaker up past the point of distortion.

In my experience, I’ve gotten the best results near the random sampling end of the spectrum. Antipodal sampling sped up convergence, but other methods not so much. But your mileage may vary; the results depend on the kind of function you’re integrating.

Cross-platform way to enter Unicode characters

Posted on 28 September 2023 by John

The previous post describes the hoops I jumped through to enter Unicode characters on a Mac. Here’s a script to run from the command line that will copy Unicode characters to the system clipboard. It runs anywhere the Python module pyperclip runs.

    #!/usr/bin/env python3

    import sys
    import pyperclip

    cp = sys.argv[1]
    ch = eval(f"chr(0x{cp})")
    print(ch)
    pyperclip.copy(ch)

I called this script U so I could type

    U 03c0

at the command line, for example, it would print π to the command line and also copy it to the clipboard.

Unlike the MacOS solution in the previous post, this works for any Unicode value, i.e. for code points above FFFF.

On my Linux box I had to install xclip before pyperclip would work.

Using Unicode on MacOS

Posted on 28 September 2023 by John

Setting up Unicode on my MacBook took some research, so I’m leaving myself a note here if I need to do it again. Maybe it’ll help someone else too.

Update: I’ve gotten some feedback on this article that suggest people imagine that I want to use this approach to enter large quantities of text, such as typing Cyrillic text one Unicode code point at a time. This is not the case. If I wanted to type Cyrillic text, I’d use a (virtual) Cyrillic keyboard. The use case I have in mind is typing symbols or maybe a single word from a foreign language.

From the System Settings dialog, go to Keyboard and click the Edit button next to Input Sources. Click on the + sign in the lower left corner to add a keyboard. Scroll down to the bottom and click on Other to add Unicode as a keyboard.

screenshot

Now you can toggle between your previous keyboard(s) and Unicode Hex Input. When the latter is active, you can hold down the globe key and type the hex value of a Unicode character to enter that character. For example, you can hold down the globe key and type 03C0 to enter a π symbol.

screenshot

This only works for Unicode characters that can be written with four hexadecimal characters, i.e. up to FFFF, code points in the Basic Multilingual Plane.

Remapping keys

I’ve remapped my keys so that my muscle memory works across operating systems, so I have the functionality of the globe key mapped to the ⌘ key. So I hold down the key labeled “command” to enter Unicode characters. More on how I remap keys here.

Fixing Emacs

When I added the Unicode keyboard, C + space quit working in Emacs because the operating system hijacked that key. Removing the Unicode keyboard doesn’t put the system back like it was.

You can use C + @ instead of C + space for set-mark-command but this is an awkward keybinding, especially for such a common function. I bound a different key sequence to set-mark-command that I find easier to type.

Command line

See the next post for a command line script that will copy any Unicode character to the clipboard, including code points higher than the solution above can handle.

Circular coordinate art

Posted on 27 September 2023 by John

About three years ago I ran across a strange coordinate system in which familiar functions lead to interesting plots. The system is called “circular coordinates” but it is not polar coordinates.

This morning I was playing around with this again.

Here’s a plot of f(x) = x.

f(x) = x

And here’s a plot of f(x) = cos(8x).

See this post for details of circular coordinates.

Here is Python code to make the plots. You can experiment with your own plots by changing the definition of f.

# See Mathematics Magazine, v 52 no 3, p175

from numpy import cos
from numpy import linspace
import matplotlib.pyplot as plt

plt.style.use('seaborn-v0_8-muted')

def g(u, c, f):
    t = f(u) + c
    
    return 2*u*t**2 / (u**2 + t**2)

def h(u, c, f):
    t = f(u) + c
    return 2*u*u*t / (u**2 + t**2)

t = linspace(-7, 7, 10000)
fig, ax = plt.subplots()
f = lambda x: cos(8*x) 
for c in range(-10, 11):
    ax.plot(g(t, c, f), h(t, c, f))
    plt.axis("off")
plt.show()

Solving hard problems

My confusion

Clarifying example

Wrong perspective

Right perspective

Related posts

Coupon collector problem

Consecutive draws

Bucketed problem

Upper and lower bounds

Expected number of scoops

Simulation results

Remapping keys

Fixing Emacs

Command line