Cross-platform way to enter Unicode characters

The previous post describes the hoops I jumped through to enter Unicode characters on a Mac. Here’s a script to run from the command line that will copy Unicode characters to the system clipboard. It runs anywhere the Python module pyperclip runs.

    #!/usr/bin/env python3

    import sys
    import pyperclip

    cp = sys.argv[1]
    ch = eval(f"chr(0x{cp})")
    print(ch)
    pyperclip.copy(ch)

I called this script U so I could type

    U 03c0

at the command line, for example, it would print π to the command line and also copy it to the clipboard.

Unlike the MacOS solution in the previous post, this works for any Unicode value, i.e. for code points above FFFF.

On my Linux box I had to install xclip before pyperclip would work.

Using Unicode on MacOS

Setting up Unicode on my MacBook took some research, so I’m leaving myself a note here if I need to do it again. Maybe it’ll help someone else too.

Update: I’ve gotten some feedback on this article that suggest people imagine that I want to use this approach to enter large quantities of text, such as typing Cyrillic text one Unicode code point at a time. This is not the case. If I wanted to type Cyrillic text, I’d use a (virtual) Cyrillic keyboard. The use case I have in mind is typing symbols or maybe a single word from a foreign language.

From the System Settings dialog, go to Keyboard and click the Edit button next to Input Sources. Click on the + sign in the lower left corner to add a keyboard. Scroll down to the bottom and click on Other to add Unicode as a keyboard.

screenshot

Now you can toggle between your previous keyboard(s) and Unicode Hex Input. When the latter is active, you can hold down the globe key and type the hex value of a Unicode character to enter that character. For example, you can hold down the globe key and type 03C0 to enter a π symbol.

screenshot

This only works for Unicode characters that can be written with four hexadecimal characters, i.e. up to FFFF, code points in the Basic Multilingual Plane.

Remapping keys

I’ve remapped my keys so that my muscle memory works across operating systems, so I have the functionality of the globe key mapped to the key. So I hold down the key labeled “command” to enter Unicode characters. More on how I remap keys here.

Fixing Emacs

When I added the Unicode keyboard, C + space quit working in Emacs because the operating system hijacked that key. Removing the Unicode keyboard doesn’t put the system back like it was.

You can use C + @ instead of C + space for set-mark-command but this is an awkward keybinding, especially for such a common function. I bound a different key sequence to set-mark-command that I find easier to type.

Command line

See the next post for a command line script that will copy any Unicode character to the clipboard, including code points higher than the solution above can handle.

Circular coordinate art

About three years ago I ran across a strange coordinate system in which familiar functions lead to interesting plots. The system is called “circular coordinates” but it is not polar coordinates.

This morning I was playing around with this again.

Here’s a plot of f(x) = x.

f(x) = x

And here’s a plot of f(x) = cos(8x).

See this post for details of circular coordinates.

Here is Python code to make the plots. You can experiment with your own plots by changing the definition of f.

# See Mathematics Magazine, v 52 no 3, p175

from numpy import cos
from numpy import linspace
import matplotlib.pyplot as plt

plt.style.use('seaborn-v0_8-muted')

def g(u, c, f):
    t = f(u) + c
    
    return 2*u*t**2 / (u**2 + t**2)

def h(u, c, f):
    t = f(u) + c
    return 2*u*u*t / (u**2 + t**2)

t = linspace(-7, 7, 10000)
fig, ax = plt.subplots()
f = lambda x: cos(8*x) 
for c in range(-10, 11):
    ax.plot(g(t, c, f), h(t, c, f))
    plt.axis("off")
plt.show()

When there is only one group of a given size

Today’s date, US style, is 9/26/2023, and there is only one group, up to isomorphism, of size 9262023. You could verify this in Mathematica with the command

    FiniteGroupCount[9262023]

which returns 1.

For a given n, when is there only one group of size n?

There are two requirements. First, n has to be the product of distinct primes, i.e. no prime appears in the factorization with a power greater than 1. Second, no prime divides one less than another prime.

Now

9262023 = 3 × 41 × 257 ×293

and you can check that 3 does not divide 40, 256, or 292, nor does 41 divide 2, 252, or 292, etc.

A more compact way to state the criteria above is to say

gcd(n, φ(n)) = 1

where φ(n) is Euler’s totient function, the number of positive numbers less than n and relatively prime to n.

Why are these criteria equivalent? If

n = pqr

then

φ(n) = (p − 1)(q − 1)(r − 1)…

If n and φ(n) have a nontrivial common factor, it has to be one of the prime factors of n, and none of these divide any term of φ(n).

Source: Dieter Jungnickel. On the Uniqueness of the Cyclic Group of Order n. The American Mathematical Monthly, Vol. 99, No. 6. (Jun. – Jul., 1992), pp. 545–547.

Analogy between prime numbers and simple groups

Simple groups are the building blocks of groups similar to the way prime numbers are the building blocks of integers. This post will unpack this analogy in two ways:

  1. How do simple groups compare to prime numbers?
  2. How does the composition of simple groups compare to the composition of prime numbers?

The former analogy is stronger than the latter.

Primes and simple groups

A simple group has no nontrivial subgroups, just a prime number has no nontrivial factors. Except that’s not quite right. A simple group is defined as having no nontrivial normal subgroups. The previous post compares normal and non-normal subgroups. Normal subgroups have nice properties which are necessary for decomposition and composition. You can’t define quotients for non-normal groups.

Every subgroup of an Abelian group is normal, so in the context of Abelian groups it is true that simple groups have no nontrivial subgroups, i.e. the only subgroups of a simple Abelian group G are the identity and G itself. It follows from Sylow’s theorems that the order of a finite Abelian group with no nontrivial factors must be an integer with no nontrivial factors, i.e. a prime number. Every Abelian finite simple group must be isomoprphic to the integers mod p for some prime p.

Non-Abelian finite simple groups do not have prime order, but they not decomposable in the sense described below.

Composition and decomposition

Prime numbers compose to form other numbers by products. You can also compose groups by taking products, though you need more than that. It is not the case that all finite groups are products of finite simple groups.

Let ℤn denote the cyclic group of order n and let ⊕ denote direct sum. The group ℤ4 is not isomorphic to ℤ2 ⊕ ℤ2. Even in the case of Abelian groups, not all Abelian groups are the direct sum or direct product of simple groups. [1]

Finite groups can be decomposed into smaller finite simple groups, but we can’t easily or uniquely rebuild a group from this decomposition.

The Jordan-Hölder theorem says that a finite group G has a composition series

1 = H0H1 ⊲ … ⊲ Hn = G

where each H is a maximal normal subgroup of the next, the quotients Hi+1 / Hi of consecutive are simple groups. The composition series is not unique, but all such series are equivalent in a sense that the Jordan-Hölder theorem makes precise.

It seems to me that the composition series ought to be called a decomposition series in that you can start with G and find the H‘s, but it’s a difficult problem, known as “the extension problem,” to reconstruct G from the H‘s, and in general there are multiple solutions.

The analogy to prime numbers would be if there was an essentially unique way to factor a number, but not a unique way to multiply the factors back together.

Reductionism

Some people thought that the classification of finite simple groups would be the end group theory. That has not been the case. Some also thought sequencing of the human genome would lead to cures for a huge range of diseases. That has not been the case either. Reductionism often produces disappointing results.

Related posts

[1] In the context of Abelian groups, (direct) products and coproducts (i.e. direct sums) are isomorphic.

Normal and non-normal subgroups

The word “normal” in mathematical nomenclature does not always means “usual” or “customary” as it does in colloquial English. Instead, it might that something has a convenient property. That is the case for normal subgroups.

We can do things with normal subgroups that we cannot do with other subgroups, such as take quotients, and so once normal subgroups are introduced in an algebra class, non-normal subgroups disappear from the discussion. A student might be left with the impression that non-normal subgroups don’t exist.

This post will give a very simple example of a group with a non-normal subgroup, and show how we can’t do operations with this group that we can with normal subgroups.

Definition of normal subgroup

A normal subgroup of a group G is a subgroup N such gN = Ng for any element g in G. That is, if I multiply everything in N by g on the left or I multiply everything in N by g on the right, I get the same set of elements.

This does not mean that gn = ng for every n in N. It means that for every n in N, there is some element m in N such that gn = mg. The elements n and m might be the same, but they might not.

In an Abelian (commutative) group, gn always equals ng, and so all subgroups are normal. But most groups are not Abelian.

Structure-preserving functions

Along with every algebraic structure there are functions that preserve aspects of that structure. In the category of groups, these structure-preserving functions are called homomorphisms, coming from the Greek words for “same” and “shape.”

A homomorphism between groups gives you a sort of picture of the first group inside the second group in a way that is consistent with the structure of groups. Specifically, if f is a homomorphism from a groups G to a group H, then

f(xy) = f(x) f(y).

Here we denote the group operation by writing two things next to each other. So “xy” means the group operation in G applied to x and y. This operation may or may not be multiplication. The same is true on the right-hand side: f(x) f(y) means the group operation in H applied to f(x) and f(y).

For example, if we let G be the real numbers with the group operation being addition, and we let H be the positive real numbers with the group operation being multiplication, then the exponential function is a homomorphism between these groups:

exp(x + y) = exp(x) exp(y).

Kernels

The kernel of a homomorphism between G and H is the subset of things in G that get sent to the identity element in H. In the example above, the identity element in H is 1, and the only element in G that gets mapped to 1 is the number 0. It’s not a coincidence that 0 is the identity element in G: homomorphisms always send the identify element of one group to the identity element of the other group. The kernel of a homomorphism always includes the identity, but it may also include more.

Normal subgroups are subgroups that can be kernels. A subgroup of G is normal if and only if there is some homomorphism from G to another group that has that subgroup as its kernel.

A non-normal subgroup

Let G be the group of permutations on three elements. The group operation is composition, i.e. do one permutation then do the other.

The identity element is the permutation that leaves everything alone. Denote this by 1. Another element of G is the permutation that swaps the first two elements. We’ll denote this by (12).

So if our three elements are a, b, and c, then the permutation 1 takes (abc) to (abc). And the permutation (12) takes (abc) to (b, ac).

The two elements 1 and (12) form a subgroup of G. But we will show that a homomorphism from G to a group H cannot take these two elements, and only these two elements, to the identity on H. It won’t matter what group H is, but we’ll need a name for the identity element in H. Let’s call it e.

Let f be a homomorphism from G to H. () By definition

f(xy) = f(x) f(y)

and by applying this to (xy)z we have

f(xyz) = f(x) f(y) f(z)

If we let z be the inverse of x and y is in the kernel of f, we have

f(xyx−1) = f(x) f(y) f(x−1) = f(x) e f(x)−1 = e.

This says that if y is in the kernel of f, xyx−1 must also be in the kernel of f for every x.

The permutation that swaps the second and third elements, (23), is its own inverse: swap the second and third elements twice and you’re back where you started. So if (12) is in the kernel of f then so is (23)(12)(23). You can work out that (23)(12)(23) reverses three elements. This shows that if the subgroup consisting of 1 and (12) is in the kernel of f, so is the reverse permutation, which is not part of our subgroup. So our subgroup alone cannot be a kernel of a homomorphism.

Related posts

Mersenne primes are unsafe

In the previous post I mentioned that a particular Mersenne prime would be unsuitable for cryptography. In fact, all Mersenne primes are unsuitable for cryptography.

A prime number p is called “safe” if

p = 2q + 1

where q is also a prime. Safe primes are called safe because p − 1 does not have small factors (other than 2). The factors of p − 1 correspond to subgroups of the group used for encryption, and small groups can be exploited to attack encryption.

Mersenne numbers are numbers of the form

Mn = 2n − 1.

Mersenne primes are Mersenne numbers that are also prime. A necessary condition for Mn to be prime is for n to be prime. This condition is not sufficient. For example,

211 − 1 = 23 × 89.

But is necessary, for reasons we’ll get into shortly.

If Mn = 2q + 1, then q = Mn−1. But if n is a prime, then n − 1 is not a prime, with one exception: n = 3. So the only Mersenne prime that is a safe prime is M3 = 7, which is not a particularly large prime. Public key cryptography uses numbers in the thousands of digits, not single digits.

Why does n have to be prime before Mn stands a chance of being prime?

If a > 1, then xa − 1 can be factored:

xa − 1 = (x − 1)(xa−1 + xa−2 + … + 1)

If n can be factored into ab, then set x = 2b. This shows that 2ab − 1 has a factor, namely 2b − 1.

In the previous post we said that M127 − 1 has a lot of small factors. We can find some of those factors easily:

M127 − 1 = 2 M126 = 2 (2126 − 1)

and (2126 − 1) is divisible by 2k – 1 for every k that divides 126.

The nontrivial factors of 126 are 2, 3, 6, 7, 9, 14, 18, 21, 42, 63, and so 2k – 1 is a factor of M126 for k equal to each of these numbers. This is enough to fully factor 2126 − 1 into

3³ × 7² × 19 × 43 × 73 × 127 × 337 × 5419 × 92737 × 649657 × 77158673929

given in the footnote from the previous post. You could easily come up with this factorization using pencil and paper, though it would not be easy to determine by hand that the last factor is a prime number.

Victorian public key cryptography

Electronic computers were invented before public key cryptography. Would public key cryptography have been possible before computers?

The security of RSA encryption depends on the ratio of the difficulty of factoring relative to the difficulty of multiplication. This ratio was high, maybe higher, before modern computers.

Suppose the idea of RSA encryption had occurred to Lewis Carroll (1832–1898). What key size would have been necessary for security in his day? Would it have been practical to manually encrypt data using keys of that size?

I imagine if you handed Victorians a pair of public and private RSA keys, they could have manually carried out public key encryption. But coming up with private keys, i.e. finding pairs of large prime numbers, would be harder than carrying out the encryption process.

The largest prime discovered without a computer was 2127 − 1, proved prime by Edouard Lucas in 1876. Such primes would have been large enough—I seriously doubt it was feasible to factor the product of 40-digit primes at the time—but this was a prime of a very special form, a Mersenne prime. Lucas had developed an algorithm for efficiently testing whether a Mersenne number is prime. To this day the largest known primes are Mersenne primes. More on this here.

Lucas would not have been able to produce two 40-digit primes. The largest known prime in 1851 had 12 digits:

999,999,000,001

Because of the special form of this number, it would seem that even coming up with 12-digit primes was quite an achievement. Euler (1707–1783) had found a 10-digit prime, but it was also a Mersenne prime. Large primes without special structure were unknown. But maybe two 10-digit primes would have been adequate, as long as they had no special structure, i.e. they were found by generating random numbers and testing them for primality.

Perhaps if Lewis Carroll had found a couple moderately large primes, he might have presented them to his queen to be used in public key cryptography. Their product could be published in newspapers, but the factors would be state secrets. Anyone could send Queen Victoria encrypted messages via public communication.

Diffie-Hellman public key encryption might have been more practical. It only requires one large prime, and that prime can be made public. Everyone can share it.

The prime p that Lucas discovered would do, until people realized that p − 1 has a lot of small factors [1], which could be used to break Diffie-Hellman cryptography. I don’t know that any large safe primes were known until more recently.

If someone from the future had given the Victorians a large safe prime, Diffie-Hellman cryptography would have been possible, though laborious. Someone could write a steampunk novel about a time traveler giving the pre-computerized world a big safe prime and teaching them Diffie-Hellman cryptography.

***

[1] 2126 − 1 = 3³ × 7² × 19 × 43 × 73 × 127 × 337 × 5419 × 92737 × 649657 × 77158673929

See the next post for a theorem that would allow you to find this factorization by hand.

Navigating a LaTeX file

I like generating long LaTeX documents from org-mode because, for one thing, org-mode has nice section folding. But not everyone I work with uses Emacs, so its better to work in LaTeX directly rather than have Emacs generate LaTeX.

AUCTeX has section folding for LaTeX documents, though so far I’ve only has limited success at getting it to work. However, RefTeX worked right out of the box.

If you enter reftex-mode ctrlc = then RefTeX will open a table of contents window.

RefTeX screen shot

Scrolling through the table of contents window scrolls through the body of the document. This isn’t exactly section folding, but it serves a similar purpose.

RefTeX ships with Emacs, so there’s probably no need to install it, but the mode is not enabled by default.

HTML entity data

It’s surprisingly hard to find a complete list of HTML entities in the form of a data file. There are numerous sites that give lists, often incomplete, in a page formatted to be human-readable but not machine-readable.

Here’s an XML file from the W3C.

Here’s a two-column text file I created from the W3C data.