Introduction
Computers fundamentally changed cryptography, opening up new possibilities for making and breaking codes. At first it may not have been clear which side benefited most, but now it’s clear that computers gave more power to code makers than code breakers.
We now have cryptographic primitives that cannot be attacked more efficiently than by brute force, as far as we know. The weak link is how these primitives are implemented and combined, not the primitives themselves.
Before computers there was more of a cat and mouse game between encryption and cryptanalysis. Encryption schemes that were convenient to carry out by hand could usually be broken by hand eventually. But if you only needed secrecy briefly, a simple scheme might provide that secrecy for long enough. This post will look at one such scheme, the straddling checkerboard.
Checkerboards
Perhaps the most obvious way to conveniently turn letters into numbers is to arrange the letters into a 5 × 5 grid. This has to leave out one letter, and in practice this meant combining I and J. Or if you needed digits, you could use a 6 × 6 grid and put J back in. You’d scramble the alphabet in the grid according to some key, then encrypt each letter by its coordinates.
12345 +----- 1|EBISP 2|XWLVN 3|AOYZQ 4|MDCKH 5|RTUFG
This is no better than a simple substitution cipher because someone intercepting a message encrypted this way would easily guess that pairs of digits represent letters. However, if you then permuted the digits with a transposition cipher, you’d have something more formidable. This is essentially what the ADFGV cipher did, which stumped cryptanalysts for a while.
The straddling checkerboard is a variation on the method above. Letters would be arranged in a 3 × 10 grid rather than 5 × 5. Some letters would be encrypted as a single digit and some as a pair of digits.
1234567890 +---------- | EBISPXWL 1|VNAOYZQMDC 2|KHRTUFGJ./
In the example above, E would be encrypted as 3, N would be encrypted as 12, and so on. This is an instance of a prefix code. In order to be able to decode the digits unambiguously, no letter could be encoded as 1 or 2; these digits always signaled the beginning of a pair.
Prefix codes are often used in non-secret codes, such as country codes for telephone numbers. More examples of prefix codes in this post.
Because 1 and 2 could not be used to encode single letters, there were 28 slots to fill. These could be filled with other symbols, and in practice period and slash were added [1].
Efficiency
The straddling checkerboard gives a more efficient encoding than does the checkerboard since typically fewer digits will be required. If efficiency were the only concern, we’d put the eight most frequent letters on the top row, something like the following [2].
1234567890 +---------- | ETAOINSR 1|BCDFGHJKLM 2|PQUVWXYZ./
This would be more efficient but less secure since the arrangement of the letters would be more predictable.
Security
The straddling checkerboard presents a bit of a challenge to the cryptanalyst since it’s not know a priori whether a digit is part of a pair (if the vertical coordinates are not always 1 and 2).
The straddling checkerboard didn’t offer much security even in its day. It would have been better if there had been some further processing done on the digits, such as how the ADFGV cipher permuted its coordinates.
The message, interpreted as a number N, could have been further encrypted as aN + b where a and b were randomly chosen numbers that were part of the key. As far as I know, nothing like this was ever done. This would have provided more security but would also require more effort and increase the chance of introducing errors.
Related posts
[1] David Kahn. The Codebreakers. Chapter 18.
[2] You may have expected the last letter on the first row to be H, going by the printer’s order ETAOIN SHRDLU. Peter Norvig discovered a slightly different order of letter frequencies based on the Google corpus.