Fifth edition of the N&O column / Spooks newsletter

(Date: Tue, 25 Aug 1998 23:23:48 +0200)

Morse Stations | Two Letter Stations | Cherry Ripe | The Counting Station | Simple Substitution | Jamming | Logs
Index | NS NL Home


Crypto - Simple Substitutions

by Torbjorn Andersson

Table of contents

Introduction

The most simple crypto systems only substitutes the cleartext letters for other letters, numbers, or, in some cases, arbitrary symbols. Usually only one cryptosymbol is allotted to each individual cleartextsymbol, but in some more complex systems, variant cryptosymbols are allotted to the more common letters of the language in question.

The Caesar cipher

Julius Caesar is said to have used a very simple method to safeguard his communications, the so called Caesar cipher. In the Caesar cipher the letters of the cleartext are substituted for the letters found three places further down the alphabet (at the end of the alphabet, the letters "wrap around", so after Z, the letter A follows), and the key for Caesar's secret cipher looks like this:

Clear text: abcdefghijklmnopqrstuvwxyz
Cipher text: DEFGHIJKLMNOPQRSTUVWXYZABC
Example: Julius Ceasar
Becomes: MXOLXV FHDVDU

The Checkerboard

The Greek historian Polybios (ca 200 BC) tells us about a signalling systems, that should have been in use in Greece. The Greek alphabet of 24 letters are written in five rows of five letters each (the last row only having four letters), thus forming a square, or checkerboard. Then, according to Polybios, to send a message to a place within sight, torches are held up. First between one and five torches are used to indicate the row where the sought letter stands in the square, then the number of torches are held up, which tells the column where the sought letter is found. Needeless to say, this signalling scheme is somewhat slow, but it can be used as a cryptosystem in the following way: We first adopt the system to the Latin alphabet. Since there are 26 letters, but only 25 cells in a five by five square, one letter must be sacrificed (or we can use e.g. six rows instead of five). Usually the letters I and J are put in the same cell, and treated equally since seldom any ambiguity will arise as to which letter is meant. Here are a typical checkerboard with the Latin alphabet:

  1 2 3 4 5
1 a b c d e
2 f g h ij k
3 l m n o p
4 q r s t u
5 v w x y z

To encrypt a text with this crypto, the letters of the cleartext are substituted for two-figure numbers, the first figure of every number telling in which row the cleartext letter stands, and the last figure telling the column. The cleartext Troy has fallen will become 44 42 34 54 - 23 11 43 - 21 11 31 31 15 33 in this checkerboard key.

A number of variants to the key shown above exists. It is possible to use letters to indicate the rows and columns, instead of figures, if one likes. In some cases a different order of the numbers telling the rows and columns, are used, or each row and column is given two figures like this:

  2,6 0,3 1,5 7,9 4,8
6,8 a b c d e
1,4 f g h ij k
0,9 l m n o p
2,7 q r s t u
3,5 v w x y z

The user has to choose between one of these two variants, when deciding how to encrypt a certain letter, and it is -of course - possible to choose different cipher numbers for the same letter occuring some place else in the message, thus hiding repetitions, like this:

Message: T H E B A T T A L I O N  
Cipher: 7 1 6 8 6 29 79 62 02 17 09 91  
  7 5 8 3 6                
  I S M O V I N G S O U T H
  1 2 0 9 3 47 05 10 75 99 24 79 11
  9 1 0 9 2                

Commonly, one would put these numbers together to form standard five- figure groups before transmission, like this:

77156 88366 29796 20217 09911 92100 99324 70510 75992 47911

A nifty checkerboard variant exists, where some of the letters - usually the ones occuring most frequent in the language in question - receives single figure cryptosymbols, and the rest gets two-figure combinations just as above. Lets say the key looks like this:

  7 4 1 0 8 5 2 9 6 3
  A S I N T O E R    
6 B C D F G H J K L M
3 P Q U V W X Y Z . /

The first row containing letters is formed by the mnemonic phrase A sin to err, with the last r dropped (The phrase happens to contain the eight most frequent letters of English.). Then the rest of the alphabet is listed in order in two rows of ten letters, ending with a period mark and a slash (the slash may be used to separate words when ambiguity would arise if they were written together). The figures in the top row and at the last two positions of the first column, are used as coordinates to refer to a cell in the table, containing the letter to be encrypted. The first row of letters are encrypted as single figures, the second row of letter gets two-figure numbers commencing with the number 6, and the letters of the last row gets two-figure numbers commencing with the number 3. As can be seen by looking at the table, the figures 6 and 3 can not be single-figure numbers, but must commence, or be part of, a two-figure number. Thus, there is no danger involved if one runs the numbers of a cryptogram together as a string, or in five-figure groups. It is always possible to decrypt such a cryptogram without any ambiguity as to which figures are to be read as single-figures, or which figures are to be treated as two-figure numbers. The string:

645636331016478150

can only be divided in one way, thus:

64-5-63-63-31-0-1-64-7-8-1-5-0

By referring to the table above, the cleartext communication is easily derived. As can be seen, only five out of a total of thirteen letters are encrypted as two-figure numbers, thus shortening the cryptogram and the transmission time needed substantially.

The following cryptogram uses the above table, but different order of the coordinates. Try and see if it is breakable; the cleartext is in English, military language:

13492 09610 41763 07431 46918 65737 67721 86111 11581 71559 14176 30710

Simple Substitution using an Unordered Alphabet

In the systems described so far the normal sequence of the alphabet has been used, but one can of course use an unordered sequence of letters or numbers as cipheralphabet. The classical method uses a keyword to achieve this. Any word or phrase will do, but all repeated letters must be deleted. If the keyword is RAMSES the following cipherkey can be constructed:

Clear text: abcdefghijklmnopqrstuvwxyz
Cipher text: RAMSEBCDFGHIJKLNOPQTUVWXYZ

A major drawback of this system, is the fact that towards the end of the alphabet the cleartext letters tend to be encrypted by themselves if the keyword doesn't contain, say, an "X", "Y", or "Z". To counter this the users can agree to start writing the keyword and the rest of the letters, at a different starting position than the letter "A". The starting position can even be varied from message to message, and this information can be hidden somewhere in the cryptogram. For instance, when starting with the keyword under the letter "f", the result will be the following key:

Clear text: abcdefghijklmnopqrstuvwxyz
Cipher text: VWXYZRAMSEBCDFGHIJKLNOPQTU

Suppose the starting position is hidden as the third letter of the resulting cryptogram, then the following table will illustrate the use of the above key:

Message: de*sperate needof supplies
Cipher: YZFKHZJVLZ FZZYGR KNHHCSZK
See also Newsletter 8.

Morse Stations | Two Letter Stations | Cherry Ripe | The Counting Station | Simple Substitution | Jamming | Logs
Index | NS NL Home

---