Session 10: Week 20/21: <em>Letter Frequencies</em>




Document: Software Engineering 1: Lab Exercises

next Exercise 1: Counting A! (30%)
up Software Engineering 1: Laboratory Exercises
gif Exercise 3: The Vigenère Cipher (30%)

Session 10: Week 20/21: Letter Frequencies

In a world where there are code makers there will, inevitably, also be code breakers. One of the simplest tools in the armoury of a code breaker is a program to count the relative frequency of each letter appearing in the cyphertext. For any given language there will be a characteristic distribution of letter frequencies in the uncoded message (the "plaintext"). The most commonly used letter in English is e, by a wide margin; t is in second place, with a and o nearly tied for third; i, n and r are also very commonly used.

If we know that a coded message uses any kind of simple substitution cypher (such as the Caesar cypher we saw previously) then a simple count of the relative frequencies will allow us to make a fairly good guess as to the letters which have substituted for the most common english letters. Often this would be enough to allow the remaining substitutions to be easily guessed.

Of course, real cyphers are much more sophisticated and harder to break than this (how would you tackle a Vigenere cypher for example?). But letter frequency counts still form an essential tool for code breaking, albeit in conjunction with many other techniques.

Try to develop the programs for this session without using safe-c. You should use standard library functions declared in stdio.h for doing input and output - for example, use getchar() for reading in the input file and use printf() for printing out results. Use functions declared in ctype.h for testing (or possibly altering) the characteristics of character code data (e.g. distinguishing alphabetic from non-alphabetic characters; or changing between upper and lower case etc.).






Document: Software Engineering 1: Lab Exercises

next Exercise 1: Counting A! (30%)
up Software Engineering 1: Laboratory Exercises
gif Exercise 3: The Vigenère Cipher (30%)



McMullin@eeng.dcu.ie
Tue Apr 30 14:15:37 GMT 1996