Académique Documents
Professionnel Documents
Culture Documents
CS161 Labs
Introduction
In lecture you have written a couple of basic programs for encrypting text. In this lab we will
break those ciphers and crack the codes.
The message before encoding will be denoted as the plaintext and the encrypted message
will be denoted as the ciphertext. Encryption is simple a function from plaintext messages
to ciphertext messages. In this lab you will invert an unknown encryption function to get the
plaintext given only the ciphertext.
Dictionary-Based Attacks
The first class of attacks we will consider will be dictionary-based.
A Key Assumption
The basic hypothesis underlying such an attack is that the plaintext will mostly only have
words from a known dictionary (in our case this is the file /usr/share/dict/words)
on the Linux machines. Given that assumption it is your task to construct attacks which
successfully decrypt the messages written up in this lab.
We exclude special characters other than spaces and spaces will not be encrypted. The
special characters in some cases will remain in the ciphertext but our encryption algorithm
will ignore them. Not all of the words will be in the dictionary, but most will.
For each ciphertext we give you to decrypt we will let you know the set of ciphers that were
potentially used to encrypt it.
Breaking ROT-n
In class the rot-n code was introduced. In this part of the lab your goal is to break this and
decode the file c1.txt. Given a ciphertext infer what the n is.
A function that will be useful for cracking the code is:
hamming :: [String] -> [String] -> Int
which is a function to measure the distance between two strings based on how many
characters they share. You will also want to make use of the word list mentioned earlier.
Implement this section as a program
http://people.cs.uchicago.edu/~stoehr/cs161/posts/2014-10-20-lab-3.html
1/4
10/22/2014
2/4
10/22/2014
expressed with a 5 bit binary number. Our efficient data-structure will map 5 bit binary
numbers to counts.
We create a tree data-structure that is constrained to have a particular height and with leaf
nodes that record counts.
data Tree = Leaf Int | Node Int Tree Tree deriving (Show, Eq)
Some example trees are here:
l0
l3
l1
l2
t0
t1
t2
=
=
=
=
=
=
=
Leaf
Leaf
Leaf
Leaf
Node
Node
Node
10
3
4
0
1 l2 l0
1 l3 l1
2 t1 t0
These have been named suggestively to indicate how the tree works. We need to have a
constructor
initTree n
which outputs a tree for a binary representation with n bits. Include a type signature and a
definition for that function. We will also want a getter-function
treeCount t w
which given a tree t and a number w returns the trees count for w. We also want a setterfunction which updates the tree
treeInsert t w
which will update the count that tree t has for number w. For both of the functions above
write a type signature and recursive definition. A hint for writing them is that the base case
(where the tree is just a Leaf Int) is obvious and you should just return the count. Next try
to handle the case for a tree formed with one root node and two leaves and think about how
the algorithm should recurse Then handle the case where you have two levels and hence four
leaves, etc. The definition for each function shouldnt be longer than five lines. mod and div
are your friends here: review them if you dont know what they do.
You will want to write an interface for this tree that handles the abstraction from character to
integer. It is up to you to decide how to handle that abstraction. asciiTreeInsert, for
instance, is one way to go. You will also want to generalize to the case where you have
multiple characters since we care about n-gram statistcs. Its difficult to make the tree
structure handle any particular length of n-gram, so just define a tree for the shorter n-grams
and use those to generate counts. You will want to think about the underlying binary
representation when doing this.
3/4
10/22/2014
What to turn in
You should have written two programs: derot.hs and vindecoden.hs which perform
the first two tasks. For the extra-credit task you will turn in vindecode.hs. Grading will
be based on whether your files can decode the ciphertexts and whether the code clearly
demonstrates how you did it. Save your programs into a folder lab2 within your subversion
repository.
Your code should also include some code for your n-gram counting functions. Make sure that
you have the functions treeInsert and treeCount implemented with the appropriate
type signatures.
It is a good idea to try to decode the passages first without restricting yourself to automatic
algorithms. The code you hand in does may take a while to decode the passages. If your code
is inefficient at performing the decrypting task then you should submit an example
demonstrating that your code does work for the simpler example. Make a note in you
README file along with your submission to discuss practicality.
http://people.cs.uchicago.edu/~stoehr/cs161/posts/2014-10-20-lab-3.html
4/4