代写 Caesar’s Cipher 算法的加密和解密过程,完成对字符串的加解密。
Requirement
In this project, you will implement a function that takes as input a sequence
of words encoded via an unknown Caesar’s Cipher, and returns a function that
decodes words in that cipher back into plain text. You then use this function
to write a code-breaker that decodes an entire document back into its plain
text.
Caesar’s Cipher
The cipher translates each letter in a word independently by “shifting” the
letter up or down the alphabet by a fixed distance. We assume that we are
using the English alphabet with 26 letters in their “usual” order, all lower
case: a, b, c, . . . z. The function ctv (“character-to-value”) maps each
character to its value, and the function vtc (“value-to-character”) maps each
value to its corresponding character. For example, ctv(c) = 2 and vtc(25) = z.
The encryption function has the following form:
Encript_n(x) = vtc((ctv(x) + n) mod 26)
where “n” represents the amount by which the letters are “shifted” up the
alphabet. For this project, we assume the values of “n” to be in the range of
0 and 25, i.e., 0 ≤ n ≤ 25.
As part of the project, you will first need to implement the encoding function
encode-n which takes as input the shift value n and returns a function that
takes as input a word w, and returns its encoded version:
(define encode-n
(lambda (n)
(lambda (w)
…
)))
—|—
Project Description
For this project, words are represented as lists of lower case symbols, e.g.,
the word “class” is represented as (c l a s s)
. Paragraphs are lists of
words, e.g., ((c l a s s) (i s) (a) (l o t) (o f) (f u n))
. A document is
a list of paragraphs.
A document encoded with the Caesar cipher is easy to break since there are
only 26 possible “shift values” n. You are asked to implement two different
methods to break the cipher, (1) a brute force version that uses a spell
checker, and (2) a version that uses knowedge about the distribution of
particular letters in the English language (frequency analysis). Both methods
try to find the value of n0 such that (n + n') = 26
.
You will need to write two functions Gen-Decoder-A and Gen-Decoder-B that take
as input a paragraph of encoded words, and return as output a function that
takes as input an encoded word, and returns the plain text of the decoded
word. This function can then be used to decode the entire document which may
consist of multiple paragraphs. The function Code-Breaker takes an encoded
document and a decoding function as input, and returns the entire document in
plain text.
The two decoder functions implement different strategies to break the
encrypted code. Both have advantages and disadvantages. Your scheme program
will provide an infrastructure to assess the effectiveness of these decoding
approaches for different document types.
Brute Force with Spell Checker Gen-Decoder-A
The algorithm for the brute force code breaker is rather simple. The encode
input words are encoded for each possible value of n0 . For each value, a
spell checker determines whether the resulting words are words in the English
language. The value of n0 for which most words are spelled correctly is
assumed to be decoding value.
For this method, you will need to implement a spell checker. You can implement
your spell checker using the dictionary of words in file dictionary.ss. You
will need to implement the spell checker spell-checker that takes as input a
word, and returns the truth value #t or #f. A brute force version of the spell
checker may just see whether the word occurs in the dictionary or not.
Frequency Analysis Gen-Decoder-B
This code breaker is based on the fact, that in the English language some
letters occur more often than others. Knowing the distribution of the
different letters, this method just counts the number of occurrences of each
letter in the encoded words. If there are enough words to be statistically
relevant, the letter distribution can be used to identify the most common
letters in English, namely ‘e’, ‘t’, and ‘a’. In other words, the assumption
is that the most frequent encoded letter has to correspond to the letter ‘e’,
etc. This means that the shift between the encoded letter and ‘e’ has to be
the value n. The same n has to work for the other frequent letters as well.
From that, we can determine the decoding value n0 . Note that this method
needs a large enough set of words in order to be successful. In other words,
it may not work well for very short paragraphs.
Implementation
For the implementation of these functions, you can use standard built-in
Scheme functions such as append. You should use the reduce function at least
once in your implementation. The definition of reduce is given in file
include.ss. You must not use assignment (set!) in Scheme.
How To Get Started
Copy the 5 files decode.ss, include.ss, test-dictionary.ss, dictionary.ss and
document.ss into your own subdirectory. You will implement your code breaker
in file decode.ss. File test-dictionary.ss contains a small dictionary
consisting of only six words. Use it to debug your program. File dictionary.ss
contains a list of over 45,000 words allowing you to generate a realistic
spell checker. You must not change this file. Finally, file document.ss
contains two sample documents to test your encoder, decoder, etc. You should
use your own test cases as well.
Grading
You will submit your version of file decode.ss via Sakai.
Your programs will be graded based on programming style and functionality.
Functionality will be verified through automatic testing on a set of test
cases.
Questions
All questions regarding this project should be posted on our piazza web site.
Start early since you will run into issues that will need to be resolved on
the way. This is a feature of the project, not a bug!