C代写:CS136Encode


完成C语言基础练习,根据规则实现 [ Encode ](https://en.wikipedia.org/wiki/Binary-to-
text_encoding “Encode”) 和Decode方法。
![Encode](https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/Original_source_code_bitcoin-
version-0.1.0_file_base58.h.png/400px-Original_source_code_bitcoin-
version-0.1.0_file_base58.h.png)

Requirement

  • Your coding style will NOT be graded for this question.
  • Your testing methodology will NOT be graded for this question.
  • All assignment rules and policies are in place for this question.
  • There is no public test. Marmoset only ensures that your code “runs”.
  • For this and EVERY question, you may define helper functions
    For your convenience, the same README.TXT file appears in q1a and q1b.
    When we transmit information we often want to try and minimise the size of the
    data that we have to send using data compression.
    A popular method of data compression is to identify common patterns and then
    replace (substitute) those patterns with shorter codewords.
    For this question, we compress a string that contains only lower case letters
    and whitespace (no punctuation, upper case, numbers, etc.).
    We replace a pattern of lower case letters with a single non-lower case
    character.
    For example, we could replace the pattern “the” with the char ‘T’.
    If the original message was “the cat in the hat”, the compressed (or
    “encoded”) message would be “T cat in T hat”.
    There could be multiple patterns, so we could also encode “at” as ‘@’ and then
    the message would be “T c@ in T h@”.
    A “mapping” struct map contains:
  • the single character [small] and
  • the pattern [big].
    struct map {
    char small;
    char *big;
    };
    —|—

For the above example, we would have:
struct map example[2] = { {‘@’, “at”}, {‘T’, “the”two big parantheses;
—|—
There are a few rules for a [struct map] that must be followed:

  • [small] must be in the ASCII range ‘!’ … ‘_‘ (inclusive)
  • [big] must be a valid and non-empty string that contains only lower case letters (‘a’ … ‘z’) and no other characters (e.g., no whitespace)
    When working with an array of maps, there are additional rules:
  • the [small] fields must all be unique (no duplicates)
  • the [big] fields must not only be unique, but no [big] field can be a prefix of another [big] field (see below for an example)
  • the elements of the array must be sorted so that the [big] strings are in lexicographical order (see above: “at” before “the”)
    Rule 5 exists so you can use binary search on an array of maps to find a
    pattern match, so if the longest [big] in the array is “m” and there are “k”
    entries, then it would take O(m*logk) to search the array.
    Rule 4 exists to make this assignment easier by removing any ambiguity.
    No [big] field can be the prefix of another big field, so you could not have
    both “the” and “theatre” as [big] fields because “the” is a prefix of
    “theatre” (“the” appears at the start of “theatre”).
    Even despite rule 4, there could be some ambiguity because some [big] fields
    could overlap. Consider this overlap example:
    struct map overlap_example[3] = { {‘@’, “at”}, {‘E’, “eat”}, {‘T’, “the”two big parantheses;
    —|—

This is a valid array because no [big] field is the prefix of any other.
However, for the string “theatre” it could be encoded as “T@re” or “thEre”.
To avoid this, we have our final rule:

  • the “first” match is always selected, moving from the start of the message string to end (in other words, moving left to right)
    So in our previous example, the encoding would be “T@re” because the first
    match moving left-to-right would be “the”.
    In part A, you use an array of struct map to encode (compress) a string
    In part B, you use an array of struct map to decode (decompress) a string
    For both part A and B we have provided a test client that uses the same file
    format.
    The file format should be straightforward for you to understand:
  • make sure you provide your mappings in alphabetical order by [big] field
  • a tilde (~) followed by a newline is used to separate the mappings and the message

Additional Notes

  • In both parts, you must provide a new, dynamically allocated string that has been realloc’d to the correct length.
  • If you were to encode a string and then decode it, the result should be the same as the string you had prior to encoding.
  • In Part A, the encoded string must be as short as possible following the above rules, meaning that if a substitution is possible it must be done.

文章作者: SafePoker
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 SafePoker !
  目录