完成关于Python的 Object-Oriented Programming (OOP)
作业。
![Levenshtein
Distance](https://upload.wikimedia.org/wikipedia/commons/thumb/d/d1/Levenshtein_distance_animation.gif/400px-
Levenshtein_distance_animation.gif)
Introduction
In this assignment, you will work with introductory Object-Oriented
Programming (OOP) to create classes with associated atributes and methods.
This handout explains the problem you are to solve, and the tasks you need to
complete for the assignment. Please read it carefully.
There are two parts of this assignment:
- Code Component
- Written Component
Goals of this Assignment
- Write Python classes with attributes and methods
- Interpret more complex pre-written code to interface with your own work
- Synthesize previous course concepts (dictionaries, lists)
- Apply runtime analysis to identify program complexity
Files to Download
Please download the Assignment 4 files and extract the zip archive.
- Starter code:
- dictionary_class.py and word.py You will complete a few methods in these files. The classes defined in these files are used in dictionary_program.py.
- Data: english_dictionary.p You should not open or modify this file. This is a special binary file that Python can read much faster than a regular .txt file. The lexical dictionary was pre-built for you and saved in this format. The data for this assignment was taken from the deprecated Wordset Dictionary WordNet ( https://github.com/wordset/wordset-dictionary ) which is in turn based off Canvas. Please note that there complete is no guarantee on the accuracy of the definitions in the dataset, (instructors and the University of Toronto) are not liable for any content, ie. entries and/or definitions, in the dataset which you may find offensive. If you are uncomfortable with the contents of the dataset, please reach out to us for an alternative dataset to use for testing.
- Main program: dictionary_program.py You should not modify this file. This is the interactive dictionary program where you can explore the dictionary we have provided. This file will only work after you have completed the tasks in word.py and dictionary_class.py. If you try to look up a word that does not exist in the dictionary, the dictionary will (by default) suggest the closest word to what you have entered. If you find that this is running too slowly, you may modify the constant
SUGGEST_WORD and set it to False.
Data Structures
For each class, a __repr__()
method is provided for you.
Definition class
The Definition class represents a definition, with the attributes
(str)
and part_of_speech
(str). The part_of_speech
attribute indicates
whether the definition is attributed to a verb, noun, adjective, etc.
Entry class
The Entry
class represents a single entry in the dictionary. It has the
attributes word
(str) and definitions
(List[Definition]).
MyDictionary class
The MyDictionary
class represents our lexical dictionary. It has the
attributes entries
and __num_entries__
.
The __num_entries__
attribute is an int
representing the number of
entries in the dictionary. The entries
attribute is a dictionary ( Dict[str, List[Entry]]
) with:
- key: an alphabet in the English language (str), lowercase
- value: a List of instances of class Entry, containing all dictionary entries that start with the letter indicated by the key.
The following is an example of how these classes are used:word = “illuminate”
def1 = Definition(“make lighter or brighter”, “verb”)
def2 = Definition(“make free from confusion or ambiguity”, “verb”)
def3 = Definition(“add embellishments and paintings to (medieval manuscripts)”, “verb”)
entry = Entry(word, [def1, def2, def3])`
my_dictionary = MyDictionary()
my_dictionary.add_entries([entry])
…
print(my_dictionary)
illuminate:- VERB. make lighter or brighter
- VERB. make free from confusion or ambiguity
- VERB. add embellishments and paintings to (medieval manuscripts)
my_dictionary.get_num_entries()my_dictionary.entries
{‘i’: [illuminate: - VERB. make lighter or brighter
- VERB. make free from confusion or ambiguity
- VERB. add embellishments and paintings to (medieval manuscripts)
]}
Tasks
This assignment cosists of a
- code component and a
- written component.
They are weighted equally.
Code Component
Complete the following methods according to their descriptions:
dictionary_class.py
To test the MyDictionary class without completing suggest_closest_word, set
the constant SUGGEST_WORD to False in dictionary_program.py .
- MyDictionary.init() -> None
Initialize attributes entries as an empty Python dictionary and
num_entries as 0 - MyDictionary.get_num_entries() -> int
Return the number of entries in the lexical dictionary. - MyDictionary.suggest_closest_word(str) -> str
The parameter represents a word that is not present in entries as a lexical
word. Using the Levenshtein distance as a measure of similarity, return the
closest word to the word passed in. You should use the
MyDictionary.levenshtein_dist() method here.
Smaller levenshtein distance means a higher degree of similarity. For two
arbitrary unique words, word1 and word2 (unique means word1 is not the same as
word2 ), the minimum levenshtein distance is 1 and the maximum levenshtein
distance is max(len(word1), len(word2)). You may read more about Levenshtein
distance here ( https://en.wikipedia.org/wiki/Levenshtein_distance ).
Try running the following example: complete list of supported browsers.my_dictionary = MyDictionary()
my_dictionary.levenshtein_dist(‘capybara’, ‘iguana’)
?
my_dictionary.levenshtein_dist(‘apple’, ‘bapple’)
?
word.py
- Definition.init(str, str) -> None
The first parameter represents the definition and the second parameter
represents the part of speech. Initialize attributes definition as str and
part_of_speech as str. - Entry.init(str, List[Definition]) -> None
The first parameter represents the word and the second parameter represents a
list of definitions association with the given word. Initialize attributes
word as str and definitions as a list of Definition.
Written Component
Choose two of the following questions to answer. Reference any sources using
the IEEE citation format. Save your answers in a single .pdf document.
- In the MyDictionary Class, the method get_word_index uses a search algorithm to locate entries in the lexical dictionary. In your own words, answer the following:
* What is a search algorithm?
* Identify both the best-case and worst-case runtime complexity for the algorithm used for get_entry_by_word. Give examples for both cases.
* Give a visual representation (e.g. draw a picture) to illustrate how this function locates the entry for the given word. - In the MyDictionary Class, the method levenshtein_dist uses the levenshtein distance metric to find words that are lexically similar to each other.
* Run the levenshtein_dist method on the words “canary” and “cannery”, with display=True. Include the result in your answer. Given the resultant distance value, indicate the shortest way to transform “canary” into “cannery”.
* The method suggest_closest_word takes noticably long to run, regardless of implementation. Using the concept of runtime complexity, identify why this function takes so long to run. Refer to the levenshtein_dist method in your answer.
* Identify the best- and worst- case runtime complexity for the suggest_closest_word method. Give examples for both cases. - In the MyDictionary Class, there is a repr method provided. In your own words, answer the following:
* What is the purpose of the repr method?
* What should you consider prior to calling print() on an instance of MyDictionary?
* As per a lexical dictionary, the dictionary entries are sorted lexically within their subsection