Introduction
用Python来写一个数据库,实现定义的SQL接口,最后通过测试集即可。
Description
NOSQL-style databases are not characterized by any particular property, except
that they have different goals than traditional SQL databases. For this
project, you will be implementing a Document Store, a type of database with no
schema, but can be useful none-the-less.
Files in Project6
- docstore.py: This is the file you must implement the project. You are welcome to import other modules (both builtin and other custom files) for use in this file. Within, you must create two classes, “Collection” and “Database”. And you must write a function named “convert_dict_to_formated_str”.
- Test_Suite/: A folder containing two types of files. Files starting with “test” are Python3 scripts. If they were copied to the Project6 folder they could be run. They import the “docstore.py” module and test functionality of the classes therein. The matching “output” files are the correct output for each “test”.
- run_single_test.py: This executable will run a single test (a.k.a. a python script) if called (example: “./run_single_test.py Test_Suite/test.initial.01.py”). To see the correct output, run: “./run_single_test.py Test_Suite/test.initial.01.py –correct”. However, I encourage you to write your own tests as well.
- run_tests.py: This executable will run all the tests in the “Test_Suite”. It can be invoked with “./run_tests.py”.
- README.txt: This file you must create. You should specify feedback regarding the project and any sources you used to implement this project.
What You Need To Do
All of the code for this project should be in the “docstore.py” file.
convert_dict_to_formated_str:
- There needs to be a function named “convert_dict_to_formated_str”. This function takes one parameter (a python dict) and returns a python string.
- The purpose of this function is to ensure that printing out dictionaries results in deterministic output.
- The strings list the mappings within the dictionary in sorted according to key.
- All dictionaries used in this project will contain keys of the same type (usually python string and integers) to make this sorting easier.
- Values may have different types.
- It converts a dictionary to a string with the following format:
- Input is a python dictionary (a complex case shown here):
- {‘a’:’b’, ‘b’:{}, ‘c’:{1:2, 3:{‘a’:’b’, ‘b’:’c’}, 7:99}, ‘josh’:’instructor’}
- Output is a python string
Collection.init(self): - This method initializes a collection.
Collection.insert(self, document): - This method takes a document (a python dictionary) and adds it to the collection. The collection needs to store its documents in insertion order.
Collection.str(self): - This method converts the collection to a python string (you should probably use convert_dict_to_formated_str to help write this method).
- The collection prints “Collection(“, then each document stored (indented), then the closing parenthesis. See testcases for details.
Collection.find_all(self): - This method returns a list of all the documents stored in the collection in insertion order.
Collection.delete_all(self): - This method removes all the documents stored in the collection.
Collection.find_one(self, where_dict): - This method returns the first document (in insertion order), that matches the where_dict. The where_dict is a python dictionary that contains key-value entries. If no match is found, return None.
- If a document doesn’t have each key in the where_dict, it doesn’t match. If a document has the correct keys, but doesn’t have the same associated values, it doesn’t match.
- If a where_dict is empty ({}), it matches all the documents.
- Example: documents = {‘age’:27, ‘name’:’Josh’, ‘major’:’CSE’}, where_dict = {‘age’:27, ‘major’:’CSE’}, the document matches because it has all the entries in the where clause. If the document was {‘age’:27, ‘name’:’Tyler’, ‘major’:’Street’}, it wouldn’t match. This document also doesn’t match {‘name’:’Hancheng’, ‘major’:’CSE’}.
Collection.find(self, where_dict): - This method is identical to find_one, but it instead returns a list of documents (in insertion order) that match the where_dict.
- If no document matches the where_dict, return the empty list.
Collection.count(self, where_dict): - This method returns an integer representing the number of documents in the collection that match the where_dict.
Collection.delete(self, where_dict): - This method removes the documents that match the where_dict.
Collection.update(self, where_dict, update_dict): - This method finds the documents that match the where_dict and then applies the changes found in the update_dict.
- All of the key-value entries in the update_dict are applied to each matching document. Adding or replacing already existing entries.
- Example:
- document = {‘name’:’Josh’, ‘age’:27, ‘courses’:[220, 450, 480], ‘grade’:3.2}
- update_dict = {‘extra_credit’:2, ‘grade’:4.0}
- updated document = {‘name’:’Josh’, ‘age’:27, ‘courses’:[220, 450, 480], ‘extra_credit’:2, ‘grade’:4.0}
Collection.map_reduce(self, map_function, reduce_function):
- This method takes two arguments which are both functions (“map_function” and “reduce_function”). It applies the map function to each document, saving the each’s result to a list. This list is passed to the reduce function. The result of the reduce function is returned.
- The map function will be provided by the test.
Database.init(self, filename): - The init method for the Database class. It takes a filename that is where the database will store its information.
Database.str(self): - Returns a python string representation of the class. The first line is “Database(“, followed by the contents of the database, which is a mapping of Collection names to the collections themselves, then the closing parenthesis. This should be indented and probably involves calling convert_dict_to_formated_str. See test cases for examples.
Database.get_collection(self, name): - Returns a Collection instance associated with the given name. If no such Collection exists, create an empty one and return it. Otherwise return the Collection associated with the name.
- Note: the returned Collection shouldn’t be a copy. Changes made to the Collection should be reflected in the database.
Database.get_names_of_collections(self): - Returns a list of (sorted) names of collections in the Database.
Database.drop_collection(self, name): - Removes the collection associated with the given name from the Database.
Database.close(self): - Saves the information in the Database to the file designated in the init method.
- You can use whatever data format you want (JSON, XML, custom, some other library).
- Then closes the database. This method will be called after working with the database.
- After the close method is called, the Database will not be used again.
- If a new Database is created using the same filename as a previously now closed Database, it should have the same data as the original Database.
- There can exist multiple, concurrent Database instances, but they will always have different filenames.
Tips
The tests for this project import and run your code. If your code outputs
(prints) additional material, it will fail the test. I recommend a
“debug_mode” global variable that you can use to test if you want to print
additional messages for debugging purposes.
You should consider trying to solve the tests by hand before implementing it.
Test Categories
- has_needed_files : There should be a files named “docstore.py” and “README.txt”.
- test.convert.* : tests the function “convert_dict_to_formated_str”.
- test.insert_and _str.* : tests the “insert” and “_ _str__“ methods of the “Collection” class.
- test.find_and_delete_all.* : tests the “find_all” and “delete_all” methods of the “Collection” class.
- test.find.* : tests the “find_one”, “find”, and “count” methods of the “Collection” class.
- test.delete_and_update.* : tests the “delete” and “update” methods of the “Collection” class.
- test.map_reduce.* : tests the “map_reduce” method of the “Collection” class.
- test.get _collection.* : tests the “_ _init__“, “__str__“, and “get_collection” methods of the “Database” class.
- test.names_and_drop.* : tests the “get_names_of_collections” and “drop_collection” methods of the “Database” class.
- test.close.* : tests the “close” method of the “Database” class.
I recommend attempting the tests in the order above, but you are welcome to
tackle the project in anyway you see fit. Remember you can use
“./run_single_test.py …” to run a particular (or custom) test.