代写数据库作业,实现一个buffer manager,用来对 buffer pool
进行管理。
![Buffer Pool](https://dev.mysql.com/doc/refman/5.6/en/images/innodb-change-
buffer.png)
Requirement
In this project milestone, you are going to implement a buffer manager
discussed in the class. You should use C++11 to finish this project. If you
use more recent C++ standards (e.g., C++17), make sure to update the required
standard in CMakeLists.txt. See Tutorial 2 for a quick overview of C++11
features. If your code (as a result of using more recent C++ standards)
requires a specific version of GCC/Clang, please document it in your
submission and make corresponding changes in your CMakeLists.txt file. You are
not allowed to use external libraries other than standard C++ library. All
code must run on a Linux machine. We will grade your code on a Linux machine.
Note: It is very important that you are familiar with the provided codebase to
be able to finish the required tasks (described later). Make sure you
understand the following two sections before attempting the problems in
Section 3. Also check out Tutorial 1 for basic information on project
architecture and building and working on the project code.
Setting up the Base Code
We have prepared the basic code skeleton for you to start with. Assuming you
have set up your repo as in Project Milestone 0, issue the following commands
to get it in your code repo:
$ git fetch –all
$ git merge base/master –allow-unrelated-histories
Code Structure
Now you should see a new yase_internal.h header file and a BufferManager
directory in which there are multiple header (.h) and source (.cc) files:
- buffer_manager.{h,cc}: buffer manager implementation, including page frame management, replacement algorithm.
- table.{h,cc}: user-visible table abstraction. The test program uses structures defined and implemented in these files to manipulate data.
- file.{h,cc}: implements storage files that hold data for tables. Each table corresponds to a file.
- page.{h,cc}: data and directory page abstractions.
Building the Code
Dependencies: The code depends on the glog, g_ags, gtest libraries for
logging, handling command-line parameters and testing, and CMake for
configuring and building. You will need to have these libraries installed in
your system (or build your own). If you are using CSIL server, then all these
packages should have already been installed. Each Linux distribution has its
own package manager so you will need to find out yourself. Below are
guidelines for Arch Linux and Ubuntu.
For Arch Linux users: The following command will get you all set directly:
$sudo pacman -S cmake gtest gflags google- glog.
Proceed to the “Building the code” section.
For Ubuntu users: To install these libraries in your own Ubuntu system, issue
the following commands:
$ sudo apt-get install cmake
$ sudo apt-get install libgoogle-glog-dev libgflags-dev libgtest-dev // Note the ‘dev’ suffix
To use gtest on Ubuntu, you need to do some extra work to build gtest by
yourself as the gtest packages in Ubuntu only give source code stored under /usr/src/gtest
. To do so, follow the steps below:
$ mkdir ~/gtest && cd ~/gtest // Create a gtest folder in your own directory
$ cd gtest
$ cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=on /usr/src/gtest // Build a shared library
$ make -jN // Replace N with the number of threads you’d like to use for compiling
Then issue sudo make install
if you are using your own machine. If you are
using CSIL machine, you may skip this step, and change your top-level
CMakeLists.txt to include the library install path using -L/home/[username]/gtest
when linking with gtest (i.e., tell the compiler
where to find libgtest.so). This can be done by adding the following to the
top-level CMakeLists.txt: link_directories("/home/[username]/gtest")
.
Finally, in some environments you may need to also add -lpthread to your
CMakeLists.txt (e.g., by adding a new line link_libraries("-lpthread -pthread")
.
Building the code: Once you have all the dependencies set up, use the
following commands to build the project:
$ mkdir build && cd build // Create a separate directory to build the project
$ cmake -DCMAKE_BUILD_TYPE=[Debug/Release/RelWithDebInfo] ..
$ make -jN
YASE Architecture
We design YASE to be a row-store that is capable of storing fixed-length
records (size can be specified by the user) using slotted pages. Free and
occupied pages are tracked by directory pages. Each table is backed by a
single file that stores both directory and data pages.
Supported operations
Insert, delete, read, and update a record. The BufferManager/table.{h,cc}
files define these user- visible interfaces. The insert operation will return
the record ID (RID) of the inserted record, the other operations require an
RID as the input and returns true if the operation succeeded. RID and page ID
are defined in yase_internal.h.
Table and Space Management
Each table is backed by a file (represented by the File structure defined in
file.{h,cc}). We use directory pages to manage pages in the table. Upon
initialization, the user specifies the maximum amount of data that can be
stored by the table (see Table’s constructor in table.h). Since the maximum
size of a file is known, the number of directory pages is also fixed. In each
file, we store all directory pages in the beginning of the file, and each
directory entry includes two field: free_slots and allocated, which represents
the number of free slots in the corresponding data page and whether the data
page was allocated.
Data Page
Data pages follow the bit array design where at the end of the page is a
header area that includes (1) record count and (2) a bit array (one bit per
slot to indicate whether the slot is vacant). Data records are stored one
after another starting from the beginning of the page. To delete a record, we
simply reset the slot’s bit in the bit array to 0 and decrement record count,
then increment the free_slots field in the corresponding directory page.
Page, File, Record IDs
RIDs are 8-byte integers subdivided into three parts (file ID, page ID and
slot ID) occupying different parts of the 8-byte (64 bits). The file ID
indicates which file (table) this record belongs to; the page ID indicates
which page inside the file stores the record; the slot ID indicates the slot
in the page storing the record. File IDs are 16-bit integers, while both page
and slot IDs are 24-bit long. Detailed definitions for PageID and RID can be
found in yase_internal.h. An important note is that page IDs are local to
files, i.e., page IDs run from 0 to maximum for each different file ID.
Similarly, slot IDs are page-local, running from 0 to maximum for each
different page.
Tasks
There are three tasks in this project milestone. Storage-level functionality
(insert/delete/read/update) are already provided in the base code. You will
need to finish the missing buffer pool functionality based on the given code.
Since the code is missing in functionality, buffer_manager_test will not pass
until you correctly implement the required components.
Task 1 - Buffer Pool Structure [40 marks]
The buffer pool is defined by the BufferManager class in buffer_manager.h.
First, finish the constructor function to initialize the buffer pool with an
array of page frames (described by the Page structure). Detailed steps are
provided in the code. Also finish the destructor function to free the page
frames allocated upon class destruction.
Next, follow the approach discussed in class (slide 5 of Storage Management -
Part 2), set up a PageID - Page Frame mapping. Note that PageIDs are local to
each file. Your buffer manager should support multiple tables. That is, you
should setup another FileID - File object mapping in order to be able to
process pages belonging to different table.
Hint: For both mappings, you may use std::map or std::unorderd_map. An example
is included in the code at lines 83-87 of buffer_manager.h
Next, implement the PinPage and UnpinPage functions that pins and unpins a
page with a given page ID. PinPage is used by the Table and File classes to
load a page in the buffer pool in order to modify (e.g., insert record) it.
PinPage’s signature is Page *PinPage(PageID page_id, bool new_page = false);
. The return value ( Page *
)is a page frame that contains a page backed
by storage.
The UnpinPage function is as simple as just decrementing the pin count in the
Page structure provided.
Check the buffer_manager.{h,cc} for details. You are free to modify the buffer
manager code (including removing/replacing variable/function definitions), but
you are not allowed to change the insert/delete/read/update interfaces
exported by table.h.
Task 2 - LRU Replacement Algorithm [40 marks]
Your PinPage function should use LRU to evict a page when needed. You are free
to use different approaches to implement LRU. One way is to follow the
approach that is discussed in class, by using a queue of unpinned pages. A
page frame (i.e., Page*) is added to the tail of the queue when its pin count
is zero. The head page on the queue will be the candidate for eviction. When
eviting a page, make sure to properly _ush the page back back using
File::FlushPage.
Task 3 - Testing Your Code [20 marks]
Under the Tests directory a new test program (buffer_manager_test.cc) is given
to get you started on testing your code. You task here is to add one more test
case that covers more than one table. You may follow the similar structure and
operations of the existing test, but modify it to use two different tables.
We will test your code using more complex cases so you are encouraged to use
this code as a starting point and vary as many parameters as possible to test
your code.
Submission
You need to do two things to submit your solution:
- Add a CONTRIBUTIONS file that documents each group member’s contribution to your solution. If you are doing the project alone, you may just indicate this in the CONTRIBUTIONS file.
- Check in (commit and push!) your code and the CONTRIBUTIONS file to your project repo by the deadline.
There is no need to submit anything to CourSys (grades will be posted there,
though). Make sure you are familiar with the course policies. Especially, we
do not accept late submissions or submissions by means other than specified
here. You can get partial marks for coding style if your code does not
compile. You can (and should) also continue to work on your solutions
throughout the semester to possibly get bonus marks in the end of the
semester. Sample solutions will not be provided for course project.