代写大型的OS作业,实现一个文件系统服务。全方面考察调度、同步、内存管理以及文件系统方面的知识。
Requirement
This project is intended to integrate many aspects of OS design and
implementation, from scheduling, to synchronization, to memory management, and
file systems. You are to implement this in the xv6 OS (I’ll provide a repo
link via Piazza). You will implement microkernel services on top of xv6 which
is a monolithic OS! This is a large and hard project and is scoped around
three team members. If you choose a smaller team, you’re still responsible for
the entire project, so I suggest you form your team soon. The most important
thing you can do to ensure you finish on time is to start early. Like now. To
the keyboard!
Final goal: OS Service to do File-system Checkpoint-Restore
The final goal of the project is to implement a File-System Server (FSS) –
implemented as a normal process – that communicates with the other processes
in the system (called “clients), receiving requests for file system services
(e.g. similar to open, read, write, close, mkdir, unlink), and it will make
those calls on behalf of the requesting process. However, it will also be
smart enough to enable the “undo” of any changes made to the file system by
those processes. It will do this by saving the contents of files and
directories to a “checkpoint/“ subdirectory when modifications are made. To
restore this checkpoint, a simple program can simply copy everything
incheckpoint/into the normal file system. The communication with the FSS is
through shared memory regions coordinated with mutexes and condition variables
(which xv6 does not already provide).
Specification and Implementation Plan
There are three main modules to this project.
- The FSS which uses the normal kernel API to provide access to the file-system, and also to perform the operations to save files/directories when they are modified for checkpointing. This requires zero kernel programming.
- The shared memory support between the FSS, and the processes that are using the FSS’s checkpointing support. This shared memory region is used to pass both the request being made (i.e. which type of call is it, read, open, etc…), and the corresponding data that is required for that call (the path and flags for open, the buffer and its length for read, etc…). This will require kernel hacking to make a shared memory region between the client and the FSS.
- The synchronization code necessary to provide synchronization on that shared buffer. This will include both mutexes, and the ability to block waiting for an event (i.e. a new client request) – a function normally provided by condition variables. This will require kernel hacking to implement mutexes for user-level, and to add logic for condition variables. The FSS process should always execute with high priority as it is providing a service to other processes in the system.
There are varying levels of support for each of these three components, and
you should be able to develop each relatively independently, so if one person
does not meet their commitment, the others should be able to still make
progress. Level 1 is intended to be very basic functionality. Leveling up from
there strengthens your project. Only a project at project level 5 will receive
full credit. Each level assumes the functionality from the previous levels
(i.e. you can’t get level 2 without achieving level 1).
A note on group work and collaboration: I designed these modules to be
independent, but that does not mean that they are of equal difficulty. Do not
plan on simply assigning one to each team member, and when one of the modules
is complete assume that team member is “done”. After a module is done, you
must help the other members on the other modules. Each team member must stay
up-to-date with each other teammate. You should use github to coordinate your
code modifications. In fact, I highly suggest that once each of the
implementations get to Level 1, that you start integrating them together. Once
integrated, you can keep leveling up in each module.
Module #1: File-System Server
The FSS uses the normal kernel API and is a normal process. It uses IPC to
talk to clients. Those clients make requests to it to access the file system
on their behalf. This is not very useful on its own, but the FSS is smart
because it provides a checkpoint and restore functionality. This means that
when the FSS starts, it begins recording all files and directories that are
modified by clients. It will record these files/directories in the
/checkpoint/ directory. So for example, if /hw.txt exists, and you wrote a
program to write to that file (via IPC with the FSS), then the FSS would copy
the /hw.txt file to /checkpoint/hw.txt, and then perform the write on the
client’s behalf. If you write a program to remove a directory, /foo/, then the
FSS would add /checkpoint/foo/ before removing the directory for the client. A
separate program called restore can be executed to restore all files and
directories in the /checkpoint/ directory into their original location. This
will undo all edits made in the mean-time. You do not have to track files and
directories that are created and remove them upon restore. This module
requires no kernel programming.
You’ll first need to understand how the client and the FSS communicate via
IPC. Each of the file-system system calls need a corresponding FSS operation,
prepended with fss_
. The client will call these operations instead of the
normal system calls. Thus, you will need to provide your implementations for fss_read
, fss_write
, fss_open
, fss_close
, fss_mkdir
, fss_unlink
. Note that open and mkdir are used to create files and
directories, respectively, and unlink isused to remove files and directories.
These functions will be used to pass through IPC to the FSS, which function is
being performed (e.g. you could pass it as a string), and the arguments being
passed to the function. For example, you could define a structure:
struct fss_request {
char operation[7];
int arg;
char data[1024];
};
—|—
Where the operation field contains the operation (“mkdir”, “close”, etc…), arg
contains any numerical argument passed to the function, and data contains any
actual data that is meant to be passed. Once you fill out this structure, you
can pass it over IPC to the FSS, and it can read the structure and convert the
request into its own logic for making real file-system system calls to the
kernel. It is cleaner to use an enumerated type for the operation.
A difficult part of this arrangement is that the FSS must return the return
value from the kernel back to the client, and in the case of read, we must
return the data that was read. So you’ll likely want a fss_response
structure that holds that response as well (e.g. an couple of ints, and a
array for the data). It can send this structure via IPC back to the client
that made the request.
The different levels of completion for this module are:
- Level 0: Use a protocol to send FS commands to the FSS. This includes which system call is being made, what arguments are being passed to it, and what the return value is.
- Level 1: Only provide checkpoint support, but not restore
- Level 2: Check-point and restore a single file that is modified by a client with reads and writes.
- Level 3: Check-point and restore a single directory of files that are modified with read and write, and files that are created and removed with open and unlink.
- Level 4: Provide check-point and restore for a directory hierarchy.
Independent testing: To avoid needing to wait for the other modules to be
completed before you can implement your FS server to communicate through pipes
with a client process. Later, you can integrate with the shared memory, and
synchronization and use that form of IPC instead. Note that when you do that,
you’ll likely want to have a mutex/condition variable protecting each
direction of communication (request and response).
Module #2: Shared Memory
Shared memory is often used for IPC between processes. It is more efficient
than message passing (e.g. pipes) as you can often have fewer memcpy
operations (e.g. zero or one vs. two). This module will add shared memory to
your project. Xv6 does not support shared memory, so this will require
understanding the xv6 mechanisms for mapping memory into processes, and how
that interacts with calls to sbrk (extending the amount of memory accessible
by a process), and fork/exec/exit. The API of system calls that you’ll add to
xv6 includes:
- char shm_get(char name, int name_len) - Map a 4096 page into the calling process’s virtual address space. For the simplest implementation, it should likely map at the address that a sbrk would normally use next. A subsequent call to
shm_get
or sbrk should still work (i.e. it can’t overlap with the region a previous call returned). Return NULL if name is NULL or if the memory cannot be mapped. The name (and the int describing how long the name string is) identifies which shared memory region we want to map in. If there are no shared memory regions, and a process callss = shm_get("ggez", 5)
; then the system will allocate a page (e.g. withkalloc()
), and map it into the process. If another process then makes the call with the same name, then it will map in the same page, thus creating shared memory. - int shm_rem(char *name, size_t name_len) - This removes a mapping of the shared page from the calling process’s virtual address space. If there are no more references to the shared page, then it is deallocated, and any subsequent mappings will return a new page. This returns -1 if the mapping with the given name doesn’t exist, or if name is NULL.
Tracking the shared memory regions should be performed in two structures in
the code.
- You can create a global kernel structure that tracks the page that is used as the shared memory, the name associated with it (which can be looked up on
shm_get
), and a “reference count” which tracks how many processes have the shared page mapped into them. Once the reference count is zero, you can deallocate the page. You will support a maximum ofSHM_MAXNUM
shared memory pages in the system, thus your global structure will likely be an array of this tracking information. Feel free to place this in vm.c. - Each process must track which shared memory regions (by name) it has mapped inside of itself, and at which virtual address they are mapped. This is required so that you can support
shm_rem
, and so that when the process forks, exits, or execs, you can know which shared memory regions are getting more, or fewer references.
You will have to read through a fair amount of the code in vm.c and proc.c to
understand how to implement all of this, and a good place to start is with
sbrk.
The different levels of completion for this module are:
- Level 0: System calls and kernel data-structure to track
SHM_MAXNUM
shared memory regions (tracking processes, and virtual memory addresses the regions are mapped into). - Level 1: Implement the system calls to share memory between two processes, including mapping in the actual shared memory.
- Level 2: Maintain the shared memory region across forks of the process.
- Level 3: Thoroughly test that your shared memory region is properly maintained, or deallocated, depending on which processes fork and exit. This will require referencecounting on the shared memory region, and you only deallocate the memory for it when all threads accessing it exit or unmap it.
Independent testing: You should be able to test this implementation
independently from the other modules by simply using multiple processes to
create the shared memory region, and test that it is functional.
Module #3: Synchronization and Scheduling
Module 2 provides shared memory between processes. Now that we have shared
memory, we must synchronize access to it! We want to provide a means for
passing requests to the FSS, and for synchronizing the return value being
passed to the client. Xv6 does not provide any memory sharing between user-
level processes, so you’re job is to implement a set of mutex system calls,
and to enable processes to block waiting for requests/responses, you’ll also
provide condition variable system calls. The prototype of these follows:
- int mutex_create(char *name, int name_len); - return a muxid which is an integer identifier for a mutex (analogous to a file descriptor which is an integer that identifies a file), or return -1 if it cannot allocate the mutex.
- void mutex_delete(int muxid); - deallocate the mutex if we are the last process referencing (i.e. all other processes that previously mutex_created it have mutex_deleted it, or have exited/ execed).
- void mutex_lock(int muxid); - Lock a mutex identified by the mutex id. If the mutex is not owned by another thread, then the calling thread now owns it, and has mutual exclusion until the mutex is unlocked. If the mutex is already owned, then we must block, awaiting the other process to release the mutex.
- void mutex_unlock(int muxid); - Unlock a mutex that was previously locked by us. If other threads have since tried to take the mutex, and is blocked awaiting entry, we should wake one of them up to go into the critical section.
- void cv_wait(int muxid); - The condition variable API is vastly simplified compared to the pthread version. We assume that each mutex can have a single condition associated with it. This works well if we just want the FSS waiting on the condition that a client sends a request, and that clients will block waiting on the condition that the FSS replies to them. Because of this, we need only specify the mutex id that the condition variable is associated with. As with condition variables, the calling thread will release the mutex, and block waiting on the condition variable (unless it has already been signaled). Note that you must implement blocking for the mutex separate from blocking for the condition variable or you are likely to break mutual exclusion.
- void cv_signal(int muxid); - wake up any threads that are blocked on the condition variable associated with the mutex identified by its mutex id. This doesn’t alter the mutex at all, and only wakes up any blocked threads.
This will, again require some data-structures in the kernel: - A global structure which is the array of mutexes that can be created in the system. These are similar to those for shared memory, but there can be
MUX_MAXNUM
number of them, maximum. I would suggest that each of these tracks the mutex name, and whatever you need to track for the blocked threads on the mutexes and the condition variable associated with the mutex. - An array per process that is indexed by the mutex id, with each entry simply containing a reference to the global mutex. This enables different processes to have different mutex ids for the same mutex as they are looked up in per-process data-structures. It also allows you to “clean up” the mutexes if the process exits.
You can not use spinning techniques to provide synchronization, and must
focus instead on blocking threads that are waiting for an occupied critical
section, or blocking waiting for some condition to be true. Note that the in-
kernel implementation of locks (see the acquire and release functions) uses
spinlocks for cross-CPU synchronization, and simply disable interrupts during
the critical section. You cannot simply call these for your mutexes. This
would result in user-level executing with interrupts disabled which is not
safe or secure. You must implement your blocking for your mutex and condition
variables. Make sure to understand the sleep/wakeup functions to help you out.
Once you have everything else working, you can level up your implementation to
the max level. The highest you can level up to, is to implement a simple
version of Linux futexes (google it). The idea is that when a mutex is
uncontended (no thread is in the critical section), you should be able to use
simple compare and swap instructions at user-level to set a bit in a word in
the shared memory denoting that the lock is now taken. Another thread
attempting the same will see the bit set, will set another bit showing that
the mutex is “contended”, and will instead make a system call to the kernel to
block on the mutex. When the mutex owner releases the mutex (unlocks it), it
will use compare and swap to unset the taken bit, and when it notices that the
contended bit is set, it will make a system call to wake the threads waiting
to enter the critical section. In this way, when a thread simply locks and
unlocks a lock, it will do so using only a couple of compare and swap, atomic
instructions and will completely avoid making system calls. Only when a thread
contends a taken lock will system calls be made.
A naive implementation of this (as spelled out here) is quite good. But there
are some race conditions that you’ll need to solve to get a 100% solution. If
you implement an xv6 equivalent of futexes, you can augment the mutex API
(likely the create function) to include the address of the lock in the shared
memory.
The different levels of completion for this module are: - Level 0: Implement the system calls to allocate and free mutexes, along with the kernel data-structures to track the mutexes. Scheduling the FSS as highest priority.
- Level 1: Logic in the kernel to block processes locking a mutex that is already locked, and to wake up a thread blocked on a mutex for the same reason. Mutual exclusion is required.
- Level 2: Implement condition variables that interoperate properly with the mutexes from the previous level.
- Level 3: Handle the case where a process terminates accidently (faults). Release any mutexes it holds, and ensure that it is no longer blocked on conditions.
- Level 4: Implement your mutexes in a manner where if a critical section is not owned by another thread a thread is attempting to lock it, it will avoid making a system call similar to futexes as spelling out above.
Independent testing: The mutex API is specially formulated to not require
shared memory. Mutexes are simply associated with a name, so any process can
get one, and lock/unlock it. This means that two processes that don’t even
share memory can be used to test the mutual exclusion of the mutex. The same
applies to the condition variables.
Overall Project
The levels for the overall project are:
- Level 0: Level 1 in one module.
- Level 1: Level 1 at least two modules.
- Level 2: Level 1 in three modules, and level 2 in at least two. Two of the modules must be integrated together.
- Level 3: All three modules must be integrated together.
- Level 4: Highest level - 1 in each module.
- Level 5: Highest level in all modules.