完成操作系统中,User Level层 Threads
编程。
![Thread](https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Multithreaded_process.svg/220px-
Multithreaded_process.svg.png)
Background on Threads
Threads and processes are key abstractions for enabling concurrency in
operating systems. To gain a deeper understanding of how these abstractions
are constructed, you will build the core of a userlevel threads package.
Implementing kernel-level threads and processes is not much different, but
we’ll do this assignment at user level since (a) installing new OS kernels on
the teach.cs machines is problematic and (b) debugging user-level code is much
easier than debugging kernel-level code.
Threads provide the illusion that different parts of your program are
executing concurrently. In the de facto standard model of multithreaded
execution, threads share the code, heap, and the runtime system. Each thread,
however, has a separate stack and, naturally, a separate set of CPU registers.
This programming model also provides synchronization primitives so that
different threads can coordinate access to shared resources, but we will leave
those concerns for the next assignment.
User-Level vs. Kernel Threads
For practical reasons, this assignment is done at user level: you will
construct user threads by implementing a set of functions that your program
will call directly to provide the illusion of concurrency. In contrast, modern
operating systems provide kernel threads, and a user program invokes the
corresponding kernel thread functions via system calls. Both types of threads
use the same core techniques for providing the concurrency abstraction; you
would build kernel threads in essentially the same way you build user threads
in this assignment. Also, kernel processes are built using these techniques.
However, there are a few differences between kernel and user threads.
Multiprocessing
User-level threads provide the illusion of concurrency, but on machines with
multiple processors kernel-level threads can provide actual concurrency. With
user-level threads, the OS schedules the user process on one CPU, and the
user-level threads package multiplexes the kernel thread associated with the
process between one or more user-level threads. With kernel-level threads, the
OS is aware of the different (kernel) threads, and it can simultaneously
schedule these threads from the same process on different processors.
A key simplifying assumption for this assignment is that you will allow
programs to multiplex some number (e.g., m) of user-level threads on one
kernel thread. This means that at most one user-level thread is running at a
time and that your runtime system has complete control over the interleaving
of user-level threads with each other. More sophisticated systems implement m
on n threads packages where m user-level threads are multiplexed across n
kernel threads.
Asynchronous I/O
When a user-level thread makes a system call that blocks (e.g., reading a file
from disk), the OS scheduler moves the process to the Blocked state and will
not schedule it until the I/O has completed. Thus, even if there are other
user-level threads within that process, they also have to wait. In contrast,
when a kernel thread blocks for a system call, the OS scheduler may choose
another kernel thread from the same process to run. Thus, some kernel threads
may be running while others are waiting for I/O.
Timer interrupts
The OS scheduler uses timer interrupts to preempt a running kernel thread and
switch the CPU to a different runnable kernel thread. Similar to blocking
system calls, this stops the execution of all user-level threads in the
process until the kernel thread is scheduled to run again. However, to switch
between user-level threads that are multiplexed on a single kernel thread, we
cannot rely on timer interrupts (those are delivered to the OS, and not to our
thread library runtime). Instead, you will implement a cooperative threading
model, where the currently running user-level thread must explicitly yield the
processor to another user-level thread by calling a function provided by your
library.
In the next assignment, we will simulate timer interrupts that cause the
scheduler to switch from one thread or process to another by using POSIX
signals. In your implementation, the threads library will “turn off
interrupts” by blocking delivery of these signals using system calls. However,
there is nothing to prevent the threads, themselves, from “turning off
interrupts” the same way. Thus, even though we will implement “preemptive”
threads, a “malicious” thread could turn off interrupts and not be preempted
until it calls yield, thus hogging the CPU. Note that kernel schedulers don’t
have this problem. Only the privileged code in the kernel can turn off the
real timer interrupts.
Using Threads
With your threads library, a typical program will look something like this:
int main(int argc, char **argv)
{
// Some initialization
// Create some threads
// wait for threads to finish
// exit
}
// “main” function for thread i
thread_main_i (…)
{
// do some work
// yield
// do some more work
// return (implicit thread exit)
}
—|—
Here thread_main_i is a programmer-supplied “main” function that the ith
thread starts executing when it is first scheduled to run after its creation.
(Note that different threads may have the same “main” function.) The thread
can perform useful work by calling any other functions in the program, or
voluntarily yielding to other threads.
A thread exits either explicitly or implicitly. It exits explicitly when it
calls the thread_exit function in the thread library. It exits implicitly when
its thread_main function returns. Additionally, to add more control to the
program, a thread may call thread_kill to force other threads to exit as well.
Cooperative Threads API
A key simplifying assumption in this assignment is that the threads are
cooperative, i.e., each thread runs until it explicitly releases the CPU to
another thread by yielding or by exiting. In contrast, preemptive threading
systems allow a scheduler to interrupt a running thread at any time and switch
the CPU to running a different thread.
The thread package provides several functions to allow application programs to
perform thread management. In addition, there are a few conventions that
application programs must adhere to in order to ensure proper and safe
operation. A list of the functions that constitute the user-level threads API
can be found in the thread.h file. The functions that you will be implementing
for this assignment are summarized here:
void thread_init(void)
You can use this function to perform any initialization that is needed by your
threading system. Here, you should also hand-craft the first user thread in
the system. To do so, you should configure your thread state data structures
so that the (kernel) thread that is running when your program begins (before
any calls to thread_create) will appear as the first user thread in the system
(with tid = 0). You will not need to allocate a stack for this thread, because
it will run on the (user) stack allocated for this kernel thread by the OS.
Tid thread_id()
This function returns the thread identifier of the currently running thread.
The return value should lie between 0 and THREAD_MAX_THREADS. See Solution
Requirements below.
Tid thread_yield(Tid to_tid)
This function suspends the caller and activates the thread given by the
identifier to_tid. The caller is put on the ready queue and can be run later
in a similar fashion. A reasonable policy is to add the caller to the tail of
the ready queue to ensure fairness (so all other threads are run before this
thread is scheduled again - see the THREAD_ANY argument below). The value of
to_tid may take the identifier of any available thread. It also can take any
of the following constants:
- THREAD_ANY: tells the thread system to run any thread in the ready queue. A reasonable policy is to run the thread at the head of the ready queue, which ensures fairness. This policy is called FIFO (first-in, first-out), since the thread that first entered the ready queue (among the threads that are currently in the ready queue) is scheduled first.
- THREAD_SELF: tells the thread system to continue the execution of the caller. This function could be implemented as a no-op, but it may be useful to explicitly switch to the current thread for debugging purposes.
The thread_yield function returns the identifier of the thread that took
control as a result of the function call. Note that the caller does not get to
see the result until it gets its turn to run (later). The function may also
fail and the caller continues execution immediately. To indicate the reason
for failure, the call returns one of these constants: - THREAD_INVALID: alerts the caller that the identifier to_tid does not correspond to a valid thread.
- THREAD_NONE: alerts the caller that there are no more threads, other than the caller, that are available to run, in response to a call with to_tid set to THREAD_ANY.
Tid thread_create(void ( fn)(void ), void *arg):
This function creates a thread whose starting point is the function fn. The
second argument, arg, is a pointer that will be passed to the function fn when
the thread starts executing. The created thread is put on a ready queue but
does not start execution yet. The caller of the thread_create function
continues to execute after thread_create returns. Upon success, thread_create
returns a thread identifier of type Tid. If thread_create fails, it returns a
value that indicates the reason for failure as follows:
- THREAD_NOMORE: alerts the caller that the thread package cannot create more threads. See Solution Requirements below.
- THREAD_NOMEMORY: alerts the caller that the thread package could not allocate memory to create a stack of the desired size. See Solution Requirements below.
void thread_exit(int exit_code)
This function ensures that the current thread does not run after this call,
i.e., this function should never return. If there are other threads in the
system, one of them should be run. If there are no other threads (this is the
last thread invoking thread_exit), then the program should exit with the
supplied exit_code. A thread that is created later should be able to reuse
this thread’s identifier, but only after this thread has been destroyed. (Note
that we will be making more use of the exit_code in the next assignment.)
Tid thread_kill(Tid victim)
This function kills another thread whose identifier is victim. The victim can
be the identifier of any available thread. The killed thread should not run
any further and the calling thread continues to execute. Upon success, this
function returns the identifier of the thread that was killed. Upon failure,
it returns the following
- THREAD_INVALID: alerts the caller that the identifier victim does not correspond to a valid thread, or is the current thread.
Solution Requirements
- The first thread in the system (before the first call to thread_create) should have a thread identifier of 0.
- Your threads system should support the creation of a maximum of THREAD_MAX_THREADS concurrent threads by a program (including the initial main thread). Thus, the maximum value of the thread identifier should be THREAD_MAX_THREADS - 1 (since thread identifiers start from 0). Note that when a thread exits, its thread identifier can be reused by another thread created later.
- Your library must maintain a “thread control block” (a thread structure) for each thread that is running in the system. This is similar to the process control block that an operating system implements to support process management. In addition, your library must maintain a queue of the threads that are ready to run, so that when the current thread yields, the next thread in the ready queue can be run. Your library allows running a fixed number of threads (THREAD_MAX_THREADS threads), so if it is helpful, you could allocate these structures statically (e.g., as a global array).
- Each thread should have a stack of at least THREAD_MIN_STACK bytes. Your implementation must not statically allocate all stacks at initialization time (e.g., using a global data structure). Instead, you must dynamically allocate a stack (e.g., using malloc()) whenever a new thread is created (and delete one each time a thread is destroyed.)
- Your library must use getcontext and setcontext to save and restore thread context state (see Implementation Details below), but it may not usemakecontext or swapcontext or any other existing C library code to manipulate a thread’s context; you need to write the code to do that yourself.
- Your code must not make calls to any existing thread libraries (e.g., Linux pthreads), or borrow code from these libraries for this assignment.
- Do not use any code from other students, or any code available on the Internet. When in doubt, please ask us.
Implementation Details
Thread Context
Each thread has per-thread state that represents the working state of the
thread – the thread’s program counter, local variables, stack, etc. A thread
context is a subset of this state that must be saved/restored from the
processor when switching threads. (To avoid copying the entire stack, the
thread context includes a pointer to the stack, not the entire stack.) Your
library will store the thread context in a per-thread data structure (this
structure is sometimes called the “thread control block”).
Think carefully about what you need to include in your thread control block
structure, and how these structures will be used to create and manage your
threads. Consider how you will implement your ready queue. Remember that the
thread_yield function allows a thread to yield the CPU to a specified thread,
so you may need to remove a thread from the middle of the ready queue.
Saving and Restoring Thread Context
When a thread yields the CPU, the threads library must save the current
thread’s context, which contains the processor register values at the time the
thread yields the CPU. The library restores the saved context later when the
thread gets its turn to run on the processor again. Additionally, the library
creates a fresh context and a new stack when it creates a thread.
Fortunately, the C runtime system allows an application to retrieve its
current context and store it in a memory location, and to set its current
context to a predetermined value from a memory location. Your library will
make use of these two existing C library calls: getcontext and setcontext.
Study the manual pages ( http://linux.die.net/man/2/setcontext
) of these two calls. Notice that
getcontext saves the current context into a structure of type struct ucontext,
which is typedef’d as type ucontext_t. So, if you allocate a struct ucontext
and pass a pointer to that memory to a call to getcontext, the current
registers and other context will be stored to that memory. Later, you can call
setcontext to copy that state from that memory to the processor, restoring the
saved state. (Hint: You almost certainly want a ‘struct ucontext’ as part of
your thread control block data structure.)
The struct ucontext is defined in /usr/include/x86_64-linux-gnu/sys/ucontext.h
on the teach.cs servers. Look at the fields of this struct in detail,
especially the uc_mcontext and the uc_sigmask fields.
You will use getcontext and setcontext in two ways. First, to suspend a
currently running thread (to run another one), you will use getcontext to save
its state and later use setcontext to restore its state. Second, to create a
new thread, you will use getcontext to create a valid context, but you will
leave the current thread running; you (the current thread, actually) will then
change a few registers in this valid context to initialize it as a new thread,
and put this new thread into the ready queue; at some point, the new thread
will be chosen by the scheduler, and it will run when setcontext is called on
this new thread’s context.
Changing Thread Context
As noted above, when creating a thread, you can’t just make a copy of the
current thread’s context (using getcontext). You also need to make a few
changes to initialize the new thread:
- You need to change the saved program counter register in the context to point to a stub function, described below, which should be the first function the thread runs.
- You need to change the saved argument registers, described below, in the context to hold the arguments that are to be passed to the stub function.
- You need to allocate a new per-thread stack using malloc.
- You need to change the saved stack pointer register in the context to point to the top of the new stack. (Warning: in x86-64, stacks grow down!)
In the real world, you would take advantage of an existing library function,
makecontext, to make these changes to the copy of the current thread’s
context. The advantage of using this function is that it abstracts away the
details of how a context is saved in memory, which simplifies things and helps
portability. The disadvantage is that it abstracts away the details of how a
context is saved in memory, which might leave you unclear about exactly what’s
going on.
In the spirit of “there is no magic”, for this assignment you should not use
makecontext or swapcontext. Instead, you must manipulate the fields in the
saved ucontext_t directly. The tutorial exercise for Week 2 will help you
understand the ucontext structure.
The Stub Function
When you create a new thread, you want it to run the thread_main function that
defines the work you want the thread to do. A thread exits implicitly when it
returns from its thread_main function, much like the main program thread is
destroyed by the OS when it returns from its main function in C, even when the
main function doesn’t invoke the exit system call. To implement a similar
implicit thread exit, rather than having your thread begin by running the
thread_main function directly, you should start the thread in a “stub”
function that calls the thread_main function of the thread (much like main is
actually called from the crt0 stub function in UNIX). In other words,
thread_create should initialize a thread so that it starts in the thread_stub
function shown below. When the thread runs for the first time, it will execute
thread_stub, which will call thread_main. If the thread_main function returns,
it will return to the stub function which will call thread_exit to terminate
the thread.
/* thread starts by calling thread_stub. The arguments to thread_stub are the
* thread_main() function, and one argument to the thread_main() function.
*/
void thread_stub(void (*thread_main)(void *), void *arg)
{
thread_main(arg); // call thread_main() function with arg
thread_exit(0);
}
—|—
In the above code, the argument thread_main is a pointer to the thread_main
function that describes the real work the thread should do. Notice that in C,
a function’s name refers to the address of its code in memory. The second
argument to thread_stub (arg) is the argument to pass to the thread_main
function. We’ll have the thread_main function take an argument that is a
pointer to an arbitrary type so that you can pass it whatever you want.
Contexts and Calling Conventions
The context structure contains many data fields, but you only need to deal
with four of them when creating new threads: the stack pointer, the program
counter, and two argument registers. Other than that, you don’t need to worry
about the fields within the context variable, as long as you do not tamper
with them. Also, it is a good idea to use variables that have been initialized
through a getcontext call in order to not have bizarre behavior.
Notice that while a procedure executes, it can allocate stack space by moving
the stack pointer down (stack grows downwards). However, it can find local
variables, parameters, return addresses, and the old frame pointer (old %rbp)
by indexing relative to the frame pointer (%rbp) register because its value
does not change during the lifetime of a function call.
When a function needs to make a function call, it copies the arguments of the
“callee” function (the function to be called) to the registers shown on the
right in the x86-64 architure. For example, the %rdi register will contain the
first argument, the %rsi register will contain the second argument, etc.
Then the caller saves the current instruction pointer (%rip) into the stack
(shown as “return address” in the figure), and changes the instruction pointer
to the callee function. At this point the stack pointer (%rsp) points to the
return address (shown in the figure). Note that the stack pointer points to
the last pushed value in the stack.
The callee function then starts executing. It first pushes the the frame
pointer value of the caller function (shown as old %rbp) into the stack, and
then sets the frame pointer (%rbp) to the current stack pointer (%rbp = %rsp),
so that it points to the old frame pointer value. Then the callee function
decrements the stack pointer (shown as %rsp), and uses the space between the
frame pointer and the stack pointer for its local variables, for saving or
spilling other registers, etc. As an example, these three steps are performed
by the first three instructions (push, mov and sub) in the main function shown
below. The callee locates its variables, parameters, and the return address,
by using addresses relative to the fixed frame pointer (%rbp).
To return to the caller, a procedure simply copies the frame pointer (%rbp) to
the stack pointer (%rsp = %rbp), effectively releasing the current frame. Then
it pops the top stack item into %rbp to restore the %rbp of the caller
function, and then uses the ret instruction to pop the old instruction pointer
off the stack into the instruction register (%rip), returning control to the
caller function. These steps are performed by the last two instructions
(leaveq, retq) in the main function shown below.
This is a gdb listing for the test_basic program that we will provide you for
this assignment. Run gdb on it (or on any program that you have compiled on
the lab machines), as shown below, to see the instructions at the start and
the end of a function (e.g., main function). Make sure that you understand
what these instructions are doing, and are able to answer the questions in the
listing below.
Setup and Submission
Log in to MarkUs and go to CSC369. (The interface to MarkUs has changed this
term - instead of a URL per course, you will be able to select your course
after logging in.)
You will find the starter files for this assignment on MarkUs. Click the
button to ‘Add starter files to repository’, then clone your repository where
you want to do your work. You should find the starter code below the a1/
subdirectory of your cloned repository. Build the starter code by typing make
in the a1/ directory
You should only modify the thread.cfiles in this assignment.
You can find the files you have modified by running the git status command.
You can commit your modified files to your local repository as follows:
git add thread.c
git commit -m “Committing changes for Assignment 1”
We suggest committing your changes frequently by rerunning the commands above
(with different meaningful messages to the commit command), so that you can go
back to see the changes you have made over time, if needed.
Once you have tested your code, and committed it locally (check that by
running git status), you can git push it back to MarkUs. We will collect and
grade the last version pushed to MarkUs when the grace period expires. Grace
tokens are consumed automatically based on the timestamp of your last push.
Hints and Advice
This assignment does not require writing a large number of lines of code. It
does require you to think carefully about the code you write. Before you dive
into writing code, it will pay to spend time planning and understanding the
code you are going to write. If you think the problem through from beginning
to end, this assignment will not be too hard. If you try to hack your way out
of trouble, you will spend many frustrating nights in the lab.
As a start, here are some questions you should answer before you write code.
- What fields will you need in your thread structure? Perhaps the most important is the thread state (e.g., running, etc.). Think about all the states that you will need to support.
- getcontext “returns” twice. When it is called directly, it initializes the context structure that is passed as an argument, and then execution continues after the getcontext() call. Then, when setcontext() is called later, execution returns to the instruction following the getcontext() call, which appears to be a second “return”, since the code continues running from the instruction following getcontext(). For this assignment, you will use this behavior, once when you create a context, and again when you switch to that context. What action will you take in each case? How will you tell which case you are in?
- Most threads are created with thread_create, but the initial thread is there before your library is invoked. Nonetheless, the original thread must be able to thread_yield to let other threads run, and other threads must be able to call thread_yield and let the original thread run. How is this going to work?
- A hard bug to find would be an overflow or underflow of the stack you allocate. How might such a bug manifest itself? What defensive programming strategies can you use to detect stack overflow in a more controlled manner as the system runs?
- Note that when the initial thread in a C process returns, it calls the exit system call, which causes the OS to destroy the process, even if you have other user level threads in the process that want to run. How will you ensure that the program exits only when the last thread in your system exits?
- Be careful. It is dangerous to use memory once it has been freed. In particular, you should not free the stack of the currently running thread in thread_exit while it is still running. So how will you make sure that the thread stack is eventually deallocated? How will you make sure that another thread that is created in between does not start using this stack (and then you inadvertently deallocate it)? You should convince yourself that your program would work even if you used a debugging malloc library that overwrites a block with dummy data when that block is free()’d.
- Be careful. If you destroy a thread that is holding or waiting on a resource such as a lock (we will be implementing locks in the next assignment), problems can occur. For example, deadlock may occur because the thread holding the lock may not have a chance to release the lock. Similarly, if the thread waiting on a lock is destroyed, then the thread releasing the lock may wake up some other thread incorrectly (e.g., due to reusing thread id). For this reason, it is important to ensure that when thread_kill is invoked on a thread, the target thread should not exit immediately. Instead, the target thread should exit when it runs the next time. How will you implement this functionality? In practice, operating systems provide a signal handler mechanism that allows threads to clean up their resources (e.g., locks) before they exit.
- What are the similarities and differences between thread_yield and thread_exit? Think carefully. It will be useful to encapsulate all that is similar in a common function, which will help reduce bugs, and possibly make your code simpler.
We strongly recommend that your first milestone might be for
thread_yield(THREAD_SELF) to work for the initial thread (where your
implementation stores and then restores the caller’s state). Get this working
before you try to implement thread_create or thread_exit.
Use a debugger. As an exercise, put a breakpoint at the instruction after you
copy the current thread’s state using getcontext. You can print the current
values of the registers (in gdb, type info registers).
You can print the values stored in your thread struct and the thread context.
For example, say current is a pointer to the thread structure associated with
the currently running thread, and context is a field in this structure that
stores the thread context. Then, in gdb, you can use p/x current]context to
print the context stored by a call to getcontext.
You may find this particularly useful in making sure that the state you
“restore” when you run a newlycreated thread for the first time makes sense.