使用Sentinel节点,构造实现 Doubly-Linked List
.
![Doubly-Linked
List](https://upload.wikimedia.org/wikipedia/commons/thumb/5/5e/Doubly-linked-
list.svg/610px-Doubly-linked-list.svg.png)
Requirement
This resit coursework consists of two parts called LIST and SKETCH. The two
tasks align exactly with the two summative coursework tasks given in the
original unit. Respectively, the two parts contribute 40% and 60% towards the
resit coursework, which overall counts 50% towards the COMS10016 unit mark. If
you have a 2nd attempt resit then your mark will be kept at pass level. You
must work individually in any case. All code you submit must be your own, do
not copy any code parts from peers or other sources and do not publish parts
of your own code anywhere. The programs we use for checking against the web,
your peers and other sources are advanced. (Plagiarism may result in 0 marks
for the coursework, the entire unit, failing the year, or worse.) Use only
standard libraries as given in the skeleton code for this task, so your code
compiles and runs without linking any other libraries. Each of the two tasks
comes itself in two parts: a closed task and an open-ended task. Backup your
work regularly. Do not attempt any open-ended task before successfully and
fully finishing the closed tasks.
List Challenge
Step 1: Understand Doubly-linked Lists
Before you start on this task make sure you watched all lectures up to Lecture
15. You should have compiled, run, and understood all the code provided for
pointers, dynamic data, stacks, and lists. In particular, be sure you you have
run, compiled and understood in detail the program linkedlist.c from Lecture
15.
Your task will evolve around doubly-linked lists with a sentinel node. Thus,
let us understand and visualise this concept first. In essence, a doubly-
linked list is made up of a sequence of nodes where neighbouring nodes point
to each other. Each node is a structure that has a payload item x (e.g. just
an int) and two pointers: back and next. The back pointer always points to the
predecessor node and the next pointer always points to the successor node. Two
neighbouring nodes in a doublylinked list can therefore be pictured like this.
This emphasizes that a node structure contains three fields. However, for most
purposes you can simplify the visualisation by depicting the above two nodes
like this.
In this simplified visualisation, a pointer from the left end of a node
represents its back pointer, and a pointer from the right end is its next
pointer. A pointer to anywhere on a node’s rectangle means a pointer to the
start of the node.
The Sentinel Node. It used to be a standard solution to implement doubly-
linked lists by keeping track of the first and last node of the list (like the
3 node and 7 node in the picture above), where the first node would point
backward to NULL, and the last node would point forward to NULL. However, it
turns out that adding an extra node, called a sentinel node, simplifies list
implementations and makes the code much more readable. For a circular, doubly-
linked list our sentinel node (pictured with no payload x) is linked in
between the first and last item in the list:
Using this idea, the nodes of a new ‘empty’ list with no item nodes look like
this.
Here both the back and next pointers of the sentinel node simply point to the
sentinel node itself.
The Structure list. To represent a list in a list data structure we need two
node pointers: one fixed pointer to the sentinel node (called none) to access
both list ends in constant time, and one current pointer that points to a
current node in the list allowing for traversals.
In the above image the current position is the node that holds 7, so item 7 is
selected. If the current pointer in a list points to none then we will
interpret this as ‘no item is selected’.
If there are item nodes in our list, none->next should link to the first item,
and none->back should link to the last item. So we get simple access to both
ends of the list and we do not need to store pointers to the first or last
node anywhere else.
Picturing List Manipulations. To visualize what a function does to a list,
draw a picture of the situation before the function call, and another of the
situation after the call. For instance, consider the function after. If an
item is selected it moves the current pointer one element forward. When
applying the function to our list with the item 3 selected, it would have the
simple effect of moving the current pointer one step forward to item 7.
Applying the after function to our list when the item 7 is selected moves the
current pointer one step forward to the sentinel node, meaning ‘no item’ is
selected after the call.
Thus, whenever in doubt, draw a picture of a particular situation before and
after a function call to understand the detailed workings.
Step 2: Understand the Closed Task (50% of 1st Task)
Your closed task is to implement 13 missing procedures in the skeleton file
list.c all of which manipulate circular doublylinked lists with one sentinel
node. You must use the provided files and are not allowed to alter any of the
provided code, only add the 13 missing functions where marked:
- list.h (header file)
- list.c (skeleton)
- Makefile
The header file lists.h forms an API, which you should read carefully because
the comments describe what the functions you will have to implement in list.c
have to do. The list.c file has just two data structures node and list which
you must use, and a lot of tests. The program as given to you will not compile
initially. So, our first task is to produce a compiling skeleton by studying
the signatures of the 13 missing procedures and accordingly defining some
initial dummy functions.
Step 3: Create a Compiling Skeleton
Your first task is to turn lists.c into a full skeleton program that compiles.
You can do this without understanding all of the technicalities of the
assignment.
Two Key Data Structures. In the file lists.c a structure for the nodes which
make up the list is defined which make up the list is defined, and a structure
for the list itself. The node structure struct node is not visible to the user
of the module. Each node is used to hold an item x and pointers to the two
neighbouring nodes (next and back) which define the list ordering. The overall
list structure struct list represents a list and is essential so that your
newList function can return something to the user which is well defined. This
structure holds two pointers: one to the sentinel node of the list and one to
the currently selected node. Read the code comments about the list structure
carefully. You will have to use the two data structures exactly as described
to comply with the tests.
Define Dummy Functions. Write a minimal dummy definition of each of the 13
functions mentioned in the header file. The safest way to do that is to copy-
and-paste a function’s declaration from lists.h, then replace the semicolon by
curly brackets. If the function returns anything other than void, add a return
statement which returns the easiest temporary value of that type you can think
of (e.g. NULL for a pointer, false for a boolean). For functions returning an
item, you can return 0 for now, but beware that depends on item being int, so
it may need to be fixed later. At this point, check that the program compiles
fine via make test or directly via:
clang -Dtest_list -std=c11 -Wall -pedantic -g list.c -o list fsanitize=undefined -fsanitize=address
Pay attention to use the exact line for compilation, including that the
parameter -Dtest_list must be used to run the tests.
Step 4: Understand the Tests
There is a separate test function in lists.h for each of the 13 list functions
you need to implement, except for freeList which can’t be tested directly. The
tests specify each function using before and after pictograms compressed into
strings. Single digits represent items and the ‘|’ symbol in front of a digit
indicates that this is the current item. If the ‘|’ symbol is at the end of
the string then ‘none’ of the items is selected. The strings “|37”, “3|7”,
“37|” represent a list of two items, with the current position at the first
item, the last item, and a situation where ‘none’ of the items is selected.
The tests utilise this pictogram string notation to drive the testing. For
example, the one-line test for applying the after function when item 3 is
selected in our example list will be encoded as:
assert(LINE, check(After, -1, “|37”, “3|7”, true));
—|—
The one-line test for inserting 5 when the current item is 3 in our example
list using the insertAfter function is:
assert(LINE, check(InsertAfter, 5, “|37”, “3|57”));
—|—
There is a different check function for each function type. The check function
builds a list matching the before picture, calls the given function (in this
case insertAfter with 5 as the second argument) and compares the result to the
after picture.
Checks and Default. Most functions are designed to return a testable value.
For example, if no item is selected, a call of after does nothing and returns
false, which is easy to test. The get function returns an item in any case. To
make sure there is an item which can be returned in any case, the newList
function is passed a default item. The default item should be stored in the
sentinel node.
Function Descriptions. What does each function do? There is a detailed comment
for each function in the list.h header which gives a summary. For each
function, there is a test function with some assert calls. These show
precisely what the function does on the empty list “|” and a list with two
items in at least each of the three cases “|37” and “3|7” and “37|”. That
should be enough for you to work out what the function does in every possible
case.
Details on Support Functions. The functions build, destroy and match form the
heart of the testing and are implemented ‘brute force’. The build function is
used to build a list from the ‘before’ picture of a test, the function being
tested is applied to the list, match is used to check that the result list
matches the ‘after’ picture, and destroy frees up the list. Each of the
functions uses an array of nodes in a very direct manner, so there is no
ambiguity about what is going on. But that is not a technique you are supposed
to be using in the list functions, because all of your 13 functions must take
O(1) constant time.
The style of testing set up here is very carefully designed to allow you to
work on one list function at a time.
Step 5: Write the 13 Functions One by One
Programming with pointers is difficult. When a test fails, there is generally
a segfault or similar, which can be very difficult to track down. You will
need to use several or maybe all of:
- the warning options -Wall -pedantic
- the sanitize options to pinpoint segfaults and memory leaks
- print statements
Develop newList. The first thing to do is to comment out all the tests except
testNewList in main. After that, keep all the tests beyond the one you are
working on commented out. That’s because if a test fails, causing a segfault,
it may be unreasonably difficult to know which test function caused it.
Develop newList until it passes its test, and you don’t get any messages from
the various compiler options. In newList you will essentially have to allocate
memory on the heap for a new list structure and a new sentinal node,
initialise the sentinal node with the default item, let the current and none
pointers in the list point to the sentinel node, and link the two pointers in
the sentinel node point to the sentinal node itself.
Develop freeList. For all the functions, the compiler options test things that
the tests themselves can’t. In the case of freeList, there is no explicit
testing that can be done. Therefore the only testing is that memory leak
detection does not give any messages. Your freeList procedure should first
free all nodes of the list including the sentinel node and finally free the
list structure itself.
Develop in Small Steps. You may want to stick to the development sequence
given by the test sequence for the functions. Thus, step by step uncomment the
call to its testing function first, develop and test. Remember, the more
exceptions and different cases your code handles, the more liable it is to
have bugs in, because there are more places for bugs to hide, and it is harder
for you to see at a glance that the code is correct. You aren’t being given
much opportunity for making your own implementation decisions in this closed
part of the assignment. That simplifies checking correctness, and allows us to
help you more easily. It is very tempting to write lines of code like this,
with lots of arrows:
current->back->next->…
—|—
The trouble is, this is very error-prone. The code may be written with a
mental picture of where the nodes were at the start of the function, but one
or more of the pointers used in the expression may have been changed already
by the time this line is executed. Trouble can arise particularly when
shuffling lines of code around. A line of code that used to work may suddenly
no longer work. And it is possible to ‘lose’ a node altogether, because there
are no pointers left pointing to it, and therefore no way to reach it.
Use Robust Strategies. In this assignment, the insert and delete functions are
the most difficult ones. They involve handling three nodes, either a new
‘middle’ node being inserted between (up to) two existing ones, or one
existing node being deleted and its (up to) two neighbours being linked up
together to close the gap in the circular list. A good strategy is to set up
three local pointer variables (e.g. p, q and r or whatever you like) for these
three nodes at the beginning of a function, so that you can keep track of them
no matter what changes are made to the pointers between them. Each line of
code after that can then be written simply using only one arrow, and the order
in which the lines of code are executed doen’t matter, making the code much
more robust.
Enjoy programming and make sure your code adheres to the C Programming Style
Guide! As always use the labs and the Teams chat for help and feedback
throughout the two weeks. Once your program compiles and runs without errors
and warnings, and passes all the tests you will have gained the first 50% of
this coursework’s marks. Everybody should work hard to get to this point.
Step 6: Notes on the Design
A List of Items. The header is set up to store item values in lists. In the
header item is defined as int to provide an example case. However, item must
be used as a synonym for int everywhere in your code, so that there is only
one place in the header file where a change would need to be made to store
some other type of items. This means the module can be used with different
item types in different programs. For those who are interested, note that even
this it is not truly generic since the setup cannot be used multiple times for
different item types in the same program. There is no really satisfying way of
making fully generic modules in C. It is recommended, as a last test before
submitting, that you change the item type to double, to check that you haven’t
inadvertently assumed int anywhere. (The numbers used in the tests should
still work as doubles.)
Conceptual Design. The header doesn’t say that the list is to be doubly
linked, nor to be circular, or use a sentinel node (comments in list.c do
though). That’s because a user of the module need not know or care, and the
implementation could be changed to something completely different in a later
version of the module, without any effect on programs that use it. On the
other hand, the header does say that all the operations are constant time.
This is a strong hint that the implementation does use a doubly linked list or
something similar, because it is difficult to achieve constant time operations
otherwise. The claim of constant time doesn’t cover the vagueness in the time
taken by malloc and free calls but, conventionally, memory management costs
are considered separately from the ‘logic’ costs of operations. The function
names use the camelCase convention, where capitals make the letters go up and
down like the humps on a Bactrian camel. I should point out, for those who are
interested, that including a current position in a list structure itself is
not thread safe. A more thread-safe approach is to create a separate iterator
object each time the list is traversed. However, that approach can still
easily lead to ‘concurrent modification’ problems where the list structure is
changed by one thread while another is traversing it. It is much safer to make
sure that a list is owned by a single thread. You will also have noticed that
we have to compile our list.c program with the -Dtest_list flag to enable the
tests. As you will learn in later lectures, if we don’t use this flag then, in
our case, we compile our program as a module without a main function and the
tests. The program then seizes to be a stand-alone program, but instead can
become part of another program that just uses its functionality.
Step 7: The Open-ended Task
Only if both your closed tasks are finished and you have time left, then you
can do some extra open-ended work in your own program called visualise.c on
the following problem: Using only standard libraries, if any, write a program
that visualises the bit structure of data types in C in binary when entered in
decimal form. The program must take 1) a type, and 2) particular data of this
type in decimal notation as command line arguments. It should check for input
errors (and print “Input error.” in this case), and if there are none, it
should print the bit structure of the data in groups of a nibble, e.g.:
./visualise char 7
0000 0111
./visualise char -128
1000 0000
./visualise char 255
Input error.
./visualise char 08
Input error.
./visualise char -x0
Input error.
We recommend to keep things relatively simple at first, for instance, by
starting with investigating just char. The knowledge you already gained in
computer architecture may be handy.
Most importantly for this unit, your program should contain detailed unit
tests for all functions which are run if no command line parameters are
provided:
./visualise
All tests pass.
If you have time left, follow up with visualising the bit structure of
unsigned char, int, unsigned int, long, and double, e.g.:
./visualise unsigned char 255
1111 1111
./visualise int 10000000
0000 0000 1001 1000 1001 0110 1000 0000
./visualise double -1.25
1011 1111 1111 0100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
You may need to do some research into interpreting the bit structure of
floating point representations if you attempt to look into double. If you
still have time left after that, extend your program so that it shows the
decimal value of data types in C when entered in binary form in groups of
nibbles, e.g.:
./visualise char 1000 0000
-128
./visualise unsigned char 0000 0111
./visualise int 0000 0000 1001 1000 1001 0110 1000 000
Input error.
./visualise double 1011 1111 1111 0100 0000 0000 0000 0000 0000 0000 0
-1.25
Note that it is always clear from the structure of your input if you are
converting to decimal or to binary since binary input has leading zeros and
comes in chunks of a nibble.
You can extend your program further (keeping all previous functionality
intact) by allowing for structured input using semicolon separated and {}
enclosed types such as:
./visualise {char;int;unsigned char} 7 10000000 255
0000 0111 0000 0000 1001 1000 1001 0110 1000 0000 1111 1111
./visualise {char;int;unsigned char} 0000 0111 0000 0000 1001 1000 1
7 10000000 255
Your source file visualise.c must compile error-free and warning-free for a
valid submission in any case. Your program MUST comply exactly with the
specified format for input and output. You are encouraged to write a summary
file readme.txt which describes what your program can do in no more than 100
words (strict limit). There are no marks for report writing, but the summary
may be necessary for us to make sense of your program. As long as your program
works, even if your program is very basic (e.g. just checks for char that some
the input char is valid), still submit it. Whenever you have reached a well
working version, take a copy of your work, so you can revert back to it at any
point. Rather submit something simple that works bug-free and is well tested
than something that contains bugs - you will not get many marks for buggy
code.
Be careful not to over spend on time, since the task is completely open-ended.
Make sure you manage your time well and stop at an appropriate point. Make
sure your programming adheres to the C Programming Style Guide. Enjoy
programming!
Sketch Challenge
Step 8: Sketch File Formats and Skeleton Code
Make sure you completed all lectures and formative courseworks of the original
unit COMS10016 Imperative Programming thread before starting this task. A
Sketch File with the extension (.sk) is a simple binary file that contains a
drawing or graphic. There is a basic, intermediate, and advanced version of
the file format, each of which is backwards-compatible to include all simpler
formats as subsets of the functionality. The formats have been designed to
store simple freehand sketches and more complex graphics, even animations in
one compact file, in such a way that these can be viewed or replayed as an
animation.
Your task will be to develop a program sketch.c which reads in and visualises
.sk files using the display module displayfull.h for graphics, a module nearly
identical to the one covered in last week’s formative exercise. You are not
allowed to change the display module, testing module, nor the function parts
and data structures already provided in the sketch skeleton and header. You
are of course allowed to implement functions and add anything you would like
to the sketch.c file.
You are given the following files and modules to start your development:
- sketch.zip (All Files in one Zip)
- sketch.h (skeleton header)
- sketch.c (skeleton)
- Makefile (basic Makefile for Unix only)
- displayfull.h (display header)
- displayfull.c (display module)
- test.c (testing framework)
- sketch00.sk, sketch01.sk, sketch02.sk, sketch03.sk, sketch04.sk, sketch05.sk, sketch06.sk, sketch07.sk, sketch08.sk, sketch09.sk
As you see, this assignment is going to involve lots of files. It is suggested
that you create a new empty folder to work in and extract all files contained
in the sketch.zip file into it before you start work. Make sure that the
folder permissions are set to owner read only if you work on university
computers to make sure nobody else can access your work. Back up your work
regularly in any case, since data loss is not a valid extenuating
circumstance.
Step 9: Understanding the Basic Sketch File Format
A Basic Sketch File contains a picture made up of white lines drawn on a black
background. For simplicity we will stick with a fixed image size of exactly
200x200 pixels for this assignment. Sketch files are encoded as a sequence of
single-byte commands, which your program will have to translate into calls to
the display module. During drawing a basic sketch file, your program will have
to keep track of a current drawing state, which is a data structure defined in
sketch.h. This contains the current pixel location (x,y) in the window (which
must be initialised as location (0,0) at the beginning of reading a sketch
file), a pixel target location (tx,ty) (which must be initialised as (0,0) at
the beginning of reading a sketch file), and the currently set tool type
(which is initialised as LINE at the beginning of reading a sketch file). The
other fields are not used by basic sketch files and should just be initialised
to 0. For basic sketch files there are three possible single-byte commands:
TOOL to switch active line drawing on or off, DX to move horizontally, and DY
to move vertically and draw a line from current to target position if the tool
is switched on. Two most significant bits of every command byte determine the
opcode (i.e. which of the three commands to do) and the remaining six bits
determine the operand (i.e. what data to do it with):
- TOOL.. If the two most significant bits (or opcode) of a command byte are 1 (most significant bit) and 0, respectively, then the command is interpreted as the TOOL command. This command selects a tool and uses the remaining 6 bits as an unsigned integer operand, which indicates the type of tool. For a Basic Sketch File the tool type can either be NONE = 0 (switching all drawing off) or LINE = 1 (switching line drawing on).
- DX….If the two most significant bits of a command byte are both 0 then the command is interpreted as the DX command, that shifts the position of the target location (tx,ty) along the x direction by dx pixels. The operand dx is specified between -32 and 31 encoded as a twocomplement integer via the remaining 6 bits of the command byte.
- DY….If the two most significant bits of a command byte are 0 (most significant bit) and 1, respectively, then the command is interpreted as the DY opcode, that is shifting the position of the target location (tx,ty) along the y direction by dy pixels. Then, and only if line drawing is switched on, a line is drawn from the current location (x,y) to the target location (tx,ty). Finally, the current location is set to the target location in any case. (If line drawing is switched off then nothing is drawn, but the current location is still changed to the target location) Note that dy is a value between -32 and 31 encoded as a two-complement integer via the remaining 6 bits of the byte.
EXAMPLE FILE
Lets have a look at a very simple sketch file sketch00.sk. Since the .sk files
are binary, you can’t view them in a text editor like atom easily. You can
view the files with the od command in Linux though or use your own HEX viewer
developed during formative work last week. To see what the bytes of a sketch
file look like use the od command, e.g.:
od -t x1 sketch00.sk
0000000 1e 5e
A number on the first column like 0000000 says what byte position in the
binary file the line of output refers to, which is useful for larger files.
The rest of the line shows you the bytes in HEX, that is two paired
hexadecimal digits for each byte. We can see that the sketch00.sk file
contains just two bytes, ‘1e’ and ‘5e’. Each of these represents a single-byte
command.
Since each hexadecimal digit represents a four bit nibble, the first byte ‘1e’
translates to binary ‘0001 1110’ and the second byte ‘5e’ translates to binary
‘0101 1110’. Looking at the two most significant bits (left two bits) tells us
which command it is, i.e. which opcode. The first byte starts with ‘00’ and
thus is the DX command, the second byte starts with ‘01’ and thus is the DY
command.
The remaining rightmost six bits represent the operands for each of these
commands encoded as two-complement representation, in both cases these are
‘011110’ which represents +30. If needed, review Bits Lecture at this point
and remember that the leftmost of the six bits is indicative of the sign. Note
that only 6 bits are used to represent the operand, not 8, so operands between
-32 and 31 can be encoded. Thus, the sketch00.sk file has the commands ‘DX+30’
and ‘DY+30’. Given that (x,y) and (tx,ty) are initialised as (0,0) this
command will first shift tx to 30, then shift ty to 30 and draw a line from
(0,0) to (30,30), and finally set (x,y) to (30,30). The pictures that should
be drawn for the five basic sketch files are shown below.
Step 10: Understanding the Skeleton Code
COMPILING, RUNNING AND TESTING YOUR CODE: In order to compile and run the
provided skeleton files with graphics use:
clang -std=c11 -Wall -pedantic -g sketch.c displayfull.c I/usr/include/SDL2 -lSDL2 -o sketch -fsanitize=undefined fsanitize=address fsanitize=address
After this you can run the program on a sketch file like:
./sketch sketch00.sk
This will at the moment just show an empty 200x200 black display window, since
no drawing functionality has been implemented. You can close the window via
the clicking on the ‘x’ or pressing ESC. Note that on some systems SDL2
produces memory leaks - we will have to accept this and as long as your
program does not leak during testing you will not loose marks for these leaks
of course.
You can, however, test your program even without having graphics fully
installed. To do this, compile and run the provided skeleton files as follows:
clang -DTESTING -std=c11 -Wall -pedantic -g sketch.c test.c I/usr/include/SDL2 -o test -fsanitize=undefined -fsanitize=address
After this you can test your program (which requires all the 10 sketch files
for testing to be in your local folder):
./test
We will use exactly the commands above to test your code (with our fresh
test.c copy), so do not add any further -D macro definitions to your
compilation commands. Currently testing fails, of course. Testing will tell
you either which line of tests in the test.c program fails or which drawing
command that should be done by a sketch file was not correctly called by your
program. (To do that, the testing module ‘plays’ the role of the graphics
module and checks all the graphics calls made against the deterministic
sequence of calls that must be made for each of the sketch files.
Intercepting, replacing or forwarding calls of a software component is known
as a ‘proxy’.