代写五种排序算法,分别是min sort, bubble sort, insertion sort, quick sort以及merge sort.
Introduction
Putting items into a sorted order is one of the most common tasks in Computer
Science, and so as a result there are myriad library routines that will do
this task for youbut that does not absolve you of the obligation of
understanding how it is done, and in fact it behooves you to understand the
various algorithms in order to make wise choices.
The best that can be accomplished (the lower bound) for sorting using
comparisons is (n log n) where n is the number is elements to be sorted. If
the universe of elements to be sorted is limited (small), then we can do
better using a Count Sort, where is O(n), where we count the number of
occurrences of each element in an array, or a Radix Sort, which is also O(n)
with a constant proportional to the maximum number of digits in the numbers
being sorted.
What is this O and stuff? It’s how we talk about the execution time (or space)
of programs. We will discuss it in class, and you will see it again in CMPS
101.
min-Sort
Perhaps the simplest sorting method is to look for the smallest element (the
minimum) and move it to the top. We could do this by making a copy of that
element, and then shifting everything down by one, and then placing the copy
in the top slot. But that seems silly, why move all of those elements? Let’s
just exchange (swap) the top element with the smallest element.
At this point what do we know? We know that the smallest element is at the
top. Since that is true and it will not change (we call this an invariant), we
can forget about it and move on. Let’s consider the second element:
Why not just do what we did the first time? If we do that, then what do we
know? We now know that the top (first) element is the smallest, and the second
element is the second smallest (or the same, if there are duplicates). We can
repeat this for each element in the array, in succession, up to but not
including the last element. By the method of induction, we can show that the
array is sorted.
Why do we not need to concern ourselves with the last element? The answer is
that if it were not the smaller element when we were at the last step, then it
was exchanged with the penultimate element, and thus must necessarily be the
largest (and consequently, last) element in the array.
To get you started, here is the code for minSort. Notice that it is composed
of two functions: minIndex which finds the location of the smallest element,
and minSort which actually performs the sorting.
// minIndex : find the index of the least element.
uint32_t minIndex(uint32_t a[], uint32_t first, uint32_t last)
{
uint32_t smallest = first; // Assume the first is the least
for (uint32_t i = first; i < last; i += 1)
{
smallest = a[i] < a[smallest] ? i : smallest;
}
return smallest;
}
// minSort : sort by repeatedly finding the least element.
void minSort(uint32_t a[], uint32_t length)
{
for (uint32_t i = 0; i < length - 1; i += 1)
{
uint32_t smallest = minIndex(a, i, length);
if (smallest != i) // It’s silly to swap with yourself!
{
SWAP(a[smallest], a[i]);
}
}
return;
}
—|—
What is the time complexity of minSort? The key is to look at the for loop,
and we see that the loop is executed length times. This would seem to indicate
that the sort is O(n), but we need to look further. In the for loop we see
that a call is made to minIndex, and when we look there we find yet another
for loop. This loop is executed (last - first) times, which is based on
length. So we call minIndex approximately length times, each time requiring
approximately length operations, thus the time complexity of minSort is
O(n^2).
It is often useful to define a macro when a task is done repetitively, and
when we also may want to hide the details. One could write a function (an in-
line function would be preferred), but you will recall that functions in C
always pass their arguments by value which is inconvenient for the swap
operation. A clever person might take advantage of the macro to do some
instrumentation.
# ifdef _INSTRUMENTED
# define SWAP(x, y) { uint32_t t = x ; x = y ; y = x ; ; moveCount += 3; }
# else
# define SWAP (x, y) { uint32_t t = x ; x = y ; y = x ; ; }
# endif
—|—
Bubble Sort
Bubble Sort works by examining adjacent pairs of items. If the second item is
smaller than the first, exchange them. If you can pass over the entire array
and no pairs are out of order, then the array is sorted. You will have noticed
that the largest element falls in a single pass to the bottom of the array.
Since it is in fact the largest, we do not need to consider it again, and so
the next pass must only consider n - 1 pairs of items.
What then is the expected time complexity of this algorithm? The first pass
requires n pairs to be examined; the second pass, n - 1 pairs; the third pass
n - 2 pairs, and so forth. When Carl Friedrich Gauss was a child of 7 (in
1784), he is reported to have amazed his elementary school teacher being a
pest he was given the task of summing the integers from 1 to 100. The
precocious little Gauss produced the correct answer immediately, having
quickly observed that the sum was actually 50 pairs of numbers, with each pair
summing to 101, total 5,050.
procedure bubbleSort(A : list of sortable items)
n = length(A)
repeat
swapped = false
for i = 1 to n -1 inclusive do
if A[i-1] > A[i] then
swap(A[i-1], A[i])
swapped = true
end if
end for
n = n - 1
until not swapped
end procedure
—|—
Insertion Sort
Insertion Sort is another in-place sort, it functions by taking an item and
then inserting it into its correct position in the array. It consumes one
input element each repetition, and growing the sorted portion of the list.
Each iteration, insertion sort removes one element from the input unsorted
portion and finds its location in the sorted portion of the list.
procedure insertionSort(A : list of sortable items)
for i = 1 to length(A)
tmp = A[i]
j = i - 1
while j >= 0 and A[j] > tmp
A[j + 1] = A[j]
j = j - 1
end while
A[j + 1] = tmp
end for
end procedure
—|—
What is the expected time complexity of Insertion sort? We look and observe
that there two two nested loops, each operating on approximately the length of
the array.
QuickSort (recursive)
One of the most important skills that you will need to develop is the ability
to read and understand, though perhaps not program in, languages with which
you are unfamiliar. Below, you will find the code for QuickSort written in
Python. It would be a questionable choice for you to simply try to translate
this into C, but there are some interesting elements to it.
First, you see that the code partitions the array into three parts: (i) a part
that is lesser in value than the pivot element; (ii) equal in value to the
pivot point, and (iii) greater in value than the pivot point. The choice of
pivot element is arbitrary.
Second, you will see that we first recursively call quickSort first on the
left partition, and then on the right partition. We then join these together
to form a sorted array.
Third, you may notice that aside from the arbitrarily chosen pivot element,
there is no array indexing. One could, in principle, use this algorithm to
sort a linked list.
You will want to write a helper function called partition that will divide an
array into three parts, as described earlier. You should do this in place, in
other words you do not need to create three arrays. Instead, you will swap
elements so that those less than or equal to the pivot element are on the
left, while those greater than it are on the right. Why was this acceptable in
Python? Python is a very different language that has lists (which is can treat
as arrays) as a native data type. Different language enable different choices,
and writing this in Python leads to the (surprising to some) insight that
QuickSort can be implemented on linked lists.
def quickSort(a):
if len (a) > 0:
pivot = a[0]
left = []
mid = []
right = []
for x in a:
if x == pivot:
mid.append(x)
elif x < pivot:
left.append(x)
else:
right.append(x)
return quickSort(left) + mid + quickSort(right)
else:
return []
—|—
Merge Sort
Merge Sort is a different type of sort in that it works through the array
sequentially. It is well-suited for the case when you do not have random
access, such as when working with magnetic tapes, or linked lists.
There are two main varieties of Merge Sort: binary merge sort (which may be
recursive) and natural merge sort. The recursive binary merge sort was
originally discovered by John von Neumann, one of the great polymaths of the
previous century.
A single item (or, if you prefer to start your induction with zero, and empty
list) is by definition sorted. Now, consider the case where we have two
unsorted lists: either (i) the item at the front of list 1 is the smallest, or
(ii) the item at the front of list 2 is the smallest, but in either case
greater than or equal to the item at the end of list 3. Move that item to the
end of the the (initially empty) list 3. Repeat this process until both of
list 1 or list 2 have elements at their head less than the element at the end
of list 3 or are empty. What do we know at this point? We know that (i) the
third list contains a sorted subsequence, and (ii) that sequence is equal in
length to the sum of the sorted subsequences that we pulled from lists 1 and
2. This sorted subsequence is called a run. We repeat this process until we
have exhausted either list 1 or list 2, at which point we append the remaining
list to list 3.
We now take list 3 and copy elements from it as long the elements continue to
increase to list 1, when the first decreasing element occurs we switch to
copying the run to list 2. We repeat this, switching between lists 1 and 2,
until list 3 is depleted.
We return to merging elements from lists 1 and 2 to increasing length runs on
list 3. We do this until there is a single run, at which point the list is
sorted.
It is much easier to think about this if you do it recursively. Take the list,
split it into two lists (left and right). Call mergeSort(left) and
mergeSort(right) and merge the results. For the base case, the recursion will
encounter a list of a single element, which by definition is sorted.
What is the execution time of this algorithm? Let’s assume that we have runs
of length 1 on tapes 1 and 2. We merge these onto list 3 and we now have (at
worst) runs of length 2. We processed n elements during this pass. We now need
to ask, How many times can we double starting with 1 until the value we get is
greater than n? The answer is log n. The worst case execution time of our
algorithm is O(n log n).
def mergeSort(items):
if len (items) > 1:
middle = len(items) / 2 # Split at the mid - point
leftList = items[0: middle] # Left half
rightList = items[middle:] # Right half
mergeSort(leftList) # Sort the leftList list
mergeSort(rightList) # Sort the RightList list
# Merge the sorted lists
l, r = 0, 0
for i in range(len(items)):
# Both lists hold elements
if l < len(leftList) and r < len(rightList):
# The left is smaller
if leftList[l] < rightList[r]:
items[i] = leftList[l]
l += 1
# The right is smaller
else:
items[i] = rightList[r]
r += 1
# Only the left has any elements
elif l < len (leftList):
items[i] = leftList[l]
l += 1
# Only the right has any elements
else:
items[i] = rightList[r]
r += 1
—|—
Your Task
You task is to:
- Implement a testing harness for sorting algorithms. You will do this using getopt.
- Implement five specified sorting algorithms.
- Gather statistics about their performance.
Specifics
You must use getopt to parse the command line arguments. To get you started,
here is a hint.
while ((c = getopt( argc, argv, “ AmbiqMp : r : n :” )) != -1)
—|—
- -A means employ all sorting algorithms.
- -m means enable minSort.
- -b means enable bubbleSort.
- -i means enable insertionSort.
- -q means enable quickSort.
- -M means enable mergeSort.
- -p n means print the first n elements of the array. The default value is 100.
- -r s means set the random seed to s. The default is 8222022.
- -n c means set the array size to c. The default value is 100.
It is important to read this carefully. None of these options is exclusive of
any other (you may specify any number of them, including zero). - Your random numbers should be 24 bits, no larger(2^24 - 1 = 16777215).
- You must use rand() and srand(), not because they are good (they are not), but because they are what is specified by the C99 standard.
- Your program must be able to sort any number of random integers up to the memory limit of the computer. That means that you will need to dynamically allocate the array using calloc().
- Your program should have no memory leaks.
A large part of this assignment is understanding and comparing the performance
of various sorting algorithms. Consequently, you must collect some simple
statistics on each algorithm. In particular, - The size of the array,
- The number of moves required (each time you transfer an element in the array, that counts), and
- The number of comparisons required (comparisons only count for elements, not for logic).
Submission
You must turn in your assignment in the following manner:
- Have file called Makefile that when the grader types make will compile your program. At this point you will have learned about make and can create your own Makefile.
* CFLAGS=-Wall -Wextra -Werror -pedantic must be included.
* CC=gcc must be specified.
* make clean must remove all files that are compiler generated.
* make should build your program, as should make all. - You program must have the source and header files:
* minsort.h specifies the interface to minSort().
* minsort.c implements minSort().
* Each sorting method will have its own pair of header file and source file.
* sorting.c contains main() and may contain the other functions necessary to complete the assignment. - You may have other source and header files, but do not try to be overly clever.
- A plain text file called README that describes how your program works.
- The executable file produced by the compiler must be called sorting.
- These files must be in the directory assignment2.
- You must commit and push the directory and its contents using git.