Vous êtes sur la page 1sur 10

1

Fun with Sorts


Students: Daniel Intskirveli and Gurpreet Singh
CS220, The City College of New York
April 25
th,
2014
INTRODUCTION

It is impossible to study and consequently obtain a deep understanding of the implementation and
runtime complexity of various classical algorithms without temporarily abandoning the theoretical
domain of textbooks and lectures in favor of a more practical approach. The practical approach is
familiar to any computer science student, and involves several recurring themes: choosing a
programming language, writing the code, testing the algorithms to make sure they actually work,
coming up with a way to count the effort for any algorithm, coming up with a way to measure the
execution time for any algorithm, comparing those statistics between various algorithms, etc.

In order to mitigate the pains brought on by these recurring themes, we chose to avoid the typical
ad-hock approach of writing an algorithm and testing it, then repeating that process for every
algorithm encountered. Instead, we created a testing framework for all the algorithms we have
encountered so far. We called the program simply- Fun With Sorts- a suitable name because most
of the algorithms are sorts, and because fun was had. This paper serves as a discussion for the
design a framework we wrote to explore the performance and implementation of various algorithms,
the results we gathered using it, and the insight those results offer.

1. DESIGN

C++ was chosen as the language for the
majority of the application because it has
features that aid in object-orientated
programming and because it is low level
enough for our needs. We wanted to avoid the
overhead as well as the unwanted
optimizations of the higher-level languages.
The framework is also designed to be modular.
This allows it to be extendable to include more
algorithms. In short, the program can be
thought of as having the following components:

A. Main- the entry point for the application
where the user decides which sorts to
test.
B. Stats- modules that facilitate in measuring
performance. At the time of writing, there
are two: the counter class, and the timer
class.
C. Utilities- a module which contains
functionality common to all algorithms for
testing purposes, e.g. functions that
create random arrays, functions that
output data from stats modules to files,
functions that check if a file is sorted,
helper functions for running algorithms,
etc.
D. Algorithms- The modules are groups of
algorithms that all have the same
characteristics and should be tested
together. Each of these modules define
the counters and timers they will use,
implementations for the algorithms being
tested, and other settings specific to that
group, such as the file size of the data
consumed by the algorithms and the type
of data they consume (e.g. random array,
reverse sorted array). At the time of
writing, there are two groups: project_1,
and project_2.
E. Grapher- a post-processing tool for the
data outputted by the rest of the
framework. As of writing, it generates
plots and bar graphs for the Stats data,
and aggregates it to get sum (total work)
and average. Because, in its current
iteration, the Grapher code needs to be
changed often to be useful, we chose
Python as the language for it. Python has
the advantage of being an interpreted
language with a lot of library support. It is
slower, but producing graphs is not
performance sensitive.
2

1.1 WALKTHROUGH OF A RUN

Each algorithm in a group of algorithms is run
against the same data set. This data set is
taken from the samples vector
1
, which is a
fundamental part of Fun with Sorts. The
samples vector is vector that stores pairs of
arrays and their respective sizes (arrays are
primitives in C and the size of one must be
known to do any useful computation). Each
sample, a pair of an array and its size, is a list
to be processed by an algorithm. An algorithm
group module defines the types of input arrays
to be generated, and their sizes.
Every algorithm in a group of algorithms
can then be run against each sample in the
samples vector. For each algorithm being a
run, a copy of the samples vector is made in
order to sandbox the side effects of the
algorithm (we do not want to run a sort on an
array that has already been sorted by the
algorithm that ran before it, for example). This
copy is then discarded.
As the algorithms are running, they are
making use of their respective Stats objects
(either counters or timers) to record statistics.
When the algorithms are finished running, the
data from these Stats objects is exported to
csv
2
files in the working directory of the
application. Afterwards, the Grapher tool reads
the csv files to aggregate the data and create
visualizations.

Note: For simplicity, all sorting algorithms sort
integer primitives as opposed to more complex
objects, and they sort to non-decreasing order.

1.2 COUNTERS


1
Throughout this document the use of the word
vector represents the vector class in C++, a
standard library class for storing a one-
dimensional list of objects.
2
A file format that contains data in comma
separated values. This format is supported by
our Grapher as well as various popular software
such as Microsoft Excel.
The Counter class was designed with one goal
in mind: to measure work done in a way that
can be reproducible on any machine,
regardless of age of hardware. It is backed by
a vector of 64 bit unsigned integers for
compatibility with large datasets. The vector
class needs not to reside in one continuous
block of memory, and the 64 bit integers
prevent overflows that will occur with 32 bit
integers and counter values larger than 2
32
.

Each of the algorithms is associated with one
or more counters. Before the algorithm begins,
the user must call the next() function for all
counters. This moves the counter cursor to the
next run, and subsequent calls to
increment() will increase the value of that
run.

The Counter class comes with one important
gotcha. Because it doesnt count total work
done in the traditional sense, the user must
increment the counter correctly in order to get
a useful value. While a traditional work count
might include all machine instructions, or one
increment per line of code in the algorithm, the
Counter class will count only the work relevant
to the performance of the algorithm, such as
the number of comparisons and exchanges.
The advantage of this is that, for a particular
set of input data, the resulting counter values
will be the same on any machine. The
downside is that the values will not hold the
truth unless the use of the counter is
implemented correctly.

1.3 TIMERS

The workflow of the Timer class is similar to
that of the Counter class. The Timer values are
inherently non-reproducible. Because a timer
measures execution time of an algorithm,
different machines will yield different results,
depending on how busy they are, the speed of
their processors, and speed of the hard drives.
However, Timer values are useful for
comparing runtimes of several algorithms on
large file sizes on the same machine. To use a
Timer, a user must called next() and start()
on the Timer associated with the algorithm
before the algorithm begins, and stop() when
3
it finishes. Unlike the Counter class, the Timer
is easy to integrate and use. However, its
results will differ from machine to machine and
even from time to time, and it cannot
discriminate between comparisons and
exchanges, for instance.

2. PROJECT 1

This is our first algorithm group, a collection
of O(n
2
) sorts, as well as a plain linear search
and two variations on it. The algorithms were
tested with the following file sizes:

n = 500
n = 2500
n = 12,500
n = 62,500

For each file size, the algorithm was expected
to perform work on the following types of input
data:
Random
3

Reverse-ordered
4

20% sorted
5


2.1. PROJECT 1 ALGORITHMS AND
RESULT DISCUSSION

Note: All results are available in the appendix.

Bubble Sort

The pure bubble sort gave us results that are
typical for a bubble sort. To illustrate this,
included on the following page is a histogram
generated by our Grapher tool. The table is
organized in 3 input array types, random,
reverse, 20% sorted, and 4 file sizes, arranged
increasing from left to right. As is typical of a

3
An array of size N, which contains randomly
generated integers in the range 0 to a maximum
defined in the algorithm group module.
4
A reverse ordered array. If the size is N, then
the array is [N-1, N-2, , 0].
5
A randomly generated array in which 20% of
the keys (1/5
th
of the file size) were randomly
chosen to be taken out, sorted, and put back in
the array.
bubble sort, our histograms shows that the
number of exchanges that the algorithm must
make is much higher for the reverse-ordered
list than a random or partially sorted list.

Adaptive Bubble Sort

Adaptive bubble sort differs from pure bubble
sort in that it knows that it is finished when it
examines the entire array and no "swaps" are
needed thus the list is completely sorted. It
keeps track of the occurring swaps by the
using of a flag. We can see from the data in the
appendix the adaptive version of the bubble
sort performs better than the regular version.

Insertion Sort

Insertion sort is a comparison-based algorithm
and it is one of the better n
2
algorithms as it
only goes through the array once and it only
compares. Insertion sort can be very fast and
efficient when used with smaller arrays but
when dealing with large amounts of data, it
loses this efficiency. Our data shows that for
random and partially sorted lists, the insertion
sort slightly better than the selection sort.
When the input file is reverse ordered, the
selection sort performs much better. However,
the insertion sort is substantially faster than
either of the bubble sort variations.

Selection Sort

In selection sort, in each pass, the unsorted
element with the smallest value is moved to its
proper position in the array. The selection sort
4
is the fastest of the sorts in project 1 for
reverse ordered files.

Sequential Search

This is an unmodified linear search. It simply
goes through the array until it finds an element
that matches the key, and returns the index of
that element. Because the key to be sought is
selected randomly and array is not modified
after a key is found, the plot we generated is
scattered and shows no improvement over
time. Most of the data points are found near
the top of the graph, where counter values are
larger than N/2, the average case for a
sequential search. This is because the
elements to be sought were chosen with a bias
for the back of the input file. This is explained
in the following section about the adaptive
version of this search.


Adaptive Sequential Search (1)

The first adaptive version of the Sequential
Search uses the move-to-front approach to
organizing lists in order to optimize search
time. This search is unique among the
algorithms we wrote because it takes a linked-
list, not an array, as the list of data to search.
This is because moving an element to the front
of a list is more efficient in a linked-list than it is
in an array. We experimented also with an
approach in which the found element was
swapped with the first element in the listthis
gave us no noticeable improvement over time.
To run this test, we randomly selected (with a
bias
6
for the end of the last, artificially making
the runtime of the search longer) N elements
(the same size as the list) from the list and then
searched the list for those elements using the
Adaptive Sequential Search.
Each list was searched for those
elements N*5 times (2500 algorithm runs for a
size of 500, for example). Our Grapher tool
then generated a graph of all the runs (on the
following page).
The improvement here is clearly visible. At first,
the search struggles with the keys we tell it to
findbecause they are mostly at the end of the
list. At some point, the algorithm organizes the
list. The sought keys, which were originally
clustered around the end of the list, were
gradually moved to the front of the list, which
improved and stabilized (i.e. lowered the
variance of) the search performance.

6
We wrote a random function that has a non-
uniform distribution for selecting keys that are
more likely to be at the end of the list using the
following logic: if rand() is a function that
returns a random floating between 0 and 1, then
rand()*rand() has a distribution that looks like
1/x
2
, and 1-rand()*rand() is a vertical flip of that
distribution, and this is what we want. The
distribution of those functions is plotted below
(the area on the right is more relevant to our use
case):

5

Adaptive Sequential Search (2)

The second adaptive version of the
sequential search uses a different approach to
organizing the list for better search
performance. Every time a key is found, it is
swapped with the array index directly before it.
Over time, this should bubble the more
popular sought keys to the front of the list,
increasing performance.
On large file sizes such as the ones we
tested, we saw no noticeable improvement in
search time with more runs of the search.
However, when we tested the performance on
a smaller list of random numbers (100), we
clearly saw an improvement in search time
when running the search 10,000 times (top
right).
Note: this is a much higher number of
runs than the program does by default, which
is 5N runs for a file size of N. We hypothesize
that the lack of improvement in larger file sizes
is because the list takes longer to organize,
and the potential improvement fell outside of
the range of our tests.

3. PROJECT 2

A collection of more complex sorts, each of
which usually has a faster run time than the
sorts in project 1. These sorts were tested on
the following file sizes:

n = 2500
n = 12,500
n = 62,500
n = 312,500

With input file types:
Random
20% sorted
50% sorted
7


3.1. PROJECT 2 ALGORITHMS AND
RESULT DISCUSSION

Note: All results are available in the appendix.

Merge Sort

This is a straightforward merge-sort. Merge-
sort uses the divide and conquer paradigm to
sort a list. In our results, we observed that an
array that is partially sorted (20% sorted) would
perform better than an array which half sorted
(50% sorted). We also observed that an array
with random values takes longer to sort than a
partially sorted array.

Merge To Insertion Sort (20)

Merge To Insertion Sort (20) is similar to
straightforward merge-sort, but sorts sub-lists
using insertion sort when the sub-list has fewer
than 20 elements. Our results show that if
insertion sort is applied to a sub-list when it has
fewer than 20 elements, it perform faster than
straight merge-sort in all cases (random, 20%
or 50% sorted list). This sort performs better

7
A randomly generated array in which 50% of
the keys (1/2 of the file size) were randomly
chosen to be been taken out, sorted, and put back
in the array.

6
because Insertion sort is a fast sorting
algorithm for sorting very small lists.

Merge To Insertion Sort (100)

Merge To Insertion Sort (100) is similar to
straightforward merge-sort, but sorts sub-lists
using insertion sort when the sub-list has fewer
than 100 elements. It was observed in our
results that if insertion sort is applied to a sub-
list when it has fewer than 100 elements, it
performs faster than merge-sort, to whom
insertion sort is applied to when its sub-list
contains fewer than 20 elements, in all cases
(random, 20% or 50% sorted list).

Quick Sort

This is a plain quick sort that uses the last
element in the list as the pivot. Quick sort also
uses the divide and conquer paradigm to sort a
list. Our results show that quick sort performs
the fastest on a 20% sorted list. If a list is half
sorted (50% sorted), then quick sort requires
more time to sort than a randomized list.

Quick To Insertion Sort

This is the same as the quick sort above, but
when a sub-list becomes small enough (less
than 50 elements), it sorts it using insertion
sort. Our results show that total work done by
this sort is less than plain quick sort because
insertion sort has a better constant than quick
sort when sorting smaller lists.

Quick Median-Of-Three Sort

A quick sort that, instead of choosing the last
element as the pivot, use the median-of-three
strategy to select a pivot: it randomly chooses
three elements from the list, then uses the
median of those three as the pivot. This
guarantees that the worst case for Quicksort,
when the pivot is either the largest or the
smallest element in the list, is omitted. It was
observed in our results that this sort performs
faster than plain quick sort and quick sort that
uses insertion sort on the last 50 elements.

Quick Median-Of-Three To Insertion Sort

A median-of-three quick sort (same as above)
that uses insertion sort on sub-lists smaller
than 50 elements. It was observed in our
results that this sort does perform faster than a
plain median-of-three quick sort. It decrease
the amount of exchanges need to sort but
when insertion sort is used, it adds more work
to the comparisons. Overall, this sort performs
the best out of all different variation of quick
sort used in this project.

Heap Sort

A sort in which all of the elements are inserted
into a heap that is constantly being updated,
while the largest values in it are being taken
out and inserted back into the array. The total
work done by heap sort was the worst out of all
the big sorts. It was observed in our results that
heap sort performed faster when the array was
50% sorted than 20% sorted. When the array
was random, then more work was done with
comparisons and exchanges.

Shell Sort

A shell sort is a variant of the bubble sort. It
starts out by sorting elements that are far from
each other, then it decreases that gap with
consequent iterations. The increments
provided to sort a list using shell sort
performed faster than all the big sorts. It was
observed in our results that shell sort 1
performed faster than shell sort 2 because the
increments for the shell sort 1 were better.

4. FUTURE CONSIDERATIONS

We plan to expand the functionality of Fun
With Sorts in the future for even easier testing
of various algorithms. For one thing, there are
several design improvements that can be
made.

More Robust Stats

At the time of writing, there are two Stats
modules: Timer, and Counter. While these are
already easy to use, they are not scalable and
need to be improved for two basic reasons.
7
1. They do not support grouping their results
(into bins of file size and input type, for
example), because they are backed by a
one-dimensional data structure. At the
time of writing, separation of Stats data
into so-called bins is handled by the
output function in the Utilities module.
2. They do not inherit from a parent class.
As a result, having two Stats classes (we
started with just one, the Counter) forces
us to have copied code that duplicates
functionality.

An Algorithm Group Interface

The two modules, Project 1 and Project 2,
must also be made to extend a base Algorithm
Group class that provides virtual function
prototypes for all the settings you would expect
in one of these groups. Each algorithm group
needs to have a function that returns the file
sizes and the input file types, for example.

Configurable Grapher

At the time of writing, the Grapher reads the
output data from the Framework and creates a
bar graph for each Stat data associated with
sorts, and a line plot for each Stat data
associated with a search. As mentioned
before, it also does some simple math to
calculate total work and average work.
Generating proper, useful, graphs is currently
ad-hock in that, if you insist on straying from
the default behavior of the Grapher, you must
change its code. A future iteration of the
Grapher will be configurable. This would
useful, for example, for grouping several
Counters into a single bar graph.

A Test Suite

The program should automatically run all of the
algorithms on small sets of data to make sure
that the algorithms continue to work as the
code is modified and more algorithms are
added.

Existing Algorithms Improved

The code for some of the algorithms we
implemented and tested needs work. Most
notably, the shell sort should be modified to
use different intervals. Also, the adaptive
searches should be tested with sought
elements having different distributions
8
.

Performance Improvements

Fun with Sorts should be thoroughly checked
for memory leaks before its next use in more
complex applications. The application should
also support the use of multiple threads for
running algorithms concurrently.

5. CONCLUSION

For the most part, the results we obtained
using our testing application arent particularly
useful because they serve simply to confirm
well-known runtime complexities for several
classic algorithms. Run time complexity for
these functions as well as detailed analysis is
readily available. However, the application has
the potential to improve students
understanding of how various algorithms work.
It can also provide visual representations of
various statistics collected while running the
algorithms.

We found that the most insightful data came
from the counters for the sequential searches
in Project 1, which show a performance
improvement over time for the adaptive
versions of the search. In fact, as we were
testing our code and going through iterations of
the searches, we saw the plots change shape
as the search code improved.






8
A distribution biased for the end of list was
chosen because it has the most dramatic
improvement. In reality, if a computer scientist
knows that keys he is searching for are
predominantly at the end of the file, he would
write an algorithm that starts the search from
the end, not the front, of the file.
8









9
Appendix sfasdasdfaskdfjkladflashdfladhw;fadks;hflajdhsf
File size 500 2500 12500 62,500
File type Random Reverse
20%
sorted Random Reverse
20%
sorted Random Reverse 20% sorted Random Reverse 20% sorted
adap. bubble sort comparisons 234,530 249,500 195,608 6,217,512 6,247,500 5,115,453 154,125,169 156,237,500 144,175,965 3,895,250,175 3,906,187,500 3,837,626,097
adap. bubble sort exchanges 62,116 124,750 59,870 1,531,373 3,123,750 1,503,941 38,908,335 78,118,750 37,847,180 974,692,553 1,953,093,750 946,728,517
adap. bubble total 296,646 374,250 255,478 7,748,885 9,371,250 6,619,394 193,033,504 234,356,250 182,023,145 4,869,942,728 5,859,281,250 4,784,354,614
bubble comparisons 249,500 249,500 249,500 6,247,500 6,247,500 6,247,500 156,237,500 156,237,500 156,237,500 3,906,187,500 3,906,187,500 3,906,187,500
bubble exchanges 62,116 124,750 59,870 1,531,373 3,123,750 1,503,941 38,908,335 78,118,750 37,847,180 974,692,553 1,953,093,750 946,728,517
bubble total 311,616 374,250 309,370 7,778,873 9,371,250 7,751,441 195,145,835 234,356,250 194,084,680 4,880,880,053 5,859,281,250 4,852,916,017
insertion sort comparisons 62,116 124,750 59,870 1,531,373 3,123,750 1,503,941 38,908,335 78,118,750 37,847,180 974,692,553 1,953,093,750 946,728,517
insertion sort exchanges 62,615 125,249 60,369 1,533,872 3,126,249 1,506,440 38,920,834 78,131,249 37,859,679 974,755,052 1,953,156,249 946,791,016
insertion total 124,731 249,999 120,239 3,065,245 6,249,999 3,010,381 77,829,169 156,249,999 75,706,859 1,949,447,605 3,906,249,999 1,893,519,533
selection sort comparisons 124,750 124,750 124,750 3,123,750 3,123,750 3,123,750 78,118,750 78,118,750 78,118,750 1,953,093,750 1,953,093,750 1,953,093,750
selection sort exchanges 494 250 494 2,491 1,250 2,491 12,492 6,250 12,494 62,491 31,250 62,487
selection total 125,244 125,000 125,244 3,126,241 3,125,000 3,126,241 78,131,242 78,125,000 78,131,244 1,953,156,241 1,953,125,000 1,953,156,237
File size 2500 12500 62500 312500
File type Random
20%
sorted
50%
sorted Random
20%
sorted
50%
sorted Random 20% sorted 50% sorted Random 20% sorted 50% sorted
adap20_merge_sort comparisons 15,666 15,560 15,730 115,521 115,382 115,460 701,994 702,371 702,283 4,140,232 4,137,217 4,139,330
adap_merge_sort comparisons 13,052 12,802 12,519 82,240 82,945 83,150 578,795 581,258 579,316 3,526,270 3,520,729 3,525,462
heap_sort comparisons 54,644 54,648 54,488 331,450 331,188 330,510 1,943,684 1,942,820 1,939,968 11,176,524 11,172,462 11,119,072
heap_sort exchanges 26,071 26,073 25,993 159,474 159,343 159,004 940,591 940,159 938,733 5,432,011 5,429,980 5,403,285
merge_sort comparisons 25,161 25,131 25,169 154,544 154,636 155,023 917,055 917,090 917,253 5,315,407 5,316,781 5,321,427
quickMedianOfThreeSort comparisons 35,868 33,144 34,914 207,852 212,343 206,001 1,260,909 1,226,043 1,178,718 7,166,973 7,062,024 6,695,841
quickMedianOfThreeSort exchanges 10,528 9,630 10,184 62,102 63,604 61,515 384,339 372,734 356,971 2,205,395 2,170,808 2,049,174
quickMedianOfThreeToInsertionSort
comparisons 40,614 37,460 40,977 235,522 236,732 234,688 1,390,249 1,348,416 1,291,592 7,757,998 7,649,529 7,203,781
quickMedianOfThreeToInsertionSort exchanges 6,895 6,088 6,611 43,796 46,111 43,927 292,555 283,806 272,175 1,743,642 1,713,348 1,609,261
quickToInsertSortTime comparisons 40,586 42,794 44,418 250,052 247,500 254,625 1,438,064 1,447,541 1,476,219 8,193,876 8,242,537 8,181,748
quickToInsertSortTime exchanges 7,468 7,225 7,214 49,629 49,524 50,425 316,545 315,562 320,052 1,910,493 1,944,170 1,928,863
quick_sort1 comparisons 38,172 37,434 38,115 229,995 227,019 230,892 1,352,892 1,349,328 1,357,179 7,772,070 7,845,096 7,768,506
quick_sort1 exchanges 11,058 10,766 10,972 68,313 67,098 68,261 409,222 407,627 408,956 2,381,677 2,404,612 2,376,814
shellSort1 comparisons 4776 4860 4922 24793 24832 23678 122611 122560 121748 610939 603164 596961
shellSort1 exchanges 19740 19824 19886 124729 124768 123614 747511 747460 746648 4048318 4040543 4034340
shellSort2 comparisons 7559 7752 8336 55437 54120 47185 283360 287922 289168 1665106 1771899 1538898
shellSort2 exchanges 22016 22209 22793 150521 149204 142269 831103 835665 836911 4969752 5076545 4843544
Measured time (ms.)
adap20_sort_time 0 0 1 2 3 2 14 14 15 84 84 83
adap_merge_sort time 0 0 1 1 2 2 11 11 11 68 68 67
heap sort time 2 2 1 9 10 9 56 56 56 321 326 320
merge sort time 1 1 1 4 3 3 23 25 21 129 124 123
quick median of three sort time 1 1 1 6 5 6 39 32 32 188 185 175
quick median of three to insertion sort time 1 0 1 6 5 5 31 30 30 174 171 161
quick to insertion sort time 1 1 1 5 6 5 34 33 34 187 189 189
vanilla quicksort time 1 1 1 6 6 6 34 34 34 192 192 192
10
shellSort1 time 1 0 1 2 3 2 16 17 14 78 77 77
shellSort2 time 1 0 1 4 3 3 19 19 19 113 115 109

Vous aimerez peut-être aussi