Académique Documents
Professionnel Documents
Culture Documents
Abstract - For over 20 years, the data structures course has ListNode *next;
been a pillar of computer science programs at colleges and
universities. This paper looks at how the data structures friend class List<Object>;
course has evolved over time from a course that friend class ListItr<Object>;
emphasized algorithmic concepts to a course that };
emphasizes syntactical and design concepts. It illustrates
how the evolution of programming languages and concepts FIGURE I
can introduce “gratuitous” complexity into algorithms. A LINKED LIST NODE IMPLEMENTED IN C++
Specific algorithms and abstract data types are compared
in past and present data structures texts using a suite of In this small segment of C++ code, there are classes,
software metrics. A comparison is performed between objects, templates, constructors, default arguments that are
algorithms from data structures texts across different used in calls to other data member’s constructors and friend
programming languages and across procedural and object class declarations which are passed the underlying template
oriented paradigms. The results are compared to provide type of the ListNode class. Couldn’t the functionality be
evidence of how the course has evolved over time. expressed in a way that does not obscure the underlying
process?
Index Terms – Abstract Data Types, Data Structures, In an earlier book by Weiss, “Data Structures and
Software Metrics, Syntactic Complexity. Algorithm Analysis” [10], a node for a linked list is defined in
the “C” language as follows:
INTRODUCTION
The data structures course has been a core constituent for typedef struct Node *PtrToNode;
many years in computer science programs. While spanning typedef PtrToNode Position;
many languages, from FORTRAN to Pascal to C to C++ and struct Node
now Java, the basic content of the course has remained {
unchanged. As languages have evolved, there has been a ElementType Element;
perceived increased in the syntactic complexity of data Position Next;
structures to become obscured. Many object oriented features };
seem to add to the syntactic complexity without contributing FIGURE II
to the functionality of an algorithm. Students are required to A LINKED LIST NODE IMPLEMENTED IN C
learn more complex language syntax and very often the
underlying algorithm is lost in the confusion. There is more functionality in Weiss’s C++ version of a node,
The term “gratuitous complexity” was first described in [1] as but how much of the added complexity is gratuitous?
complexity that “contributes nothing to the task in hand”. For example, in C a node “N” could be initialized as follows:
For example, in the book “Data Structures and Algorithm
Analysis in C++” by Mark Allen Weiss[11], Weiss defines a N.element.field1 = initvalue1;
Node for a linked list in C++ as: N.element.field2 = initvalue2;
etc.
template <class Object>
This is straightforward and students do not have to focus on
class ListNode
understanding a large number of abstract programming
{
concepts in order to comprehend the underlying process. Of
ListNode( const Object & theElement = Object(
course, the fault could be placed on the author. Clever use of
), ListNode * n = NULL )
syntax very often obscures the underlying process. In his
: element( theElement ), next( n ) { }
book “Data Structures and Algorithm Analysis in Java” [12],
Weiss defines a node for a linked list as follows:
Object element;
1
James Harris, Associate Professor of Computer Science, jkharris@georgiasouthern.edu
2
Ardian Greca, Assistant Professor of Computer Science, naidrag@IEEE.org
0-7803-8552-7/04/$20.00 © 2004 IEEE October 20 – 23, 2004, Savannah, GA
34th ASEE/IEEE Frontiers in Education Conference
S3H-9
Session S3H
template <class Entry>
class ListNode class Binary_tree {
{ public:
// Constructors Binary_tree();
ListNode( Object theElement ) protected:
{ Binary_node<Entry> *root;
this( theElement, null ); };
}
template <class Entry>
ListNode( Object theElement, ListNode n ) struct Binary_node {
{ Entry data;
element = theElement; Binary_node<Entry> *left;
next = n; Binary_node<Entry> *right;
} Binary_node();
};
// Friendly data; accessible by other package routines
Object element; template <class Entry>
ListNode next; Binary_tree<Entry>::Binary_tree()
} {
root = NULL;
FIGURE III }
A LINKED LIST NODE IMPLEMENTED IN JAVA FIGURE V
A BINARY TREE DEFINITION IN C++
The code in Figure III is considerably more comprehendible
than the corresponding C++ code, albeit without templates. These definitions include the data definitions and the
One of the issues with implementing abstract data types implementation of a C++ default constructor and its equivalent
(ADT’s) with object oriented languages is the question of in Pascal. Clearly the C++ definition is longer and more
obscuring the algorithm by using more sophisticated language complex. Part of the adding complexity stems from the fact
constructs needed to implement objects. As the previous that in the C++ version, a tree is defined by a class rather than
example illustrates, this does not necessarily have to be the just a pointer variable as in Pascal. In fact, much of the added
case. complexity in implementing ADT’s in object oriented
As another example, consider Dale Kruse’s Pascal languages can be attributed to the use of objects rather than
and C++ definitions for a binary tree ADT [2,4]. pointers to represent ADT’s. Most books in C++ allow for
dynamic declarations such as:
TABLE I
TEXTBOOKS USED IN THE STUDY
Title Author(s)
Pascal Plus Data Structures Nell B. Dale, Neil Dale, Susan C. Lilly
C++ Plus Data Structures Nell B. Dale
Object Oriented Data Structures using Java Nell Dale, Daniel T. Joyce, Chip Weems
Data Structures Program Design Robert L. Kruse
Data Structure and Program Design in C Robert L. Kruse, Bruce P. Leung, Clovis L. Tondo
Data Structures and Program Design in C++ Robert L. Kruse , Alex Ryba
Data Structures and Algorithm Analysis in Java Mark Allen Weiss
Data Structures and Algorithm Analysis in C++ Mark Allen Weiss
Data Structures and Algorithm Analysis in Java Mark Allen Weiss
These books were chosen because they involve three therefore a weight of three. The weighted block count
authors (Dale, Kruse, and Weiss), who have published over (WBC), which is the sum of the weighted blocks, is a good
time various data structures books in different languages. measure of complexity since each level of nesting applies an
The functionality within each ADT tends to be consistent addition scoping and control context, adding to the
within authors, i.e. independent of the implementation complexity of an algorithm.
language. Another reason these texts were chosen is The number of different identifiers (D-Ident) requires
because each of these authors was nice enough to make the reader to remember the name and purpose of each
copies of their source code available online [13]-[15]. The identifier. A program with many identifiers is analogous to
four ADT’s analyzed are common to most data structures going to a meeting where you are introduced to many
textbooks. They are stacks (implemented with arrays), people. It becomes difficult to associate names with faces.
circular queues (also implemented with arrays), singly As a measure of complexity, the number of different
linked lists, and binary search trees. reserved words (D-RW) differs from the number of
Each set of source code was first stripped of comments identifiers. In order to understand an implementation with
and blank lines. The following seven metrics were a greater number of reserved words, the reader must have a
measured: greater knowledge of language syntax. The number of
identifiers and the number of reserved words were not
• Number of characters included as measures because they are a subset of the
• Number of tokens number of tokens. Remembering the purpose of an
• Number of lines identifier involves associating a particular variable or a
• Number of blocks (BC) function with its purpose, whereas remembering the purpose
• Number of weighted blocks (WBC) of a reserved word involves grammatical syntax such as
• Number of different identifiers (D-Ident) flow of control, primitive data types, etc.
Several common complexity measures, such as
• Number of different reserved words (D-RW)
cyclomatic numbers [5] and Halstead measures [9] were not
deemed appropriate because they tend to measure
The number of characters, tokens, and lines generally
algorithmic complexity rather than syntactic complexity.
indicates the amount of code needed for implementation. In
Table II shows the results of the metrics applied to the
general, the more code needed to implement an algorithm,
sample code. ADT stands for “Abstract Data Type”, BST
the greater the syntactic complexity.
stands for “Binary Search Tree”, “LL” stands for “Linked
The number of blocks or block count (BC) is a measure
of logical “blocks” of code. Blocks are logical groupings of Lists”, PAS stands for Pascal, and AVG stands for
“Average”.
statements used by other syntactic structures. Blocks are
The data in Table II is sorted by the primary key
determined by BEGIN and END statements in Pascal (with
several exceptions) and “{“ and “}” in C, C++, and Java. “Author”, secondary key “ADT”, and tertiary key
“Language”. The data is grouped by Author first and ADT
Weighted blocks assign a weight to each block that is the
second.
nesting level of that block. For example, if a block is nested
within two other blocks, it has a nesting level of three and
TABLE II
MEASURE COUNTS
Since each author has implemented different gives a measure that allows a comparison between languages
functionality for a particular data structure, the values in independent of the differing functionalities provided by each
Table II were normalized by dividing each value by the author. The data was then sorted by primary key
corresponding average value for a particular author and data “Language” (Lang) and secondary key “ADT”. The results
structure. Because the functionality of each author’s data are shown in Table III.
structure does not vary by author over the implementation
language, dividing each count by the corresponding average
TABLE III
WEIGHTED MEASURE COUNTS
Author ADT Lang Chars Tokens Lines BC WBC D-Ident D-RW Average
DALE BST C++ 0.612 0.705 0.555 0.471 0.304 0.776 1.085
KRUSE BST C++ 1.230 1.227 1.051 1.037 0.793 0.964 1.109
WEISS BST C++ 1.618 1.508 1.360 1.318 1.019 1.000 1.239
KRUSE LL C++ 1.319 1.256 1.228 0.968 0.672 1.200 0.957
WEISS LL C++ 1.293 1.320 1.070 1.113 0.989 0.895 1.227
DALE LL C++ 0.963 1.039 0.920 0.857 0.495 0.695 0.840
DALE QUEUE C++ 1.247 1.361 1.343 1.222 0.929 0.879 1.263
KRUSE QUEUE C++ 0.919 0.926 0.917 0.720 0.600 0.737 0.774
WEISS QUEUE C++ 0.970 0.917 0.865 0.730 0.491 0.789 0.878
DALE STACK C++ 0.734 0.906 0.788 0.656 0.344 0.600 0.786
KRUSE STACK C++ 0.970 0.920 1.000 0.667 0.462 0.709 0.943
WEISS STACK C++ 0.974 0.956 0.906 0.818 0.563 0.761 0.900
Avg 1.071 1.087 1.000 0.881 0.638 0.834 1.000 0.930
DALE BST JAVA 1.247 1.274 1.279 1.414 1.443 1.133 1.149
WEISS BST JAVA 0.835 0.876 0.909 1.091 1.427 1.241 1.239
DALE LL JAVA 0.790 0.756 0.878 1.041 1.144 1.074 1.200
WEISS LL JAVA 0.918 0.805 0.905 1.065 1.372 1.342 1.227
DALE QUEUE JAVA 0.687 0.747 0.731 0.889 1.000 0.776 0.947
WEISS QUEUE JAVA 0.820 0.789 0.810 1.054 1.473 1.145 1.463
DALE STACK JAVA 1.113 0.986 1.136 1.406 1.328 0.943 1.429
WEISS STACK JAVA 0.835 0.838 0.843 1.182 1.688 1.141 1.500
Avg 0.906 0.884 0.936 1.143 1.359 1.099 1.269 1.085
DALE BST PAS 1.141 1.021 1.166 1.114 1.253 1.091 0.766
KRUSE BST PAS 0.866 0.645 0.966 0.852 1.221 1.051 1.109
DALE LL PAS 1.247 1.205 1.202 1.102 1.361 1.232 0.960
KRUSE LL PAS 0.671 0.723 0.743 0.871 1.216 1.110 1.404
DALE QUEUE PAS 1.065 0.892 0.925 0.889 1.071 1.345 0.789
KRUSE QUEUE PAS 1.050 0.979 1.000 1.200 1.300 1.421 1.452
DALE STACK PAS 1.153 1.108 1.076 0.938 1.328 1.457 0.786
KRUSE STACK PAS 0.843 0.848 0.750 1.000 1.462 1.255 1.286
Avg 1.005 0.928 0.978 0.996 1.276 1.245 1.069 1.071
1.600
1.400
1.200
1.000 C
C++
0.800
Java
0.600 Pascal
0.400
0.200
0.000
Chars Tokens Lines BC WBC D-Ident D-RW
FIGURE VI
COMPARING COMPLEXITY METRICS
[7] Kruse, R., H., Tondo, C., L., Leung, B., Data Structures and Program [11] Weiss, M., A. Data Structures and Algorithm Analysis in C++, 2nd
Design in C, 2nd edition, 1997. edition, 1999.
[8] Kruse, R., H., Data Structures and Program Design in C++, 1st [12] Weiss, M., A. Data Structures and Algorithm Analysis in Java, 2nd
edition, 1998. edition, 2002.
[9] McCabe, T., J., Complexity Measure, IEEE Transctions on Software [13] ftp://ftp.prenhall.com/pub/esm/computer_science.s-041/kruse
Engineering,, pp 308-320, December 1976.
[14] http://computerscience.jbpub.com/cs_resources.cfm
[10] Weiss, M., A. Data Structures and Algorithm Analysis, 2nd edition,
1994. [15] http://www.cs.fiu.edu/~weiss/dsaajava/code/