Vous êtes sur la page 1sur 81

data structure

In programming, the term data structure refers to a scheme for organizing related pieces of information. The basic types of data structures include: files lists arrays records trees tables Each of these basic structures has many variations and allows different operations to be performed on the data.

data structure

E-mail Print A AA AAA LinkedIn Facebook Twitter Share This Reprints A data structure is a specialized format for organizing and storing data. General data structure types include the array, the file, the record, the table, the tree, and so on. Any data structure is designed to organize data to suit a specific purpose so that it can be accessed and worked with in appropriate ways. In computer programming, a data structure may be selected or designed to store data for the purpose of working on it with various algorithms. BLOB (binary large object), catalog, data mart (datamart), ECMAScript (European Computer Manufacturers Association Script), Visual FoxPro, segment, block, flat file, Database: Glossary, OPAC (Online Public Access Catalog)
RELATED GLOSSARY TERMS:

This book is about the creation and analysis of efficient data structures. It covers:

the primitive node structure; asymptotic notation for mathematically discussing performance characteristics; built-in arrays; list structures built from either nodes or arrays; iterators as an abstract model of enumerating the items in a sequence; stacks and queues for computing with last-in/first-out and first-in/first-out orderings; binary and general tree structures for searching or representing hierarchical relationships; min and max heaps for representing ordering based on priorities; graph structures for representing more general relationships between data elements; hash tables for the efficient retrieval of strings and other objects; and finally trade-offs between the structures, and strategies for picking the most appropriate ones.

To understand the material in this book you should be comfortable enough in a programming language to be able to work with and write your own variables, arithmetic expressions, if-else conditions, loops, subroutines (also known as functions), pointers (also known as references or object handles), structures (also known as records or classes), simple input and output, and simple recursion. Because many different languages approach the construction of data structures differently, we use pseudo-code so that you can translate the code into your own language.

An Extensive Examination of Data Structures


Visual Studio .NET 2003 Scott Mitchell 4GuysFromRolla.com October 2003 Summary: This article kicks off a six-part series that focuses on important data structures and their use in application development. We'll examine both built-in data structures present in the .NET Framework, as well as essential data structures we'll have to build ourselves. This first installment focuses on defining what data structures are, how the efficiency of data structures is analyzed, and why this analysis is important. In this article, we'll also examine the Array and ArrayList, two of the most commonly used data structures present in the .NET Framework. (12 printed pages) Contents Introduction Analyzing the Performance of Data Structures Everyone's Favorite Linear, Direct Access, Homogeneous Data Structure

The ArrayList: a Heterogeneous, Self-Redimensioning Array Conclusion

Introduction
Welcome to the first in a six-part series on using data structures in .NET. Throughout this article series we will be examining a variety of data structures, some of which are included in the .NET Framework Base Class Library and others that we'll build ourselves. If you're unfamiliar with the term, data structures are abstract structures, or classes, that are used to organize data and provide various operations upon their data. The most common and likely well-known data structure is the array, which contains a contiguous collection of data items that can be accessed by an ordinal index. Before jumping into the content for this article, let's first take a quick peek at the roadmap for this six-part article series, so that you can see what lies ahead. If there are any topics you think are missing from this outline, I invite you to e-mail me at mitchell@4guysfromrolla.com and share your thoughts. Space permitting, I'll be happy to add your suggestions to the appropriate installment or, if needed, add a seventh part to the series. In this first part of the six-part series, we'll look at why data structures are important, and their effect on the performance of an algorithm. To determine a data structure's effect on performance, we'll need to examine how the various operations performed by a data structure can be rigorously analyzed. Finally, we'll turn our attention to two data structures present in the .NET Frameworkthe Array and ArrayList. Chances are you've used both of these data structures in past projects. In this article, we'll examine what operations they provide and the efficiency of these operations. In Part 2, we'll explore the ArrayList class in more detail and examine its counterparts, the Queue class and Stack class. Like the ArrayList, both the Queue and Stack classes store a contiguous collection of data and are data structures available in the .NET Framework Base Class Library. However, unlike an ArrayList from which you can retrieve any data item, Queues and Stacks only allow data to be accessed in a predetermined sequential order. We'll examine some applications of Queues and Stacks, and see how to implement both of these classes by extending the ArrayList class. After examining Queues and Stacks, we'll look at HashTables, which allow for direct access like an ArrayList, but store data indexed by a string key. While ArrayLists are ideal for directly accessing and storing contents, they are suboptimal candidates when the data needs to be searched. In Part 3, we'll examine the binary search tree data structure, which provides a much more efficient means for searching than the ArrayList. The .NET Framework does not include any built-in binary search tree data structures, so we will have to build our own. The efficiency of searching a binary search trees is sensitive to the order with which the data was inserted into the tree. If the data was inserted in sorted or near-sorted order, the binary search tree loses virtually all of its efficiency advantages over the ArrayList. To combat this issue, in Part 4 we'll examine an interesting randomized data structurethe SkipList. SkipLists provide the efficiency of searching a binary search tree, but without the sensitivity to the order with which data is entered. In Part 5 we'll turn our attention to data structures that can be used to represent graphs. A graph is a collection of nodes, with a set of edges connecting the various nodes. For example, a map can be

visualized as a graph, with cities as nodes and the highways between them as edged between the nodes. Many real-world problems can be abstractly defined in terms of graphs, thereby making graphs an oftenused data structure. Finally, in Part 6 we'll look at data structures to represent sets and disjoint sets. A set is an unordered collection of items. Disjoint sets are a collection of sets that have no elements in common with one another. Both sets and disjoint sets have many uses in everyday programs, which we'll examine in detail in this final part.

Analyzing the Performance of Data Structures


When thinking about a particular application or programming problem, many developers (myself included) find themselves most interested in writing the algorithm to tackle the problem at hand, or adding cool features to the application to enhance the user's experience. Rarely, if ever, will you hear someone excited about what type of data structure they are using. However, the data structures used for a particular algorithm can greatly impact its performance. A common example is finding an element in a data structure. With an array, this process takes time proportional to the number of elements in the array. With binary search trees or SkipLists, the time required is sub-linear. When searching large amounts of data, the data structure chosen can make a difference in the application's performance that can be visibly measured in seconds or even minutes. Since the data structure used by an algorithm can greatly affect the algorithm's performance, it is important that there exists a rigorous method by which to compare the efficiency of various data structures. What we, as developers utilizing a data structure, are primarily interested in is how the data structures performance changes as the amount of data stored increases. That is, for each new element stored by the data structure, how are the running times of the data structure's operations effected? Consider a scenario in which you have a program that uses the System.IO.Directory.GetFiles(path) method to return the list of the files in a specified directory as a string array. Now, imagine that you wanted to search through the array to determine if an XML file existed in the list of files (namely one whose extension was .xml). One approach to do this would be to scan through the array and set some flag once an XML file was encountered. The code might look like so: using System; using System.Collections; using System.IO; public class MyClass { public static void Main() { string [] fs = Directory.GetFiles(@"C:\Inetpub\wwwroot"); bool foundXML = false; int i = 0; for (i = 0; i < fs.Length; i++) if (String.Compare(Path.GetExtension(fs[i]), ".xml", true) == 0) { foundXML = true; break;

} if (foundXML) Console.WriteLine("XML file found - " + fs[i]); else Console.WriteLine("No XML files found."); } }

Here we see that in the worst-case, when there is no XML file or the XML file is the last file in the list, we have to search through each element of the array exactly once. To analyze the array's efficiency at sorting, we must ask ourselves the following, "Assume that I have an array with n elements. If I add another element, so the array has n + 1 elements, what is the new running time?" (The term running time, despite its name, does not measure the absolute time it takes the program to run, but rather, it refers to the number of steps the program must perform to complete the given task at hand. When working with arrays, typically the steps considered are how many array accesses one needs to perform.) To search for a value in an array, we need to potentially visit every array value, so if we have n + 1 array elements, we might have to perform n + 1 checks. That is, the time it takes to search an array is linearly proportional to the number of elements in the array. The sort of analysis described here is called asymptotic analysis, as it examines how the efficiency of a data structure changes as the data structure's size approaches infinity. The notation commonly used in asymptotic analysis is calledbig-Oh notation. The big-Oh notation to describe the performance of searching an array would be denoted as O(n). The large script O is where the terminology big-Oh notation comes from, and the n indicates that the number of steps required to search an array grows linearly as the size of the array grows. A more methodical way of computing the asymptotic running time of a block of code is to do follow these simple steps: 1. Determine the steps that constitute the algorithm's running time. As aforementioned, with arrays, typically the steps considered are the read and write accesses to the array. For other data structures the steps might differ. Typically, you want to concern yourself with steps that involve the data structure itself, and not simple, atomic operations performed by the computer. That is, with the block of code above, I analyzed its running time by only counting how many times the array needs to be accessed, and did not bother worrying about the time for creating and initializing variables or the check to see if the two strings were equal. Find the line(s) of code that perform the steps you are interested in counting. Put a 1 next to each of those lines. For each line with a 1 next to it, see if it is in a loop. If so, change the 1 to 1 times the maximum number of repetitions the loop may perform. If you have two or more nested loops, continue the multiplication for each loop. Find the largest single term you have written down. This is the running time.

2. 3.

4.

Let's apply these steps to the block of code above. We've already identified that the steps we're interested in are the number of array accesses. Moving onto step 2 note that there are two lines on which the array, fs, is being accessedas a parameter in the String.Compare() method and in

the Console.WriteLine() method, so mark a 1 next to each line. Now, applying step 3 notice that the access to fs in the String.Compare() method occurs within a loop that runs at most n times (where n is the size of the array). So, scratch out the 1 in the loop and replace it with n. Finally, we see that the largest value is n, so the running time is denoted as O(n). O(n), or linear-time, represents just one of a myriad of possible asymptotic running times. Others 2 n include O(log2 n), O(nlog2 n), O(n ), O(2 ), and so on. Without getting into the gory mathematical details of big-Oh, the lower the term inside the parenthesis for large values of n, the better the data structure's operation's performance. For example, an operation that runs in O(log n) is more efficient than one that runs in O(n) since log n < n. Note In case you need a quick mathematics refresher, loga b = y is just another way to write a = b. So, 2 3 log2 4 = 2, since 2 = 4. Similarly, log2 8 = 3, since 2 = 8. Clearly, log2 n grows much slower than nalone, because when n = 8, log2 n = 3. In Part 3 we'll examine binary search trees whose search operation provides an O(log2 n) running time. Throughout this article series, we'll be computing each new data structure and its operations asymptotic running time and comparing it to the running time for similar operations on other data structures.
y

Everyone's Favorite Linear, Direct Access, Homogeneous Data StructureThe Array


Arrays are one of the simplest and most widely used data structures in computer programs. Arrays in any programming language all share a few common properties: The contents of an array are stored in contiguous memory. All of the elements of an array must be of the same type; hence arrays are referred to as homogeneous data structures. Array elements can be directly accessed. (This is not necessarily the case for many other data structures. For example, in part 4 of this article series we'll examine a data structure called the SkipList. To access a particular element of a SkipList you must search through other elements until you find the element for which you're looking. With arrays, however, if you know you want to th access the i element, you can simply use one line of code:arrayName[i].)

The common operations performed on arrays are: Allocation Accessing Redimensioning

When an array is initially declared in C# it has a null value. That is, the following line of code simply creates a variable named booleanArray that equals null: bool [] booleanArray;

Before we can begin to work with the array, we must allocate a specified number of elements. This is accomplished using the following syntax: booleanArray = new bool[10];

Or more generically: arrayName = new arrayType[allocationSize];

This allocates a contiguous block of memory in the CLR-managed heap large enough to hold the allocationSizenumber of arrayTypes. If arrayType is a value type, then allocationSize number of unboxed arrayType values are created. If arrayType is a reference type, then allocationSize number of arrayType references are created. (If you are unfamiliar with the difference between reference and value types and the managed heap versus the stack, check outUnderstanding .NET's Common Type System.) To help hammer home how the .NET Framework stores the internals of an array, consider the following example: bool [] booleanArray; FileInfo [] files; booleanArray = new bool[10]; files = new FileInfo[10];

Here, the booleanArray is an array of the value type System.Boolean, while the files array is an array of a reference type, System.IO.FileInfo. Figure 1 shows a depiction of the CLR-managed heap after these four lines of code have executed.

Figure 1. The contents of an array are laid out contiguously in the managed heap.

The thing to keep in mind is that the ten elements in the files array are references to FileInfo instances. Figure 2 hammers home this point, showing the memory layout if we assign some of the values in the files array to FileInfoinstances.

Figure 2. The contents of an array are laid out contiguously in the managed heap. All arrays in .NET provide allow their elements to both be read and written to. The syntax for accessing an array element is: // Read an array element bool b = booleanArray[7]; // Write to an array element booleanArray[0] = false;

The running time of an array access is denoted O(1) because it is constant. That is, regardless of how many elements are stored in the array, it takes the same amount of time to lookup an element. This constant running time is possible solely because an array's elements are stored contiguously, hence a lookup just requires knowledge of the array's starting location in memory, the size of each array element, and the element to be indexed. Realize that in managed code, array lookups are a bit more involved than this because with each array access the CLR checks to ensure that the index being requested is within the array's bounds. If the array index specified is out of bounds, an IndexOutOfRangeException is thrown. This check helps ensure that when stepping through an array we do not accidentally step past the last array index and into some other memory. This check, though, does not affect the running time of an array access because the time to perform such checks does not increase as the size of the array increases. Note This index-bounds check comes at a slight cost of performance for applications that make a large number of array accesses. With a bit of unmanaged code, though, this index out of bounds check can be bypassed. For more information, refer to Chapter 14 of Applied Microsoft .NET Framework Programming by Jeffrey Richter. When working with an array, you might need to change the number of elements it holds. To do so, you'll need to create a new array instance of the specified size and then copy over the contents of the old array

into the new, resized array. This process is called redimensioning, and can be accomplished with the following code: using System; using System.Collections; public class MyClass { public static void Main() { // Create an integer array with three elements int [] fib = new int[3]; fib[0] = 1; fib[1] = 1; fib[2] = 2; // Redimension message to a 10 element array int [] temp = new int[10]; // Copy the fib array to temp fib.CopyTo(temp, 0); // Assign temp to fib fib = temp; } }

After the last line of code, fib references a ten-element Int32 array. The elements 3 through 9 in the fib array will have the default Int32 value0. Arrays are excellent data structures to use when storing a collection of heterogeneous types that you only need to access directly. Searching an unsorted array has linear running time. While this is acceptable when working with small arrays, or when performing very few searches, if your application is storing large arrays that are searched frequently, there are a number of other data structures better suited for the job. We'll look at some such data structures in upcoming pieces of this article series. (Realize that if you are searching an array on some property and the array issorted by that property, you can use an algorithm called binary search to search the array in O(log n) running time, which is on par with the search times for binary search trees. In fact, the Array class contains a static BinarySearch()method. For more information on this method, check out an earlier article of mine, Efficiently Searching a Sorted Array. Note The .NET Framework allows for multi-dimensional arrays as well. Multi-dimensional arrays, like single-dimensional arrays, offer a constant running time for accessing elements. Recall that the running time to search through an n-element single dimensional array was denoted O(n). For an nxn two2 2 dimensional array, the running time is denoted O(n ) because the search must check n elements. More k generally, a k-dimensional array has a search running time of O(n ).

The ArrayList: a Heterogeneous, Self-Redimensioning Array


While arrays definitely have their time and place, arrays create some limitations on design because a single array can only store elements of one type (homogeneity), and when using arrays you must specifically allocate a certain number of elements. Oftentimes, though, developers want something more flexiblea simple collection of objects of potentially different types that can be easily managed without having to worry about allocation issues. The .NET Framework Base Class Library provides such a data structure called the System.Collections.ArrayList. An example of the ArrayList in action can be seen in the code snippet below. Note that with the ArrayList any type can be added and no allocation step has to be performed, and in fact, elements of different types can be added to the ArrayList. Furthermore, at no point do we have to concern ourselves with redimensioning the ArrayList. All of this is handled behind the scenes for us. ArrayList countDown = new ArrayList(); countDown.Add(5); countDown.Add(4); countDown.Add(3); countDown.Add(2); countDown.Add(1); countDown.Add("blast off!"); countDown.Add(new ArrayList());

Behind the scenes the ArrayList uses a System.Array of type object. Since all types are derived either directly or indirectly from object, an object array can hold elements of any type. By default, an ArrayList creates a 16-elementobject array, although the precise size can be specified through a parameter in the constructor or the Capacityproperty. When adding an element thought the Add() method, the number of elements in the internal array is checked with the array's capacity. If adding the new element causes the count to exceed the capacity, the capacity is automatically doubled and the array is redimensioned. The ArrayList, like the array, can be directly indexed using the same syntax: // Read access int x = (int) countDown[0]; string y = (string) countDown[5]; // Write access countDown[1] = 5; // ** WILL GENERATE AN ArgumentOutOfRange EXCEPTION ** countDown[7] = 5;

Since the ArrayList stores an array of objects, when reading the value from an ArrayList you need to explicitly cast it to the data type being stored in the specified location. Also, note that if you try to reference an ArrayList element greater than the ArrayList's size, a System.ArgumentOutOfRange exception will be thrown.

While the ArrayList provides added flexibility over the standard array, this flexibility comes at the cost of performance, especially when storing value types in an ArrayList. Recall that an array of a value typesuch as a System.Int32,System.Double, System.Boolean, and so onis stored contiguously in the managed heap in its unboxed form. The ArrayList's internal array, however, is an array of object references. Therefore, even if you have an ArrayList that stores nothing but value types, each ArrayList element is a reference to a boxed value type, as shown in Figure 3.

Figure 3. The ArrayList contains a contiguous block of object references The boxing and unboxing, along with the extra level of indirection that comes with using value types in an ArrayList, can hamper the performance of your application when using large ArrayLists with many reads and writes. As Figure 3 illustrates, the same memory layout occurs for reference types in both ArrayLists and arrays. The ArrayList's self-redimensioning shouldn't cause any sort of performance degradation in comparison to an array. If you know the precise number of elements that need to be stored in the ArrayList, you can essentially turn off self-redimensioning by specifying the initial capacity in the ArrayList's constructor. If you don't know the precise size, even with an array you may have to redimension the array should the number of elements inserted exceed the array's size. A classic computer science problem is determining how much new space to allocate when running out of a space in some buffer. One option when redimensioning an array is to allocate just one more element in the resized array. That is, if the array is initially allocated with five elements, before the sixth element is inserted, the array is redimensioned to six elements. Clearly, this approach conserves the most memory, but can become costly because each insert following the first redimensioning results in another redimensioning. Another option, at the opposite end of the spectrum, is to redimension the array 100 times larger than its current size. That is, if an array is initially allocated with five elements, before the sixth element is inserted, the array is redimensioned to 500 elements. Clearly, this approach greatly reduces the number of redimensionings that needs to occur, but, if only a few more elements are added to the array then hundreds of array elements have been unused, resulting in wasted space.

A tried and true compromise to this problem is to simply double the existing size of the array when free space becomes exhausted. So, for an array initially allocated with five elements, adding a sixth element would cause the array to be redimensioned to a size of 10. This is precisely the approach the ArrayList class takes and, best of all, it is all performed for you. The asymptotic running time of the ArrayList's operations are the same as those of the standard array. While the ArrayList does indeed have more overhead, especially when storing value types, the relationship between the number of elements in the ArrayList and the cost per operation is the same as the standard array.

Conclusion
This article started our discussion on data structures by identifying why studying data structures was important, and by providing a means of how to analyze the performance of data structures. This material is important to understand as being able to analyze the running times of various data structure operations is a useful tool when deciding what data structure to use for a particular programming problem. After studying how to analyze data structures, we turned to examining two of the most common data structures in the .NET Framework Base Class LibrarySystem.Array class and System.Collections.ArrayList. Arrays allow for a contiguous block of homogeneous types. Their main benefit is that they provide lightning-fast access to reading and writing array elements. Their weak point lies in searching arrays, as each and every element must potentially be visited (in an unsorted array). The ArrayList provides a more flexible array-like data structure. Rather than enforcing homogeneous types, the ArrayList allows for heterogeneous types to be stored by using an array of objects. Furthermore, the ArrayList does not require explicit allocation and can gracefully grow as the more elements are added. In the next part of this article series we'll turn our attention to the Stack and Queue classes. We'll also look at associative arrays, which are arrays indexed by a string key as opposed to an integer value. Associative arrays are provided in the .NET Framework Base Class Library through the Hashtable class. Scott Mitchell, author of five books and founder of 4GuysFromRolla.com, has been working with Microsoft Web technologies for the past five years. Scott works as an independent consultant, trainer, and writer, and recently completed his Masters degree in Computer Science at the University of California San Diego. He can be reached atmitchell@4guysfromrolla.com.

Stack (abstract data type)


In computer science, a stack is a last in, first out (LIFO) abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack, or initializes the stack if it is empty. If the stack is full and does not contain enough space to accept the given item, the stack is then considered to be in an overflow state. The pop operation removes an item from the top of the stack. A pop either reveals previously concealed items, or results in an empty stack, but if the stack is empty then it goes into underflow state (It means no items are present in stack to be removed). The stack top operation gets the data from the top-most position and returns it to the user without deleting it. The same underflow state can also occur in stack top operation if stack is empty. A stack is a restricted data structure, because only a small number of operations are performed on it. The nature of the pop and push operations also means that stack elements have a natural order. Elements are removed from the stack in the reverse order to the order of their addition: therefore, the lower [1] elements are those that have been on the stack the longest.

stack
class template

<stack>
LIFO stack Stacks are a type of container adaptor, specifically designed to operate in a LIFO context (last-in firstout), where elements are inserted and extracted only from the end of the container. stacks are implemented as containers adaptors, which are classes that use an encapsulated object of

a specific container class as its underlying container, providing a specific set of member functions to access its elements. Elements are pushed/popped from the"back" of the specific container, which is known as the top of the stack. The underlying container may be any of the standard container class templates or some other specifically designed container class. The only requirement is that it supports the following operations:

back() push_back() pop_back()

Therefore, the standard container class templates vector, deque and list can be used. By default, if no container class is specified for a particular stack class, the standard container class template deque is used. In their implementation in the C++ Standard Template Library, stacks take two template parameters:

template < class T, class Container = deque<T> > class stack;


Where the template parameters have the following meanings:

T: Type of the elements. Container: Type of the underlying container object used to store and access the elements.

In the reference for the stack member functions, these same names are assumed for the template parameters.

Member functions
(constructor) empty size top push pop Construct stack (public member function) Test whether container is empty (public member function) Return size (public member function) Access next element (public member function) Add element (public member function) Remove element (public member function)

java.util

Class Stack
java.lang.Object java.util.AbstractCollection java.util.AbstractList java.util.Vector

java.util.Stack

All Implemented Interfaces: Cloneable, Collection, List, RandomAccess, Serializable public class Stack extends Vector The Stack class represents a last-in-first-out (LIFO) stack of objects. It extends class Vector with five operations that allow a vector to be treated as a stack. The usual push and pop operations are provided, as well as a method to peek at the top item on the stack, a method to test for whether the stack is empty, and a method to search the stack for an item and discover how far it is from the top. When a stack is first created, it contains no items. Since: JDK1.0 See Also: Serialized Form

Field Summary
Fields inherited from class java.util.Vector
capacityIncrement, elementCount, elementData

Fields inherited from class java.util.AbstractList


modCount

Constructor Summary
Stack()

Creates an empty Stack.

Method Summary
boolean empty()

Tests if this stack is empty.


Object peek()

Looks at the object at the top of this stack without removing it

from the stack.


Object pop()

Removes the object at the top of this stack and returns that object as the value of this function.
Object push(Object item)

Pushes an item onto the top of this stack.


int search(Object o)

Returns the 1-based position where an object is on this stack. Methods inherited from class java.util.Vector
add, add, addAll, addAll, addElement, capacity, clear, clone, contains, conta insAll, copyInto, elementAt, elements,ensureCapacity, equals, firstElement, g et, hashCode, indexOf, indexOf, insertElementAt, isEmpty, lastElement, lastIn dexOf,lastIndexOf, remove, remove, removeAll, removeAllElements, removeElemen t, removeElementAt, removeRange, retainAll, set,setElementAt, setSize, size, subList, toArray, toArray, toString, trimToSize

Methods inherited from class java.util.AbstractList


iterator, listIterator, listIterator

Methods inherited from class java.lang.Object


finalize, getClass, notify, notifyAll, wait, wait, wait

Methods inherited from interface java.util.List


iterator, listIterator, listIterator

Constructor Detail
Stack
public Stack()

Creates an empty Stack.

Method Detail
push
public Object push(Object item)

Pushes an item onto the top of this stack. This has exactly the same effect as:
addElement(item)

Parameters:

- the item to be pushed onto this stack. Returns: the item argument. See Also:
item Vector.addElement(java.lang.Object)

pop
public Object pop()

Removes the object at the top of this stack and returns that object as the value of this function. Returns: The object at the top of this stack (the last item of the Vector object). Throws: EmptyStackException - if this stack is empty.
peek

public Object peek()

Looks at the object at the top of this stack without removing it from the stack. Returns: the object at the top of this stack (the last item of the Vector object). Throws: EmptyStackException - if this stack is empty.
empty
public boolean empty()

Tests if this stack is empty. Returns: true if and only if this stack contains no items; false otherwise.
search
public int search(Object o)

Returns the 1-based position where an object is on this stack. If the object o occurs as an item in this stack, this method returns the distance from the top of the stack of the occurrence nearest the top of the stack; the topmost item on the stack is considered to be at distance 1. The equals method is used to compare o to the items in this stack. Parameters: o - the desired object. Returns:

the 1-based position from the top of the stack where the object is located; the return value -1 indicates that the object is not on the stack.

stack

(1) In programming, a special type of data structure in which items are removed in the reverse order from that in which they are added, so the most recently added item is the first one removed. This is also called last-in, first-out (LIFO). Adding an item to a stack is called pushing. Removing an item from a stack is called popping. (2) In networking, short for protocol stack. (3) In Apple Computer 's HyperCard software system, a stack is a collection ofcards.

stack
(1) TCP/IP is frequently referred to as a "stack." This refers to the layers (TCP, IP, and sometimes others) through which all data passes at both client and server ends of a data exchange. A clear picture of layers similar to those of TCP/IP is provided in our description of OSI, the reference model of the layers involved in any network communication. The term "stack" is sometimes used to include utilities that support the layers of TCP/IP. The Netscape Handbook
LEARN MORE

CIO Midmarket Resources

says (and we quote): "To make a successful connection to the Internet, your PC needs application software such as Netscape plus a TCP/IP stack consisting of TCP/IP software,sockets software (Winsock.dynamic link library), and hardware driver software

(packet drivers). Several popular TCP/IP stacks are available for Windows, includingshareware stacks." (2) In programming, a stack is a data area or buffer used for storing requests that need to be handled. The IBM Dictionary of Computing says that a stack is always a push-down list, meaning that as new requests come in, they push down the old ones. Another way of looking at a pushdown list - or stack - is that the program always takes its next item to handle from the top of the stack. (This is unlike other arrangements such as "FIFO" or "first-in first-out.")

1.2 WHAT IS A STACK?

LIFO stacks, also known as "push down" stacks, are the conceptually simplest way of saving information in a temporary storage location for such common computer operations as mathematical expression evaluation and recursive subroutine calling.

1.2.1 Cafeteria tray example

As an example of how a stack works, consider a spring-loaded tray dispenser of the type often found in cafeterias. Let us say that each tray has number engraved upon it. One tray at a time is loaded in from the top, each resting on the already loaded trays with the spring compressing to make room for more trays as necessary. For example, in Figure 1.1, the trays numbered 42, 23, 2, and 9 are loaded onto the stack of trays with 42 loaded first and 9 loaded last.

Figure 1.1 -- An example of stack operation. The "Last In" tray is number 9. Thus, the "First Out" tray is also number 9. As customers remove trays from the top of the stack, the first tray removed is tray number 9, and the second is tray number 2. Let us say that at this point more trays were added. These trays would then have to come off the stack before the very first tray we loaded. After any sequence of pushes and pops of the stack of trays, tray 42 would still be on the bottom. The stack would be empty once again only after tray 42 had been popped from the top of the stack.

1.2.2 Example software implementations

LIFO stacks may be programmed into conventional computers in a number of ways. The most straightforward way is to allocate an array in memory, and keep a variable

with the array index number of the topmost active element. Those programmers who value execution efficiency will refine this technique by allocating a block of memory locations and maintaining a pointer with the actual address of the top stack element. In either case, "pushing" a stack element refers to the act of allocating a new word on the stack and placing data into it. "Popping" the stack refers to the action of removing the top element from the stack and then returning the data value removed to the routine requesting the pop. Stacks often are placed in the uppermost address regions of the machine. They usually grow from the highest memory location towards lower memory locations, allowing the maximum flexibility in the use of the memory between the end of program memory and the "top" of the stack. In our discussions, whether the stack grows "up" in memory or "down" in memory is largely irrelevant. The "top" element of the stack is the element that was last pushed and will be the first to be popped. The "bottom" element of the stack is the one that, when removed, will leave the stack empty. A very important property of stacks is that, in their purest form, they only allow access to the top element in the data structure. We shall see later that this property has profound implications in the areas of program compactness, hardware simplicity and execution speed. Stacks make excellent mechanisms for temporary storage of information within procedures. A primary reason for this is that they allow recursive invocations of procedures without risk of destroying data from previous invocations of the routine. They also support reentrant code. As an added advantage, stacks may be used to pass the parameters between these same procedures. Finally, they can conserve memory space by allowing different procedures to use the same memory space over and over again for temporary variable allocation, instead of reserving room within each procedure's memory for temporary variables. There are other ways of creating stacks in software besides the array approach. Linked lists of elements may be used to allocate stack words, with elements of the stack not necessarily in any order with respect to actual memory addresses. Also, a software heap may be used to allocate stack space, although this is really begging the question since heap management is really a superset of stack management.

1.2.3 Example hardware implementations

Hardware implementation of stacks has the obvious advantage that it can be much faster than software management. In machines that refer to the stack with a large

percentage of instructions, this increased efficiency is vital to maintaining high system performance. While any software method of handling stacks can be implemented in hardware, the generally practiced hardware implementation is to reserve contiguous locations of memory with a stack pointer into that memory. Usually the pointer is a dedicated hardware register that can be incremented or decremented as required to push and pop elements. Sometimes a capability is provided to add an offset to the stack pointer to nondestructively access the first few elements of the stack without requiring successive pop operations. Often times the stack is resident in the same memory devices as the program. Sometimes, in the interest of increased efficiency, the stacks reside in their own memory devices. Another approach that may be taken to building stacks in hardware is to use large shift registers. Each shift register is a long chain of registers with one end of the chain being visible as a single bit at the top of stack. 32 such shift registers of N bits each may be placed side-by-side to form a 32 bit wide by N element stack. While this approach has not been practical in the past, VLSI stack machines may find this a viable alternative to the conventional register pointing into memory implementation.

4.3 STACKS AND QUEUES


This section under major construction.

Stacks and queues. In this section, we introduce two closely-related data types for manipulating arbitrarily large collections of objects: the stack and the queue. Each is defined by two basic operations: insert a new item, and remove an item. When we insert an item, our intent is clear. But when we remove an item, which one do we choose? The rule used for a queue is to always remove the item that has been in the collection the mostamount of time. This policy is known as first-in-first-out or FIFO. The rule used for a stack is to always remove the item that has been in the collection the least amount of time. This policy is known as last-in first-out or LIFO. Pushdown stacks. A pushdown stack (or just a stack) is a collection that is based on the last-in-first-out (LIFO) policy. When you click a hyperlink, your browser displays the new page (and inserts it onto a stack). You can keep clicking on hyperlinks to visit new pages. You can always revisit the previous page by clicking the back button (remove it from a stack). The last-infirst-out policy offered by a pushdown stack provides just the behavior that you expect.

By tradition, we name the stack insert method push() and the stack remove operation pop(). We also include a method to test whether the stack is empty. The following API summarizes the operations:

The asterisk indicates that we will be considering more than one implementation of this API. Array implementation. Representing stacks with arrays is a natural idea. The first problem that you might encounter is implementing the constructor ArrayStackOfStrings(). An instance variable a[] with an array of strings to hold the stack items is clearly needed, but how big should it be? For the moment, We will finesse this problem by having the client provide an argument for the constructor that gives the maximum stack size. We keep the items in reverse order of their arrival. This policy allows us to add and remove items at the end without moving any of the other items in the stack. We could hardly hope for a simpler implementation of ArrayStackOfStrings.java: all of the methods are one-liners! The instance variables are an array a[] that hold the items in the stack and an integer N that counts the number of items in the stack. To remove an item, we decrement N and then return a[N]; to insert a new item, we set a[N] equal to the new item and then increment N. These operations preserve the following properties: the items in the array are in their insertion order the stack is empty when the value of N is 0 the top of the stack (if it is nonempty) is at a[N-1]

The primary characteristic of this implementation is that the push and pop operations take constant time. The drawback of this implementation is that it requires the client to estimate the maximum size of the stack ahead of time and always uses space proportional to that maximum, which may be unreasonable in some situations. Linked lists. For classes such as stacks that implement collections of objects, an important objective is to ensure that the amount of space used is always proportional to the number of items in the collection. Now we consider the use of a fundamental data structure known as a linked list that can provide implementations of collections (and, in particular, stacks) that achieves this important objective. A linked list is a recursive data structure defined as follows: a linked list is either empty (null) or a reference to a node having a reference to a linked list. The node in this definition is an abstract entity that might hold any kind of data in addition to the node reference that characterizes its role in building linked lists. With object-oriented programming, implementing linked lists is not difficult. We start with a simple example of a class for the node abstraction: class Node { String item; Node next; } A Node has two instance variables: a String and a Node. The String is a placeholder in this example for any data that we might want to structure with a linked list (we can use any set of instance variables); the instance variable of type Node characterizes the linked nature of the data structure. Now, from the recursive definition, we can represent a linked list by a variable of type Node just by ensuring that its value is either null or a reference to a Node whose next field is a reference to a linked list. We create an object of type Node by invoking its (no-argument) constructor. This creates a reference to aNode object whose instance variables are both initialized to the value null. For example, to build a linked list that contains the items "to", "be", and "or", we create a Node for each item:

Node first = new Node(); Node second = new Node(); Node third = new Node(); and set the item field in each of the nodes to the desired item value: first.item = "to"; second.item = "be"; third.item = "or"; and set the next fields to build the linked list: first.next = second; second.next = third; third.next = null; When tracing code that uses linked lists and other linked structures, we use a visual representation of the changes where we draw a rectangle to represent each object we put the values of instance variables within the rectangle we depict references as arrows that point to the referenced object This visual representation captures the essential characteristic of linked lists and allows us to focus on the links. Insert. Suppose that you want to insert a new node into a linked list. The easiest place to do so is at the beginning of the list. For example, to insert the string "not" at the beginning of a given linked list whose first node is first, we save first in oldfirst, assign to first a new Node, assign its item field to"not" and its next field to oldfirst.

Remove. Suppose that you want to remove the first node from a list. This operation is even easier: simply assign to first the value first.next. Normally, you would retrieve the value of the item (by assigning it to some String variable) before doing this assignment, because once you change the value of first, you may not have any access to the node to which it was referring. Typically, the node object becomes an orphan, and the memory it occupies is eventually reclaimed by the Java memory management system.

These two operations take constant time (independent of the length of the list). Implementing stacks with linked lists. Program LinkedStackOfStrings.java uses a linked list to implement a stack of strings. The implementation is based on a nested class Node like the one we have been using. Java allows us to define and use other classes within class implementations in this natural way. The class is privatebecause clients do not need to know any of the details of the linked lists.

List traversal. One of the most common operations we perform on collections is to iterate through the items in the collection. For example, we might wish to implement the toString() method to facilitate debugging our stack code with traces. For ArrayStackOfStrings, this implementation is familiar: public String toString() { String s = ""; for (int i = 0; i < N; i++) s += a[i] + " "; return s; } As usual, this solution is intended for use only when N is small - it takes quadratic time because string concatenation takes linear time. Our focus now is just on the process of examining every item. There is a corresponding idiom for visiting the items in a linked list: We initialize a loop index variable x that references the the first Node of the linked list. Then, we find the value of the item associated with x by accessing x.item, and then

update x to refer to the next Node in the linked list assigning to it the value of x.next, repeating this process until x is null (which indicates that we have reached the end of the linked list). This process is known as traversing the list, and is succinctly expressed in this implementation of toString() forLinkedStackOfStrings: public String toString() { String s = ""; for (Node x = first; x != null; x = x.next) s += x.item + " "; return s; } Array doubling. Next, we consider an approach to accommodating arbitrary growth and shrinkage in a data structure that is an attractive alternative to linked lists. As with linked lists, The idea is to modify the array implementation to dynamically adjust the size of the array a[] so that it is (i) both sufficiently large to hold all of the items and (ii) not so large as to waste an excessive amount of space. ProgramDoublingStackOfStrings.java is a modification of ArrayStackOfStrings.java that achieves these objectives. First, in push(), we check whether the array is too small. In particular, we check whether there is room for the new item in the array by checking whether the stack size N is equal to the array size a.length. If not, we just insert it with a[N++] = item as before; if so, we double the size of the array, by creating a new array of twice the size, copying the stack items to the new array, and resetting the a[] instance variable to reference the new array. Similarly, in pop(), we begin by checking whether the array is too large, and we halve its size if that is the case.

Parameterized data types. We have developed one stack implementation that allows us to build a stack of one particular type (String). In other applications we might need a stack of integers or a stack of oranges or a queue of customers.

Create a stack of Objects. We could develop one stack implementation StackOfObjects.java whose elements are of type Object. Using inheritance, we can insert an object of any type. However, when we pop it, we must cast it back to the appropriate type. This approach can expose us to subtle bugs in our programs that cannot be detected until runtime. For example, there is nothing to stop a programmer from putting different types of objects on the same stack, then encountering a runtime type-checking error, as in the following example: StackOfObjects stack = new StackOfObjects(); Apple a = new Apple(); Orange b = new Orange(); stack.push(a); stack.push(b); a = (Apple) (stack.pop()); // throws a ClassCastException b = (Orange) (stack.pop()); This toy example illustrates a basic problem. When we use type casting with an implementation such asStack for different types of items, we are assuming that clients will cast objects popped from the stack to the proper type. This implicit assumption contradicts our requirement for ADTs that operations are to be accessed only through an explicit interface. One reason that programmers use precisely defined ADTs is to protect future clients against errors that arise from such implicit assumptions. The code cannot be type-checked at compile time: there might be an incorrect cast that occurs in a complex piece of code that could escape detection until some particular runtime circumstance arises. Such an error is to be avoided at all costs because it could happen long after an implementation is delivered to a client, who would have no way to fix it.

Java generics. We use Java generics to limit the objects on a stack or queue to all be of the same type within a given application. The primary benefit is to discover type mismatch errors at compile-time instead of run-time. This involves a small bit of new Java syntax. We name the generic class Stack. It is identical to StackOfStrings except that we replace every occurrence of String with Item and declare the class as follows: public class Stack<Item> Program Stack.java implements a generic stack using this approach. The client Stack<Apple> stack = new Stack<Apple>(); Apple a = new Apple(); Orange b = new Orange(); stack.push(a); stack.push(b); // compile-time error Program DoublingStack.java implements a generic stack using an array. For technical reasons, one cast is needed when allocating the array of generics.

Autoboxing. We have designed our stacks so that they can store any generic object type. We now describe the Java language feature, known as auto-boxing and auto-unboxing, that enables us to reuse the same code with primitive types as well. Associated with each primitive type, e.g. int, is a full blown object type, e.g.,Integer. When we assign a primitive to the

corresponding object type (or vice versa), Java automatically performs the transformation. This enables us to write code like the following. Stack<Integer> stack = new Stack<Integer>(); stack.push(17); // auto-boxing (converts int to Integer) int a = stack.pop(); // auto-unboxing (converts Integer to int) The value 17 is automatically cast to be of type Integer when we pass it to the push() method. The pop()method returns an Integer, and this value is cast to an int when we assign it to the variable a. We should be aware of what is going on behind the scenes since this can affect performance. Java supplies built-in wrapper types for all of the primitive types: Boolean, Byte, Character, Double, Float,Integer, Long, and Short. These classes consist primarily of static methods (e.g., Integer.parseInt(),Integer.reverse()), but they also include some non-static methods (compareTo(), equals(),doubleValue()). Queue. A queue supports the insert and remove operations using a FIFO discipline. By convention, we name the queue insert operation enqueue and the remove operation dequeue. Lincoln tunnel. Student has tasks that must be completed. Put on a queue. Do the tasks in the same order that they arrive.

public class Queue<Item> { public boolean isEmpty(); public void enqueue(Item item); public Item dequeue(); } Linked list implementation. Program Queue.java implements a FIFO queue of strings using a linked list. LikeStack, we maintain a reference first to the least-recently added Node on the queue. For efficiency, we also maintain a reference last to the least-recently added Node on the queue.

Array implementation. Similar to array implementation of stack, but a little trickier since need to wrap-around. Program DoublingQueue.java implements the queue interface. The array is dynamically resized using repeated doubling.

Iteration. Sometimes the client needs to access all of the items of a collection, one at a time, without deleting them. To maintain encapsulation, we do not want to reveal the internal representation of the queue (array or linked list) to the client. "Decouple the thing that needs to traverse the list from the details of getting each element from it." We solve this design challenge by using Java's java.util.Iterator interface: public interface Iterator<Item> { boolean hasNext(); Item next(); void remove(); // optional } That is, any data type that implements the Iterator interface promises to implement two methods: hasNext()and next(). The client uses these methods to access the list elements one a time using the following idiom. Queue<String> queue = new Queue<String>(); ... Iterator<String> i = queue.iterator(); while (i.hasNext()) { String s = i.next(); StdOut.println(s); } Queue iterator in Java. Queue.java illustrates how to implement an Iterator when the items are stored in a linked list. public Iterator iterator() { return new QueueIterator(); }

private class QueueIterator implements Iterator<Item> { Node current = first; public boolean hasNext() { return current != null; }

public Item next() { Item item = current.item; current = current.next; return item; } } It relies on a private nested subclass QueueIterator that implements the Iterator interface. The method iterator() creates an instance of type QueueIterator and returns it as an Iterator. This enforces the iteration abstraction since the client will only the items through the hasNext() and next()methods. The client has no access to the internals of the Queue or even the QueueIterator. It is the client's responsibility to only add elements to the list when no iterator is in action. Enhanced for loop. Iteration is such a useful abstraction that Java provides compact syntax (known as the enhanced for loop) to iterate over the elements of a collection (or array). Iterator<String> i = queue.iterator(); while (i.hasNext()) { String s = i.next(); StdOut.println(s); } for (String s : queue) StdOut.println(s); To take advantage of Java's enhanced foreach syntax, the data type must implement Java's Iterableinterface. public interface Iterable<Item> { Iterator<Item> iterator(); } That is, the data type must implement a method named iterator() that returns an Iterator to the underlying collection. Since our Queue ADT now includes such a method, we simply need to declare it as implementing the Iterable interface and we are ready to use the foreach notation. public class Queue<Item> implements Iterable<Item> Stack and queue applications. Stacks and queues have numerous useful applications. Queue applications: Computing applications: serving requests of a single shared resource (printer, disk, CPU), transferring data asynchronously (data not necessarily received at same rate as sent) between two processes (IO buffers), e.g., pipes, file IO, sockets. Buffers on MP3 players and portable CD players, iPod playlist. Playlist for jukebox - add songs to the end, play from the front of the list. Interrupt handling:

When programming a real-time system that can be interrupted (e.g., by a mouse click or wireless connection), it is necessary to attend to the interrupts immediately, before proceeding with the current activity. If the interrupts should be handles in the same order they arrive, then a FIFO queue is the appropriate data structure. Arithmetic expression evaluation. Program Evaluate.java evaluates a fully parenthesized arithmetic expression. An important application of stacks is in parsing. For example, a compiler must parse arithmetic expressions written using infix notation. For example the following infix expression evaluates to 212. ( 2 + ( ( 3 + 4 ) * ( 5 * 6 ) ) ) We break the problem of parsing infix expressions into two stages. First, we convert from infix to a different representation called postfix. Then we parse the postfix expression, which is a somewhat easier problem than directly parsing infix. o Evaluating a postfix expression. A postfix expression is.... 2 3 4 + 5 6 * * + First, we describe how to parse and evaluate a postfix expression. We read the tokens in one at a time. If it is an integer, push it on the stack; if it is a binary operator, pop the top two elements from the stack, apply the operator to the two elements, and push the result back on the stack. Program Postfix.java reads in and evaluates postfix expressions using this algorithm. o Converting from infix to postfix. Now, we describe how to convert from infix to postfix. We read in the tokens one at a time. If it is an operator, we push it on the stack; if it is an integer, we print it out; if it is a right parentheses, we pop the topmost element from the stack and print it out; if it is a left parentheses, we ignore it. Program Infix.java reads in an infix expression, and uses a stack to output an equivalent postfix expression using the algorithm described above. Relate back to the parse tree example in Section 4.3.

Function calls. Perhaps the most important application of stacks is to implement function calls. Most compilers implement function calls by using a stack. This also provides a technique for eliminating recursion from a program: instead of calling a function recursively, the programmer uses a stack to simulate the function calls in the same way that the compiler would have done so. Conversely, we can often use recursion instead of using an explicit stack. Some programming languages provide a mechanism for recursion, but not for calling functions. Programming languages have built in support for stacks (recursion), but no analogous mechanism for dealing with queues. Postscript and FORTH programming languages are stack based. Java bytecode is interpreted on (virtual) stack based processor. Microsoft Intermediate Language (MSIL) that .NET applications are compiled to.

M/M/1 queue. The Markov/Markov/Single-Server model is a fundamental queueing model in operations research and probability theory. Tasks arrive according to a Poisson process at a certain rate . This means that customers arrive per hour.

More specifically, the arrivals follow an exponential distribution with mean 1 / : the probability of k arrivals between time 0 and t is ( t)^k e^(- t) / k!. Tasks are serviced in FIFO order according to a Poisson process with rate . The two M's standard for Markov: it means that the system is memoryless: the time between arrivals is independent, and the time between departures is independent. Analysis of M/M/1 model. We are interested in understanding the queueing system. If &lambda > the queue size increases without limit. For simple models like M/M/1 we can analyze these quantities analytically using probability theory. Assuming > , the probability of exactly n customers in the system is ( / )^n (1 - / &mu). o o o o L = average number of customers in the system = / ( - ). LQ = average number of customers in the queue = 2 / ( ( - )). W = average time a customer spends in the system = 1 / ( - ). WQ = average time a customer spends in the queue = W - 1 / .

Program MM1Queue.java For more complex models we need to resort to simulation like this. Variants: multiple queues, multiple servers, sequential multi-stage servers, using a finite queue and measuring number of customers that are turned away. Applications: customers in McDonalds, packets in an internet router, Little's law asserts that the average number of customers in a (stable) queueing system equals the average arrival rate times their average time in the system. But the variance of customer waiting times satisfies: Var(FIFO) < Var(SIRO) < Var(LIFO). The distribution of the number of customers in the system does not depend on the queueing discipline (so long as it is independent of their service times). Same for expected waiting time. M/D/1 queue. Program MD1Queue.java is similar but the service occurs at a fixed rate (rather than random). Load balancing. Write a program LoadBalance.java that performs a load-balancing simulation.

Q + A.
Q. When do I use new with Node? A. Just as with any other class, you should only use new when you want to create a new Node object (a new element in the linked list). You should not use new to create a new reference to an existing Node object. For example, the code Node oldfirst = new Node(); oldfirst = first; creates a new Node object, then immediately loses track of the only reference to it. This code does not result in an error, but it is a bit untidy to create orphans for no reason. Q. Why declare Node as a nested class? Why private? A. By declaring the subclass Node to be private we restrict access to methods within the enclosing class. One characteristic of a private nested class is that its instance variables can be directly accessed from within the enclosing class, but nowhere else, so there is no need to declare them public or private. Note : A nested class that is not static is known as

an inner class, so technically our Node classes are inner classes, though the ones that are not generic could be static. Q. Why does javac LinkedStackOfStrings.java creates a file LinkedStackOfStrings$Node.class as well as LinkedStackOfStrings.class? A. That file is for the nested class Node. Java's naming convention is to use $ to separate the name of the outer class from the nested class. Q. Should a client be allowed to insert null items onto a stack or queue? A. This question arises frequently when implementing collections in Java. Our implementation (and Java's stack and queue libraries) do permit the insertion of null values. Q. Are there Java libraries for stacks and queues? A. Yes and no. Java has a built in library called java.util.Stack, but you should avoid using it when you want a stack. It has several additional operations that are not normally associated with a stack, e.g., getting the ith element. It also allows adding an element to the bottom of the stack (instead of the top), so it can implement a queue! Although having such extra operations may appear to be a bonus, it is actually a curse. We use data types not because they provide every available operation, but rather because they allow us to precisely specify the operations we need. The prime benefit of doing so is that the system can prevent us from performing operations that we do not actually want. The java.util.Stack API is an example of a wide interface, which we generally strive to avoid. Q. I want to use an array representation for a generic stack, but code like the following will not compile. What is the problem? private Item[] a = new Item[max]; oldfirst = first; A. Good try. Unfortunately, creating arrays of generics is not allowed in Java 1.5. Experts still are vigorously debating this decision. As usual, complaining too loudly about a language feature puts you on the slippery slope towards becoming a language designer. There is a way out, using a cast: you can write: private Item[] a = (Item[]) new Object[max]; oldfirst = first; The underlying cause is that arrays in Java are covariant, but generics are not. In other words, String[] is a subtype of Object[], but Stack<String> is not a subtype of Stack<Object>. To get around this defect, you need to perform an unchecked cast as in DoublingStack.java. Many programmers consider covariant arrays to be a defect in Java's type system (and this resulted in the need for "reifiable types" and "type erasure"). However, in a world without generics, covariant arrays are useful, e.g., to implementArrays.sort(Comparable[]) and have it be callable with an input array of type String[]. Q. Can I use the foreach construction with arrays? A. Yes (even though arrays do not implement the Iterator interface). The following prints out the command-line arguments: public static void main(String[] args) { for (String s : args) StdOut.println(s); }

Q. Is iterating over a linked list more efficient with a loop or recursion? A. An optimizing compiler will likely translate a tail-recursive function into the equivalent loop, so there may be no observable performance overhead of using recursion. Q. How does auto-boxing handle the following code fragment? Integer a = null; int b = a; A. It results in a run-time error. Primitive type can store every value of their corresponding wrapper type exceptnull. Q. Why does the first group of statements print true, but the second two print false? Integer a1 = 100; Integer a2 = 100; System.out.println(a1 == a2); // true Integer b1 = new Integer(100); Integer b2 = new Integer(100); System.out.println(b1 == b2); Integer c1 = 150; Integer c2 = 150; System.out.println(c1 == c2);

// false

// false

A. The second prints false because b1 and b2 are references to different Integer objects. The first and third code fragments rely on autoboxing. Surprisingly the first prints true because values between -128 and 127 appear to refer to the same immutable Integer objects (presumably there is a pool of them that are reused), while Java creates new objects for each integer outside this range. Lesson: as usual, don't use == to compare whether two objects have the same value. Q. Are generics solely for auto-casting? A. No, but this will be the only thing we use them for. This is known as "pure generics" or "concrete parameterized types." Concrete parameterized types work almost like normal types with a few exceptions (array creation, exception handling, with instanceof, and in a class literal). More advanced uses of generics, including "wildcards", are are useful for handling subtypes and inheritance. Here is a generics tutorial. Q. Why do I get an incompatible types compile-time error with the following code? Stack stack = new Stack<String>(); stack.push("Hello"); String s = stack.pop(); A. You forgot to specify the concrete type when declaring stack. It should be Stack<String>. Q. Why do I get a uses unchecked or unsafe operations compile-time warning with the following code? Stack<String> stack = new Stack(); stack.push("Hello"); String s = stack.pop();

A. You forgot to specify the concrete type when calling the constructor. It should be new Stack<String>().

Exercises
1. Add a method isFull() to ArrayStackOfStrings.java. 2. Give the output printed by java ArrayStackOfStrings 5 for the input it was - the best - of times - - - it was - the - 3. Suppose that a client performs an intermixed sequence of (stack) push and pop operations. The push operations put the integers 0 through 9 in order on to the stack; the pop operations print out the return value. Which of the following sequence(s) could not occur? (a) (b) (c) (d) (e) (f) (g) (h) 4 4 2 4 1 0 1 2 3 6 5 3 2 4 4 1 2 8 6 2 3 6 7 4 1 7 7 1 4 5 9 3 0 5 4 0 5 3 8 6 9 3 8 5 6 8 6 5 8 2 9 6 9 1 5 8 7 9 3 7 8 7 3 7 6 0 1 8 7 2 0 9 5 1 0 9 0 9 2 0

4. Write a stack client Reverse.java that reds in strings from standard input and prints them in reverse order. 5. Assuming that standard input has some unknown number N of double values. Write a method that reads all the values and returns an array of length N containing them, in the other they appear on standard input. 6. Write a stack client Parentheses.java that reads in a text stream from standard input and uses a stack to determine whether its parentheses are properly balanced. For example, your program should print true for[()]{}{[()()]()} and false for [(]). Hint : Use a stack. 7. What does the following code fragment print when N is 50? Give a high-level description of what the code fragment does when presented with a positive integer N. Stack stack = new Stack(); while (N > 0) { stack.push(N % 2); N = N / 2; } while (!stack.isEmpty()) StdOut.print(stack.pop()); StdOut.println(); Answer: prints the binary representation of N (110010 when N is 50). 8. What does the following code fragment do to the queue q? Stack stack = new Stack();

while (!q.isEmpty()) stack.push(q.dequeue()); while (!stack.isEmpty()) q.enqueue(stack.pop()); 9. Add a method peek() to Stack.java that returns the most recently inserted element on the stack (without popping it). 10. Give the contents and size of the array for DoublingStackOfStrings with the input it was - the best - of times - - - it was - the - 11. Add a method length() to Queue.java that returns the number of elements on the queue. Hint: Make sure that your method takes constant time by maintaining an instance variable N that you initialize to 0, increment in enqueue(), decrement in dequeue(), and return in length(). 12. Draw a memory usage diagram in the style of the diagrams in Section 4.1 for the three-node example used to introduce linked lists in this section. 13. Write a program that takes from standard input an expression without left parentheses and prints the equivalent infix expression with the parentheses inserted. For example, given the input 1 + 2 ) * 3 - 4 ) * 5 - 6 ) ) ) your program should print ( ( 1 + 2 ) * ( ( 3 - 4 ) * ( 5 - 6 ) ) 14. Write a filter InfixToPostfix.java that converts an arithmetic expression from infix to postfix. 15. Write a program EvaluatePostfix.java that takes a postfix expression from standard input, evaluates it, and prints the value. (Piping the output of your program from the previous exercise to this program gives equivalent behavior to Evaluate.java.) 16. Suppose that a client performs an intermixed sequence of (queue) enqueue and dequeue operations. The enqueue operations put the integers 0 through 9 in order on to the queue; the dequeue operations print out the return value. Which of the following sequence(s) could not occur? (a) (b) (c) (d) 0 4 2 4 1 6 5 3 2 8 6 2 3 7 7 1 4 5 4 0 5 3 8 5 6 2 9 6 7 9 3 7 8 0 1 8 9 1 0 9

17. Write an iterable Stack client that has a static methods copy() that takes a stack of strings as argument and returns a copy of the stack. Note: This ability is a prime example of the value of having an iterator, because it allows development of such functionality without changing the basic API. 18. Develop a class DoublingQueueOfStrings.java that implements the queue abstraction with a fixed-size array, and then extend your implementation to use array doubling to remove the size restriction. 19. Write a Queue.java client that takes a command-line argument k and prints the kth from the last string found on standard input.

20. (For the mathematically inclined.) Prove that the array in DoublingStackOfStrings.java is never less than one-quarter full. Then prove that, for any DoublingStackOfStrings client, the total cost of all of the stack operations divided by the number of operations is a constant. 21. Modify MD1Queue.java to make a program MM1Queue.java that simulates a queue for which both the arrival and service times are Poisson processes. Verify Little's law for this model. 22. Develop a class StackOfInts.java that uses a linked-list representation (but no generics). Write a client that compares the performance of your implementation withStack<Integer> to determine the performance penalty from autoboxing on your system.

Linked List Exercises


23. Write a method delete() that takes an int argument k and deletes the kth element in a linked list, if it exists. Solution. // we assume that first is a reference to the first Node in the list public void delete(int k) { if (k <= 0) throw new RuntimeException("Invalid value of k"); // degenerate case - empty linked list if (first == null) return; // special case - removing the first node if (k == 1) { first = first.next; return; } // general case, make temp point to the (k-1)st node Node temp = first; for (int i = 1; i < k; i++) { temp = temp.next; if (temp == null) return; // list has < k nodes } if (temp.next == null) return; // list has < k nodes

// change temp.next to skip kth node temp.next = temp.next.next; } 24. Write a method find() that takes a linked list and a string key as arguments and returns true if some node in the list has key as its item field, false otherwise. 25. Suppose x is a linked-list node. What is the effect of the following code fragment? x.next = x.next.next; Answer: Deletes from the list the node immediately following x.

26. Suppose that x is a linked-list node. What is the effect of the following code fragment? t.next = x.next; x.next = t; Answer: Inserts node t immediately after node x. 27. Why does the following code fragment not have the same effect as in the previous question? x.next = t; t.next = x.next; Answer: When it comes time to update t.next, x.next is no longer the original node following x, but is instead t itself! 28. Write a method removeAfter() that takes a linked-list Node as argument and removes the node following the given one (and does nothing if the argument or the next field in the argument node is null). 29. Write a method insertAfter() that takes two linked-list Node arguments and inserts the second after the first on its list (and does nothing if either argument is null). 30. Write a method remove() that takes a linked list and a string key as arguments and removes all of the nodes in the list that have key as its item field. 31. Write a method max() that a reference to the first node in a linked list as argument and returns the value of the maximum key in the list. Assume that all keys are positive integers, and return 0 if the list is empty. 32. Develop a recursive solution to the previous exercise. 33. Write a recursive method to print the elements of a linked list in reverse order. Do not modify any of the links. Easy: Use quadratic time, constant extra space. Also easy: Use linear time, linear extra space. Not so easy: Develop a divide-and-conquer algorithm that uses linearithmic time and logarithmic extra space. 34. Write a recursive method to randomly shuffle the elements of a linked list by modifying the links. Easy:Use quadratic time, constant extra space. Not so easy: Develop a divide-and-conquer algorithm that uses linearithmic time and logarithmic extra space.

Creative Exercises
1. Deque A double-ended queue or deque (pronounced deck) is a combination of a stack and and a queue. It stores a parameterized collection of items and supports the following API:

Write a data type Deque.java that implements the deque API using a singly linked list. 2. Random queue. Create an abstract data type RandomizedQueue.java that supports the following operations: isEmpty(), insert(), random(), and removeRandom(), where the deletion operation deletes and returns a random object. Hint: maintain an array of objects. To delete an object, swap a random object (indexed 0 through N-1) with the last object (index N-1). Then, delete and return the last object.

3. Listing files. A Unix directory is a list of files and directories. Program Directory.java takes the name of a directory as a command line parameter and prints out all of the files contained in that directory (and any subdirectories) in level-order. It uses a queue. 4. Josephus problem Program Josephus.java uses a queue to solve the Josephus problem. 5. Delete ith element. Create an ADT that supports the following operations: isEmpty, insert, andremove(int i), where the deletion operation deletes and returns the ith least recently added object on the queue. Do it with an array, then do it with a linked list. See Exercise XYZ for a more efficient implementation that uses a BST. 6. Dynamic shrinking. With the array implementations of stack and queue, we doubled the size of the array when it wasn't big enough to store the next element. If we perform a number of doubling operations, and then delete alot of elements, we might end up with an array that is much bigger than necessary. Implement the following strategy: whenever the array is 1/4 full or less, shrink it to half the size. Explain why we don't shrink it to half the size when it is 1/2 full or less.

7. Ring buffer. A ring buffer or circular queue is a FIFO data structure of a fixed size N. It is useful for transferring data between asynchronous processes or storing log files. When the buffer is empty, the consumer waits until data is deposited; when the buffer is full, the producer waits to deposit data. A ring buffer has the following methods: isEmpty(), isFull(), enqueue(), and dequeue(). Write an generic data type RingBuffer using an array (with circular wrap-around for efficiency). 8. Merging two sorted queues. Given two queues with strings in ascending order, move all of the strings to a third queue so that the third queues ends up with the strings in ascending order. 9. Mergesort. Given N strings, create N queues, each containing one of the strings. Create a queue of the N queues. Then repeatedly apply the sorted merging operation to the first two queues and reinsert the merged queue at the end. Repeat until the queue of queues contains only one queue. 10. Queue with two stacks. Show how to implement a queue using two stacks. Hint: If you push elements onto a stack and then pop them all, they appear in reverse order. If you repeat this process, they're now back in order. 11. Move-to-front. Read in a sequence of characters from standard input and maintain the characters in a linked list with no duplicates. When you read in a previously unseen character, insert it at the front of the list. When you read in a duplicate character, delete it from the list and re-insert it at the beginning. Thismove-tofront strategy is useful for caching and data compression (Burrows-Wheeler) algorithms where items that have been recently accessed are more likely to be reaccessed. 12. Text editor buffer. Implement an ADT for a buffer in a text editor. It should support the following operations: insert(c): insert character c at cursor delete(): delete and return the character at the cursor left(): move the cursor one position to the left right(): move the cursor one position to the right get(i): return the ith character in the buffer

Hint: use two stacks. 13. Topological sort. You have to sequence the order of N jobs on a processor. Some of the jobs must complete before others can begin. Specifically, you are given a list of order pairs of jobs (i, j). Find a sequence of the jobs such that for each pair (i, j) job i is scheduled before job j. Use the following algorithm.... For each node, maintain a list of outgoing arcs using a queue. Also, maintain the indegree of each node. Finally, maintain a queue of all nodes whose indegree is 0. Repeatedly delete a node with zero indegree, and delete all of its outgoing arcs. Write a program TopologicalSorter.java to accomplish this. Alternate application: prerequisites for graduating in your major. Must take COS 126 and COS 217 before COS 341, etc. Can you graduate? 14. PERT / CPM. Modify the previous exercise to handle weights (i, j, w) means job i is scheduled at least w units of time before job j. 15. Set of integers. Create a data type that represents a set of integers (no duplicates) between 0 and N-1. Support add(i), exists(i), remove(i), size(), intersect, difference, symmetricDifference, union, isSubset, isSuperSet, and isDisjointFrom.

16. Indexing a book. Write a program that reads in a text file from standard input and compiles an alphabetical index of which words appear on which lines, as in the following input. Ignore case and punctuation. Similar to FrequencyCount, but for each word maintain a list of location on which it appears. Reverse a linked list. Write a function that takes the first Node in a linked list, reverse it, and returns the first Node in the resulting linked list. Solution. To accomplish this, we maintain references to three consecutive nodes in the linked list,reverse, first, and second. At each iteration we extract the node first from the original linked list and insert it at the beginning of the reversed list. We maintain the invariant that first is the first node of what's left of the original list, second is the second node of what's left of the original list, and reverse is the first node of the resulting reversed list.

public static Node reverse(Node list) { Node first = list; Node reverse = null; while (first != null) { Node second = first.next; first.next = reverse; reverse = first; first = second; } return reverse; } When writing code involving linked lists, we must always be careful to properly handle the exceptional cases (when the linked list is empty, when the list has only one or two nodes) and the boundary cases (dealing with the first or last items). This is usually the trickiest part, as opposed to handling the normal cases. Recursive solution. Assuming the linked list has N elements, we recursively reverse the last N-1 elements, then carefully append the first element to the end. public Node reverse(Node first) { if (first == null || first.next == null) return first; Node second = first.next; Node rest = reverse(second); second.next = first; first.next = null; return rest; }

Web Exercises
1. Quote. Develop a data type Quote.java that implements the following API:

public class Quote -----------------------------------------------------------------------------Quote() create an empty quote void addWord(String w) add w to the end of the quote int count() return number of words in quote String getWord(int i) return the ith word starting at 1 void insertWord(int i, String w) add w after the ith word String toString() return the entire quote as a String To do so, define a nested class Card that holds one word of the quote and a link to the next word: private class Card { private String word; private Card next; public Card(String word) { this.word = word; this.next = null; } } 2. Circular quote. Repeated the previous exercise, but write a program CircularQuote.java that use a circular linked list. 3. Write a recursive function that takes as input a queue, and rearranges it so that it is in reverse order. Hint: dequeue() the first element, recursively reverse the queue, and the enqueue the first element. 4. Add a method Item[] multiPop(int k) to Stack that pops k elements from the stack and returns them as an array of objects. 5. Add a method Item[] toArray() to Queue that returns all N elements on the queue as an array of length N. 6. What does the following code fragment do? IntQueue q = new IntQueue(); q.enqueue(0); q.enqueue(1); for (int i = 0; i < 10; i++) { int a = q.dequeue(); int b = q.dequeue(); q.enqueue(b); q.enqueue(a + b); System.out.println(a); } Fibonacci 7. What data type would you choose to implement an "Undo" feature in a word processor?

8. Suppose you have a single array of size N and want to implement two stacks so that you won't get overflow until the total number of elements on both stacks is N+1. How would you proceed? 9. Suppose that you implemented push in the linked list implementation of StackList with the following code. What is the mistake? public void push(Object value) { Node second = first; Node first = new Node(); first.value = value; first.next = second; } Answer: By redeclaring first, you are create a new local variable named first, which is different from the instance variable named first. 10. Copy a queue. Create a new constructor so that LinkedQueue r = new LinkedQueue(q) makes rreference a new and independent queue. Hint: delete all of the elements from q and add to both q andthis. 11. Copy a stack. Create a new constructor for the linked list implementation of Stack.java so that Stack t = new Stack(s) makes t reference a new and independent copy of the stack s. You should be able to push and pop from s or t without influencing the other. Should it work if argument is null? Recursive solution: create a copy constructor for a Node and use this to create the new stack. Node(Node x) { item = x.item; if (x.next != null) next = new Node(x.next); } public Stack(Stack s) { first = new Node(s.first); } Nonrecursive solution (untested): Node(Node x, Node next) { this.x = x; this.next = next; } public Stack(Stack s) { if (s.first != null) { first = new Node(s.first.value, s.first.next) { for (Node x = first; x.next != null; x = x.next) x.next = new Node(x.next.value, x.next.next); } } 12. Stack with one queue. Show how to implement a stack using one queue. Hint: to delete an item, get all of the elements on the queue one at a time, and put them at the end, except for the last one which you should delete and return. 13. Listing files with a stack. Write a program that takes the name of a directory as a command line argument, and prints out all of the files contained in this directory and any subdirectories. Also prints out the file size (in bytes) of each file. Use a stack instead of a queue. Repeat using recursion and name your program DirectoryR.java. Modify DirectoryR.java so that it prints out each subdirectory and its total size. The

size of a directory is equal to the sum of all of the files it contains or that its subdirectories contain. 14. Stack + max. Create a data structure that efficiently supports the stack operations (pop and push) and also return the maximum element. Assume the elements are integers or reals so that you can compare them. Hint: use two stacks, one to store all of the elements and a second stack to store the maximums. 15. Tag systems. Write a program that reads in a binary string from the command line and applies the following (00, 1101) tag-system: if the first bit is 0, delete the first three bits and append 00; if the first bit is 1, delete the first three bits and append 1101. Repeat as long as the string has at least 3 bits. Try to determine whether the following inputs will halt or go into an infinite loop: 10010, 100100100100100100. Use a queue. 16. Reverse. Write a method to read in an arbitrary number of strings from standard input and print them in reverse order. public static void main(String[] args) { Stack<String> stack = new Stack<String>(); while (!StdIn.isEmpty()) { String s = StdIn.readString(); stack.push(s); } while (!stack.isEmpty()) { String s = stack.pop(); StdOut.println(s); } } 17. Add a method int size() to DoublingStack.java and Stack.java that returns the number of elements on the stack. 18. Add a method reverse() to Queue that reverses the order of the elements on the queue. 19. Add a method copy() to ArrayStackOfStrings.java

Queueing theory is the mathematical study of waiting lines, or queues. The theory enables mathematical analysis of several related processes, including arriving at the (back of the) queue, waiting in the queue (essentially a storage process), and being served at the front of the queue. The theory permits the derivation and calculation of several performance measures including the average waiting time in the queue or the system, the expected number waiting or receiving service, and the probability of encountering the system in certain states, such as empty, full, having an available server or having to wait a certain time to be served. Queueing theory has applications in diverse fields, including telecommunications, traffic [3] [4] engineering, computing and the design of factories, shops, offices and hospitals.
[1] [2]

queue

(v.) To line up. In computer science, queuing refers to lining up jobs for acomputer or device. For example, if you want to print a number of documents, the operating system (or a special print spooler) queues the documents by placing them in a special area called a print buffer or print queue. The printerthen pulls the documents off the queue one at a time. Another term for this isprint spooling . The order in which a system executes jobs on a queue depends on the priority system being used. Most commonly, jobs are executed in the same order that they were placed on the queue, but in some schemes certain jobs are given higher priority. (n.) (1) A group of jobs waiting to be executed. (2) In programming, a queue is a data structure in which elements are removed in the same order they were entered. This is often referred to as FIFO (first in, first out). In contrast, a stack is a data structure in which elements are removed in the reverse order from which they were entered. This is referred to as LIFO (last in, first out).

Queue (abstract data type)


From Wikipedia, the free encyclopedia

This article is about queuing structures in computing. For queues in general, see Queue (disambiguation).

Representation of a FIFO Queue

A queue (pronounced /kju/ kew) is a particular kind of collection in which the entities in the collection are kept in order and the principal (or only) operations on the collection are the addition of entities to the rear terminal position and removal of entities from the front terminal position. This makes the queue a First-In-First-Out (FIFO) data structure. In a FIFO data structure, the first element added to the queue will be the first one to be removed. This is equivalent to the requirement that once an element is added, all elements that were added before have to be removed before the new element can be invoked. A queue is an example of a linear data structure.

Queues provide services in computer science, transport, and operations researchwhere various entities such as data, objects, persons, or events are stored and held to be processed later. In these contexts, the queue performs the function of a buffer. Queues are common in computer programs, where they are implemented as data structures coupled with access routines, as an abstract data structure or in object-oriented languages as classes. Common implementations are circular buffers andlinked lists.

queue
In general, a queue is a line of people or things waiting to be handled, usually in sequential order starting at the beginning or top of the line or sequence. In computer technology, a queue is a sequence of work objects that are waiting to be processed. The possible factors, arrangements, and processes related to queues is known as queueing theory. Ohm's Law, reboot (warm boot, cold boot), self-destructing email,shared memory, electron, L1 and L2, permittivity (electric permittivity), cardinality, angular velocity (rotational velocity), SKU (stockkeeping unit)
RELATED GLOSSARY TERMS:

What is a queue?
On many mail systems, it is common for all messages to flow into an Inbox file, where they remain stored. New messages are appended at the end of the Inbox file. The mail client program used to read and write mails reads this Inbox file and presents the content to the user. A queue in OTRS is somewhat comparable to an Inbox file, since it too can store many messages. A queue also has features beyond those of an Inbox mail file. As an OTRS agent or user, one needs to remember which queue a ticket is stored in. Agents can open and edit tickets in a queue, and also move tickets from one queue to another. But why would they move tickets? To explain it more practically, remember the example of Max's company described in an example of a ticket system. Max installed OTRS in order to allow his team to better manage support for company customers buying video recorders. One queue holding all requests is enough for this situation. However, after some time Max decides to also sell DVD recorders. Now, the customers have questions not only about the video recorder, but also the new product. More and more emails get into the one queue of Max's OTRS and it's hard to have a clear picture of what's happening. Max decides to restructure his support system, and adds two new queues. So now three queues are being used. Fresh new mails arriving at the ticket system are stored into the old queue titled "raw". Of the two new queues, one titled "video recorder" is for video recorder requests, while the other one titled "dvd recorder" is for dvd recorder requests.

Max asks Sandra to watch the "raw" queue and sort (dispatch) the mails either into "video recorder" or "dvd recorder" queue, depending on the customer request. John only has access to the "video recorder" queue, while Joe can only answer tickets in the "dvd recorder" queue. Max is able to edit tickets in all queues. OTRS supports access management for users, groups and roles, and it is easy to setup queues that are accessible only to some user accounts. Max could also use another way to get his requests into the different queues, with filter rules. Else, if two different mail addresses are used, Sandra only has to dispatch those emails into the two other queues, that can't be dispatched automatically. Sorting your incoming messages into different queues helps you to keep the support system structured and tidy. Because your agents are arranged into different groups with different access rights on queues, the system can be optimized even further. Queues can be used to define work flow processes or to create the structure of a company. Max could implement, for example, another queue called "sales", which could contain the sub queues "requests", "offers", "orders", "billing", etc. Such a queue structure could help Max to optimize his order transactions. Improved system structures, such as through the proper design of queues, can lead to significant time and cost savings. Queues can help to optimize the processes in your company.

Hello, welcome to Queue, an information management and research tool for the Web. Read below to learn more, or to dive right in at any time, click on the Login link above. What is Queue and why would I ever want to use it? Queue is a simple tool that lets you research, organize, and share information you find on the Web or through email without wasting time. You can run Queue without installing anything on your machine by just using the Web interface. However, Queue software makes using Queue day-to-day much easier.

If you do research using the Web, Queue can help you catalog and share your findings. If you have a bunch of scribbled notes to yourself that are piling

up, Queue can help you bring some semblance of order to the mess by making storing random little bits information extremely quick and painfree. If you regularly visit some site or get some email newsletter with lots of links from the front page that you want to be sure to read at some point but can't now for whatever reason, Queue can be a helpful way to make sure you remember to get back to them. It's so incredibly simple, you might actually use it and get more organized.

Where can I access Queue? Queue is a Web application so you can get at your data anywhere you have Web access. Since Queue has built in access control, you can be sure that only you can see some certain thing, or you can share it with others, it's up to you. Queue can also be accessed over IMAP, NNTP, and via email. Getting Started First off, if you haven't created an account yet, you should do that here. You can get back to this page from any Queue page by clicking the "Help" or "What is This?" links at the top and bottom of each page. When you first start using Queue, everything will look pretty empty. As you add more and more data and share it with other users, you can develop useful information repositories for yourself and your friends and coworkers. However, there are some public areas where you can see what other people are sharing. As the amount of information you store in the system increases, so will its utility for you and those you share with. Queue is quite adept at storing small snippets of information that you might normally write on a scrap of paper or a Post-It note. Instead of doing this, if you put the information in Queue, you will have a searchable database of information that you can share with interested parties instead of a messy pile of scribbled notes on your desk. Sounds like work... Yes, there is an up-front cost in time to putting information in Queue instead of scrawling it on a Post-It, and most of the work that goes into Queue development is an attempt to decrease this up-front cost as much as possible.

Through context-menu integration, hot-key shortcuts, and various other ingenious means, Queue software tries to make it as easy or easier to organize your information as to scratch it out on a note pad, email it to yourself, etc. Queue is better than a notepad for queuing up URLs that you want to read later but don't have time for now. It is also better than Post-Its for handling text that is already on the computer (for instance web pages and emails). It is arguably worse for spoken or printed text, but if you are willing to type the information in, you will have a nice shared, searchable database instead of a pile of papers and things to try to remember in your head. Anyway, that is the basic idea. The main precept is that one reason people are not better organized and more efficient is that they are so lazy. Queue tries to make organization easier than disorganization, so that people will build structured data despite their innate laziness. The remainder of this document discusses the details of actually using Queue.

Using Queue If you have not yet logged in to Queue and just want information, you can read below to see what you can do in Queue. If you have already installed or logged in, and if this is your first time using Queue, open up another browser next to this one. If you are running Netscape/Mozilla/Firebird or Windows IE 4 or greater and have installed the Queue extensions available here, then in the new browser you will find that when you right-click on a link, page, or selected text, that there are new options in the right-click menu. In the following sections, we will review some of the advanced functionality of Queue. It is important to remember that any browser capable of handling cookies will work with Queue. This means you can get at the data you put in Queue from almost anywhere, so you can use it to hold data that you want to get at from home and at work, and at a friend's house. Of course only you can get at your data, unless you explicitly choose to share it with other users.

Queues and Data Stores Queue is based around two concepts: that of the Queue and that of the Data Store. A Queue is like an "in-box" or a "to-do bin" on your work desk, you pile things on it to look at later when you have some time. If you are browsing a site and see a link you really want to read but don't want to forget about it, right click on the link and select 'Add link to queue'. Or, if you are already on the page and want to put it off for later, you can also right-click on a blank area of the page and select 'Add to queue' to send the current page to your Queue. To handle the next thing on your Queue, right click in your browser window and select 'Pop page from queue'. The next thing on your Queue will pop up.

Ok, but what is a Data Store? Whereas a Queue is like an in-box, a Data Store is like a filing folder in your desk's drawer that you can put things in for later reference once you have found them to be interesting or otherwise important. You can also share Data Stores with other people, and search them for information. Think of them as places to put information that you think others (including yourself later on) might be interested in. Data Stores are a good place to store important phone numbers, itineraries, notes to yourself etc. that you may need in the future or may want to access on the road. When you add an Entry to a Data Store, you can search on it later. You can also edit it and move it to other Data Stores. Queue provides four gradations of access control from full access to read-only access to hidden. If a user is not explicitly allowed access to a Data Store, the Data Store will be hidden from them, so you don't have to worry about other people seeing your data unless you explicitly add them to the list.

Your Journal and Access Control

You start with a private Data Store called your "Journal". You can put anything you want in your Journal, no one else will ever be able to see it unless you move it to another Data Store. You cannot grant others access to your Journal. If you want to share data with others, make a new Data Store and put the items you want to share in there. You can move items from your Journal to shared Data Stores or create Entries in other Data Stores to start with. If you grant others full access to your Data Stores, they can delete Entries (including yours, but just in that Data Store). read and write access levels do not allow others to move or delete Entries from the Store, use those to restrict others from removing data that you put in to a Store. History Tracking Queue will store all your activity, so you can go back and see what you did on various days, and search past submissions. It's like writing in a file folder but you can share with others and search past Entries more efficiently. Also, the Entries are available to you from any machine with a Web browser capable of using cookies. Alright, I get it, how do I start? Your Queue will start out empty. When you add links or pages to your Queue, you can remove them later from the View Queue page (requires login), or if you are using Netscape/Mozilla/Firebird or Windows IE 4 or higher, by selecting 'Pop page from queue' from the right-click menu. You can manually add data to your Data Stores and edit and move existing Entries on the Add Entry page (requires login). Or, if you are using Netscape/Mozilla/Firebird or Windows IE 4 or higher when you select a piece of text and right click within the selection, you will see an 'Add to data store' option. Selecting that stores the text into a Data Store, which can be searched from any page including the Main Page (requires login). Once you have entered in a Queue email address, you can also email your Data Stores, as explained here. Data Stores are places to put data, they have controlled access. On the View Stores you can see all your Stores, change who can see them, andcreate new Stores (requires login).

Change Notifications As a member of a Data Store you can also choose methods of notification for when new items are added to a Data Store. There are three available methods: 'Email', 'Digest', and 'Add to My Queue'. You can also choose not to be notified of additions. You can set a different notification method for each Data Store. 'Email' sends you a mail whenever someone adds a new Entry or edits an existing Entry, 'Digest' sends you one email a day containing all new Entries and edits to all Data Stores with 'Digest' notification in the past 24 hours. This means with 'Digest' notification for all your Data Stores you can get at most one notification email a day from Queue. To use 'Email' or 'Digest', you must enter a valid email address. For 'Add to My Queue' notification, new and edited Entries will automatically be added to your Queue immediately when added. You may also choose your method of notification for when other Queue users add you to a Data Store. Three options are available: 'None', 'Email', and 'Add To My Queue'. Of course, 'Email' notification will only work if you have entered a valid Queue email address. All notification settings can be set here (requires login). Other Info You may want to bookmark the Main Page (requires login). It offers you several facilities for using, searching, and managing your Queue and your Data Stores. Be sure to check out the Downloads page to explore some of the other ways to interact with Queue.

Queue is meant to be simple to use, if you have any questions or comments, please send an email to the address at the bottom of the page.

Thank you for trying Queue.

Linked list
From Wikipedia, the free encyclopedia

In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference (in other words, a link) to the next node in the sequence; more complex variants add additional links. This structure allows for efficient insertion or removal of elements from any position in the sequence.

A linked list whose nodes contain two fields: an integer value and a link to the next node. The last node is linked to a terminator used to signify the end of the list.

Linked lists are among the simplest and most common data structures. They can be used to implement several other common abstract data structures, including stacks, queues, associative arrays, and symbolic expressions, though it is not uncommon to implement the other data structures directly without using a list as the basis of implementation. The principal benefit of a linked list over a conventional array is that the list elements can easily be inserted or removed without reallocation or reorganization of the entire structure because the data items need not be stored contiguously in memory or on disk. Linked lists allow insertion and removal of nodes at any point in the list, and can do so with a constant number of operations if the link previous to the link being added or removed is maintained during list traversal. On the other hand, simple linked lists by themselves do not allow random access to the data, or any form of efficient indexing. Thus, many basic operations such as obtaining the last node of the list (assuming that the last node is not maintained as separate node reference in the list structure), or finding a node that contains a given datum, or locating the place where a new node should be inserted may require scanning most or all of the list elements.

Linked List Basics


Stanford CS Education Library: a 26 page introduction to linked lists in C/C++. Includes examples, drawings, and practice problems, and solution code. The more advanced article, Linked List Problems, has 18 sample problems with solutions.

This article introduces the basic structures and techniques for building linked lists with a mixture of explanations, drawings, sample code, and exercises. The material is useful if you want to understand linked lists or if you want to see a realistic, applied example of pointer-intensive code. Even if you never really need a linked list, they are an excellent way to learn pointers and pointer algorithms.
Download LinkedListBasics.pdf

(revised 4/2001) See also..


The silly Binky Pointer Fun video -- animated introduction to pointers Pointers and Memory -- basic concepts of pointers and memory Linked List Problems -- lots of linked list problems Binary Trees -- all about binary trees The Great Tree List Recursion Problem -- the greatest pointer/recursion problem ever (advanceed)

linked list

A method of organizing stored data in a computer??s memory or on a storage medium based on the logical order of the data and not the physical order. All stored data records are assigned a physical address in memory that the computer uses to locate the information. A linked list arranges the data by logic rather than by physical address. In the table below, each data record is assigned a memory address, and each record has five fields that contain data. The first field holds the physical memory address of the record and the last field holds the physical memory address of the next logical record. The data is organized numerically based on the ID field, and the list is linked because each record is linked to the next based on that last field. Address of Record ID Name Phone Number Next Name 0000 1111Adams 265-8943 5500 5500 3333Johnson465-7219 6000 6000 4444Smith 421-6307 8200 8200 5555Murphy 720-9437 eof (end of file) If a new record is added to the list, with a numerical ID of "2222," it will be assigned an available physical address that may not be adjacent to the physical memory address of the record that precedes or comes after it numerically (1111 or 3333 in this case). When this record is added to the list, the list changes to reflect the new linking logic, based on the numerical ID. Note how the "Next Name" field changes for ID 1111 to accommodate the added record with ID 2222. Address of Record ID Name Phone Number Next Name 0000 1111Adams 265-8943 9672 9672 2222Jones 481-9698 5500 5500 3333Johnson465-7219 6000 6000 4444Smith 421-6307 8200 8200 5555Murphy 720-9437 eof

Linked lists are used to organize data in specific desired logical orders, independent of the memory address each record is assigned to. In the above example, the data is organized numerically by the ID number. In the table below, the same data is organized alphabetically by name. Notice how the linked list still connects each record to the next using the "Next Name" field. Address of Record ID Name Phone Number Next Name 0000 1111Adams 265-8943 5500 5500 3333Johnson465-7219 9672 9672 2222Jones 481-9698 8200 8200 5555Murphy 720-9437 6000 6000 4444Smith 421-6307 eof Linked list data storage works best with data arrays in which one doesn??t know how large the array will need to be or when there is a certainty of more data being added or subtracted at later times. A disadvantage to linked list data storage is that the data must be accessed sequentially and cannot be accessed randomly. Some common applications of linked lists include creating hash tables for collision resolutionn across communication channels, structuring binary trees, building stacks and queues in programming, and managing relational databases.

Linked lists
17.1 Embedded references

We have seen examples of attributes that refer to other objects, which we calledembedded references (see Section 12.8). A common data structure, the linked list, takes advantage of this feature. Linked lists are made up of nodes, where each node contains a reference to the next node in the list. In addition, each node contains a unit of data called thecargo. A linked list is considered a recursive data structure because it has a recursive definition. A linked list is either:

the empty list, represented by None, or a node that contains a cargo object and a reference to a linked list.

Recursive data structures lend themselves to recursive methods.


17.2 The Node class

As usual when writing a new class, we'll start with the initialization and __str__methods so that we can test the basic mechanism of creating and displaying the new type:
class Node: def __init__(self, cargo=None, next=None): self.cargo = cargo self.next = next

def __str__(self): return str(self.cargo)

As usual, the parameters for the initialization method are optional. By default, both the cargo and the link, next, are set to None. The string representation of a node is just the string representation of the cargo. Since any value can be passed to the str function, we can store any value in a list. To test the implementation so far, we can create a Node and print it:
>>> node = Node("test") >>> print node test

To make it interesting, we need a list with more than one node:


>>> node1 = Node(1) >>> node2 = Node(2) >>> node3 = Node(3)

This code creates three nodes, but we don't have a list yet because the nodes are not linked. The state diagram looks like this:

To link the nodes, we have to make the first node refer to the second and the second node refer to the third:
>>> node1.next = node2 >>> node2.next = node3

The reference of the third node is None, which indicates that it is the end of the list. Now the state diagram looks like this:

Now you know how to create nodes and link them into lists. What might be less clear at this point is why.
17.3 Lists as collections

Lists are useful because they provide a way to assemble multiple objects into a single entity, sometimes called a collection. In the example, the first node of the list serves as a reference to the entire list. To pass the list as an argument, we only have to pass a reference to the first node. For example, the function printList takes a single node as an argument. Starting with the head of the list, it prints each node until it gets to the end:
def printList(node): while node: print node, node = node.next print

To invoke this function, we pass a reference to the first node:


>>> printList(node1) 1 2 3

Inside printList we have a reference to the first node of the list, but there is no variable that refers to the other nodes. We have to use the next value from each node to get to the next node. To traverse a linked list, it is common to use a loop variable like node to refer to each of the nodes in succession. This diagram shows the nodes in the list and the values that node takes on:

By convention, lists are often printed in brackets with commas between the elements, as in [1, 2, 3]. As an exercise, modifyprintList so that it generates output in this format.

17.4 Lists and recursion

It is natural to express many list operations using recursive methods. For example, the following is a recursive algorithm for printing a list backwards: 1. Separate the list into two pieces: the first node (called the head); and the rest (called the tail). 2. Print the tail backward. 3. Print the head. Of course, Step 2, the recursive call, assumes that we have a way of printing a list backward. But if we assume that the recursive call works the leap of faith then we can convince ourselves that this algorithm works. All we need are a base case and a way of proving that for any list, we will eventually get to the base case. Given the recursive definition of a list, a natural base case is the empty list, represented by None:
def printBackward(list): if list == None: return head = list tail = list.next printBackward(tail) print head,

The first line handles the base case by doing nothing. The next two lines split the list into head and tail. The last two lines print the list. The comma at the end of the last line keeps Python from printing a newline after each node. We invoke this function as we invoked printList:
>>> printBackward(node1) 3 2 1

The result is a backward list. You might wonder why printList and printBackward are functions and not methods in the Node class. The reason is that we want to use None to represent the empty list and it is not legal to invoke a method on None. This limitation makes it awkward to write list-manipulating code in a clean object-oriented style. Can we prove that printBackward will always terminate? In other words, will it always reach the base case? In fact, the answer is no. Some lists will make this function crash.

17.5 Infinite lists

There is nothing to prevent a node from referring back to an earlier node in the list, including itself. For example, this figure shows a list with two nodes, one of which refers to itself:

If we invoke printList on this list, it will loop forever. If we invoke printBackward, it will recurse infinitely. This sort of behavior makes infinite lists difficult to work with. Nevertheless, they are occasionally useful. For example, we might represent a number as a list of digits and use an infinite list to represent a repeating fraction. Regardless, it is problematic that we cannot prove that printList and printBackwardterminate. The best we can do is the hypothetical statement, "If the list contains no loops, then these functions will terminate." This sort of claim is called aprecondition. It imposes a constraint on one of the arguments and describes the behavior of the function if the constraint is satisfied. You will see more examples soon.
17.6 The fundamental ambiguity theorem

One part of printBackward might have raised an eyebrow:


head = list tail = list.next

After the first assignment, head and list have the same type and the same value. So why did we create a new variable? The reason is that the two variables play different roles. We think of head as a reference to a single node, and we think of list as a reference to the first node of a list. These "roles" are not part of the program; they are in the mind of the programmer. In general we can't tell by looking at a program what role a variable plays. This ambiguity can be useful, but it can also make programs difficult to read. We often use

variable names like node and list to document how we intend to use a variable and sometimes create additional variables to disambiguate. We could have written printBackward without head and tail, which makes it more concise but possibly less clear:
def printBackward(list) : if list == None : return printBackward(list.next) print list,

Looking at the two function calls, we have to remember that printBackward treats its argument as a collection and print treats its argument as a single object. The fundamental ambiguity theorem describes the ambiguity that is inherent in a reference to a node: A variable that refers to a node might treat the node as a single object or as the first in a list of nodes.
17.7 Modifying lists

There are two ways to modify a linked list. Obviously, we can change the cargo of one of the nodes, but the more interesting operations are the ones that add, remove, or reorder the nodes. As an example, let's write a function that removes the second node in the list and returns a reference to the removed node:
def removeSecond(list): if list == None: return first = list second = list.next # make the first node refer to the third first.next = second.next # separate the second node from the rest of the list second.next = None return second

Again, we are using temporary variables to make the code more readable. Here is how to use this function:
>>> 1 2 >>> >>> 2 printList(node1) 3 removed = removeSecond(node1) printList(removed)

>>> printList(node1) 1 3

This state diagram shows the effect of the operation:

What happens if you invoke this function and pass a list with only one element (asingleton)? What happens if you pass the empty list as an argument? Is there a precondition for this function? If so, fix the function to handle a violation of the precondition in a reasonable way.
17.8 Wrappers and helpers

It is often useful to divide a list operation into two functions. For example, to print a list backward in the format [3 2 1] we can use the printBackward function to print3 2 1 but we need a separate function to print the brackets. Let's call itprintBackwardNicely:
def printBackwardNicely(list) : print "[", printBackward(list) print "]",

Again, it is a good idea to check functions like this to see if they work with special cases like an empty list or a singleton. When we use this function elsewhere in the program, we invokeprintBackwardNicely directly, and it invokes printBackward on our behalf. In that sense, printBackwardNicely acts as a wrapper, and it uses printBackward as ahelper.
17.9 The LinkedList class

There are some subtle problems with the way we have been implementing lists. In a reversal of cause and effect, we'll propose an alternative implementation first and then explain what problems it solves.

First, we'll create a new class called LinkedList. Its attributes are an integer that contains the length of the list and a reference to the first node. LinkedList objects serve as handles for manipulating lists of Node objects:
class LinkedList : def __init__(self) : self.length = 0 self.head = None

One nice thing about the LinkedList class is that it provides a natural place to put wrapper functions like printBackwardNicely, which we can make a method of theLinkedList class:
class LinkedList: ... def printBackward(self): print "[", if self.head != None: self.head.printBackward() print "]", class Node: ... def printBackward(self): if self.next != None: tail = self.next tail.printBackward() print self.cargo,

Just to make things confusing, we renamed printBackwardNicely. Now there are two methods named printBackward: one in the Node class (the helper); and one in the LinkedList class (the wrapper). When the wrapper invokesself.head.printBackward, it is invoking the helper, because self.head is a Nodeobject. Another benefit of the LinkedList class is that it makes it easier to add or remove the first element of a list. For example, addFirst is a method for LinkedLists; it takes an item of cargo as an argument and puts it at the beginning of the list:
class LinkedList: ... def addFirst(self, cargo): node = Node(cargo) node.next = self.head self.head = node self.length = self.length + 1

As usual, you should check code like this to see if it handles the special cases. For example, what happens if the list is initially empty?

17.10 Invariants

Some lists are "well formed"; others are not. For example, if a list contains a loop, it will cause many of our methods to crash, so we might want to require that lists contain no loops. Another requirement is that the length value in the LinkedListobject should be equal to the actual number of nodes in the list. Requirements like these are called invariants because, ideally, they should be true of every object all the time. Specifying invariants for objects is a useful programming practice because it makes it easier to prove the correctness of code, check the integrity of data structures, and detect errors. One thing that is sometimes confusing about invariants is that there are times when they are violated. For example, in the middle of addFirst, after we have added the node but before we have incremented length, the invariant is violated. This kind of violation is acceptable; in fact, it is often impossible to modify an object without violating an invariant for at least a little while. Normally, we require that every method that violates an invariant must restore the invariant. If there is any significant stretch of code in which the invariant is violated, it is important for the comments to make that clear, so that no operations are performed that depend on the invariant.
17.11 Glossary

embedded reference A reference stored in an attribute of an object. linked list A data structure that implements a collection using a sequence of linked nodes. node An element of a list, usually implemented as an object that contains a reference to another object of the same type. cargo An item of data contained in a node. link An embedded reference used to link one object to another. precondition An assertion that must be true in order for a method to work correctly. fundamental ambiguity theorem A reference to a list node can be treated as a single object or as the first in a list of nodes. singleton

A linked list with a single node. wrapper A method that acts as a middleman between a caller and a helper method, often making the method easier or less error-prone to invoke. helper A method that is not invoked directly by a caller but is used by another method to perform part of an operation. invariant An assertion that should be true of an object at all times (except perhaps while the object is being modified). Warning: the HTML version of this document is generated from Latex and may contain translation errors. In particular, some mathematical expressions are not translated correctly.

Sorting is the process of putting a list or a group of items in a specific order. Some common sorting
criteria are: alphabetical or numerical. Sorting can also be done in ascending order (A-Z) or descending order (Z-A)

What is sorting?
Sorting is the process of arranging data into meaningful order so that you can analyze it more effectively. For example, you might want to order sales data by calendar month so that you can produce a graph of sales performance. You can use Discoverer to sort data as follows:

sort text data into alphabetical order sort numeric data into numerical order group sort data to many levels, for example, you can sort on City within Month within Year

Sorting worksheet data also makes it easier to analyze. For example, you might want to sort sales data from most profitable sales to least profitable sales to show the relative position of your company's best selling products. Discoverer offers great flexibility when sorting data within data. You can do this to many different levels. For example, you can sort by City within Region. Note: Discoverer sorts data according to the alphabetical or numeric sequence most appropriate for the local language. For more information about choosing a language when you start Discoverer, contact the Discoverer manager.

In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation, or reordering, of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was [1] analyzed as early as 1956. Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, timespace tradeoffs, and lower bounds.

Introduction to Sorting Lecture 16


Steven S. Skiena Sorting Sorting is, without doubt, the most fundamental algorithmic problem 1. Supposedly, 25% of all CPU cycles are spent sorting 2. Sorting is fundamental to most other algorithmic problems, for example binary search. 3. Many different approaches lead to useful sorting algorithms, and these ideas can be used to solve many other problems. What is sorting? It is the problem of taking an arbitrary permutation of n items and rearranging them into the total order,

Knuth, Volume 3 of ``The Art of Computer Programming is the definitive reference of sorting. Issues in Sorting Increasing or Decreasing Order? - The same algorithm can be used by both all we need do is change to in the comparison function as we desire.

What about equal keys? - Does the order matter or not? Maybe we need to sort on secondary keys, or leave in the same order as the original permutations. What about non-numerical data? - Alphabetizing is sorting text strings, and libraries have very complicated rules concerning punctuation, etc. Is Brown-Williamsbefore or after Brown America before or after Brown, John? We can ignore all three of these issues by assuming a comparison function which depends on the application. Compare (a,b) should return ``<'', ``>'', or ''=''. Applications of Sorting One reason why sorting is so important is that once a set of items is sorted, many other problems become easy. SearchingBinary search lets you test whether an item is in a dictionary in time.

Speeding up searching is perhaps the most important application of sorting. Closest pairGiven n numbers, find the pair which are closest to each other. Once the numbers are sorted, the closest pair will be next to each other in sorted order, so an O(n) linear scan completes the job. Element uniquenessGiven a set of n items, are they all unique or are there any duplicates? Sort them and do a linear scan to check all adjacent pairs. This is a special case of closest pair above. Frequency distribution - ModeGiven a set of n items, which element occurs the largest number of times?

Sort them and do a linear scan to measure the length of all adjacent runs. Median and SelectionWhat is the kth largest item in the set? Once the keys are placed in sorted order in an array, the kth largest can be found in constant time by simply looking in the kth position of the array. How do you sort? There are several different ideas which lead to sorting algorithms:

Insertion - putting an element in the appropriate place in a sorted list yields a larger sorted list. Exchange - rearrange pairs of elements which are out of order, until no such pairs remain. Selection - extract the largest element form the list, remove it, and repeat. Distribution - separate into piles based on the first letter, then sort each pile. Merging - Two sorted lists can be easily combined to form a sorted list.

Selection Sort In my opinion, the most natural and easiest sorting algorithm is selection sort, where we repeatedly find the smallest element, move it to the front, then repeat...
* 5 7 3 2 2 * 7 3 5 2 3 * 7 5 2 3 5 * 7 2 3 5 7 * 8 8 8 8 8

If elements are in an array, swap the first with the smallest element- thus only one array is necessary. If elements are in a linked list, we must keep two lists, one sorted and one unsorted, and always add the new element to the back of the sorted list. Selection Sort Implementation
MODULE SimpleSort EXPORTS Main; (*1.12.94. LB*) (* Sorting and text-array by selecting the smallest element *) TYPE Array = ARRAY [1..N] OF TEXT; VAR a: Array; (*the array in which to search*) x: TEXT; (*auxiliary variable*)

last, min: INTEGER; BEGIN ...

(*last valid index *) (* current minimum*)

FOR i:= FIRST(a) TO last - 1 DO min:= i; (*index of smallest element*) FOR j:= i + 1 TO last DO IF Text.Compare(a[j], a[min]) = -1 THEN (*IF a[i] < a[min]*) min:= j END; END; (*FOR j*) x:= a[min]; (* swap a[i] and a[min] *) a[min]:= a[i]; a[i]:= x; END; (*FOR i*) ... END SimpleSort.

The Complexity of Selection Sort One interesting observation is that selection sort always takes the same time no matter what the data we give it is! Thus the best case, worst case, and average cases are all the same! Intuitively, we make n iterations, each of which ``on average'' compares n/2, so we should make about comparisons to sort n items.

To do this more precisely, we can count the number of comparisons we make. To find the largest takes (n-1) steps, to find the second largest takes (n-2) steps, to find the third largest takes (n-3) steps, ... to find the last largest takes 0 steps.

An advantage of the big Oh notation is that fact that the worst case obvious - we have n loops of at most n steps each.

time is

If instead of time we count the number of data movements, there are n-1, since there is exactly one swap per iteration.

Insertion Sort In insertion sort, we repeatedly add elements to a sorted subset of our data, inserting the next element in order:
* 5 7 3 2 5 * 7 3 2 3 5 * 7 2 2 3 5 * 7 2 3 5 7 * 8 8 8 8 8

InsertionSort(A)

for i = 1 to n-1 do

j=i

while (A[j] > A[j-1]) do swap(A[j],A[j-1])

In inserting the element in the sorted section, we might have to move many elements to make room for it. If the elements are in an array, we scan from bottom to top until we find the j such that room. , then move from j+1 to the end down one to make

If the elements are in a linked list, we do the sequential search until we find where the element goes, then insert the element there. No other elements need move! Complexity of Insertion Sort

Since we do not necessarily have to scan the entire sorted section of the array, the best, worst, and average cases for insertion sort all differ! Best case: the element always gets inserted at the end, so we don't have to move anything, and only compare against the last sorted element. We have (n-1) insertions, each with exactly one comparison and no data moves per insertion! What is this best case permutation? It is when the array or list is already sorted! Thus insertion sort is a great algorithm when the data has previously been ordered, but slightly messed up. Worst Case Complexity Worst case: the element always gets inserted at the front, so all the sorted elements must be moved at each insertion. The ith insertion requires (i-1) comparisons and moves so:

What is the worst case permutation? When the array is sorted in reverse order. This is the same number of comparisons as with selection sort, but uses more movements. The number of movements might get important if we were sorting largerecords. Average Case Complexity Average Case: If we were given a random permutation, the chances of the ith insertion requiring comparisons are equal, and hence 1/i.

The expected number of comparisons is for the ith insertion is:

Summing up over all n keys,

So we do half as many comparisons/moves on average! Can we use binary search to help us get below time?

Sorting is actually a procedure of placing the data in some order or we can say that the sorting is a way of organizing something like data in some specific manner. This ordering makes this possible or easy to search a specific data element from sorted elements. In computer science sorting is very much important. As efficiency is a major concern of computing, thus data is sorted in order to gain the efficiency in retrieving or searching tasks. In any database management system sorting plays a big role. All the data in a database, is stored in a sorted and compact form by a database management system. The Microsoft access is a database management system that is mostly used for small size databases. There are different ways of sorting the records in Microsoft access. The two generally used ways of sorting are as: 1. sort ascending 2. sort descending These two ways of sorting can be used with respect to the date of records and alphabetical order.You can sort the database records either by one column or by multiple columns.In order to sort the records by one column, place the cursor in any cell of that column and select required sorting option from the 'Record' menu. AnonymousSorting is actually a procedure of placing the data in some order or we can say that the sorting is a way of organizing something like data in some specific manner. This ordering makes this possible or easy to search a specific data element from sorted elements. In computer science sorting is very much important. As efficiency is a major concern of computing, thus data is sorted in order to gain the efficiency in retrieving or searching tasks. In any database management system sorting plays a big role. All the data in a database, is stored in a sorted and compact form by a database management system. The Microsoft access is a database management system that is mostly used for small size databases. There are different ways of sorting the records in Microsoft access. The two generally used ways of sorting are as: 1. sort ascending 2. sort descending These two ways of sorting can be used with respect to the date of records and alphabetical order.You can sort the database records either by one column or by multiple columns.In order to sort the records by one column, place the cursor in any cell of that column and select required sorting option from the 'Record' menu. Anonymous

searching

E-mail Print A AA AAA LinkedIn Facebook Twitter Share This Reprints On the Internet, searching is just trying to find the information you need. There are three basic approaches:

The subject directory. These can be general and cover all subjects (as Yahoo does) or specialized (like the information technology sites at searchWindowsManageability and other TechTarget sites). The search engine. These can be general and attempt to index all or most of the Web's pages (like Google or FAST), or specialized and search within a narrow range of subjects. The so-called deep Web - that is, the Web sites that have information that can't be indexed by the

LEARN MORE

Mobile Basics

search engines but can in many cases by searched directly at the individual Web site.

A program that searches documents for specified keywords and returns a list of the documents where the keywords were found. Although search engine is really a general class of programs, the term is often used to specifically describe systems like Google, Alta Vista and Excite that enable users to search for documents on the World Wide Web and USENET newsgroups. Typically, a search engine works by sending out a spider to fetch as many documents as possible. Another program, called an indexer, then reads these documents and creates an index based on the words contained in each document. Each search engine uses a proprietary algorithm to create its indices such that, ideally, only meaningful results are returned for each query. See How Web Search Engines Work in the Did You Know...? section of Webopedia. Also see "Web Search Engines & Directories" in the Quick Reference sectionof Webopedia.

What is Boolean Search? Boolean searches allow you to combine words and phrases using the words AND, OR, NOT and NEAR (otherwise known as Boolean operators) to limit, widen, or define your search. Most Internet search engines and Web directories default to these Boolean search parameters anyway, but a good Web searcher should know how to use basic Boolean operators.

Where does the term Boolean originate? George Boole, an English mathematician in the 19th century, developed "Boolean Logic" in order to combine certain concepts and exclude certain concepts when searching databases. How do I do a Boolean Search? You have two choices: you can use the standard Boolean operators (AND, OR, NOT, or NEAR, or you can use their math equivalents. It depends on you, the searcher, on which method you're more comfortable with. For example: Boolean Search Operators

The Boolean search operator AND is equal to the "+" symbol. The Boolean search operator NOT is equal to the "-" symbol. The Boolean search operator OR is the default setting of any search engine; meaning, all search engines will return all the words you type in, automatically. The Boolean search operator NEAR is equal to putting a search query in quotes, i.e., "sponge bob squarepants". You're essentially telling the search engine that you want all of these words, in this specific order, or this specific phrase.

4 Searching
Computer systems are often used to store large amounts of data from which individual records must be retrieved according to some search criterion. Thus the efficient storage of data to facilitate fast searching is an important issue. In this section, we shall investigate the performance of some searching algorithms and the data structures which they use.
4.1 Sequential Searches Let's examine how long it will take to find an item matching a key in the collections we have discussed so far. We're interested in:

a. the average time b. the worst-case time and c. the best possible time. However, we will generally be most concerned with the worst-case time as calculations based on worst-case times can lead to guaranteed performance predictions. Conveniently, the worst-case times are generally easier to calculate than average times. If there are n items in our collection - whether it is stored as an array or as a linked list - then it is obvious that in the worst case, when there is no item in the collection with the desired key, then n comparisons of the key with keys of the items in the collection will have to be made.

To simplify analysis and comparison of algorithms, we look for a dominant operation and count the number of times that dominant operation has to be performed. In the case of searching, the dominant operation is the comparison, since the search requires n comparisons in the worst case, we say this is a O(n) (pronounce this "big-Oh-n" or "Oh-n") algorithm. The best case - in which the first comparison returns a match requires a single comparison and is O(1). The average time depends on the probability that the key will be found in the collection - this is something that we would not expect to know in the majority of cases. Thus in this case, as in most others, estimation of the average time is of little utility. If the performance of the system is vital, i.e. it's part of a life-critical system, then we must use the worst case in our design calculations as it represents the best guaranteed performance.
4.2 Binary Search However, if we place our items in an array and sort them in either ascending or descending order on the key first, then we can obtain much better performance with an algorithm called binary search.

In binary search, we first compare the key with the item in the middle position of the array. If there's a match, we can return immediately. If the key is less than the middle key, then the item sought must lie in the lower half of the array; if it's greater then the item sought must lie in the upper half of the array. So we repeat the procedure on the lower (or upper) half of the array. Our FindInCollection function can now be implemented:
static void *bin_search( collection c, int low, int high, void *key ) { int mid; /* Termination check */ if (low > high) return NULL; mid = (high+low)/2; switch (memcmp(ItemKey(c->items[mid]),key,c->size)) { /* Match, return item found */ case 0: return c->items[mid]; /* key is less than mid, search lower half */ case -1: return bin_search( c, low, mid-1, key); /* key is greater than mid, search upper half */ case 1: return bin_search( c, mid+1, high, key ); default : return NULL; } } void *FindInCollection( collection c, void *key ) { /* Find an item in a collection Pre-condition: c is a collection created by ConsCollection c is sorted in ascending order of the key key != NULL

Post-condition: returns an item identified by key if one exists, otherwise returns NULL */ int low, high; low = 0; high = c->item_cnt-1; return bin_search( c, low, high, key ); }

Points to note: a. is recursive: it determines whether the search key lies in the lower or upper half of the array, then calls itself on the appropriate half. b. There is a termination condition (two of them in fact!) i. If low > high then the partition to be searched has no elements in it and ii. If there is a match with the element in the middle of the current partition, then we can return immediately. c. AddToCollection will need to be modified to ensure that each item added is placed in its correct place in the array. The procedure is simple: i. Search the array until the correct spot to insert the new item is found, ii. Move all the following items up one position and iii. Insert the new item into the empty position thus created. d. bin_search is declared static. It is a local function and is not used outside this class: if it were not declared static, it would be exported and be available to all parts of the program. The static declaration also allows other classes to use the same name internally.
bin_search

reduces the visibility of a function an should be used wherever possible to control access to functions!
static

Analysis

Each step of the algorithm divides the block of items being searched in half. We can divide a set of n items in half at most log2 n times. Thus the running time of a binary search is proportional to log n and we say this is a O(log n)algorithm.

Binary search requires a more complex program than our original search and thus for small n it may run slower than the simple linear search. However, for large n,

Thus at large n, log n is much smaller than n, consequently an O(log n) algorithm is much faster than an O(n)one.

Plot of n and log n vs n .

We will examine this behaviour more formally in a later section. First, let's see what we can do about the insertion (AddToCollection) operation.

In the worst case, insertion may require n operations to insert into a sorted list. 1. We can find the place in the list where the new item belongs using binary search in O(log n) operations. 2. However, we have to shuffle all the following items up one place to make way for the new one. In the worst case, the new item is the first in the list, requiring nmove operations for the shuffle! A similar analysis will show that deletion is also an O(n) operation. If our collection is static, ie it doesn't change very often - if at all - then we may not be concerned with the time required to change its contents: we may be prepared for the initial build of the collection and the occasional insertion and deletion to take some time. In return, we will be able to use a simple data structure (an array) which has little memory overhead. However, if our collection is large and dynamic, ie items are being added and deleted continually, then we can obtain considerably better performance using a data structure called a tree.
Key terms

Big Oh A notation formally describing the set of all functions which are bounded above by a nominated function. Binary Search A technique for searching an ordered list in which we first check the middle item and - based on that comparison - "discard" half the data. The same procedure is then applied to the remaining half until a match is found or there are no more items left.

Vous aimerez peut-être aussi