Vous êtes sur la page 1sur 4

Lab 5 Report

Author: Jorge Berumen


CS2302 – Data Structures

Introduction
In this laboratory we investigated the usage of the union/find data structure as a means of solving more
complicated problems that otherwise would have been difficult to resolve without. In this laboratory in
particular, we implemented a class that encapsulates the logic of creating and printing a maze based on
a (n x m) matrix. In the latter section of the laboratory, we used the union/find data structure to detect
cycles in a weighted graph read from a *.csv file and to perform the Kariskal’s Minimum Spanning Tree
Algorithm. Finally, because the union/find data structure was essential for the completion of this
laboratory, we also analyzed the running time of all three methods which this data structure could have
used. Namely, these methods are the standard, compressed, and the ranked/height methods.

Proposed solution design and implementation

Disjoint Set Class (DisjSet.java)


This class has a constructor that initializes the set according to the size provided alongside the
type of method the user wants the data structure to use. In order to accomplish this functionality, the
class defines an internal, yet accessible Enumerable constant that defines these methods. These
methods are denoted by the disjSetMethod enum. Using an Enumerable guarantees that only the
three inputs available are chosen, and make the source code much more readable than other code. The
class also defines private methods that concern the union and find methods according to the technique
requested. The neat thing about this structure is that it does not make these methods available, but a
generic find and union method are provided that hide the mess of choosing the correct method. If the
user chooses to change method during runtime, they can simply re-instantiate the object with different
constructor arguments. This method uses implementations of the book as well as implementations
provided in class.

Maze (Maze.java)
In order to process the labyrinth walls properly, a system had to be used in order to represent
them. In a (n x m) matrix, each cell that will be considered in the algorithm, typically has two walls, the
bottom and the right. Every row has m * 2 – 1 walls, with the exception of the last row. You can see the
scheme applied below on a 3 x 4 matrix:

Figure 5.1
For instance, in this example, there are 4 columns, thus in each row (with exception to the last row), has
2m – 1 walls or 4 * 2 – 1 = 7. You can see that the walls go up to 6, starting from 0. Nonetheless, giving
the walls symbols does not help us much in finding the cell numbers. Couldn’t we just divide the number
of the wall by 2 and that would be the cell number? Well, for each row, the walls are reduced by one, so
dividing the number by two won’t work for the rest of the rows. Because in each additional row, there is
one more wall missing, I devised an algorithm to determine the adjacent cells to the wall:

‫ݎܾ݁݉ݑ݊ ݈݈ܽݓ‬
IF current Row is not last Row THEN
‫ ݎܾ݁݉ݑ݊ ݈݈ܽݓ‬+ 2݉ − 1
‫ = ݎܾ݁݉ݑܰ ݈݈݁ܥ‬ቨ ቩ
2

‫ݎܾ݁݉ݑ݊ ݈݈ܽݓ‬
ELSE
‫ ݎܾ݁݉ݑ݊ ݈݈ܽݓ‬+ ሺ‫ݎܾ݁݉ݑ݊ ݈݈ܽݓ‬ሻ݉‫ ݀݋‬ሺ2݉ − 1ሻ + 1
‫ = ݎܾ݁݉ݑܰ ݈݈݁ܥ‬ቨ 2݉ − 1 + ቩ
2
Notice that if the wall happens to land on the last row, then we lose twice the amount of walls, since the
bottom walls are not considered. In order to account for this loss, we add the number of walls we are
missing, according to which column it belongs to, and add one to it. We do this because 0 mod x is
always 0.

The class generates the maze by randomly selecting a wall from the range of [0, 2nm - n - m] ,
calculating the cell number, and finding out if those two cell numbers are already connected by using
the union/find data structure. The loop continues until the first and last cell are connected, or belong to
the same set.

Kruskal’s MST Algorithm (Maze.java)


The Kruskal’s algorithm uses the union/find set in order to determine if the graph read from the
provided file contains a cycle. The class has a constructor that accepts a string denoting the filename of
the csv that contains the data representing the graph. The first value is the source, the second value is
the destination, and the third value is the weight of the edge. The user must be careful while modifying
the csv file because the program does not check for any parsing error, and it automatically assumes that
there are three values to be read. The advantage of using a csv is that it can be opened in either
Microsoft Excel or OpenOffice Calc and be changed with ease. If either the file is not a csv file or does
not exist, then the class throws an exception. This exception, is handled in the main entry of the
program. Once it reads the file, it parses it and automatically sorts the edges according to weight. When
you print the MST, the class uses Kruskal’s algorithm to print out the MST of the graph.

The entire program is structured in multiple classes that encapsulate each aspect of the lab. The
Main.java manufactures an easy-to-follow menu to illustrate the usage of these classes whilst providing
the user with the option to re-run the program.
Experimental Results
In the experimental process we tested for the running times of all three implementation of the
union/find data structure. The results are shown below, in milliseconds:
n Standard Compression Rank/Height
50 16 4 2
100 114 5 7
150 348 22 12
200 418 18 23
250 524 36 37
300 398 36 57
350 18886 62 47
400 14901 99 57
450 44705 224 121
500 147509 115 100
Table 5.1

Disjoint Set Running Times


1000000
2 min 27 sec
100000 45 sec
18 sec 15 sec
Time in milliseconds

10000

1000 418 524 398


348
224
114 99 115
36 36 62
100
16 22 18 121 100
47 57
10 4 5 37 57
23
12
7
2
1
50 100 150 200 250 300 350 400 450 500
Input Size (n x n)

Standard Compression Rank/Height

Figure 5.2

The data structure was tested by setting both of the dimensions to the same value. This will allow us to
record a constant change between all of the executions of the methods when we increase the input size.
The results above illustrate how both the compression and the rank/height methods work sufficiently
faster than the standard method.
Conclusion
The union/find data structure, although a very simply one, is a helpful tool that allow us to solve more
abstract problems. Because it such a useful tool, and often used, one must use the proper
implementation of the structure. It may be that in a certain application it is useful to use one
implementation rather than the other two. However, in this lab, we have concluded that union/find
structure using either the compression or rank/height methods are ideal candidates for use.

Vous aimerez peut-être aussi