Vous êtes sur la page 1sur 25

Distance Join Processing

in a P2P World

Xiaoqi Zhang
x.zhang4@pgrad.unimelb.edu.au

Student ID: 261273

8/7/2008
Supervisor: Dr. Egemen Tanin
Distance Join Processing in a P2P World

Abstract

P2P networks have expanded their use to the area of distributed database
systems. The P2P paradigm is famous for its various advantages over the conventional
client-server paradigm in that it provides excellent scalability both in computation and
bandwidth as well as no single point of failure due to decentralization. Spatial data is
widely used today in P2P applications. By exploiting the features of the P2P paradigm,
efficient spatial data retrieval becomes possible. A large body of work has been done
in spatial data retrieval over P2P networks, which focuses on the classic query
operations of range query and nearest neighbor query. However, to the best of my
knowledge, no work has been done in spatial data distance join operations in the
context of P2P paradigm. This report gives a detailed review on the first distance join
algorithm for P2P networks along with its implementation. A comprehensive
experiment is carried out at the end to examine different aspects of the algorithm.

Keywords: P2P, client-server, spatial data, GIS

1. Introduction
Spatial data has become a critical ingredient in various applications and
databases including location-based services [1], public transportation services
scientific data management [2,3,4] and digital government [5]. Not only is spatial data
widely used in scientific or government organizations but also it is used by the general
public, such as in-car GPS systems, real-estate agencies, etc.
2D worlds and their representations are the most frequently used spatial data in
spatial data processing domain. A 2D presentation of a virtual or a real world in an
application contains many spatial objects which have positional values. One solution
to eliminate the bottleneck problem that the conventional client-server architecture
may bring into the applications is to distribute such spatial objects among machines in
the P2P networks so that operations on the spatial data are carried out in a P2P
paradigm rather than a client-server paradigm. New P2P applications, i.e.,
job-employee seeker networks, buyer-seller networks, event/location finders for a city,
follow the solution. For example, in a buyer-seller P2P network, information about
sellers and products is distributed over the network. A potential buyer may supply
his/her location and an area in the map where sellers may be located along with some
information about the product to a search system and the system returns a list about
the sellers who is selling the related products. This type of operation can be done by
simply clicking on a 2D map to choose the location and area. Another similar type of
query will yield the distance join result which contains ordered pairs of spatial objects.
Such order depends on the distance between the two spatial objects. Finding the

2|P a g e
closest bar-restaurant pair will be one example of such applications. One
straightforward approach towards this type of operation is to simply forward
messages among available nodes in the network for locating desired data. Such an
approach is obviously not feasible, which makes an extra large amount of peers that
do not have the desired data participate in this operation. In the unpublished paper [6],
Tanin et al. have proposed an elegant way that exploits the features of P2P networks.
They used a data structure called quadtree [7] to partition underlying spatial data in
2D worlds on which distance join queries are carried out. The content of this report is
based on [6, 8]. It gives a detailed explanation of the proposed distance join algorithm
and the results of a comprehensive experiment are presented at the end.
The rest of this report is organized as follows. Section 2 gives a brief review of
related works focusing on sequential distance join algorithms and distributed quadtree
index; section 3 discusses 2 other types of query on distributed quadtree index;
section 4 explains the distance join algorithm and one implementation of mine;
section 5 gives the details of the experiments and the results; in section 6, conclusion
and future work are given.

2. Related Work
2.1. Base Sequential Algorithm
Several works has been done regarding to distance join algorithms. Hjaltason
and Samet examined various similarity search algorithms in metric spaces in [9] with
the main contribution being the use of a priority queue-based ranking algorithm for
spatial data. This algorithm can find the results of a ranking query in an incremental
fashion. In [10], they proposed a distance join algorithm that works on a hierarchical
spatial data structures. In the paper, the authors use a data structure called R-tree as
the storage of the spatial data/R-tree blocks. Priority queue based approach is adopted
to facilitate the process of the ranking algorithm. Pairs of spatial objects and R-tree
blocks are inserted into the priority queue. The distance between each pair is used as
the criterion for ordering the queue. At each step of the algorithm, the pair at the head
of the priority queue is retrieved and processed, i.e., the pair with the smallest distance.
If the pair is formed by two data objects, then the pair is reported as the next closest
pair. If one of the items in the dequeued pair is a node from the R-tree, then the R-tree
node in the pair is substituted by its descendants, i.e., objects or sub-nodes, to form
new pairs. This method works in an incremental fashion. Their algorithm has a
drawback. Pairs in the priority queue are processed sequentially. Thus in a P2P
network, the algorithm will work inefficiently due to the accumulated communication
delay. The algorithm examined in this report employs the similar priority queue based
approach but it is carefully designed so that it works efficiently in P2P networks by
utilizing the parallelism in the network.

2.2. Distributed Quadtree Index

3|P a g e
2.2.1. Partition Spatial Data Using Quad-CIF Tree
The distance join algorithm examined in this report is based on distributed
quadtree index proposed in [11]. In the paper [11] a data structure called quad-CIF
tree [12] is used for partitioning spatial data. A quad-CIF tree is a variation of quad
tree [13] and is originally used for speeding-up algorithms used in computer-aided
design of integrated circuits [12]. A quadtree is a tree data structure with each node
can have maximum 4 sub nodes. The quadtree can represent a 2D space in the
following way: At the beginning a root node in the quadtree represents the entire 2D
space. The space is then divided into 4 identical sub regions, which equals the root
root node node splitting itself into 4 sub nodes
with each one of them corresponding to
a sub region. For each one of the sub
regions, the same process then proceeds
o recursively until a certain criterion is
met. Figure 1 shows this process. Quad
CIF-tree extends quadtree definition in

root node

root node

A B O

level 1 nodes
o

C D

root node
O

root node
B
A A D
B C
O
A B
level 1 nodes

o
C D

C D

Figure 1. Quad tree demo root node


O

B
that it specifies the criteria of when to start A A D
B C
the subdivision and when to stop the O

subdivision given the distribution of CA CB


CA
CD

spatial data in the 2D space. The start and C D


CB CC

stop rules are defined as follows: CC CD


rectangle 1

For any one of the spatial objects


within a certain 2D region, the region that
completely contains the spatial object Figure 2. Quad-CIF tree partitions spatial data

4|P a g e
splits itself into 4 identical sub regions; and for any one of the 4 sub regions that
completely contains the spatial object, split itself again, until no sub region can
contain the spatial object in its entirety. And the spatial object is inserted to the node
which corresponds to the smallest region that contains the spatial object in its entirety.
The process is depicted in figure 2.
In the paper [11], the proposers give a concept of “control point” for each
region and sub region, which is simply the centroid of the region. As shown in figure
2, each node in the quadtree maintains the information about its corresponding control
point denoted as ‫ݑ‬, which can be represented in the following formula:
݀(‫݀{( = )ݑ‬ଵ, ݀ଶ, ݀ଷ, ݀ସ}, ܲ(‫ݔ‬, ‫)ݕ‬, ݈݅‫)ݐݏ‬
Basically, these are 3 pieces of information: first, the information about the 4 children
of the node, denoted as ݀ଵ, ݀ଶ, ݀ଷ, ݀ସ, which are just type of integer indicating how
many spatial objects does the corresponding child have; ܲ(‫ݔ‬, ‫ )ݕ‬is the 2D Cartesian
point ‫ ݑ‬in the 2D region; and ݈݅‫ ݐݏ‬contains all the spatial objects which are inserted
to this quadtree node. The information is crucial for searching algorithms (rang query,
nearest neighbor query, distance join query) to conduct. It makes it possible to decide
whether to forward a query further down on the quadtree. Details will be given at
section 3.

2.2.2. Routing Desired Data Using Chord


The P2P distance join algorithm proposed employs the distributed quadtree
index as well as the well known DHT (distributed hash table) protocol Chord [14] as
the application level routing protocol.
There are 2 major reasons for choosing the Chord as the application level
transport protocol:
Firstly, the hashing function which Chord employs provides uniformly random
key-location mappings, which guarantee that keys are near uniformly distributed
among the peers in the P2P networks. In other words, no peer is allocated keys
significantly more than others. This is good for load balancing. Because no peer in the
network will overload due to the fact that more queries are forwarded to it; Secondly,
Chord uses consistent hash function SHA-1 [15] which is excellent for an unstable
network such as P2P networks where peers leave and join the networks frequently.
Without consistent hashing, as peers join or leave the network, all the existing hashed
keys must be rehashed which results in issue that the most of the network bandwidth
is taken over by the messages used for rehashing.
As mentioned previously, every node in the quadtree stores a control point
which controls the underlying region. For distributing the quadtree among the
available machines in the P2P network, the string representation of x y coordinates of
control points stored at each quadtree node are used as the key of SHA-1 hash
function. It is in the format of “(x, y)”. Practically, no two control points are hashed to
the same location due to the fact that there are no two control points are exactly the
same. With Chord protocol a control point and the information about it (݀(‫)ݑ‬
described previously) are hashed into the Chord virtual circle space. With a string
representation of a control point one can easily find the desired data just by following

5|P a g e
Chord specification. Figure 3 shows one possible result of hashing control point of
each quadtree node to the Chord virtual circle space. As depicted in the figure, peer1
root node
O
O C

CB
peer 1
B peer m
A A D A
D
B C
O
CA
CD
CA
CA CB
CB CC
peer 1567
C D CD
B
CC CD peer 345
CC
rectangle 1

Figure 3. Hashing result of quadtree in Chord circle space


has the control points C, O and CB along with the spatial data stored; peer345 has the
control points D and CD; etc.
When partitioning the spatial data, smaller objects tend to be inserted into the
deeper level in the quadtree which may cause the problem that a query is passed down
to many levels in the quadtree before a spatial object can be found. A major impact of
this is that more messages are needed to find the smaller spatial objects therefore,
causes longer communication delay. A variable Fmax is proposed to specify the
maximum level in the quadtree into which a spatial object can be inserted. Variable
Fmax prevents the quadtree generated from partitioning process from being too high,
which may results in long time traverse along the quadtree when doing queries.
Note that for any queries, they all start processing from the peer who has
information about root quadtree node, which may cause single point of failure. A new
variable which is similar to Fmax, namely, Fmin is defined. Fmin specifies the
minimum level in the quadtree into which a spatial object can be inserted. When
spatial objects are inserted into the quadtree, at minimum, they are inserted into the
Fmin level nodes in the quadtree. When no Fmin node can contain the spatial data in
its entirety, then the spatial object is inserted into those Fmin level nodes whose
controlled regions intersect with the spatial object. By doing this, every query now
starts processing from those nodes at Fmin level in the quadtree not a single root
node.

3. Algorithms for Basic Spatial Query


3.1. Range Query
3.1.1. High Level Description
Range query, nearest neighbor query and distance join query are all based on
distributed quadtree index [11]. Figure 4 shows the pseudo code of range query. In
figure 4, procedure D (u) returns a reference of control point u; C(u,i) returns the ith
children control point of control point u; R( ) returns the range that the specified
control point controls. Range query is initiated from one peer in a P2P network by
calling the InitiateRangeQuery procedure with a parameter Q being the 2D rectangle
within which one wants to check whether there are some spatial objects located.
6|P a g e
Firstly, procedure Subdivide is called to get the Fmin level of control points
whose controlled ranges intersect with the query Q. And then for each of such control
points, forward the range query to the peers who possess the desired control points by
following Chord protocol (denoted as Delegate(u)-> DoRangeQuery(Q, u) in figure

InitiateRangeQuery(query Q)
{
control point list G = {}
Subdivide (Q, root, G)
for each u in G do
Delegate(u)-> DoRangeQuery(Q, u)
}
DoRangeQuery (query Q, control point u)
{
intersect objects in D(u).list with Q
send results
for i = 1 to 4 do
if (Ints(R(C(u, i)), Q) is not empty) and(D(u).di > 0) then
Delegate (C(u, i))->DoRangeQuery(Q, C(u, i))
}

Figure 4.
Algorithm for range query

4). Upon arrival, peers that get the forwarded range query return any spatial objects
that intersect with the query range Q and then for each children of the queried control
point, forward the query Q to those who have spatial objects and whose controlled
range intersects with the query. The range query process is shown in figure 5 with
Fmin=0. Peer1567 initiates the range query. Translucent rectangle (denoted as “query
Q” in the figure) is the query rectangle. In a distributed quadtree index P2P network,
every query starts to process from Fmin level in the quadtree. In this case, Fmin=0,
the query starts from root node. Query is passed down on the quadtree. Initially, the
1
root node
2 O 1
O C

CB
peer 1
B peer m
A A D A
D
B C
2
O
CA
CA
CD 3
CA CB
CB CC
query Q peer 1567
C D CD
B 3
CC CD peer 345
CC
rectangle 1

Figure 5. Chord, quad tree and spatial data

7|P a g e
result of Subdivide contains only control point O which controls the entire region.
Peer1567 then passes the query to the peer in the network which has the data about
control point O. This process is depicted as the curve marked 1 in figure 5. With the
help of Chord, the query is then passed to peer1 who has information about control
point O. When query is arrived in peer1, peer1 first examines whether it has any
spatial data (in this simplified example, rectangles) that intersects the query
rectangle; and then, it checks are there any children of the node O whose controlled
range intersects the query rectangle Q and who has spatial data. After examining,
peer1 finds that the children of O, C meets such requirements. Then peer1 forwards
the query to the peer who has information about control point C. With Chord, we
know that the peer is still peer1. This process is depicted by curve marked 2. Peer1
repeated process 1, and finds that sub region CD intersects the query Q and has spatial
data in it. Then peer1 forwards the query to the peer who has information regarding
control point CD, namely the peer345. The routing process is depicted by the curve 3.
When query arrives at peer345 it finds it has spatial object rectangle1 and no sub
regions have spatial objects. Then, after sending the result back, the range query stops.
As described, the query starts at root node and is passed down on the quadtree with
the order: O->C->CD.

3.1.2. Implementation
For implementation part, I use tables to show the features which I implemented
and in “Extra” column, I added some specials and key points that must be paid
attention to.
Table 1 shows the implementation details.
Item Implemented Extra
Routing (Chord) Basic data structures This project does not deal with the

find_predecessor issues that arise when node join or leave

find_successor the Chord network, only routing is dealt


with. Caching mechanism in Chord is
NOT implemented.

Indexing Basic data structures Quadtree, control point. Quadtree


node, rectangle, Fmin, Fmax, etc.

Algorithm Basic data structures Implementation strictly follows

InitiateRangeQuery() the protocol defined in the original paper

Subdivide (Q, root, G) [8]

Delegate(u)
DoRangeQuery(Q, u)
Table 1. Implementation details for range query

3.2. Nearest Neighbor Query


3.2.1. High Level Description
Hjaltason and Samet [9] gave a comprehensive analysis of various similarity search
algorithms in metric spaces. The main contribution of theirs was to propose a priority
8|P a g e
queue based ranking algorithm that can
InitiateNNQuery(query q)
find the results of a ranking query in an
{
incremental fashion. Ranking is a more
priority queue pqueue =
general form of NN query where all the
GetSortedControlPoints (q, fmin)
spatial objects will eventually be retrieved
control point c =FindControlPoint (q, fmin)
in the increasing order of their distance
WCDist =MaxDist(q, c)
from a query point. Initially, by first
SendMessagesWithin(WCDist)
iteration of the algorithm the root node of
}
the data structure is inserted into the
priority queue. The priority is measured
DoNNQuery(control point u)
by the distance between the data structure
{
and the query point. In the next iteration
Msg= CreateReplyMessage()
of the algorithm, all children of the root
msg.Put (D(u).list)
which are in turn added to the priority
for i = 1 to 4 do
queue. Hence, in this fashion, at each
if (D(u).di > 0) then
iteration of the algorithm, the element
msg.Put (C(u, i))
with the smallest distance is removed and
SendMessageBack(msg)
visited, and its children are inserted into
}
the queue. Eventually, there will be an
object at the head of the queue, which is
Synchronized ReceiveNNMessage(message msg)
the object with the shortest distance to the
{
query point. Note that their algorithm
for each object X in msg.list do
works in an incremental fashion.
pqueue. Add (X )
Elements in the priority queue are
for each control point u in msg do
contacted sequentially, which is clearly
pqueue. Add(u)
not suitable for P2P paradigm where the
pqueue.Remove(SenderOf(msg))
power of parallelism must be fully
WCDist=UpdateWCDist()
exploited.
SendMessagesWithin(WCDist)
Tannin, et al. proposed an elegant
}
way of doing nearest neighbor query in
Figure 6
[8]. Their algorithm is based on the
Algorithm for nearest neighbor query
priority queue based approach. Figure 6
shows the pseudo code of nearest
neighbor query. The peer that initiates the nearest neighbor query maintains the
priority queue. At the beginning, instead inserting just the root node into the priority
queue, all the control points at level Fmin are inserted into the priority queue. There is
a new variable called WCDist, which is the worst case distance from the query point
to the controlled range of the control point. The WCDist is used as a criterion to
decide which peers are to be contacted in parallel during one iteration of the algorithm.
This is the most remarkable difference between this algorithm and the algorithm
proposed by Hjaltason and Samet in [10]. During each iteration, the WCDist is
updated as follows: Let d be the distance between the first spatial object (if any) in the
priority queue and the query point. And let D be the maximum distance between the
query point and the top element (cannot be a spatial object, because spatial objects at

9|P a g e
top will be deleted as soon as they are found). Thus the WCDist= Min (d, D). Then,
for each control point in the priority queue, those with the distance from their
root node
O
O C

CB
peer 1
B peer m
A A D A
Wc D
Wcd d is B C
is t 2 t1
O
CA
CD
CA
CA CB
CB CC
query Q peer 1567
C D CD
B
CC CD peer 345
CC
rectangle 1
q

priority queue status1: priority queue status 3: priority queue status 5:

C A D B CD D B rect0 D B rect0

priority queue status 2: priority queue status 4: priority queue status 6:


CD A D B rect1 D B rect0 rect0

Figure 7. Process of nearest neighbor query

controlled ranges to the query point less than or equal to WCDist are contacted in
parallel. The entire process is depicted in figure 7 with Fmin=1. Peer345 initiates the
nearest neighbor query by calling InitiateNNQuery. GetSortedControlPoints will
return a priority queue, which contains level 1 control points, namely, A,B,C and D.
The status of the priority queue is denoted as “priority queue status 1” in the figure.
The first WCDist and the range it covers are denoted by the quadrant marked as
“Wcdis1” in the figure. Therefore, SendMessagesWithin will forward the query in
parallel to the peers who possess control points C, A and D respectively. As shown in
the figure, peer345, peer1 and peer m get this message. Then DoNNQuery procedure
is called at each one of them. They will create reply message put any spatial objects
they have along with any control points which have spatial object in it to the message
and send it back to query initiating peer, in this case, peer345. Assuming the reply
message corresponding to control point C arrives at peer345 first (the arriving order
may vary due to message delay; however, this doesn’t affect the correctness of the
algorithm). ReceiveNNMessage is called at peer345. After inserting all the control
points and spatial objects into the priority queue, the status of the priority queue is
denoted as “priority queue status 2” in figure 7. Control point C is deleted from the
priority queue after handling the reply message corresponding to it. Then
UpdateWCDist is called to update the WCDist. The updated WCDist is shown as the
smaller quadrant in figure 7 denoted as “Wcdis2”, where the SendMessagesWithin
procedure will sent the query to the peer that just has control point CD (because
control points A and D has been contacted previously). This time peer345 is contacted.
Before peer345 returns a reply message back, assuming reply message about control
point A just arrives at the query initiating peer which is peer345, according to the
algorithm, the spatial objects and control points are inserted into the priority queue.
“priority queue status 3” in the figure shows the status of the priority queue after
insertion. Note that the distance from control point D, B, CD to query point is closer
10 | P a g e
than that of rectangle 0, thus, rectangle 0 is at the end of the priority queue. Now
peer345 sent the reply message back along with the spatial object rectangle 1 to the
query initiating peer. After this iteration, the status of the priority queue is shown as
“priority queue status 4”. Now, there is a spatial object becoming the head of the
queue. So it will be the nearest spatial object with respect to query point q. The
algorithm can now stop or proceed as needed. Because neither do both control points
B and D possess any spatial objects nor their children, when the reply messages
corresponding to them are returned, B and D are simply deleted. The nearest neighbor
query stops automatically when the priority queue is empty.

3.2.2. Implementation
Table 2 shows the implementation details of algorithm for nearest neighbor
query.

Item Implemented Extra


Routing Basic data structures This project does not deal with the

(Chord) find_predecessor issues that arise when node join or

find_successor leave the Chord network, only


routing is dealt with. Caching
mechanism in Chord is NOT
implemented.

Indexing Basic data structures Quadtree, control point. Quadtree


node, rectangle, Fmin, Fmax, etc.

Algorithm Basic data structures Implementation strictly follows the

InitiateNNQuery(query q) protocol defined in the original paper

GetSortedControlPoints (q, fmin) [8]. Data structures include priority

FindControlPoint (q, fmin) queue, two types of queue elements,


etc.
SendMessagesWithin(WCDist)
DoNNQuery(control point u)
CreateReplyMessage
SendMessageBack(msg)
Synchronized ReceiveNNMessage(message msg)
UpdateWCDist()
Table 2. Implementation details for nearest neighbor query

4. Distance Join Algorithm for P2P Networks


4.1. High Level Description
Distance join algorithm is working on two sets of spatial data. The goal of the
algorithm is to try to find the closest pair of spatial objects from two spatial data sets.
Such type of searching has great potentials in real life. Imaging at a weekend, one
11 | P a g e
wants to go out for dinner and watch a
JoinInit(QuadTreeNode root1,QuadTreeNode root2)
great movie then. The first mind off
{
the top of his/her mind is to try to find
PQueue=new PriorityQueue()
a restaurant with a cinema nearby. The
MessageCacheList=new List();
shorter the distance between the two
controlpoint1=GetRootControlPoint(root1)
the better (no one wants to drive a long
controlpoint2=GetRootControlPoint(root2)
way to watch a movie after having
SendMessageTo(controlpoint1,id)
dinner). Finding the closest
SendMessageTo(controlpoint2,id)
cinema-restaurant pair is one possible
}
application of distance join algorithm.
ProcessReply(ControlPoint u,id)
One straightforward approach is
{
to retrieve all the spatial objects in data
msg=CreateReplyMessage(id)
set 1 and data set 2, and compute the
msg.Put (D(u).list)
Cartesian product of the two sets,
for i = 1 to 4 do
order the result in increasing order
if (D(u).di > 0) then msg.Put (C(u, i))
based on distance. The first pair in the
SendMessageBack(msg)
ordered result is the closest pair. This
}
is clearly not suitable for a large P2P
Synchronized RecvMessage(Message msg)
network with extremely huge amount
{
of spatial objects distributed among
if MessageCacheList.contains(msg.id) then
the machines in the network. Several
doCombine(msg, MessageCacheList.get(id))
works has been done regarding to
PQueue.deque(msg,MessageCacheList.get(id)
distance join algorithms [9, 10].
else
However, the algorithms proposed
MessageCacheList.add(msg);
only work in a centralized
Return;
environment and algorithm proceeds
for each new pair P generated from doCombine do
sequentially. To fully exploit the
Pqueue.add(P)
advantages of P2P networks, extra
WCDist=UpdateWCDist()
work has to be done.
for each element pair E in Pqueue do in parallel
Chord, distributed quadtree
{ if E.Dist<=WCDist
index and priority queue based
{
approach, all three form the essence of
PQueue.GetElementPair(E)
the newly proposed distance join
Item1=E.item1;
algorithm for P2P networks. Similar
Item2=E.item2;
with the proposal in [10], the query
if Item1.type is ControlPoint then
initiating peer maintains the priority
SendMessageTo(Item1,id)
queue and acts as a query processing
If Item2.type is ControlPoint then
front. Two pieces of information are
SendMessageTo(Item2,id)
crucial for forwarding a distance join
}
query in query initiating peer. One is
}
the information about how quadtree
}
partitions underlying spatial data. The
other is the information about 4 Figure 8.

children of a control point. The former Algorithm for distance join query

12 | P a g e
is implicitly known by every peer in the P2P network, thus no communication is
required. The latter is automatically obtained after distributing the quad-CIF tree
among the machines in the P2P network (mentioned in section 2.2.1, each control
point contains information in the form: ݀(‫݀{( = )ݑ‬ଵ, ݀ଶ, ݀ଷ, ݀ସ}, ܲ(‫ݔ‬, ‫)ݕ‬, ݈݅‫))ݐݏ‬. Therefore, it is
very easy for a query initiating peer to forward the distance join query from root node
down on the quadtree. Figure 8 is the pseudo code for P2P distance join algorithm.
Initially, there is only one pair in the priority queue, namely, the root control point of
each quadtree. As the algorithm proceeds, pairs of control points and spatial objects
are inserted into the priority queue. Thus, four types of queue element exist, (spatial
object, spatial object), (spatial object, control point), (control point, spatial object),
(control point, control point). The processing of a pair in the query initiating peer
must be strictly synchronized in the sense that messages that are sent as a pair must be
processed together. In the P2P distance join algorithm, elements in priority queue are
control points and objects pair. As algorithm proceeds, pairs of messages are sent. The
reply messages corresponding to paired-messages sent previously must be handled
together. However, due to the uncertainty in communication delay, reply messages
may arrive at query initiating peer at arbitrary time. Therefore, for handling reply
messages pairwise, extra work has to be done. My solution is giving the messages that
are sent in pair a unique ID and caching the single message to which that hasn’t
received a paired reply message. Whenever a reply message with the same ID as the

BA
Status1: Status3:
SETX SETY
SETX SETY
Head C B
A B rectX0 BA
Tail A B
rectX0 BD
CD BD
BD
rectangleY1 Tail CD BA
O

Status2: Status4:
CA CB
SETX SETY SETX SETY
Head A B rectX0 rectY0
C D CD BD rectX0 BD

Tail CD BA CD BD
CC CD
Tail CD BA
rectangle X1

Figure 9. Process of distributed distance join algorithm

cached one is received, we can say that the two replay messages are in one pair, thus
they can be handled together. This strict synchronization property of pairwise
message processing guarantees that the new pairs generated from doCombine will not
contain redundant pairs. As shown in the algorithm, pairs in the priority queue are
contacted in parallel rather than sequentially. The newly defined variable WCDist is
used here to be a criterion to determine which pairs are contacted. The procedure
UpdateWCDist updates the WCDist in the following way: let D be the maximum
13 | P a g e
distance between the items of a pair that is in the head of the priority queue and is
none-object-object pair. And let d be the maximum distance between the spatial
objects of the first object-object pair (if any) found in the priority queue (can not be
the first, because as soon as found in head, it will be retrieved as the next closest pair).
Then WCDist=Min(D,d). Then for those pairs in the priority queue whose distance
between the two items in the pair is less than or equal to WCDist is contacted in
parallel, which makes this algorithm distinct from the traditional sequential algorithm.
Figure 9 shows a simple case to demonstrate the distance join algorithm. There
are 2 sets of data, depicted using two different colors. Rectangles X0, X1 belong to
dataset X. Rectangles Y0, Y1 belong to dataset Y. At the beginning, procedure JoinInit
is called at query initiating peer. As shown in the pseudo code, peers that own the root
control point of each data set are first contacted; in this case, two control points O of
two data sets. Two distance join initialization messages are sent with the same unique
ID (for processing messages in pair). Whenever a peer receives a distance join related
message procedure ProcessReply is called, it will put any spatial objects along with
any children control points which contain spatial objects in a reply message and sent it
back to the query initiating peer. Procedure RecvMessage is called at query initiating
peer upon receiving a reply message. Due to the fact that reply messages
corresponding to pairwise sent messages can be delay randomly, for being able to
process the messages in pair, a message cache is used to temporarily store the early
arrived reply message (the unique id is used to pair messages). Assuming reply
message from peer that owns control point O of data set X arrives first, and that of
data set Y arrives second. The algorithm then finds the paired reply messages, and
calls procedure doCombine to generate new pairs from the reply messages. After
processing the messages, it deletes the processed element from the queue. Now one of
the possible statuses of the priority queue is denoted as “Status 1” in figure 9 (it also
can be (A,B),(C,B), because the distance between control block A and B is equal to
that of C and B). Then the worst case distance WCDist is calculated, the result is
denoted in the figure as WCDist1 which is the maximum distance between control
block C and B. Then pairs in priority queue whose distance between two items in the
pair is less than or equal to WCDist1 are contacted. Thus peer that has control point C
in data set X and peer that has control point B in data set Y are contacted. The same for
pair (A,B). Until now, the first iteration of the algorithm finishes. Note that same
control points in one data set may appear in more than one pair in the priority queue,
thus potentially will be contacted multiple times, which causes communication
overheads. To overcome the problem, the results of previously contacted control
points are stored locally in the query initiating peer for eliminating unnecessary
communication. In the next iteration, assuming paired reply messages for (C,B) arrive
first (algorithm works correctly if paired reply messages for (A,B) arrive first). “Status
2” in figure 9 shows the content of priority queue after receiving reply messages for
(C,B). “Status 3” shows the content after receiving reply messages for (A,B). Note that
a new iteration may begin when the queue is in “Status 2” where the previously
contacted pair (A,B) will not be contacted again. Assuming the new iteration begins
after “Status3”. The corresponding updated WCDist is denoted as “WCDist2” in the

14 | P a g e
figure, which is the maximum distance between rectangle X0 and control block BA.
Again, pairs in the priority queue that satisfy the worst case criterion are contacted. In
this case, all 4 pairs are contacted. For the reason of clarity and simplicity, we only
look at pair (rectX0, BA). When the reply messages for control point BA is received,
after calling procedure doCombine, the content of the queue is denoted in the figure as
“Status 4”. As shown in the figure, an object-object pair appears at the top of the
queue; this is the closest pair in two different data sets. Once such a pair is found, it is
retrieved immediately and the algorithm should allow the users to determine whether
to proceed or stop the distance join algorithm.
The simple example described previously started the query from the root control
point of each data set. The distributed quadtree index allows spatial data to be inserted
from Fmin level in the quadtree rather than from root level which is the same as when
Fmin=0. Therefore a slight modification of the algorithm is needed to allow query to
start from Fmin level rather than root level to avoid communication overheads when
forwarding query from level 0 to Fmin level.

4.2. Implementation
Table below shows the implantation details of P2P distance join algorithm.
Item Implemented Extra
Routing Basic data structures This project does not deal with the issues that arise

(Chord) find_predecessor when node join or leave the Chord network, only

find_successor routing is dealt with. Caching mechanism in Chord


is NOT implemented.

Indexing Basic data structures Quadtree, control point. Quadtree node, rectangle,
Fmin, Fmax, etc..

Algorithm Basic data structures Implementation strictly follows the protocol defined

JoinInit(QuadTreeNode in the original paper [6]. Data structures include

root1,QuadTreeNode root2) priority queue, four types of queue element, queue

MessageCacheList operations, etc.

SendMessageTo(controlpoint,id) But the priority queue only allows sequential access,


but implementation allows contacting multiple peers
ProcessReply(ControlPoint u,id)
in parallel.
CreateReplyMessage(id)
Implementation only allows distance join query to
SendMessageBack(msg)
start from root level rather than from Fmin level.
Synchronized RecvMessage(Message msg)
Implementation allows caching the results of
doCombine(msg1,msg2)
previously contacted control points.
PQueue.deque(msg.id)

UpdateWCDist()
Table 3. Implementation details of P2P distance join algorithm

5. Experiments
15 | P a g e
5.1. Experimental Environment
Transit domain3

stub node

transit node

stub domain

Transit domain1 Transit domain2

Figure 10. Example of transit-stub model

In the experiment part, J-Sim (www.j-sim.org) is used for simulation


environment. Because there are no random factors which may result in differences in
testing results for the same test case, for each test case I run the test for only once.
There are several assumptions that my experiments are based on: 1. No packets
lost during communication; 2. Query response time are introduced mainly for the
reasons of messages propagation delay; 3. The P2P network is extremely stable that
during the entire progress of the experiments no node will leave or join the network
and no node will randomly crash. By defining such assumptions, I actually create an
ideal world to measure the performance of this algorithm in ideal state.
Before conducting experiments, network topology and test data sets must be
prepared. For network topology, I create a static topology for each test case, which is
similar to Transit-Stub model [16] as shown in figure 10, where intermediate nodes
can be regarded as transit nodes and nodes shown on the edge can be regarded as stub
nodes. In real life, transit domains can be thought as the metropolitan area networks
and transit nodes play the role of internet service provider. Stub domains resemble
networks within different organizations, companies, campuses, etc. Table 4 gives the
physical characteristics of the underlying network used in J-Sim. All of the test
parameters are chosen to closely reflect the real world scenario. Some of them are
statistics generated from Rogers Communications Inc [17].

16 | P a g e
Parameter Value Unit
Network delay in local area network 10 ms
Network delay between stub nodes 40 ms
Network delay between stub node and transit node 200 ms
Network delay between transit nodes 200 ms
Bandwidth in local area network 54 Mbps
Bandwidth between stub nodes 100 Mbps
Bandwidth between stub node and transit node 100 Mbps
Bandwidth between transit nodes 1000 Mbps
Table 4. Physical parameters for underlying network

For test data sets, obtaining real life data can be tricky. Thus a solution must be
found to generate near real life test data sets, for example, all the restaurants
distribution in urban region in Melbourne and all the seven-eleven
seven convenience
onvenience store
stores
in urban region in Melbourne. Merely adopting random functions provided by API
can only yield uniformly distributed
distri data which cannot reflect the genuine

F
Figure 11. Sample test data with 400 spatial object

performance of this algorithm towards real world. According to Zipf's law [[18], many
types of data studied in the physical and social sciences can be approximated with a
Zipfian distribution [18].. My test data sets are generated roughly following the
Zipfian distribution. For a 2D region, it is divided into 8 square rings with each one of
them sharing a centroid (the innermost one becomes a square). A fixed number of
spatial objects are distributed in the following manner:
manner: the number of spatial objects
in the inner square ring is roughly twice as many as that of in its immediate outer
17 | P a g e
square ring; and within a certain square ring, random function API is used to generate
spatial data. By doing this, spatial objects are densely distributed in the central area in
the 2D region while sparsely distributed in the outer region,, which simulates the real
life data distribution. Figure 11 shows one example of 400 spatial objects distribution
that follows Zipfian distribution.
distribution
Generally
enerally speaking, the experiments are conducted by changing the following
parameters: Fmin;; number of peers in the P2P network; number of queries
simultaneously initiated; number of spatial objects in each data set. The he one query is
said to be finished when the top 10 closest pairs are found.
Besides, peers
eers are almost equally allocated to stub nodes and number of queries
from each stub domain is roughly the same.

5.2. Results
5.2.1. Different Fmin:
The first experiment examines how Fmin affects the algorithm. There are 2200
peers in the network, which are uniformly distributed in the stub domain
domains. Each dada
set contains 200 spatial objects.
objects The number of simultaneous us client request
requests is set to
10 and Fmax is set to 9. The philosophy behind the variable Fmin is to avoid single
point of failure. With Fmin
min,, the spatial objects are forced to be inserted into the Fmin
level or deeper in the quadtree. Therefore, queries are no longer processed from root
node.. Multiple peers in the network are contacted as soon as the queries start. One of
the effects of increasing Fmin
F will be that as Fmin increases the bigger
ger spatial objects
are split into smaller pieces and pieces of objects are falling deeper down the quadtree
resulting in increasing the height of the quadtree, which in turn causes the algorithm
complexity to become bigger. Another effect is that more messages have to be sent
before actual spatial data is retrieved which causes overheads in communi
communication.

Changing Fmin
Average Response Time

35.000 29.006 31.060


26.554
30.000 24.033 24.677 24.365 24.200 23.392 24.959
(seconds)

25.000
20.000
15.000
10.000
5.000
0.000

0 1 2 3 4 5 6 7 8

Fmin

Figure
igure 12. Average query response time as Fmin increases

As can be observed in the figure 12, different Fmins do not affect the average
processing time so much, as Fmin increase, the average response time curve remain
roughly steady. However,
owever, as Fmin reaches its maximum, a slight increase is observed.
This is due to the longer query messages propagation delay introduced when queries
are forwarded from root level to Fmin level in the distributed quadtree where the
spatial objects are actually
ually located.
located For the first few Fmins, there is no significant
18 | P a g e
difference in average response time, which is because: 1. For finding the first 10
closest pairs is quite different from that of finding all the pairs; 2. Fmin doesn’t affect
the test data set significantly before it is reaching a certain value due to the fact that
the test data set contains many smaller spatial objects; 3. Even if spatial objects are
split into smaller pieces which will cause communication overheads (shown in figure
13), the parallel communication property of the algorithm compensates for such
overheads with regard to average response time.
140000
Messages Per Request

120000 Average Number of Messages 120,302


100000
80000
71,769
60000
46,691
40000
20000 26,361
9,403 13,550
0 5,975 7,020 7,782

0 1 2 3 4 5 6 7 8
Fmin

Figure 13. Average number of messages for finishing one query as Fmin increases

Figure 13 shows the variation in the number of messages per query (each query
finds the first 10 closest pairs) as Fmin increases. As expected, number of messages
increases when Fmin increases. For the first few cases, Fmin doesn’t affect the
number of messages so much. However, as it reaches 5, there is a relatively steep
increase due to the fact that the underlying 2D space is divided into so many tiny
squares and hence the increase in height of the distributed quadtree.

Standard Deviation for Fmin


Standard Deviation in Load

25

20

15

10

0
0 1 2 3 4 5 6 7 8

Fmin

Figure 14. Standard deviation of number of messages for finishing one


query as Fmin increases

For different Fmins, figure 14 shows the load distribution in terms of the
standard deviation. As can be observed, as Fmin increases, the standard deviation
drops gradually which means the load among peers in the network tends to be more
balanced.
Figure 15 shows the actual load for peers in the network. There are 15 slots on

19 | P a g e
the x-axis with each of them representing a number-of-message-range a certain
number of peers have received for finishing 10 queries. Each of the slots potentially
has 9 bars indicating load for different Fmin. For example, if one wants to know the
load distribution for Fmin=0, then he/she needs to see the first bar in every slot. As
shown in the figure, there are around 80 peers in the network which get less than or
equal to 10 messages; and around 7 peers which got more than 10 but less than or
equal to 20 messages, etc. There is a general trend can be seen, as the Fmin increases,
more and more peers in the network handle more messages. When Fmin=0, 81 out of
200 peers handle less than 10 messages, no peer handles more than 5120 messages.
While when Fmin come to 8, only 14 peers in the network handle less than 10
messages, 47 out of 200 peers handle more than 5120 messages totally. Load is
increasing along with the increase of Fmin, However, load is roughly uniformly
distributed among the network.

Load Distribution for Different Fmin (finish 10 queries)


90
80 fmin=0
70 fmin=1
Number of Peers

60 fmin=2
50 fmin=3
40 fmin=4
30 fmin=5
20 fmin=6
10 fmin=7
0 fmin=8

Slots of Number of Messages

Figure 15. Load distribution for finishing 10 queries with different Fmins

5.2.2. Distributed VS Sequential:


The most prominent advantage of the P2P distance join algorithm over the
traditional distance join algorithm is that it will contact the relevant peers in a parallel
manner rather than a sequential manner, which enables it to exploit the parallelism of
P2P network. Figure 16 gives the comparison of experiment results between parallel
algorithm and sequential algorithm. As shown, parallel algorithm gives a steady curve.
The average response time isn’t affected significantly by increasing Fmin; while the
sequential one fluctuates severely, because the elements in the priority queue are
handled one by one. Besides, different Fmins will cause the uncertainty in spatial
objects distribution when partitioning them using the distributed quadtree, which
gives the uncertainty in average response time. Without surprise, the parallel one
works much better than the sequential one from the response time point of view.
Next several experiments will examine how well the P2P distance join
algorithm scales with respect to increasing the number of peers, the number of
simultaneous queries and the number of spatial objects.
20 | P a g e
Parallel One VS Sequential One
600.000

Average Response Time (seconds)


492.621 Sequential Distance Join
446.853 449.236 Algorithm
500.000 425.328
404.139

400.000
277.483
300.000 269.224
196.662
171.358
200.000

100.000 29.006 31.060


24.033 24.677 24.365 24.200 23.392 24.959 26.554

0.000
0 1 2 3 4 5 6 7 8
Fmin

Figure 16. Average response time per query for P2P distance join algorithm in
comparison to centralized sequential algorithm

5.2.3. Different Number


umber of Peers:
The first experiment examines how the algorithm scales with increasing number
of peers in the network. Fmin is set to 2; Fmax is set to 9; there are 200 spatial objects
in the region; number of simultaneous queries is set to 10; and only the first 10 closest
pairs found account for finishing 1 query. The result is shown in figure 17.
Ass shown in the figure, as the number of peers increases the average response
time remains roughly steady, although there are tiny increase
ease in average response time
due to the fact that as there are more peers in the network, 200 spatial objects are
located at more peers, therefore, more hops in the Chord network are needed for a
query to finish.

Changing Number of Peers


Average Response Time (seconds)

30.000 27.498
25.014 25.538
22.796
25.000 21.007
20.000

15.000

10.000

5.000

0.000
200 400 600 800 1000

Number of Peers in the Network

Figure 17. Average response time per query as number of peers increases

21 | P a g e
5.2.4. Different Number of Simultaneous Queries:
The second scalability experiment examines how well the algorithm scales as
the number of simultaneous queries increases. Again, Fmin is set to 2; Fmax is set to 9;
there are 200 spatial objects in the 2D space; number of peers in the network is set to
200; and only the first 10 closest pairs found account for finishing 1 query. The result
is shown in figure 18. In the figure, there is a drop at the beginning. One possible
reason that introduces the drop in average response time is that most of the queries are
forwarded to the same peers that previously forwarded the same messages. However,
the rest of the curve remains steady.

Changing Number of Queries


Average Response Time

26.500
26.000 26.153
25.500
25.000 24.966 24.928
24.703
24.500 24.365
24.000
23.500
23.000
5 10 20 40 80

Number of Simultaneous Queries

Figure 18. Average response time per query as number of query increases

5.2.5. Different Number of Spatial Objects


The last experiment examines how well the algorithm performs with the
increasing number of spatial objects. With fixed number of peers in the network, as
more and more spatial objects are inserted into the network, for one single peer, there
must be an increase in the number of spatial objects allocated to it, which will reduce
the number of hops a query needs to be forwarded in the Chord network to fetch
needed spatial objects before the first 10 closest pairs are returned. In this experiment,
Fmin is set to 2; Fmax is set to 9; number of peers in the network is set to 200; the
number of simultaneous queries is set to 10; and only the first 10 closest pairs found
account for finishing 1 query.

Changing Number of Spatial Objects (response time)


26.000
25.497
Average Response Time

25.000
24.000
23.555
23.000
(seconds)

22.717
22.000 21.813
21.404
21.000
20.000
19.000
200 400 600 800 1000

Number of Spatial Objects

Figure 19. Average response time per query as number of objects increases
22 | P a g e
Figure 19 shows the result. As expected, as the number of spatial objects
increases, the general trend in average response time is in a decreasing pattern
regardless of a sudden increase when the number of objects is set to 600, which is
possible for the reason of the randomness in distribution of spatial objects among the
machines in the P2P network.
Although the average response time decreases, as more and more spatial objects
are inserted into the network, the number of messages generated for finishing one
query is in an increasing pattern (shown in figure 20). The reason is intuitive. As more
spatial objects are inserted, more quadtree blocks (control points) are needed to be
inserted into the network including both the quadtree blocks (control points) that
contain spatial objects or those whose children contain spatial objects. Therefore,
either the distributed quadtree is becoming fuller or the height of the quadtree is
increasing. In either case, more messages are needed to finish one query.
Changing Number of Spatial Objects (messages/request)
Average Number of Messages

50,000
45,652
40,000
Per Request

30,000 32,512
27,648
20,000
16,471
10,000
7,480
0
200 400 600 800 1000

Number of Spatial Objects

Figure 20. Average messages per query as number of objects increases

6. Conclusion and Future Work


P2P paradigm is absolutely a trend in today’s network development. More and
more people start to use applications that employ P2P technology. However, complex
queries on spatial data over P2P networks can be difficult to achieve. The P2P
distance join algorithm examined in this report fully exploits the advantages of P2P
networks. In this project, I did heaps of research on the unpublished P2P distance join
algorithm and made one implementation of it as well as 2 other algorithms, range
query and nearest neighbour query. At the end, several experiments have been
conducted to examine different aspects of the P2P distance join algorithm. The results
of experiments show that the distance join algorithm works pretty well in a 2D
environment with respect to average response time. The variable Fmin proposed in
the original paper [8] is very important to this algorithm. Finding an appropriate Fmin
so that single point of failure will not likely to happen and meanwhile the number of
messages generated for finishing one single query isn’t overwhelming, isn’t a trivial
task. However Fmin and Fmax do give a lot of flexibility to the applications built on
top of it.
The P2P distance join algorithm implemented for experiments always starts
23 | P a g e
query from root control points of 2 data sets, which causes communication overheads
from passing down the query form level 0 to level Fmin in the distributed quadtree.
This problem can be solved by allowing the query to start from Fmin level rather than
0 level. In real life applications, other query criteria can be applied, such as giving a
query range, within which find the closest pair or allowing the users to specify two
certain types of data sets that are in users’ interest.

24 | P a g e
References
[1]. Front Page of Business Link. Business Link Web Site. [Online]
http://www.businesslink.gov.uk.
[2]. Wilson, Jim. Front Page of National Aeronautics and Space Administration.
NASA Official Web Site. [Online] http://www.nasa.gov.
[3]. Front Page of National Institutes of Health. Official Web Site of National
Institutes of Health. [Online] http://www.nih.gov.
[4]. Front Page of National Geospatial Intelligence Agency. Official Web Site of
National Geospatial Intelligence Agency. [Online] http://www.nga.mil.
[5]. Front Page of National Institute of Justice. Official Web Site of National
Institute of Justice. [Online] http://www.ojp.usdoj.gov/nij.
[6]. Egemen Tanin and Deepa Nayar. An Efficient Distributed Distance Join
Algorithm for Peer-to-Peer Networks.
[7]. Raphael Finkel and J.L. Bentley. Quad Trees: A Data Structure for Retrieval on
Composite Keys. Acta Informatica 4 (1): 1-9.
[8]. E. Tanin, A. Harwood, H. Samet, D. Nayar, and S. Nutanong. Building and
querying a P2P virtual world, Geoinformatica, 2006, 10(1):91-116,.
[9]. G.R. Hjaltason and H. Samet. Index-Driven Similarity Search in Metric Spaces,
ACM Tran. On Database Systems, Dec 2003, Vol.28, No. 4, pp. 517-580.
[10]. G.R.Hjaltason and H.Samet, Incremental. Distance Join Algorithms for Spatial
Databases, Proc. Of the ACM SIGMOD Conference, Seattle, WA, 1998, pp.
237-248.
[11]. E. Tanin, A. Harwood and H. Samet. A distributed quadtree index for
peer-to-peer settings, in Proceedings of the IEEE International Conference on
Data Engineering, Tokyo, Japan, April 2005, pp. 254-255.
[12]. Gershon Kedem. The Ouad-ClF Tree:A Data Structure for Hierarchical On-Line
Algorithms, University of Rochester Rochester, New York 14627.
[13]. Raphael Finkel and J.L. Bentley. Quad Trees: A Data Structure for Retrieval on
Composite Keys, Acta Informatica 4(1): 1-9.
[14]. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari
Balakrishnan. A scalable peer-to-peer lookup service for Internet applications,
in Proceedings of the ACM SIGCOMM 01, San Diego, CA, August 2001, pp.
149-160.
[15]. Secure Hash Standard, FIPS PUB 180, by US government standards agency
NIST (National Institute of Standards and Technology).
[16]. Zegura EW, Calvert KL and Donahoo MJ. A quantitative comparison of
graph-based models for Internet topology. IEEE/ACM Trans. on Networking,
1997, 5(6):770-783.
[17]. Looking Glass and Network Information. Rogers Communications Inc. [Online]
https://supernoc.rogerstelecom.net/ops/.
[18]. G.K.Zipf. Human Behavior and the Principle of Least-Effort,
Addison-Wesley ,MA, 1965.

25 | P a g e

Vous aimerez peut-être aussi