Académique Documents
Professionnel Documents
Culture Documents
SCHEDULING IN
CLOUD-BASED CYBERINFRASTRUCTURES
NBiS2014,Salerno, Italy
Contents
Introduction
Related Work
Hybrid Algorithm for Workflow Scheduling
Cyberwater: A Cyberinfrastructure for Water
Quality Management
Testing Scenarios and Experimental Results
Conclusions and Future Work
INTRODUCTION
The computational power has risen to new heights:
but unfortunately so have the costs associated
a new interest raise: efficient usage of power consumption
Cyber-Infrastructure - relatively new concept
mix of technologies
Data platforms acquisition through realtime sensor networks, Big
Data,Visualisation Tools, HPC,Web services
Inter-disciplinary approach
Various sciences, engineering and social disciplines
Advantages for using cloud services in Cyberinfrastructures:
scalability up and down
real-time resource provisioning
simplified deployment and management of resources and applications
better cost/performance ratio
INTODUCTION(1)
In Cloud: it is mandatory to achieve a good
balance between cost and performance
CyberWater project
Monitors natural resources and water related events
Offer a solution for water quality in respect to the
pollution phenomena
HYBRID ALGORITHM FOR
WORKFLOW SCHEDULING
The application is represented by a DAG - G(V;E;C;W):
V is the set of v nodes, and each node vi ∈V represents an
application task;
w is the set of computation costs, where wi ∈W is the
execution time of task vi;
E is the set of communication edges. The directed edge ei,j
joins node vi and vj
node vi is called the parent node and node
vj is called the child node
This implies that vj cannot start until vi finishes and sends its data to vj
C is the set of communication costs, and the edge ei,j has a
communication cost ci,j apartine C.
HYBRID ALGORITHM FOR
WORKFLOW SCHEDULING (2)
We consider a homogeneous computing environment
model :
a set P of p identical processors connected in a fully
connected graph
Assumptions:
any processor can execute the task and communicate with
other processors at the same time;
once a processor has started task execution, it continues
without interruption, and
on completing the execution it sends immediately the output
data to all children tasks in parallel.
HYBRID ALGORITHM FOR
WORKFLOW SCHEDULING (3)
Communication cost ci,j -for transferring data from task vi (scheduled on pm):
ci , j S Ri , j
S is the cost of starting communication between processors (in seconds)
µi,j is the amount of data transmitted from task vi to task vj (in bytes)
R is the cost of communication per transferred byte (in seconds/byte).
Assumptions:
the startup cost S is negligible and
the unit cost R is the same for any two processors,
that the communication cost for any two tasks is a function of the amount of transferred
data only
We present a near optimal scheduling while taking into consideration two
things:
minimizing the total execution time and
minimizing the number of resources (processors) used
HYBRID ALGORITHM FOR
WORKFLOW SCHEDULING (4)
HER algorithm descripription:
Phase I:
Group nodes in such a way that the communication cost
is as small as possible;
Nodes are placed in lists. Every list will run on a
different core;
Second, lists are sorted, in an ascended order by the
total execution time of each node.
HYBRID ALGORITHM FOR
WORKFLOW SCHEDULING (5)
Phase II:
Node 0 is assigned with priority 0;
Nodes (excluding node 0) with 2 or more children are
assigned priority 2 and are placed as early as possible in
their respective lists;
The total execution time is calculated for each list.
Phase III:
If all lists are balanced then jump to Phase V;
Nodes, grouped together with their parents, are split to
different lists;
All nodes (except final node) without children are placed at
the end of their respective lists.
HYBRID ALGORITHM FOR
WORKFLOW SCHEDULING (6)
Phase IV:
thelists are split and reorganized taking into account
load balancing.
Phase V:
all tasks are allocated to their respective processors
since the nodes in the lists are not arranged in order of
execution we need to go over each lists multiple times
until all nodes are placed on their respective processors
and all tasks are executed
CYBERWATER: A CYBERINFRASTRUCTURE FOR
WATER QUALITY MANAGEMENT
CyberWater - Prototype Cyberinfrastructure based System for Decision-
Making Support in Water Resources Management
Data level
Measured data, predicted data, modeled data, suscribers data
Storage level
prediction module
propagation module
decision support module
alerts module
Visualization level
Customized web
application
CYBERWATER: A CYBERINFRASTRUCTURE FOR
WATER QUALITY MANAGEMENT (1)
Eg – 61%
TESTING SCENARIOS AND
EXPERIMENTAL RESULTS (4)
Scenario 3:
DAG which can be used in services
composition
”fork-join” structure
relatively low
the computational cost of