Vous êtes sur la page 1sur 6

Content Parallel and Distributed Computing

Introduction

The course covers two interrelated branches: parallel computing and distributed computing Which aspects will be covered?
Parallel / distributed architectures Parallel / distributed algorithms Parallel / distributed programming Parallel / distributed operating systems

Computer Science Department Technical University of Cluj-Napoca


Fall 2011

Content
Parallel computing
Interconnection networks: static networks (metrics, topologies) and dynamic networks (buses, crossbars, multistage networks) Performance and scalability: metrics, scalability definition, Amdahls law Parallel algorithms design Parallelization process, case study Data dependency Decomposition techniques (recursive, data exploratory, speculative) Mapping techniques Static (based on data partitioning, tasks partitioning, hierarchical) Dynamic (centralized, distributed) Dense matrix algorithms: matrix-vector multiplication (1D partitioning and 2D partitioning, comparison 1D to 2D), matrix-matrix multiplication (2D partitioning, Cannon algorithm) Sorting algorithms

Content
Distributed computing
Time Physical clocks synchronization (Cristian, Berkeley, Network Time Protocol) Logical clocks (Scalar time, Vector time, Efficient implementation of vector clocks - Singhal-Kshemkalyani) Distributed mutual exclusion: problem definition, token ring, Suzuki-Kasami, Central coordinator, Lamport, Ricart-Agrawala Causal order: problem definition, Birman-Schiper-Stephenson, Schiper-EggliSandoz Snapshot: problem definition, Chandy-Lamport, Spezialetti-Kearns, Lai-Yang Leader election: problem definition, General networks: FloodMax, OptFloodMax Synchronous / asynchronous ring: LeLann, Chang-Roberts, HirschbergSinclair, Franklin, Peterson Anonymous ring: Itai-Rodeh MapReduce - Hadoop

Goal
After completion of this course the students
will get a good understanding of designing parallel algorithms will get a good understanding of distributed algorithms

Administrative issues
Courses will be held on Thursday 18:00 20:00 in room 365 Lab activities will be held on Monday 08:00 16:00 in 36 Lecture notes will be made available on request Bibliography will be made available on request Sometimes is useful to take notes in class!!!

Administrative issues
Grading policy
Lab 30% Examen 70% Optional assignment 10%

Literature
Parallel Computing
(Grama) Introduction to Parallel Computing, A. Grama, A. Gupta, G. Karpypis, V. Kumar, 2003 (Culler) Parallel Computer Architecture: A Hardware / Software Approach, D.E. Culler, J.P. Singh, A. Gupta, Morgan Kaufman Publishers, 1999 Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, B. Wilkinson, Prentice Hall, 2004

All work will be done individually, unless otherwise stated Lab, Assignment
Parallel Virtual Machine; GRID Assignment submission via email only; send to: Anca.Rarau@cs.utcluj.ro Late submissions are not accepted

Distributed Computing
(Kshemkalayani) Distributed Computing: Principles, Algorithms, and Systems, A. D. Kshemkalayani, M. Singhal, Cambridge University Press, 2008 (Coloris) Distributed Systems. Concepts and Design, G. Colouris, J. Dollimore, T. Kindberg, Addison-Wesley, Third Edition, 2001 Distributed Algorithms, N.A. Lynch, Morgan Kaufmann Publishers, 1996

Definition of parallel systems


Almasri and Gottlieb (1989)
collection of processing elements that communicate and cooperate to solve large problems fast

Introduction to Parallel Computing

Parallel application category


Applications in engineering and design Scientific applications Commercial applications Applications in computer science

Parallel applications
weather forecast, 3D plasma modeling, ocean circulation, viscous fluid dynamics, superconductor modeling, vision, chemical dynamics, ...

Why parallel systems?


solve a problem faster

Levels & types of parallelism


Explicit parallelism parallel platforms: PVM, MPI, OpenMP decomposition techniques mapping techniques compilers / interpreter Implicit parallelism
(programmer does not get involved)

solve a bigger problem in a reasonable time get more accurate result in a reasonable time than in case of using a single processor

(programmer gets involved)

Software parallelism

pipelining execution superscalar execution VLIW processors

Hardware parallelism
(exploit the instruction level parallelism)

Parallelism comes in various ways.

Levels & types of parallelism


Critical components of parallel computing from programmers perspective
how to express parallel tasks? how to specify the interaction between parallel tasks?

Levels & types of parallelism


how to express parallel tasks?
each program can be viewed as a parallel task each instruction can be viewed as a parallel task
1 2

for (int i = 0; i < 1000; i++) c[i] = a[i] + b[i];

Levels & types of parallelism


how to specify the interaction between parallel tasks?
access a shared data space (multiprocessors) exchange messages

Levels & types of parallelism


how to specify the interaction between parallel tasks? access a shared data space (multiprocessors)
processors interact by modifying the data stored in sharedaddress space memory can be local or global UMA (uniform memory access) all processors have the same time of accessing any memory module (local or global) NUMA easy to program read-only interactions read/write operations need mutual exclusion for concurrent access caches cache coherence mechanism put / get POSIX, OpenMP

Levels & types of parallelism


how to specify the interaction between parallel tasks?
exchange messages
processors interact (send data, work and synchronize) by message passing each process has its exclusive address space send / receive specify target address mechanism to assign a unique identifier to each process whoami, numproces MPI, PVM

Levels & types of parallelism


Level of parallelism
Program / process level parallelism Thread level parallelism Instruction level parallelism
Lab class (PVM, Grid)

Types of parallelism
Data parallelism Control (functional) parallelism

message passing architecture with p nodes can be emulated on a shared-address-space architecture


partition the shared address-space into p disjoint parts assign a part to each processor send / receive by writing / reading to the other processor partition

shared-address-space is costly to be emulated on a message passing architecture


accessing other node memory needs message sending and receiving

Control parallelism

Program/process level parallelism Data parallelism

Thread level parallelism

Parallel architectures taxonomy


Flynn Single instruction
SISD SIMD

Explicit parallelism
(programmer gets involved)

parallel platforms: PVM, MPI, OpenMP decomposition techniques mapping techniques compilers / interpreter Software parallelism

Single data Multiple data Culler

Multiple instruction
MISD MIMD

Instruction level parallelism Hardware parallelism


(exploit the instruction level parallelism)

Implicit parallelism
(programmer does not get involved)

pipelining execution superscalar execution VLIW processors

shared memory multiprocessor message passing architecture data parallel architecture (another name for SIMD) dataflow architecture systolic architecture

Parallelism comes in various ways.

Definitions of distributed systems


Coulouris et al (2001)
one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages

Introduction to Distributed Computing

Tanenbaum (1995)
a collection of independent computers that appear to the user of the system as a single computer

Sloman and Kramer (1987)


one in which several autonomous processors and data stores supporting processes and/or databases interact in order to cooperate to achieve an overall goal. The processes coordinate their activities and exchange information by means of information transfer over a communication network.

Distributed applications
banking systems applications for conferences

Why distributed systems?


economy: share hardware resources and information potential improvement in performance and reliability bring improvement over the services provided by a single computer
access to a wider variety of resources (e.g. specialized processors, peripheral) that become accessible over a network

Distributed system
Architectural models
Client-server Peer-to-peer

Interaction models
Synchronous Asynchronous

Architectural models
Client-server
client calls a service of server (by sending a request message to the server) server does the work and sends the result back to the client server can act as a client for other servers issue: centralization (point of failure, bottleneck)

Architectural models
client

client

server

server

Peer-to-peer
all processes are equal every computer holds resources that are commonly shared no bottleneck for processing and communication issue: high complexity (find resources)

client

client

peer

peer

peer

peer

Interaction models
Synchronous lower and upper bounds on the execution time of processes messages are received within a known bounded time drift rates between local clocks have a known bound global physical time (with a certain precision) predicable behavior in terms of timing (proper for hard real-time apps) timeouts can be used to detect failures difficult and costly to implement Asynchronous no lower and upper bounds on the execution time of processes messages are not received within a known bounded time drift rates between local clocks do not have a known bound no global physical time (logical time is needed) unpredictable in terms of timing timeout cannot be used for failures detections widespread in practice

Parallel vs. Distributed vs. Concurrent

Parallel vs. distributed


Goal
parallel: solve a problem faster, bigger, more accurate distributed: resource sharing, improve reliability

Concurrent vs. parallel vs. distributed


Concurrent processes are executed simultaneously
on a single processors on a set of processors which are
physically close to each other physically far one from another

Processors type
parallel: homogenous distributed: heterogeneous

Geographical distribution
parallel: close to each other distributed: far from each other

References
Based on: Grama: chapter 1 Coulouris: chapter 1, 2