Vous êtes sur la page 1sur 33

CSC4305: Parallel Programming

Lecture 1: Introduction

Sana Abdullahi Mu’az & Ahmad Abba Datti

Bayero University, Kano
Reference Material(s)

• An Introduction to Parallel Programming

by Peter S. Pacheco (2011)

• Introduction to Parallel Computing: From

Algorithms to Programming on State-of-the-Art
Platforms by Trobec, Slivnik , Bulić and Robič (2018)

• Part I: Theoretical Foundations. (Sana Abdullahi Mu’az)

• Part II: Applied Programming. (Ahmad Abba Datti)

Part 1: Foundations

• Introduction

• Parallel Hardware

• Parallel Software
Part 2: Programming
Programming Models
• Shared Memory Programming with OpenMP and PThreads
• Distributed Systems Programming with MPI
• Programming Graphic Processors with OpenCL

Real life problems:

• Parallel computation of 𝜋
• Parallel solution of 1-D Heat Equation
• Parallel Implementation of Seam Carving
Why Parallel Computing?
From 1986 to 2002 the performance of microprocessors increased, on
average, 50% per year.

Since 2002, however, single-processor performance improvement has

slowed to about 10% per year.

By 2005 the trend changed, most of the major manufacturers of

microprocessors had decided that the road to rapidly increasing
performance lay in the direction of parallelism.

An intelligent solution

• Instead of designing and building faster

microprocessors, put multiple processors on a single
integrated circuit.
The world’s fastest computer?
• Peak performance of 200 petaflops, or 200,000 trillion
calculations per second (trillion=(1,000,000,000,000; one
million million)

• Cost: $200 million (N72,000,000,000. –Kano FAAC is N52bn)

• Compute servers: 4,608 (each with 22 cores CPU and 6 GPUs)

Now it’s up to the programmers

• Adding more processors does not help much if programmers are

not aware of them…

• … or don’t know how to use them.

• Serial programs don’t benefit from this approach (in most cases).
A point of complexity:
If a single person can take 1000s to complete
a job, how long will it take 10 people to
complete the same job?
100 + x:

• Identify one another => Identity

• Need for understanding one another => means of

• Dividing and sharing the job

• Synchronizing the job
Other problems

 If the job is atomic(indivisible).

 Or can not be divided equally.
 What about if one of them is busy, unable to
complete it's portion.
 Or completed late.
 And so on.
• Cores usually need to coordinate their work.

• Communication – one or more cores exchange their current

partial result to one another.

• Load balancing – share the work evenly among the cores so

that one is not heavily loaded.

• Synchronization – because each core works at its own pace,

make sure cores do not get too far ahead of the rest.
• Parallel programming is not so simple.
• Good parallel programming is even more complex.
• The reason to write parallel programs is for improved performance
• Performance programming is hard
– Management of resources
– How resources interact
– How to find, fix, and avoid bottlenecks
Why we need ever-increasing

• Computational power is increasing, but so are our computation

problems and needs.

• Problems we never dreamed of have been solved because of past

increases, such as decoding the human genome.

• More complex problems are still waiting to be solved.

Climate modeling
Protein folding
Drug discovery
Energy research
Data analysis/Data Mining
Why we’re building parallel systems

• Up to now, performance increases have

been attributable to increasing density
of transistors.

• But there are

A little physics lesson

• Smaller transistors = faster processors.

• Faster processors = increased power consumption.

• Increased power consumption = increased heat.

• Increased heat = unreliable processors.

• Move away from single-core systems to multicore
• “core” = central processing unit (CPU)

 Introducing parallelism
Approaches to the serial problem

• Rewrite serial programs so that they’re parallel.

• Write translation programs that automatically
convert serial programs into parallel programs.
• This is very difficult to do.
• Success has been limited.
More problems
• Some coding constructs can be recognized by an
automatic program generator, and converted to a
parallel construct.

• However, it is likely that the result will be a very

inefficient program.

• Sometimes the best parallel solution is to step back

and devise an entirely new algorithm.
Some benefits

• High speed → Save time (and money)

• Improve performance → Better solution
• High accuracy and resolution
The quest for increasingly more powerful
Scientific simulation will continue to push on
system requirements:
•To increase the precision of the result
•To get to an answer sooner (e.g., climate
modelling, disaster modelling)

A similar phenomenon in commodity machines

More, faster, cheaper

Limitation: one day the speed limit will be reached

Heat: integrated circuit gets too hot and becomes


These indicates that it is impossible to continue

increasing the speed of processor.

The way out is parallelism, multicore systems


Programs written for single-core systems cannot

exploit the presence of multiple cores, multiple
instance of each program is not enough

Translation programs => very limited success,

Entirely new algorithm is needed for parallel
Concluding Remarks
• The laws of physics have brought us to the doorstep of multicore technology.

• Serial programs typically do not benefit from multiple cores.

• Automatic parallel program generation from serial program code is not the
most efficient approach to get high performance from multicore computers.

• Learning to write parallel programs involves learning how to coordinate the


• Parallel programs are usually very complex and therefore, require sound
program techniques and development.
Next Class

• Parallel Hardware

• Parallel Software