Vous êtes sur la page 1sur 33

CSC4305: Parallel Programming

Lecture 1: Introduction

Sana Abdullahi Mu’az & Ahmad Abba Datti


Bayero University, Kano
Reference Material(s)

• An Introduction to Parallel Programming


by Peter S. Pacheco (2011)

• Introduction to Parallel Computing: From


Algorithms to Programming on State-of-the-Art
Platforms by Trobec, Slivnik , Bulić and Robič (2018)
Outline

• Part I: Theoretical Foundations. (Sana Abdullahi Mu’az)

• Part II: Applied Programming. (Ahmad Abba Datti)


Part 1: Foundations

• Introduction

• Parallel Hardware

• Parallel Software
Part 2: Programming
Programming Models
• Shared Memory Programming with OpenMP and PThreads
• Distributed Systems Programming with MPI
• Programming Graphic Processors with OpenCL

Real life problems:


• Parallel computation of 𝜋
• Parallel solution of 1-D Heat Equation
• Parallel Implementation of Seam Carving
Why Parallel Computing?
From 1986 to 2002 the performance of microprocessors increased, on
average, 50% per year.

Since 2002, however, single-processor performance improvement has


slowed to about 10% per year.

By 2005 the trend changed, most of the major manufacturers of


microprocessors had decided that the road to rapidly increasing
performance lay in the direction of parallelism.

.
An intelligent solution

• Instead of designing and building faster


microprocessors, put multiple processors on a single
integrated circuit.
The world’s fastest computer?
Summit:
• Peak performance of 200 petaflops, or 200,000 trillion
calculations per second (trillion=(1,000,000,000,000; one
million million)

• Cost: $200 million (N72,000,000,000. –Kano FAAC is N52bn)

• Compute servers: 4,608 (each with 22 cores CPU and 6 GPUs)


Now it’s up to the programmers

• Adding more processors does not help much if programmers are


not aware of them…

• … or don’t know how to use them.

• Serial programs don’t benefit from this approach (in most cases).
A point of complexity:
If a single person can take 1000s to complete
a job, how long will it take 10 people to
complete the same job?
=1000/10
=100.
Sure?
100 + x:

• Identify one another => Identity


• Need for understanding one another => means of
communication

• Dividing and sharing the job


• Synchronizing the job
Other problems

 If the job is atomic(indivisible).


 Or can not be divided equally.
 What about if one of them is busy, unable to
complete it's portion.
 Or completed late.
 And so on.
Coordination
• Cores usually need to coordinate their work.

• Communication – one or more cores exchange their current


partial result to one another.

• Load balancing – share the work evenly among the cores so


that one is not heavily loaded.

• Synchronization – because each core works at its own pace,


make sure cores do not get too far ahead of the rest.
• Parallel programming is not so simple.
• Good parallel programming is even more complex.
• The reason to write parallel programs is for improved performance
• Performance programming is hard
– Management of resources
– How resources interact
– How to find, fix, and avoid bottlenecks
Why we need ever-increasing
performance

• Computational power is increasing, but so are our computation


problems and needs.

• Problems we never dreamed of have been solved because of past


increases, such as decoding the human genome.

• More complex problems are still waiting to be solved.


Climate modeling
Protein folding
Drug discovery
Energy research
Data analysis/Data Mining
Why we’re building parallel systems

• Up to now, performance increases have


been attributable to increasing density
of transistors.

• But there are


inherent
problems.
A little physics lesson

• Smaller transistors = faster processors.

• Faster processors = increased power consumption.


• Increased power consumption = increased heat.

• Increased heat = unreliable processors.


Solution
• Move away from single-core systems to multicore
processors.
• “core” = central processing unit (CPU)

 Introducing parallelism
Approaches to the serial problem

• Rewrite serial programs so that they’re parallel.


• Write translation programs that automatically
convert serial programs into parallel programs.
• This is very difficult to do.
• Success has been limited.
More problems
• Some coding constructs can be recognized by an
automatic program generator, and converted to a
parallel construct.

• However, it is likely that the result will be a very


inefficient program.

• Sometimes the best parallel solution is to step back


and devise an entirely new algorithm.
Some benefits

• High speed → Save time (and money)


• Improve performance → Better solution
• High accuracy and resolution
Recap
The quest for increasingly more powerful
machines
Scientific simulation will continue to push on
system requirements:
•To increase the precision of the result
•To get to an answer sooner (e.g., climate
modelling, disaster modelling)

A similar phenomenon in commodity machines


More, faster, cheaper
WHY WE’RE BUILDING PARALLEL
SYSTEMS?

Limitation: one day the speed limit will be reached

Heat: integrated circuit gets too hot and becomes


unreliable

These indicates that it is impossible to continue


increasing the speed of processor.

The way out is parallelism, multicore systems


WHY WE NEED TO WRITE PARALLEL
PROGRAMS?

Programs written for single-core systems cannot


exploit the presence of multiple cores, multiple
instance of each program is not enough

Translation programs => very limited success,


Entirely new algorithm is needed for parallel
program
Concluding Remarks
• The laws of physics have brought us to the doorstep of multicore technology.

• Serial programs typically do not benefit from multiple cores.

• Automatic parallel program generation from serial program code is not the
most efficient approach to get high performance from multicore computers.

• Learning to write parallel programs involves learning how to coordinate the


cores.

• Parallel programs are usually very complex and therefore, require sound
program techniques and development.
Next Class

• Parallel Hardware

• Parallel Software