Vous êtes sur la page 1sur 4

Course code Course title L T P J C

CSE 4001 Parallel and Distributed Computing 2 0 2 4 4


Pre-requisite Syllabus version

Course Objectives:
1. To introduce the fundamentals of parallel and distributed computing including parallel and
distributed architectures and paradigms
2. To understand the technologies, system architecture, and communication architecture that
propelled the growth of parallel and distributed computing systems
3. To develop and execute basic parallel and distributed application using basic programming
models and tools.

Expected Course Outcome:


By the end of this module, the student will be able to
1. Identify and describe the physical organization of parallel platforms and remember the
cost metrics that can be used for algorithm design.
2. Demonstrate knowledge of the core architectural aspects of Parallel Computing.
3. Indicate key factors that contribute to efficient parallel algorithms and apply a suite of
techniques that can be applied across a wide range of applications.
4. Distinguish the architectural and fundamental models that underpin distributed systems
and identify challenges in the construction of distributed systems.
5. Analyze the issues in synchronizing clocks in distributed systems apply suitable
algorithms to solve it effectively and examine how election algorithms can be
implemented in a distributed system.
6. Discuss on how locking, timestamp ordering and optimistic concurrency control may be
extended for use with distributed transactions.
7. Apply the concepts behind software architectures for large scale Web-based systems as
well as to design, recognize and evaluate software architectures.
8. Design and build application programs on distributed Systems.
9. Analyze and Implement current distributed Computing research literature.

Student Learning Outcomes (SLO): 2, 5, 14, 17


SLO-2: Having a clear understanding of the subject related concepts and of contemporary issues.
SLO-5: Having design thinking capability.
SLO-14: Having an ability to design and conduct experiments, as well as to analyze and interpret
data.
SLO-17: Having an ability to use techniques, skills and modern engineering tools necessary for
engineering practice.
Module:1 Parallelism Fundamentals 2 hours
Motivation – Key Concepts and Challenges – Overview of Parallel computing – Flynn’s
Taxonomy – Multi-Core Processors – Shared vs Distributed memory.

Module:2 Parallel Architectures 3 hours


Introduction to OpenMP Programming – Instruction Level Support for Parallel Programming –
SIMD – Vector Processing – GPUs
Module:3 Parallel Algorithm and Design 5 hours
Preliminaries – Decomposition Techniques – Characteristics of Tasks and Interactions – Mapping
Techniques for Load balancing – Parallel Algorithm Models.

Module:4 Introduction To Distributed Systems 4 hours


Introduction – Characterization of Distributed Systems – Distributed Shared Memory –
Message Passing – Programming Using the Message Passing Paradigm – Group Communication –
Case Study (RPC and Java RMI).

Module:5 Coordination 6 hours


Time and Global States – Synchronizing Physical Clocks – Logical Time and Logical Clock –
Coordination and Agreement – Distributed Mutual Exclusion – Election Algorithms – Consensus
and Related Problems.

Module:6 Distributed Transactions 6 hours


Transaction And Concurrency Control – Nested Transactions – Locks – Optimistic Concurrency
Control – Timestamp Ordering
Distributed Transactions – Flat and Nested – Atomic – Two Phase Commit Protocol –
Concurrency Control

Module:7 Distributed System Architecture & its 2 hours


Variants
Distributed File System: Architecture – Processes – Communication
Distributed Web-based System: Architecture – Processes – Communication. Overview of
Distributed Computing Platforms.

Module:8 Recent Trends 2 hours

Total Lecture hours: 30 hours


Text Book(s)
1. Grama, A. (2013). Introduction to parallel computing. Harlow, England: Addison-
Wesley.
2. Coulouris, G. (2012). Distributed systems. Boston: Addison-Wesley.
Reference Books
1. Hwang, K., & Briggs, F. (1990). Computer Architecture and Parallel Processing (1st ed.).
New York: McGraw-Hill.
2. Hager, G. (2017). Introduction To high performance computing for scientists and
engineers: CRC press.
3. Tanenbaum, A. (2007). Distributed Systems: Principles and Paradigms (2nd ed.). Pearson.
Mode of Evaluation: CAT / Assignment / Quiz / FAT / Project / Seminar

List of Challenging Experiments (Indicative) SLO: 5,17,14


1. Assume that a program generates large quantities of floating point data 2 hours
that is stored in an array. In order to determine the distribution of the
data, we can make a histogram of the data. To make a histogram, we
simply divide the range of the data up into equal sized subintervals, or
bins; determine the number of measurements in each bin; and plot a bar
graph showing the relative sizes of the bin. Use MPI to implement the
histogram.
2. Suppose we toss darts randomly at a square dartboard, whose bullseye is 2 hours
at the origin, and whose sides are 2 feet in length. Suppose also that there
is a circle inscribed in the square dartboard. The radius of the circle is 1
foot, and its area is π square feet. If the points that are hit by the darts are
uniformly distributed (and we always hit the square), then the number of
darts that hit inside the circle should approximately satisfy the equation

We can use this formula to estimate the value of π with a random number
generator:
number_in_circle = 0;
for (toss = 0; toss < number_of_tossess; toss++)
{
x = random double between -1 and 1;
y = random double between -1 and 1;
distance_squared = x * x + y * y;
if (distance_squared <= 1)
number_in_circle++;
}
pi_estimate = 4 * number_in_circle/ (double)
number_of_tosses;
This is called a “Monte Carlo” method, since it uses
randomness. Write a program that uses the above Monte Carlo method to
estimate π (MPI / Pthreads/ OpenMP).
3. Conway’s Game of Life is played on a rectangular grid of cells that may or 3 hours
may not contain an organism. The state of the cells is updated each time step by
applying the following set of rules:
1. Every organism with two or three neighbours survives.
2. Every organism with four or more neighbours dies from
overpopulation.
3. Every organism with zero or one neighbours dies from isolation.
4. Every empty cell adjacent to three organisms gives birth to a new one.
Create an MPI program that evolves a board of arbitrary size
(dimensions could be specified at the command line) over several
iterations. The board could be randomly generated or real from a file.
Try applying the geometric decomposition pattern to partition the work
among your process
4. Use OpenMP to implement a producer-consumer program in which 2 hours
some of the threads are producers and others are consumers. The
producers read text from a collection of files, one per producer. They
insert lines of text into a single shared queue. The consumers take the
lines of text and tokenize them. Tokens are “words” separated by white
space. When a consumer finds a token, it writes it to stdout.
5. Write a code that sets a real variable on each of N 2 hours
processors equal to MPI rank (Task ID) of the task. Then write your own
routine to perform a reduction operation over all processors to sum the values
using only MPI Send and MPI Receive calls. Do this global reduction operation
using the following communication algorithms:
a. Communications in a ring.
b. Hypercube communications.
Put in timing calls using MPIWtime to test the timing of your routines
compared to using the MPI routine MPI All Reduce to do the same
computation.

Total Laboratory Hours 12 hours


Mode of assessment: Lab Assessments/ MIDTERM/ Viva.
Recommended by Board of Studies DD-MM-YYYY
Approved by Academic Council No. xx Date DD-MM-YYYY

Vous aimerez peut-être aussi