Vous êtes sur la page 1sur 27

10Jul12

UNIT1 UNIT 1
INTRODUCTIONTO PARALLELISM

10Jul12

Computerdevelopment milestones il
500BCAbacus,china 1940s Mechanicalorelectromech parts 1942 Blaise Pascal,France Mech, Adder/Subtractor 1827CharlesBabbage,England, Differenceengineforpolynomial engine 1941 Konrad zusa,Germany,First binarymech,computer 1944 HowardAiken,Electronicmech computer,Harward MarkIbyIBM

INTRODUCTIONTOPARALLELISM

10Jul12

INTRODUCTIONTOPARALLELISM

10Jul12

ElementsofModern computers p

INTRODUCTIONTOPARALLELISM

10Jul12

ElementsofModern p computers
Hardware Software Programming elements Computingproblems Computing problems Algorithmanddata structures Hardwareresources Operatingsystem Systemsoftwaresupport Compilersupport

A. A B. C. D. E. F.

INTRODUCTIONTOPARALLELISM

10Jul12

A.Computingproblems
A modern computer is an integrated system of machine hardware, an instruction set, application programs and user interfaces. The use of computer is driven by reallife problems demanding fast and accurate solutions. d l i Depending on the nature of the problems, the solutions may require different computing resources.

INTRODUCTIONTOPARALLELISM

10Jul12

A.Computingproblems
conti conti Numerical problems in science and technology: solutions demand p complex mathematical formulations an tedious integer or floating point computations. Alphanumerical problems in business and government: S l ti d t Solutions d demand d accurate transactions, large database management, and information retrieval operations. Artificial intelligence problems: Solutions demands logic interferences and symbolic manipulations. Such as numerical computing, transaction l processing and logical reasoning. Some complex problems may demand a combination of these modes modes.
INTRODUCTIONTOPARALLELISM 7

10Jul12

B.Algorithmanddata structures
Special algorithms and data p g structures are needed to computation and communication. N Numericalgorithmsare i l ith deterministic, usingregularly structureddata. Symbolicprocessingmayuse heuristicornon heuristicsearches overalargeknowledgebase. over a large knowledge base

INTRODUCTIONTOPARALLELISM

10Jul12

C.Hardwareresources
Processor, memory, peripheral devices forms hardware core Special hardware: interface built into I/O devices such as display terminals, workstations, optical page scanners, g g magnetic ink character recognizer, modem, network adapters, voice data entry, printer and plotters. These are directly connected to mainframe computer directly or through WAN Software interface needed: file transfer system, editors, word processors, device drivers, interrupt handlers, network communication programs. These facilitate the portability of user programs on different machine architectures.
INTRODUCTIONTOPARALLELISM 9

10Jul12

D.OperatingSystem
An effective OS manages the allocation and deallocation of resources during the execution of user programs. Mapping is a bidirectional process matching algorithmic structure with hardware architecture and viceversa. Efficient mapping will benefit the programmer and produce better source codes. The mapping of algorithmic and data structures onto the machine architecture includes processor scheduling, memory maps, i t interprocessor communications i ti etc., Thesewereusuallyarchitecture dependent. dependent
INTRODUCTIONTOPARALLELISM 10

10

10Jul12

D.OperatingSystemconti
The implementation of these mapping relies on efficient compiler and operating system support. Parallelism can be exploited at algorithm design time, at program time, at compile time, and at run time. Techniques for exploiting parallelism at these levels form the core of parallel processing technology.

INTRODUCTIONTOPARALLELISM

11

11

10Jul12

E.Systemsoftwaresupport
System software support is needed for development of efficient programs in high level languages. S Source code written i HLL must b d itt in t be first translated into object code by an optimizing compiler. The Compiler assigns variables to registers or to memory words and reserves functional units for operators. An assembler is used to translate the compiler object code into machine code which can be recognized by the machine hardware. A loader is used to initiate the program execution through the OS kernel.
INTRODUCTIONTOPARALLELISM 12

12

10Jul12

F.CompilerSupport F. Compiler Support


Threecompilerupgradeapproaches: Preprocessor: A sequential compiler and low level library of target computer to implement high level parallel constructs. P Precompiler: requires some program il i flow analysis, dependence checking and limited optimization towards parallelism detection. Parallelizing compiler: demands full developed parallelizing or vectorizing compiler which can automatically detect parallelism in source code and transform sequential codes into parallel constructs.

INTRODUCTIONTOPARALLELISM

13

13

10Jul12

F.CompilerSupport F. Compiler Support


Efficiency of binding process depends on the effectiveness of the preprocessor the preprocessor, parallelizing compiler, the loader, and the OS support. Due to unpredictable program behavior , none of existing computers can be considered fully automatic or fully intelligent in detecting all types of parallelism. So,compilerdirectiveisinsertedintothe sourcecodetohelpthecompilertodoa betterjob. Users may interact with the compiler to Usersmayinteractwiththecompilerto restructuretheprograms Thishasbeenprovedforenhancingthe performanceofparallelprocessing.
INTRODUCTIONTOPARALLELISM

14

14

10Jul12

Evolutionofcomputer Architecture
In view of assembly language programmer, the abstract is organized by its instruction set, which i l d it i t ti t hi h includes opcode, addressing modes, registers, virtual memory, etc, implementation, In the view of hardware implementation the abstract machine is organized with CPUs, caches, buses, microcode, pipelines, physical memory, etc. Therefore, the study of architecture covers both instructionset architectures and machine implementation organizations. organizations In past four decades, computer architecture has gone through evolution rather than revolution changes.
INTRODUCTIONTOPARALLELISM 15

15

10Jul12

Evolutionfromsequentialscalar computerstovectorprocessors computers to vector processors andparallelcomputer

INTRODUCTIONTOPARALLELISM

16

16

10Jul12

The van Neumann architecture is built as a sequential machine built as scalar architecture.
Sequential computer was improved from bitserial to wordparallel operations, and from fixedpoint to floating point operations. The van Neumann architecture is slow due t d to sequential execution of ti l ti f instructions in programs. Lookahead, Parallelism and pipelining: Lookahead techniques were introduced to prefetch instructions in order to overlap I/E. (instruction fetch/ decode and ) p execution) operations to enable functional parallelism . Two approaches:
1.tousemultiplefunctionalunitssimultaneously, 2.topracticepipeliningatvariousprocessing levels.
INTRODUCTIONTOPARALLELISM 17

17

10Jul12

Pipeline has proven especially attractive in performing identical operations over vector data strings.
Vector operations were originally carried out by software controlled looping using scalar pipeline processors.

INTRODUCTIONTOPARALLELISM

18

18

10Jul12

FlynnsClassification
Michael Flynn ,1972, introduced a classification of various computer architectures based on notions of instruction and data streams. SISD single instruction stream over a single data stream, Conventional sequential computers

CU Controlunit PU ProcessingUnit MU Memoryunit IS instructionstream

I/O input/output PE Processingelement PE Processing element DS Datastream LM Localmemory


19

INTRODUCTIONTOPARALLELISM

19

10Jul12

SIMD Singleinstructionstreamovera multipledatastream

MIMD Multipleinstructionstream overamultipledatastream,parallel computers.

I/O input/output / p / p PE Processingelement DS Datastream LM Localmemory

INTRODUCTIONTOPARALLELISM

CU Controlunit PU ProcessingUnit MU Memoryunit IS instructionstream

20

20

10Jul12

MISD Multipleinstructionstream overasingledatastream,

I/O input/output I/O input/ output PE Processingelement DS Datastream LM Localmemory

CU Controlunit PU ProcessingUnit MU Memoryunit IS instructionstream

INTRODUCTIONTOPARALLELISM

21

21

10Jul12

Parallel/vectorcomputers
ExecutesprogramsinMIMDmode Twomajorcategories: Two major categories: Sharedmemorymultiprocessors Messagepassingmulticomputer. g p g p Differencebetweenmultiprocessors andmulticomputersliesinmemory sharingandmechanismsusedfor interprocessorcommunications p

INTRODUCTIONTOPARALLELISM

22

22

10Jul12

Multiprocessorsystem communicatewitheachother h h h throughsharedvariables. Multicomputersystemhasalocal Multicomputer system has a local memory,unsharedwithother nodes Interprocessorcommunicationis donethroughmessagepassing amongnodes. g

INTRODUCTIONTOPARALLELISM

23

23

10Jul12

Vectorprocessorsusesvector instructions. Vectorprocessorsequippedwith multiplevectorpipelinesthatcanbe concurrentlyusedhardwareor concurrently used hardware or firmwarecontrol. p Twofamiliesinvectorprocessors:
Memorytomemoryarchitectures supportsthepipelinedflowofvector operandsfromthememorytopipelines, operands from the memory to pipelines, andthenbacktothememory. Registertoregisterarchitectureuses vectorregisterstointerfacebetween vector registers to interface between memoryandfunctionalpipelines.

INTRODUCTIONTOPARALLELISM

24

24

10Jul12

Layersofcomputersystem development

B d Li BasedonLionelNi,1990 l Ni 1990 Hardwareconfigurationsdifferfrommachine tomachine,insamemodelalso. Addressspaceofaprocessorvariesamong differentarchitectures. Dependsonmemoryorganizations,whichis machinedependent. Featuresaredependonthetargetapplication p g pp domains.
INTRODUCTIONTOPARALLELISM 25

25

10Jul12

Need to develop application programs and programming environment which are machine independent. User programs can be ported to many computers with minimum conversion cost, in independent machine architectures. High level languages and communication models depend on architectural choices. From programmers viewpoint, these layers should be architecture transparent. transparent The Linda approach using tuple spaces offers an architecture transparent communication model model.
INTRODUCTIONTOPARALLELISM 26

26

10Jul12

Applicationprogrammersprefer pp p g p morearchitecturaltransparency. Compiler and OS support should be designed to b d i d t remove as many architectural constraints as possible from the programmer.

INTRODUCTIONTOPARALLELISM

27

27