Accelerating parallel computations and anomaly detection

Accessing sharing data 8.
14
Broadcast on a workstation cluster 3.16
acceleration anomaly 12.7
Broadcast_receive 6.28
Add a list of numbers 4.8
Bubble sort 9.12
Adding numbers 4.2 , 5.16 , 5.20 , 8.32 , 8.40 , 8.44 , 8.45
Bucket sort 4.15
Adding numbers SPMD 5.18
busy waiting 8.16
All gather 6.29
Butterfly barrier 6.10
All-to-all routines 4.20

Algorithmic scalability 1.39
cache memory 8.30
anomaly 12.7
cache coherence protocols 8.30
Amdahls law 1.36
Cannons algorithm 10.13
Application of DFT in image processing: 11.36
Cartesian coordinates 11.25
Atomic checks of the looks 8.16
Centralized counter implementations (or linear barrier) 6.6
Asymmetric parallel compare-and-exchange 9.8
Centralized parallel Moore algorithm 7.38

Cellular automata 6.46
back-substitution 10.19
Centralized load balancing 7.9 , 7.10
Barrier 6.3 , 8.11
chromosomes 12.8
Barnes-Hut algorithm 4.32
Circuit switching 1.29
Bernsteins conditions 8.29
CLB 7.9 , 7.10
best-first 12.3
coarse grid 10.41
Best way to climb a mountain 7.21
Concrete architectures 5.19
binary semaphores 8.22
Condition variables 8.24
Bitonic mergesort 9.32
Conway Game of life 6.47
bitonic sequence 9.32
Combined deadlock-free blocking 6.14
Blocking 2.11,2.33
Combinatorial search 12.2
block matrix multiplication 10.7
Compare-and-exchange sorting algorithms 9.7
Block partitions 6.31
Computation time 3.4 , 3.11
branch and bound search 12.2 , 12.3 , 12.5
Computer clasification 1.9
breadth-first 12.3
Communication time 3.4 , 3.11 , 3.18
Broadcast 2.13 , 3.13
Communication time by Ping-Pong methods 3.18
Broadcast on a hypercube network 3.13
Computing an Integral 3.35
Broadcast on a mash network 3.14
Computing pi: 3.34
Contrast stretching 11.5
Embarrassingly parallel computations 3.20
Cost-optimal algorithms 3.12
Embedding 1.19
Creating bitonic sequences 9.34
Evaluating programs empirically 3.17
cut-off functions 12.3
Eratosthenes 5.29
Cyclic partition 6.31
Execution order 8.11
cyclic-strip partitions 10.27
Examples / Pthreads 8.33

Examples / Unix 8.43
Data-flow architecture 5.15
Examples / Java 8.45
Data parallel computations 6.15
E-node 12.4
Data partitioning 9.10

Deadlock 1.27 , 6.13
false sharing 8.30
Decentralized load balancing 7.9
Faster convergence methods 10.33
Decentralized DLB 7.12
Fast Fourier transform 11.41
Decentralized parallel Moore algorithm 7.41
FFT 11.41
Debugging strategies 3.19
Fixed energy distributed termination algorithm 7.27
deceleration anomaly 12.7
Frequency filter 5.6
Dependency analysis 8.28
Forall 6.16
depth-first 12.3
Fork-Join construct 8.4
detrimental anomaly 12.7
fork in Unix 8.5
DFT 11.36
Fourier transform 11.4 , 11.29 ,11.30
Dilatation 1.20
Fourier series 11.29 , 11.30
discrete Fourier transform 11.32

Distributed termination detection 7.21
Gather 2.15
Divide-and-Conquer 4.1 , 4.8
Gather on a hypercube network 3.14
DLB 7.7
Gaussian elimination 10.19 10.23
DLB on a line structure 7.17
Gauss-Seidel relaxation 10.33
Dual-pass ring termination algorithm 7.25
Gauss-Seidel over-relaxation formula 10.36
Duplicated computation 9.9
Genetic algorithms 7.4 , 12.8
Dynamic load balancing 7.7
general semaphores 8.22
dynamic programming 12.2
genetic algorithms 12.2

Global_barrier 6.29
edge detection 11.4 , 11.14 , 11,20
gradient direction 11.15
gradient magnitude 11.15
Linear barrier 6.6
Graph representation 7.32
Linear equations 5.34
Gravitational N-body problem 4.27
Load balancing 7.1 , 7.2
gray-scale 10.2
Local synchronization 6.12
Group of stages 5.14
Locks 8.17
Gustafsons law 1.40
Logical stages 5.14
Hardware scalability 1.39
Mandelbrot set 3.27
Heat distribution problem 6.33
Mash implementations 10.13
heavyweight processes 8.7
matching 11.4
Higher order difference methods 10.37
Matrices 10.2
hill climbing 12.2 , 12.24
matrix multiplication 10.5
Histogram 11.6
matrix-vector multiplication 10.4
Hugh transformation 11.4
Mean 11.9
Hugh transform 11.21
Median 11.10 , 11.11
Hyperquicksort 9.27
Measuring execution time 3.17

Merging two sublists 9.11
Insertion sort 5.24-5.28
Mergesort 9.17 , 9.27 , 9.30 , 9.32
Integral 3.35,4.21
Message-passing computing 2.1
Interleaving 8.11
Message tag 2.12
inverse Fourier transform 11.31 , 11.32
MIMD 1.9
Image processing 3.22 , 11.1 , 11.3
MPI (Message Passing Interface) 2.18 , 2.29
Image processing methods 10.4
MPI_Allgather 6.29
MPI_Alltoall 4.20
Jacobi iteration 6.23 , 10.28
MPI_Barrier 6.5
Jacobi Over-relaxation 10.35
MPI_Bsend 6.45
Java 8.44
MPI_Irecv 6.45
MPI_Isend 6.45
Language constructs for parallelism 8.27
MPI_Sendrecv 6.14 , 6.45
Laplaces equation 10.30
MPI_Sendrecv_replace 6.14
Laplace operator 11.19 , 11.20
MPMD 1.9, 2.5
Latency hiding 3.6
Monitor 8.23
Monte-Carlo methods 3.34
Parallel FFT 11.45
Moore algorithm 7.35
Parallel genetic algorithms (PGA) 12.18
most likely lines 11.25
Parallel matrix multiplication 10.6
Multicast 2.13
Parallel mean computation 11.9
Multi-grid method 10.40
Parallel program evaluation 3.3
Multi-grid processor allocation 10.40
Parallel random number generation 3.37
Mutual exclusion 8.16
Parallel sorting algorithms 9.2
mutex variables 8.26
Parallel time 4.17,4.18
M-ary divide and conquer 4.14
Parallelism in space and in time 5.2

Partial barriers 6.12
N-body problem 4.27
Partial pivoting 10.21
Natural evolution 12.8
Parallelizing a common population 12.22
noise reduction 11.7
Particular interpretations of SPP 7.30
Non-Blocking 2.11 , 2.34
Partitioning 6.40 , 10.26
Nonblocking routines 7.20
Partitioning into sub-matrices 10.7
Nonblocking send routines 3.4
Partitioning strategies 4.1
Network Criteria 1.12
Performance analysis 1.33
Numerical algorithms 10.1
PGA 12.18
Numerical integration 4.21
Prewitt operator 11.17

Pi 3.34
Odd-even transposition sort 9.13
Ping-Pong methods 3.18
Odd-even mergesort 9.30
Pipeline computations 5.1
Operations on matrices 10.3
Pipeline latency 5.9
optimization techniques 12.2
Pipeline processing 5.12
Order of magnitude 3.7
Pipeline technique 5.4
Over-relaxation 10.35
Pocket switching 1.24

Point processing 11.5
Parallel bubble sort 9.13
Potential speedup (sorting) 9.2
Parallel branch-and-bound 12.5
Prime numbers 5.29
Parallel data-flow architectures 5.3
Prefix sum problem: 6.17
Parallel DFT 11.39
priority queue 12.4
Parallel execution time 3.4
Process selection in DLB 7.16
Processor speed 5.3
sendrecv() 6.14
Processor-time optimal algorithms 3.12
Sender-initiated method 7.15
Program evaluation 3.3
Sequential algorithms for SPP 7.34
Pseudo-random number generation 3.36
Sequential genetic algorithms 12.10
pthread mutex trylock 8.20
Sharing data 8.14
pthread mutex lock 8.26
Sharing data in systems with caches 8.30
Pthreads 8.9 , 8.31 , 8.40 , 8.41
Shared data 8.27
Pthreads locks 8.19
Shared memory model 8.2
PVM (Parallel Virtual Machine) 2.17 , 2.19-2.28 , 7.20
Shared memory multiprocessor system 8.2
pvm_barrier 6.5
sharpening 11.7
Shearsort 9.15
Quicksort 9.19
Shifting 3.22
Shortest path problem 7.29
Random number generation 3.36
Sieve of Eratosthenes 5.29
Random polling algorithm 7.16
simulated annealing 12.2
Randomized algorithm 7.4
SISD 1.9
Rank sort 9.3 , 9.4 , 9.5
Simulated annealing 7.4
Receiver-initiated method 7.13
Single-pass ring termination algorithm 7.24
Recursive bisection 7.4
SLB 7.4
Reduce 2.16
smoothing 11.7
Red-black ordering 10.35
snakelikestyle 9.15
Relaxation 10.35(and before)
Sobel operator 11.18 , 11.20
Rotation 3.22
SoC Cluster 2.40
Round robin algorithm 7.4 , 7.16
Sorting a bitonic sequence (SBS) 9.34

Sorting algorithms 9.1
SBS 9.34
Sorting numbers, Type 2 5.24
Scalability 1.39
Space-time diagrams 5.8
Scaling 3.22
Speedup factor 1.33
Scatter 2.15
sparse linear equations 10.29
Searching a graph 7.34
Spin lock 8.18
Searching and optimization 12.1
SPMD 1.11 , 2.4
Semaphores 8.21
SPP 7.29
Statement execution order 8.11
Tree implementation 6.8
Static load balancing 7.7
Two dimensional sorting 9.15
Stages vs. processes 5.14
Two-dimensional Fourier transform 11.35
strip partitions 10.26

Successive refinement 12.23
Unsafe send/receive 6.43
Superlinear speedup 1.34

Symmetric parallel compare-and-exchange 9.9
Virtual cut-trough 1.25
Synchronized computations 6.15
Von Neumann architecture 5.3
synchronized methods 8.44

Synchronous computations 6.1
Wormhole routing 1.26
Synchronous iteration 6.20
Weighted masks 11.12
Synchronous message-passing 2.6

Systems of linear equations 5.34 , 6.21 , 10.18
Systolic array 10.17
Task transfer mechanisms 7.13

TD 7.21
TD using acknowledgment messages 7.22
Tembusu 2.40
Termination detection 7.1 , 7.2
Termination in centralized DLB 7.12
The O-notation 3.8
The notation 3.9
The -notation 3.9
Threads vs. (heavyweight) processes 8.7
Thread barrier 8.11
Thread-safe routines 8.13
Thresholding 11.5
Time complexity 3.7
Transformation into frequency domain 11.29
triangular system 10.19
Tree barier 6.8

Accelerating parallel computations and anomaly detection

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Accelerating parallel computations and anomaly detection

Transféré par

Droits d'auteur :

Formats disponibles

Accessing sharing data 8.

Broadcast on a workstation cluster 3.16

acceleration anomaly 12.7

Add a list of numbers 4.8

Bubble sort 9.12

Adding numbers 4.2 , 5.16 , 5.20 , 8.32 , 8.40 , 8.44 , 8.45

Bucket sort 4.15

Adding numbers SPMD 5.18

busy waiting 8.16

All gather 6.29

Butterfly barrier 6.10

All-to-all routines 4.20

cache memory 8.30

cache coherence protocols 8.30

Amdahls law 1.36

Cannons algorithm 10.13

Application of DFT in image processing: 11.36

Cartesian coordinates 11.25

Atomic checks of the looks 8.16

Centralized counter implementations (or linear barrier) 6.6

Asymmetric parallel compare-and-exchange 9.8

Centralized parallel Moore algorithm 7.38

Centralized load balancing 7.9 , 7.10

Barrier 6.3 , 8.11

Barnes-Hut algorithm 4.32

Circuit switching 1.29

Bernsteins conditions 8.29

CLB 7.9 , 7.10

coarse grid 10.41

Best way to climb a mountain 7.21

Concrete architectures 5.19

binary semaphores 8.22

Condition variables 8.24

Bitonic mergesort 9.32

Conway Game of life 6.47

bitonic sequence 9.32

Combined deadlock-free blocking 6.14

Combinatorial search 12.2

block matrix multiplication 10.7

Compare-and-exchange sorting algorithms 9.7

Block partitions 6.31

Computation time 3.4 , 3.11

branch and bound search 12.2 , 12.3 , 12.5

Computer clasification 1.9

Communication time 3.4 , 3.11 , 3.18

Broadcast 2.13 , 3.13

Communication time by Ping-Pong methods 3.18

Broadcast on a hypercube network 3.13

Computing an Integral 3.35

Broadcast on a mash network 3.14

Computing pi: 3.34

Contrast stretching 11.5

Embarrassingly parallel computations 3.20

Cost-optimal algorithms 3.12

Creating bitonic sequences 9.34

Evaluating programs empirically 3.17

cut-off functions 12.3

Cyclic partition 6.31

Execution order 8.11

cyclic-strip partitions 10.27

Examples / Pthreads 8.33

Data-flow architecture 5.15

Examples / Java 8.45