Académique Documents
Professionnel Documents
Culture Documents
David B. Thomas, Lee Howes, Wayne Luke Presented by: M. Ameen Qureshi
Introduction
RNG Parallelism
High Performance Computing
Deterministic generators Pseudo Random Number Generation Beneficiary: Simulations, Cryptography, Genetic Algorithms, Climate modeling
Uniform Generation
P should be close to S
S is a vector of w bits P close to 2w
P should be close to S
S is a vector of w bits P close to 2w
Uniform Generation
Combined Tausworthe XorShift Mersenne Twister SFMT
SIMD-oriented Mersenne Twister
Non-Uniform Generation
Inversion Transfromation
Box-Muller Method
Rejection
Box-Muller Method
Rejection
Box-Muller Method
Platforms
CPU FPGA GPU
Thread level parallelism More area to ALU, cache and scheduling logic removed Each CPU executes upto 1024 threads at once Batches of 32 threads (warps)
MPPA
Hundreds of RISC CPUs (regular grid) Small memories and 2D communication channels