Académique Documents
Professionnel Documents
Culture Documents
Extra allocation and extra copy required Just register and go!
malloc(a)
cudaMallocHost(b)
memcpy(b, a) cudaHostRegister(a)
cudaMemcpy() to GPU, launch kernels, cudaMemcpy() from GPU
memcpy(a, b)
cudaFreeHost(b) cudaHostUnregister(a)
• thrust::device_vector • thrust::sort
• thrust::host_vector • thrust::reduce
• thrust::device_ptr • thrust::exclusive_scan
Etc. Etc.
© NVIDIA Corporation 2011
© NVIDIA Corporation 2011
–
Details @ http://www.nvidia.com/object/software-for-tesla-products.html
© NVIDIA Corporation 2011
NVIDIA GPUDirect™ v2.0
GPU1 GPU2
Memory Memory 1. cudaMemcpy(GPU2, sysmem)
2. cudaMemcpy(sysmem, GPU1)
System
Memory
CPU
GPU1 GPU2
Chip
set
GPU1 GPU2
Memory Memory 1. cudaMemcpy(GPU2, GPU1)
System
Memory
CPU
GPU1 GPU2
Chip
set
cudaMemcpyHostToHost
cudaMemcpyHostToDevice cudaMemcpyDefault
cudaMemcpyDeviceToHost (data location becomes an implementation detail)
cudaMemcpyDeviceToDevice
artificial retinas possible, and that wasn’t predicted to happen until 2060.”
GPU Technology Conference 2011
October 11 -14 | San Jose, CA
The one event you can’t afford to miss
www.gputechconf.com