History
• Ian Buck, Dir. of GPU Computing, received his
PhD from Stanford for his research on GPPM in
2004
• Started working for Nvidia to commercialize
GPU computing
• First start was in 2006, Nvidia released CUDA v
1.0 for G80
• In spring 2008, CUDA 2.0 was released together
with GT200
About
• With CUDA, normal applications can be
ported to GPU for higher performance
• No low level or 3D programming
knowledge required, CUDA works with C
CPU vs GPU
• A CPU core can execute 4 32-bit instructions per
clock, whilst a GPU can execute 3200 32-bit
instructions per clock
• A CPU is designed primarily to be an executive
and make decisions
• A GPU is different, it has a large number of
ALU’s(Arithmetic/Logic Units), a lot more than a
CPU.
Structure
• In CUDA, you are required to specify the
number of blocks and threads in each
block.
• One block can contain up to 512 threads.
• Each thread on each block is executed
separately.
Syntax
• Key parts:
• Identifying a GPU function (__global__,
__device__)
• Calling a GPU function, specifying number
of blocks and threads per block
function<<<block_nr,
thread_nr>>>(param);
Bruteforce
• As a lot of information is processed at the
same time, parallel programming has a
big impact on bruteforce
• Number of tries increases drastically on a
GPU than on a CPU
Examples
• Let’s say we have a password to break,
and the only thing we know is it has
length=3
• A simple bruteforce would be:
Real Life
• Let’s say we have an MD5 and a wordlist
of 1.000.000 words
• A simple bruteforce would be:
Real Life
• A GPU bruteforce would be:
• Called like this:
• threadIdx.x+blockIdx.x*blockDim.x is the thread
ID (ranging from 1 to 1.000.000)
• 2000*500=1.000.000 threads