More Related Content


NVidia CUDA for Bruteforce Attacks - DefCamp 2012

  1. History • Ian Buck, Dir. of GPU Computing, received his PhD from Stanford for his research on GPPM in 2004 • Started working for Nvidia to commercialize GPU computing • First start was in 2006, Nvidia released CUDA v 1.0 for G80 • In spring 2008, CUDA 2.0 was released together with GT200
  2. About • With CUDA, normal applications can be ported to GPU for higher performance • No low level or 3D programming knowledge required, CUDA works with C
  3. CPU vs GPU • A CPU core can execute 4 32-bit instructions per clock, whilst a GPU can execute 3200 32-bit instructions per clock • A CPU is designed primarily to be an executive and make decisions • A GPU is different, it has a large number of ALU’s(Arithmetic/Logic Units), a lot more than a CPU.
  4. Structure • In CUDA, you are required to specify the number of blocks and threads in each block. • One block can contain up to 512 threads. • Each thread on each block is executed separately.
  5. Structure
  6. Syntax • Key parts: • Identifying a GPU function (__global__, __device__) • Calling a GPU function, specifying number of blocks and threads per block function<<<block_nr, thread_nr>>>(param);
  7. Syntax • CPU Code: • Calling function:
  8. Syntax • GPU Code: • Calling function:
  9. Bruteforce • As a lot of information is processed at the same time, parallel programming has a big impact on bruteforce • Number of tries increases drastically on a GPU than on a CPU
  10. Examples • Let’s say we have a password to break, and the only thing we know is it has length=3 • A simple bruteforce would be:
  11. Examples • A GPU bruteforce: • Called like this:
  12. Examples • A more efficient GPU bruteforce: • Called like this:
  13. Real Life • Let’s say we have an MD5 and a wordlist of 1.000.000 words • A simple bruteforce would be:
  14. Real Life • A GPU bruteforce would be: • Called like this: • threadIdx.x+blockIdx.x*blockDim.x is the thread ID (ranging from 1 to 1.000.000) • 2000*500=1.000.000 threads