SlideShare a Scribd company logo
john-devkit: specialized compiler for hash
cracking
Aleksey Cherepanov
lyosha@openwall.com
May 26, 2015
General
john-devkit
is an experiment
not yet embraced by John the Ripper developer community
is a code generator
on input: algo written in special language and a list of
optimizations to apply
on output: C file for John the Ripper
John the Ripper (JtR)
the famous hash cracker
primary purpose is to detect weak Unix passwords
supports 200+ hash formats (types)
supports several kinds of compute devices:
CPU, Xeon Phi
scalar
SIMD: SSE2+/AVX/XOP, AVX2, MIC/AVX-512, AltiVec,
NEON
GPU
OpenCL, CUDA
FPGA, Epiphany
currently for bcrypt only
Problems of JtR development
scalability of programmers is low due to 200+ formats:
sometimes it is hard to apply even 1 optimization to all
formats:
important formats get the optimization first
each additional format to optimize eats more time
support for each device needs a separate implementation
readability degrades when various cases are handled by
preprocessor
Aims of john-devkit
to separate crypto algorithms, optimizations, and output code
for various devices
to include optimizations specific for hash cracking and John
the Ripper
to provide better syntax
to retain or improve performance
to provide precise control over optimizations
to support various devices: CPU, GPU, FPGA
to give great output for great input (not for any input)
to be simple
Early results
john-devkit is not mature
7 formats were implemented separating crypto primitives,
optimizations, and device specific code
good speeds (over default implementation in JtR):
raw-sha256 +22%
raw-sha224 +20%
raw-sha512 +6%
raw-sha384 +5%
bad speeds (but expose interesting features of john-devkit):
raw-sha1 -1%
raw-md4 -11%
raw-md5 -15%
optimizations implemented: interleave, vectorization, unroll of
loops, early reject, additional batching (loop around algo)
all formats got all optimizations without effort
Optimizations
Cracking process
we are in attacker’s position
we have a lot of candidates to try
high parallelism
high level algo:
load hashes (once)
generate some candidates
compute hashes (or only parts)
reject most of wrong candidates
check probable passwords precisely (rare case)
generate next batch of candidates and repeat
formats are integrated into this process using OOP-like calls
over function pointers
Optimizations
some optimizations do not affect internals of crypto
algorithms in any way and may be added to any algorithm
additional loop around algo to process more candidates in 1 call
OpenMP support
other optimizations affect crypto algorithms
vectorization (SIMD)
precomputation
e.g. first few steps in MD*/SHA* for partially changed input
reversal of operations
e.g. last few steps in MD*/SHA*, DES final permutation
loop unrolling
interleaving
bitslicing
and others
Bitslice
splits numbers into bits and computes everything through
bitwise operations
optimization focuses on minimization of Boolean formula (or
circuit)
Roman Rusakov generated current formulas for S-boxes of
DES used in John the Ripper with custom generator
it took 3 months
Billy Bob Brumley demonstrated application of simulated
annealing algorithm to scheduling of DES S-box instructions
so code generation is not new for John the Ripper (not even
speaking about C preprocessor)
Other solutions
OpenCL
is the first thing I hear when I say about output for both CPU
and GPU
has quite heavy syntax (based on C)
knows nothing about John the Ripper
does not have automatic bitslicing
Dynamic formats in John the Ripper
were implemented by Jim Fougeron
separate crypto primitives from formats
so md5($p) and md5(md5($p)) have one code base
work at runtime
john-devkit aims to be able to do similar thing but at compile
time and with ability to optimize better
so md5(md5($p)) would get more optimizations (at price of
separate code)
C Macros
allow to do things, but are not smart
an example of loop unroll in Keccak defining all useful
variants:
[...]
#elif (Unrolling == 3)
#define rounds 
prepareTheta 
for(i=0; i<24; i+=3) { 
thetaRhoPiChiIotaPrepareTheta(i , A, E) 
thetaRhoPiChiIotaPrepareTheta(i+1, E, A) 
thetaRhoPiChiIotaPrepareTheta(i+2, A, E) 
copyStateVariables(A, E) 
} 
copyToState(state, A)
#elif (Unrolling == 2)
#define rounds 
prepareTheta 
for(i=0; i<24; i+=2) { 
thetaRhoPiChiIotaPrepareTheta(i , A, E) 
thetaRhoPiChiIotaPrepareTheta(i+1, E, A) 
} 
copyToState(state, A)
X-Macro
is a tricky way to use macros, most likely with a separate file
to be included multiple times:
the file has code with variable parts
these parts are defined before #include
so #include provides a ”template engine”
example from NetBSD’s libcrypt:
[...]
#define HASH_Init SHA1Init
#define HASH_Update SHA1Update
#define HASH_Final SHA1Final
#include "hmac.c"
john-devkit technical details
From Python to C in john-devkit
code in intermediate language (IL) is generated from
algorithm description
the code is modified by several functions chosen by user
C code is generated from the modified the code using a
template
Intermediate Language (IL)
while algorithms are written in Python with modified
environment, john-devkit uses flat representation of code using
its own instruction language called intermediate language
some instructions of this language express constructions
specific to hash cracking
for instance, state variables of hash functions are defined by
special instruction
intermediate language is very simple
intermediate language is intended to be rich to express
common constructions natively to simplify optimization
Example of specific instruction
separate instruction is used to define state variable, so
john-devkit uses a filter to replace initial state with values for
SHA-224 having code for SHA-256:
def override_state(code, state):
consts = {}
for l in code:
if l[0] == ’new_const’:
consts[l[1]] = l
if l[0] == ’new_state_var’:
consts[l[2]][2] = str(state.pop(0))
return code
Optimizations specific to password cracking
use knowledge not available to regular compiler:
code can be moved between some functions of format
the functions have different probability to be called
so main computation is always called
check of probable candidates is very rare
it almost implies a successful guess (for strong hashes),
also hashes are loaded only once while there are millions of
candidates being hashed every second
Specific optimization: early reject
hashes are long
some output values may be computed a bit quicker than
others
a 32-bit or 64-bit one value is usually enough to reject almost
all wrong candidates
so john-devkit drops instructions for computation of other
output values in main working function and places full
implementation into function for precise check of possible
password
main implementation is vectorized while full implementation is
scalar because it checks only 1 candidate
Specific optimization: steps reversal
some operations can be reversed
if r = i + C, we know r, and C is a constant, then i = r - C
John the Ripper learns ”r” when it loads hashes
john-devkit can sometimes reverse operations, replacing
”forward” computation during cracking (applied per candidate
password) with reverse computation at startup (applied per
hash)
Full Python
is available to define algorithms
the environment has some objects with overloaded
instructions to produce code in IL in a global variable instead
of running it right away
but any Python code can be used
it is evaluated fully before further steps of code generation
but to produce good output some additional markup may be
needed
Full Python, example
a part of MD4 definition adapted right from RFC 1320:
def make_round(func, code):
res = ’’
func = re.sub(’([abcdks])’, r’{1}’, func)
parts = re.compile(r’[(.)(.)(.)(.)s+(d+)s+(d+)]’
).findall(code)
for a, b, c, d, k, s in parts:
res += func.format(**vars()) + "n"
return res
exec make_round(’a = rol((a + F(b, c, d) + X[k]), s)’,
’’’ [ABCD 0 3] [DABC 1 7] [CDAB 2 11] [BCDA 3 19]
[ABCD 4 3] [DABC 5 7] [CDAB 6 11] [BCDA 7 19]
[ABCD 8 3] [DABC 9 7] [CDAB 10 11] [BCDA 11 19]
[ABCD 12 3] [DABC 13 7] [CDAB 14 11] [BCDA 15 19]
’’’)
Conclusions
john-devkit demonstrates practical application of code
generation approach
john-devkit is a real way to automate programmer’s work at
such scale
Thank you!
Thank you!
code:
https://github.com/AlekseyCherepanov/john-devkit
more technical detail will be on john-dev mailing list
my email: lyosha@openwall.com

More Related Content

What's hot

How it's made: C++ compilers (GCC)
How it's made: C++ compilers (GCC)How it's made: C++ compilers (GCC)
How it's made: C++ compilers (GCC)
Sławomir Zborowski
 
Handling inline assembly in Clang and LLVM
Handling inline assembly in Clang and LLVMHandling inline assembly in Clang and LLVM
Handling inline assembly in Clang and LLVM
Min-Yih Hsu
 
Lex tool manual
Lex tool manualLex tool manual
Lex tool manual
Sami Said
 
FTD JVM Internals
FTD JVM InternalsFTD JVM Internals
FTD JVM Internals
Felipe Mamud
 
How to write a TableGen backend
How to write a TableGen backendHow to write a TableGen backend
How to write a TableGen backend
Min-Yih Hsu
 
Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)
omercomail
 
Binary art - Byte-ing the PE that fails you (extended offline version)
Binary art - Byte-ing the PE that fails you (extended offline version)Binary art - Byte-ing the PE that fails you (extended offline version)
Binary art - Byte-ing the PE that fails you (extended offline version)
Ange Albertini
 
[COSCUP 2021] A trip about how I contribute to LLVM
[COSCUP 2021] A trip about how I contribute to LLVM[COSCUP 2021] A trip about how I contribute to LLVM
[COSCUP 2021] A trip about how I contribute to LLVM
Douglas Chen
 
The LLDB Debugger in FreeBSD by Ed Maste
The LLDB Debugger in FreeBSD by Ed MasteThe LLDB Debugger in FreeBSD by Ed Maste
The LLDB Debugger in FreeBSD by Ed Maste
eurobsdcon
 
I Know Kung Fu - Juggling Java Bytecode
I Know Kung Fu - Juggling Java BytecodeI Know Kung Fu - Juggling Java Bytecode
I Know Kung Fu - Juggling Java Bytecode
Alexander Shopov
 
Eval4j @ JVMLS 2014
Eval4j @ JVMLS 2014Eval4j @ JVMLS 2014
Eval4j @ JVMLS 2014
Andrey Breslav
 
First session quiz
First session quizFirst session quiz
First session quiz
Keroles karam khalil
 
Hacking with hhvm
Hacking with hhvmHacking with hhvm
Hacking with hhvm
Elizabeth Smith
 
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
Alexey Smirnov
 
Language Design Trade-offs
Language Design Trade-offsLanguage Design Trade-offs
Language Design Trade-offs
Andrey Breslav
 
Advance ROP Attacks
Advance ROP AttacksAdvance ROP Attacks
Php extensions
Php extensionsPhp extensions
Php extensions
Elizabeth Smith
 
C++ programming
C++ programmingC++ programming
Lexyacc
LexyaccLexyacc
Lexyacc
Hina Tahir
 
Java Bytecode Fundamentals - JUG.lv
Java Bytecode Fundamentals - JUG.lvJava Bytecode Fundamentals - JUG.lv
Java Bytecode Fundamentals - JUG.lv
Anton Arhipov
 

What's hot (20)

How it's made: C++ compilers (GCC)
How it's made: C++ compilers (GCC)How it's made: C++ compilers (GCC)
How it's made: C++ compilers (GCC)
 
Handling inline assembly in Clang and LLVM
Handling inline assembly in Clang and LLVMHandling inline assembly in Clang and LLVM
Handling inline assembly in Clang and LLVM
 
Lex tool manual
Lex tool manualLex tool manual
Lex tool manual
 
FTD JVM Internals
FTD JVM InternalsFTD JVM Internals
FTD JVM Internals
 
How to write a TableGen backend
How to write a TableGen backendHow to write a TableGen backend
How to write a TableGen backend
 
Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)
 
Binary art - Byte-ing the PE that fails you (extended offline version)
Binary art - Byte-ing the PE that fails you (extended offline version)Binary art - Byte-ing the PE that fails you (extended offline version)
Binary art - Byte-ing the PE that fails you (extended offline version)
 
[COSCUP 2021] A trip about how I contribute to LLVM
[COSCUP 2021] A trip about how I contribute to LLVM[COSCUP 2021] A trip about how I contribute to LLVM
[COSCUP 2021] A trip about how I contribute to LLVM
 
The LLDB Debugger in FreeBSD by Ed Maste
The LLDB Debugger in FreeBSD by Ed MasteThe LLDB Debugger in FreeBSD by Ed Maste
The LLDB Debugger in FreeBSD by Ed Maste
 
I Know Kung Fu - Juggling Java Bytecode
I Know Kung Fu - Juggling Java BytecodeI Know Kung Fu - Juggling Java Bytecode
I Know Kung Fu - Juggling Java Bytecode
 
Eval4j @ JVMLS 2014
Eval4j @ JVMLS 2014Eval4j @ JVMLS 2014
Eval4j @ JVMLS 2014
 
First session quiz
First session quizFirst session quiz
First session quiz
 
Hacking with hhvm
Hacking with hhvmHacking with hhvm
Hacking with hhvm
 
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
FORECAST: Fast Generation of Accurate Context-Aware Signatures of Control-Hij...
 
Language Design Trade-offs
Language Design Trade-offsLanguage Design Trade-offs
Language Design Trade-offs
 
Advance ROP Attacks
Advance ROP AttacksAdvance ROP Attacks
Advance ROP Attacks
 
Php extensions
Php extensionsPhp extensions
Php extensions
 
C++ programming
C++ programmingC++ programming
C++ programming
 
Lexyacc
LexyaccLexyacc
Lexyacc
 
Java Bytecode Fundamentals - JUG.lv
Java Bytecode Fundamentals - JUG.lvJava Bytecode Fundamentals - JUG.lv
Java Bytecode Fundamentals - JUG.lv
 

Viewers also liked

How to know you was hacked
How to know you was hackedHow to know you was hacked
How to know you was hacked
Phannarith Ou, G-CISO
 
"Hack The Interview" BSidesCT 2011
"Hack The Interview" BSidesCT 2011"Hack The Interview" BSidesCT 2011
"Hack The Interview" BSidesCT 2011
jadedsecurity
 
How I got 357 emails and 50+ leads with this BBQ hack
How I got 357 emails and 50+ leads with this BBQ hackHow I got 357 emails and 50+ leads with this BBQ hack
How I got 357 emails and 50+ leads with this BBQ hack
Elizabeth Brewer
 
Jacktheripper
JacktheripperJacktheripper
Jacktheripper
Mentalisme
 
The Yorkshire Ripper
The Yorkshire RipperThe Yorkshire Ripper
The Yorkshire Ripper
Dani Cathro
 
Open hack 2011-hardware-hacks
Open hack 2011-hardware-hacksOpen hack 2011-hardware-hacks
Open hack 2011-hardware-hacks
Sudar Muthu
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
Barry Feldman
 

Viewers also liked (7)

How to know you was hacked
How to know you was hackedHow to know you was hacked
How to know you was hacked
 
"Hack The Interview" BSidesCT 2011
"Hack The Interview" BSidesCT 2011"Hack The Interview" BSidesCT 2011
"Hack The Interview" BSidesCT 2011
 
How I got 357 emails and 50+ leads with this BBQ hack
How I got 357 emails and 50+ leads with this BBQ hackHow I got 357 emails and 50+ leads with this BBQ hack
How I got 357 emails and 50+ leads with this BBQ hack
 
Jacktheripper
JacktheripperJacktheripper
Jacktheripper
 
The Yorkshire Ripper
The Yorkshire RipperThe Yorkshire Ripper
The Yorkshire Ripper
 
Open hack 2011-hardware-hacks
Open hack 2011-hardware-hacksOpen hack 2011-hardware-hacks
Open hack 2011-hardware-hacks
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
 

Similar to Specialized Compiler for Hash Cracking

A Life of breakpoint
A Life of breakpointA Life of breakpoint
A Life of breakpoint
Hajime Morrita
 
john-devkit: 100 типов хешей спустя / john-devkit: 100 Hash Types Later
john-devkit: 100 типов хешей спустя / john-devkit: 100 Hash Types Laterjohn-devkit: 100 типов хешей спустя / john-devkit: 100 Hash Types Later
john-devkit: 100 типов хешей спустя / john-devkit: 100 Hash Types Later
Positive Hack Days
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
Dmitri Nesteruk
 
Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011
Jimmy Schementi
 
Go 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX GoGo 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX Go
Rodolfo Carvalho
 
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
JiandSon
 
Debugging Python with gdb
Debugging Python with gdbDebugging Python with gdb
Debugging Python with gdb
Roman Podoliaka
 
SE 20016 - programming languages landscape.
SE 20016 - programming languages landscape.SE 20016 - programming languages landscape.
SE 20016 - programming languages landscape.
Ruslan Shevchenko
 
07 140430-ipp-languages used in llvm during compilation
07 140430-ipp-languages used in llvm during compilation07 140430-ipp-languages used in llvm during compilation
07 140430-ipp-languages used in llvm during compilation
Adam Husár
 
7986-lect 7.pdf
7986-lect 7.pdf7986-lect 7.pdf
7986-lect 7.pdf
RiazAhmad521284
 
Os Worthington
Os WorthingtonOs Worthington
Os Worthington
oscon2007
 
Mach-O Internals
Mach-O InternalsMach-O Internals
Mach-O Internals
Anthony Shoumikhin
 
HES2011 - James Oakley and Sergey bratus-Exploiting-the-Hard-Working-DWARF
HES2011 - James Oakley and Sergey bratus-Exploiting-the-Hard-Working-DWARFHES2011 - James Oakley and Sergey bratus-Exploiting-the-Hard-Working-DWARF
HES2011 - James Oakley and Sergey bratus-Exploiting-the-Hard-Working-DWARF
Hackito Ergo Sum
 
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5
PRADEEP
 
A Replay Approach to Software Validation
A Replay Approach to Software ValidationA Replay Approach to Software Validation
A Replay Approach to Software Validation
James Pascoe
 
Survey of Program Transformation Technologies
Survey of Program Transformation TechnologiesSurvey of Program Transformation Technologies
Survey of Program Transformation Technologies
Chunhua Liao
 
0100_Embeded_C_CompilationProcess.pdf
0100_Embeded_C_CompilationProcess.pdf0100_Embeded_C_CompilationProcess.pdf
0100_Embeded_C_CompilationProcess.pdf
KhaledIbrahim10923
 
The craft of meta programming on JVM
The craft of meta programming on JVMThe craft of meta programming on JVM
The craft of meta programming on JVM
Igor Khotin
 
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019
corehard_by
 

Similar to Specialized Compiler for Hash Cracking (20)

A Life of breakpoint
A Life of breakpointA Life of breakpoint
A Life of breakpoint
 
john-devkit: 100 типов хешей спустя / john-devkit: 100 Hash Types Later
john-devkit: 100 типов хешей спустя / john-devkit: 100 Hash Types Laterjohn-devkit: 100 типов хешей спустя / john-devkit: 100 Hash Types Later
john-devkit: 100 типов хешей спустя / john-devkit: 100 Hash Types Later
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
 
Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011
 
Go 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX GoGo 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX Go
 
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
 
Debugging Python with gdb
Debugging Python with gdbDebugging Python with gdb
Debugging Python with gdb
 
SE 20016 - programming languages landscape.
SE 20016 - programming languages landscape.SE 20016 - programming languages landscape.
SE 20016 - programming languages landscape.
 
07 140430-ipp-languages used in llvm during compilation
07 140430-ipp-languages used in llvm during compilation07 140430-ipp-languages used in llvm during compilation
07 140430-ipp-languages used in llvm during compilation
 
7986-lect 7.pdf
7986-lect 7.pdf7986-lect 7.pdf
7986-lect 7.pdf
 
Os Worthington
Os WorthingtonOs Worthington
Os Worthington
 
Mach-O Internals
Mach-O InternalsMach-O Internals
Mach-O Internals
 
HES2011 - James Oakley and Sergey bratus-Exploiting-the-Hard-Working-DWARF
HES2011 - James Oakley and Sergey bratus-Exploiting-the-Hard-Working-DWARFHES2011 - James Oakley and Sergey bratus-Exploiting-the-Hard-Working-DWARF
HES2011 - James Oakley and Sergey bratus-Exploiting-the-Hard-Working-DWARF
 
EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5EMBEDDED SYSTEMS 4&5
EMBEDDED SYSTEMS 4&5
 
A Replay Approach to Software Validation
A Replay Approach to Software ValidationA Replay Approach to Software Validation
A Replay Approach to Software Validation
 
Survey of Program Transformation Technologies
Survey of Program Transformation TechnologiesSurvey of Program Transformation Technologies
Survey of Program Transformation Technologies
 
0100_Embeded_C_CompilationProcess.pdf
0100_Embeded_C_CompilationProcess.pdf0100_Embeded_C_CompilationProcess.pdf
0100_Embeded_C_CompilationProcess.pdf
 
The craft of meta programming on JVM
The craft of meta programming on JVMThe craft of meta programming on JVM
The craft of meta programming on JVM
 
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019
 

More from Positive Hack Days

Инструмент ChangelogBuilder для автоматической подготовки Release Notes
Инструмент ChangelogBuilder для автоматической подготовки Release NotesИнструмент ChangelogBuilder для автоматической подготовки Release Notes
Инструмент ChangelogBuilder для автоматической подготовки Release Notes
Positive Hack Days
 
Как мы собираем проекты в выделенном окружении в Windows Docker
Как мы собираем проекты в выделенном окружении в Windows DockerКак мы собираем проекты в выделенном окружении в Windows Docker
Как мы собираем проекты в выделенном окружении в Windows Docker
Positive Hack Days
 
Типовая сборка и деплой продуктов в Positive Technologies
Типовая сборка и деплой продуктов в Positive TechnologiesТиповая сборка и деплой продуктов в Positive Technologies
Типовая сборка и деплой продуктов в Positive Technologies
Positive Hack Days
 
Аналитика в проектах: TFS + Qlik
Аналитика в проектах: TFS + QlikАналитика в проектах: TFS + Qlik
Аналитика в проектах: TFS + Qlik
Positive Hack Days
 
Использование анализатора кода SonarQube
Использование анализатора кода SonarQubeИспользование анализатора кода SonarQube
Использование анализатора кода SonarQube
Positive Hack Days
 
Развитие сообщества Open DevOps Community
Развитие сообщества Open DevOps CommunityРазвитие сообщества Open DevOps Community
Развитие сообщества Open DevOps Community
Positive Hack Days
 
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Positive Hack Days
 
Автоматизация построения правил для Approof
Автоматизация построения правил для ApproofАвтоматизация построения правил для Approof
Автоматизация построения правил для Approof
Positive Hack Days
 
Мастер-класс «Трущобы Application Security»
Мастер-класс «Трущобы Application Security»Мастер-класс «Трущобы Application Security»
Мастер-класс «Трущобы Application Security»
Positive Hack Days
 
Формальные методы защиты приложений
Формальные методы защиты приложенийФормальные методы защиты приложений
Формальные методы защиты приложений
Positive Hack Days
 
Эвристические методы защиты приложений
Эвристические методы защиты приложенийЭвристические методы защиты приложений
Эвристические методы защиты приложений
Positive Hack Days
 
Теоретические основы Application Security
Теоретические основы Application SecurityТеоретические основы Application Security
Теоретические основы Application Security
Positive Hack Days
 
От экспериментального программирования к промышленному: путь длиной в 10 лет
От экспериментального программирования к промышленному: путь длиной в 10 летОт экспериментального программирования к промышленному: путь длиной в 10 лет
От экспериментального программирования к промышленному: путь длиной в 10 лет
Positive Hack Days
 
Уязвимое Android-приложение: N проверенных способов наступить на грабли
Уязвимое Android-приложение: N проверенных способов наступить на граблиУязвимое Android-приложение: N проверенных способов наступить на грабли
Уязвимое Android-приложение: N проверенных способов наступить на грабли
Positive Hack Days
 
Требования по безопасности в архитектуре ПО
Требования по безопасности в архитектуре ПОТребования по безопасности в архитектуре ПО
Требования по безопасности в архитектуре ПО
Positive Hack Days
 
Формальная верификация кода на языке Си
Формальная верификация кода на языке СиФормальная верификация кода на языке Си
Формальная верификация кода на языке Си
Positive Hack Days
 
Механизмы предотвращения атак в ASP.NET Core
Механизмы предотвращения атак в ASP.NET CoreМеханизмы предотвращения атак в ASP.NET Core
Механизмы предотвращения атак в ASP.NET Core
Positive Hack Days
 
SOC для КИИ: израильский опыт
SOC для КИИ: израильский опытSOC для КИИ: израильский опыт
SOC для КИИ: израильский опыт
Positive Hack Days
 
Honeywell Industrial Cyber Security Lab & Services Center
Honeywell Industrial Cyber Security Lab & Services CenterHoneywell Industrial Cyber Security Lab & Services Center
Honeywell Industrial Cyber Security Lab & Services Center
Positive Hack Days
 
Credential stuffing и брутфорс-атаки
Credential stuffing и брутфорс-атакиCredential stuffing и брутфорс-атаки
Credential stuffing и брутфорс-атаки
Positive Hack Days
 

More from Positive Hack Days (20)

Инструмент ChangelogBuilder для автоматической подготовки Release Notes
Инструмент ChangelogBuilder для автоматической подготовки Release NotesИнструмент ChangelogBuilder для автоматической подготовки Release Notes
Инструмент ChangelogBuilder для автоматической подготовки Release Notes
 
Как мы собираем проекты в выделенном окружении в Windows Docker
Как мы собираем проекты в выделенном окружении в Windows DockerКак мы собираем проекты в выделенном окружении в Windows Docker
Как мы собираем проекты в выделенном окружении в Windows Docker
 
Типовая сборка и деплой продуктов в Positive Technologies
Типовая сборка и деплой продуктов в Positive TechnologiesТиповая сборка и деплой продуктов в Positive Technologies
Типовая сборка и деплой продуктов в Positive Technologies
 
Аналитика в проектах: TFS + Qlik
Аналитика в проектах: TFS + QlikАналитика в проектах: TFS + Qlik
Аналитика в проектах: TFS + Qlik
 
Использование анализатора кода SonarQube
Использование анализатора кода SonarQubeИспользование анализатора кода SonarQube
Использование анализатора кода SonarQube
 
Развитие сообщества Open DevOps Community
Развитие сообщества Open DevOps CommunityРазвитие сообщества Open DevOps Community
Развитие сообщества Open DevOps Community
 
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
Методика определения неиспользуемых ресурсов виртуальных машин и автоматизаци...
 
Автоматизация построения правил для Approof
Автоматизация построения правил для ApproofАвтоматизация построения правил для Approof
Автоматизация построения правил для Approof
 
Мастер-класс «Трущобы Application Security»
Мастер-класс «Трущобы Application Security»Мастер-класс «Трущобы Application Security»
Мастер-класс «Трущобы Application Security»
 
Формальные методы защиты приложений
Формальные методы защиты приложенийФормальные методы защиты приложений
Формальные методы защиты приложений
 
Эвристические методы защиты приложений
Эвристические методы защиты приложенийЭвристические методы защиты приложений
Эвристические методы защиты приложений
 
Теоретические основы Application Security
Теоретические основы Application SecurityТеоретические основы Application Security
Теоретические основы Application Security
 
От экспериментального программирования к промышленному: путь длиной в 10 лет
От экспериментального программирования к промышленному: путь длиной в 10 летОт экспериментального программирования к промышленному: путь длиной в 10 лет
От экспериментального программирования к промышленному: путь длиной в 10 лет
 
Уязвимое Android-приложение: N проверенных способов наступить на грабли
Уязвимое Android-приложение: N проверенных способов наступить на граблиУязвимое Android-приложение: N проверенных способов наступить на грабли
Уязвимое Android-приложение: N проверенных способов наступить на грабли
 
Требования по безопасности в архитектуре ПО
Требования по безопасности в архитектуре ПОТребования по безопасности в архитектуре ПО
Требования по безопасности в архитектуре ПО
 
Формальная верификация кода на языке Си
Формальная верификация кода на языке СиФормальная верификация кода на языке Си
Формальная верификация кода на языке Си
 
Механизмы предотвращения атак в ASP.NET Core
Механизмы предотвращения атак в ASP.NET CoreМеханизмы предотвращения атак в ASP.NET Core
Механизмы предотвращения атак в ASP.NET Core
 
SOC для КИИ: израильский опыт
SOC для КИИ: израильский опытSOC для КИИ: израильский опыт
SOC для КИИ: израильский опыт
 
Honeywell Industrial Cyber Security Lab & Services Center
Honeywell Industrial Cyber Security Lab & Services CenterHoneywell Industrial Cyber Security Lab & Services Center
Honeywell Industrial Cyber Security Lab & Services Center
 
Credential stuffing и брутфорс-атаки
Credential stuffing и брутфорс-атакиCredential stuffing и брутфорс-атаки
Credential stuffing и брутфорс-атаки
 

Recently uploaded

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 

Recently uploaded (20)

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 

Specialized Compiler for Hash Cracking

  • 1. john-devkit: specialized compiler for hash cracking Aleksey Cherepanov lyosha@openwall.com May 26, 2015
  • 3. john-devkit is an experiment not yet embraced by John the Ripper developer community is a code generator on input: algo written in special language and a list of optimizations to apply on output: C file for John the Ripper
  • 4. John the Ripper (JtR) the famous hash cracker primary purpose is to detect weak Unix passwords supports 200+ hash formats (types) supports several kinds of compute devices: CPU, Xeon Phi scalar SIMD: SSE2+/AVX/XOP, AVX2, MIC/AVX-512, AltiVec, NEON GPU OpenCL, CUDA FPGA, Epiphany currently for bcrypt only
  • 5. Problems of JtR development scalability of programmers is low due to 200+ formats: sometimes it is hard to apply even 1 optimization to all formats: important formats get the optimization first each additional format to optimize eats more time support for each device needs a separate implementation readability degrades when various cases are handled by preprocessor
  • 6. Aims of john-devkit to separate crypto algorithms, optimizations, and output code for various devices to include optimizations specific for hash cracking and John the Ripper to provide better syntax to retain or improve performance to provide precise control over optimizations to support various devices: CPU, GPU, FPGA to give great output for great input (not for any input) to be simple
  • 7. Early results john-devkit is not mature 7 formats were implemented separating crypto primitives, optimizations, and device specific code good speeds (over default implementation in JtR): raw-sha256 +22% raw-sha224 +20% raw-sha512 +6% raw-sha384 +5% bad speeds (but expose interesting features of john-devkit): raw-sha1 -1% raw-md4 -11% raw-md5 -15% optimizations implemented: interleave, vectorization, unroll of loops, early reject, additional batching (loop around algo) all formats got all optimizations without effort
  • 9. Cracking process we are in attacker’s position we have a lot of candidates to try high parallelism high level algo: load hashes (once) generate some candidates compute hashes (or only parts) reject most of wrong candidates check probable passwords precisely (rare case) generate next batch of candidates and repeat formats are integrated into this process using OOP-like calls over function pointers
  • 10. Optimizations some optimizations do not affect internals of crypto algorithms in any way and may be added to any algorithm additional loop around algo to process more candidates in 1 call OpenMP support other optimizations affect crypto algorithms vectorization (SIMD) precomputation e.g. first few steps in MD*/SHA* for partially changed input reversal of operations e.g. last few steps in MD*/SHA*, DES final permutation loop unrolling interleaving bitslicing and others
  • 11. Bitslice splits numbers into bits and computes everything through bitwise operations optimization focuses on minimization of Boolean formula (or circuit) Roman Rusakov generated current formulas for S-boxes of DES used in John the Ripper with custom generator it took 3 months Billy Bob Brumley demonstrated application of simulated annealing algorithm to scheduling of DES S-box instructions so code generation is not new for John the Ripper (not even speaking about C preprocessor)
  • 13. OpenCL is the first thing I hear when I say about output for both CPU and GPU has quite heavy syntax (based on C) knows nothing about John the Ripper does not have automatic bitslicing
  • 14. Dynamic formats in John the Ripper were implemented by Jim Fougeron separate crypto primitives from formats so md5($p) and md5(md5($p)) have one code base work at runtime john-devkit aims to be able to do similar thing but at compile time and with ability to optimize better so md5(md5($p)) would get more optimizations (at price of separate code)
  • 15. C Macros allow to do things, but are not smart an example of loop unroll in Keccak defining all useful variants: [...] #elif (Unrolling == 3) #define rounds prepareTheta for(i=0; i<24; i+=3) { thetaRhoPiChiIotaPrepareTheta(i , A, E) thetaRhoPiChiIotaPrepareTheta(i+1, E, A) thetaRhoPiChiIotaPrepareTheta(i+2, A, E) copyStateVariables(A, E) } copyToState(state, A) #elif (Unrolling == 2) #define rounds prepareTheta for(i=0; i<24; i+=2) { thetaRhoPiChiIotaPrepareTheta(i , A, E) thetaRhoPiChiIotaPrepareTheta(i+1, E, A) } copyToState(state, A)
  • 16. X-Macro is a tricky way to use macros, most likely with a separate file to be included multiple times: the file has code with variable parts these parts are defined before #include so #include provides a ”template engine” example from NetBSD’s libcrypt: [...] #define HASH_Init SHA1Init #define HASH_Update SHA1Update #define HASH_Final SHA1Final #include "hmac.c"
  • 18. From Python to C in john-devkit code in intermediate language (IL) is generated from algorithm description the code is modified by several functions chosen by user C code is generated from the modified the code using a template
  • 19. Intermediate Language (IL) while algorithms are written in Python with modified environment, john-devkit uses flat representation of code using its own instruction language called intermediate language some instructions of this language express constructions specific to hash cracking for instance, state variables of hash functions are defined by special instruction intermediate language is very simple intermediate language is intended to be rich to express common constructions natively to simplify optimization
  • 20. Example of specific instruction separate instruction is used to define state variable, so john-devkit uses a filter to replace initial state with values for SHA-224 having code for SHA-256: def override_state(code, state): consts = {} for l in code: if l[0] == ’new_const’: consts[l[1]] = l if l[0] == ’new_state_var’: consts[l[2]][2] = str(state.pop(0)) return code
  • 21. Optimizations specific to password cracking use knowledge not available to regular compiler: code can be moved between some functions of format the functions have different probability to be called so main computation is always called check of probable candidates is very rare it almost implies a successful guess (for strong hashes), also hashes are loaded only once while there are millions of candidates being hashed every second
  • 22. Specific optimization: early reject hashes are long some output values may be computed a bit quicker than others a 32-bit or 64-bit one value is usually enough to reject almost all wrong candidates so john-devkit drops instructions for computation of other output values in main working function and places full implementation into function for precise check of possible password main implementation is vectorized while full implementation is scalar because it checks only 1 candidate
  • 23. Specific optimization: steps reversal some operations can be reversed if r = i + C, we know r, and C is a constant, then i = r - C John the Ripper learns ”r” when it loads hashes john-devkit can sometimes reverse operations, replacing ”forward” computation during cracking (applied per candidate password) with reverse computation at startup (applied per hash)
  • 24. Full Python is available to define algorithms the environment has some objects with overloaded instructions to produce code in IL in a global variable instead of running it right away but any Python code can be used it is evaluated fully before further steps of code generation but to produce good output some additional markup may be needed
  • 25. Full Python, example a part of MD4 definition adapted right from RFC 1320: def make_round(func, code): res = ’’ func = re.sub(’([abcdks])’, r’{1}’, func) parts = re.compile(r’[(.)(.)(.)(.)s+(d+)s+(d+)]’ ).findall(code) for a, b, c, d, k, s in parts: res += func.format(**vars()) + "n" return res exec make_round(’a = rol((a + F(b, c, d) + X[k]), s)’, ’’’ [ABCD 0 3] [DABC 1 7] [CDAB 2 11] [BCDA 3 19] [ABCD 4 3] [DABC 5 7] [CDAB 6 11] [BCDA 7 19] [ABCD 8 3] [DABC 9 7] [CDAB 10 11] [BCDA 11 19] [ABCD 12 3] [DABC 13 7] [CDAB 14 11] [BCDA 15 19] ’’’)
  • 26. Conclusions john-devkit demonstrates practical application of code generation approach john-devkit is a real way to automate programmer’s work at such scale
  • 27. Thank you! Thank you! code: https://github.com/AlekseyCherepanov/john-devkit more technical detail will be on john-dev mailing list my email: lyosha@openwall.com