Erlang, random numbers,
and the security
1
Kenji Rikitake
10-SEP-2013
for London Erlang User Group meetup
1Sunday, Septemb...
Executive summary
2
Random number
generation is very
diffcult to do
properly; so don’t try
to invent yours!
2Sunday, Septe...
Dilbert’s definition of
randomness (NOT)
3
http://dilbert.com/strips/comic/2001-10-25/
http://insanity-works.blogspot.jp/20...
What does randomness
mean? (Wikipedia)
4
•Commonly, it means lack of pattern or
predictability in events
•Non-order or non...
“Pseudo” randomness?
• The results of deterministic computations can
be predicted; there is no real randomness from
those ...
Then what is “real”
randomness?
• Mostly depending on the physical events
which is very unlikely to be predictable by
the ...
Entropy pool and the
level of randomness
• Entropy is the unit of randomness
• If 32-bit word of complete random number
ex...
Sources of “real” high-
entropy sources
• Thermal noise (of resistors)
• Avalanche noise of Zener diodes
• Optical noise (...
Avalanche diode noise
generator example
9
Source: http://www.erlang-factory.com/conference/SFBay2011/speakers/kenjirikitak...
Arduino Duemilanove
and the noise generator
10
Source: http://www.erlang-factory.com/conference/SFBay2011/speakers/kenjiri...
Can OS gather entropy from
various events in a computer?
• Possible events: hard disk access duration,
mouse/keyboard timi...
So what are good
PRNGs?
• Output distribution
• uniform distribution: equally covering the all
possible values in the same...
Gaussian/normal
distribution
13
Source: http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg
13Sunday, Septemb...
Uniformity is hard to
achieve in the real world
14
http://nakedsecurity.sophos.com/2013/07/09/anatomy-of-a-pseudorandom-nu...
Another example: PHP5 rand(0,1)
on Windows: see the pattern?
15
http://boallen.com/random-numbers.html
15Sunday, September...
Android Bitcoin Wallets had PRNG
vulnerability to let 55 BTC stolen
16
• Java’s SecureRandom class (wasn’t secure
enough o...
Executive summary
(again)
17
Random number generation is very diffcult to
do properly; so don’t try to invent yours!
http:...
So how Erlang/OTP
does on PRNGs?
18
• random module: non-secure PRNG (Wichmann-
Hill AS183, in 1982)
• crypto module: cryp...
Erlang’s AS183 512x512 bitmap:
at least visually well-randomized
19
http://www.erlang-factory.com/
conference/SFBay2011/sp...
20
%% Code of AS183 PRNG
%% from Erlang/OTP R16B01 lib/stdlib/src/random.erl
%% This hasn’t been really changed at least s...
Issues on random
module, aka AS183
21
• The period length is short (~ 2^43)
• Very old design (in 1980s)
• Not designed fo...
Goals for solving issues
on random module
• Larger period length PRNG needed
• New or modern design PRNGs preferred
• Addr...
Required/suggested
features on parallelism
• Each PRNG has to manage its own state
• Independent = orthogonal sequences
fr...
Alternatives I implented
for the random module
• sfmt-erlang
• tinymt-erlang
• Both available at GitHub
• Details availabl...
Mersenne Twister
• BSD licensed
• By Matsumoto and Nishimura, 1996
• Very long period ((2^19937) -1)
• Uniformly distribut...
SIMD-oriented Fast
Mersenne Twister (SFMT)
• Improved MT by Saito and Matsumoto
(2006)
• SIMD-oriented, optimized for x86(...
sfmt-erlang
• A straight-forward implementation of SFMT
in both pure Erlang and with NIFs
• Suggested period length: (2^19...
Porting tips of SFMT
code from C to Erlang
• Make it reentrant: removed all static arrays for
the internal state table (no...
Speed difference
between C and Erlang
• On sfmt-erlang
• Pure Erlang code: x300 slower than C
• Erlang NIF code: still x10...
TinyMT
• MT for smaller memory footprint (and
where shorter period length is OK)
• Saito and Matsumoto, 2011
• Period: ((2...
tinymt-erlang
• A straight-forward implementation of TinyMT
in both pure Erlang and with NIFs
• Some of sfmt-erlang code a...
Observations on
tinymt-erlang code
• 32-bit integer ops are on BIGNUMs
• Small integer on 32bit Erlang: max 28 bits
• Over...
NIF scheduling issue
• NIF blocks the Erlang scheduler
• How long can a CPU spend time in a NIF?
• Rule of thumb: <1msec p...
TinyMT precomputed
polynomial parameters
• On 2011, I decided to pre-compute the
polynomial parameters at Kyoto University...
“Secure” PRNGs
• One-way function: guessing input or internal state
from the output should be kept practically impossible
...
For building more
secure PRNGs
• Reducing predictability: always incorporate
necessary external extropy
• Cryptographic st...
Tips for parallel
execution of PRNGs
• Use completely orthogonal sequences
• e.g.,TinyMT independent polynomials
• If the ...
Tips on secure PRNGs
• Don’t invent yours unless you hire a
well-experienced cryptographers (i.e., stick
to proven functio...
Executive summary
again
39
Random
number
generation
is very
diffcult to
do properly;
so don’t try
to invent
yours!
http://...
Thanks
Questions?
40
40Sunday, September 8, 13
Upcoming SlideShare
Loading in …5
×

Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

2,722 views

Published on

London Erlang User Group presentation on Erlang and the random numbers

Published in: Technology, News & Politics
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,722
On SlideShare
0
From Embeds
0
Number of Embeds
114
Actions
Shares
0
Downloads
23
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Erlang, random numbers, and the security: London Erlang User Group Talk Slide 10-SEP-2013

  1. 1. Erlang, random numbers, and the security 1 Kenji Rikitake 10-SEP-2013 for London Erlang User Group meetup 1Sunday, September 8, 13
  2. 2. Executive summary 2 Random number generation is very diffcult to do properly; so don’t try to invent yours! 2Sunday, September 8, 13
  3. 3. Dilbert’s definition of randomness (NOT) 3 http://dilbert.com/strips/comic/2001-10-25/ http://insanity-works.blogspot.jp/2008/05/debianubuntu-serious-opensslssh.html 3Sunday, September 8, 13
  4. 4. What does randomness mean? (Wikipedia) 4 •Commonly, it means lack of pattern or predictability in events •Non-order or non-coherence in a sequence, no intelligible pattern or combination •Statistically, random processes still follow a probabilistic distribution 4Sunday, September 8, 13
  5. 5. “Pseudo” randomness? • The results of deterministic computations can be predicted; there is no real randomness from those algorithms • A predictive sequence generator with a very long period (e.g., (2^19937)-1 times of computations) can be used as a list of pseudo random number generator (PRNG) • Each generation needs a seed for to determine the starting point (aka Initialization Vector) 5 5Sunday, September 8, 13
  6. 6. Then what is “real” randomness? • Mostly depending on the physical events which is very unlikely to be predictable by the consumer of the randomness itself • You can’t really avoid the interference of the exterior environment • But at least statistically whether a sequence is random or not can be measured 6 6Sunday, September 8, 13
  7. 7. Entropy pool and the level of randomness • Entropy is the unit of randomness • If 32-bit word of complete random number exists, it has 32 bits of entropy • If the 32-bit word takes only four different values and each value has a 25% chance of occurring, the word has 2 bits of entropy • The Dilbert’s case has ZERO bit of entropy 7 Source: Neils Ferguson, Bruce Schneier, and Tadayoshi Kohno: Cryptography Engineering, Chapter 9 (Generating Randomness), p. 137, John Wiley and Sons, 2010, ISBN-13: 9780470474242 7Sunday, September 8, 13
  8. 8. Sources of “real” high- entropy sources • Thermal noise (of resistors) • Avalanche noise of Zener diodes • Optical noise (e.g., LavaCan) • Radio noise (static) (random.org) • Rolling the dice (Dice-O-Matic) • Very expensive (= low entropy bit rate) • Practical for seeding purpose only • Not necessarily statistically uniform or gaussian 8 8Sunday, September 8, 13
  9. 9. Avalanche diode noise generator example 9 Source: http://www.erlang-factory.com/conference/SFBay2011/speakers/kenjirikitake 9Sunday, September 8, 13
  10. 10. Arduino Duemilanove and the noise generator 10 Source: http://www.erlang-factory.com/conference/SFBay2011/speakers/kenjirikitake 10Sunday, September 8, 13
  11. 11. Can OS gather entropy from various events in a computer? • Possible events: hard disk access duration, mouse/keyboard timing, ethernet packet arrival timing, wall-clock values, etc. • Note: those events can only happen after when the system is booted; at the boot time the entropy pool is virtually empty • Solutions: use an external entropy source to gather sufficient entropy, or wait for the enough entropy to be gathered and filled into the pool 11 Source: Nadia Heninger, Zakir Durumeric, Eric Wustrow, J.Alex Halderman: Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices, Proceedings of the 21st USENIX Security Symposium,August 2012, https://factorable.net/paper.html 11Sunday, September 8, 13
  12. 12. So what are good PRNGs? • Output distribution • uniform distribution: equally covering the all possible values in the same probability • Gaussian/normal distribution: representing the sum of many independent random values • The generation period length should be very large (especially when sharing the same algorithm between multiple processes) 12 12Sunday, September 8, 13
  13. 13. Gaussian/normal distribution 13 Source: http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg 13Sunday, September 8, 13
  14. 14. Uniformity is hard to achieve in the real world 14 http://nakedsecurity.sophos.com/2013/07/09/anatomy-of-a-pseudorandom-number-generator-visualising-cryptocats-buggy-prng/ An example: Cryptocat's buggy PRNG written in JavaScript 14Sunday, September 8, 13
  15. 15. Another example: PHP5 rand(0,1) on Windows: see the pattern? 15 http://boallen.com/random-numbers.html 15Sunday, September 8, 13
  16. 16. Android Bitcoin Wallets had PRNG vulnerability to let 55 BTC stolen 16 • Java’s SecureRandom class (wasn’t secure enough on Android actually) • “In order to remain secure the random numbers used to generate private keys must be nondeterministic, meaning that the output of the generator cannot be predicted” • “Android phones/tablets are weak and some signatures have been observed to have colliding R values” http://thegenesisblock.com/security-vulnerability-in-all-android-bitcoin-wallets/ 16Sunday, September 8, 13
  17. 17. Executive summary (again) 17 Random number generation is very diffcult to do properly; so don’t try to invent yours! http://imgs.xkcd.com/comics/random_number.png 17Sunday, September 8, 13
  18. 18. So how Erlang/OTP does on PRNGs? 18 • random module: non-secure PRNG (Wichmann- Hill AS183, in 1982) • crypto module: cryptographically secure PRNG, OpenSSL API in NIFs • For generating passwords and keys, always use the crypto module • There was a bug in SSH module using random function instead of crypto (CVE-2011-0766, discovered by Geoff Cant, fixed on R14B03) 18Sunday, September 8, 13
  19. 19. Erlang’s AS183 512x512 bitmap: at least visually well-randomized 19 http://www.erlang-factory.com/ conference/SFBay2011/speakers/ kenjirikitake 19Sunday, September 8, 13
  20. 20. 20 %% Code of AS183 PRNG %% from Erlang/OTP R16B01 lib/stdlib/src/random.erl %% This hasn’t been really changed at least since R14B02 -define(PRIME1, 30269). -define(PRIME2, 30307). -define(PRIME3, 30323). uniform() -> {A1,A2,A3} = case get(random_seed) of undefined -> seed0(); Tuple -> Tuple end, B1 = (A1*171) rem ?PRIME1, B2 = (A2*172) rem ?PRIME2, B3 = (A3*170) rem ?PRIME3, put(random_seed, {B1,B2,B3}), R = B1/?PRIME1 + B2/?PRIME2 + B3/?PRIME3, R - trunc(R). 20Sunday, September 8, 13
  21. 21. Issues on random module, aka AS183 21 • The period length is short (~ 2^43) • Very old design (in 1980s) • Not designed for parallelism • Only three 16-bit integers as the seed • Not designed for speed (float division) • Good thing: it’s pure Erlang code 21Sunday, September 8, 13
  22. 22. Goals for solving issues on random module • Larger period length PRNG needed • New or modern design PRNGs preferred • Addressing parallelism requirements • Non-overlapping = orthogonal sequences • Larger state length for seeding • Availability on both with and without NIFs 22 22Sunday, September 8, 13
  23. 23. Required/suggested features on parallelism • Each PRNG has to manage its own state • Independent = orthogonal sequences from the same algorithm • Splitting the internal state • Using orthogonal polynomials • Seed jumping: fast calculation for advancing the internal state 23 23Sunday, September 8, 13
  24. 24. Alternatives I implented for the random module • sfmt-erlang • tinymt-erlang • Both available at GitHub • Details available as ACM Erlang Workshop papers, as well as from my own web site • These are non-secure PRNGs 24 24Sunday, September 8, 13
  25. 25. Mersenne Twister • BSD licensed • By Matsumoto and Nishimura, 1996 • Very long period ((2^19937) -1) • Uniformly distributed in 623-dimension hypercube, mathematically proven • Widely used on many open-source languages: R, Python, mruby, etc. 25 25Sunday, September 8, 13
  26. 26. SIMD-oriented Fast Mersenne Twister (SFMT) • Improved MT by Saito and Matsumoto (2006) • SIMD-oriented, optimized for x86(_64) • Various period length supported • ((2^607)-1) ~ ((2^216091)-1) • Faster recovery from 0-excess initial state 26 26Sunday, September 8, 13
  27. 27. sfmt-erlang • A straight-forward implementation of SFMT in both pure Erlang and with NIFs • Suggested period length: (2^19937-1) • with NIFS: >x3 faster than random module • A choice on PropEr Erlang test tool 27 27Sunday, September 8, 13
  28. 28. Porting tips of SFMT code from C to Erlang • Make it reentrant: removed all static arrays for the internal state table (no mutable data structure) • SFMT itself can be written as a recursion a[X] = r(a[X-N], a[X-(N-POS1)], a[X-1], a[X-2]) • An Erlang way: adding elements to the heads and do the lists:reverse/1 made the code 50% faster than using the ++ operator! 28 28Sunday, September 8, 13
  29. 29. Speed difference between C and Erlang • On sfmt-erlang • Pure Erlang code: x300 slower than C • Erlang NIF code: still x10 slower than C • Some dilemma: speed optimization is important, but ErlangVM can hardly beat the native C code 29 29Sunday, September 8, 13
  30. 30. TinyMT • MT for smaller memory footprint (and where shorter period length is OK) • Saito and Matsumoto, 2011 • Period: ((2^127)-1), still much > 2^43 • Seed: seven 32-bit unsigned integers • Parallel execution oriented features: ~2^56 independent generation polynomials 30 30Sunday, September 8, 13
  31. 31. tinymt-erlang • A straight-forward implementation of TinyMT in both pure Erlang and with NIFs • Some of sfmt-erlang code are equally applicable (e.g., seed initialization) • x86_64 HiPE optimization: x3 speed • Wall-clock speed: roughly the same as random module • In fprof: x2~x6 slower than random module 31 31Sunday, September 8, 13
  32. 32. Observations on tinymt-erlang code • 32-bit integer ops are on BIGNUMs • Small integer on 32bit Erlang: max 28 bits • Overhead of calling functions and BEAM memory allocation might have taken a significant portion of exec time • sfmt-erlang batch exec is still x3~4 faster than tinymt-erlang batch exec 32 32Sunday, September 8, 13
  33. 33. NIF scheduling issue • NIF blocks the Erlang scheduler • How long can a CPU spend time in a NIF? • Rule of thumb: <1msec per exec • Target for sfmt/tinymt-erlang: < 0.1msec • On Riak this has been a serious issue • Scott Fritchie has an MD5 example code • A preference for the pure Erlang code 33 33Sunday, September 8, 13
  34. 34. TinyMT precomputed polynomial parameters • On 2011, I decided to pre-compute the polynomial parameters at Kyoto University’s super-computer (x86_64) cluster (I could only use two 16-core nodes (32 cores)) • 2^28 (~256M) parameter sets are available for both 32- and 64-bit implementations • Took ~32 days of continuous execution • 18~19 sets/sec for each core 34 34Sunday, September 8, 13
  35. 35. “Secure” PRNGs • One-way function: guessing input or internal state from the output should be kept practically impossible • A PRNG is less secure when: • generation period is shorter • easy function exists to guess the internal state from the output • the output sequence is predictable from the past output sequence data (= a simple PRNG is unsecure) 35 35Sunday, September 8, 13
  36. 36. For building more secure PRNGs • Reducing predictability: always incorporate necessary external extropy • Cryptographic strength: use well-proven algorithms, e.g.,AES, to “encrypt” the PRNG output • Rule of thumb: use crypto:strong_rand_bytes/1 to guarantee sufficient entropy is consumed 36 36Sunday, September 8, 13
  37. 37. Tips for parallel execution of PRNGs • Use completely orthogonal sequences • e.g.,TinyMT independent polynomials • If the seeding space is large, it is unlikely that PRNG sequences from the seeds generated from another PRNGs will collide with each other (but this is not mathematically proven) • Beware of execution sequence change when predictability is important 37 37Sunday, September 8, 13
  38. 38. Tips on secure PRNGs • Don’t invent yours unless you hire a well-experienced cryptographers (i.e., stick to proven functions) • Ensure the system has enough entropy to guarantee the unpredictability • An external hardware RNG is suggested • And don’t let NSA cripple the algorithms! 38 38Sunday, September 8, 13
  39. 39. Executive summary again 39 Random number generation is very diffcult to do properly; so don’t try to invent yours! http://imgs.xkcd.com/comics/security_holes.png 39Sunday, September 8, 13
  40. 40. Thanks Questions? 40 40Sunday, September 8, 13

×