Towards a stable definition of         Algorithmic Randomness                                  Hector Zenil                ...
Classical ProbabilityIf the process generating bitstrings of length k is uniformly random, theprobability of producing a p...
Algorithmic ComplexityBasic notion A string complexity (or simplicity) is the difference in length between the             ...
Hector Zenil (LIFL and IHPST)   Towards a stable definition of AR   Paris Diderot   4 / 25
Algorithmic Complexity (cont.)Definition(Kolmogorov, Chaitin) The algorithmic complexity K (s) of a string s isthe length (...
Algorithmic RandomnessExampleThe string ’010101010101010101...’ has low algorithmic complexitybecause it can be described ...
Infinite vs. Finite RandomnessDefinitionGiven a natural number c and a sequence s, s is c-incompressible if                 ...
Infinite vs. Finite Randomness (cont.)No finite string can be declared random. A string s can only look randombecause it can...
Convergence of DefinitionsThere are 3 mathematical approaches to randomness, each capturingdifferent intuitive features:    ...
The Choice of M MattersA major criticism brought forward against K is also its dependence of thechoice of programming lang...
The Invariance TheoremA theorem guarantees that in the long term algorithmic complexityevaluations will quickly converge, ...
Evaluating KLet’s take an example, namely the program “A” as following :  1   n := 0  2   Print n  3   n := n + 1 mod 2  4...
Miscellaneous Facts      Most strings are random. There are exactly 2n bit strings of length n,      but there are only 20...
Noncomputability of KImportant ResultNo algorithm will tell whether a program generating s is the shortestpossible (due to...
Algorithmic ProbabilityThere is a distribution that describes the expected output when picking aprogram at random and run ...
AP MetaphorsBasic notionIt is unlikely that a Rube Goldberg machine produces a string if the stringcan be produced by a mu...
Algorithmic Probability (cont.)The chances of producing π are greater typing a program producing thedigits of π than typin...
Main IdeaUsing m(s) to evaluate K (s):Observation (Zenil, Delahaye)To approach K (s) one can calculate m(s).Motivationm is...
Calculating mDefinition(Zenil, Delahaye) D(n) = the function that assigns to every finite binarystring s the quotient:(# of ...
Calculating m (cont.)Given that the Busy Beaver function values are known for n-state 2-symbolTuring machines for n = 2, 3...
Complexity TablesTable: The 22 bit-strings in D(2) from 6 088 (2,2)-Turing machines that halt.(Zenil, Delahaye)           ...
From a Prior to an Empirical DistributionWe see algorithmic complexity emerging:  1   The classification goes according to ...
Miscellaneous Facts from D      There are 5 970 768 960 machines that halt among the 22 039 921 152      in (4,2). That is...
Method LimitationsOne cannot continue calculating D(n) for any given n because of thenoncomputability of D (the lack of Bu...
Further Discussion  1   How to extend the results of D to larger strings?  2   How stable is D to other computing framewor...
Upcoming SlideShare
Loading in …5
×

Towards a stable definition of Algorithmic Randomness

2,119 views

Published on

Although information content is invariant up to an additive constant, the range of possible additive constants applicable to programming languages is so large that in practice it plays a major role in the actual evaluation of K(s), the Kolmogorov complexity of a string s. We present a summary of the approach we've developed to overcome the problem by calculating its algorithmic probability and evaluating the algorithmic complexity via the coding theorem, thereby providing a stable framework for Kolmogorov complexity even for short strings. We also show that reasonable formalisms produce reasonable complexity classifications.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,119
On SlideShare
0
From Embeds
0
Number of Embeds
226
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Towards a stable definition of Algorithmic Randomness

  1. 1. Towards a stable definition of Algorithmic Randomness Hector Zenil hector.zenil@lifl.fr Laboratoire d’Informatique Fondamentale de Lille (CNRS) and Institut d’Histoire et de Philosophie des Sciences et des Techniques (Paris 1 Panth´on-Sorbonne/ENS Ulm/CNRS) e Paris Diderot Philmaths Seminar May 17, 2011 Universit Paris Diderot - Paris 7, SPHERE-REHSEISHector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 1 / 25
  2. 2. Classical ProbabilityIf the process generating bitstrings of length k is uniformly random, theprobability of producing a particular string is exactly 1/2k , the same as for anyother string of the same length.ExampleLet s1 and s2 be as follow: s1 =’01010101010101’ and s2 =’10110110001010’Both have probability P(s1 ) = P(s2 ) = 1/214 to be chosen at random amongthe 214 binary strings of length k = 14.Yet s1 looks less random than s2 . How to quantify and characterize suchintuition? Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 2 / 25
  3. 3. Algorithmic ComplexityBasic notion A string complexity (or simplicity) is the difference in length between the string and its shortest description.The description of an object depends on a language. The theory ofcomputation is the framework of algorithmic complexity:Basic notion description ⇐⇒ computer programA string of low algorithmic complexity is highly compressible, as theinformation that it contains can be encoded in an algorithm much shorterin length than the string itself. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 3 / 25
  4. 4. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 4 / 25
  5. 5. Algorithmic Complexity (cont.)Definition(Kolmogorov, Chaitin) The algorithmic complexity K (s) of a string s isthe length (in bits) of the shortest program p that produces s running on auniversal Turing machine M. K (s) = min{|p|, M(p) = s} Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 5 / 25
  6. 6. Algorithmic RandomnessExampleThe string ’010101010101010101...’ has low algorithmic complexitybecause it can be described as k times ’01’, no matter how long, with thedescription increasing only by ∼ log (k) despite the linear increasingnumber of k.ExampleThe string ’010010110110001010...’ may have high algorithmic complexitybecause it doesn’t seem to allow a shorter description other than the stringitself, so a shorter description may not exist. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 6 / 25
  7. 7. Infinite vs. Finite RandomnessDefinitionGiven a natural number c and a sequence s, s is c-incompressible if K (s) ≥ |s| − cExampleA string s is random if the shortest program producing s is no shorter thans itself.DefinitionAn infinite sequence s is Martin-L¨f random if and only if there is a oconstant c such that all initial segments (prefixes) of s arec-incompressible. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 7 / 25
  8. 8. Infinite vs. Finite Randomness (cont.)No finite string can be declared random. A string s can only look randombecause it can always be part of a longer non-random sequence. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 8 / 25
  9. 9. Convergence of DefinitionsThere are 3 mathematical approaches to randomness, each capturingdifferent intuitive features: Incompressibility (program-size) Unpredictability (effective martingales) Typicalness (effective statistical tests)Importanta (Chaitin, Schnorr) a albeit some technicalities Uncompressibility ⇐⇒ Unpredictability ⇐⇒ Typicality Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 9 / 25
  10. 10. The Choice of M MattersA major criticism brought forward against K is also its dependence of thechoice of programming language (or, M, the universal Turing machine).From the definition: K (s) = min{|p|, M(p) = s}It may turn out that: KM1 (s) = KM2 (s) when evaluated respectively using M1 and M2 .Basic notionThis dependency is particularly troubling for short strings, shorter than forexample the length of the universal Turing machine on which K of thestring is evaluated (hundreds of bits). Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 10 / 25
  11. 11. The Invariance TheoremA theorem guarantees that in the long term algorithmic complexityevaluations will quickly converge, and that they will only diverge for somefixed constant value in the beginning.TheoremInvariance theorem If M1 and M2 are two (universal) Turing machines andKM1 (s) and KM2 (s) the algorithmic complexity of a binary string s whenM1 or M2 are used respectively, there exists a constant c such that for allbinary string s: |KM1 (s) − KM2 (s)| < c(think of a compiler between 2 programming languages) Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 11 / 25
  12. 12. Evaluating KLet’s take an example, namely the program “A” as following : 1 n := 0 2 Print n 3 n := n + 1 mod 2 4 Goto 2which generates the output string: “01010101...”. The length of A (inbits) is an upper bound of K .PredictabilityThe program A trivially allows a shortcut to the value of an arbitrary digitthrough the following function f(n): if n = 2m then f (n) = 1, f (n) = 0 otherwise. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 12 / 25
  13. 13. Miscellaneous Facts Most strings are random. There are exactly 2n bit strings of length n, but there are only 20 + 21 + 22 + . . . + 2(n−1) = 2n − 1 bit strings of fewer bits. So one can’t pair-up all n-length strings with programs of shorter length (there simply aren’t enough short strings to encode all longer strings). There are examples of infinite random sequences. For example, Chaitin Ω numbers are algorithmic random. Most real numbers are algorithmic random. There is a deep connection between algorithmic randomness and the field of computability (Turing degrees): random number → noncomputable number. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 13 / 25
  14. 14. Noncomputability of KImportant ResultNo algorithm will tell whether a program generating s is the shortestpossible (due to the undecidability of the halting problem of Turingmachines).Basic notionOne may not be able to prove that a program generating s is the shortestbut one can exhibit a short program generating s (much) shorter than sitself. So even though one cannot tell whether a string is random becauseit may have a short generating program, one can find a short program andtherefore tell that the string is definitely not random.Basic notionOne can find upper bounds by finding short programs approaching K , forexample, using compression algorithms. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 14 / 25
  15. 15. Algorithmic ProbabilityThere is a distribution that describes the expected output when picking aprogram at random and run it on a universal Turing machine.According to algorithmic probability, the simpler a string the more likely tobe produced by a short program. The idea formalizes the concept ofOccam’s razor.Definition(Levin) m(s) = Σp:M(p)=s 1/2|p| i.e. the sum over all the programs forwhich M with p outputs the string s and halts.M is a prefix free universal Turing machine1 . 1 The set of valid programs forms a prefix-free set, that is no element is aprefix of any other, a property necessary to keep 0 < m(s) < 1.) Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 15 / 25
  16. 16. AP MetaphorsBasic notionIt is unlikely that a Rube Goldberg machine produces a string if the stringcan be produced by a much simpler process.The immediate consequence of AP is simple but powerful (and surprising):Basic notion Monkeys on a type-writer (Borel) garbage in → garbage out Programmer monkeys: (Chaitin, Lloyd) garbage in → interesting out Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 16 / 25
  17. 17. Algorithmic Probability (cont.)The chances of producing π are greater typing a program producing thedigits of π than typing the digits of π. Monkeys are just a representationof a random source.m is related to algorithmic complexity in that m(s) is at least themaximum term in the summation of programs. So one can actually writem(s) as:Theorem(Levin, Chaitin) − log2 m(s) = K (s) + cAlgorithmic probability defines a prior distribution on produced stringsbased on algorithmic complexity. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 17 / 25
  18. 18. Main IdeaUsing m(s) to evaluate K (s):Observation (Zenil, Delahaye)To approach K (s) one can calculate m(s).Motivationm is more stable than K (s) because one makes less arbitrary choices on M.As m is defined in terms of K , m is also noncomputable and onlyapproachable from below (hence called semi-computable measure orsimply a semi-measure). Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 18 / 25
  19. 19. Calculating mDefinition(Zenil, Delahaye) D(n) = the function that assigns to every finite binarystring s the quotient:(# of times that a machine (n,2) produces s) / (# of machines in (n,2)).i.e. D(n) is the probability distribution of the strings produced by all2-symbol halting Turing machines with n states.Examples for n = 1, n = 2 D(1) = 0 → 0.5; 1 → 0.5 D(2) = 0 → 0.328; 1 → 0.328; 00 → .0834 . . . Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 19 / 25
  20. 20. Calculating m (cont.)Given that the Busy Beaver function values are known for n-state 2-symbolTuring machines for n = 2, 3, 4 we could compute D(n) for n = 2, 3, 4.Following techniques from (Wolfram), we ran all 22 039 921 152 two-waytape Turing machines starting with a tape filled with 0s and 1s in order tocalculate D(4)2TheoremD(n) is noncomputable (by reduction to Rado’s Busy Beaver problem(Rado)). 2 A 9-day calculation on a single 2.26 Core Duo Intel CPU. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 20 / 25
  21. 21. Complexity TablesTable: The 22 bit-strings in D(2) from 6 088 (2,2)-Turing machines that halt.(Zenil, Delahaye) 0 → .328 010 → .00065 1 → .328 101 → .00065 00 → .0834 111 → .00065 01 → .0834 0000 → .00032 10 → .0834 0010 → .00032 11 → .0834 0100 → .00032 001 → .00098 0110 → .00032 011 → .00098 1001 → .00032 100 → .00098 1011 → .00032 110 → .00098 1101 → .00032 000 → .00065 1111 → .00032Solving degenerate casesDoes ’0’ have high Kolmogorov complexity? AP says it is not random, it isactually the simplest string (together with ’0’) according to D. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 21 / 25
  22. 22. From a Prior to an Empirical DistributionWe see algorithmic complexity emerging: 1 The classification goes according to our intuition of what complexity should be. 2 Strings are almost always classified by length except in cases in which intuition justifies they should not e.g. 0101010 is better ranked (less complex) than e.g. 11001101.Full tables are available online: www.algorithmicnature.orgFrom m to DUnlike m, D is an empirical distribution and no longer a prior. Dexperimentally confirms Solomonoff and Levin’s AP measure. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 22 / 25
  23. 23. Miscellaneous Facts from D There are 5 970 768 960 machines that halt among the 22 039 921 152 in (4,2). That is a fraction of 0.27. A total number of 1824 strings are produced in (4,2). The following are the most random looking strings according to D: 1101010101010101, 1101010100010101, 1010101010101011 and 1010100010101011, each with 5.4447×10−10 probability. (4,2) produces all strings up to length 8, then the number of strings larger than 8 decreases. As in D(3), where we report that one string group (0101010 and its reversion), in D(4) 399 strings climbed to the top and were not sorted among their length groups. In D(4) string length was no longer a determinant for string positions. For example, between positions 780 and 790, string lengths are: 11, 10, 10, 11, 9, 10, 9, 9, 9, 10 and 9 bits. D(4) preserves the string order of D(3) except in 17 places out of 128 strings in D(3) ordered from highest to lowest string frequency. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 23 / 25
  24. 24. Method LimitationsOne cannot continue calculating D(n) for any given n because of thenoncomputability of D (the lack of Busy Beaver values for n = 5), but onecan proceed either by sampling or by a partitioning technique, that iscutting a longer string in shorter strings for which their complexity isknown.Given that the procedure is, computationally speaking, very expensive, forlonger strings one can continue using compression algorithms that workwell for longer strings.Media Coverage“Pour La Science” (Scientific American in French) has featured myresearch in its July 2011 issue. Available onlinehttp://www.mathrix.org/zenil/PLSZenilDelahaye.pdf. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 24 / 25
  25. 25. Further Discussion 1 How to extend the results of D to larger strings? 2 How stable is D to other computing frameworks? 3 How stable is D(n) for growing n? 4 Is convergence of D(n) in order or values possible? 5 How to formally reconnect D to m?We positively answer some of these questions. We have shown thatreasonable formalisms of computation produce reasonable (andcompatible) complexity classifications. We have also shown that D(n) isstrongly stable at least for n ≤ 5. Hector Zenil (LIFL and IHPST) Towards a stable definition of AR Paris Diderot 25 / 25

×