SlideShare a Scribd company logo
Statistical Properties of the Entropy Function of a Random

                                             Partition

                                          Anna Movsheva



Contents

1 Introduction                                                                                            1

  1.1   Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      1

  1.2   Research Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        3

  1.3   Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      4

        1.3.1   General properties of θ(p, x)    . . . . . . . . . . . . . . . . . . . . . . . . . . .    5

        1.3.2   Functions µ(p) and σ(p) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       7

        1.3.3   The Generating Function of Momenta . . . . . . . . . . . . . . . . . . . . . . 10

        1.3.4   Discussion of Conjecture 1.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

  1.4   Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


2 Methods                                                                                                14


3 Results                                                                                                14

  3.1   Computation of βr and γr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

  3.2   Bell Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17


4 Discussion and Conclusion                                                                              19
Abstract

   It is well known that living organisms are open self-organizing thermodynamic systems with
a low entropy. An estimate for the number of subsystems with low entropy would give a rough
guess about the number of self-organizing subsystems that exist in a closed system S. I study
the mathematical properties of a model in which a finite set X with a probability distribution
                                                                                         l
{px |x ∈ X} encodes a set of states of the system S. A partition of the set X =          i=1   Yi , in this
model represents a subsystem with the set of probabilities {p(Yi ) =        x∈Yi   px }. In this paper I
study the entropy function H(p, Y ) = −    i   p(Yi ) ln p(Yi ) of a random partition Y . In particular
I study the counting function Θ(p, x) = #{Y |H(p, Y ) ≤ x}. Using computer simulations, I give
evidences that the normalized function θ(p, x) = Θ(p, x)/Θ(H(p, X)) asymptotically can be
                                                                     x
approximated by the cumulative Gauss distribution 1/ 2πσ(p)          −∞
                                                                          exp(−(t − µ(p))2 /2σ(p))dt.
I state my findings in a form of falsifiable conjectures some of which I partly prove. The
asymptotics explain a strong correlation between µ(p), the average entropy of a random partition
of X, and the entropy H(p, X). Since the quantity µ(p) is usually available in practice I can
give an estimate for H(p, X) when it is not directly computable.
Movsheva, Anna


1     Introduction

1.1    Background

One of the main problems of theoretical biology and theoretical physics is to reconcile the theory of

evolution with statistical mechanics and thermodynamics. It was Ilya Prigogine who was the first

who made the fundamental contributions to the solution of this problem. He advocated that living

organisms are open self-organizing thermodynamic systems with a low entropy. These open systems

are part of a large closed system S. Since I am interested in open self-organizing thermodynamic

systems it is important to know the number of subsystems within S that have low entropy. In my

work I studied this question from the mathematical point of view. In my simplified approach the

configuration space of S was a finite set X with a probability distribution. In my interpretation a

subsystem was a partition of X. In my work I studied a function that, for a given x, counted the

number of partitions of X who’s entropy did not exceed x. My approach is rather general because

any configuration space can be approximated by a sufficiently large but a finite set.

    The controversy between classical biology and physics has a long history. It revolves around

a paradox that physical processes are reversible and biological are not. Boltzmann in the process

of working on this dilemma laid the foundation of statistical physics. He put forward the notion

of entropy which characterizes the degree of disorder in a statistical system. The second law

of thermodynamics in the formulation of Boltzmann states that the entropy of a closed system

cannot decrease, which makes the time in a statistical system irreversible. The solution of the

problems of irreversibility of time did not completely eliminate the contradiction. The second law

of thermodynamics seems to forbid the long term existence of the organized system, such as living

organisms. Schr¨dinger in his book [19] (Chapter 6) pointed out that the entropy can go down in an
               o

open system, that is a system that can exchange mass and energy with the surroundings. Prigogine

in his groundbreaking works [15, 14, 16] showed that the self-organization (decrease of entropy) can

be achieved dynamically. His discovery layed down the foundation of non-equilibrium statistical

mechanics. The most interesting self-organizing systems exist far away from the equilibrium and

are non static by their nature.

    There is a vast literature on self-organization (see e.g.[16, 10, 9, 12] and the references therein).


                                                   1
Movsheva, Anna


Current research is focused on the detailed studying of individual examples of self-organization and

is very successful (see e.g.[3]). In this work I changed the perspective. My motivating problem was

rather general - to estimate the total number of self-organizing subsystems in a thermodynami-

cally closed system. Self-organizing subsystems are the most interesting specimen of the class of

subsystems with a low entropy. This motivates my interest in estimating the number of subsys-

tems with a low entropy. Knowing this number the number of self-organizing subsystems can be

assessed. A problem given in such generalities looks very hard so I made a series of simplifications

that let me progress in this direction. Ashby in [1] argued that any system S can be thought of

as a “machine”. His idea is that the configuration space of S can be approximated by a set or

an alphabet X and the dynamics is given by the transition rule TX : X → X. A homomorphism

between machines S = (X, TX ) and Q = (Z, TZ ) is a map ψ : X → Z such that ψTX = TZ ψ.

Homomorphisms are useful in analysis of complicated systems. (See [1] for details) A submachine,

according to [1], is a subset X ⊂ X that is invariant in respect to TX . I never used this definition

in this paper. In my definition a submachine is a homomorphic image ψ : (X, TX ) → (Z, TZ ). For

example, if a machine (X, T ) consists of N non-interactive sub machines (X1 , T1 ), . . . , (XN , TN )

then X = X1 × · · · × XN , T = T1 × · · · × TN . Projectors ψi (xi , . . . , xN ) = xi are homomorphisms of

machines. This reflects the fact that the configuration space of a union of non interacting systems

is a product (not a union) of the configuration spaces of the components.

Definition 1.1. A collection of subsets Y = {Yz |z ∈ Z} , such that Yz ∩ Yz = ∅, z = z and

  z∈Z   Yz = X is a partition of a finite set X, r = #X. Let ki to be the cardinality #Yz . In this are

I shall use the notation X =      z   Yz

   Any homomorphism ψ : (X, TX ) → (Z, TZ ) defines a partition of X with Yz equal to {x ∈

X|ψ(x) = z}. In fact up to relabeling the elements of Z the homomorphism is the same as a

partition. This also explains why I am interested in the counting of the partitions. Ashby in

[1] argued that a machine (X, T ) is a limiting case of a more realistic Markov process, in which
                                                                                     ˜
deterministic transition rules x → T (x) get replaced by random transition rules x → T (x). The

dynamics of the process is completely determined by the probabilities {px ,x |x, x ∈ X} to pass from

the state x to the state x and the initial probability distribution {px |x ∈ X}. Markov processes

have been studies in the theory of information developed originally in [20].

                                                    2
Movsheva, Anna


   Yet there is still another way to interpret quantities that I would like to compute. A submachine

can be also be interpreted as a scientific device. This can be understood in the example of a

hurricane on Jupiter [2]. You can analyze the hurricane in a multitude of ways: visually through

the lenses of a telescope, by recording the fluctuations of winds with a probe, by capturing the

fluctuations of the magnetic field around the hurricane. Every method of analysis (device) gives

a statistical data that yields in turn the respective entropy. If (X, p) is a space of states of the

hurricane, then ψ : X → Z is a function, whose set of values is the set of readings of the scientific

device. It automatically leads to a partition of X as it was explained above. The list of known

scientific methods in planetary science is enormous [13], and any new additional method contributes

something to the knowledge. Yet, the full understanding of the subject would be only possible if I

used all possible methods (ψs). This, however, is not going to happen in planetary science in the

near future. The reason is that the set of states X of the Jupiter atmosphere is colossal, which

makes the set of all conceivable methods of its study (devices) even bigger.

   Still, imagine that all the mentioned troubles are nonexistent. It would be interesting to count

the number of scientific devices that yield statistical data about the hurricane with entropy no

greater than a given value. It would be also interesting to know their the average entropy. This is

a dream. I did just that in my oversimplified model.


1.2   Research Problem

In the following, the set X will be {1, . . . , r}. Let p be a probability distribution on X, that is
                                              r
a collection of numbers pi ≥ 0 such that      i=1 pi   = 1. The array p = (p1 , . . . , pr ) is said to be a

probability vector. The probability of Yi in the partition X =         Yi is


      p(Yi ) =          pj .
                 j∈Yi


                                                                                               l
Definition 1.2. Entropy of a partition Y , H(p, Y ) is calculated by the expression −           i=0 p(Yi ) ln p(Yi ).

In this definition the function x ln x is extended to x = 0 by continuity 0 ln 0 = 0.

                                                                  r
   Here are some examples of entropies: H(p, Ymax ) = −           i=1 pi ln pi   for Ymax = {{1}, . . . , {r}},

H(p, Ymin ) = 0 for Ymin = {{1, . . . , r}}. One of the properties of the entropy function (see [6]) is

                                                  3
Movsheva, Anna


that


       H(p, Ymin ) ≤ H(p, Y ) ≤ H(p, Ymax ) for any Y ∈ Pr                                         (1)


   It is clear from the previous discussion that Θ(p, x) = #{Y ∈ Pr |H(p, Y ) ≤ x} is identical to

the function defined in the abstract.

   The Bell number Br ([22],[17]) is the cardinality of Pr . The value Θ(p, H(p, Ymax )) = Θ(p, H(p, id))

thanks to (1) coincides with Br . From this I conclude that

                     #{Y ∈ Pr |H(p, Y ) ≤ x}
       θ(p, x) =
                               Br

is the function defined in the abstract.

   My main goal is to find a simple approximation to θ(p, x).


1.3     Hypothesis

In this section I will formulate the conjectures that I obtained with Computing Software Mathe-

matica [11].

Remark 1.3. I equipped the set Pr with the probability distribution P such that P(Y ) for Y ∈ Pr

is equal to 1/Br . The value of the function θ(p, x) is the probability that a random partition Y has

the entropy ≤ x. This explains the adjective “random” in the title of the paper.

   In order to state the main result I will need to set notation:

                                      k

       p[k] = (p1 , . . . , pr , 0, . . . , 0)                                                     (2)


where p = (p1 , . . . , pr ) is the probability vector. From the set of momenta of the entropy of a

random partition

                             1
       E(H l (p, Y )) =                   H l (p, Y )                                              (3)
                             Br
                                  Y ∈Pr


I will use the first two to define the average µ(p) = E(H(p, Y )) and the standard deviation σ(p) =

                                                        4
Movsheva, Anna


  E(H(p, Y )2 ) − E(H(p, Y ))2 .

Conjecture 1.4. Let p be a probability distribution on {1, . . . , r}. Then

                                               ∞               (x−µ)2
                                    1
        lim E(H l (p[k], Y )) − √                     xl e −     2σ       dx = 0
        k→∞                         2πσ        −∞


with µ = µ(p[k]), σ = σ(p[k])and for any integer l ≥ 0.

   Practically this means that the cumulative normal distribution

                                 x           (x−µ)2
                           1
        Erf(x, µ, σ) = √              e−       2σ
                           2πσ   −∞


with µ = µ(p[k]), σ = σ(p[k]) makes a good approximation to θ(p[k], x) for large k.

   The initial study of the function θ(p, x) has been done with the help of Mathematica. The

software can effectively compute the quantities associated with set X whose cardinality does not

exceed ten.


1.3.1     General properties of θ(p, x)

The plots of some typical graphs are presented in Figure 1.1. These were done with a help of

Mathematica.
                                       1.2



                                       1.0



                                       0.8



                                       0.6



                                       0.4



                                       0.2



                                         0.0      0.5      1.0      1.5      2.0   2.5   3.0




                                     Figure 1.1: Graphs of θ(p, x), θ(q, x).


   The continuous line on the graph corresponds to θ(p, x) with


        p = (0.082, 0.244, 0.221, 0.093, 0.052, 0.094, 0.079, 0.130)


The step function corresponds to q = ( 8 , . . . , 1 ). Large steps are common for θ(q, x) when q has
                                       1
                                                   8


                                                                    5
Movsheva, Anna


symmetries. A symmetry of q is a permutation τ of X such that qτ (x) = qx for all x ∈ X. Indeed,

if I take a symmetry and act it upon a partition, I get another partition with the same entropy.

This way I can produce many partitions with equal entropies. Hence, I get high steps in the graph.

   The effect of of the operation p → p[1] (2) on θ(p, x) is surprising. Here are the typical graphs:
                                   1.2



                                   1.0



                                   0.8



                                   0.6



                                   0.4



                                   0.2




                                     0.0   0.5   1.0       1.5   2.0   2.5




   Figure 1.2: Graphs of θ(p, x), θ(p[1], x), θ(p[2], x) for some randomly chosen p = (p1 , . . . , p6 ).


   The reader can see that the graphs have the same bending patterns. Aslo graphs lie one over

the other. I wanted to put forth a conjecture that passed multiple numerical tests.

Conjecture 1.5. For any p I have


     θ(p, x) ≥ θ(p[1], x)


   A procedure that plots θ(p, x) is hungry for computer memory. This is why it is worthwhile to

find a function that makes a good approximation. I have already mentioned in the introduction

that Erf(x, µ(p), σ(p)) approximates θ(p, x) well. For example, if


     p = (0.138, 0.124, 0.042, 0.106, 0.081, 0.131, 0.088, 0.138, 0.154),                                   (4)


the picture below indicates a good agreement of graphs.




                                                       6
Movsheva, Anna

                                       1.2



                                       1.0



                                       0.8



                                       0.6



                                       0.4



                                       0.2



                                         0.0    0.5    1.0   1.5   2.0   2.5   3.0




                    Figure 1.3: Erf(x, µ(p), σ(p)) (red) vs θ(p, x)(blue), with p as in (4).


     The reader will more precise relations between Erf and θ in the following sections.


1.3.2        Functions µ(p) and σ(p)

The good agreement of graphs Erf(x, µ(p), σ(p)) and θ(p, x) raises a question of a detailed analysis

of the functions µ(p) and σ(p). It turns out that a more manageable quantities than µ(p) are

                                                      H(p, Ymax )
         β(p) = H(p, Ymax ) − µ(p),          γ(p) =                                                               (5)
                                                        µ(p)

The unequally (1) implies that µ(p) ≤ H(p, Ymax ) and β(p) ≥ 0, γ(p) ≥ 1. Evaluation of the

denominator of γ(p) with formula (3) requires intensive computing. On my slow machine I used

the Monte-Carlo approximation [8]

                     k
                1
         µ(p) ∼           H(p, Y i )
                k
                    i=1


where Y i are independent random partitions. Below is the graph of β(p1 , p2 , p3 ) γ(p1 , p2 , p3 ) plotted

in Mathematica. The reader can distinctly see one maximum in the center corresponding to p =

( 3 , 1 , 1 ).
  1
      3 3




      Figure 1.4: The plot of β(p1 , p2 , 1 − p1 − p2 )             Figure 1.5: The plot of γ(p1 , p2 , 1 − p1 − p2 )


     A closer look at the plot shows that γ(p1 , p2 , p3 ) is not a concave function.


                                                              7
Movsheva, Anna


      In the following hr stands for the probability vector ( 1 , . . . , 1 ).
                                                              r           r

      I came up with a conjecture, which has been numerically tested for r ≤ 9:

Conjecture 1.6. The function γ(p1 , . . . , pr ) can be extended by continuity to all distributions p.

In this bigger domain it satisfies


                              def
        1 ≤ γ(p) ≤ γ(hr ) = γr .                                                                               (6)


Likewise the function β satisfies


                              def
        0 ≤ β(p) ≤ β(hr ) = βr .                                                                               (7)


      The reader should consult sections below on the alternative ways of computing of βr and γr .

      The following table contains an initial segment of the sequence of {γr }.


                                             Table 1: Values of γr .
 r      2        3        4         5       6       7      8         9     ...        100   ...      1000
 γr     2    1.826    1.739     1.691   1.659 1.635 1.617 1.602            ...      1.426   ...     1.341


      I see that it is a decreasing sequence. Extensive computer tests have lead me the following

conjecture.

Conjecture 1.7. The sequence {γr } satisfies γr ≥ γr+1 and lim γr = 1.
                                                                        r→∞


      The limit statement is proved in Proposition 3.6.

Corollary 1.8. lim γ(p[t]) = 1.
                     t→∞


Proof. From Conjecture 1.6 I conclude that 1 ≤ γ(p[t]) ≤ γr+t . Since lim γr+t = 1 by Conjecture
                                                                                     t→∞
1.7, lim γ(p[t]) = 1.
      t→∞




                                             Table 2: Values of βr .

                     r            6            7           8           9      ...       100       ...
                     βr    0.711731     0.756053    0.793492    0.825835      ...    1.3943       ...



                                                         8
Movsheva, Anna


Conjecture 1.9. The sequence {βr } satisfies βr ≤ βr+1 and lim βr = ∞.
                                                                      r→∞


    The situation with the standard deviation σ(p) is a bit more complicated. Here is a graph of

σ(p1 , p2 , p3 ).




  Figure 1.6: Three-dimensional view of the graph of standard deviation σ(p1 , p2 , p3 ) for θ(p, x).


    The reader can clearly see four local maxima. The function σ(p1 , p2 , p3 ) is symmetric. The

maxima correspond to the points ( 1 , 1 , 3 ) and permutations of ( 1 , 1 , 0). This lead me to think
                                  3 3
                                          1
                                                                    2 2

that local maxima of σ(p1 , . . . , pr ) are permutations of qk,r = hk [r − k], k ≤ r. I tabulated the

values of σ(qk,r ) for small k and r in the table below.


                                          Table 3: Values of σ(qk,r ).

                    kr        3          4         5          6          7         8        9
                    2     0.3396     0.3268     0.314     0.3026     0.2924   0.2832     0.275
                    3       0.35   0.3309     0.3173    0.3074     0.2992     0.292     0.286
                    4     -          0.3254     0.309      0.298       0.29     0.283    0.278
                    5     -        -            0.302      0.289       0.28     0.273    0.267
                    6     -        -          -            0.283      0.272     0.265    0.258
                    7     -        -          -         -             0.267     0.258    0.251
                    8     -        -          -         -          -            0.254    0.246
                    9     -        -          -         -          -          -          0.242



    The reader can see that the third row in bold has the largest values of each column. It is not

hard to see analytically that qk,r is a critical point of σ. My computer experiments lead me to the

following conjecture:

Conjecture 1.10. The function σ(p) has a global maximum at q3,r .




                                                        9
Movsheva, Anna


1.3.3      The Generating Function of Momenta

In order to test Conjecture 1.4 I need to have an effective way of computing E(H l (p[k], Y )) for

large values of k. In this section I present my computations of E(H l (p[k], Y )) for small r, which

lead me to a conjectural formula for E(H l (p[k], Y )).

    The factorial generating function of powers of entropy can be written compactly this way:

                                                                                 t
                                                          l
                         ∞
                               H(p, Y )t st
                                                ∞     −   i=0 p(Yi ) ln p(Yi )       st
         G(p, Y, s) =                       =                                             =
                                   t!                               t!                                   (8)
                        t=0                     t=0

          = Πl p(Yi )−p(Yi )s
             i=1


The function G(p, Y, s) can be extended from Pr to Pr+1 in the following way. I extend the r-

dimensional probability vector p to r + 1-dimensional vector p by adding a zero coordinate. Any

partition Y = {Y1 , . . . , Yl } defines a partition Y = {Y1 , . . . , Yl , {r + 1}}. Note that G(p, Y, s) =

G(p , Y , s).

    The following generating function, after normalization, encodes all the moments of the random

partition:


         J(p, s) =           G(p, Y, s)   J(p, s)/Br =          E(H l (p, Y ))sl /l!                     (9)
                     Y ∈Pr                                l≥0


I want to explore the effect of substitution p → p[k] on J(p, s).

    I use the following notations:


         At (p, s) = J(p[t], s).


Here are the results of my computer experimentations. A set of two non-zero p extended by t zeros

yields


         At (p1 , p2 , −s) = Bt+1 + (Bt+2 − Bt+1 )pp1 s pp2 s
                                                   1     2                                              (10)




                                                            10
Movsheva, Anna


The next is for 3 non-zero p extended by zeroes.


       At (p1 , p2 , p3 , −s) = Bt+1 + (Bt+2 − Bt+1 )×

       × (p1 + p2 )s(p1 +p2 ) pp3 s + (p1 + p3 )s(p1 +p3 ) pp2 s + (p2 + p3 )s(p2 +p3 ) pp1 s
                               3                            2                            1
                                                                                                             (11)

       + (Bt+3 − 3Bt+2 + 2Bt+1 )pp1 s pp2 s pp3 s
                                 1     2     3


I found At (p, s) for probability vector p with five or less coordinates. In order to generalize the re-

sults of my computation I have to fix some notations. With the notation deg Y = deg{Y1 , . . . , Yl } =
                 def
l I set J l (p, s) =       deg Y =l   G(p, Y, s). The function

                       k
      At (p, s) =          L(l, t)J l (p, s)                                                                 (12)
                    l=1


where L(l, t) are some coefficients. For example, in the last line of formula (11) the coefficient

L(3, t) is Bt+3 − 3Bt+2 + 2Bt+1 and the function J 3 (p, s) is pp1 s pp2 s pp3 s . The reader can see that
                                                                1     2     3

the coefficients of J l (p, s) in the formulae (10) and (11) coincide. The coefficients of the Bell

numbers in the formulae for L(l, t):


       Bt+1

       Bt+2 − Bt+1

       Bt+3 − 3Bt+2 + 2Bt+1

       Bt+4 − 6Bt+3 + 11Bt+2 − 6Bt+1

       Bt+5 − 10Bt+4 + 35Bt+3 − 50Bt+2 + 24Bt+1


form a triangle. I took these constants 1, 1, −1, 1, −3, 2, 1, −6, 11, −6 and entered them into the

Google search window. The result of the search lead me to the sequence A094638, Stirling numbers

of the first kind, in the Online Encyclopedia of Integer Sequences (OEIS [21]).

                                                                                                 n
Definition 1.11. The unsigned Stirling numbers of the first kind are denoted by                    k   . They count

the number of permutations of n elements with k disjoint cycles [22].




                                                           11
Movsheva, Anna



                                        Table 4: Values of the function L

                                      lt    1        2       3         4              5       ...
                                        1    2        5     15         52            203       ...
                                        2    3      10      37       151             674       ...
                                        3    4      17      77       372           1915        ...
                                        4    5      26    141        799           4736        ...
                                        5    6      37    235      1540          10427         ...
                                      ...   ...     ...   ...      ...           ...           ...



   The rows of this table are sequences A000110, A138378, A005494, A045379. OEIS provided me

with the factorial generating function for these sequences:

Conjecture 1.12.

                    l         l                           l
        L(l, t) =     Bt+l −     Bt+l−1 + · · · + (−1)l+1   Bt+1                                               (13)
                    l        l−1                          1




        ∞
              L(l, t)z t        z
                         = elz+e −1                                                                            (14)
                 t!
        t=0


The identity (12) holds for all values of t.


1.3.4     Discussion of Conjecture 1.4

Formula (12) simplify computation of E(H l (p[k], Y )). Here is a sample computation of

                                                                  ∞             (x−µ(p[k]))2
                                                    1                       −
        D(p, l, k) = E(H l (p[k], Y )) −                               xl e       2σ(p[k])     dx
                                                  2πσ(p[k])       −∞


   for p = (0.4196, 0.1647, 0.4156)




                                                              12
Movsheva, Anna



                               Table 5: Values of the function D(l, k)

                   lk         0       100       200       300       400   500
                     3   -0.0166   -0.0077   -0.0048   -0.0036   -0.0029   -0.0024
                     4   -0.0474   -0.0273   -0.0173   -0.0129   -0.0104   -0.0088
                     5   -0.0884   -0.0617   -0.0393   -0.0294   -0.0237   -0.0200
                     6   -0.1467   -0.1142   -0.0726   -0.0543   -0.0438   -0.0369



   The reader can see that the functions k → D(p, l, k) have a minimum for some k after which

they increase toward zero.


1.4   Significance

There are multitudes of possible devices that can be used for study of a remote system. While

some devices will convey a lot of information, some device will be inadequate. Surprisingly, the

majority of the devices (see Conjectures 1.6, 1.7, and 1.9) will measure the entropy very close to

the actual entropy of the system. All that is asked is that the device satisfies condition


      the onto map ψ : X → Z.                                                                  (15)


Z is the set of readings of the device.

   The cumulative Gauss distribution [4] makes a good approximation to θ(p, x). The only pa-

rameters that have to be known are the average µ and the standard deviation σ. This give an

effective way of making estimates of θ(p, x). The precise meaning of the estimates can be found in

Conjecture 1.4.

   My work offers a theoretical advance in the study of large complex systems through entropy

analysis. The potential applications will be in sciences that deal with complex systems, like econ-

omy, genetics, biology, paleontology, and psychology. My theory explains some hidden relations

between entropies of observed processes in a system. Also my theory can give an insight about the

object of study from incomplete information. This is an important problem to solve and a valuable

contribution to science according to my mentor who is an expert in this field.




                                                 13
Movsheva, Anna


2       Methods

All of the conjectures were gotten with the help of Mathematica. My main theoretical technical

tool is the theory of generating functions [22].

Definition 2.1. Let ak be a sequence of numbers where k ≥ 0. The generating function correspond

to ak is a formal power series                     k
                                           k≥0 ak t .


    My knowledge of Stirling numbers (see Definition 1.11) also comes from [22]. I also used Jensen

Inequality (Theorem 3.4)[6].



3       Results

3.1     Computation of βr and γr

The main result of this section is the explicit formulae for βr (see formula (7)) and γr (see formula

(6)):

               ω(r, 1)
        βr =
                rBr
                   1                                                                                                (16)
        γr =         ω(r,1)
               1−     rBr


where

                       r−1
                               Bi ln(r − i)
        ω(r, 1) = r!                                                                                                (17)
                              i!(r − i − 1)!
                       i=0


                                                                        ki         #Yi
    I set some notations. The probability of Yi is                      r      =    r    and the entropy of Y is H(Y ) =
                      l   ki
H(hr , Y ) = −        i=1 r    ln ki . After some simplifications H(Y ) becomes
                                  r


                      1
        H(Y ) = ln r − λ(Y )                                                                                        (18)
                      r

where

                                                               l
                                      k k             k
        λ(Y ) = λ(k1 . . . kl ) = ln k1 1 k2 2 . . . kl l =         ki ln ki                                        (19)
                                                              i=1


                                                                   14
Movsheva, Anna


The average entropy is


                                                Y ∈Pr      λ(Y )
     E(H(hr , Y )) = ln r −                                                                                         (20)
                                                     rBr

I am interested in calculating the sums:


     ω(r, q) =                  λ(Y )q , q ≥ 0                                                                      (21)
                        Y ∈Pr


The generating function of λ(Y )q with factorial is

                         ∞
                                λ(Y )k sk
     Λ(Y, s) =                            = k1 1 s · · · kl l s
                                             k            k
                                                                                                                    (22)
                                   k!
                        k=0


I will compute the generating function with factorials of the quantities


     Λ(r, s) =                  Λ(Y, s)                                                                             (23)
                        Y ∈Pr


Theorem 3.1.

         ∞
             Λ(r, s)tr
                       = eF (s,t)                                                                                   (24)
                r!
      r=0


                              ∞ rrs tr
where F (s, t) =              r=1 r! .


Proof.

                         ∞                      ∞            ∞                 l
         F (s,t)              F (s, t)l               1            k ks tk
         e         =                    =                                          =
                                 l!                   l!             k!
                        l=0                    l=0       k=1
             ∞           ∞                    ∞                       ∞
                   1            k1 1 s tk1
                                 k                 k2 s k2
                                                 k2 t                        kl l s tkl
                                                                              k
         =                                                     ···
                   l!             k1 !                k2 !                     kl !
             l=0        k1 =1                k2 =1                   kl =1
              ∞
                                                                                                                    (25)
                   1                      k1 1 s k2 2 s · · · kl l s tk1 +k2 +···+kl
                                           k      k            k
         =
                   l!                                 k1 !k2 ! · · · kl !
             l=0        k1 ≥1,...,kl ≥1
              ∞
                   1                               l!        k1 1 s k2 2 s · · · kl l s tk1 +k2 +···+kl
                                                              k      k            k
         =
                   l!                         c1 !c2 ! . . .              k1 !k2 ! · · · kl !
             l=0        1≤k1 ≤k2 ≤···≤kl




                                                                               15
Movsheva, Anna


Coefficient ci is the number of ks that are equal to i. After some obvious simplifications the formula

above becomes:

                   ∞                                                                               ks
                         1 r                                         r!       k1 1 s k2 2 s · · · kl l
                                                                               k      k
      eF (s,t) =            t                                                                                                  (26)
                         r!                                     c1 !c2 ! . . . k1 !k2 ! · · · kl !
                   r=0          k1 ≤k2 ≤···≤kl ,k1 +···+kl =r


    Each partition Y determines a set of numbers, ki = #Yi . I will refer to k1 , . . . , kl as to a

portrait of {Y1 , . . . , Yl }. Let me fixate one collection of numbers, k1 , . . . , kl . I can always assume

that the sequence is not decreasing. Let me count the number of partitions with the given portrait:
                                                                                                         (k1 +k2 +···+kl )!
k1 ≤ · · · ≤ kl . If the subsets were ordered the number of partitions would equal to                       k1 !k2 !···kl ! .      In
                                                                                                            (k1 +k2 +···+kl )!
my case the subsets are unordered and the number of such unordered partitions is                           k1 !k2 !···kl !c1 !c2 !... ,

where ci is the number of subsets with cardinality i. The function Λ(Y, s) depends only on the

portrait of Y . From this I conclude that

                                             (k1 + k2 + · · · + kl )!k1 1 s k2 2 s . . . kl l s
                                                                         k      k         k
              Λ(Y, s) =                                                                                                        (27)
                                                    k1 !k2 ! · · · kl !c1 !c2 ! . . .
      Y ∈Pr                k1 ≤k2 ≤···≤kl


which yields the proof.

    Note that upon the substitution s = 0 the formula (24) becomes a classical generating function



            Bk t k     t
                   = ee −1                                                                                                     (28)
             k!
      k≥0


(see [22]).
                                                                                      ∞ tr
    My knowledge lets me find the generating function                                  r=0 r! ω(r, 1).


Proposition 3.2.

       ∞                              ∞
            tr ω(r, 1)     t                tk ln k
                       = ee −1                                                                                                 (29)
                r!                         (k − 1)!
      r=0                            k=1




                                                                     16
Movsheva, Anna


Proof. Using equations 22, 23, and 21 I prove that

        ∞                             ∞                            ∞
              ∂ Λ(r, s)tr                   tr                           ω(r, 1)tr
                          |s=0 =                         λ(Y ) =                   .
              ∂s   r!                       r!                              r!
        r=0                          r=0         Y ∈Pr             r=0


                                                             ∞ ∂ Λ(r,s)tr
I find alternatively the partial derivative                   r=0 ∂s r!    |s=0         with the chain rule applied to right-
                           ∂ F (s,t)                         ∂
hand-side of (24):         ∂s [e     ]|s=0        = eF (s,t) ∂s [F (s, t)]|s=0 . Note that F (s, t)|s=0 = et − 1 and
∂                         ∞ tk k ln k
∂s [F (s, t)]|s=0   =     k=1   k! .        From this I infer that

                                      ∞
        ∂ F (s,t)           t                tk k ln k
           [e     ]|s=0 = ee −1                        .
        ∂s                                       k!
                                     k=1




                                                                                                          t −1       ∞ Bn tn
    I want to find am explicit formula for ω(r, 1). To my advantage I know that ee                                =   n=0 n! ,

where Bn is the Bell number or the number of unordered partitions that could be made my of a

set of n elements [22]. To find ω(r, 1) I will expand equation (29).

         ∞                    ∞              ∞
              tr ω(r, 1)           Bn t n          ln ktk
                         =
                  r!                n!            (k − 1)!
        r=0                  n=0            k=1                                                                            (30)
           B0 ln 2 2   B1 ln 2 B0 ln 3 3     B2 ln 2 B1 ln 3 B0 ln 4 4
         =        t +(        +       )t + (        +       +       )t + · · ·
           0! 1!       1! 1!    0! 2!        2! 1!    1! 2!   0! 3!

Since equal power series have equal Taylor coefficients, I conclude that formula (29) is valid. For-

mulae (16) follow (20), (5), (6), and (7).

    Using the first and second derivatives of equation 11 at s = 0 I find σ(q3,r ):

                         4Bt+1 ln 22 8Bt+1 Bt+2 ln 22 4Bt+2 ln 22 4Bt+1 ln 22
                           2                            2
                     −       2      +       2        −    2      −            +
                           Bt+3           Bt+3          Bt+3        3Bt+3
      σ(q3,r ) =                                                                                                       1   (31)
                    4Bt+2 ln 22 4Bt+1 ln 2 ln 3 4Bt+1 Bt+2 ln 2 ln 3 Bt+1 ln 32 Bt+1 ln 32
                                  2                                   2                                                2
                               +      2        −        2           −    2     +
                      3Bt+3         Bt+3              Bt+3             Bt+3       Bt+3

3.2     Bell Trials

I introduce a sequence of numbers

                  (r − 1)!Bi
       pi =                      , i = 0, . . . , r − 1                                                                    (32)
               Br i!(r − i − 1)!

                                                                   17
Movsheva, Anna


                                                                                r−1
The sequence p = (p0 , . . . , pr−1 ) satisfies pi ≥ 0 and                       i=0 pi     = 1. It follows from the recursion
                r−1 (r−1)!Bi
formula         i=0 i!(r−i−1)!      = Br [22]. I refer to a random variable ξ with this probability distribution

as to Bell trials. Note that the average of ln(r − ξ) is equal to ω(r, q)/rBr .

                                           r−1                        (r−1)Br−1 +Br
Proposition 3.3.                 1.        i=0 (r   − i)pi = µr−1 =        Br

           r−1                        (r−2)(r−1)Br−2 +3(r−1)Br−1 +Br
   2.      i=0 (r    − i)2 pi =                     Br

                                                                                           r  r!Bi xr−i+1
Proof. I will compute the generating function of Sr (x) =                                  i=0 i!(r−i)!     instead. Note that

Sr (x) |x=1 = Br µr .

 ∞                     ∞            r                                 ∞             ∞
        Sr (x)tr            1              (a + b)!Ba ta xb+1 tb           Ba t a         x(xt)b     t             t
                 =                                               =                               = ee −1 xext = xee −1+xt (33)
           r!               r!                     a!b!                     a!              b!
r=0                   r=0        a+b=r                               a=0            b=0


I factored the generating function into two series, which very conveniently simplified into expo-

nential expressions. Now that I found the simplified expression for the generating function I will

differentiate it

         ∂      t                        t                      t
            [xee −1+xt ]|x=1 = (xt + 1)ee −1+xt |x=1 = (t + 1)ee −1+t                                                         (34)
         ∂x

                                              Bk tk−1                                                           t −1      t −1+t
Note that the function                    k≥1 (k−1)!    (compare it with formula (28)) is equal to ee                  = ee        ,

which implies

           t −1+t          t −1+t               (k − 1)Bk−1 tk−1            Bk tk−1
         tee         + ee             =                          +                                                            (35)
                                                    (k − 1)!                (k − 1)!
                                          k≥2                         k≥1


and the formula for µr−1
                                          r−1
     The second moment                    i=0 (r    − i)2 pi can be computed with the same methodic. Note that the

second moment is equal to Br (x(Sr (x) )) The generating function with factorials of the second

moments is

         ∂              t                                 t                            t
            [x(xt + 1)ee −1+xt ]|x=1 = (t2 x2 + 3tx + 1)ee −1+xt |x=1 = (t2 + 3t + 1)ee −1+t =
         ∂x
                (k − 2)(k − 1)Bk−2 tk−1          3(k − 1)Bk−1 tk−1         Bk tk−1                                            (36)
         =                                +                         +
                        (k − 1)!                      (k − 1)!            (k − 1)!
               k≥3                                        k≥2                              k≥1




                                                                  18
Movsheva, Anna




Theorem 3.4. (Jensen’s Inequality)[6],[18] For any concave function f : R → R and a sequence

qi ∈ R>0 the inequality holds

       r                      r
            f (i)qi ≤ f (         iqi )
      i=1                   i=1


    I want to apply this theorem to the concave function ln x:

Corollary 3.5. There is an inequality with pi as in (32):

       r
                                            (r − 1)Br−1
            ln(r − i)pi < ln 1 +
                                                 Br
      i=1


Proof. Follows from Jensen’s Inequality and Proposition 3.3.

Proposition 3.6. lim γr = 1
                        r→∞

Proof. Corollary 3.5 implies

                   1                            1
      γr =          ω(r,1)
                                  <          “   (r−1)Br−1
                                                           ”                                                                (37)
              1−   rBr ln r
                                           ln 1+    B      r
                                      1−            ln r


It’s easy to see that Br ≥ 2Br−1 since I always have a choice of whether to add r to the same part
                                                 Br−1                1                                                „ 1
as r − 1 or not. This implies that                Br           ≤   2r−1
                                                                        .   From this I conclude that 1 ≤ γr <            (r−1)
                                                                                                                                «
                                                                                                                    ln 1+ r−1
                                                                                                                           2
                                                                                                                 1−      ln r
and lim γr = 1.



4    Discussion and Conclusion

I was not able to prove all the conjectures I made. I have made some steps (Proposition 3.6)

toward the proof of Corollary 1.8 of the Conjecture 1.6, that the ratio of the maximum entropy to

the average entropy is close to one. Another fact I have found was that the difference between the

maximum entropy and the average entropy of partitions slowly grows as #X increases. I conjecture

that the difference has the magnitude ln ln #X. Also I have computed the standard deviations,

formula (31), which is conjecturally the greatest value of σ(p).

                                                                            19
Movsheva, Anna


   My short term goal is to prove these conjectures. The more challenging goal is to add dynamics

to the system being studied and identify self-organizing subsystems among low entropy subsystems.

   I am grateful for the support and training of my mentor, Dr. Rostislav Matveyev, on this

research.



References

 [1] W. R. Ashby. An Introduction To Cybernetics. John Wiley and Sons Inc, 1966.

 [2] R. Beebe. Jupiter the Giant Planet. Smithsonian Books, Washington, 2 edition, 1997.

 [3] C. Bettstetter and C. Gershenson, editors. Self-Organizing Systems, volume 6557 of Lecture

    Notes in Computer Science. Springer, 2011.

 [4] W. Bryc. The Normal Distribution: Characterizations with Applications. Springer-Verlag,

    1995.

                  ¨
 [5] R. Clausius. Uber die w¨rmeleitung gasf¨rmiger k¨rper. Annalen der Physik, 125:353–400,
                            a               o        o

    1865.

 [6] T.M. Cover and J.A. Thomas. Elements of information theory. Wiley, 1991.

 [7] S.R. de Groot and P. Mazur. Non-Equilibrium Thermodynamics. Dover, 1984.

 [8] Z.I. Botev D.P. Kroese, T.Taimre. Handbook of Monte Carlo Methods. John Wiley & Sons,

    New York, 2011.

 [9] H. Haken. Synergetics, Introduction and Advanced Topics,. Springer,, Berlin, 2004.

[10] S.A. Kauffman. The Origins of Order. Oxford University Press, 1993.

[11] Mathematica. www.wolfram.com.

[12] H. Meinhardt. Models of biological pattern formation: from elementary steps to the organi-

    zation of embryonic axes. Curr. Top. Dev. Biol., 81:1–63, 2008.

[13] D. Morrison. Exploring Planetary Worlds. W. H. Freeman, 1994.

                                               20
Movsheva, Anna


[14] I. Prigogine. Non-Equilibrium Statistical Mechanics. Interscience Publishers, 1962.

[15] I. Prigogine. Introduction to Thermodynamics of Irreversible Processes. John Wiley and Sons,

    1968.

[16] I. Prigogine and G. Nicolis. Self-Organization in Nonequilibrium Systems: From Dissipative

    Structures to Order through Fluctuations. John Wiley and Sons, 1977.

[17] G.C. Rota. The number of partitions of a set. American Mathematical Monthly, 71(5):498–504,

    1964.

[18] W. Rudin. Real and Complex Analysis. McGraw-Hill, 1987.

[19] E. Schr¨dinger. What Is Life? Cambridge University Press, 1992.
            o

[20] C. E. Shannon. A mathematical theory of communication. The Bell System Technical Journal,,

    27:379–423, 1948.

[21] N. J. A. Sloane and other. The on-line encyclopedia of integer sequences oeis.org.

[22] R.P. Stanley. Enumerative combinatorics, volume 1,2. CUP, 1997.




                                                21

More Related Content

What's hot

Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Umberto Picchini
 
A Study of Some Systems of Linear and Nonlinear Partial Differential Equation...
A Study of Some Systems of Linear and Nonlinear Partial Differential Equation...A Study of Some Systems of Linear and Nonlinear Partial Differential Equation...
A Study of Some Systems of Linear and Nonlinear Partial Differential Equation...
inventionjournals
 
Reading Birnbaum's (1962) paper, by Li Chenlu
Reading Birnbaum's (1962) paper, by Li ChenluReading Birnbaum's (1962) paper, by Li Chenlu
Reading Birnbaum's (1962) paper, by Li Chenlu
Christian Robert
 
Two parameter entropy of uncertain variable
Two parameter entropy of uncertain variableTwo parameter entropy of uncertain variable
Two parameter entropy of uncertain variable
Surender Singh
 
Chapter 4 solving systems of nonlinear equations
Chapter 4 solving systems of nonlinear equationsChapter 4 solving systems of nonlinear equations
Chapter 4 solving systems of nonlinear equations
ssuser53ee01
 
Elzaki transform homotopy perturbation method for solving porous medium equat...
Elzaki transform homotopy perturbation method for solving porous medium equat...Elzaki transform homotopy perturbation method for solving porous medium equat...
Elzaki transform homotopy perturbation method for solving porous medium equat...
eSAT Journals
 
Elzaki transform homotopy perturbation method for
Elzaki transform homotopy perturbation method forElzaki transform homotopy perturbation method for
Elzaki transform homotopy perturbation method for
eSAT Publishing House
 
Section3 stochastic
Section3 stochasticSection3 stochastic
Section3 stochastic
cairo university
 
Multiattribute Decision Making
Multiattribute Decision MakingMultiattribute Decision Making
Multiattribute Decision Making
Arthur Charpentier
 
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
Zac Darcy
 
Synchronizing Chaotic Systems - Karl Dutson
Synchronizing Chaotic Systems - Karl DutsonSynchronizing Chaotic Systems - Karl Dutson
Synchronizing Chaotic Systems - Karl Dutson
Karl Dutson
 

What's hot (11)

Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
A Study of Some Systems of Linear and Nonlinear Partial Differential Equation...
A Study of Some Systems of Linear and Nonlinear Partial Differential Equation...A Study of Some Systems of Linear and Nonlinear Partial Differential Equation...
A Study of Some Systems of Linear and Nonlinear Partial Differential Equation...
 
Reading Birnbaum's (1962) paper, by Li Chenlu
Reading Birnbaum's (1962) paper, by Li ChenluReading Birnbaum's (1962) paper, by Li Chenlu
Reading Birnbaum's (1962) paper, by Li Chenlu
 
Two parameter entropy of uncertain variable
Two parameter entropy of uncertain variableTwo parameter entropy of uncertain variable
Two parameter entropy of uncertain variable
 
Chapter 4 solving systems of nonlinear equations
Chapter 4 solving systems of nonlinear equationsChapter 4 solving systems of nonlinear equations
Chapter 4 solving systems of nonlinear equations
 
Elzaki transform homotopy perturbation method for solving porous medium equat...
Elzaki transform homotopy perturbation method for solving porous medium equat...Elzaki transform homotopy perturbation method for solving porous medium equat...
Elzaki transform homotopy perturbation method for solving porous medium equat...
 
Elzaki transform homotopy perturbation method for
Elzaki transform homotopy perturbation method forElzaki transform homotopy perturbation method for
Elzaki transform homotopy perturbation method for
 
Section3 stochastic
Section3 stochasticSection3 stochastic
Section3 stochastic
 
Multiattribute Decision Making
Multiattribute Decision MakingMultiattribute Decision Making
Multiattribute Decision Making
 
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
On Approach of Estimation Time Scales of Relaxation of Concentration of Charg...
 
Synchronizing Chaotic Systems - Karl Dutson
Synchronizing Chaotic Systems - Karl DutsonSynchronizing Chaotic Systems - Karl Dutson
Synchronizing Chaotic Systems - Karl Dutson
 

Viewers also liked

OOH DIGEST JANUARY
OOH DIGEST JANUARYOOH DIGEST JANUARY
OOH DIGEST JANUARYAnna Kobzeva
 
Boek Amandine
Boek AmandineBoek Amandine
Boek AmandineIdaklas
 
Presentación sin título (1)
Presentación sin título (1)Presentación sin título (1)
Presentación sin título (1)
Xiomaraperezsuescun97
 
кровельные аксессуары и элементы безопасности
кровельные аксессуары и элементы безопасностикровельные аксессуары и элементы безопасности
кровельные аксессуары и элементы безопасностиlidosha
 
Centro soluciones
Centro solucionesCentro soluciones
LA MAS BELLA HISTORIA.
LA MAS BELLA HISTORIA.LA MAS BELLA HISTORIA.
LA MAS BELLA HISTORIA.
Franc.J. Vasquez.M
 
Cybercafe
CybercafeCybercafe
Cybercafeouhassi
 
Lesson 1
Lesson 1Lesson 1
Lesson 1
vmmanikandan
 
Seminar bahasa_uswatun khasanah
Seminar bahasa_uswatun khasanahSeminar bahasa_uswatun khasanah
Seminar bahasa_uswatun khasanah
Dhita Candra
 
Guidelines 2015 principais alterações
Guidelines 2015   principais alteraçõesGuidelines 2015   principais alterações
Guidelines 2015 principais alterações
Patricia Mendes
 
Cappuccino & fries
Cappuccino & friesCappuccino & fries
Cappuccino & fries
clanmort
 
At home final_lr
At home final_lrAt home final_lr
At home final_lr
McGinn Meg Brown
 
Outdoor Digest May 2013
Outdoor Digest May 2013Outdoor Digest May 2013
Outdoor Digest May 2013
Anna Kobzeva
 
E20s London Nov 2014
E20s London Nov 2014E20s London Nov 2014
E20s London Nov 2014
Celine Schillinger
 
FOLLETO PARA IMPRIMIR DE HALLOWEN
FOLLETO PARA IMPRIMIR DE HALLOWEN FOLLETO PARA IMPRIMIR DE HALLOWEN
FOLLETO PARA IMPRIMIR DE HALLOWEN
Franc.J. Vasquez.M
 
SEMINAR BAHASA_DHITA CANDRA PUSPITA
SEMINAR BAHASA_DHITA CANDRA PUSPITASEMINAR BAHASA_DHITA CANDRA PUSPITA
SEMINAR BAHASA_DHITA CANDRA PUSPITADhita Candra
 
Lesenne wrab 2014 writing movement
Lesenne wrab 2014 writing movementLesenne wrab 2014 writing movement
Lesenne wrab 2014 writing movement
Sabine Lesenne
 
Info class
Info classInfo class
Info class
clanmort
 
6 predictions for 2016
6 predictions for 20166 predictions for 2016
6 predictions for 2016
Nick Dorra
 
Estruturas compensatórias de drenagem
Estruturas compensatórias de drenagem Estruturas compensatórias de drenagem
Estruturas compensatórias de drenagem
Aline Schuck
 

Viewers also liked (20)

OOH DIGEST JANUARY
OOH DIGEST JANUARYOOH DIGEST JANUARY
OOH DIGEST JANUARY
 
Boek Amandine
Boek AmandineBoek Amandine
Boek Amandine
 
Presentación sin título (1)
Presentación sin título (1)Presentación sin título (1)
Presentación sin título (1)
 
кровельные аксессуары и элементы безопасности
кровельные аксессуары и элементы безопасностикровельные аксессуары и элементы безопасности
кровельные аксессуары и элементы безопасности
 
Centro soluciones
Centro solucionesCentro soluciones
Centro soluciones
 
LA MAS BELLA HISTORIA.
LA MAS BELLA HISTORIA.LA MAS BELLA HISTORIA.
LA MAS BELLA HISTORIA.
 
Cybercafe
CybercafeCybercafe
Cybercafe
 
Lesson 1
Lesson 1Lesson 1
Lesson 1
 
Seminar bahasa_uswatun khasanah
Seminar bahasa_uswatun khasanahSeminar bahasa_uswatun khasanah
Seminar bahasa_uswatun khasanah
 
Guidelines 2015 principais alterações
Guidelines 2015   principais alteraçõesGuidelines 2015   principais alterações
Guidelines 2015 principais alterações
 
Cappuccino & fries
Cappuccino & friesCappuccino & fries
Cappuccino & fries
 
At home final_lr
At home final_lrAt home final_lr
At home final_lr
 
Outdoor Digest May 2013
Outdoor Digest May 2013Outdoor Digest May 2013
Outdoor Digest May 2013
 
E20s London Nov 2014
E20s London Nov 2014E20s London Nov 2014
E20s London Nov 2014
 
FOLLETO PARA IMPRIMIR DE HALLOWEN
FOLLETO PARA IMPRIMIR DE HALLOWEN FOLLETO PARA IMPRIMIR DE HALLOWEN
FOLLETO PARA IMPRIMIR DE HALLOWEN
 
SEMINAR BAHASA_DHITA CANDRA PUSPITA
SEMINAR BAHASA_DHITA CANDRA PUSPITASEMINAR BAHASA_DHITA CANDRA PUSPITA
SEMINAR BAHASA_DHITA CANDRA PUSPITA
 
Lesenne wrab 2014 writing movement
Lesenne wrab 2014 writing movementLesenne wrab 2014 writing movement
Lesenne wrab 2014 writing movement
 
Info class
Info classInfo class
Info class
 
6 predictions for 2016
6 predictions for 20166 predictions for 2016
6 predictions for 2016
 
Estruturas compensatórias de drenagem
Estruturas compensatórias de drenagem Estruturas compensatórias de drenagem
Estruturas compensatórias de drenagem
 

Similar to Partitions and entropies intel third draft

A Mini Introduction to Information Theory
A Mini Introduction to Information TheoryA Mini Introduction to Information Theory
A Mini Introduction to Information Theory
rpiitcbme
 
Gradu.Final
Gradu.FinalGradu.Final
Gradu.Final
Anssi Mirka
 
The Potency of Formalism Logical Operations of Truth Tables Study
The Potency of Formalism Logical Operations of Truth Tables StudyThe Potency of Formalism Logical Operations of Truth Tables Study
The Potency of Formalism Logical Operations of Truth Tables Study
IOSR Journals
 
Random process and noise
Random process and noiseRandom process and noise
Random process and noise
Punk Pankaj
 
1 6
1 61 6
Exponentialentropyonintuitionisticfuzzysets (1)
Exponentialentropyonintuitionisticfuzzysets (1)Exponentialentropyonintuitionisticfuzzysets (1)
Exponentialentropyonintuitionisticfuzzysets (1)
Dr. Hari Arora
 
thesis_final_draft
thesis_final_draftthesis_final_draft
thesis_final_draft
Bill DeRose
 
István Dienes Lecture For Unified Theories 2006
István Dienes Lecture For Unified Theories 2006István Dienes Lecture For Unified Theories 2006
István Dienes Lecture For Unified Theories 2006
Istvan Dienes
 
Bachelor's Thesis
Bachelor's ThesisBachelor's Thesis
Bachelor's Thesis
Bastiaan Frerix
 
Random 3-manifolds
Random 3-manifoldsRandom 3-manifolds
Random 3-manifolds
Igor Rivin
 
Problems and solutions statistical physics 1
Problems and solutions   statistical physics 1Problems and solutions   statistical physics 1
Problems and solutions statistical physics 1
Alberto de Mesquita
 
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019 2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
The Statistical and Applied Mathematical Sciences Institute
 
Duffing oscillator and driven damped pendulum
Duffing oscillator and driven damped pendulumDuffing oscillator and driven damped pendulum
Duffing oscillator and driven damped pendulum
Yiteng Dang
 
Stochastic Schrödinger equations
Stochastic Schrödinger equationsStochastic Schrödinger equations
Stochastic Schrödinger equations
Ilya Gikhman
 
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
IJEAB
 
Be2419772016
Be2419772016Be2419772016
Be2419772016
IJMER
 
thesis_jordi
thesis_jordithesis_jordi
thesis_jordi
Jordi Giner-Baldó
 
A baisc ideas of statistical physics.pptx
A baisc ideas of statistical physics.pptxA baisc ideas of statistical physics.pptx
A baisc ideas of statistical physics.pptx
abhilasha7888
 
On New Root Finding Algorithms for Solving Nonlinear Transcendental Equations
On New Root Finding Algorithms for Solving Nonlinear Transcendental EquationsOn New Root Finding Algorithms for Solving Nonlinear Transcendental Equations
On New Root Finding Algorithms for Solving Nonlinear Transcendental Equations
AI Publications
 
Random vibrations
Random vibrationsRandom vibrations
Random vibrations
Koteswara Rao Unnam
 

Similar to Partitions and entropies intel third draft (20)

A Mini Introduction to Information Theory
A Mini Introduction to Information TheoryA Mini Introduction to Information Theory
A Mini Introduction to Information Theory
 
Gradu.Final
Gradu.FinalGradu.Final
Gradu.Final
 
The Potency of Formalism Logical Operations of Truth Tables Study
The Potency of Formalism Logical Operations of Truth Tables StudyThe Potency of Formalism Logical Operations of Truth Tables Study
The Potency of Formalism Logical Operations of Truth Tables Study
 
Random process and noise
Random process and noiseRandom process and noise
Random process and noise
 
1 6
1 61 6
1 6
 
Exponentialentropyonintuitionisticfuzzysets (1)
Exponentialentropyonintuitionisticfuzzysets (1)Exponentialentropyonintuitionisticfuzzysets (1)
Exponentialentropyonintuitionisticfuzzysets (1)
 
thesis_final_draft
thesis_final_draftthesis_final_draft
thesis_final_draft
 
István Dienes Lecture For Unified Theories 2006
István Dienes Lecture For Unified Theories 2006István Dienes Lecture For Unified Theories 2006
István Dienes Lecture For Unified Theories 2006
 
Bachelor's Thesis
Bachelor's ThesisBachelor's Thesis
Bachelor's Thesis
 
Random 3-manifolds
Random 3-manifoldsRandom 3-manifolds
Random 3-manifolds
 
Problems and solutions statistical physics 1
Problems and solutions   statistical physics 1Problems and solutions   statistical physics 1
Problems and solutions statistical physics 1
 
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019 2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
 
Duffing oscillator and driven damped pendulum
Duffing oscillator and driven damped pendulumDuffing oscillator and driven damped pendulum
Duffing oscillator and driven damped pendulum
 
Stochastic Schrödinger equations
Stochastic Schrödinger equationsStochastic Schrödinger equations
Stochastic Schrödinger equations
 
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
 
Be2419772016
Be2419772016Be2419772016
Be2419772016
 
thesis_jordi
thesis_jordithesis_jordi
thesis_jordi
 
A baisc ideas of statistical physics.pptx
A baisc ideas of statistical physics.pptxA baisc ideas of statistical physics.pptx
A baisc ideas of statistical physics.pptx
 
On New Root Finding Algorithms for Solving Nonlinear Transcendental Equations
On New Root Finding Algorithms for Solving Nonlinear Transcendental EquationsOn New Root Finding Algorithms for Solving Nonlinear Transcendental Equations
On New Root Finding Algorithms for Solving Nonlinear Transcendental Equations
 
Random vibrations
Random vibrationsRandom vibrations
Random vibrations
 

Partitions and entropies intel third draft

  • 1. Statistical Properties of the Entropy Function of a Random Partition Anna Movsheva Contents 1 Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Research Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.1 General properties of θ(p, x) . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.2 Functions µ(p) and σ(p) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.3 The Generating Function of Momenta . . . . . . . . . . . . . . . . . . . . . . 10 1.3.4 Discussion of Conjecture 1.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2 Methods 14 3 Results 14 3.1 Computation of βr and γr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 Bell Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Discussion and Conclusion 19
  • 2. Abstract It is well known that living organisms are open self-organizing thermodynamic systems with a low entropy. An estimate for the number of subsystems with low entropy would give a rough guess about the number of self-organizing subsystems that exist in a closed system S. I study the mathematical properties of a model in which a finite set X with a probability distribution l {px |x ∈ X} encodes a set of states of the system S. A partition of the set X = i=1 Yi , in this model represents a subsystem with the set of probabilities {p(Yi ) = x∈Yi px }. In this paper I study the entropy function H(p, Y ) = − i p(Yi ) ln p(Yi ) of a random partition Y . In particular I study the counting function Θ(p, x) = #{Y |H(p, Y ) ≤ x}. Using computer simulations, I give evidences that the normalized function θ(p, x) = Θ(p, x)/Θ(H(p, X)) asymptotically can be x approximated by the cumulative Gauss distribution 1/ 2πσ(p) −∞ exp(−(t − µ(p))2 /2σ(p))dt. I state my findings in a form of falsifiable conjectures some of which I partly prove. The asymptotics explain a strong correlation between µ(p), the average entropy of a random partition of X, and the entropy H(p, X). Since the quantity µ(p) is usually available in practice I can give an estimate for H(p, X) when it is not directly computable.
  • 3. Movsheva, Anna 1 Introduction 1.1 Background One of the main problems of theoretical biology and theoretical physics is to reconcile the theory of evolution with statistical mechanics and thermodynamics. It was Ilya Prigogine who was the first who made the fundamental contributions to the solution of this problem. He advocated that living organisms are open self-organizing thermodynamic systems with a low entropy. These open systems are part of a large closed system S. Since I am interested in open self-organizing thermodynamic systems it is important to know the number of subsystems within S that have low entropy. In my work I studied this question from the mathematical point of view. In my simplified approach the configuration space of S was a finite set X with a probability distribution. In my interpretation a subsystem was a partition of X. In my work I studied a function that, for a given x, counted the number of partitions of X who’s entropy did not exceed x. My approach is rather general because any configuration space can be approximated by a sufficiently large but a finite set. The controversy between classical biology and physics has a long history. It revolves around a paradox that physical processes are reversible and biological are not. Boltzmann in the process of working on this dilemma laid the foundation of statistical physics. He put forward the notion of entropy which characterizes the degree of disorder in a statistical system. The second law of thermodynamics in the formulation of Boltzmann states that the entropy of a closed system cannot decrease, which makes the time in a statistical system irreversible. The solution of the problems of irreversibility of time did not completely eliminate the contradiction. The second law of thermodynamics seems to forbid the long term existence of the organized system, such as living organisms. Schr¨dinger in his book [19] (Chapter 6) pointed out that the entropy can go down in an o open system, that is a system that can exchange mass and energy with the surroundings. Prigogine in his groundbreaking works [15, 14, 16] showed that the self-organization (decrease of entropy) can be achieved dynamically. His discovery layed down the foundation of non-equilibrium statistical mechanics. The most interesting self-organizing systems exist far away from the equilibrium and are non static by their nature. There is a vast literature on self-organization (see e.g.[16, 10, 9, 12] and the references therein). 1
  • 4. Movsheva, Anna Current research is focused on the detailed studying of individual examples of self-organization and is very successful (see e.g.[3]). In this work I changed the perspective. My motivating problem was rather general - to estimate the total number of self-organizing subsystems in a thermodynami- cally closed system. Self-organizing subsystems are the most interesting specimen of the class of subsystems with a low entropy. This motivates my interest in estimating the number of subsys- tems with a low entropy. Knowing this number the number of self-organizing subsystems can be assessed. A problem given in such generalities looks very hard so I made a series of simplifications that let me progress in this direction. Ashby in [1] argued that any system S can be thought of as a “machine”. His idea is that the configuration space of S can be approximated by a set or an alphabet X and the dynamics is given by the transition rule TX : X → X. A homomorphism between machines S = (X, TX ) and Q = (Z, TZ ) is a map ψ : X → Z such that ψTX = TZ ψ. Homomorphisms are useful in analysis of complicated systems. (See [1] for details) A submachine, according to [1], is a subset X ⊂ X that is invariant in respect to TX . I never used this definition in this paper. In my definition a submachine is a homomorphic image ψ : (X, TX ) → (Z, TZ ). For example, if a machine (X, T ) consists of N non-interactive sub machines (X1 , T1 ), . . . , (XN , TN ) then X = X1 × · · · × XN , T = T1 × · · · × TN . Projectors ψi (xi , . . . , xN ) = xi are homomorphisms of machines. This reflects the fact that the configuration space of a union of non interacting systems is a product (not a union) of the configuration spaces of the components. Definition 1.1. A collection of subsets Y = {Yz |z ∈ Z} , such that Yz ∩ Yz = ∅, z = z and z∈Z Yz = X is a partition of a finite set X, r = #X. Let ki to be the cardinality #Yz . In this are I shall use the notation X = z Yz Any homomorphism ψ : (X, TX ) → (Z, TZ ) defines a partition of X with Yz equal to {x ∈ X|ψ(x) = z}. In fact up to relabeling the elements of Z the homomorphism is the same as a partition. This also explains why I am interested in the counting of the partitions. Ashby in [1] argued that a machine (X, T ) is a limiting case of a more realistic Markov process, in which ˜ deterministic transition rules x → T (x) get replaced by random transition rules x → T (x). The dynamics of the process is completely determined by the probabilities {px ,x |x, x ∈ X} to pass from the state x to the state x and the initial probability distribution {px |x ∈ X}. Markov processes have been studies in the theory of information developed originally in [20]. 2
  • 5. Movsheva, Anna Yet there is still another way to interpret quantities that I would like to compute. A submachine can be also be interpreted as a scientific device. This can be understood in the example of a hurricane on Jupiter [2]. You can analyze the hurricane in a multitude of ways: visually through the lenses of a telescope, by recording the fluctuations of winds with a probe, by capturing the fluctuations of the magnetic field around the hurricane. Every method of analysis (device) gives a statistical data that yields in turn the respective entropy. If (X, p) is a space of states of the hurricane, then ψ : X → Z is a function, whose set of values is the set of readings of the scientific device. It automatically leads to a partition of X as it was explained above. The list of known scientific methods in planetary science is enormous [13], and any new additional method contributes something to the knowledge. Yet, the full understanding of the subject would be only possible if I used all possible methods (ψs). This, however, is not going to happen in planetary science in the near future. The reason is that the set of states X of the Jupiter atmosphere is colossal, which makes the set of all conceivable methods of its study (devices) even bigger. Still, imagine that all the mentioned troubles are nonexistent. It would be interesting to count the number of scientific devices that yield statistical data about the hurricane with entropy no greater than a given value. It would be also interesting to know their the average entropy. This is a dream. I did just that in my oversimplified model. 1.2 Research Problem In the following, the set X will be {1, . . . , r}. Let p be a probability distribution on X, that is r a collection of numbers pi ≥ 0 such that i=1 pi = 1. The array p = (p1 , . . . , pr ) is said to be a probability vector. The probability of Yi in the partition X = Yi is p(Yi ) = pj . j∈Yi l Definition 1.2. Entropy of a partition Y , H(p, Y ) is calculated by the expression − i=0 p(Yi ) ln p(Yi ). In this definition the function x ln x is extended to x = 0 by continuity 0 ln 0 = 0. r Here are some examples of entropies: H(p, Ymax ) = − i=1 pi ln pi for Ymax = {{1}, . . . , {r}}, H(p, Ymin ) = 0 for Ymin = {{1, . . . , r}}. One of the properties of the entropy function (see [6]) is 3
  • 6. Movsheva, Anna that H(p, Ymin ) ≤ H(p, Y ) ≤ H(p, Ymax ) for any Y ∈ Pr (1) It is clear from the previous discussion that Θ(p, x) = #{Y ∈ Pr |H(p, Y ) ≤ x} is identical to the function defined in the abstract. The Bell number Br ([22],[17]) is the cardinality of Pr . The value Θ(p, H(p, Ymax )) = Θ(p, H(p, id)) thanks to (1) coincides with Br . From this I conclude that #{Y ∈ Pr |H(p, Y ) ≤ x} θ(p, x) = Br is the function defined in the abstract. My main goal is to find a simple approximation to θ(p, x). 1.3 Hypothesis In this section I will formulate the conjectures that I obtained with Computing Software Mathe- matica [11]. Remark 1.3. I equipped the set Pr with the probability distribution P such that P(Y ) for Y ∈ Pr is equal to 1/Br . The value of the function θ(p, x) is the probability that a random partition Y has the entropy ≤ x. This explains the adjective “random” in the title of the paper. In order to state the main result I will need to set notation: k p[k] = (p1 , . . . , pr , 0, . . . , 0) (2) where p = (p1 , . . . , pr ) is the probability vector. From the set of momenta of the entropy of a random partition 1 E(H l (p, Y )) = H l (p, Y ) (3) Br Y ∈Pr I will use the first two to define the average µ(p) = E(H(p, Y )) and the standard deviation σ(p) = 4
  • 7. Movsheva, Anna E(H(p, Y )2 ) − E(H(p, Y ))2 . Conjecture 1.4. Let p be a probability distribution on {1, . . . , r}. Then ∞ (x−µ)2 1 lim E(H l (p[k], Y )) − √ xl e − 2σ dx = 0 k→∞ 2πσ −∞ with µ = µ(p[k]), σ = σ(p[k])and for any integer l ≥ 0. Practically this means that the cumulative normal distribution x (x−µ)2 1 Erf(x, µ, σ) = √ e− 2σ 2πσ −∞ with µ = µ(p[k]), σ = σ(p[k]) makes a good approximation to θ(p[k], x) for large k. The initial study of the function θ(p, x) has been done with the help of Mathematica. The software can effectively compute the quantities associated with set X whose cardinality does not exceed ten. 1.3.1 General properties of θ(p, x) The plots of some typical graphs are presented in Figure 1.1. These were done with a help of Mathematica. 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Figure 1.1: Graphs of θ(p, x), θ(q, x). The continuous line on the graph corresponds to θ(p, x) with p = (0.082, 0.244, 0.221, 0.093, 0.052, 0.094, 0.079, 0.130) The step function corresponds to q = ( 8 , . . . , 1 ). Large steps are common for θ(q, x) when q has 1 8 5
  • 8. Movsheva, Anna symmetries. A symmetry of q is a permutation τ of X such that qτ (x) = qx for all x ∈ X. Indeed, if I take a symmetry and act it upon a partition, I get another partition with the same entropy. This way I can produce many partitions with equal entropies. Hence, I get high steps in the graph. The effect of of the operation p → p[1] (2) on θ(p, x) is surprising. Here are the typical graphs: 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.5 1.0 1.5 2.0 2.5 Figure 1.2: Graphs of θ(p, x), θ(p[1], x), θ(p[2], x) for some randomly chosen p = (p1 , . . . , p6 ). The reader can see that the graphs have the same bending patterns. Aslo graphs lie one over the other. I wanted to put forth a conjecture that passed multiple numerical tests. Conjecture 1.5. For any p I have θ(p, x) ≥ θ(p[1], x) A procedure that plots θ(p, x) is hungry for computer memory. This is why it is worthwhile to find a function that makes a good approximation. I have already mentioned in the introduction that Erf(x, µ(p), σ(p)) approximates θ(p, x) well. For example, if p = (0.138, 0.124, 0.042, 0.106, 0.081, 0.131, 0.088, 0.138, 0.154), (4) the picture below indicates a good agreement of graphs. 6
  • 9. Movsheva, Anna 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Figure 1.3: Erf(x, µ(p), σ(p)) (red) vs θ(p, x)(blue), with p as in (4). The reader will more precise relations between Erf and θ in the following sections. 1.3.2 Functions µ(p) and σ(p) The good agreement of graphs Erf(x, µ(p), σ(p)) and θ(p, x) raises a question of a detailed analysis of the functions µ(p) and σ(p). It turns out that a more manageable quantities than µ(p) are H(p, Ymax ) β(p) = H(p, Ymax ) − µ(p), γ(p) = (5) µ(p) The unequally (1) implies that µ(p) ≤ H(p, Ymax ) and β(p) ≥ 0, γ(p) ≥ 1. Evaluation of the denominator of γ(p) with formula (3) requires intensive computing. On my slow machine I used the Monte-Carlo approximation [8] k 1 µ(p) ∼ H(p, Y i ) k i=1 where Y i are independent random partitions. Below is the graph of β(p1 , p2 , p3 ) γ(p1 , p2 , p3 ) plotted in Mathematica. The reader can distinctly see one maximum in the center corresponding to p = ( 3 , 1 , 1 ). 1 3 3 Figure 1.4: The plot of β(p1 , p2 , 1 − p1 − p2 ) Figure 1.5: The plot of γ(p1 , p2 , 1 − p1 − p2 ) A closer look at the plot shows that γ(p1 , p2 , p3 ) is not a concave function. 7
  • 10. Movsheva, Anna In the following hr stands for the probability vector ( 1 , . . . , 1 ). r r I came up with a conjecture, which has been numerically tested for r ≤ 9: Conjecture 1.6. The function γ(p1 , . . . , pr ) can be extended by continuity to all distributions p. In this bigger domain it satisfies def 1 ≤ γ(p) ≤ γ(hr ) = γr . (6) Likewise the function β satisfies def 0 ≤ β(p) ≤ β(hr ) = βr . (7) The reader should consult sections below on the alternative ways of computing of βr and γr . The following table contains an initial segment of the sequence of {γr }. Table 1: Values of γr . r 2 3 4 5 6 7 8 9 ... 100 ... 1000 γr 2 1.826 1.739 1.691 1.659 1.635 1.617 1.602 ... 1.426 ... 1.341 I see that it is a decreasing sequence. Extensive computer tests have lead me the following conjecture. Conjecture 1.7. The sequence {γr } satisfies γr ≥ γr+1 and lim γr = 1. r→∞ The limit statement is proved in Proposition 3.6. Corollary 1.8. lim γ(p[t]) = 1. t→∞ Proof. From Conjecture 1.6 I conclude that 1 ≤ γ(p[t]) ≤ γr+t . Since lim γr+t = 1 by Conjecture t→∞ 1.7, lim γ(p[t]) = 1. t→∞ Table 2: Values of βr . r 6 7 8 9 ... 100 ... βr 0.711731 0.756053 0.793492 0.825835 ... 1.3943 ... 8
  • 11. Movsheva, Anna Conjecture 1.9. The sequence {βr } satisfies βr ≤ βr+1 and lim βr = ∞. r→∞ The situation with the standard deviation σ(p) is a bit more complicated. Here is a graph of σ(p1 , p2 , p3 ). Figure 1.6: Three-dimensional view of the graph of standard deviation σ(p1 , p2 , p3 ) for θ(p, x). The reader can clearly see four local maxima. The function σ(p1 , p2 , p3 ) is symmetric. The maxima correspond to the points ( 1 , 1 , 3 ) and permutations of ( 1 , 1 , 0). This lead me to think 3 3 1 2 2 that local maxima of σ(p1 , . . . , pr ) are permutations of qk,r = hk [r − k], k ≤ r. I tabulated the values of σ(qk,r ) for small k and r in the table below. Table 3: Values of σ(qk,r ). kr 3 4 5 6 7 8 9 2 0.3396 0.3268 0.314 0.3026 0.2924 0.2832 0.275 3 0.35 0.3309 0.3173 0.3074 0.2992 0.292 0.286 4 - 0.3254 0.309 0.298 0.29 0.283 0.278 5 - - 0.302 0.289 0.28 0.273 0.267 6 - - - 0.283 0.272 0.265 0.258 7 - - - - 0.267 0.258 0.251 8 - - - - - 0.254 0.246 9 - - - - - - 0.242 The reader can see that the third row in bold has the largest values of each column. It is not hard to see analytically that qk,r is a critical point of σ. My computer experiments lead me to the following conjecture: Conjecture 1.10. The function σ(p) has a global maximum at q3,r . 9
  • 12. Movsheva, Anna 1.3.3 The Generating Function of Momenta In order to test Conjecture 1.4 I need to have an effective way of computing E(H l (p[k], Y )) for large values of k. In this section I present my computations of E(H l (p[k], Y )) for small r, which lead me to a conjectural formula for E(H l (p[k], Y )). The factorial generating function of powers of entropy can be written compactly this way: t l ∞ H(p, Y )t st ∞ − i=0 p(Yi ) ln p(Yi ) st G(p, Y, s) = = = t! t! (8) t=0 t=0 = Πl p(Yi )−p(Yi )s i=1 The function G(p, Y, s) can be extended from Pr to Pr+1 in the following way. I extend the r- dimensional probability vector p to r + 1-dimensional vector p by adding a zero coordinate. Any partition Y = {Y1 , . . . , Yl } defines a partition Y = {Y1 , . . . , Yl , {r + 1}}. Note that G(p, Y, s) = G(p , Y , s). The following generating function, after normalization, encodes all the moments of the random partition: J(p, s) = G(p, Y, s) J(p, s)/Br = E(H l (p, Y ))sl /l! (9) Y ∈Pr l≥0 I want to explore the effect of substitution p → p[k] on J(p, s). I use the following notations: At (p, s) = J(p[t], s). Here are the results of my computer experimentations. A set of two non-zero p extended by t zeros yields At (p1 , p2 , −s) = Bt+1 + (Bt+2 − Bt+1 )pp1 s pp2 s 1 2 (10) 10
  • 13. Movsheva, Anna The next is for 3 non-zero p extended by zeroes. At (p1 , p2 , p3 , −s) = Bt+1 + (Bt+2 − Bt+1 )× × (p1 + p2 )s(p1 +p2 ) pp3 s + (p1 + p3 )s(p1 +p3 ) pp2 s + (p2 + p3 )s(p2 +p3 ) pp1 s 3 2 1 (11) + (Bt+3 − 3Bt+2 + 2Bt+1 )pp1 s pp2 s pp3 s 1 2 3 I found At (p, s) for probability vector p with five or less coordinates. In order to generalize the re- sults of my computation I have to fix some notations. With the notation deg Y = deg{Y1 , . . . , Yl } = def l I set J l (p, s) = deg Y =l G(p, Y, s). The function k At (p, s) = L(l, t)J l (p, s) (12) l=1 where L(l, t) are some coefficients. For example, in the last line of formula (11) the coefficient L(3, t) is Bt+3 − 3Bt+2 + 2Bt+1 and the function J 3 (p, s) is pp1 s pp2 s pp3 s . The reader can see that 1 2 3 the coefficients of J l (p, s) in the formulae (10) and (11) coincide. The coefficients of the Bell numbers in the formulae for L(l, t): Bt+1 Bt+2 − Bt+1 Bt+3 − 3Bt+2 + 2Bt+1 Bt+4 − 6Bt+3 + 11Bt+2 − 6Bt+1 Bt+5 − 10Bt+4 + 35Bt+3 − 50Bt+2 + 24Bt+1 form a triangle. I took these constants 1, 1, −1, 1, −3, 2, 1, −6, 11, −6 and entered them into the Google search window. The result of the search lead me to the sequence A094638, Stirling numbers of the first kind, in the Online Encyclopedia of Integer Sequences (OEIS [21]). n Definition 1.11. The unsigned Stirling numbers of the first kind are denoted by k . They count the number of permutations of n elements with k disjoint cycles [22]. 11
  • 14. Movsheva, Anna Table 4: Values of the function L lt 1 2 3 4 5 ... 1 2 5 15 52 203 ... 2 3 10 37 151 674 ... 3 4 17 77 372 1915 ... 4 5 26 141 799 4736 ... 5 6 37 235 1540 10427 ... ... ... ... ... ... ... ... The rows of this table are sequences A000110, A138378, A005494, A045379. OEIS provided me with the factorial generating function for these sequences: Conjecture 1.12. l l l L(l, t) = Bt+l − Bt+l−1 + · · · + (−1)l+1 Bt+1 (13) l l−1 1 ∞ L(l, t)z t z = elz+e −1 (14) t! t=0 The identity (12) holds for all values of t. 1.3.4 Discussion of Conjecture 1.4 Formula (12) simplify computation of E(H l (p[k], Y )). Here is a sample computation of ∞ (x−µ(p[k]))2 1 − D(p, l, k) = E(H l (p[k], Y )) − xl e 2σ(p[k]) dx 2πσ(p[k]) −∞ for p = (0.4196, 0.1647, 0.4156) 12
  • 15. Movsheva, Anna Table 5: Values of the function D(l, k) lk 0 100 200 300 400 500 3 -0.0166 -0.0077 -0.0048 -0.0036 -0.0029 -0.0024 4 -0.0474 -0.0273 -0.0173 -0.0129 -0.0104 -0.0088 5 -0.0884 -0.0617 -0.0393 -0.0294 -0.0237 -0.0200 6 -0.1467 -0.1142 -0.0726 -0.0543 -0.0438 -0.0369 The reader can see that the functions k → D(p, l, k) have a minimum for some k after which they increase toward zero. 1.4 Significance There are multitudes of possible devices that can be used for study of a remote system. While some devices will convey a lot of information, some device will be inadequate. Surprisingly, the majority of the devices (see Conjectures 1.6, 1.7, and 1.9) will measure the entropy very close to the actual entropy of the system. All that is asked is that the device satisfies condition the onto map ψ : X → Z. (15) Z is the set of readings of the device. The cumulative Gauss distribution [4] makes a good approximation to θ(p, x). The only pa- rameters that have to be known are the average µ and the standard deviation σ. This give an effective way of making estimates of θ(p, x). The precise meaning of the estimates can be found in Conjecture 1.4. My work offers a theoretical advance in the study of large complex systems through entropy analysis. The potential applications will be in sciences that deal with complex systems, like econ- omy, genetics, biology, paleontology, and psychology. My theory explains some hidden relations between entropies of observed processes in a system. Also my theory can give an insight about the object of study from incomplete information. This is an important problem to solve and a valuable contribution to science according to my mentor who is an expert in this field. 13
  • 16. Movsheva, Anna 2 Methods All of the conjectures were gotten with the help of Mathematica. My main theoretical technical tool is the theory of generating functions [22]. Definition 2.1. Let ak be a sequence of numbers where k ≥ 0. The generating function correspond to ak is a formal power series k k≥0 ak t . My knowledge of Stirling numbers (see Definition 1.11) also comes from [22]. I also used Jensen Inequality (Theorem 3.4)[6]. 3 Results 3.1 Computation of βr and γr The main result of this section is the explicit formulae for βr (see formula (7)) and γr (see formula (6)): ω(r, 1) βr = rBr 1 (16) γr = ω(r,1) 1− rBr where r−1 Bi ln(r − i) ω(r, 1) = r! (17) i!(r − i − 1)! i=0 ki #Yi I set some notations. The probability of Yi is r = r and the entropy of Y is H(Y ) = l ki H(hr , Y ) = − i=1 r ln ki . After some simplifications H(Y ) becomes r 1 H(Y ) = ln r − λ(Y ) (18) r where l k k k λ(Y ) = λ(k1 . . . kl ) = ln k1 1 k2 2 . . . kl l = ki ln ki (19) i=1 14
  • 17. Movsheva, Anna The average entropy is Y ∈Pr λ(Y ) E(H(hr , Y )) = ln r − (20) rBr I am interested in calculating the sums: ω(r, q) = λ(Y )q , q ≥ 0 (21) Y ∈Pr The generating function of λ(Y )q with factorial is ∞ λ(Y )k sk Λ(Y, s) = = k1 1 s · · · kl l s k k (22) k! k=0 I will compute the generating function with factorials of the quantities Λ(r, s) = Λ(Y, s) (23) Y ∈Pr Theorem 3.1. ∞ Λ(r, s)tr = eF (s,t) (24) r! r=0 ∞ rrs tr where F (s, t) = r=1 r! . Proof. ∞ ∞ ∞ l F (s,t) F (s, t)l 1 k ks tk e = = = l! l! k! l=0 l=0 k=1 ∞ ∞ ∞ ∞ 1 k1 1 s tk1 k k2 s k2 k2 t kl l s tkl k = ··· l! k1 ! k2 ! kl ! l=0 k1 =1 k2 =1 kl =1 ∞ (25) 1 k1 1 s k2 2 s · · · kl l s tk1 +k2 +···+kl k k k = l! k1 !k2 ! · · · kl ! l=0 k1 ≥1,...,kl ≥1 ∞ 1 l! k1 1 s k2 2 s · · · kl l s tk1 +k2 +···+kl k k k = l! c1 !c2 ! . . . k1 !k2 ! · · · kl ! l=0 1≤k1 ≤k2 ≤···≤kl 15
  • 18. Movsheva, Anna Coefficient ci is the number of ks that are equal to i. After some obvious simplifications the formula above becomes: ∞ ks 1 r r! k1 1 s k2 2 s · · · kl l k k eF (s,t) = t (26) r! c1 !c2 ! . . . k1 !k2 ! · · · kl ! r=0 k1 ≤k2 ≤···≤kl ,k1 +···+kl =r Each partition Y determines a set of numbers, ki = #Yi . I will refer to k1 , . . . , kl as to a portrait of {Y1 , . . . , Yl }. Let me fixate one collection of numbers, k1 , . . . , kl . I can always assume that the sequence is not decreasing. Let me count the number of partitions with the given portrait: (k1 +k2 +···+kl )! k1 ≤ · · · ≤ kl . If the subsets were ordered the number of partitions would equal to k1 !k2 !···kl ! . In (k1 +k2 +···+kl )! my case the subsets are unordered and the number of such unordered partitions is k1 !k2 !···kl !c1 !c2 !... , where ci is the number of subsets with cardinality i. The function Λ(Y, s) depends only on the portrait of Y . From this I conclude that (k1 + k2 + · · · + kl )!k1 1 s k2 2 s . . . kl l s k k k Λ(Y, s) = (27) k1 !k2 ! · · · kl !c1 !c2 ! . . . Y ∈Pr k1 ≤k2 ≤···≤kl which yields the proof. Note that upon the substitution s = 0 the formula (24) becomes a classical generating function Bk t k t = ee −1 (28) k! k≥0 (see [22]). ∞ tr My knowledge lets me find the generating function r=0 r! ω(r, 1). Proposition 3.2. ∞ ∞ tr ω(r, 1) t tk ln k = ee −1 (29) r! (k − 1)! r=0 k=1 16
  • 19. Movsheva, Anna Proof. Using equations 22, 23, and 21 I prove that ∞ ∞ ∞ ∂ Λ(r, s)tr tr ω(r, 1)tr |s=0 = λ(Y ) = . ∂s r! r! r! r=0 r=0 Y ∈Pr r=0 ∞ ∂ Λ(r,s)tr I find alternatively the partial derivative r=0 ∂s r! |s=0 with the chain rule applied to right- ∂ F (s,t) ∂ hand-side of (24): ∂s [e ]|s=0 = eF (s,t) ∂s [F (s, t)]|s=0 . Note that F (s, t)|s=0 = et − 1 and ∂ ∞ tk k ln k ∂s [F (s, t)]|s=0 = k=1 k! . From this I infer that ∞ ∂ F (s,t) t tk k ln k [e ]|s=0 = ee −1 . ∂s k! k=1 t −1 ∞ Bn tn I want to find am explicit formula for ω(r, 1). To my advantage I know that ee = n=0 n! , where Bn is the Bell number or the number of unordered partitions that could be made my of a set of n elements [22]. To find ω(r, 1) I will expand equation (29). ∞ ∞ ∞ tr ω(r, 1) Bn t n ln ktk = r! n! (k − 1)! r=0 n=0 k=1 (30) B0 ln 2 2 B1 ln 2 B0 ln 3 3 B2 ln 2 B1 ln 3 B0 ln 4 4 = t +( + )t + ( + + )t + · · · 0! 1! 1! 1! 0! 2! 2! 1! 1! 2! 0! 3! Since equal power series have equal Taylor coefficients, I conclude that formula (29) is valid. For- mulae (16) follow (20), (5), (6), and (7). Using the first and second derivatives of equation 11 at s = 0 I find σ(q3,r ): 4Bt+1 ln 22 8Bt+1 Bt+2 ln 22 4Bt+2 ln 22 4Bt+1 ln 22 2 2 − 2 + 2 − 2 − + Bt+3 Bt+3 Bt+3 3Bt+3 σ(q3,r ) = 1 (31) 4Bt+2 ln 22 4Bt+1 ln 2 ln 3 4Bt+1 Bt+2 ln 2 ln 3 Bt+1 ln 32 Bt+1 ln 32 2 2 2 + 2 − 2 − 2 + 3Bt+3 Bt+3 Bt+3 Bt+3 Bt+3 3.2 Bell Trials I introduce a sequence of numbers (r − 1)!Bi pi = , i = 0, . . . , r − 1 (32) Br i!(r − i − 1)! 17
  • 20. Movsheva, Anna r−1 The sequence p = (p0 , . . . , pr−1 ) satisfies pi ≥ 0 and i=0 pi = 1. It follows from the recursion r−1 (r−1)!Bi formula i=0 i!(r−i−1)! = Br [22]. I refer to a random variable ξ with this probability distribution as to Bell trials. Note that the average of ln(r − ξ) is equal to ω(r, q)/rBr . r−1 (r−1)Br−1 +Br Proposition 3.3. 1. i=0 (r − i)pi = µr−1 = Br r−1 (r−2)(r−1)Br−2 +3(r−1)Br−1 +Br 2. i=0 (r − i)2 pi = Br r r!Bi xr−i+1 Proof. I will compute the generating function of Sr (x) = i=0 i!(r−i)! instead. Note that Sr (x) |x=1 = Br µr . ∞ ∞ r ∞ ∞ Sr (x)tr 1 (a + b)!Ba ta xb+1 tb Ba t a x(xt)b t t = = = ee −1 xext = xee −1+xt (33) r! r! a!b! a! b! r=0 r=0 a+b=r a=0 b=0 I factored the generating function into two series, which very conveniently simplified into expo- nential expressions. Now that I found the simplified expression for the generating function I will differentiate it ∂ t t t [xee −1+xt ]|x=1 = (xt + 1)ee −1+xt |x=1 = (t + 1)ee −1+t (34) ∂x Bk tk−1 t −1 t −1+t Note that the function k≥1 (k−1)! (compare it with formula (28)) is equal to ee = ee , which implies t −1+t t −1+t (k − 1)Bk−1 tk−1 Bk tk−1 tee + ee = + (35) (k − 1)! (k − 1)! k≥2 k≥1 and the formula for µr−1 r−1 The second moment i=0 (r − i)2 pi can be computed with the same methodic. Note that the second moment is equal to Br (x(Sr (x) )) The generating function with factorials of the second moments is ∂ t t t [x(xt + 1)ee −1+xt ]|x=1 = (t2 x2 + 3tx + 1)ee −1+xt |x=1 = (t2 + 3t + 1)ee −1+t = ∂x (k − 2)(k − 1)Bk−2 tk−1 3(k − 1)Bk−1 tk−1 Bk tk−1 (36) = + + (k − 1)! (k − 1)! (k − 1)! k≥3 k≥2 k≥1 18
  • 21. Movsheva, Anna Theorem 3.4. (Jensen’s Inequality)[6],[18] For any concave function f : R → R and a sequence qi ∈ R>0 the inequality holds r r f (i)qi ≤ f ( iqi ) i=1 i=1 I want to apply this theorem to the concave function ln x: Corollary 3.5. There is an inequality with pi as in (32): r (r − 1)Br−1 ln(r − i)pi < ln 1 + Br i=1 Proof. Follows from Jensen’s Inequality and Proposition 3.3. Proposition 3.6. lim γr = 1 r→∞ Proof. Corollary 3.5 implies 1 1 γr = ω(r,1) < “ (r−1)Br−1 ” (37) 1− rBr ln r ln 1+ B r 1− ln r It’s easy to see that Br ≥ 2Br−1 since I always have a choice of whether to add r to the same part Br−1 1 „ 1 as r − 1 or not. This implies that Br ≤ 2r−1 . From this I conclude that 1 ≤ γr < (r−1) « ln 1+ r−1 2 1− ln r and lim γr = 1. 4 Discussion and Conclusion I was not able to prove all the conjectures I made. I have made some steps (Proposition 3.6) toward the proof of Corollary 1.8 of the Conjecture 1.6, that the ratio of the maximum entropy to the average entropy is close to one. Another fact I have found was that the difference between the maximum entropy and the average entropy of partitions slowly grows as #X increases. I conjecture that the difference has the magnitude ln ln #X. Also I have computed the standard deviations, formula (31), which is conjecturally the greatest value of σ(p). 19
  • 22. Movsheva, Anna My short term goal is to prove these conjectures. The more challenging goal is to add dynamics to the system being studied and identify self-organizing subsystems among low entropy subsystems. I am grateful for the support and training of my mentor, Dr. Rostislav Matveyev, on this research. References [1] W. R. Ashby. An Introduction To Cybernetics. John Wiley and Sons Inc, 1966. [2] R. Beebe. Jupiter the Giant Planet. Smithsonian Books, Washington, 2 edition, 1997. [3] C. Bettstetter and C. Gershenson, editors. Self-Organizing Systems, volume 6557 of Lecture Notes in Computer Science. Springer, 2011. [4] W. Bryc. The Normal Distribution: Characterizations with Applications. Springer-Verlag, 1995. ¨ [5] R. Clausius. Uber die w¨rmeleitung gasf¨rmiger k¨rper. Annalen der Physik, 125:353–400, a o o 1865. [6] T.M. Cover and J.A. Thomas. Elements of information theory. Wiley, 1991. [7] S.R. de Groot and P. Mazur. Non-Equilibrium Thermodynamics. Dover, 1984. [8] Z.I. Botev D.P. Kroese, T.Taimre. Handbook of Monte Carlo Methods. John Wiley & Sons, New York, 2011. [9] H. Haken. Synergetics, Introduction and Advanced Topics,. Springer,, Berlin, 2004. [10] S.A. Kauffman. The Origins of Order. Oxford University Press, 1993. [11] Mathematica. www.wolfram.com. [12] H. Meinhardt. Models of biological pattern formation: from elementary steps to the organi- zation of embryonic axes. Curr. Top. Dev. Biol., 81:1–63, 2008. [13] D. Morrison. Exploring Planetary Worlds. W. H. Freeman, 1994. 20
  • 23. Movsheva, Anna [14] I. Prigogine. Non-Equilibrium Statistical Mechanics. Interscience Publishers, 1962. [15] I. Prigogine. Introduction to Thermodynamics of Irreversible Processes. John Wiley and Sons, 1968. [16] I. Prigogine and G. Nicolis. Self-Organization in Nonequilibrium Systems: From Dissipative Structures to Order through Fluctuations. John Wiley and Sons, 1977. [17] G.C. Rota. The number of partitions of a set. American Mathematical Monthly, 71(5):498–504, 1964. [18] W. Rudin. Real and Complex Analysis. McGraw-Hill, 1987. [19] E. Schr¨dinger. What Is Life? Cambridge University Press, 1992. o [20] C. E. Shannon. A mathematical theory of communication. The Bell System Technical Journal,, 27:379–423, 1948. [21] N. J. A. Sloane and other. The on-line encyclopedia of integer sequences oeis.org. [22] R.P. Stanley. Enumerative combinatorics, volume 1,2. CUP, 1997. 21