SlideShare a Scribd company logo
Real Time Information Reconstruction:
               The Prediction Market.
                     MS&E 211
                          Omede Firouz          05547809
                         Xuechen Jiang          05836587
                           Yixin Kou            05796061
                         Raghav Ramesh          05835723
                               December 8, 2012


                                     Abstract
   A survey of the prediction market is given and analyzed in the framework of convex
optimization. Online and offline approaches are compared. Conditions for unique-
ness of solution are given. Computational results are shown for both simulated data
and student bets on football games. Comparisons are made to traditional approaches
such as regression. The methods analyzed are in general theoretically sound with few
weaknesses.




                                         1
1 OFFLINE PROBLEM


Introduction
Prediction is a well studied field with many statistical and machine learning techniques
available. Recently, the crowd-sourcing of prediction markets has become of interest. In
many cases, auctions are the source of such distributed information. Methods to analyze the
large and diverse data that arises have been developed in the context of convex optimization.


1      Offline Problem
Consider the problem of bid acceptance for a centralized market maker. Given a set of bids
on various states, the market maker would like to accept bids in order to maximize his/her
own profit. We will consider the state price security. Let π be the bid prices for each bid, x
be the number of shares sold for each bid, and y be the worst case loss. Then the following
linear program will maximize the worst case profit.

           Primal: max π T x − y                                       Dual: min qT s
              AT −e         x         0                 A   I         p    ≥      π
                                  ≤
               I 0          y         q                −eT 0T         s    =      −1
           x ≥ 0, y free                                                       p, s ≥ 0

We can also analyze the dual to see that s are the shadow prices for our share limit con-
straints, the amount our objective would increase if we could sell more shares (ceteris
paribus). p are the state prices, and Ai,∗ p are the internal cost of state i occuring per
share sold (unit cost).

      We can further analyze the system through complementarity conditions. It is clear that:
1.   If we do not sell any shares, the internal cost is strictly higher than the bid price.
2.   If we do sell shares, the bid price is at least as high as the internal cost.
3.   If we do not max out the shares we sell, the bid price and internal price are equal.
4.   If we max out the shares we sell, the bid price is strictly greater than the internal price.

    The dual can be stated as the problem of minimizing the money won qT s subject to, for
all possible winning states i, the beliefs/bets made by the winners are fully utilized. (The
winners’ bets imply their beliefs) [2].
    Solving the first two games gives the following State Prices:
                      State:         1   2  3   4   5   6   7   8
                    Notre Dame       0   0 .27 .43 .3   0   0   0
                     California    .001 .01 0 .119 .68 .12 .06 .01

   In these cases, Stanford beat Notre Dame 20-13 (state 5), and California 21-3 (state 7).
These state price distributions are quite bad since state prices of zero imply any bet will be
accepted. Clearly however, there is some nonzero probability on these outcomes. We explain
such a discrepancy by the small number of students (on the order of 100), undervaluing the
extreme outcomes of the game, and a lack of understanding of the market design. Indeed,

                                                2
3 UNIFIED FRAMEWORK


in later games, the state prices more accurately resembled what an expected probability
distribution would be.


2    Uniqueness
However, the question arises as to whether the state prices are necessarily unique. In fact,
using only the current formulation, the state prices are not guaranteed to be unique. Consider
the case of 2 teams and 1 bid: (1,1) with any price π1 .


    max π1 x1 − y                                                                   min q1 s1
                                                                      
           1 −1                    0                                   p1
    s.t.  1 −1 
                         x1
                               ≤  0         s.t.
                                                      1  1 1          p2  ≥          π1
                         y                            −1 −1 0               =          −1
           1 0                     q1                                  s1
    x1 ≥ 0, y f ree                                                            p1 , p2 , s1 ≥ 0

     Note the first two columns of the dual are identical. This is an ’entanglement’ of the
states 1,2 and the dual variables p1 , p2 . With a linear objective, every point on p1 + p2 = 1,
p1 , p2 ≥ 0 is optimal. To see why, for simplicity, let π1 < 1. Then the primal and dual
objectives are 0 by strong duality and s1 = 0. We have p1 + p2 ≥ π1 , p1 + p2 = 1. The first
constraint is redundant since pi1 < 1, and clearly all p1 + p2 = 1 are optimal.

   Theorem. If there is a set of states S such that every bet that includes one element in
S includes all of S, the dual has infinite solutions.
Proof. Every row corresponding to S will have identical rows in the primal. Therefore,
the dual will have identical columns corresponding S. Since the objective is linear, every
combination of pi∈S such that i∈S pi = 1 − j∈S , p ≥ 0 are optimal solutions to the dual.

    It is natural to break the above symmetry by choosing the midpoint of all possible
solutions since a corner point’s sensitivity will be higher than the midpoint with slight
changes in the parameters. The midpoint is in a sense the most ’robust’ state price.
    We can choose the midpoint similar to choosing the analytic center in an interior point
method. Since we are maximizing, we will use a nondecreasing and strictly concave function
of the slack variables.


3    Unified Framework
Here we formulate bid acceptance with the added function u(s) which is nondecreasing and
strictly concave. Since the feasible region remains unchanged and convex, and the objective
is concave maximization, we still have a linearly constrained convex optimization problem
which can be solved efficiently.




                                               3
3 UNIFIED FRAMEWORK



                          n
               max             πj xj − y + u(s)
                         j=1
                         n
              s.t.             aij xj + si = y, ∀i = 1, 2, ..., m                 Dual Var: p
                         j=1

              0 ≤ xj , ∀j = 1, ..., n                                             Dual Var: µ
              xj ≤ qj , ∀j = 1, ..., n                                            Dual Var: ν
              si ≥ 0, ∀i = 1, ..., m                                              Dual Var: η
Letting u(s) =       i   u(si ) we write down the KKT conditions.
                                                                       m
                                                     δf
    KKT Conditions                  Stationarity:        : πj x j =         aij pi − µj + νj ∀j = 1, ..., n
                                                     δxj              i=1
                                                                       m
                                                      δf
                                                         :    −1=           −pi
                                                      δy              i=1
                                              δf    δu(s)
                                                  :         = pi − ηi ∀i = 1, ..., m
                                              δsi     δsi
                                Primal Feasibility: 0 ≤ xj ≤ qj , ∀j = 1, ..., n
                                                    0 ≤ si , ∀i = 1, ..., m
                                  Dual Feasibility: pi f ree ∀i = 1, ..., m
                                                    µj , νj ≥ 0 ∀j = 1, ..., n
                                                    ηi ≥ 0 ∀i = 1, ..., m
                          Complementary Slackness: µj xj = 0 ∀j = 1, ..., n
                                                    νj (qj − xj ) = 0 ∀j = 1, ..., n
                                                    ηi si = 0 ∀i = 1, ..., m
    From [3] we see the interpretation of u is a cost to ’slack’. That is, u is the value lost
by not matching the worst case with each case. It is worth noting that a strictly concave
increasing function is stable in the sense that the maximal point is towards the center of the
optimal set. That is, it has diminishing returns or risk aversion ’built in’.
    From [3] we see that the choice of u corresponds to different levels of risk aversion. The
most risk averse option is to let u = min(s), which recovers our original formulation (since
min(s) = 0 of the original formulation, and s ≥ 0) and cannot lose money in the worst case.
By letting u = θ(s) where θ is a probability distribution of our beliefs, our problem becomes
that of maximizing the expected value.

   Furthermore, a strictly concave u generates a unique solution.

    Theorem: If u(s) is strictly concave, the state prices are unique.
Proof. Suppose there were 2 different state prices corresponding to 2 different optimal so-
lutions to the above optimization problem, z 1 , z 2 . Since the constraints are linear equalities

                                                         4
3.1 Parallels to Underconstrained Equations                            4 ONLINE PROBLEM


and inequalities, the feasible region is convex and z ∗ = αz 1 + (1 − α)z 2 for 0 ≤ α ≤ 1 is
feasible. Since the objective function is the sum of convex terms and a strictly convex term,
it is strictly convex and f (z ∗ ) > αf (z 1 ) + (1 − α)z 2 , violating our optimality assumption.
Therefore, the state prices must be unique.



3.1    Parallels to Underconstrained Equations
We can draw a parallel here to the problem of underconstrained systems of equations. If
such a solution is consistent, there are infinite solutions. Then to find a unique solution, the
minimum norm solution is selected. This is also an example of convex optimization since
the objective and feasible region are convex.

   Theorem. The norm is a convex function.
Proof. By triangle inequality we have α|x| + (1 − α)|y| = |αx| + |(1 − α)y| ≤ |αx + (1 − α)y|.
Furthermore, triangle inequality is tight if and only if x = cy for some nonnegative c.

   Theorem. The set of solutions to a linear system Ax = b is an affine set.
Proof. The solution set is a vector space, and all vector spaces are affine.

    Although the norm is not strictly convex, we can still establish uniqueness.
Theorem. Minimizing the norm gives a unique solution to a system of equations.
Proof. The solution set is N (A)+ x where Aˆ = b and N (A) is the null space. If b = 0 then x
                                    ˆ         x                                              ˆ
is not in the null space. Then, without loss of generality we can choose x to be perpendicular
                                                                          ˆ
to the null space. In this case, triangle inequality is strict and minimized when we choose 0
as the null space vector. Since this is a unique choice, the minimum norm solution is unique.
In the case that b = 0, x = 0 is the unique minimum norm solution.


4     Online Problem
Up to this point we have assumed all the bids have arrived before our decision has to be
made. However, in practice, bettors would like to know whether their bids are accepted or
rejected in real time. Then, we can modify our formulation to form an online program.

               max πk xk − y +            u(si )
                                      i
                      k
              s.t.         aij xj + si = y, ∀i = 1, 2, ..., m         Dual Var: p
                     j=1

              0 ≤ xk                                                  Dual Var: µ
              xk ≤ q k                                                Dual Var: ν
              si ≥ 0, ∀i = 1, ..., m                                  Dual Var: η




                                                     5
4 ONLINE PROBLEM


Following [2], for simplicity let bk−1 = k−1 ai xi . That is, b is the outstanding shares in
                                         i=1
each state up to the current time. Then our formulation simplifies to:
                                                       k−1
              max πk xk − y +          u(y − aik xk − bi )
                                  i
             s.t. 0 ≤ xk                                                          Dual Var: µ
             xk ≤ q k                                                             Dual Var: ν

   There are only two variables, xk and y. We can write down the KKT conditions:
                           δf
  KKT Stationarity:            : π k xk −            aik u (y − aik xk − bk−1 ) − νk + µk = 0
                                                                          i
                           δxk                   i
                           δf
                              :       −1+            u (y − aik xk − bt−1 ) = 0
                                                                      i                                (1∗ )
                           δy                i
         Primal Feasibility:      0 ≤ xk ≤ q k
           Dual Feasibility:      µk , νk ≥ 0
   Complementary Slackness:       xk µ k = 0
                                      → xk (πk xk −            aik u (y − aik xk − bk−1 ) − νk ) = 0
                                                                                    i                  (2∗ )
                                                           i
                                  (qk − xk )νk = 0

    The problem can now be solved as a series of individual optimizations of two variables,
by updating b repeatedly. From [2] we see pk = u (y t − ait xi − bt−1 represents the state prices
                                             i                    i
after each iteration. From [2], we can solve this as follows. 1. If the internal price is strictly
higher than the bid price, we can immediately reject a bid. If not, go to the next step.
2.Update the state prices. If the bid price is higher than the new internal price accept it up
to qk (from complementarity) and update b, p. Otherwise, 3. find the quantity needed so
the new internal price exactly matches the bid price. This can be solved as a system of two
equations (1∗ ) and (2∗ ) in the KKT conditions. This is explained thoroughly in [2] lecture
notes 15.

    Moreover, let O(f (m, n)) be the complexity of solving the offline formulation for m states
and n bets. Then the online formulation takes O(nf (m, 2) + nm) time, where nm is the
cost of updating the vector b, n times. Such a speed is comparable to the offline problem.
Moreover, the memory required is O(m) and quite small if a huge number of bids are made.
In fact, it was already mentioned in [3] that online linear programs can be used as a way to
solve extremely large linear programs. The online formulation can therefore be used to solve
problems where the number of columns is far larger than the memory available.
    It is interesting that depending on the choice of u, the worst case is bounded differently.
In the most risk averse case, u = min(s) and the worst case is 0, but the online formulation
will never accept any realistic bids. By varying the choice of u a tradeoff can be made
between expected gain and risk aversity.



                                                       6
5 TRUTHFULNESS


5     Truthfulness
It is a desirable property of our system to have truthful bets. Let a player make a bet and
(x∗ ) be the number of shares sold. We will charge: χ(0) − χ(x∗ ). where
   k                                                            k

                             χ(x) = max −y + u(s)
                                       y,s

                              s.t. si = y − qi − aik x, ∀i = 1, 2, ..., m

We will maximize the profit for an arbitrary bidder over πk and show that at optimality it
is equal to his/her real valuation πk .
                                   ˆ

             πk x∗ − ck = πk x∗ − (χ(0) − χ(x∗ ))
             ˆ k          ˆ k                  k
                        = πk x∗ + χ(x∗ ) − χ(0)
                          ˆ k        k
                        ≡ πk xk + χ(x∗ ) since χ(0) is constant with respect to πk
                          ˆ   ∗
                                     k

But note χ has the same constraints as the online problem:
                       k
                            aij xj + si ≡ q + aik xk + si = y ∀i = 1, 2, ..., m
                      j=1


At optimaltiy, χ(x∗ ) has the same x as the optimum of the online problem z ∗ . The only
                    k
difference is in a single term of the objective function. It follows that χ(x∗ ) = zk − πk xk .
                                                                            k
                                                                                   ∗

Then we have:

                             πk x∗ − ck = πk x∗ + χ(x∗ )
                             ˆ k          ˆ k            k
                                        = π k x ∗ + z ∗ − πk x k
                                          ˆ k
                                         = π k x∗ +
                                           ˆ k               πt xt − y + u(s)
                                                       t=k


    So, if the player bids truthfully and πk = πk , we can replace the above to get:
                                               ˆ


                             πk x ∗ − ck = π k x ∗ +
                             ˆ k           ˆ k               πt xt − y + u(s)
                                                       t=k

                                         =      πt xt − y + u(s)

In other words, if the player bids truthfully, then his own profit maximization problem
becomes exactly aligned with the pricing and online optimization problems. Therefore, to
bet differently would be suboptimal.
    This is a special case of the VCG mechanism, where the price assigned is equal to the
lost value to the rest of the players.[5]




                                                   7
6 RECOVERY OF THE GRAND TRUTH


6     Recovery of the Grand Truth
As we have seen, the formulations as given return state prices. These state prices can be seen
as the bettor’s collective belief distribution. If there is a hidden probability distribution the
betters are aware of, a grand truth, we hope to converge to this grand truth after enough bets.

    Theorem. Convergence to the grand truth is a necessary condition to bound loss.
Proof. Assume a method does not converge to the grand truth, then there will always be a
bet according to the grand truth that is accepted. In other words, the bid price is less than
the internal price π(x) < pT x. Consider the series of loss due to bets of this type. Such a
series diverges if the bid price does not converge to the grand truth.

   From [1] we know the choice of u(s) puts different bounds on the worst case losses. This
worst case loss can be seen as the integral of the difference between the state prices and the
grand truth. The choice of u(s) can be seen to influence the rate of the convergence. Here
we will investigate different u. For this section we will use the data from Stanford students
betting on outcomes again. We use parameters w = 1, w = 10 where b = w/m, and m is the
total
   For logarithmic scoring, the worst case loss is unbounded [1]. For exponential scoring,
the worst case loss is b log N [1]. Using w = 1, w = 10 for both logarithmic and exponential
scoring functions we get the following results for the online simulation of the first two games:

                                   Log, w = 1       Exp, w = 1   Log, w = 10 Exp, w = 10
    Notre Dame, Total Shares           135              48          2012         221
     Notre Dame, Total Bids            58               23            59          59
      Berkeley, Total Shares          7070             6488         7606        6695
       Berkeley, Total Bids            123              127          122         127

    We considered a bid ’accepted’ if we granted more than 0.001 shares. Although the
total number of bids accepted is quite similar (at least for Berkeley), the total shares sold
follows a clear trend. Clearly, b = 10 is less risk averse and accepts more bids. Similarly,
logarithmic scoring seems to be less risk averse than exponential. This makes sense in light
of the fact that exponential can bound the worst case. There is a clear tradeoff then between
risk aversity and total shares accepted.
    On the state price convergence, we see the following trend for state 1 in Stanford vs.
Notre Dame (the offline problem has solution p1 = 0).




                                                8
6.1 Regression                                     6 RECOVERY OF THE GRAND TRUTH




    All the functions converge to the original offline solution of p1 = 0. However, the trend
as to which function works best is not clear in this case. Logarithm initially converges faster,
but takes longer after bid 40. We also note that smaller b appears to converge faster for this
case, although it is not clear what will happen in general.

6.1    Regression
Using a model of grand truth with gaussian peturbations (noise), we note that least squares
regression can be shown to return the grand truth [4].


Conclusion
The prediction market is an interesting emerging field of research. Although many traditional
approaches to prediction have been used such as regression, new approaches can be nearly
self-funded as well as a source of information. More theoretical results are needed, for
example to prove recovery of a grand truth. In comparison, least squares regression can
provably recover the grand truth.


                                               9
6.1 Regression                                6 RECOVERY OF THE GRAND TRUTH


References

  [1] S. Agrawal, E. Delage, M. Peters, Z. Wang and Y. Ye. A Unified Framework for
Dynamic Pari-mutuel Information Market Design. The 10th ACM Conference on Electronic
Commerce, 2009.

   [2] Y. Ye. MS&E 211 Lecture Notes, 2012.

   [3] S. Agrawal, Z. Wang, Y. Ye. A Dynamic Near-Optimal Algorithm for Online Linear
Programming, 2009.

   [4] A. Ng. CS 229 Lecture Notes, 2012.

   [5]en.wikipedia.org/wiki/Vickrey-Clarke-Groves auction




                                            10
6.1 Regression                         6 RECOVERY OF THE GRAND TRUTH


Appendix A: Code for Problem 1
% Read the excel data into matrices
num1 = xlsread(’NotreDame.xlsx’);
num2 = xlsread(’California.xlsx’);

A1 = num1(:,3:10);
A1 = A1’;
[m1 n1] = size(A1);
A1 = [A1 -1*ones(m1,1)];

f1 = num1(:,2);
f1 = [f1; -1];
f1 = -f1;

b1 = zeros(m1,1);

lb1 = zeros(n1,1);
ub1 = num1(:,1);

[x1, fval1, exitflag1, output1, lambda1] = linprog(f1,A1,b1,[],[],lb1,ub1);

%Print the state prices
lambda1.ineqlin


A2 = num2(:,3:10);
A2 = A2’;
[m2 n2] = size(A2);
A2 = [A2 -1*ones(m2,1)];

f2 = num2(:,2);
f2 = [f2; -1];
f2 = -f2;

b2 = zeros(m2,1);

lb2 = zeros(n2,1);
ub2 = num2(:,1);

[x2, fval2, exitflag2, output2, lambda2] = linprog(f2,A2,b2,[],[],lb2,ub2);

% Print the state prices
lambda2.ineqlin


                                      11
6.1 Regression                        6 RECOVERY OF THE GRAND TRUTH


Appendix B: Code for Problem 6, Online Version
clc
clear all
% Either of Type 1 or Type 2 Data Generation Modes to be chosen
% and other commented
% Type 1 - Data from Auction Games
% num = xlsread(’NotreDame.xlsx’);
% num = xlsread(’California.xlsx’);
% numStates = 8;
% b = 1;
%Type 2 - Random Data Generation
numBids = 100;
numStates = 8;
b = 1;
%num is the overall matrix
num = zeros(numBids,numStates+2);
for i = 1 :numBids
    num(i,1) = randi([2,10],1);
num(i,2) = rand(1);
for j = 1 : numStates
        num(i,j+2) = randi([0 1],1);
    end
end

A = num(:,(3:numStates+2));
A = A’;
[m n] = size(A);
f = num(:,2);
q = num(:,1);
cvx_begin
    variable xt
variable y
variable s(m)
dual variable P
maximize( f(1,1)*xt - y + (b/m)*sum(1-exp(-s)));
    subject to
        P: A(:,1)*xt + s - y*ones(m,1) == 0;
        xt >= 0;
xt <= q(1,1);
s >= 0;cvx_end
if (xt < 10^-5)
    xt = 0;
end
x_old = xt;

                                     12
6.1 Regression                        6 RECOVERY OF THE GRAND TRUTH


A_old = A(:,1);
Price = P;
for i = 2 : n
cvx_begin
    variable xt
variable y
variable s(m)
dual variable P
maximize( f(i,1)*xt - y + (b/m)*sum(1-exp(-s)));
    subject to
        P: A(:,1)*xt + s - y*ones(m,1) == -A_old * x_old;
        xt >= 0;
xt <= q(i,1);
s >= 0;
cvx_end
if (xt < 10^-5)
    xt = 0;
end
x_old =[x_old; xt];
A_old =[A_old, A(:,i)];
Price =[Price P];
end
Price = - Price’;


Appendix C: Code for Problem 6, Offline Version
clc
clear all
num = xlsread(’NotreDame.xlsx’);
% num = xlsread(’California.xlsx’);
A = num(:,3:10);
A = A’;
[m n] = size(A);
f = num(:,2);
q = num(:,1);
cvx_begin
    variable x(n)
variable y
variable s(m)
dual variable P
b = 0.1;
maximize (f’*x - y + (b/m)*sum(log(s)))
    subject to
         P: A*x + s -y*ones(m,1) == 0


                                     13
6.1 Regression      6 RECOVERY OF THE GRAND TRUTH


          x >= 0
x <= q
s >= 0
cvx_end




                   14

More Related Content

What's hot

Lecture on nk [compatibility mode]
Lecture on nk [compatibility mode]Lecture on nk [compatibility mode]
Lecture on nk [compatibility mode]NBER
 
Chapter 1 nonlinear
Chapter 1 nonlinearChapter 1 nonlinear
Chapter 1 nonlinearNBER
 
Chapter 2 pertubation
Chapter 2 pertubationChapter 2 pertubation
Chapter 2 pertubationNBER
 
Chapter 5 heterogeneous
Chapter 5 heterogeneousChapter 5 heterogeneous
Chapter 5 heterogeneousNBER
 
multiple linear reggression model
multiple linear reggression modelmultiple linear reggression model
multiple linear reggression modelOmarSalih
 
Nonlinear Price Impact and Portfolio Choice
Nonlinear Price Impact and Portfolio ChoiceNonlinear Price Impact and Portfolio Choice
Nonlinear Price Impact and Portfolio Choiceguasoni
 
Matematika ekonomi slide_optimasi_dengan_batasan_persamaan
Matematika ekonomi slide_optimasi_dengan_batasan_persamaanMatematika ekonomi slide_optimasi_dengan_batasan_persamaan
Matematika ekonomi slide_optimasi_dengan_batasan_persamaanUfik Tweentyfour
 
Prediction of Financial Processes
Prediction of Financial ProcessesPrediction of Financial Processes
Prediction of Financial ProcessesSSA KPI
 
Asset Prices in Segmented and Integrated Markets
Asset Prices in Segmented and Integrated MarketsAsset Prices in Segmented and Integrated Markets
Asset Prices in Segmented and Integrated Marketsguasoni
 
UT Austin - Portugal Lectures on Portfolio Choice
UT Austin - Portugal Lectures on Portfolio ChoiceUT Austin - Portugal Lectures on Portfolio Choice
UT Austin - Portugal Lectures on Portfolio Choiceguasoni
 
Exchange confirm
Exchange confirmExchange confirm
Exchange confirmNBER
 
Lecture on solving1
Lecture on solving1Lecture on solving1
Lecture on solving1NBER
 
Chapter 3 projection
Chapter 3 projectionChapter 3 projection
Chapter 3 projectionNBER
 
Chapter 4 likelihood
Chapter 4 likelihoodChapter 4 likelihood
Chapter 4 likelihoodNBER
 

What's hot (20)

Peta karnaugh
Peta karnaughPeta karnaugh
Peta karnaugh
 
Lecture on nk [compatibility mode]
Lecture on nk [compatibility mode]Lecture on nk [compatibility mode]
Lecture on nk [compatibility mode]
 
Chapter 1 nonlinear
Chapter 1 nonlinearChapter 1 nonlinear
Chapter 1 nonlinear
 
Chapter 2 pertubation
Chapter 2 pertubationChapter 2 pertubation
Chapter 2 pertubation
 
Chapter 5 heterogeneous
Chapter 5 heterogeneousChapter 5 heterogeneous
Chapter 5 heterogeneous
 
multiple linear reggression model
multiple linear reggression modelmultiple linear reggression model
multiple linear reggression model
 
Nonlinear Price Impact and Portfolio Choice
Nonlinear Price Impact and Portfolio ChoiceNonlinear Price Impact and Portfolio Choice
Nonlinear Price Impact and Portfolio Choice
 
Or lpp
Or lppOr lpp
Or lpp
 
Matematika ekonomi slide_optimasi_dengan_batasan_persamaan
Matematika ekonomi slide_optimasi_dengan_batasan_persamaanMatematika ekonomi slide_optimasi_dengan_batasan_persamaan
Matematika ekonomi slide_optimasi_dengan_batasan_persamaan
 
Prediction of Financial Processes
Prediction of Financial ProcessesPrediction of Financial Processes
Prediction of Financial Processes
 
Asset Prices in Segmented and Integrated Markets
Asset Prices in Segmented and Integrated MarketsAsset Prices in Segmented and Integrated Markets
Asset Prices in Segmented and Integrated Markets
 
UT Austin - Portugal Lectures on Portfolio Choice
UT Austin - Portugal Lectures on Portfolio ChoiceUT Austin - Portugal Lectures on Portfolio Choice
UT Austin - Portugal Lectures on Portfolio Choice
 
Interview Preparation
Interview PreparationInterview Preparation
Interview Preparation
 
T tests anovas and regression
T tests anovas and regressionT tests anovas and regression
T tests anovas and regression
 
Exchange confirm
Exchange confirmExchange confirm
Exchange confirm
 
F ch
F chF ch
F ch
 
1
11
1
 
Lecture on solving1
Lecture on solving1Lecture on solving1
Lecture on solving1
 
Chapter 3 projection
Chapter 3 projectionChapter 3 projection
Chapter 3 projection
 
Chapter 4 likelihood
Chapter 4 likelihoodChapter 4 likelihood
Chapter 4 likelihood
 

Viewers also liked

Practical Augmented Visualization on Handheld Devices for Cultural Heritage
Practical Augmented Visualization on Handheld Devices for Cultural Heritage Practical Augmented Visualization on Handheld Devices for Cultural Heritage
Practical Augmented Visualization on Handheld Devices for Cultural Heritage Giovanni Murru
 
Mackey Glass Time Series Prediction
Mackey Glass Time Series PredictionMackey Glass Time Series Prediction
Mackey Glass Time Series PredictionGiovanni Murru
 
F-Measure as the error function to train Neural Networks
F-Measure as the error function to train Neural NetworksF-Measure as the error function to train Neural Networks
F-Measure as the error function to train Neural NetworksFrancisco Zamora-Martinez
 
Review of Methodology and Rationale of Monte Carlo Simulation - Application t...
Review of Methodology and Rationale of Monte Carlo Simulation - Application t...Review of Methodology and Rationale of Monte Carlo Simulation - Application t...
Review of Methodology and Rationale of Monte Carlo Simulation - Application t...vramnath
 
MSE Walls & Geosynthetics - Design Basics Webinar April 2016
MSE Walls & Geosynthetics - Design Basics Webinar April 2016MSE Walls & Geosynthetics - Design Basics Webinar April 2016
MSE Walls & Geosynthetics - Design Basics Webinar April 2016Communications Branding
 
Moving average method maths ppt
Moving average method maths pptMoving average method maths ppt
Moving average method maths pptAbhishek Mahto
 
Chapter 16
Chapter 16Chapter 16
Chapter 16bmcfad01
 
Time Series
Time SeriesTime Series
Time Seriesyush313
 
Using Microsoft Social Engagement Together with Dynamics CRM
Using Microsoft Social Engagement Together with Dynamics CRMUsing Microsoft Social Engagement Together with Dynamics CRM
Using Microsoft Social Engagement Together with Dynamics CRMJukka Niiranen
 
Unit 7 mathemetical calculations for science
Unit 7 mathemetical calculations for scienceUnit 7 mathemetical calculations for science
Unit 7 mathemetical calculations for sciencemrrayner
 

Viewers also liked (14)

Practical Augmented Visualization on Handheld Devices for Cultural Heritage
Practical Augmented Visualization on Handheld Devices for Cultural Heritage Practical Augmented Visualization on Handheld Devices for Cultural Heritage
Practical Augmented Visualization on Handheld Devices for Cultural Heritage
 
Mackey Glass Time Series Prediction
Mackey Glass Time Series PredictionMackey Glass Time Series Prediction
Mackey Glass Time Series Prediction
 
F-Measure as the error function to train Neural Networks
F-Measure as the error function to train Neural NetworksF-Measure as the error function to train Neural Networks
F-Measure as the error function to train Neural Networks
 
Review of Methodology and Rationale of Monte Carlo Simulation - Application t...
Review of Methodology and Rationale of Monte Carlo Simulation - Application t...Review of Methodology and Rationale of Monte Carlo Simulation - Application t...
Review of Methodology and Rationale of Monte Carlo Simulation - Application t...
 
MSE Walls & Geosynthetics - Design Basics Webinar April 2016
MSE Walls & Geosynthetics - Design Basics Webinar April 2016MSE Walls & Geosynthetics - Design Basics Webinar April 2016
MSE Walls & Geosynthetics - Design Basics Webinar April 2016
 
Durbin watson tables
Durbin watson tablesDurbin watson tables
Durbin watson tables
 
Moving average method maths ppt
Moving average method maths pptMoving average method maths ppt
Moving average method maths ppt
 
Chapter 16
Chapter 16Chapter 16
Chapter 16
 
Time Series Analysis Ravi
Time Series Analysis RaviTime Series Analysis Ravi
Time Series Analysis Ravi
 
Technical analysis
Technical analysisTechnical analysis
Technical analysis
 
time series analysis
time series analysistime series analysis
time series analysis
 
Time Series
Time SeriesTime Series
Time Series
 
Using Microsoft Social Engagement Together with Dynamics CRM
Using Microsoft Social Engagement Together with Dynamics CRMUsing Microsoft Social Engagement Together with Dynamics CRM
Using Microsoft Social Engagement Together with Dynamics CRM
 
Unit 7 mathemetical calculations for science
Unit 7 mathemetical calculations for scienceUnit 7 mathemetical calculations for science
Unit 7 mathemetical calculations for science
 

Similar to Real time information reconstruction -The Prediction Market

Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selectionguasoni
 
Optimalpolicyhandout
OptimalpolicyhandoutOptimalpolicyhandout
OptimalpolicyhandoutNBER
 
Summary of "A Universally-Truthful Approximation Scheme for Multi-unit Auction"
Summary of "A Universally-Truthful Approximation Scheme for Multi-unit Auction"Summary of "A Universally-Truthful Approximation Scheme for Multi-unit Auction"
Summary of "A Universally-Truthful Approximation Scheme for Multi-unit Auction"Thatchaphol Saranurak
 
100 things I know
100 things I know100 things I know
100 things I knowr-uribe
 
Curve_fairness_IOUs.pdf
Curve_fairness_IOUs.pdfCurve_fairness_IOUs.pdf
Curve_fairness_IOUs.pdfStefan Duprey
 
Cournot’s model of oligopoly
Cournot’s model of oligopolyCournot’s model of oligopoly
Cournot’s model of oligopolynazirali423
 
Options pricing using Lattice models
Options pricing using Lattice modelsOptions pricing using Lattice models
Options pricing using Lattice modelsQuasar Chunawala
 
The Use of Fuzzy Optimization Methods for Radiation.ppt
The Use of Fuzzy Optimization Methods for Radiation.pptThe Use of Fuzzy Optimization Methods for Radiation.ppt
The Use of Fuzzy Optimization Methods for Radiation.ppteslameslam18
 
Sienna 3 bruteforce
Sienna 3 bruteforceSienna 3 bruteforce
Sienna 3 bruteforcechidabdu
 
Principles of Actuarial Science Chapter 2
Principles of Actuarial Science Chapter 2Principles of Actuarial Science Chapter 2
Principles of Actuarial Science Chapter 2ssuser8226b2
 
Rogue Traders
Rogue TradersRogue Traders
Rogue Tradersguasoni
 
2D1431 Machine Learning
2D1431 Machine Learning2D1431 Machine Learning
2D1431 Machine Learningbutest
 

Similar to Real time information reconstruction -The Prediction Market (20)

Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selection
 
Optimalpolicyhandout
OptimalpolicyhandoutOptimalpolicyhandout
Optimalpolicyhandout
 
Summary of "A Universally-Truthful Approximation Scheme for Multi-unit Auction"
Summary of "A Universally-Truthful Approximation Scheme for Multi-unit Auction"Summary of "A Universally-Truthful Approximation Scheme for Multi-unit Auction"
Summary of "A Universally-Truthful Approximation Scheme for Multi-unit Auction"
 
Slides
SlidesSlides
Slides
 
Statistical Method In Economics
Statistical Method In EconomicsStatistical Method In Economics
Statistical Method In Economics
 
Duality.ppt
Duality.pptDuality.ppt
Duality.ppt
 
100 things I know
100 things I know100 things I know
100 things I know
 
Slides ensae-2016-9
Slides ensae-2016-9Slides ensae-2016-9
Slides ensae-2016-9
 
Curve_fairness_IOUs.pdf
Curve_fairness_IOUs.pdfCurve_fairness_IOUs.pdf
Curve_fairness_IOUs.pdf
 
Cournot’s model of oligopoly
Cournot’s model of oligopolyCournot’s model of oligopoly
Cournot’s model of oligopoly
 
Options pricing using Lattice models
Options pricing using Lattice modelsOptions pricing using Lattice models
Options pricing using Lattice models
 
The Use of Fuzzy Optimization Methods for Radiation.ppt
The Use of Fuzzy Optimization Methods for Radiation.pptThe Use of Fuzzy Optimization Methods for Radiation.ppt
The Use of Fuzzy Optimization Methods for Radiation.ppt
 
Crt
CrtCrt
Crt
 
Sienna 3 bruteforce
Sienna 3 bruteforceSienna 3 bruteforce
Sienna 3 bruteforce
 
Vidyasagar rocond09
Vidyasagar rocond09Vidyasagar rocond09
Vidyasagar rocond09
 
Principles of Actuarial Science Chapter 2
Principles of Actuarial Science Chapter 2Principles of Actuarial Science Chapter 2
Principles of Actuarial Science Chapter 2
 
Price Models
Price ModelsPrice Models
Price Models
 
Rogue Traders
Rogue TradersRogue Traders
Rogue Traders
 
Probabilistic systems assignment help
Probabilistic systems assignment helpProbabilistic systems assignment help
Probabilistic systems assignment help
 
2D1431 Machine Learning
2D1431 Machine Learning2D1431 Machine Learning
2D1431 Machine Learning
 

Real time information reconstruction -The Prediction Market

  • 1. Real Time Information Reconstruction: The Prediction Market. MS&E 211 Omede Firouz 05547809 Xuechen Jiang 05836587 Yixin Kou 05796061 Raghav Ramesh 05835723 December 8, 2012 Abstract A survey of the prediction market is given and analyzed in the framework of convex optimization. Online and offline approaches are compared. Conditions for unique- ness of solution are given. Computational results are shown for both simulated data and student bets on football games. Comparisons are made to traditional approaches such as regression. The methods analyzed are in general theoretically sound with few weaknesses. 1
  • 2. 1 OFFLINE PROBLEM Introduction Prediction is a well studied field with many statistical and machine learning techniques available. Recently, the crowd-sourcing of prediction markets has become of interest. In many cases, auctions are the source of such distributed information. Methods to analyze the large and diverse data that arises have been developed in the context of convex optimization. 1 Offline Problem Consider the problem of bid acceptance for a centralized market maker. Given a set of bids on various states, the market maker would like to accept bids in order to maximize his/her own profit. We will consider the state price security. Let π be the bid prices for each bid, x be the number of shares sold for each bid, and y be the worst case loss. Then the following linear program will maximize the worst case profit. Primal: max π T x − y Dual: min qT s AT −e x 0 A I p ≥ π ≤ I 0 y q −eT 0T s = −1 x ≥ 0, y free p, s ≥ 0 We can also analyze the dual to see that s are the shadow prices for our share limit con- straints, the amount our objective would increase if we could sell more shares (ceteris paribus). p are the state prices, and Ai,∗ p are the internal cost of state i occuring per share sold (unit cost). We can further analyze the system through complementarity conditions. It is clear that: 1. If we do not sell any shares, the internal cost is strictly higher than the bid price. 2. If we do sell shares, the bid price is at least as high as the internal cost. 3. If we do not max out the shares we sell, the bid price and internal price are equal. 4. If we max out the shares we sell, the bid price is strictly greater than the internal price. The dual can be stated as the problem of minimizing the money won qT s subject to, for all possible winning states i, the beliefs/bets made by the winners are fully utilized. (The winners’ bets imply their beliefs) [2]. Solving the first two games gives the following State Prices: State: 1 2 3 4 5 6 7 8 Notre Dame 0 0 .27 .43 .3 0 0 0 California .001 .01 0 .119 .68 .12 .06 .01 In these cases, Stanford beat Notre Dame 20-13 (state 5), and California 21-3 (state 7). These state price distributions are quite bad since state prices of zero imply any bet will be accepted. Clearly however, there is some nonzero probability on these outcomes. We explain such a discrepancy by the small number of students (on the order of 100), undervaluing the extreme outcomes of the game, and a lack of understanding of the market design. Indeed, 2
  • 3. 3 UNIFIED FRAMEWORK in later games, the state prices more accurately resembled what an expected probability distribution would be. 2 Uniqueness However, the question arises as to whether the state prices are necessarily unique. In fact, using only the current formulation, the state prices are not guaranteed to be unique. Consider the case of 2 teams and 1 bid: (1,1) with any price π1 . max π1 x1 − y min q1 s1       1 −1 0 p1 s.t.  1 −1  x1 ≤  0  s.t. 1 1 1  p2  ≥ π1 y −1 −1 0 = −1 1 0 q1 s1 x1 ≥ 0, y f ree p1 , p2 , s1 ≥ 0 Note the first two columns of the dual are identical. This is an ’entanglement’ of the states 1,2 and the dual variables p1 , p2 . With a linear objective, every point on p1 + p2 = 1, p1 , p2 ≥ 0 is optimal. To see why, for simplicity, let π1 < 1. Then the primal and dual objectives are 0 by strong duality and s1 = 0. We have p1 + p2 ≥ π1 , p1 + p2 = 1. The first constraint is redundant since pi1 < 1, and clearly all p1 + p2 = 1 are optimal. Theorem. If there is a set of states S such that every bet that includes one element in S includes all of S, the dual has infinite solutions. Proof. Every row corresponding to S will have identical rows in the primal. Therefore, the dual will have identical columns corresponding S. Since the objective is linear, every combination of pi∈S such that i∈S pi = 1 − j∈S , p ≥ 0 are optimal solutions to the dual. It is natural to break the above symmetry by choosing the midpoint of all possible solutions since a corner point’s sensitivity will be higher than the midpoint with slight changes in the parameters. The midpoint is in a sense the most ’robust’ state price. We can choose the midpoint similar to choosing the analytic center in an interior point method. Since we are maximizing, we will use a nondecreasing and strictly concave function of the slack variables. 3 Unified Framework Here we formulate bid acceptance with the added function u(s) which is nondecreasing and strictly concave. Since the feasible region remains unchanged and convex, and the objective is concave maximization, we still have a linearly constrained convex optimization problem which can be solved efficiently. 3
  • 4. 3 UNIFIED FRAMEWORK n max πj xj − y + u(s) j=1 n s.t. aij xj + si = y, ∀i = 1, 2, ..., m Dual Var: p j=1 0 ≤ xj , ∀j = 1, ..., n Dual Var: µ xj ≤ qj , ∀j = 1, ..., n Dual Var: ν si ≥ 0, ∀i = 1, ..., m Dual Var: η Letting u(s) = i u(si ) we write down the KKT conditions. m δf KKT Conditions Stationarity: : πj x j = aij pi − µj + νj ∀j = 1, ..., n δxj i=1 m δf : −1= −pi δy i=1 δf δu(s) : = pi − ηi ∀i = 1, ..., m δsi δsi Primal Feasibility: 0 ≤ xj ≤ qj , ∀j = 1, ..., n 0 ≤ si , ∀i = 1, ..., m Dual Feasibility: pi f ree ∀i = 1, ..., m µj , νj ≥ 0 ∀j = 1, ..., n ηi ≥ 0 ∀i = 1, ..., m Complementary Slackness: µj xj = 0 ∀j = 1, ..., n νj (qj − xj ) = 0 ∀j = 1, ..., n ηi si = 0 ∀i = 1, ..., m From [3] we see the interpretation of u is a cost to ’slack’. That is, u is the value lost by not matching the worst case with each case. It is worth noting that a strictly concave increasing function is stable in the sense that the maximal point is towards the center of the optimal set. That is, it has diminishing returns or risk aversion ’built in’. From [3] we see that the choice of u corresponds to different levels of risk aversion. The most risk averse option is to let u = min(s), which recovers our original formulation (since min(s) = 0 of the original formulation, and s ≥ 0) and cannot lose money in the worst case. By letting u = θ(s) where θ is a probability distribution of our beliefs, our problem becomes that of maximizing the expected value. Furthermore, a strictly concave u generates a unique solution. Theorem: If u(s) is strictly concave, the state prices are unique. Proof. Suppose there were 2 different state prices corresponding to 2 different optimal so- lutions to the above optimization problem, z 1 , z 2 . Since the constraints are linear equalities 4
  • 5. 3.1 Parallels to Underconstrained Equations 4 ONLINE PROBLEM and inequalities, the feasible region is convex and z ∗ = αz 1 + (1 − α)z 2 for 0 ≤ α ≤ 1 is feasible. Since the objective function is the sum of convex terms and a strictly convex term, it is strictly convex and f (z ∗ ) > αf (z 1 ) + (1 − α)z 2 , violating our optimality assumption. Therefore, the state prices must be unique. 3.1 Parallels to Underconstrained Equations We can draw a parallel here to the problem of underconstrained systems of equations. If such a solution is consistent, there are infinite solutions. Then to find a unique solution, the minimum norm solution is selected. This is also an example of convex optimization since the objective and feasible region are convex. Theorem. The norm is a convex function. Proof. By triangle inequality we have α|x| + (1 − α)|y| = |αx| + |(1 − α)y| ≤ |αx + (1 − α)y|. Furthermore, triangle inequality is tight if and only if x = cy for some nonnegative c. Theorem. The set of solutions to a linear system Ax = b is an affine set. Proof. The solution set is a vector space, and all vector spaces are affine. Although the norm is not strictly convex, we can still establish uniqueness. Theorem. Minimizing the norm gives a unique solution to a system of equations. Proof. The solution set is N (A)+ x where Aˆ = b and N (A) is the null space. If b = 0 then x ˆ x ˆ is not in the null space. Then, without loss of generality we can choose x to be perpendicular ˆ to the null space. In this case, triangle inequality is strict and minimized when we choose 0 as the null space vector. Since this is a unique choice, the minimum norm solution is unique. In the case that b = 0, x = 0 is the unique minimum norm solution. 4 Online Problem Up to this point we have assumed all the bids have arrived before our decision has to be made. However, in practice, bettors would like to know whether their bids are accepted or rejected in real time. Then, we can modify our formulation to form an online program. max πk xk − y + u(si ) i k s.t. aij xj + si = y, ∀i = 1, 2, ..., m Dual Var: p j=1 0 ≤ xk Dual Var: µ xk ≤ q k Dual Var: ν si ≥ 0, ∀i = 1, ..., m Dual Var: η 5
  • 6. 4 ONLINE PROBLEM Following [2], for simplicity let bk−1 = k−1 ai xi . That is, b is the outstanding shares in i=1 each state up to the current time. Then our formulation simplifies to: k−1 max πk xk − y + u(y − aik xk − bi ) i s.t. 0 ≤ xk Dual Var: µ xk ≤ q k Dual Var: ν There are only two variables, xk and y. We can write down the KKT conditions: δf KKT Stationarity: : π k xk − aik u (y − aik xk − bk−1 ) − νk + µk = 0 i δxk i δf : −1+ u (y − aik xk − bt−1 ) = 0 i (1∗ ) δy i Primal Feasibility: 0 ≤ xk ≤ q k Dual Feasibility: µk , νk ≥ 0 Complementary Slackness: xk µ k = 0 → xk (πk xk − aik u (y − aik xk − bk−1 ) − νk ) = 0 i (2∗ ) i (qk − xk )νk = 0 The problem can now be solved as a series of individual optimizations of two variables, by updating b repeatedly. From [2] we see pk = u (y t − ait xi − bt−1 represents the state prices i i after each iteration. From [2], we can solve this as follows. 1. If the internal price is strictly higher than the bid price, we can immediately reject a bid. If not, go to the next step. 2.Update the state prices. If the bid price is higher than the new internal price accept it up to qk (from complementarity) and update b, p. Otherwise, 3. find the quantity needed so the new internal price exactly matches the bid price. This can be solved as a system of two equations (1∗ ) and (2∗ ) in the KKT conditions. This is explained thoroughly in [2] lecture notes 15. Moreover, let O(f (m, n)) be the complexity of solving the offline formulation for m states and n bets. Then the online formulation takes O(nf (m, 2) + nm) time, where nm is the cost of updating the vector b, n times. Such a speed is comparable to the offline problem. Moreover, the memory required is O(m) and quite small if a huge number of bids are made. In fact, it was already mentioned in [3] that online linear programs can be used as a way to solve extremely large linear programs. The online formulation can therefore be used to solve problems where the number of columns is far larger than the memory available. It is interesting that depending on the choice of u, the worst case is bounded differently. In the most risk averse case, u = min(s) and the worst case is 0, but the online formulation will never accept any realistic bids. By varying the choice of u a tradeoff can be made between expected gain and risk aversity. 6
  • 7. 5 TRUTHFULNESS 5 Truthfulness It is a desirable property of our system to have truthful bets. Let a player make a bet and (x∗ ) be the number of shares sold. We will charge: χ(0) − χ(x∗ ). where k k χ(x) = max −y + u(s) y,s s.t. si = y − qi − aik x, ∀i = 1, 2, ..., m We will maximize the profit for an arbitrary bidder over πk and show that at optimality it is equal to his/her real valuation πk . ˆ πk x∗ − ck = πk x∗ − (χ(0) − χ(x∗ )) ˆ k ˆ k k = πk x∗ + χ(x∗ ) − χ(0) ˆ k k ≡ πk xk + χ(x∗ ) since χ(0) is constant with respect to πk ˆ ∗ k But note χ has the same constraints as the online problem: k aij xj + si ≡ q + aik xk + si = y ∀i = 1, 2, ..., m j=1 At optimaltiy, χ(x∗ ) has the same x as the optimum of the online problem z ∗ . The only k difference is in a single term of the objective function. It follows that χ(x∗ ) = zk − πk xk . k ∗ Then we have: πk x∗ − ck = πk x∗ + χ(x∗ ) ˆ k ˆ k k = π k x ∗ + z ∗ − πk x k ˆ k = π k x∗ + ˆ k πt xt − y + u(s) t=k So, if the player bids truthfully and πk = πk , we can replace the above to get: ˆ πk x ∗ − ck = π k x ∗ + ˆ k ˆ k πt xt − y + u(s) t=k = πt xt − y + u(s) In other words, if the player bids truthfully, then his own profit maximization problem becomes exactly aligned with the pricing and online optimization problems. Therefore, to bet differently would be suboptimal. This is a special case of the VCG mechanism, where the price assigned is equal to the lost value to the rest of the players.[5] 7
  • 8. 6 RECOVERY OF THE GRAND TRUTH 6 Recovery of the Grand Truth As we have seen, the formulations as given return state prices. These state prices can be seen as the bettor’s collective belief distribution. If there is a hidden probability distribution the betters are aware of, a grand truth, we hope to converge to this grand truth after enough bets. Theorem. Convergence to the grand truth is a necessary condition to bound loss. Proof. Assume a method does not converge to the grand truth, then there will always be a bet according to the grand truth that is accepted. In other words, the bid price is less than the internal price π(x) < pT x. Consider the series of loss due to bets of this type. Such a series diverges if the bid price does not converge to the grand truth. From [1] we know the choice of u(s) puts different bounds on the worst case losses. This worst case loss can be seen as the integral of the difference between the state prices and the grand truth. The choice of u(s) can be seen to influence the rate of the convergence. Here we will investigate different u. For this section we will use the data from Stanford students betting on outcomes again. We use parameters w = 1, w = 10 where b = w/m, and m is the total For logarithmic scoring, the worst case loss is unbounded [1]. For exponential scoring, the worst case loss is b log N [1]. Using w = 1, w = 10 for both logarithmic and exponential scoring functions we get the following results for the online simulation of the first two games: Log, w = 1 Exp, w = 1 Log, w = 10 Exp, w = 10 Notre Dame, Total Shares 135 48 2012 221 Notre Dame, Total Bids 58 23 59 59 Berkeley, Total Shares 7070 6488 7606 6695 Berkeley, Total Bids 123 127 122 127 We considered a bid ’accepted’ if we granted more than 0.001 shares. Although the total number of bids accepted is quite similar (at least for Berkeley), the total shares sold follows a clear trend. Clearly, b = 10 is less risk averse and accepts more bids. Similarly, logarithmic scoring seems to be less risk averse than exponential. This makes sense in light of the fact that exponential can bound the worst case. There is a clear tradeoff then between risk aversity and total shares accepted. On the state price convergence, we see the following trend for state 1 in Stanford vs. Notre Dame (the offline problem has solution p1 = 0). 8
  • 9. 6.1 Regression 6 RECOVERY OF THE GRAND TRUTH All the functions converge to the original offline solution of p1 = 0. However, the trend as to which function works best is not clear in this case. Logarithm initially converges faster, but takes longer after bid 40. We also note that smaller b appears to converge faster for this case, although it is not clear what will happen in general. 6.1 Regression Using a model of grand truth with gaussian peturbations (noise), we note that least squares regression can be shown to return the grand truth [4]. Conclusion The prediction market is an interesting emerging field of research. Although many traditional approaches to prediction have been used such as regression, new approaches can be nearly self-funded as well as a source of information. More theoretical results are needed, for example to prove recovery of a grand truth. In comparison, least squares regression can provably recover the grand truth. 9
  • 10. 6.1 Regression 6 RECOVERY OF THE GRAND TRUTH References [1] S. Agrawal, E. Delage, M. Peters, Z. Wang and Y. Ye. A Unified Framework for Dynamic Pari-mutuel Information Market Design. The 10th ACM Conference on Electronic Commerce, 2009. [2] Y. Ye. MS&E 211 Lecture Notes, 2012. [3] S. Agrawal, Z. Wang, Y. Ye. A Dynamic Near-Optimal Algorithm for Online Linear Programming, 2009. [4] A. Ng. CS 229 Lecture Notes, 2012. [5]en.wikipedia.org/wiki/Vickrey-Clarke-Groves auction 10
  • 11. 6.1 Regression 6 RECOVERY OF THE GRAND TRUTH Appendix A: Code for Problem 1 % Read the excel data into matrices num1 = xlsread(’NotreDame.xlsx’); num2 = xlsread(’California.xlsx’); A1 = num1(:,3:10); A1 = A1’; [m1 n1] = size(A1); A1 = [A1 -1*ones(m1,1)]; f1 = num1(:,2); f1 = [f1; -1]; f1 = -f1; b1 = zeros(m1,1); lb1 = zeros(n1,1); ub1 = num1(:,1); [x1, fval1, exitflag1, output1, lambda1] = linprog(f1,A1,b1,[],[],lb1,ub1); %Print the state prices lambda1.ineqlin A2 = num2(:,3:10); A2 = A2’; [m2 n2] = size(A2); A2 = [A2 -1*ones(m2,1)]; f2 = num2(:,2); f2 = [f2; -1]; f2 = -f2; b2 = zeros(m2,1); lb2 = zeros(n2,1); ub2 = num2(:,1); [x2, fval2, exitflag2, output2, lambda2] = linprog(f2,A2,b2,[],[],lb2,ub2); % Print the state prices lambda2.ineqlin 11
  • 12. 6.1 Regression 6 RECOVERY OF THE GRAND TRUTH Appendix B: Code for Problem 6, Online Version clc clear all % Either of Type 1 or Type 2 Data Generation Modes to be chosen % and other commented % Type 1 - Data from Auction Games % num = xlsread(’NotreDame.xlsx’); % num = xlsread(’California.xlsx’); % numStates = 8; % b = 1; %Type 2 - Random Data Generation numBids = 100; numStates = 8; b = 1; %num is the overall matrix num = zeros(numBids,numStates+2); for i = 1 :numBids num(i,1) = randi([2,10],1); num(i,2) = rand(1); for j = 1 : numStates num(i,j+2) = randi([0 1],1); end end A = num(:,(3:numStates+2)); A = A’; [m n] = size(A); f = num(:,2); q = num(:,1); cvx_begin variable xt variable y variable s(m) dual variable P maximize( f(1,1)*xt - y + (b/m)*sum(1-exp(-s))); subject to P: A(:,1)*xt + s - y*ones(m,1) == 0; xt >= 0; xt <= q(1,1); s >= 0;cvx_end if (xt < 10^-5) xt = 0; end x_old = xt; 12
  • 13. 6.1 Regression 6 RECOVERY OF THE GRAND TRUTH A_old = A(:,1); Price = P; for i = 2 : n cvx_begin variable xt variable y variable s(m) dual variable P maximize( f(i,1)*xt - y + (b/m)*sum(1-exp(-s))); subject to P: A(:,1)*xt + s - y*ones(m,1) == -A_old * x_old; xt >= 0; xt <= q(i,1); s >= 0; cvx_end if (xt < 10^-5) xt = 0; end x_old =[x_old; xt]; A_old =[A_old, A(:,i)]; Price =[Price P]; end Price = - Price’; Appendix C: Code for Problem 6, Offline Version clc clear all num = xlsread(’NotreDame.xlsx’); % num = xlsread(’California.xlsx’); A = num(:,3:10); A = A’; [m n] = size(A); f = num(:,2); q = num(:,1); cvx_begin variable x(n) variable y variable s(m) dual variable P b = 0.1; maximize (f’*x - y + (b/m)*sum(log(s))) subject to P: A*x + s -y*ones(m,1) == 0 13
  • 14. 6.1 Regression 6 RECOVERY OF THE GRAND TRUTH x >= 0 x <= q s >= 0 cvx_end 14