SlideShare a Scribd company logo
1 of 32
Introduction      Formalization    Main result      Application   Conclusion




               Partial Identification with Missing Data

                 Laurent Davezies and Xavier d’Haultfœuille
                               CREST, Paris
Introduction          Formalization   Main result   Application   Conclusion



Outline



       Introduction


       Formalization of the problem


       Main result


       Application


       Conclusion
Introduction          Formalization   Main result   Application   Conclusion



Outline



       Introduction


       Formalization of the problem


       Main result


       Application


       Conclusion
Introduction        Formalization     Main result      Application      Conclusion



Partial Identification
       The literature on missing data has traditionally focused on point
       identification, at the price of imposing often implausible
       assumptions (Missing at Random, exclusion restrictions, parametric
       models...).
       In the 90’s and 00’s, Manski has shown that it was possible to
       weaken these conditions and still get informative bounds on
       parameters of interest in many missing data problems.
       The literature on partial identification is now large and applies to
       many other settings:
           limited dependent data models: Chesher (2010), Chesher et
           al. (2011), Bontemps et al. (2012)...
           panel data models: Honore and Tamer (2006), Chernozhukov
           et al. (2012), Rosen (2012)...
           incomplete models: Ciliberto and Tamer (2009), Galichon and
           Henry (2011), Beresteanu et al. (2012)...
Introduction        Formalization     Main result     Application      Conclusion



Goal of this work

       In missing data problems, partial identification often involves
       infinite dimensional optimization, which may be impossible to solve
       both in theory and computationally.
       For specific models and parameters, closed forms of the bounds of
       the identified set have been derived by the method of ”guess and
       verify”. But methods to find bounds is often specific.
       We show that for a large class of missing data problems (including
       models with unobserved heterogeneity) and parameters, bounds
       can be obtained by an optimization on a far smaller set than the
       initial one, making the optimization often tractable.
       This generalizes results of Chernozhukov et al. (2012) and
       D’Haultfœuille and Rathelot (2012). Also related to Balke and
       Pearle (1997), Honore and Tamer (2006) and Freyberger and
       Horowitz (2012) but in an infinite dimensional setting.
Introduction          Formalization   Main result   Application   Conclusion



Outline



       Introduction


       Formalization of the problem


       Main result


       Application


       Conclusion
Introduction           Formalization      Main result      Application       Conclusion



General framework
       We are interested in a parameter θ0 that depends on P0 , the
       probability of a (partly) unobserved r.v. U. Instead of U, we
       observe the r.v. O (which is related to U) and whose probability
       measure is Q0 . This restricts the set of distributions of U that are
       compatible with Q0 . Moreover one can impose additional
       restrictions (coming from a theory) on the distributions U, some
       restrictions can depends on the value of the parameter θ.

               q(θ0 , P0 ) = 0: definition of the parameter θ0 + restrictions on
               P0 that depends on the value of θ0
               P0 ∈ R: restrictions on P0 that do not depends on P0 .

       Assumption 1 (Framework)
       The true parameter θ0 and distribution P0 satisfy q(θ0 ; P0 ) = 0,
       where q is known, and P0 ∈ R. These restrictions exhaust the
       information on (θ0 ; P0 ).
Introduction        Formalization      Main result       Application   Conclusion



General framework
       We are interested in Θ0 , the identification region of θ0 :

                     Θ0 = cl{θ ∈ Θ : ∃P ∈ R : q(θ, P) = 0}

       We restrict our framework to the following assumption:
       Assumption 2 (Convex restriction)
       Rθ = {P ∈ R : q(θ, P) = 0} is convex for every θ ∈ Θ.
       True for every problem considered in practice (to our best
       knowledge).
       We also provide more precise results when Assumption 2 is
       replaced by the following condition.
       Assumption 3 (Convex restriction and linear parameter)
       R is convex and closed for the weak convergence. Moreover,
       q(θ, P) = θ − f (u)dP(u) with f a known (or identifiable) real
       function satisfying |f (u)|dP0 (u) < ∞.
Introduction           Formalization    Main result      Application      Conclusion



General framework


       Example 1: missing data with a known link.
       We are interested in a moment of U then θ0 = f (u)dP0 (u), but
       we do not observe U but only O = s(U) where s is known,
       noninjective in general (loss of information).

       This case covers for instance:
               sample selection model: U = (D, Y , X ) and O = (D, DY , X )
               treatment effects/Roy models/Ecological inference:
               U = (T , Y0 , Y1 , X ) and O = (T , YT , X )
               nonresponse on X : U = (D, Y , X ) and O = (D, Y , DX )
Introduction            Formalization       Main result       Application       Conclusion



General framework

       Example 1: missing data with a known link (continued).
       In this case:

               Q0 (A) = P(O ∈ A) = P(s(U) ∈ A) =          1{s(u) ∈ A}dP0 (u).

       And then, q(θ, P) = θ −          f (u)dP(u) and R is the following set of
       probability distribution:

                  P : Q0 (A) = 1{s(u) ∈ A}dP(u) for measurable A
                                                                            .
                      and |f (u)|dP(u) ≤ ∞

       Alternatively, q and R can be adapted to other definition of θ0
       (quantile, index of inequality, coefficient of regression...).
Introduction       Formalization     Main result     Application     Conclusion



General framework




       Example 2: unobserved heterogeneity
        Details .

       Example 3: incomplet models and games with multiple equilibria.
        Details .
Introduction          Formalization   Main result   Application   Conclusion



Outline



       Introduction


       Formalization of the problem


       Main result


       Application


       Conclusion
Introduction        Formalization     Main result      Application      Conclusion



Extreme part of convex set of distribution


       Difficult to compute Θ0 = {θ ∈ Θ : Rθ = ∅}. We try to get
       simplifications of this problem.
       For a closed and convex C, let ext(C) the set of extreme part of C,
       i.e. elements of C that are not a mixture of elements of C.
       Theorem 1 (Main result)
          1. Under Assumptions 1 and 2,
             Θ0 = {θ ∈ Θ : ext(Rθ ) = ∅}.
          2. Moreover if Assumption 3 also holds, then:
                 .
              θ = inf Θ0 = inf P∈ext(R)∩I(f ) f (u)dP(u)
                .
              θ = sup Θ0 = supP∈ext(R)∩I(f ) f (u)dP(u).
Introduction       Formalization     Main result      Application       Conclusion



Extreme part of convex set of distribution



       In finite dimension, closed, bounded and convex sets are convex
       hull of their extreme parts.
       When we know that distribution P0 is concentrated on a finite
       number of elements of Rk , then Rθ is included in a finite
       dimensional vector space. And in this case the result is
       straightforward.
       We extend this result to the case where P0 is concentrated on any
       closed subset of Rk , then Rθ is infinite dimensional.
Introduction               Formalization           Main result           Application            Conclusion



Extreme part of convex set of distribution

       In infinite dimension, closed, bounded and convex sets are not
       characterized by their extreme parts.
                [No extreme parts] Let K denote the set of real valued continuous functions f
                from [0; 1] such that supx∈[0;1] |f (x)| ≤ 1 and f (0) = 0. K is a bounded, closed
                and convex set for the supremum norm in the Banach space of continuous
                functions from [0; 1] to R. However ext(K) is empty.
                [No closure of convex hull] Let K be the set of real valued continuous functions
                f from [−1; 1] such that supx∈[−1;1] |f (x)| ≤ 1. K is a bounded, closed and
                convex set of a Banach space, and
                ext(K) = {f : f (x) = 1 for x ∈ [−1; 1] or f (x) = −1 for x ∈ [−1; 1]},
                then cl(co(ext(K))) = K.
                [No continuity of linear forms] Linear forms are not necessarily continuous in
                finite dimensional space, then even if K = cl(co(ext(K))) and l is a linear form
                on K one can have:
                                             sup l(x) = sup l(x)
                                             x∈K          x∈ext(K)

        Proof
Introduction          Formalization   Main result   Application   Conclusion



Outline



       Introduction


       Formalization of the problem


       Main result


       Application


       Conclusion
Introduction           Formalization     Main result     Application         Conclusion



SSM without exclusion restriction

       We are interested by a distribution of (Y , X ) but we only observe a
       sample of variables (D, DY , X ), with D = 1 if Y is observed and
       D = 0 if not.
       In such case:
               MAR: D ⊥ Y |X ⇒ point identification of PY ,X ,D
                       ⊥
               Standard Exclusion Restriction: Y ⊥ X , if it exists x such
                                                    ⊥
               that P(D = 1|X = x) = 1 ⇒ point identification of PY ,X ,D ,
               otherwise only partial identification
               Non Standard Restriction: D ⊥ X |Y , if Y and X are
                                             ⊥
               sufficiently dependent (rank condition or completness
               condition) ⇒ point identification, otherwise partial
               identification
Introduction        Formalization      Main result        Application     Conclusion



SSM without exclusion restriction


       Instead of exclusion relation, we assume monotonicity conditions:
       Assumption 4 (Monotonicity in X : MX)
       x → E(D|Y , X = x) is increasing almost surely.

       Assumption 5 (Monotonicity in Y : MY)
       y → E(D|Y = y , X ) is increasing almost surely.

       θ0 is defined by a finite number of moments of (Y , X ) (ex:
       regression, quantile...). In this case Rθ can be deduced from R.
       In a first time we assume that Supp(X ) = {x1 , ..., xJ }, but Y has
       any support.
Introduction              Formalization           Main result                Application      Conclusion



SSM with monotonicity on X

       Instead of R, we can consider C, the set of possible p.d. of
       Y |D = 0, X = xj for j = 1...J.
       We show that no constraint is imposed on PY |D=0,X =xJ . Thus extremal
       elements are simply dirac. Then we show that

                                                          ρ(y , xj )
                   fY |D=0,X =xj (y ) = rxj+1 ,xj (y )                fY |D=0,X =xj+1 (y ),   (1)
                                                         ρ(y , xj+1 )

       with ρ(y , x) = 1/P(D = 1|Y = y , X = x) − 1 and

                                P(D = 1|X = xj )P(D = 0|X = xi )fY |D=1,X =xj (y )
               rxi ,xj (y ) =                                                      .
                                P(D = 1|X = xi )P(D = 0|X = xj )fY |D=1,X =xi (y )

       By MX, the ratio in (1) is greater than one. Thus,

           fY |D=0,X =xj (y ) = rxj ,xj+1 (y )fY |D=0,X =xj+1 (y ) + qj (y ), with qj ≥ 0.    (2)
Introduction            Formalization        Main result        Application         Conclusion



SSM with monotonicity on X

       qj may be seen as the density of a (nonprobability) measure Qj .
       Because Qj admits no restriction, extremal elements for Qj are
       weighted dirac.
       Then, by induction, extremal elements of PY |D=0,X =xj admit at
       most J − j + 1 support points (and among them J − j are common
       with PY |D=0,X =xj+1 ).
       Once the support points (y1 , ..., yJ ) ∈ Y J have been determined,
       the corresponding weights are given by (2). For instance

               PY |D=0,X =xJ−1 = rxJ−1 ,xJ (yJ )δyJ + 1 − rxJ−1 ,xJ (yJ ) δyJ−1 .

       Thus ext(C) is parametrized by Y J . Moreover, some support
       points can be discarded because they lead to negative weights.
Introduction             Formalization           Main result        Application       Conclusion



SSM with monotonicity on Y

       In this case, there is no constraint on the distribution of X so one
       can reason conditional on X = x.
       Because
                                         P(D = 1|X = x)
               dPY |D=0,X =x (y ) =                     ρ(y , x)dPY |D=1,X =x (y ),
                                         P(D = 0|X = x)

       it suffices to find extremal elements on ρ(., x).
       ρ(., x) is decreasing and satisfies the integral equation

                                                                 1
                      ρ(y , x)dPY |D=1,X =x (y ) =                        − 1.
                                                           P(D = 1|X = x)

       Extremal elements on ρ(., x) are heavyside functions satisfying an
       “area restriction”.
Introduction          Formalization                                      Main result                                     Application   Conclusion



SSM with monotonicity on Y

                                                                                                      1
                                                                              ( y, x)                             1
                                                                                           P ( D  1Y  y , X  x )




                                                                         1
                               ( y, x)dF
                                         Y D 1, X  x
                                                         ( y) 
                                                                  P( D  1 X  x)
                                                                                  1




                                                                                                        y



                  Figure: An example of extremal element under MY.


       Proposition 1
       Under MY in the sample selection model, we have

       ext(C) = (P Y |D=1,X =x1 ,Y ≤y1 , ..., P Y |D=1,X =xJ ,Y ≤yJ ) : (y1 , ..., yJ ) ∈ Y J
Introduction            Formalization        Main result     Application      Conclusion



SSM with double monotonicity


       Still reasoning on ρ(., .), we should find extremal parts of functions
       such that
               for all x, ρ(., x) is decreasing;
               E(ρ(Y , x)|D = 1, X = x) = 1/P(D = 1|X = x) − 1;
               for all y , ρ(y , x) ≤ ρ(y , x ) if x ≥ x .
       Extremal elements are similar as before but more difficult to
       characterize.
       One can show for instance that if X takes J values, then each ρ
       takes at most J values but (ρ(., x1 ), ..., ρ(., xJ )) taken together do
       not take more than 2J − 1 values.
Introduction        Formalization                                             Main result       Application   Conclusion



SSM with double monotonicity

                                                1
                        ( y, x)                             1
                                     P ( D  1Y  y , X  x )




                                                                          1
                          ( y, x )dF
                                0        Y D 1, X  x0
                                                          ( y) 
                                                                   P( D  1 X  x0 )
                                                                                     1




                                                                           1
                           ( y, x )dF
                                     1    Y D 1, X  x1
                                                           ( y) 
                                                                    P( D  1 X  x1 )
                                                                                      1




                                                                                            y




       Figure: An example of extremal elements under MX, MY and with J = 2.

       At the end, ext(C) is parametrized by R2J−1 × Y J(J−1) .
Introduction           Formalization      Main result       Application    Conclusion



Extensions




               If #Supp(X ) = +∞, if Assumption MX holds for X , it also
               holds for Xn = n 1{X ∈[σ(i);σ(i+1)[} with
                                 i=1
               −∞ = σ(1) < ... < σ(n + 1) = +∞, then we get Θ0n , an
               outer region for Θ0 .
               In such case, we give technical conditions under which
               Θ0 = n∈N Θ0n
               If several covariates, results can be extended.
Introduction          Formalization   Main result   Application   Conclusion



Outline



       Introduction


       Formalization of the problem


       Main result


       Application


       Conclusion
Introduction           Formalization     Main result      Application      Conclusion



Conclusion




               Still work in progress, comments are welcome...
               Additional results: link with methodologies used by
               Beresteanu et al. (random sets theory) or by Galichon et al.
               (optimal transport) for problems where all the constraints can
               be written as moment conditions.
               More general: we can also use our result when constraints are
               not given by moment conditions as in our application.
Supplementary material



Outline




      Supplementary material
Supplementary material



      Example 2: unobserved heterogeneity. In this case, we suppose
      that the distribution of O conditional on the unobserved
      heterogeneity U is known. Then P O|U (A|U = u, θ) is known (by
      the model) and P O is known (by the data).

                                                  O|U
                   q(θ, P) =    max    sup    P         (A|u, θ)dP(u) − P(O ∈ A) ,     g (u, θ)dP(u)
                                        A



       This covers semiparametric nonlinear panel model:
      O = ((Yt )t=1...T , (Xt )t=1...T ), U = ((Xt )t=1...T , α),
      Yit = 1{Xit β0 + αi + εit ≥ 0} where the (εt )t=1...T are i.i.d.,
      independent of (X , α) and with a known distribution and β0 is a
      subvector of θ0 .
      If θ0 = β0 then g (u, θ) = 0, if θ0 = (β0 , ∆0 ), where ∆0 is the
      average marginal effect of one binary covariate X1 then:
           g (x1 , x2 , a, β, ∆) = E (Yt |X1t = 1, X2t = x2 , α = a, β) − E (Yt |X1t = 0, X2t = x2 , α = a, β) − ∆.



       Applies to many other settings (see also Chernozhukov et al.,
      2012). main .
Supplementary material




      Example 3: incomplet models and games with multiple equilibria.

                                         Y2 = 1           Y2 = 0
                          Y1 = 1     (θ + ε1 , θ + ε2 )   (ε1 , 0)
                          Y1 = 0         (0, ε2 )          (0, 0)
                         Figure: Payoffs of entry game (with θ < 0)


      Payoff shifters are known by players, but econometricians only
      knows that (ε1 , ε2 ) ∼ N (0, I2 ).
      When (ε1 , ε2 ) ∈ [0; −θ]2 , two pure strategies
      (Y1 , Y2 ) ∈ {(0, 1); (1, 0)} (and one mixed strategy) Back .
Supplementary material



Steps of proof of main result.


              The vector space of signed measure (M, |.|TV ) is the dual of
              continuous functions with compact support (Cb , ||.||∞ ).
              The Banach-Alaoglu theorem ensures that Rθ (as a closed
              subset of the unit ball) is compact for the weak- topology.
              Moreover the weak- topology is metrizable by the
              Levy-Prokhorov metric.
              Applying Choquet theorem: ∀P ∈ Rθ , there exists a
              probability measure µP such that

                              gdP =              gdQ dµP (Q)             (3)
                                      ext(Rθ )

              for every g ∈ Cb .
Supplementary material



Steps of proof of main result.
              Considering gn → 1, one can extend the previous relation to
              g = 1:
                             1=                       1dQ dµP (Q)
                                      ext(Rθ )

              And then ∀Q ∈ Supp(µP ), Q ∈ ext(Rθ ) ∩ P = ext(Rθ )
              It follows that Rθ = ∅ ⇒ ext(Rθ ) = ∅
              For linear parameter: apply the Choquet Theorem to R
              instead of Rθ and consider gn ∈ Cb → f to conclude that

                            fdP =                        fdQ dµP (Q)
                                       ext(R)∩I(f )

              This ensures that:

                            sup          fdP ≤          sup       fdQ
                          P∈R∩I(f )              Q∈ext(R)∩I(f )

              Reverse inequality is straightforward.      Back

More Related Content

What's hot

Introductory maths analysis chapter 13 official
Introductory maths analysis   chapter 13 officialIntroductory maths analysis   chapter 13 official
Introductory maths analysis chapter 13 officialEvert Sandye Taasiringan
 
An implicit partial pivoting gauss elimination algorithm for linear system of...
An implicit partial pivoting gauss elimination algorithm for linear system of...An implicit partial pivoting gauss elimination algorithm for linear system of...
An implicit partial pivoting gauss elimination algorithm for linear system of...Alexander Decker
 
Introductory maths analysis chapter 11 official
Introductory maths analysis   chapter 11 officialIntroductory maths analysis   chapter 11 official
Introductory maths analysis chapter 11 officialEvert Sandye Taasiringan
 
Lesson 20: Derivatives and the Shapes of Curves (slides)
Lesson 20: Derivatives and the Shapes of Curves (slides)Lesson 20: Derivatives and the Shapes of Curves (slides)
Lesson 20: Derivatives and the Shapes of Curves (slides)Matthew Leingang
 
Introductory maths analysis chapter 17 official
Introductory maths analysis   chapter 17 officialIntroductory maths analysis   chapter 17 official
Introductory maths analysis chapter 17 officialEvert Sandye Taasiringan
 
Introductory maths analysis chapter 12 official
Introductory maths analysis   chapter 12 officialIntroductory maths analysis   chapter 12 official
Introductory maths analysis chapter 12 officialEvert Sandye Taasiringan
 
Introductory maths analysis chapter 02 official
Introductory maths analysis   chapter 02 officialIntroductory maths analysis   chapter 02 official
Introductory maths analysis chapter 02 officialEvert Sandye Taasiringan
 
(C f)- weak contraction in cone metric spaces
(C f)- weak contraction in cone metric spaces(C f)- weak contraction in cone metric spaces
(C f)- weak contraction in cone metric spacesAlexander Decker
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGabriel Peyré
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2zukun
 
Introductory maths analysis chapter 10 official
Introductory maths analysis   chapter 10 officialIntroductory maths analysis   chapter 10 official
Introductory maths analysis chapter 10 officialEvert Sandye Taasiringan
 
Lesson 15: Exponential Growth and Decay (slides)
Lesson 15: Exponential Growth and Decay (slides)Lesson 15: Exponential Growth and Decay (slides)
Lesson 15: Exponential Growth and Decay (slides)Matthew Leingang
 
Tele3113 wk1wed
Tele3113 wk1wedTele3113 wk1wed
Tele3113 wk1wedVin Voro
 

What's hot (17)

Algo Final
Algo FinalAlgo Final
Algo Final
 
Introductory maths analysis chapter 13 official
Introductory maths analysis   chapter 13 officialIntroductory maths analysis   chapter 13 official
Introductory maths analysis chapter 13 official
 
Midterm I Review
Midterm I ReviewMidterm I Review
Midterm I Review
 
An implicit partial pivoting gauss elimination algorithm for linear system of...
An implicit partial pivoting gauss elimination algorithm for linear system of...An implicit partial pivoting gauss elimination algorithm for linear system of...
An implicit partial pivoting gauss elimination algorithm for linear system of...
 
Introductory maths analysis chapter 11 official
Introductory maths analysis   chapter 11 officialIntroductory maths analysis   chapter 11 official
Introductory maths analysis chapter 11 official
 
Lesson 20: Derivatives and the Shapes of Curves (slides)
Lesson 20: Derivatives and the Shapes of Curves (slides)Lesson 20: Derivatives and the Shapes of Curves (slides)
Lesson 20: Derivatives and the Shapes of Curves (slides)
 
Introductory maths analysis chapter 17 official
Introductory maths analysis   chapter 17 officialIntroductory maths analysis   chapter 17 official
Introductory maths analysis chapter 17 official
 
Introductory maths analysis chapter 12 official
Introductory maths analysis   chapter 12 officialIntroductory maths analysis   chapter 12 official
Introductory maths analysis chapter 12 official
 
Introductory maths analysis chapter 02 official
Introductory maths analysis   chapter 02 officialIntroductory maths analysis   chapter 02 official
Introductory maths analysis chapter 02 official
 
(C f)- weak contraction in cone metric spaces
(C f)- weak contraction in cone metric spaces(C f)- weak contraction in cone metric spaces
(C f)- weak contraction in cone metric spaces
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and Graphics
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2
 
Introductory maths analysis chapter 10 official
Introductory maths analysis   chapter 10 officialIntroductory maths analysis   chapter 10 official
Introductory maths analysis chapter 10 official
 
Lesson 15: Exponential Growth and Decay (slides)
Lesson 15: Exponential Growth and Decay (slides)Lesson 15: Exponential Growth and Decay (slides)
Lesson 15: Exponential Growth and Decay (slides)
 
Predicate Logic
Predicate LogicPredicate Logic
Predicate Logic
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Tele3113 wk1wed
Tele3113 wk1wedTele3113 wk1wed
Tele3113 wk1wed
 

Viewers also liked (8)

default_change.pdf
default_change.pdfdefault_change.pdf
default_change.pdf
 
 
default_change.pdf
default_change.pdfdefault_change.pdf
default_change.pdf
 
Favre
FavreFavre
Favre
 
Lesage
LesageLesage
Lesage
 
i-9.pdf
i-9.pdfi-9.pdf
i-9.pdf
 
Grelaud
GrelaudGrelaud
Grelaud
 
Dubai vs switzerlsnd
Dubai vs switzerlsndDubai vs switzerlsnd
Dubai vs switzerlsnd
 

Similar to Partial Identification Bounds with Missing Data and Unobserved Heterogeneity

Successive approximation of neutral stochastic functional differential equati...
Successive approximation of neutral stochastic functional differential equati...Successive approximation of neutral stochastic functional differential equati...
Successive approximation of neutral stochastic functional differential equati...Editor IJCATR
 
International journal of engineering and mathematical modelling vol2 no3_2015_2
International journal of engineering and mathematical modelling vol2 no3_2015_2International journal of engineering and mathematical modelling vol2 no3_2015_2
International journal of engineering and mathematical modelling vol2 no3_2015_2IJEMM
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxLecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxcroysierkathey
 
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxLecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxjeremylockett77
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsAlexander Litvinenko
 
Novel set approximations in generalized multi valued decision information sys...
Novel set approximations in generalized multi valued decision information sys...Novel set approximations in generalized multi valued decision information sys...
Novel set approximations in generalized multi valued decision information sys...Soaad Abd El-Badie
 
A current perspectives of corrected operator splitting (os) for systems
A current perspectives of corrected operator splitting (os) for systemsA current perspectives of corrected operator splitting (os) for systems
A current perspectives of corrected operator splitting (os) for systemsAlexander Decker
 
Fuzzy logic andits Applications
Fuzzy logic andits ApplicationsFuzzy logic andits Applications
Fuzzy logic andits ApplicationsDrATAMILARASIMCA
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Geometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderGeometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderJuliho Castillo
 
p_enclosure_presentation_long
p_enclosure_presentation_longp_enclosure_presentation_long
p_enclosure_presentation_longTommi Brander
 
Natalini nse slide_giu2013
Natalini nse slide_giu2013Natalini nse slide_giu2013
Natalini nse slide_giu2013Madd Maths
 
Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Guillaume Costeseque
 
590-Article Text.pdf
590-Article Text.pdf590-Article Text.pdf
590-Article Text.pdfBenoitValea
 
590-Article Text.pdf
590-Article Text.pdf590-Article Text.pdf
590-Article Text.pdfBenoitValea
 

Similar to Partial Identification Bounds with Missing Data and Unobserved Heterogeneity (20)

Successive approximation of neutral stochastic functional differential equati...
Successive approximation of neutral stochastic functional differential equati...Successive approximation of neutral stochastic functional differential equati...
Successive approximation of neutral stochastic functional differential equati...
 
E42012426
E42012426E42012426
E42012426
 
International journal of engineering and mathematical modelling vol2 no3_2015_2
International journal of engineering and mathematical modelling vol2 no3_2015_2International journal of engineering and mathematical modelling vol2 no3_2015_2
International journal of engineering and mathematical modelling vol2 no3_2015_2
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxLecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
 
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docxLecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
Lecture13p.pdf.pdfThedeepness of freedom are threevalues.docx
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problems
 
Novel set approximations in generalized multi valued decision information sys...
Novel set approximations in generalized multi valued decision information sys...Novel set approximations in generalized multi valued decision information sys...
Novel set approximations in generalized multi valued decision information sys...
 
A current perspectives of corrected operator splitting (os) for systems
A current perspectives of corrected operator splitting (os) for systemsA current perspectives of corrected operator splitting (os) for systems
A current perspectives of corrected operator splitting (os) for systems
 
Fuzzy logic andits Applications
Fuzzy logic andits ApplicationsFuzzy logic andits Applications
Fuzzy logic andits Applications
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Geometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first orderGeometric and viscosity solutions for the Cauchy problem of first order
Geometric and viscosity solutions for the Cauchy problem of first order
 
p_enclosure_presentation_long
p_enclosure_presentation_longp_enclosure_presentation_long
p_enclosure_presentation_long
 
Natalini nse slide_giu2013
Natalini nse slide_giu2013Natalini nse slide_giu2013
Natalini nse slide_giu2013
 
Chapter 13
Chapter 13Chapter 13
Chapter 13
 
Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...
 
Probability Cheatsheet.pdf
Probability Cheatsheet.pdfProbability Cheatsheet.pdf
Probability Cheatsheet.pdf
 
590-Article Text.pdf
590-Article Text.pdf590-Article Text.pdf
590-Article Text.pdf
 
590-Article Text.pdf
590-Article Text.pdf590-Article Text.pdf
590-Article Text.pdf
 
Nested sampling
Nested samplingNested sampling
Nested sampling
 

Partial Identification Bounds with Missing Data and Unobserved Heterogeneity

  • 1. Introduction Formalization Main result Application Conclusion Partial Identification with Missing Data Laurent Davezies and Xavier d’Haultfœuille CREST, Paris
  • 2. Introduction Formalization Main result Application Conclusion Outline Introduction Formalization of the problem Main result Application Conclusion
  • 3. Introduction Formalization Main result Application Conclusion Outline Introduction Formalization of the problem Main result Application Conclusion
  • 4. Introduction Formalization Main result Application Conclusion Partial Identification The literature on missing data has traditionally focused on point identification, at the price of imposing often implausible assumptions (Missing at Random, exclusion restrictions, parametric models...). In the 90’s and 00’s, Manski has shown that it was possible to weaken these conditions and still get informative bounds on parameters of interest in many missing data problems. The literature on partial identification is now large and applies to many other settings: limited dependent data models: Chesher (2010), Chesher et al. (2011), Bontemps et al. (2012)... panel data models: Honore and Tamer (2006), Chernozhukov et al. (2012), Rosen (2012)... incomplete models: Ciliberto and Tamer (2009), Galichon and Henry (2011), Beresteanu et al. (2012)...
  • 5. Introduction Formalization Main result Application Conclusion Goal of this work In missing data problems, partial identification often involves infinite dimensional optimization, which may be impossible to solve both in theory and computationally. For specific models and parameters, closed forms of the bounds of the identified set have been derived by the method of ”guess and verify”. But methods to find bounds is often specific. We show that for a large class of missing data problems (including models with unobserved heterogeneity) and parameters, bounds can be obtained by an optimization on a far smaller set than the initial one, making the optimization often tractable. This generalizes results of Chernozhukov et al. (2012) and D’Haultfœuille and Rathelot (2012). Also related to Balke and Pearle (1997), Honore and Tamer (2006) and Freyberger and Horowitz (2012) but in an infinite dimensional setting.
  • 6. Introduction Formalization Main result Application Conclusion Outline Introduction Formalization of the problem Main result Application Conclusion
  • 7. Introduction Formalization Main result Application Conclusion General framework We are interested in a parameter θ0 that depends on P0 , the probability of a (partly) unobserved r.v. U. Instead of U, we observe the r.v. O (which is related to U) and whose probability measure is Q0 . This restricts the set of distributions of U that are compatible with Q0 . Moreover one can impose additional restrictions (coming from a theory) on the distributions U, some restrictions can depends on the value of the parameter θ. q(θ0 , P0 ) = 0: definition of the parameter θ0 + restrictions on P0 that depends on the value of θ0 P0 ∈ R: restrictions on P0 that do not depends on P0 . Assumption 1 (Framework) The true parameter θ0 and distribution P0 satisfy q(θ0 ; P0 ) = 0, where q is known, and P0 ∈ R. These restrictions exhaust the information on (θ0 ; P0 ).
  • 8. Introduction Formalization Main result Application Conclusion General framework We are interested in Θ0 , the identification region of θ0 : Θ0 = cl{θ ∈ Θ : ∃P ∈ R : q(θ, P) = 0} We restrict our framework to the following assumption: Assumption 2 (Convex restriction) Rθ = {P ∈ R : q(θ, P) = 0} is convex for every θ ∈ Θ. True for every problem considered in practice (to our best knowledge). We also provide more precise results when Assumption 2 is replaced by the following condition. Assumption 3 (Convex restriction and linear parameter) R is convex and closed for the weak convergence. Moreover, q(θ, P) = θ − f (u)dP(u) with f a known (or identifiable) real function satisfying |f (u)|dP0 (u) < ∞.
  • 9. Introduction Formalization Main result Application Conclusion General framework Example 1: missing data with a known link. We are interested in a moment of U then θ0 = f (u)dP0 (u), but we do not observe U but only O = s(U) where s is known, noninjective in general (loss of information). This case covers for instance: sample selection model: U = (D, Y , X ) and O = (D, DY , X ) treatment effects/Roy models/Ecological inference: U = (T , Y0 , Y1 , X ) and O = (T , YT , X ) nonresponse on X : U = (D, Y , X ) and O = (D, Y , DX )
  • 10. Introduction Formalization Main result Application Conclusion General framework Example 1: missing data with a known link (continued). In this case: Q0 (A) = P(O ∈ A) = P(s(U) ∈ A) = 1{s(u) ∈ A}dP0 (u). And then, q(θ, P) = θ − f (u)dP(u) and R is the following set of probability distribution: P : Q0 (A) = 1{s(u) ∈ A}dP(u) for measurable A . and |f (u)|dP(u) ≤ ∞ Alternatively, q and R can be adapted to other definition of θ0 (quantile, index of inequality, coefficient of regression...).
  • 11. Introduction Formalization Main result Application Conclusion General framework Example 2: unobserved heterogeneity Details . Example 3: incomplet models and games with multiple equilibria. Details .
  • 12. Introduction Formalization Main result Application Conclusion Outline Introduction Formalization of the problem Main result Application Conclusion
  • 13. Introduction Formalization Main result Application Conclusion Extreme part of convex set of distribution Difficult to compute Θ0 = {θ ∈ Θ : Rθ = ∅}. We try to get simplifications of this problem. For a closed and convex C, let ext(C) the set of extreme part of C, i.e. elements of C that are not a mixture of elements of C. Theorem 1 (Main result) 1. Under Assumptions 1 and 2, Θ0 = {θ ∈ Θ : ext(Rθ ) = ∅}. 2. Moreover if Assumption 3 also holds, then: . θ = inf Θ0 = inf P∈ext(R)∩I(f ) f (u)dP(u) . θ = sup Θ0 = supP∈ext(R)∩I(f ) f (u)dP(u).
  • 14. Introduction Formalization Main result Application Conclusion Extreme part of convex set of distribution In finite dimension, closed, bounded and convex sets are convex hull of their extreme parts. When we know that distribution P0 is concentrated on a finite number of elements of Rk , then Rθ is included in a finite dimensional vector space. And in this case the result is straightforward. We extend this result to the case where P0 is concentrated on any closed subset of Rk , then Rθ is infinite dimensional.
  • 15. Introduction Formalization Main result Application Conclusion Extreme part of convex set of distribution In infinite dimension, closed, bounded and convex sets are not characterized by their extreme parts. [No extreme parts] Let K denote the set of real valued continuous functions f from [0; 1] such that supx∈[0;1] |f (x)| ≤ 1 and f (0) = 0. K is a bounded, closed and convex set for the supremum norm in the Banach space of continuous functions from [0; 1] to R. However ext(K) is empty. [No closure of convex hull] Let K be the set of real valued continuous functions f from [−1; 1] such that supx∈[−1;1] |f (x)| ≤ 1. K is a bounded, closed and convex set of a Banach space, and ext(K) = {f : f (x) = 1 for x ∈ [−1; 1] or f (x) = −1 for x ∈ [−1; 1]}, then cl(co(ext(K))) = K. [No continuity of linear forms] Linear forms are not necessarily continuous in finite dimensional space, then even if K = cl(co(ext(K))) and l is a linear form on K one can have: sup l(x) = sup l(x) x∈K x∈ext(K) Proof
  • 16. Introduction Formalization Main result Application Conclusion Outline Introduction Formalization of the problem Main result Application Conclusion
  • 17. Introduction Formalization Main result Application Conclusion SSM without exclusion restriction We are interested by a distribution of (Y , X ) but we only observe a sample of variables (D, DY , X ), with D = 1 if Y is observed and D = 0 if not. In such case: MAR: D ⊥ Y |X ⇒ point identification of PY ,X ,D ⊥ Standard Exclusion Restriction: Y ⊥ X , if it exists x such ⊥ that P(D = 1|X = x) = 1 ⇒ point identification of PY ,X ,D , otherwise only partial identification Non Standard Restriction: D ⊥ X |Y , if Y and X are ⊥ sufficiently dependent (rank condition or completness condition) ⇒ point identification, otherwise partial identification
  • 18. Introduction Formalization Main result Application Conclusion SSM without exclusion restriction Instead of exclusion relation, we assume monotonicity conditions: Assumption 4 (Monotonicity in X : MX) x → E(D|Y , X = x) is increasing almost surely. Assumption 5 (Monotonicity in Y : MY) y → E(D|Y = y , X ) is increasing almost surely. θ0 is defined by a finite number of moments of (Y , X ) (ex: regression, quantile...). In this case Rθ can be deduced from R. In a first time we assume that Supp(X ) = {x1 , ..., xJ }, but Y has any support.
  • 19. Introduction Formalization Main result Application Conclusion SSM with monotonicity on X Instead of R, we can consider C, the set of possible p.d. of Y |D = 0, X = xj for j = 1...J. We show that no constraint is imposed on PY |D=0,X =xJ . Thus extremal elements are simply dirac. Then we show that ρ(y , xj ) fY |D=0,X =xj (y ) = rxj+1 ,xj (y ) fY |D=0,X =xj+1 (y ), (1) ρ(y , xj+1 ) with ρ(y , x) = 1/P(D = 1|Y = y , X = x) − 1 and P(D = 1|X = xj )P(D = 0|X = xi )fY |D=1,X =xj (y ) rxi ,xj (y ) = . P(D = 1|X = xi )P(D = 0|X = xj )fY |D=1,X =xi (y ) By MX, the ratio in (1) is greater than one. Thus, fY |D=0,X =xj (y ) = rxj ,xj+1 (y )fY |D=0,X =xj+1 (y ) + qj (y ), with qj ≥ 0. (2)
  • 20. Introduction Formalization Main result Application Conclusion SSM with monotonicity on X qj may be seen as the density of a (nonprobability) measure Qj . Because Qj admits no restriction, extremal elements for Qj are weighted dirac. Then, by induction, extremal elements of PY |D=0,X =xj admit at most J − j + 1 support points (and among them J − j are common with PY |D=0,X =xj+1 ). Once the support points (y1 , ..., yJ ) ∈ Y J have been determined, the corresponding weights are given by (2). For instance PY |D=0,X =xJ−1 = rxJ−1 ,xJ (yJ )δyJ + 1 − rxJ−1 ,xJ (yJ ) δyJ−1 . Thus ext(C) is parametrized by Y J . Moreover, some support points can be discarded because they lead to negative weights.
  • 21. Introduction Formalization Main result Application Conclusion SSM with monotonicity on Y In this case, there is no constraint on the distribution of X so one can reason conditional on X = x. Because P(D = 1|X = x) dPY |D=0,X =x (y ) = ρ(y , x)dPY |D=1,X =x (y ), P(D = 0|X = x) it suffices to find extremal elements on ρ(., x). ρ(., x) is decreasing and satisfies the integral equation 1 ρ(y , x)dPY |D=1,X =x (y ) = − 1. P(D = 1|X = x) Extremal elements on ρ(., x) are heavyside functions satisfying an “area restriction”.
  • 22. Introduction Formalization Main result Application Conclusion SSM with monotonicity on Y 1  ( y, x)  1 P ( D  1Y  y , X  x ) 1   ( y, x)dF Y D 1, X  x ( y)  P( D  1 X  x) 1 y Figure: An example of extremal element under MY. Proposition 1 Under MY in the sample selection model, we have ext(C) = (P Y |D=1,X =x1 ,Y ≤y1 , ..., P Y |D=1,X =xJ ,Y ≤yJ ) : (y1 , ..., yJ ) ∈ Y J
  • 23. Introduction Formalization Main result Application Conclusion SSM with double monotonicity Still reasoning on ρ(., .), we should find extremal parts of functions such that for all x, ρ(., x) is decreasing; E(ρ(Y , x)|D = 1, X = x) = 1/P(D = 1|X = x) − 1; for all y , ρ(y , x) ≤ ρ(y , x ) if x ≥ x . Extremal elements are similar as before but more difficult to characterize. One can show for instance that if X takes J values, then each ρ takes at most J values but (ρ(., x1 ), ..., ρ(., xJ )) taken together do not take more than 2J − 1 values.
  • 24. Introduction Formalization Main result Application Conclusion SSM with double monotonicity 1  ( y, x)  1 P ( D  1Y  y , X  x ) 1   ( y, x )dF 0 Y D 1, X  x0 ( y)  P( D  1 X  x0 ) 1 1   ( y, x )dF 1 Y D 1, X  x1 ( y)  P( D  1 X  x1 ) 1 y Figure: An example of extremal elements under MX, MY and with J = 2. At the end, ext(C) is parametrized by R2J−1 × Y J(J−1) .
  • 25. Introduction Formalization Main result Application Conclusion Extensions If #Supp(X ) = +∞, if Assumption MX holds for X , it also holds for Xn = n 1{X ∈[σ(i);σ(i+1)[} with i=1 −∞ = σ(1) < ... < σ(n + 1) = +∞, then we get Θ0n , an outer region for Θ0 . In such case, we give technical conditions under which Θ0 = n∈N Θ0n If several covariates, results can be extended.
  • 26. Introduction Formalization Main result Application Conclusion Outline Introduction Formalization of the problem Main result Application Conclusion
  • 27. Introduction Formalization Main result Application Conclusion Conclusion Still work in progress, comments are welcome... Additional results: link with methodologies used by Beresteanu et al. (random sets theory) or by Galichon et al. (optimal transport) for problems where all the constraints can be written as moment conditions. More general: we can also use our result when constraints are not given by moment conditions as in our application.
  • 28. Supplementary material Outline Supplementary material
  • 29. Supplementary material Example 2: unobserved heterogeneity. In this case, we suppose that the distribution of O conditional on the unobserved heterogeneity U is known. Then P O|U (A|U = u, θ) is known (by the model) and P O is known (by the data). O|U q(θ, P) = max sup P (A|u, θ)dP(u) − P(O ∈ A) , g (u, θ)dP(u) A This covers semiparametric nonlinear panel model: O = ((Yt )t=1...T , (Xt )t=1...T ), U = ((Xt )t=1...T , α), Yit = 1{Xit β0 + αi + εit ≥ 0} where the (εt )t=1...T are i.i.d., independent of (X , α) and with a known distribution and β0 is a subvector of θ0 . If θ0 = β0 then g (u, θ) = 0, if θ0 = (β0 , ∆0 ), where ∆0 is the average marginal effect of one binary covariate X1 then: g (x1 , x2 , a, β, ∆) = E (Yt |X1t = 1, X2t = x2 , α = a, β) − E (Yt |X1t = 0, X2t = x2 , α = a, β) − ∆. Applies to many other settings (see also Chernozhukov et al., 2012). main .
  • 30. Supplementary material Example 3: incomplet models and games with multiple equilibria. Y2 = 1 Y2 = 0 Y1 = 1 (θ + ε1 , θ + ε2 ) (ε1 , 0) Y1 = 0 (0, ε2 ) (0, 0) Figure: Payoffs of entry game (with θ < 0) Payoff shifters are known by players, but econometricians only knows that (ε1 , ε2 ) ∼ N (0, I2 ). When (ε1 , ε2 ) ∈ [0; −θ]2 , two pure strategies (Y1 , Y2 ) ∈ {(0, 1); (1, 0)} (and one mixed strategy) Back .
  • 31. Supplementary material Steps of proof of main result. The vector space of signed measure (M, |.|TV ) is the dual of continuous functions with compact support (Cb , ||.||∞ ). The Banach-Alaoglu theorem ensures that Rθ (as a closed subset of the unit ball) is compact for the weak- topology. Moreover the weak- topology is metrizable by the Levy-Prokhorov metric. Applying Choquet theorem: ∀P ∈ Rθ , there exists a probability measure µP such that gdP = gdQ dµP (Q) (3) ext(Rθ ) for every g ∈ Cb .
  • 32. Supplementary material Steps of proof of main result. Considering gn → 1, one can extend the previous relation to g = 1: 1= 1dQ dµP (Q) ext(Rθ ) And then ∀Q ∈ Supp(µP ), Q ∈ ext(Rθ ) ∩ P = ext(Rθ ) It follows that Rθ = ∅ ⇒ ext(Rθ ) = ∅ For linear parameter: apply the Choquet Theorem to R instead of Rθ and consider gn ∈ Cb → f to conclude that fdP = fdQ dµP (Q) ext(R)∩I(f ) This ensures that: sup fdP ≤ sup fdQ P∈R∩I(f ) Q∈ext(R)∩I(f ) Reverse inequality is straightforward. Back