Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4

778 views

Published on

Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
778
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Quantitative Methods for Lawyers - Class #21 - Regression Analysis - Part 4

  1. 1. Quantitative Methods for Lawyers Class #21 Regression Analysis Part 4 @ computational computationallegalstudies.com professor daniel martin katz danielmartinkatz.com lexpredict.com slideshare.net/DanielKatz
  2. 2. Building Regression Tables in R
  3. 3. “Stargazer is a new R package that creates LaTeX code for well- formatted regression tables, with multiple models side-by-side, as well as for summary statistics tables. It can also output the content of data frames directly into LaTeX.” If you want to go further in this area you probably need to learn some LaTeX. LaTeX is the industry standard for type setting technical documents
  4. 4. If you do not like LyX: First you need load a TeX Package: http://en.wikipedia.org/wiki/Comparison_of_TeX_editors MacTeX http://tug.org/mactex/ http://miktex.org/MikTeX Then it is useful to have IDE: http://www.lyx.org/
  5. 5. Install the Stargazer Package:
  6. 6. Stargazer is a going to give you LaTeX output which you can paste and compile into a Table Install the Stargazer Package:
  7. 7. Stargazer is a going to give you LaTeX output which you can paste and compile into a Table Install the Stargazer Package:
  8. 8. This is a very helpful website that you should consult regularly (and follow on FB ) for all things
  9. 9. http://www.r-bloggers.com/stargazer-package-for- beautiful-latex-tables-from-r-statistical-models-output/ Lets Consult the ‘Stargazer’ Tutorial
  10. 10. http://www.r-bloggers.com/stargazer-package-for- beautiful-latex-tables-from-r-statistical-models-output/ The ‘attitude’ data frame (which should be available with your default installation of R) Lets take a quick peak:
  11. 11. Applying the basic command to the dataframe get you a set of LaTeX output as shown to the left
  12. 12. (2) File > New http://www.lyx.org/(1) Open (3) Start a LaTeX Box (4) Cut from R output + Then Paste the LaTeX Code in box (5) Then Hit this Button to See Output starting here: ending here:
  13. 13. http://www.r- bloggers.com/ stargazer-package-for- beautiful-latex-tables- from-r-statistical- models-output/
  14. 14. Download this as an alternative because it allows you to easily override errors and push through to get a regression table
  15. 15. Okay Lets Run a Few Regression Models Now Lets Generate the LaTeX Code
  16. 16. The Resulting LaTeX Code Put this below end{document} Put this above documentclass{article} begin{document}
  17. 17. These Tables are Typically How Regression Output is Reported
  18. 18. A Quick Primer on Interpreting Regression Output
  19. 19. http://dss.princeton.edu/training/ We are Working Through Selected Examples From this Fabulous Resource Created by Oscar Torres-Reyna @ Princeton
  20. 20. A Quick Primer on Interpreting Regression Output How Should We Discuss the R e l a t i o n s h i p B e t w e e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (i.e. All Other Things Being Equal)
  21. 21. These are dummy variables for the respective regions
  22. 22. How Should We Discuss the R e l a t i o n s h i p B e t w e e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (All Other Things Being Equal)
  23. 23. How Should We Discuss the R e l a t i o n s h i p B e t w e e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (All Other Things Being Equal) The Implies We Are Interested in a Thought Experiment: If We Were To Change Some Independent Variable by 1 Unit -- What Would Be the Corresponding Effect on Y? This Should be Considered Both in the Case of a Regular Variable and a Dummy/Indicator Variable
  24. 24. The Implies We Are Interested in a Thought Experiment: If We Were To Change Some Independent Variable by 1 Unit -- What Would Be the Corresponding Effect on Y? This Should be Considered Both in the Case of a Regular Variable and a Dummy/Indicator Variable Start with “College” Variable - 3.38 is the Beta Coefficient on College
  25. 25. Start with “College” Variable - Thinking in a Ceteris Paribus Manner 3.38 is the Beta Coefficient on College
  26. 26. Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε Start with “College” Variable - Thinking in a Ceteris Paribus Manner 3.38 is the Beta Coefficient on College
  27. 27. Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε Start with “College” Variable - Thinking in a Ceteris Paribus Manner 3.38 is the Beta Coefficient on College
  28. 28. Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε Start with “College” Variable - Thinking in a Ceteris Paribus Manner All Else Equal - For Each 1 Unit Change in “College” there is a corresponding 3.38 Unit Change in “Csat” 3.38 is the Beta Coefficient on College
  29. 29. Thinking in a Ceteris Paribus Manner 76.84 if region =2 is True 27.26 if region =3 is True 34.35 if region =4 is True Otherwise if if region =1 is True we retain the Default Coefficient Estimates Notice that there are really 4 Separate Models Here csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε
  30. 30. Non Linearities and Transformations Okay This is the Interpretation in the Linear Case From a Model / Prediction Standpoint, Failure to Adjust to Account for Non-Linearity might lead to Type II Error Sometimes Data Does not Neatly Conform to Our Linearity Assumption
  31. 31. Non Linearities and Transformations Simple Linear Model Y = B0 + (B1 * (X1)) + ε Y = B0 (B1 * (X1)2 ) + ε Polynomial Regression Model “Lin- Log” Model Y = B0 + (B1 * (ln X1)) + ε Dependent Variable is Linear 1 or More Indep Var is Log In this Case of X^2 this is a Negative quadratic Function _
  32. 32. How Do We Determine that a Transformation is Appropriate? These Are the Variables From Our Model
  33. 33. How Do We Determine that a Transformation is Appropriate? Mean composite SAT score Per pupil expenditures prim&sec % HS graduates taking SAT Median household income, $1,000 % adults HS diploma % adults college degree Take A Look at this
  34. 34. How Do We Determine that a Transformation is Appropriate? Plot the Relationship Between X & Y and Observe the Relationship L e t s L o o k a t “ C s a t ” a n d “Percent”
  35. 35. How Do We Determine that a Transformation is Appropriate? R e l a t i o n s h i p looks non-Linear -- “Curvilinear” Aka Curve + Line
  36. 36. How Do We Determine that a Transformation is Appropriate? It Appears that a Polynomial (Quadratic) relationship probably exists thus, it makes sense to add a square version of it -300-200-1000100 Augmentedcomponentplusresidual 0 20 40 60 80 % HS graduates taking SAT The command acprplot (augmented component-plus-residual plot) provides a graphical way to examine linearity. Run this command after running a regression regress csat percent This is a Stata Command There is an alternative in R
  37. 37. How Do I Generate a New Variable? We Want to Generate a New Variable Called “Percent Squared” Here is How We Do This In R
  38. 38. Okay Lets Feed This Back Into the Regression Model
  39. 39. Now We Have Added “Percent Squared” to the Model R^2 is not everything but we can see the impact of alternative specification of the model on R^2
  40. 40. Other Transformations We Might Have A Variable Whose Relationship was Non-Linear and follow a Natural Log Include in the Model and Look at the Corresponding Model Fit NOTE YOU CAN ALSO TRANSFORM THE DEPENDENT VARIABLE ln Y = B0 + (B1 * (X1)) + ε
  41. 41. How To Understand Log Transformed Regression Output Dependent Variable is not in Log Form, Independent Variable is in Log Form (aka Linear-Log) “A 1 Percent Change in the Independent Variable is associated with a (.01* Beta) Change in the Dependent Variable” Dependent Variable is in Log Form, Independent Variable in Not in Log Form (aka Log-Linear) “A Change in the Independent Variable by 1 unit is associated with a (100percent * Beta) Change in the Dependent Variable” Dependent Variable is in Log Form, Independent Variable in Not in Log Form (aka Log-Log) “A Change in the Independent Variable by 1 unit is associated with a (Beta % Change) in the Dependent Variable”
  42. 42. Interaction Terms
  43. 43. Interaction Terms Sometime X1 Impacts Y and X2 Impacts Y but when both X1 and X2 are Present there is an additional impact (+ or - ) beyond Y = B0 + (B1 * (X1)) + (B2 * (X2)) + (B3 * (X3)(X2) + ε Income = B0 + B1 *Gender + B2 * Education + B3* Gender * Education + ε Our Beta Three Term Gives Us the Effect of Gender and Education Together Assuming Gender is Binary in the Model - The Interaction Will Explore the Differential Effect on Income By Gender
  44. 44. Image From - Thomas Brambor, William Roberts Clark & Matt Golder, Understanding Interaction Models: Improving Empirical Analyses, 14 Political Analysis 63 (2005) A Visual Display of Interaction Terms
  45. 45. For More on Interaction Terms ... Thomas Brambor, William Roberts Clark & Matt Golder, Understanding Interaction Models: Improving Empirical Analyses, 14 Political Analysis 63 (2005)
  46. 46. Daniel Martin Katz @ computational computationallegalstudies.com lexpredict.com danielmartinkatz.com illinois tech - chicago kent college of law@

×