Quantitative Methods for Lawyers - Class #27 - Regression Analysis - Part 4

  • 379 views
Uploaded on

 

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
379
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Quantitative Methods for Lawyers Regression Analysis Part 4 Class #27 professor daniel martin katz computationallegalstudies.com @ computational
  • 2. Building Regression Tables in R
  • 3. “Stargazer is a new R package that creates LaTeX code for well-formatted regression tables, with multiple models side-by-side, as well as for summary statistics tables. It can also output the content of data frames directly into LaTeX.” If you want to go further in this area you probably need to learn some LaTeX. LaTeX is the industry standard for type set t ing tec hni cal documents
  • 4. First you need load a TeX Package: Then it is useful to have IDE: If you do not like LyX: MacTeX http://tug.org/mactex/ MikTeX http://miktex.org/ http://www.lyx.org/ http://en.wikipedia.org/wiki/Comparison_of_TeX_editors
  • 5. Install the Stargazer Package:
  • 6. Install the Stargazer Package: Stargazer is a going to give you LaTeX output which you can paste and compile into a Table
  • 7. Install the Stargazer Package: Stargazer is a going to give you LaTeX output which you can paste and compile into a Table
  • 8. This is a very helpful website that you should consult regularly (and follow on FB ) for all things
  • 9. Lets Consult the ‘Stargazer’ Tutorial http://www.r-bloggers.com/stargazer-package-for-beautiful- latex-tables-from-r-statistical-models-output/
  • 10. The ‘attitude’ data frame (which should be available with your default installation of R) Lets take a quick peak: http://www.r-bloggers.com/stargazer-package-for-beautiful- latex-tables-from-r-statistical-models-output/
  • 11. Applying the basic command to the dataframe get you a set of LaTeX output as shown to the left
  • 12. (1) Open http://www.lyx.org/ (2) File > New (3) Start a LaTeX Box (4) Cut from R output + Then Paste the LaTeX Code in box starting here: ending here: (5) Then Hit this Button to See Output
  • 13. http://www.r-bloggers. com/ stargazer-package-for-beautiful- latex-tables-from- r-statistical-models- output/
  • 14. Download this as an alternative because it allows you to easily override errors and push through to get a regression table
  • 15. Okay Lets Run a Few Regression Models Now Lets Generate the LaTeX Code
  • 16. Put this above documentclass{article} begin{document} The Resulting LaTeX Code Put this below end{document}
  • 17. These Tables are Typically How Regression Output is Reported
  • 18. A Quick Primer on Interpreting Regression Output
  • 19. http://dss.princeton.edu/training/ We are Working Through Selected Examples From this Fabulous Resource Created by Oscar Torres-Reyna @ Princeton
  • 20. A Quick Primer on Interpreting Regression Output How Should We Discuss the R e l a t i o n s h i p B e twe e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (i.e. All Other Things Being Equal)
  • 21. These are dummy variables for the respective regions
  • 22. How Should We Discuss the R e l a t i o n s h i p B e twe e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (All Other Things Being Equal)
  • 23. How Should We Discuss the R e l a t i o n s h i p B e twe e n Independent Variables and Dependent Variables? We Think in a Ceteris paribus Manner (All Other Things Being Equal) The Implies We Are Interested in a Thought Experiment: If We Were To Change Some Independent Variable by 1 Unit -- What Would Be the Corresponding Effect on Y? This Should be Considered Both in the Case of a Regular Variable and a Dummy/Indicator Variable
  • 24. The Implies We Are Interested in a Thought Experiment: If We Were To Change Some Independent Variable by 1 Unit -- What Would Be the Corresponding Effect on Y? This Should be Considered Both in the Case of a Regular Variable and a Dummy/Indicator Variable Start with “College” Variable - 3.38 is the Beta Coefficient on College
  • 25. Thinking in a Ceteris Paribus Manner Start with “College” Variable - 3.38 is the Beta Coefficient on College
  • 26. Thinking in a Ceteris Paribus Manner Start with “College” Variable - 3.38 is the Beta Coefficient on College Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε
  • 27. Thinking in a Ceteris Paribus Manner Start with “College” Variable - 3.38 is the Beta Coefficient on College Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε
  • 28. Thinking in a Ceteris Paribus Manner Start with “College” Variable - 3.38 is the Beta Coefficient on College Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε All Else Equal - For Each 1 Unit Change in “College” there is a corresponding 3.38 Unit Change in “Csat”
  • 29. Thinking in a Ceteris Paribus Manner Y = B0 + ( B1 * (X1) ) – ( B2 * (X2) ) + ( B3 * (X3) ) + ( B4 * (X4)) + ( B5 * (X5) ) + ( B6 * (X6) ) + ( B7 * (X7) ) + ( B8 * (X8) ) + ε csat = 786.30 – 0.004*expense – 3.02*percent + 0.48*income + 2.30*high + 3.38*college + 76.84*1 if region2=true + 27.26* 1 if region3=true + 34.35* 1 if region4=true + ε 76.84 if region =2 is True 27.26 if region =3 is True 34.35 if region =4 is True Otherwise if if region =1 is True we retain the Default Coefficient Estimates Notice that there are really 4 Separate Models Here
  • 30. Non Linearities and Transformations Okay This is the Interpretation in the Linear Case Sometimes Data Does not Neatly Conform to Our Linearity Assumption From a Model / Prediction Standpoint, Failure to Adjust to Account for Non-Linearity might lead to Type II Error
  • 31. Non Linearities and Transformations Simple Linear Model Y = B0 + (B1 * (X1)) + ε Polynomial Regression Model _ Y = B0 (B1 * (X1)2) + ε In this Case of X^2 this is a Negative quadratic Function “Lin- Log” Model Y = B0 + (B1 * (ln X1)) + ε Dependent Variable is Linear 1 or More Indep Var is Log
  • 32. How Do We Determine that a Transformation is Appropriate? These Are the Variables From Our Model
  • 33. How Do We Determine that a Transformation is Appropriate? Mean composite SAT score Per pupil expenditures prim&sec % HS graduates taking SAT Median household income, $1,000 % adults HS diploma % adults college degree Take A Look at this
  • 34. How Do We Determine that a Transformation is Appropriate? Plot the Relationship Between X & Y and Observe the Relationship L e t s L o o k at “ C s a t ” a n d “Percent”
  • 35. How Do We Determine that a Transformation is Appropriate? R e l a t i o n s h i p looks non-Linear -- “Curvilinear” Aka Curve + Line
  • 36. How Do We Determine that a Transformation is Appropriate? -300 -200 -100 0 100 Augmented component plus residual 0 20 40 60 80 % HS graduates taking SAT The command acprplot (augmented component-plus-residual plot) provides a graphical way to examine linearity. Run this command after running a regression regress csat percent This is a Stata Command There is an alternative in R It Appears that a Polynomial (Quadratic) relationship probably exists thus, it makes sense to add a square version of it
  • 37. How Do I Generate a New Variable? We Want to Generate a New Variable Called “Percent Squared” Here is How We Do This In R
  • 38. Okay Lets Feed This Back Into the Regression Model
  • 39. Now We Have Added “Percent Squared” to the Model R^2 is not everything but we can see the impact of alternative specification of the model on R^2
  • 40. Other Transformations We Might Have A Variable Whose Relationship was Non-Linear and follow a Natural Log Include in the Model and Look at the Corresponding Model Fit NOTE YOU CAN ALSO TRANSFORM THE DEPENDENT VARIABLE ln Y = B0 + (B1 * (X1)) + ε
  • 41. How To Understand Log Transformed Regression Output Dependent Variable is not in Log Form, Independent Variable is in Log Form (aka Linear-Log) “A 1 Percent Change in the Independent Variable is associated with a (.01* Beta) Change in the Dependent Variable” Dependent Variable is in Log Form, Independent Variable in Not in Log Form (aka Log-Linear) “A Change in the Independent Variable by 1 unit is associated with a (100percent * Beta) Change in the Dependent Variable” Dependent Variable is in Log Form, Independent Variable in Not in Log Form (aka Log-Log) “A Change in the Independent Variable by 1 unit is associated with a (Beta % Change) in the Dependent Variable”
  • 42. Interaction Terms
  • 43. Interaction Terms Sometime X1 Impacts Y and X2 Impacts Y but when both X1 and X2 are Present there is an additional impact (+ or - ) beyond Y = B0 + (B1 * (X1)) + (B2 * (X2)) + (B3 * (X3)(X2) + ε Income = B0 + B1 *Gender + B2 * Education + B3* Gender * Education + ε Our Beta Three Term Gives Us the Effect of Gender and Education Together Assuming Gender is Binary in the Model - The Interaction Will Explore the Differential Effect on Income By Gender
  • 44. A Visual Display of Interaction Terms Image From - Thomas Brambor, William Roberts Clark & Matt Golder, Understanding Interaction Models: Improving Empirical Analyses, 14 Political Analysis 63 (2005)
  • 45. For More on Interaction Terms ... Thomas Brambor, William Roberts Clark & Matt Golder, Understanding Interaction Models: Improving Empirical Analyses, 14 Political Analysis 63 (2005)