SlideShare a Scribd company logo
1 of 35
Download to read offline
Regression in Stata

                                             Ista Zahn

                                       Harvard MIT Data Center


                                         January 24 2013



                                      The Institute
                                      for Quantitative Social Science
                                      at Harvard University


Ista Zahn (Harvard MIT Data Center)        Regression in Stata   January 24 2013   1 / 35
Outline


  1   Introduction

  2   Univariate regression

  3   Multiple regression

  4   Interactions

  5   Exporting and saving results

  6   Wrap-up



Ista Zahn (Harvard MIT Data Center)   Regression in Stata   January 24 2013   2 / 35
Introduction

  Topic


  1   Introduction

  2   Univariate regression

  3   Multiple regression

  4   Interactions

  5   Exporting and saving results

  6   Wrap-up



Ista Zahn (Harvard MIT Data Center)       Regression in Stata   January 24 2013   3 / 35
Introduction

  Organization



        Please feel free to ask questions at any point if they are relevant to
        the current topic (or if you are lost!)
        There will be a Q&A after class for more specific, personalized
        questions
        Collaboration with your neighbors is encouraged
        If you are using a laptop, you will need to adjust paths accordingly
        Make comments in your Do-file rather than on hand-outs
        Save on flash drive or email to yourself




Ista Zahn (Harvard MIT Data Center)       Regression in Stata   January 24 2013   4 / 35
Introduction

  Copy the workshop materials to your home directory
        Log in to an Athena workstation using your Athena user name and
        password
        Click on the “Ubuntu” button on the upper-left and type “term” as
        shown below




        Click on the “Terminal” icon as shown above
        In the terminal, type this line exactly as shown:
   cd; wget http://tinyurl.com/stata-stats-zip; unzip stata-stats-zip

        If you see “ERROR 404: Not Found”, then you mistyped the command
        – try again, making sure to type the command exactly as shown
Ista Zahn (Harvard MIT Data Center)       Regression in Stata   January 24 2013   5 / 35
Introduction

  Launch Stata on Athena



        To start Stata type these commands in the terminal:

          add stata
          xstata

        Open up today’s Stata script
               In Stata, go to Window => New do file => Open
               Locate and open the StatStatistics.do script in the StataStatistics
               folder in your home directory
        I encourage you to add your own notes to this file!




Ista Zahn (Harvard MIT Data Center)       Regression in Stata   January 24 2013   6 / 35
Introduction

  Today’s Dataset



        We have data on a variety of variables for all 50 states
        Population, density, energy use, voting tendencies, graduation rates,
        income, etc.
        We’re going to be predicting SAT scores
        Univariate Regression: SAT scores and Education Expenditures
        Does the amount of money spent on education affect the mean SAT
        score in a state?
        Dependent variable: csat
        Independent variable: expense




Ista Zahn (Harvard MIT Data Center)       Regression in Stata   January 24 2013   7 / 35
Introduction

  Opening Files in Stata



        Look at bottom left hand corner of Stata screen – This is the directory
        Stata is currently reading from
        Files are located in the StataDatMan folder
        Start by changing directory and loading the data

          // change directory
          cd "C:/Users/dataclass/Desktop/StataStatistics"
          // use dir to see what is in the directory:
          dir
          // tell Stata to use the states data set
          use states.dta




Ista Zahn (Harvard MIT Data Center)       Regression in Stata   January 24 2013   8 / 35
Introduction

  Steps for Running Regression




     1   Examine descriptive statistics
     2   Look at relationship graphically and test correlation(s)
     3   Run and interpret regression
     4   Test regression assumptions




Ista Zahn (Harvard MIT Data Center)       Regression in Stata   January 24 2013   9 / 35
Univariate regression

  Topic


  1   Introduction

  2   Univariate regression

  3   Multiple regression

  4   Interactions

  5   Exporting and saving results

  6   Wrap-up



Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   10 / 35
Univariate regression

  Univariate Regression: Preliminaries




        We want to predict csat scores from expense
        First, let’s look at some descriptives

          // generate summary statistics for csat and expense
          sum csat expense
          // look at codebok
          codebook csat expense




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   11 / 35
Univariate regression

  Univariate Regression



        Look at scatterplots, compute correlation matrix, and regress SAT
        scores on expenditures

          // graph expense by csat
          twoway scatter expense csat

          // correlate csat and expense
          pwcorr csat expense, star(.05)

          // run the regression
          regress csat expense




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   12 / 35
Univariate regression

  Linear Regression Assumptions




        Assumption 1: Normal Distribution
        The errors of regression equation are normally distributed
        Assumption 2: Homoscedasticity (The variance around the regression
        line is the same for all values of the predictor variable)
        Assumption 3: Errors are independent
        Assumption 4: Relationships are linear




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   13 / 35
Univariate regression

  Homoscedasticity




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   14 / 35
Univariate regression

  Testing Assumptions: Normality

        A simple histogram of the residuals can be informative

          // graph the residual values of csat
          predict resid, residual
          histogram resid, normal




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   15 / 35
Univariate regression

  Testing Assumptions: Homoscedasticity


          rvfplot




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   16 / 35
Multiple regression

  Topic


  1   Introduction

  2   Univariate regression

  3   Multiple regression

  4   Interactions

  5   Exporting and saving results

  6   Wrap-up



Ista Zahn (Harvard MIT Data Center)              Regression in Stata   January 24 2013   17 / 35
Multiple regression

  Multiple Regression




        Just keep adding predictors
        Let’s try adding some predictors to the model of SAT scores
        income :: % students taking SATs
        percent :: % adults with HS diploma (high)




Ista Zahn (Harvard MIT Data Center)              Regression in Stata   January 24 2013   18 / 35
Multiple regression

  Multiple Regression Preliminaries



        As before, start with descriptive statistics and correlations

          // descriptive statistics
          sum income percent high

          // generate correlation matrix
          pwcorr csat expense income percent high

          // regress csat on exense, income, percent, and high
          regress csat expense income percent high




Ista Zahn (Harvard MIT Data Center)              Regression in Stata   January 24 2013   19 / 35
Multiple regression

  Exercise 1: Multiple Regression



  Open the datafile, states.dta.
     1   Select a few variables to use in a multiple regression of your own.
         Before running the regression, examine descriptive of the variables and
         generate a few scatterplots.
     2   Run your regression
     3   Examine the plausibility of the assumptions of normality and
         homogeneity




Ista Zahn (Harvard MIT Data Center)              Regression in Stata   January 24 2013   20 / 35
Interactions

  Topic


  1   Introduction

  2   Univariate regression

  3   Multiple regression

  4   Interactions

  5   Exporting and saving results

  6   Wrap-up



Ista Zahn (Harvard MIT Data Center)      Regression in Stata   January 24 2013   21 / 35
Interactions

  Interactions




        What if we wanted to test an interaction between percent & high?
        Option 1: generate product terms by hand

          // generate product of percent and high
          gen percenthigh = percent*high
          regress csat expense income percent high percenthigh




Ista Zahn (Harvard MIT Data Center)      Regression in Stata   January 24 2013   22 / 35
Interactions

  Interactions




        What if we wanted to test an interaction between percent & high?
        Option 2: Let Stata do your dirty work

          // use the # sign to represent interactions
          regress csat percent high c.percent#c.high
          // same as . regress csat c.percent##high




Ista Zahn (Harvard MIT Data Center)      Regression in Stata   January 24 2013   23 / 35
Interactions

  Categorical Predictors



        For categorical variables, we first need to dummy code
        Use region as example
               Option 1: create dummy codes before fitting regression model

          // create region dummy codes using tab
          tab region, gen(region) // could also use gen / replace

          //regress csat on region
          regress csat region1 region2 region3




Ista Zahn (Harvard MIT Data Center)      Regression in Stata    January 24 2013   24 / 35
Interactions

  Categorical Predictors




        For categorical variables, we first need to dummy code
        Use region as example
               Option 2: Let Stata do it for you

          // regress csat on region using fvvarlist syntax
          // see help fvvarlist for details
          regress csat i.region




Ista Zahn (Harvard MIT Data Center)      Regression in Stata   January 24 2013   25 / 35
Interactions

  Exercise 2: Regression, Categorical Predictors, &
  Interactions



  Open the datafile, states.dta.
     1   Add on to the regression equation that you created in exercise 1 by
         generating an interaction term and testing the interaction.
     2   Try adding a categorical variable to your regression (remember, it will
         need to be dummy coded). You could use region or high25, or
         generate a new categorical variable from one of the continuous
         variables in the dataset.




Ista Zahn (Harvard MIT Data Center)      Regression in Stata   January 24 2013   26 / 35
Exporting and saving results

  Topic


  1   Introduction

  2   Univariate regression

  3   Multiple regression

  4   Interactions

  5   Exporting and saving results

  6   Wrap-up



Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   27 / 35
Exporting and saving results

  Saving and exporting regression tables



        Usually when we’re running regression, we’ll be testing multiple
        models at a time
        Can be difficult to compare results
        Stata offers several user-friendly options for storing and viewing
        regression output from multiple models
        First, download the necessary packages:

          * install outreg2 package
          findit outreg2




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   28 / 35
Exporting and saving results

  Saving and replaying




        You can store regression model results in Stata

          // fit two regression models and store the results
          regress csat expense income percent high
          estimates store Model1
          regress csat expense income percent high i.region
          estimates store Model2




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   29 / 35
Exporting and saving results

  Saving and replaying




        Stored models can be recalled
          // Display Model1
          estimates replay Model1




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   30 / 35
Exporting and saving results

  Saving and replaying




        Stored models can be compared

          // Compare Model1 and Model2 coefficients
          estimates table Model1 Model2




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   31 / 35
Exporting and saving results

  Exporting into Excel




        Avoid human error when transferring coefficients into tables
        Excel can be used to format publication-ready tables

          outreg2 [Model1 Model2] using csatprediction.xls, replace




Ista Zahn (Harvard MIT Data Center)            Regression in Stata   January 24 2013   32 / 35
Wrap-up

  Topic


  1   Introduction

  2   Univariate regression

  3   Multiple regression

  4   Interactions

  5   Exporting and saving results

  6   Wrap-up



Ista Zahn (Harvard MIT Data Center)   Regression in Stata   January 24 2013   33 / 35
Wrap-up

  Help Us Make This Workshop Better




        Please take a moment to fill out a very short feedback form
        These workshops exist for you–tell us what you need!
        ttp://tinyurl.com/StataRegressionFeedback




Ista Zahn (Harvard MIT Data Center)   Regression in Stata   January 24 2013   34 / 35
Wrap-up

  Additional resources



        training and consulting
               IQSS workshops:
               http://projects.iq.harvard.edu/rtc/filter_by/workshops
               IQSS statistical consulting: http://rtc.iq.harvard.edu
        Stata resources
               UCLA website: http://www.ats.ucla.edu/stat/Stata/
               Great for self-study
               Links to resources
        Stata website: http://www.stata.com/help.cgi?contents
        Email list: http://www.stata.com/statalist/




Ista Zahn (Harvard MIT Data Center)   Regression in Stata        January 24 2013   35 / 35

More Related Content

What's hot

Data analysis using spss
Data analysis using spssData analysis using spss
Data analysis using spssSyed Faisal
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"Bashir7576
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss softwareDebashis Baidya
 
Data Analysis using SPSS: Part 1
Data Analysis using SPSS: Part 1Data Analysis using SPSS: Part 1
Data Analysis using SPSS: Part 1Taddesse Kassahun
 
Introduction - Using Stata
Introduction - Using StataIntroduction - Using Stata
Introduction - Using StataRyan Herzog
 
Basic guide to SPSS
Basic guide to SPSSBasic guide to SPSS
Basic guide to SPSSpaul_gorman
 
Base SAS Statistics Procedures
Base SAS Statistics ProceduresBase SAS Statistics Procedures
Base SAS Statistics Proceduresguest2160992
 
1 Introduction to SPSS.ppt
1 Introduction to SPSS.ppt1 Introduction to SPSS.ppt
1 Introduction to SPSS.pptAbebeNega
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notesDavid mbwiga
 
Learn SAS Programming
Learn SAS ProgrammingLearn SAS Programming
Learn SAS ProgrammingSASTechies
 
Basics Of SAS Programming Language
Basics Of SAS Programming LanguageBasics Of SAS Programming Language
Basics Of SAS Programming Languageguest2160992
 
Epidata ppt user guide
Epidata ppt user guideEpidata ppt user guide
Epidata ppt user guideSadat Mohammed
 
An introduction to spss
An introduction to spssAn introduction to spss
An introduction to spsszeeshanwrch
 

What's hot (20)

Data analysis using spss
Data analysis using spssData analysis using spss
Data analysis using spss
 
INTRODUCTION TO STATA.pptx
INTRODUCTION TO STATA.pptxINTRODUCTION TO STATA.pptx
INTRODUCTION TO STATA.pptx
 
Spss tutorial 1
Spss tutorial 1Spss tutorial 1
Spss tutorial 1
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"
 
(Manual spss)
(Manual spss)(Manual spss)
(Manual spss)
 
Introduction To SPSS
Introduction To SPSSIntroduction To SPSS
Introduction To SPSS
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss software
 
Types of data
Types of data Types of data
Types of data
 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
 
Data Analysis using SPSS: Part 1
Data Analysis using SPSS: Part 1Data Analysis using SPSS: Part 1
Data Analysis using SPSS: Part 1
 
Introduction - Using Stata
Introduction - Using StataIntroduction - Using Stata
Introduction - Using Stata
 
Basic guide to SPSS
Basic guide to SPSSBasic guide to SPSS
Basic guide to SPSS
 
Base SAS Statistics Procedures
Base SAS Statistics ProceduresBase SAS Statistics Procedures
Base SAS Statistics Procedures
 
1 Introduction to SPSS.ppt
1 Introduction to SPSS.ppt1 Introduction to SPSS.ppt
1 Introduction to SPSS.ppt
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notes
 
Learn SAS Programming
Learn SAS ProgrammingLearn SAS Programming
Learn SAS Programming
 
Spss
SpssSpss
Spss
 
Basics Of SAS Programming Language
Basics Of SAS Programming LanguageBasics Of SAS Programming Language
Basics Of SAS Programming Language
 
Epidata ppt user guide
Epidata ppt user guideEpidata ppt user guide
Epidata ppt user guide
 
An introduction to spss
An introduction to spssAn introduction to spss
An introduction to spss
 

Viewers also liked

STATA - Panel Regressions
STATA - Panel RegressionsSTATA - Panel Regressions
STATA - Panel Regressionsstata_org_uk
 
Data Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVAData Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVADerek Kane
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)Harsh Upadhyay
 
Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
Simple (and Simplistic) Introduction to Econometrics and Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear RegressionSimple (and Simplistic) Introduction to Econometrics and Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear RegressionPhilip Tiongson
 
Writing a conceptual framework
Writing a conceptual frameworkWriting a conceptual framework
Writing a conceptual frameworkwtidwell
 
From logistic regression to linear chain CRF
From logistic regression to linear chain CRFFrom logistic regression to linear chain CRF
From logistic regression to linear chain CRFDarren Yow-Bang Wang
 
Econometrics theory and_applications_with_eviews
Econometrics theory and_applications_with_eviewsEconometrics theory and_applications_with_eviews
Econometrics theory and_applications_with_eviewsDeiven Jesus Palpa
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysisnadiazaheer
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regressionalok tiwari
 
How to write a review article
How to write a review articleHow to write a review article
How to write a review articleHarshit Jadav
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
Reporting Workshop 10.12.08 B
Reporting Workshop 10.12.08 BReporting Workshop 10.12.08 B
Reporting Workshop 10.12.08 BHeather Price
 
Reporting Workshop
Reporting WorkshopReporting Workshop
Reporting WorkshopTitoOlsen
 
Economic exploitation and happiness
Economic exploitation and happinessEconomic exploitation and happiness
Economic exploitation and happinesssamyakshami
 
Stata Learning From Treiman
Stata Learning From TreimanStata Learning From Treiman
Stata Learning From TreimanChengjun Wang
 

Viewers also liked (20)

Regression
RegressionRegression
Regression
 
STATA - Panel Regressions
STATA - Panel RegressionsSTATA - Panel Regressions
STATA - Panel Regressions
 
Data Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVAData Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVA
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Simple (and Simplistic) Introduction to Econometrics and Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear RegressionSimple (and Simplistic) Introduction to Econometrics and Linear Regression
Simple (and Simplistic) Introduction to Econometrics and Linear Regression
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Writing a conceptual framework
Writing a conceptual frameworkWriting a conceptual framework
Writing a conceptual framework
 
Ols by hiron
Ols by hironOls by hiron
Ols by hiron
 
From logistic regression to linear chain CRF
From logistic regression to linear chain CRFFrom logistic regression to linear chain CRF
From logistic regression to linear chain CRF
 
Report writing
Report writingReport writing
Report writing
 
Econometrics theory and_applications_with_eviews
Econometrics theory and_applications_with_eviewsEconometrics theory and_applications_with_eviews
Econometrics theory and_applications_with_eviews
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regression
 
How to write a review article
How to write a review articleHow to write a review article
How to write a review article
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Reporting Workshop 10.12.08 B
Reporting Workshop 10.12.08 BReporting Workshop 10.12.08 B
Reporting Workshop 10.12.08 B
 
Reporting Workshop
Reporting WorkshopReporting Workshop
Reporting Workshop
 
Economic exploitation and happiness
Economic exploitation and happinessEconomic exploitation and happiness
Economic exploitation and happiness
 
Stata Learning From Treiman
Stata Learning From TreimanStata Learning From Treiman
Stata Learning From Treiman
 

Similar to Stata statistics

Accounting serx
Accounting serxAccounting serx
Accounting serxzeer1234
 
Accounting serx
Accounting serxAccounting serx
Accounting serxzeer1234
 
Unit III - Statistical Process Control (SPC)
Unit III - Statistical Process Control (SPC)Unit III - Statistical Process Control (SPC)
Unit III - Statistical Process Control (SPC)Dr.Raja R
 
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxIHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxwilcockiris
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSScsula its training
 
SOC2002 Lecture 11
SOC2002 Lecture 11SOC2002 Lecture 11
SOC2002 Lecture 11Bonnie Green
 
OLD SEVEN TOOLS OF QUALTIY MANAGEMENT
OLD SEVEN TOOLS OF QUALTIY MANAGEMENTOLD SEVEN TOOLS OF QUALTIY MANAGEMENT
OLD SEVEN TOOLS OF QUALTIY MANAGEMENTANNA UNIVERSITY
 
UNIT1-2.pptx
UNIT1-2.pptxUNIT1-2.pptx
UNIT1-2.pptxcsecem
 
Graphing stata (2 hour course)
Graphing stata (2 hour course)Graphing stata (2 hour course)
Graphing stata (2 hour course)izahn
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Claudia Wagner
 
Statistical Process Control (SPC) Tools - 7 Basic Tools
Statistical Process Control (SPC) Tools - 7 Basic ToolsStatistical Process Control (SPC) Tools - 7 Basic Tools
Statistical Process Control (SPC) Tools - 7 Basic ToolsMadeleine Lee
 
03 Design of Experiments - Factor prioritization
03 Design of Experiments - Factor prioritization03 Design of Experiments - Factor prioritization
03 Design of Experiments - Factor prioritizationStefan Moser
 
A review of statistics
A review of statisticsA review of statistics
A review of statisticsedisonre
 
Edisons Statistics
Edisons StatisticsEdisons Statistics
Edisons Statisticsteresa_soto
 

Similar to Stata statistics (20)

Accounting serx
Accounting serxAccounting serx
Accounting serx
 
Accounting serx
Accounting serxAccounting serx
Accounting serx
 
Applied statistics part 5
Applied statistics part 5Applied statistics part 5
Applied statistics part 5
 
Unit III - Statistical Process Control (SPC)
Unit III - Statistical Process Control (SPC)Unit III - Statistical Process Control (SPC)
Unit III - Statistical Process Control (SPC)
 
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docxIHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
IHP 525 Milestone Five (Final) TemplateMOST OF THIS TEMPLATE S.docx
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSS
 
SOC2002 Lecture 11
SOC2002 Lecture 11SOC2002 Lecture 11
SOC2002 Lecture 11
 
SENIOR COMP FINAL
SENIOR COMP FINALSENIOR COMP FINAL
SENIOR COMP FINAL
 
Tqm old tools
Tqm old toolsTqm old tools
Tqm old tools
 
OLD SEVEN TOOLS OF QUALTIY MANAGEMENT
OLD SEVEN TOOLS OF QUALTIY MANAGEMENTOLD SEVEN TOOLS OF QUALTIY MANAGEMENT
OLD SEVEN TOOLS OF QUALTIY MANAGEMENT
 
Tqm old tools
Tqm old toolsTqm old tools
Tqm old tools
 
UNIT1-2.pptx
UNIT1-2.pptxUNIT1-2.pptx
UNIT1-2.pptx
 
Nursing research
Nursing researchNursing research
Nursing research
 
Data structure
Data   structureData   structure
Data structure
 
Graphing stata (2 hour course)
Graphing stata (2 hour course)Graphing stata (2 hour course)
Graphing stata (2 hour course)
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
 
Statistical Process Control (SPC) Tools - 7 Basic Tools
Statistical Process Control (SPC) Tools - 7 Basic ToolsStatistical Process Control (SPC) Tools - 7 Basic Tools
Statistical Process Control (SPC) Tools - 7 Basic Tools
 
03 Design of Experiments - Factor prioritization
03 Design of Experiments - Factor prioritization03 Design of Experiments - Factor prioritization
03 Design of Experiments - Factor prioritization
 
A review of statistics
A review of statisticsA review of statistics
A review of statistics
 
Edisons Statistics
Edisons StatisticsEdisons Statistics
Edisons Statistics
 

More from izahn

Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SASizahn
 
Stata datman
Stata datmanStata datman
Stata datmanizahn
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programmingizahn
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 
R Regression Models with Zelig
R Regression Models with ZeligR Regression Models with Zelig
R Regression Models with Zeligizahn
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environmentizahn
 

More from izahn (6)

Introduction to SAS
Introduction to SASIntroduction to SAS
Introduction to SAS
 
Stata datman
Stata datmanStata datman
Stata datman
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
R Regression Models with Zelig
R Regression Models with ZeligR Regression Models with Zelig
R Regression Models with Zelig
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environment
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Stata statistics

  • 1. Regression in Stata Ista Zahn Harvard MIT Data Center January 24 2013 The Institute for Quantitative Social Science at Harvard University Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 1 / 35
  • 2. Outline 1 Introduction 2 Univariate regression 3 Multiple regression 4 Interactions 5 Exporting and saving results 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 2 / 35
  • 3. Introduction Topic 1 Introduction 2 Univariate regression 3 Multiple regression 4 Interactions 5 Exporting and saving results 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 3 / 35
  • 4. Introduction Organization Please feel free to ask questions at any point if they are relevant to the current topic (or if you are lost!) There will be a Q&A after class for more specific, personalized questions Collaboration with your neighbors is encouraged If you are using a laptop, you will need to adjust paths accordingly Make comments in your Do-file rather than on hand-outs Save on flash drive or email to yourself Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 4 / 35
  • 5. Introduction Copy the workshop materials to your home directory Log in to an Athena workstation using your Athena user name and password Click on the “Ubuntu” button on the upper-left and type “term” as shown below Click on the “Terminal” icon as shown above In the terminal, type this line exactly as shown: cd; wget http://tinyurl.com/stata-stats-zip; unzip stata-stats-zip If you see “ERROR 404: Not Found”, then you mistyped the command – try again, making sure to type the command exactly as shown Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 5 / 35
  • 6. Introduction Launch Stata on Athena To start Stata type these commands in the terminal: add stata xstata Open up today’s Stata script In Stata, go to Window => New do file => Open Locate and open the StatStatistics.do script in the StataStatistics folder in your home directory I encourage you to add your own notes to this file! Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 6 / 35
  • 7. Introduction Today’s Dataset We have data on a variety of variables for all 50 states Population, density, energy use, voting tendencies, graduation rates, income, etc. We’re going to be predicting SAT scores Univariate Regression: SAT scores and Education Expenditures Does the amount of money spent on education affect the mean SAT score in a state? Dependent variable: csat Independent variable: expense Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 7 / 35
  • 8. Introduction Opening Files in Stata Look at bottom left hand corner of Stata screen – This is the directory Stata is currently reading from Files are located in the StataDatMan folder Start by changing directory and loading the data // change directory cd "C:/Users/dataclass/Desktop/StataStatistics" // use dir to see what is in the directory: dir // tell Stata to use the states data set use states.dta Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 8 / 35
  • 9. Introduction Steps for Running Regression 1 Examine descriptive statistics 2 Look at relationship graphically and test correlation(s) 3 Run and interpret regression 4 Test regression assumptions Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 9 / 35
  • 10. Univariate regression Topic 1 Introduction 2 Univariate regression 3 Multiple regression 4 Interactions 5 Exporting and saving results 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 10 / 35
  • 11. Univariate regression Univariate Regression: Preliminaries We want to predict csat scores from expense First, let’s look at some descriptives // generate summary statistics for csat and expense sum csat expense // look at codebok codebook csat expense Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 11 / 35
  • 12. Univariate regression Univariate Regression Look at scatterplots, compute correlation matrix, and regress SAT scores on expenditures // graph expense by csat twoway scatter expense csat // correlate csat and expense pwcorr csat expense, star(.05) // run the regression regress csat expense Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 12 / 35
  • 13. Univariate regression Linear Regression Assumptions Assumption 1: Normal Distribution The errors of regression equation are normally distributed Assumption 2: Homoscedasticity (The variance around the regression line is the same for all values of the predictor variable) Assumption 3: Errors are independent Assumption 4: Relationships are linear Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 13 / 35
  • 14. Univariate regression Homoscedasticity Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 14 / 35
  • 15. Univariate regression Testing Assumptions: Normality A simple histogram of the residuals can be informative // graph the residual values of csat predict resid, residual histogram resid, normal Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 15 / 35
  • 16. Univariate regression Testing Assumptions: Homoscedasticity rvfplot Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 16 / 35
  • 17. Multiple regression Topic 1 Introduction 2 Univariate regression 3 Multiple regression 4 Interactions 5 Exporting and saving results 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 17 / 35
  • 18. Multiple regression Multiple Regression Just keep adding predictors Let’s try adding some predictors to the model of SAT scores income :: % students taking SATs percent :: % adults with HS diploma (high) Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 18 / 35
  • 19. Multiple regression Multiple Regression Preliminaries As before, start with descriptive statistics and correlations // descriptive statistics sum income percent high // generate correlation matrix pwcorr csat expense income percent high // regress csat on exense, income, percent, and high regress csat expense income percent high Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 19 / 35
  • 20. Multiple regression Exercise 1: Multiple Regression Open the datafile, states.dta. 1 Select a few variables to use in a multiple regression of your own. Before running the regression, examine descriptive of the variables and generate a few scatterplots. 2 Run your regression 3 Examine the plausibility of the assumptions of normality and homogeneity Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 20 / 35
  • 21. Interactions Topic 1 Introduction 2 Univariate regression 3 Multiple regression 4 Interactions 5 Exporting and saving results 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 21 / 35
  • 22. Interactions Interactions What if we wanted to test an interaction between percent & high? Option 1: generate product terms by hand // generate product of percent and high gen percenthigh = percent*high regress csat expense income percent high percenthigh Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 22 / 35
  • 23. Interactions Interactions What if we wanted to test an interaction between percent & high? Option 2: Let Stata do your dirty work // use the # sign to represent interactions regress csat percent high c.percent#c.high // same as . regress csat c.percent##high Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 23 / 35
  • 24. Interactions Categorical Predictors For categorical variables, we first need to dummy code Use region as example Option 1: create dummy codes before fitting regression model // create region dummy codes using tab tab region, gen(region) // could also use gen / replace //regress csat on region regress csat region1 region2 region3 Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 24 / 35
  • 25. Interactions Categorical Predictors For categorical variables, we first need to dummy code Use region as example Option 2: Let Stata do it for you // regress csat on region using fvvarlist syntax // see help fvvarlist for details regress csat i.region Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 25 / 35
  • 26. Interactions Exercise 2: Regression, Categorical Predictors, & Interactions Open the datafile, states.dta. 1 Add on to the regression equation that you created in exercise 1 by generating an interaction term and testing the interaction. 2 Try adding a categorical variable to your regression (remember, it will need to be dummy coded). You could use region or high25, or generate a new categorical variable from one of the continuous variables in the dataset. Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 26 / 35
  • 27. Exporting and saving results Topic 1 Introduction 2 Univariate regression 3 Multiple regression 4 Interactions 5 Exporting and saving results 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 27 / 35
  • 28. Exporting and saving results Saving and exporting regression tables Usually when we’re running regression, we’ll be testing multiple models at a time Can be difficult to compare results Stata offers several user-friendly options for storing and viewing regression output from multiple models First, download the necessary packages: * install outreg2 package findit outreg2 Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 28 / 35
  • 29. Exporting and saving results Saving and replaying You can store regression model results in Stata // fit two regression models and store the results regress csat expense income percent high estimates store Model1 regress csat expense income percent high i.region estimates store Model2 Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 29 / 35
  • 30. Exporting and saving results Saving and replaying Stored models can be recalled // Display Model1 estimates replay Model1 Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 30 / 35
  • 31. Exporting and saving results Saving and replaying Stored models can be compared // Compare Model1 and Model2 coefficients estimates table Model1 Model2 Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 31 / 35
  • 32. Exporting and saving results Exporting into Excel Avoid human error when transferring coefficients into tables Excel can be used to format publication-ready tables outreg2 [Model1 Model2] using csatprediction.xls, replace Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 32 / 35
  • 33. Wrap-up Topic 1 Introduction 2 Univariate regression 3 Multiple regression 4 Interactions 5 Exporting and saving results 6 Wrap-up Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 33 / 35
  • 34. Wrap-up Help Us Make This Workshop Better Please take a moment to fill out a very short feedback form These workshops exist for you–tell us what you need! ttp://tinyurl.com/StataRegressionFeedback Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 34 / 35
  • 35. Wrap-up Additional resources training and consulting IQSS workshops: http://projects.iq.harvard.edu/rtc/filter_by/workshops IQSS statistical consulting: http://rtc.iq.harvard.edu Stata resources UCLA website: http://www.ats.ucla.edu/stat/Stata/ Great for self-study Links to resources Stata website: http://www.stata.com/help.cgi?contents Email list: http://www.stata.com/statalist/ Ista Zahn (Harvard MIT Data Center) Regression in Stata January 24 2013 35 / 35