SlideShare a Scribd company logo
1 of 44
1
1
Stata Introduction
3h
Hein Stigum & Jonathan Wörn
Presentation, data and syntax at:
https://tinyurl.com/53scv867
2
2
Why Stata
• Pro
– Aimed at epidemiology
– Many methods, growing
– Graphics
– Structured, programmable
– Coming soon to a course near you 
• Con
– Memory > file size
3
3
This Course
Date Level Topic Teacher
21.1. Beginner Introduction to Stata JW
28.2. Beginner Graphics for data and results JW
7.3. Elementary Linear Regression HS
14.3. Elementary Logistic regression HS
21.3. --------------- No course ---------------
28.3. Advanced Survival analysis HS
4.4. Advanced Automating analysis: Loops, macros, … HS
11.4. Advanced
Programing: Simulating data,
bootstrapping, power calculations
HS
18.4. Advanced Individual fixed effects regression JW
For more details: https://tinyurl.com/53scv867
Stata introduction
• General use
– Interface and menu
– Do-files and syntax
– Data handling
• Analysis
– Descriptive
– Graphs
– Bivariate
4
5 Exercises
Exercises
• Course files: https://tinyurl.com/53scv867
– Birth 1 (Datafil) data for exercises 1-5
– 1 Stata introduction (syntax) solutions to exercises
5
INTERFACE & FILE TYPES
Welcome to Stata
6
Interface
7
Interface Stata 16
8
do-file edit/browse
9
9
Menu
10
10
Write syntax in do-file
New do-file: icon or
Ctrl-9 or
“doedit”
Run: Mark, Ctrl-D (Shift-Command-D on MAC)
11
11
Syntax
• Examples
– mean age
– mean age if sex==1
– bysort sex: summarize age
– summarize age, detail
command [varlist] [if exp][in range] [, opts]
[prefix:]
• Syntax
Data handling
12
Smart working
• Data (.dta)
– Master file, keep safe
– Working file for each project
• Syntax (.do)
– Work in progress file
– Manuscript file (Table 1…, Figure 1…, Supplement)
• Output (.smcl or .log)
– Save or discard
13
14
14
Using SPSS data
15
15
Use and save data
• Open data
– use “C:CourseMyfile.dta”, clear
– Or two lines:
• cd “C:Course”
• use “Myfile.dta”, clear
• Describe
– describe describe all variables
– list sex age in 1/20 list obs nr 1 to 20
• Save data
– save “C:CourseMyfile_new.dta”, replace
– Or two lines:
• cd “C:Course”
• save “Myfile_new.dta”, replace
Exercise 1
• Download the birth1-datafile (to desktop/folder):
https://tinyurl.com/53scv867
• Start Stata
• Open a new syntax file (Ctlr-9)
– Write all commands in the syntax file
• Open the dataset (use)
• Describe all variables (describe)
• List the 10 first observations of id, weight, sex and
mother’s age (mage)
• Save the syntax file (to desktop/folder) for later use
16
17
Descriptive
• Continuous
• Categorical
summarize weight
summarize weight, detail fractiles ++
tabulate bullied
tabulate bullied, nolab show coding
18
18
Other descriptives
tabstat mAge, stat(N min p50 mean max) by(parity)
19
19
Generate, replace
• Index (young men)
– generate index=0
– replace index=1 if sex==1 & age<30
• Young/Old
– generate old=(age>50) if age<.
Recode
• Recode 1/2 into 0/1
– recode sex (1=0) (2=1), gen(sex0)
• Alternative
– generate sex0=sex-1
20
Labels
• Assign variable label to variable
– label variable girl ”Girl (ref. Boy)”
• Assign value label to variable values
– label define girllbl 0 ”boy” 1 ”girl”
– label value girl girllbl
21
22
22
Dates
• From numeric to date (3 numeric variables into date variable)
ex: m=12, d=2, y=1987
generate birth=mdy(m,d,y)
format birth %td
• From string to date (1 string variable into date variable)
ex: bstr=“02.12.1987”
generate birth=date(bstr,”DMY”)
format birth %td
Exercise 2
• Summarize mother’s age
• Tabulate sex
• Recode sex into sex0 with categories 0, 1
• Generate new gestational age in weeks (the old is in
days)
– Summarize the new variable
– Label the new variable (not its values)
• Generate and format new variable birth in date format
based on the three variables day, month and year
– List day, month, year and birth to control the results
23
24
24
Missing
• Obs!!!
– Represented as ”.” (.a, .b, …)
– Missing values are large numbers
– age>30 will include missing.
– age>30 if age<. will not.
• Change between values and missings
– replace educ = . if educ == 99
– mvdecode educ, mv(99 = .  9999 = .a)
25
25
Describe missing
• Summarize missing
• Missing in tables
tab bullied sex, missing
misstable summarize weight sex gest missing
Exercise 3
• Tabulate missing in weight, sex, and gestational age
(gest) with the misstable sum command. Interpret.
• Tabulate gest versus sex and show number of
missing
• Summarize mage if gest is greater than 260 days
– Will this include missing in gest? Prove!
– Summarize mage if gest is greater than 260 days, excluding
missing in gest
26
27
27
Help
• General
– help command
– findit keyword search Stata+net
• Examples
– help table
– findit coefplot
• Web resources
– https://www.stata.com/links/resources-for-
learning-stata/
28
Graphics
29
29
Twoway plots
• Syntax
– twoway (plot1, opts) (plot2, opts), opts
• One plot
– kdensity bw
– scatter bw gest
0 2000 4000 6000
Birth weight
kernel = epanechnikov, bandwidth = 102.3251
Kernel density estimate
0
2000
4000
6000
Birth
weight
240 260 280 300 320 340
Gestational age
30
30
twoway (scatter bw gest) (fpfitci bw gest) (lfit bw gest)
2000
3000
4000
5000
6000
gram
250 270 290 310
days
Weight by gestational age
scatter smooth with CI line fit
31
31
Titles
1000
2000
3000
4000
5000
ytitle
240 260 280 300 320
xtitle
note
subtitle
title
scatter bw gest, title("title") subtitle("subtitle") ///
xtitle("xtitle") ytitle("ytitle") note("note")
Exercise 4
• Make a density plot of birth weight (weight)
• Make a scatter plot of birth weight versus gestational age (gest)
– Replace the outlier in gestational age (gest) with missing
– Restrict the plot to gestational age greater than 250 days (hint if gest>250)
– Add a linear fit line to the scatter plot to see the trend
– Add a smoothing curve with confidence interval to the plot (fpfitci) to check
for a non-linear pattern. The order of plots matters!
– Add a title, ytitle and xtitle to the plot
32
Bivariate analysis
33
34
34
2 independent samples
2000 3000 4000 5000 6000
Birth weight
twoway ( kdensity weight if sex==1, lcolor(blue) ) ///
( kdensity weight if sex==2, lcolor(red) )
Equal means?
Equal variance?
Do boys and girls have the same mean birth weight?
35
35
2 independent samples test
ttest weight, by(sex) unequal
ttest w1 w2, paired
ttest weight, by(sex) 2-sample T-test
36
36
Crosstables
equal proportions?
Are boys bullied as much as girls?
tabulate bullied sex, col chi2 nofreq
Exercise 5
• The variable “magegr2” contains mother’s age in two groups. Do
tab magegr2 and tab magegr2, nolab to find the groups and the
coding. An alternative to find coding is to list all labels: label list
• Make a plot of the birth weight distribution for each of the two
groups of mother’s age.
• Do a ttest of weight by magegr2. Are the means different?
• Redo the ttest for weight>2000 to get more normal distributions.
– Are the means different?
– Are the p-values different?
• Generate an indicator for high birth weight (>4500).
• Make a table of high birth weight by gestgr2 with columns
percent and chi-square test. Is higher birthweight more likely
with higher gestational age?
37
Extra (if you have time)
• Do a help tabstat and look at the statistics options
• Do a tabstat of weight showing N min p25 p50 p75 max, by
magegr2
38
39
39
Summing up
• Descriptive
– summarize weight
– tabulate sex
• Graphs
– twoway (plot1, opts) (plot2, opts), opts
• Bivariate
– ttest weight, by(sex)
– tabulate bullied sex, chi2
EXTRA MATERIAL
40
Copying output
• Copy graphs to Word or PowerPoint
– Save graphs in many formats, or
– Right-click on a graph to copy
• Copy output to Word or PowerPoint
– Mark output and right-click
– “Copy as picture”
• Copy tables to Excel
– Mark table, Ctrl-shift-C
41
Save output (Log results)
• Save a portion of the output as a .smcl file
capture log close
log using “results.smcl”, replace
…
log close
42
Keep plots during session
• Set “tabbed” graphics
• Give each plot a name
43
set autotabgraphs on, permanently
twoway …, name(“scatter”,replace)
Stata via kiosk
• Stata
– https://kiosk.uio.no
– Analyse  Stata (single click, wait…)
– Vmvare horizon
44

More Related Content

Similar to 1-stata-introduction.ppt

Spike sorting-tutorial
Spike sorting-tutorialSpike sorting-tutorial
Spike sorting-tutorialvacagodx
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive ModellingRajiv Advani
 
01_(Chapter_One_-_Section_1)_QDA_1006_1.pptx
01_(Chapter_One_-_Section_1)_QDA_1006_1.pptx01_(Chapter_One_-_Section_1)_QDA_1006_1.pptx
01_(Chapter_One_-_Section_1)_QDA_1006_1.pptxMiladrazi1
 
1) Chapter#02 Presentation of Data.ppt
1) Chapter#02 Presentation of Data.ppt1) Chapter#02 Presentation of Data.ppt
1) Chapter#02 Presentation of Data.pptMuntazirMehdi43
 
Introduction to MS Excel
Introduction to MS ExcelIntroduction to MS Excel
Introduction to MS ExcelTarek Dib
 
Statistics for math (English Version)
Statistics for math (English Version)Statistics for math (English Version)
Statistics for math (English Version)Tito_14
 
Sociology 601 class 7
Sociology 601 class 7Sociology 601 class 7
Sociology 601 class 7Rishabh Gupta
 
Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Simplilearn
 
useR2011 - Rougier
useR2011 - RougieruseR2011 - Rougier
useR2011 - Rougierrusersla
 
20141216 heatmaps eindhoven
20141216 heatmaps eindhoven20141216 heatmaps eindhoven
20141216 heatmaps eindhovenAlex Priem
 
Summarizing Data : Listing and Grouping pdf
Summarizing Data : Listing and Grouping pdfSummarizing Data : Listing and Grouping pdf
Summarizing Data : Listing and Grouping pdfJustynOwen
 

Similar to 1-stata-introduction.ppt (20)

Introduction
IntroductionIntroduction
Introduction
 
Spike sorting-tutorial
Spike sorting-tutorialSpike sorting-tutorial
Spike sorting-tutorial
 
Predictive Modelling
Predictive ModellingPredictive Modelling
Predictive Modelling
 
01_(Chapter_One_-_Section_1)_QDA_1006_1.pptx
01_(Chapter_One_-_Section_1)_QDA_1006_1.pptx01_(Chapter_One_-_Section_1)_QDA_1006_1.pptx
01_(Chapter_One_-_Section_1)_QDA_1006_1.pptx
 
1) Chapter#02 Presentation of Data.ppt
1) Chapter#02 Presentation of Data.ppt1) Chapter#02 Presentation of Data.ppt
1) Chapter#02 Presentation of Data.ppt
 
Intro to SPSS.ppt
Intro to SPSS.pptIntro to SPSS.ppt
Intro to SPSS.ppt
 
Introduction to MS Excel
Introduction to MS ExcelIntroduction to MS Excel
Introduction to MS Excel
 
Statistics for math (English Version)
Statistics for math (English Version)Statistics for math (English Version)
Statistics for math (English Version)
 
Uta005 lecture3
Uta005 lecture3Uta005 lecture3
Uta005 lecture3
 
3 module 2
3 module 23 module 2
3 module 2
 
Sociology 601 class 7
Sociology 601 class 7Sociology 601 class 7
Sociology 601 class 7
 
Ch06 multalign
Ch06 multalignCh06 multalign
Ch06 multalign
 
Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...
 
Cours Stats 5E
Cours Stats 5ECours Stats 5E
Cours Stats 5E
 
ejercicios de e
ejercicios de eejercicios de e
ejercicios de e
 
Data organization
Data organizationData organization
Data organization
 
useR2011 - Rougier
useR2011 - RougieruseR2011 - Rougier
useR2011 - Rougier
 
20141216 heatmaps eindhoven
20141216 heatmaps eindhoven20141216 heatmaps eindhoven
20141216 heatmaps eindhoven
 
Summarizing Data : Listing and Grouping pdf
Summarizing Data : Listing and Grouping pdfSummarizing Data : Listing and Grouping pdf
Summarizing Data : Listing and Grouping pdf
 
Lab 1 intro
Lab 1 introLab 1 intro
Lab 1 intro
 

Recently uploaded

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 

Recently uploaded (20)

Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 

1-stata-introduction.ppt

  • 1. 1 1 Stata Introduction 3h Hein Stigum & Jonathan Wörn Presentation, data and syntax at: https://tinyurl.com/53scv867
  • 2. 2 2 Why Stata • Pro – Aimed at epidemiology – Many methods, growing – Graphics – Structured, programmable – Coming soon to a course near you  • Con – Memory > file size
  • 3. 3 3 This Course Date Level Topic Teacher 21.1. Beginner Introduction to Stata JW 28.2. Beginner Graphics for data and results JW 7.3. Elementary Linear Regression HS 14.3. Elementary Logistic regression HS 21.3. --------------- No course --------------- 28.3. Advanced Survival analysis HS 4.4. Advanced Automating analysis: Loops, macros, … HS 11.4. Advanced Programing: Simulating data, bootstrapping, power calculations HS 18.4. Advanced Individual fixed effects regression JW For more details: https://tinyurl.com/53scv867
  • 4. Stata introduction • General use – Interface and menu – Do-files and syntax – Data handling • Analysis – Descriptive – Graphs – Bivariate 4 5 Exercises
  • 5. Exercises • Course files: https://tinyurl.com/53scv867 – Birth 1 (Datafil) data for exercises 1-5 – 1 Stata introduction (syntax) solutions to exercises 5
  • 6. INTERFACE & FILE TYPES Welcome to Stata 6
  • 10. 10 10 Write syntax in do-file New do-file: icon or Ctrl-9 or “doedit” Run: Mark, Ctrl-D (Shift-Command-D on MAC)
  • 11. 11 11 Syntax • Examples – mean age – mean age if sex==1 – bysort sex: summarize age – summarize age, detail command [varlist] [if exp][in range] [, opts] [prefix:] • Syntax
  • 13. Smart working • Data (.dta) – Master file, keep safe – Working file for each project • Syntax (.do) – Work in progress file – Manuscript file (Table 1…, Figure 1…, Supplement) • Output (.smcl or .log) – Save or discard 13
  • 15. 15 15 Use and save data • Open data – use “C:CourseMyfile.dta”, clear – Or two lines: • cd “C:Course” • use “Myfile.dta”, clear • Describe – describe describe all variables – list sex age in 1/20 list obs nr 1 to 20 • Save data – save “C:CourseMyfile_new.dta”, replace – Or two lines: • cd “C:Course” • save “Myfile_new.dta”, replace
  • 16. Exercise 1 • Download the birth1-datafile (to desktop/folder): https://tinyurl.com/53scv867 • Start Stata • Open a new syntax file (Ctlr-9) – Write all commands in the syntax file • Open the dataset (use) • Describe all variables (describe) • List the 10 first observations of id, weight, sex and mother’s age (mage) • Save the syntax file (to desktop/folder) for later use 16
  • 17. 17 Descriptive • Continuous • Categorical summarize weight summarize weight, detail fractiles ++ tabulate bullied tabulate bullied, nolab show coding
  • 18. 18 18 Other descriptives tabstat mAge, stat(N min p50 mean max) by(parity)
  • 19. 19 19 Generate, replace • Index (young men) – generate index=0 – replace index=1 if sex==1 & age<30 • Young/Old – generate old=(age>50) if age<.
  • 20. Recode • Recode 1/2 into 0/1 – recode sex (1=0) (2=1), gen(sex0) • Alternative – generate sex0=sex-1 20
  • 21. Labels • Assign variable label to variable – label variable girl ”Girl (ref. Boy)” • Assign value label to variable values – label define girllbl 0 ”boy” 1 ”girl” – label value girl girllbl 21
  • 22. 22 22 Dates • From numeric to date (3 numeric variables into date variable) ex: m=12, d=2, y=1987 generate birth=mdy(m,d,y) format birth %td • From string to date (1 string variable into date variable) ex: bstr=“02.12.1987” generate birth=date(bstr,”DMY”) format birth %td
  • 23. Exercise 2 • Summarize mother’s age • Tabulate sex • Recode sex into sex0 with categories 0, 1 • Generate new gestational age in weeks (the old is in days) – Summarize the new variable – Label the new variable (not its values) • Generate and format new variable birth in date format based on the three variables day, month and year – List day, month, year and birth to control the results 23
  • 24. 24 24 Missing • Obs!!! – Represented as ”.” (.a, .b, …) – Missing values are large numbers – age>30 will include missing. – age>30 if age<. will not. • Change between values and missings – replace educ = . if educ == 99 – mvdecode educ, mv(99 = . 9999 = .a)
  • 25. 25 25 Describe missing • Summarize missing • Missing in tables tab bullied sex, missing misstable summarize weight sex gest missing
  • 26. Exercise 3 • Tabulate missing in weight, sex, and gestational age (gest) with the misstable sum command. Interpret. • Tabulate gest versus sex and show number of missing • Summarize mage if gest is greater than 260 days – Will this include missing in gest? Prove! – Summarize mage if gest is greater than 260 days, excluding missing in gest 26
  • 27. 27 27 Help • General – help command – findit keyword search Stata+net • Examples – help table – findit coefplot • Web resources – https://www.stata.com/links/resources-for- learning-stata/
  • 29. 29 29 Twoway plots • Syntax – twoway (plot1, opts) (plot2, opts), opts • One plot – kdensity bw – scatter bw gest 0 2000 4000 6000 Birth weight kernel = epanechnikov, bandwidth = 102.3251 Kernel density estimate 0 2000 4000 6000 Birth weight 240 260 280 300 320 340 Gestational age
  • 30. 30 30 twoway (scatter bw gest) (fpfitci bw gest) (lfit bw gest) 2000 3000 4000 5000 6000 gram 250 270 290 310 days Weight by gestational age scatter smooth with CI line fit
  • 31. 31 31 Titles 1000 2000 3000 4000 5000 ytitle 240 260 280 300 320 xtitle note subtitle title scatter bw gest, title("title") subtitle("subtitle") /// xtitle("xtitle") ytitle("ytitle") note("note")
  • 32. Exercise 4 • Make a density plot of birth weight (weight) • Make a scatter plot of birth weight versus gestational age (gest) – Replace the outlier in gestational age (gest) with missing – Restrict the plot to gestational age greater than 250 days (hint if gest>250) – Add a linear fit line to the scatter plot to see the trend – Add a smoothing curve with confidence interval to the plot (fpfitci) to check for a non-linear pattern. The order of plots matters! – Add a title, ytitle and xtitle to the plot 32
  • 34. 34 34 2 independent samples 2000 3000 4000 5000 6000 Birth weight twoway ( kdensity weight if sex==1, lcolor(blue) ) /// ( kdensity weight if sex==2, lcolor(red) ) Equal means? Equal variance? Do boys and girls have the same mean birth weight?
  • 35. 35 35 2 independent samples test ttest weight, by(sex) unequal ttest w1 w2, paired ttest weight, by(sex) 2-sample T-test
  • 36. 36 36 Crosstables equal proportions? Are boys bullied as much as girls? tabulate bullied sex, col chi2 nofreq
  • 37. Exercise 5 • The variable “magegr2” contains mother’s age in two groups. Do tab magegr2 and tab magegr2, nolab to find the groups and the coding. An alternative to find coding is to list all labels: label list • Make a plot of the birth weight distribution for each of the two groups of mother’s age. • Do a ttest of weight by magegr2. Are the means different? • Redo the ttest for weight>2000 to get more normal distributions. – Are the means different? – Are the p-values different? • Generate an indicator for high birth weight (>4500). • Make a table of high birth weight by gestgr2 with columns percent and chi-square test. Is higher birthweight more likely with higher gestational age? 37
  • 38. Extra (if you have time) • Do a help tabstat and look at the statistics options • Do a tabstat of weight showing N min p25 p50 p75 max, by magegr2 38
  • 39. 39 39 Summing up • Descriptive – summarize weight – tabulate sex • Graphs – twoway (plot1, opts) (plot2, opts), opts • Bivariate – ttest weight, by(sex) – tabulate bullied sex, chi2
  • 41. Copying output • Copy graphs to Word or PowerPoint – Save graphs in many formats, or – Right-click on a graph to copy • Copy output to Word or PowerPoint – Mark output and right-click – “Copy as picture” • Copy tables to Excel – Mark table, Ctrl-shift-C 41
  • 42. Save output (Log results) • Save a portion of the output as a .smcl file capture log close log using “results.smcl”, replace … log close 42
  • 43. Keep plots during session • Set “tabbed” graphics • Give each plot a name 43 set autotabgraphs on, permanently twoway …, name(“scatter”,replace)
  • 44. Stata via kiosk • Stata – https://kiosk.uio.no – Analyse  Stata (single click, wait…) – Vmvare horizon 44