The document provides instructions for importing, merging, labeling, and summarizing student GPA data using Stata software. It shows how to import an Excel dataset, merge it with another dataset, label the data, variables, and values. It then demonstrates generating new log-transformed variables, calculating means, recoding variables, tabulating and describing the data, plotting crosstabs and graphs, and performing a regression analysis to check for issues like multicollinearity and heteroscedasticity. The document is teaching someone how to conduct an analysis of student GPA data in Stata.
2. Import “GPA 2_1.dta”
FROM MENU
File
Import
Excel Spreadsheet
Browse and select
Import 1st row as variable name
OK
Command: use "D:Session 3 4 Nazmul
Hossaingpa2_1.dta“
br
3. Merge
FROM MENU
Data
Combine dataset
Merge two datasets
One to one
Key Variable: Id
Browse “GPA 2_2.dta”
Command: merge 1:1 id using "C:UsersDept. of
EconomicsDesktopBER 20_04_2019gpa2_2.dta"
7. Generating New Variables (log form)
▸ gen lsat=ln(sat)
▸ gen ltothrs=ln( tothrs)
▸ gen lcolgpa=ln( colgpa)
▸ label var lsat "ln of sat"
▸ label var ltothrs "ln of tothrs"
▸ label var lcolgpa "ln of colgpa"
9. Recoding “colgpa”
▸ recode colgpa (min/1.99=1 "Low GPA")
(2/3.49=2 "Moderate GPA") (3.5/max=3
"High GPA"), gen(ggpa)
GPA after fall semester Label
0-1.99 Low GPA
2-3.49 Moderate GPA
3.5-4 High GPA
10. Tabulate “ggpa”
▸ tab ggpa
RECODE of colgpa
(GPA after fall
semester)
Freq. Percent Cum.
Low GPA 592 14.31 14.31
Moderate GPA 3087 74.62 88.93
High GPA 458 11.07 100.00
11. Describe
▸ des
obs: 4,137 GPA 2
vars: 18 25 Apr 2019 02:06
size: 297,864
----------------------------------------------------------------------------------------------------------------------------------
storage display value
variable name type format label variable label
----------------------------------------------------------------------------------------------------------------------------------
id float %9.0g
sat float %9.0g combined SAT score
tothrs float %9.0g total hours through fall semest
colgpa float %9.0g GPA after fall semester
athlete float %11.0g athlete_lbl
=1 if athlete
verbmath float %9.0g verbal/math SAT score
hsize float %9.0g size graduating class, 100s
hsrank float %9.0g rank in graduating class
hsperc float %9.0g 100*(hsrank/hssize)
female float %9.0g female_lbl
=1 if female
white float %9.0g white_lbl
=1 if white
black float %9.0g black_lbl
=1 if black
hsizesq float %9.0g hsize^2
lsat float %9.0g ln of sat
ltothrs float %9.0g ln of tothrs
lcolgpa float %9.0g ln of colgpa
meangpa float %9.0g
ggpa float %12.0g ggpa RECODE of colgpa (GPA after fall semester)
----------------------------------------------------------------------------------------------------------------------------------
Sorted by: id
12. Summarize
▸ sum sat tothrs colgpa hsize hsrank hsperc athlete female white
black
Variable Obs Mean Std.Dev. Min Max
sat 4137 1030.331 139.401 470 1540
tothrs 4137 52.832 35.33 6 137
colgpa 4137 2.653 .659 0 4
hsize 4137 2.8 1.737 .03 9.4
hsrank 4137 52.83 64.684 1 634
hsperc 4137 19.237 16.569 .167 92
athlete 4137 .047 .211 0 1
female 4137 .45 .498 0 1
white 4137 .926 .263 0 1
black 4137 .055 .229 0 1
TABLE: Descriptive Statistics
13. Crosstab
▸ tab athlete black, row
Athlete=1 if athlete
Black=1 if black
0 1 Total
0 3758 185 3943
95.31 4.69 100.00
1 150 44 194
77.32 22.68 100.00
Total 3908 229 4137
94.46 5.54 100.00
Tabulation of athlete black
First row has frequencies and second row has row percentages
20. There is no
multicollinearity
problem
Post-estimation: Checking for
multicollinearity
. corr lsat ltothrs hsize athlete female black
| lsat ltothrs hsize athlete female black
-------------+---------------------------------------------------------------------
lsat | 1.0000
ltothrs | 0.0121 1.0000
hsize | 0.0652 -0.0397 1.0000
athlete | -0.2079 0.0107 0.0493 1.0000
female | -0.1446 0.0314 -0.0042 -0.0970 1.0000
black | -0.2533 -0.0054 -0.0575 0.1663 0.0213 1.0000
21. . estat imtest, white
White's test for Ho: homoskedasticity
against Ha: unrestricted heteroskedasticity
chi2(31) = 183.61
Prob > chi2 = 0.0000
Cameron & Trivedi's decomposition of IM-test
-------------------------------------------------------------
Source | chi2 df p
---------------------+---------------------------------------
Heteroskedasticity | 183.61 31 0.0000
Skewness | 63.89 7 0.0000
Kurtosis | 10.33 1 0.0013
---------------------+----------------------------------------
Total | 257.82 39 0.0000
---------------------------------------------------------------
There is
heteroscedasticity
problem
Post-estimation: Checking for
heteroscedasticity
22. There is no
Specification bias
problem
Post-estimation: Checking for
specification bias
. estat ovtest
Ramsey RESET test using powers of the fitted
values of lcolgpa
Ho: model has no omitted variables
F(3, 4124) = 1.26
Prob > F = 0.2854
23.
24. Robust standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
From STATA to
MS Word
outreg2 command
outreg2 using “myfile.doc", replace ctitle(model)
25. From STATA to MS Word
ssc install asdoc asdoc command run
27. 27
SlidesCarnival icons are editable shapes.
This means that you can:
● Resize them without losing quality.
● Change line color, width and style.
Isn’t that nice? :)
Examples:
28. Now you can use any emoji as an icon!
And of course it resizes without losing quality and you can change the color.
How? Follow Google instructions
https://twitter.com/googledocs/status/730087240156643328
✋👆👉👍👤👦👧👨👩👪💃🏃💑❤😂😉
😋😒😭👶😸🐟🍒🍔💣📌📖🔨🎃🎈🎨🏈
🏰🌏🔌🔑 and many more...
😉
28