SlideShare a Scribd company logo
1 of 24
INTRODUCTION TO STATA
17TH APRIL, 2022 DHANANJAY K UMAR
What is Stata?
Stata is a general-purpose integrated statistical software package created in 1985 by StataCorp LP.
It is a powerful statistical software that enables users to analyze, manage, and produce graphical
visualizations of data.
It provides commands to conduct statistical tests, and econometric analysis including panel data
analysis (cross-sectional time-series, longitudinal, repeated-measures), cross-sectional data, time-
series, survival-time data, cohort analysis, etc
Stata can be used either through dropdown menus or using commands.
It is user friendly, it has an extensive library of tools and internet capabilities, which install and
update new features regularly.
Versions of STATA.
There are three versions of STATA.
All of the three version are available for 32-bit and 64-bit computers. The major differences
among the version are of observations and variables handling capacity along with data
processing speed.
1) Stata /IC (or Intercooled Stata) – It can handle up to 2,047 variables.
2) Stata/SE (Special Edition) – It can handle up to 32,766 variables (and also allows longer string
variables and larger matrices).
3) Stata/MP (Multicore/Multiprocessor) – It has the same variable handling capacity as of
Stata/SE. However it is substantially faster and efficient for multicore computers.
STATA INTERFACE
Version Name
Result Window
Review Window
Command Window
Variables
Window
Properties
Window
Menu Driven Commands
Open a
.dta file
Save and
print the file
Log file
.do file
Data Editor
Variable Manager
Do file setup
* do files - Stata do-files are text files where users can store and run their commands for reuse, rather
than retyping the commands into the Command window.
◦ It is used due to Reproducibility, Easier debugging and changing commands.
◦ The file extension .do is used for do-files.
◦ doedit (doe) is used as command to open the do file.
Stata 16 features an enhanced editor that features tab auto-completion for Stata commands and
previously typed words
*To run a command from the do-file, highlight part or all of the command, and then hit Ctrl-D (Mac:
Shift+Cmd+D) or the “Execute(do)” icon, the rightmost icon on the do-file editor toolbar
*Multiple commands can be selected and executed
Syntax highlighting
The do-file editor colors Stata commands “blue”
Comments, which are not executed, are usually preceded
by * and are colored “green”
Words in quotes (file names, string values) are colored
“red”
Stata 16 features an enhanced editor that features tab
auto-completion for Stata commands and previously typed
words
Running commands
from the do-file
To run a command from the do-file, highlight
part or all of the command, and then hit Ctrl-D
(Mac: Shift+Cmd+D) or
The “Execute(do)” icon, the rightmost icon
on the do-file editor toolbar
Multiple commands can be selected and
executed
COMMENTS
Comments are not executed, so provide a
way to document the do-file.
Comments are either preceded by * or
surrounded by /* and */
Comments will appear in green in the do-file
editor
Stata will normally assume that a newline signifies the end of a
command
You can extend commands over multiple lines by placing /// at
the end of each line except for the last
Make sure to put a space before ///
When executing, highlight each line in the command(s)
long lines in do-files
Rules to define a variable -
A) English alphabet - upper or lower case (variable names, as commands are case sensitive),
B) Numbers – Any number starting from 0 to 9 can be used. Although the first character cannot
be a number.
C) Symbol – the underscore (_) symbol
D)The name can have up to 32 characters.
Example – age, AGE, age_1, age3, age32, AGE_1 (Correct form)
1age, 1AGE, age@, age@1, 3_age, age&3 (Incorrect form)
use load Stata dataset
save save Stata dataset
clear clear dataset from memory
import import Excel dataset
excel
Importing data
Using drop down menu:
file- >import->Excel
spreadsheet
Viewing data
browse open spreadsheet of data
list print data to Stata console
Once the data are loaded, we can view the dataset
as a spreadsheet using the command browse
The magnifying glass with spreadsheet icon also
browses the dataset
Black columns are numeric, red columns are
strings, and blue columns are numeric with string
labels
Operators and Expressions
These are key arithmetic, logical and relational operators you need to keep in mind:
Arithmetic Logical Relational
+ add ! not (also ~) == equal
- Subtract | or != not equal (also ~=)
* multiply & and < less than
/ divide <= less than or equal
^ raise to power > greater than
+ string concatenation >= greater than or equal
Use display command to use stata as calculator
Selecting observations
in select by observation number
Many commands are run on a subset of the data set
observations
in selects by observation (row) number
Syntax
in firstobs/lastobs
30/100 – observations 30 through 100
Negative numbers count from the end
“L” means last observation
-10/L – tenth observation from the last through
last observation
if select by condition
if selects observations that meet a certain
condition
gender == 1 (male)
math > 50
if clause usually placed after the command
specification, but before the comma that precedes
the list of options
The basic structure of using IF is : command if exp,
Exploring data
codebook inspect variable values -
Summarize summarize distribution
describe describe the variables
tabulate tabulate frequencies (tab, tab1, tab2), row, column
tabstat tabulation of statistics
Data Management
generate create variable
egen extended variable generation
replace replace values of variable
rename rename variable
recode recode variable values
label variable give variable description
label define generate value label set
label value apply value labels to variable
keep keep variables, drop others
drop drop variables, keep others
keep if keep observations, drop others
drop if drop observations, keep others
sort sort by variables, ascending
gsort ascending and descending sort
gen command creates a new variable using an expression that may combine constants, variables, functions, and
arithmetic and logical operators
gen id=_n /* id number of observation */
gen total=_N /* total number of observations */
gen ten=10 /* constant value of 10 */
gen tensq = ten^2 /* squared of ten*/
gen lnten = log(ten) /* generates ten in log form */
The egen command creates new variables based on summary measures, such as sum, mean, min and max. For
example
Generate (gen) command
replace
The typical syntax to replace values of an existing variable is:
replace oldvar = exp [if] [in]
the exp are similar to those used for the generate command above and can use the oldvar. Here
are two examples:
replace oldvar = oldvar * 5.
replace oldvar = oldvar * -1 if oldvar < 0
*recode
This command is useful to deal with missing values or special codes in the existing variables and to
change the existing values of categorical variables.
recode varlist (rule) [(rule) ...] [, generate(newvar)]
Rule Example Meaning
# = # 3 = 1 3 recoded to 1
# # = # 2 . = 9 2 and . recoded to 9
#/# = # 1/5 = 4 1 through 5 recoded to 4
nonmissing = # nonmiss = 8 all other nonmissing to 8
missing = # miss = 9 all other missings to 9
Raw datasets, especially large ones, often contain variable names which are not intuitive. For example, don’t be
surprised to find variable named a001s01 or d23s02r34. For this reason, it is important to “label” variables so
that we understand what exactly they mean, but variable labels cannot be so long that they appear verbose. To
attach a label to a variable use the label variable command in the following way
label variable name “name of the head of the household”
Labelling Variables and Values
*The values of categorical variables have a meaning unlike those of continuous variables. This is achieved by
first defining a label of the values and then applying those value labels to a variable as shown below.
label define sexlbl 0 “Female” 1 "Male”
label values sexhead sexlbl
*Creating a dummy variable using gen command
*‘By’ group processing
To execute a Stata command separately for groups of observations for which the values of the variables in
varlist are the same, type:
by varlist: command
Most commands allow the by prefix, but data should be sorted by varlist (precede command with sort varlist
or use bysort):
bysort varlist: command
Examples:
bysort id: summarize varname
bysort id: tabulate varname
bysort id: ta varname if varname>=18
Combining Datasets
append add more observations
merge add more variables, join by
matching variable
Some other statistical commands.
summarize : descriptive statistics
correlate : correlation matrices
ttest : perform 1-, 2-sample and paired t-tests
anova : 1-, 2-, n-way analysis of variance
regress : least squares regression
predict : generate fitted values, residuals, etc.
test : test linear hypotheses on parameters
logit, logistic : logit model, logistic regression
probit : binomial probit model
tobit : one- and two-limit Tobit model
cnsreg : Censored normal regression (generalized Tobit)
reg3 : three-stage least squares
lincom : linear combinations of parameters
cnsreg : regression with linear constraints
testnl : test nonlinear hypothesis on parameters
margins : marginal effects (elasticities, etc.)
ivregress : instrumental variables regression
prais : regression with AR(1) errors
sureg : seemingly unrelated regressions
qreg : quantile regression
ologit, oprobit : ordered logit and probit models
mlogit : multinomial logit model
poisson : Poisson regression
heckman : selection model
Importing txt. File (Fixed Width Data)
Very often, important datasets carry textual information on each household, individual, or firm. In Stata,
one can import the “txt. data” using the following command:
infix specifications using <filename>
In this example, for importing the text file provided by the NSSO, i.e. “PLFS data, the following command is
used:
infix id 1-3 FSU 4-8 Round 9-10 Schedule 11-13 Sample 14 Sector 15 State 16-18 Dist 19-20 Stratum 21-
22 Sub 23-24 Sub_round 25 Sub_Sample 26 FOD 27-30 HG 31 Second_Stage_Str 32 Sample_HH_No 33-
34 level 35-36 filler 37-41 Informant_sl_no 42-43 response_code 44 survey_code 45 subst_code 46 using
“c:PLFSDataABCD.TXT"

More Related Content

What's hot

Introduction To SPSS
Introduction To SPSSIntroduction To SPSS
Introduction To SPSSPhi Jack
 
STATA - Introduction
STATA - IntroductionSTATA - Introduction
STATA - Introductionstata_org_uk
 
Basic guide to SPSS
Basic guide to SPSSBasic guide to SPSS
Basic guide to SPSSpaul_gorman
 
Stata statistics
Stata statisticsStata statistics
Stata statisticsizahn
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss softwareDebashis Baidya
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"Bashir7576
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notesDavid mbwiga
 
SPSS an intro...
SPSS an intro...SPSS an intro...
SPSS an intro...Jithin Zcs
 
Software packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSSoftware packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSANAND BALAJI
 
Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)sspink
 
Data Analysis using SPSS: Part 1
Data Analysis using SPSS: Part 1Data Analysis using SPSS: Part 1
Data Analysis using SPSS: Part 1Taddesse Kassahun
 
Workshop on SPSS: Basic to Intermediate Level
Workshop on SPSS: Basic to Intermediate LevelWorkshop on SPSS: Basic to Intermediate Level
Workshop on SPSS: Basic to Intermediate LevelHiram Ting
 
Introduction to STATA(2).pdf
Introduction to STATA(2).pdfIntroduction to STATA(2).pdf
Introduction to STATA(2).pdfYomif3
 
introduction to spss
introduction to spssintroduction to spss
introduction to spssOmid Minooee
 

What's hot (20)

Introduction To SPSS
Introduction To SPSSIntroduction To SPSS
Introduction To SPSS
 
Stata tutorial
Stata tutorialStata tutorial
Stata tutorial
 
STATA - Introduction
STATA - IntroductionSTATA - Introduction
STATA - Introduction
 
Basic guide to SPSS
Basic guide to SPSSBasic guide to SPSS
Basic guide to SPSS
 
Spss tutorial 1
Spss tutorial 1Spss tutorial 1
Spss tutorial 1
 
Introduction To SPSS
Introduction To SPSSIntroduction To SPSS
Introduction To SPSS
 
Stata statistics
Stata statisticsStata statistics
Stata statistics
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss software
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"
 
Data management through spss
Data management through spssData management through spss
Data management through spss
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notes
 
SPSS an intro...
SPSS an intro...SPSS an intro...
SPSS an intro...
 
Software packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSSSoftware packages for statistical analysis - SPSS
Software packages for statistical analysis - SPSS
 
Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)
 
SPSS
SPSSSPSS
SPSS
 
Data Analysis using SPSS: Part 1
Data Analysis using SPSS: Part 1Data Analysis using SPSS: Part 1
Data Analysis using SPSS: Part 1
 
Workshop on SPSS: Basic to Intermediate Level
Workshop on SPSS: Basic to Intermediate LevelWorkshop on SPSS: Basic to Intermediate Level
Workshop on SPSS: Basic to Intermediate Level
 
Introduction to STATA(2).pdf
Introduction to STATA(2).pdfIntroduction to STATA(2).pdf
Introduction to STATA(2).pdf
 
introduction to spss
introduction to spssintroduction to spss
introduction to spss
 
Spss
SpssSpss
Spss
 

Similar to INTRODUCTION TO STATA.pptx

Similar to INTRODUCTION TO STATA.pptx (20)

Stata Programming Cheat Sheet
Stata Programming Cheat SheetStata Programming Cheat Sheet
Stata Programming Cheat Sheet
 
Stata cheatsheet programming
Stata cheatsheet programmingStata cheatsheet programming
Stata cheatsheet programming
 
ADVANCE ITT BY PRASAD
ADVANCE ITT BY PRASADADVANCE ITT BY PRASAD
ADVANCE ITT BY PRASAD
 
R workshop
R workshopR workshop
R workshop
 
Matlab ppt
Matlab pptMatlab ppt
Matlab ppt
 
Stata Cheat Sheets (all)
Stata Cheat Sheets (all)Stata Cheat Sheets (all)
Stata Cheat Sheets (all)
 
Python
PythonPython
Python
 
Matlab Manual
Matlab ManualMatlab Manual
Matlab Manual
 
Cheat Sheet for Stata v15.00 PDF Complete
Cheat Sheet for Stata v15.00 PDF CompleteCheat Sheet for Stata v15.00 PDF Complete
Cheat Sheet for Stata v15.00 PDF Complete
 
Programming in R
Programming in RProgramming in R
Programming in R
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
 
R Cheat Sheet – Data Management
R Cheat Sheet – Data ManagementR Cheat Sheet – Data Management
R Cheat Sheet – Data Management
 
e_lumley.pdf
e_lumley.pdfe_lumley.pdf
e_lumley.pdf
 
STATA_Training_for_data_science_juniors.pdf
STATA_Training_for_data_science_juniors.pdfSTATA_Training_for_data_science_juniors.pdf
STATA_Training_for_data_science_juniors.pdf
 
Stata cheat sheet: data processing
Stata cheat sheet: data processingStata cheat sheet: data processing
Stata cheat sheet: data processing
 
SAS Commands
SAS CommandsSAS Commands
SAS Commands
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Bt0065
Bt0065Bt0065
Bt0065
 

Recently uploaded

How Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingHow Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingAggregage
 
Vip B Aizawl Call Girls #9907093804 Contact Number Escorts Service Aizawl
Vip B Aizawl Call Girls #9907093804 Contact Number Escorts Service AizawlVip B Aizawl Call Girls #9907093804 Contact Number Escorts Service Aizawl
Vip B Aizawl Call Girls #9907093804 Contact Number Escorts Service Aizawlmakika9823
 
Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex
 
Call Girls In Yusuf Sarai Women Seeking Men 9654467111
Call Girls In Yusuf Sarai Women Seeking Men 9654467111Call Girls In Yusuf Sarai Women Seeking Men 9654467111
Call Girls In Yusuf Sarai Women Seeking Men 9654467111Sapana Sha
 
Andheri Call Girls In 9825968104 Mumbai Hot Models
Andheri Call Girls In 9825968104 Mumbai Hot ModelsAndheri Call Girls In 9825968104 Mumbai Hot Models
Andheri Call Girls In 9825968104 Mumbai Hot Modelshematsharma006
 
Classical Theory of Macroeconomics by Adam Smith
Classical Theory of Macroeconomics by Adam SmithClassical Theory of Macroeconomics by Adam Smith
Classical Theory of Macroeconomics by Adam SmithAdamYassin2
 
House of Commons ; CDC schemes overview document
House of Commons ; CDC schemes overview documentHouse of Commons ; CDC schemes overview document
House of Commons ; CDC schemes overview documentHenry Tapper
 
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...makika9823
 
VIP Call Girls Service Begumpet Hyderabad Call +91-8250192130
VIP Call Girls Service Begumpet Hyderabad Call +91-8250192130VIP Call Girls Service Begumpet Hyderabad Call +91-8250192130
VIP Call Girls Service Begumpet Hyderabad Call +91-8250192130Suhani Kapoor
 
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsHigh Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyInterimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyTyöeläkeyhtiö Elo
 
call girls in Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in  Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in  Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Chapter 2.ppt of macroeconomics by mankiw 9th edition
Chapter 2.ppt of macroeconomics by mankiw 9th editionChapter 2.ppt of macroeconomics by mankiw 9th edition
Chapter 2.ppt of macroeconomics by mankiw 9th editionMuhammadHusnain82237
 
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一S SDS
 
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...First NO1 World Amil baba in Faisalabad
 
Financial institutions facilitate financing, economic transactions, issue fun...
Financial institutions facilitate financing, economic transactions, issue fun...Financial institutions facilitate financing, economic transactions, issue fun...
Financial institutions facilitate financing, economic transactions, issue fun...Avanish Goel
 
Stock Market Brief Deck for 4/24/24 .pdf
Stock Market Brief Deck for 4/24/24 .pdfStock Market Brief Deck for 4/24/24 .pdf
Stock Market Brief Deck for 4/24/24 .pdfMichael Silva
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 

Recently uploaded (20)

How Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of ReportingHow Automation is Driving Efficiency Through the Last Mile of Reporting
How Automation is Driving Efficiency Through the Last Mile of Reporting
 
🔝9953056974 🔝Call Girls In Dwarka Escort Service Delhi NCR
🔝9953056974 🔝Call Girls In Dwarka Escort Service Delhi NCR🔝9953056974 🔝Call Girls In Dwarka Escort Service Delhi NCR
🔝9953056974 🔝Call Girls In Dwarka Escort Service Delhi NCR
 
Vip B Aizawl Call Girls #9907093804 Contact Number Escorts Service Aizawl
Vip B Aizawl Call Girls #9907093804 Contact Number Escorts Service AizawlVip B Aizawl Call Girls #9907093804 Contact Number Escorts Service Aizawl
Vip B Aizawl Call Girls #9907093804 Contact Number Escorts Service Aizawl
 
Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024Bladex Earnings Call Presentation 1Q2024
Bladex Earnings Call Presentation 1Q2024
 
Call Girls In Yusuf Sarai Women Seeking Men 9654467111
Call Girls In Yusuf Sarai Women Seeking Men 9654467111Call Girls In Yusuf Sarai Women Seeking Men 9654467111
Call Girls In Yusuf Sarai Women Seeking Men 9654467111
 
Andheri Call Girls In 9825968104 Mumbai Hot Models
Andheri Call Girls In 9825968104 Mumbai Hot ModelsAndheri Call Girls In 9825968104 Mumbai Hot Models
Andheri Call Girls In 9825968104 Mumbai Hot Models
 
Classical Theory of Macroeconomics by Adam Smith
Classical Theory of Macroeconomics by Adam SmithClassical Theory of Macroeconomics by Adam Smith
Classical Theory of Macroeconomics by Adam Smith
 
House of Commons ; CDC schemes overview document
House of Commons ; CDC schemes overview documentHouse of Commons ; CDC schemes overview document
House of Commons ; CDC schemes overview document
 
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
Independent Lucknow Call Girls 8923113531WhatsApp Lucknow Call Girls make you...
 
VIP Call Girls Service Begumpet Hyderabad Call +91-8250192130
VIP Call Girls Service Begumpet Hyderabad Call +91-8250192130VIP Call Girls Service Begumpet Hyderabad Call +91-8250192130
VIP Call Girls Service Begumpet Hyderabad Call +91-8250192130
 
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsHigh Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
High Class Call Girls Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance CompanyInterimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
Interimreport1 January–31 March2024 Elo Mutual Pension Insurance Company
 
call girls in Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in  Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in  Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Nand Nagri (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Chapter 2.ppt of macroeconomics by mankiw 9th edition
Chapter 2.ppt of macroeconomics by mankiw 9th editionChapter 2.ppt of macroeconomics by mankiw 9th edition
Chapter 2.ppt of macroeconomics by mankiw 9th edition
 
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
(办理学位证)加拿大萨省大学毕业证成绩单原版一比一
 
🔝+919953056974 🔝young Delhi Escort service Pusa Road
🔝+919953056974 🔝young Delhi Escort service Pusa Road🔝+919953056974 🔝young Delhi Escort service Pusa Road
🔝+919953056974 🔝young Delhi Escort service Pusa Road
 
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
 
Financial institutions facilitate financing, economic transactions, issue fun...
Financial institutions facilitate financing, economic transactions, issue fun...Financial institutions facilitate financing, economic transactions, issue fun...
Financial institutions facilitate financing, economic transactions, issue fun...
 
Stock Market Brief Deck for 4/24/24 .pdf
Stock Market Brief Deck for 4/24/24 .pdfStock Market Brief Deck for 4/24/24 .pdf
Stock Market Brief Deck for 4/24/24 .pdf
 
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service NashikHigh Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
High Class Call Girls Nashik Maya 7001305949 Independent Escort Service Nashik
 

INTRODUCTION TO STATA.pptx

  • 1. INTRODUCTION TO STATA 17TH APRIL, 2022 DHANANJAY K UMAR
  • 2. What is Stata? Stata is a general-purpose integrated statistical software package created in 1985 by StataCorp LP. It is a powerful statistical software that enables users to analyze, manage, and produce graphical visualizations of data. It provides commands to conduct statistical tests, and econometric analysis including panel data analysis (cross-sectional time-series, longitudinal, repeated-measures), cross-sectional data, time- series, survival-time data, cohort analysis, etc Stata can be used either through dropdown menus or using commands. It is user friendly, it has an extensive library of tools and internet capabilities, which install and update new features regularly.
  • 3. Versions of STATA. There are three versions of STATA. All of the three version are available for 32-bit and 64-bit computers. The major differences among the version are of observations and variables handling capacity along with data processing speed. 1) Stata /IC (or Intercooled Stata) – It can handle up to 2,047 variables. 2) Stata/SE (Special Edition) – It can handle up to 32,766 variables (and also allows longer string variables and larger matrices). 3) Stata/MP (Multicore/Multiprocessor) – It has the same variable handling capacity as of Stata/SE. However it is substantially faster and efficient for multicore computers.
  • 4. STATA INTERFACE Version Name Result Window Review Window Command Window Variables Window Properties Window Menu Driven Commands
  • 5. Open a .dta file Save and print the file Log file .do file Data Editor Variable Manager
  • 6. Do file setup * do files - Stata do-files are text files where users can store and run their commands for reuse, rather than retyping the commands into the Command window. ◦ It is used due to Reproducibility, Easier debugging and changing commands. ◦ The file extension .do is used for do-files. ◦ doedit (doe) is used as command to open the do file. Stata 16 features an enhanced editor that features tab auto-completion for Stata commands and previously typed words *To run a command from the do-file, highlight part or all of the command, and then hit Ctrl-D (Mac: Shift+Cmd+D) or the “Execute(do)” icon, the rightmost icon on the do-file editor toolbar *Multiple commands can be selected and executed
  • 7. Syntax highlighting The do-file editor colors Stata commands “blue” Comments, which are not executed, are usually preceded by * and are colored “green” Words in quotes (file names, string values) are colored “red” Stata 16 features an enhanced editor that features tab auto-completion for Stata commands and previously typed words
  • 8. Running commands from the do-file To run a command from the do-file, highlight part or all of the command, and then hit Ctrl-D (Mac: Shift+Cmd+D) or The “Execute(do)” icon, the rightmost icon on the do-file editor toolbar Multiple commands can be selected and executed
  • 9. COMMENTS Comments are not executed, so provide a way to document the do-file. Comments are either preceded by * or surrounded by /* and */ Comments will appear in green in the do-file editor
  • 10. Stata will normally assume that a newline signifies the end of a command You can extend commands over multiple lines by placing /// at the end of each line except for the last Make sure to put a space before /// When executing, highlight each line in the command(s) long lines in do-files
  • 11. Rules to define a variable - A) English alphabet - upper or lower case (variable names, as commands are case sensitive), B) Numbers – Any number starting from 0 to 9 can be used. Although the first character cannot be a number. C) Symbol – the underscore (_) symbol D)The name can have up to 32 characters. Example – age, AGE, age_1, age3, age32, AGE_1 (Correct form) 1age, 1AGE, age@, age@1, 3_age, age&3 (Incorrect form)
  • 12. use load Stata dataset save save Stata dataset clear clear dataset from memory import import Excel dataset excel Importing data Using drop down menu: file- >import->Excel spreadsheet
  • 13. Viewing data browse open spreadsheet of data list print data to Stata console Once the data are loaded, we can view the dataset as a spreadsheet using the command browse The magnifying glass with spreadsheet icon also browses the dataset Black columns are numeric, red columns are strings, and blue columns are numeric with string labels
  • 14. Operators and Expressions These are key arithmetic, logical and relational operators you need to keep in mind: Arithmetic Logical Relational + add ! not (also ~) == equal - Subtract | or != not equal (also ~=) * multiply & and < less than / divide <= less than or equal ^ raise to power > greater than + string concatenation >= greater than or equal Use display command to use stata as calculator
  • 15. Selecting observations in select by observation number Many commands are run on a subset of the data set observations in selects by observation (row) number Syntax in firstobs/lastobs 30/100 – observations 30 through 100 Negative numbers count from the end “L” means last observation -10/L – tenth observation from the last through last observation if select by condition if selects observations that meet a certain condition gender == 1 (male) math > 50 if clause usually placed after the command specification, but before the comma that precedes the list of options The basic structure of using IF is : command if exp,
  • 16. Exploring data codebook inspect variable values - Summarize summarize distribution describe describe the variables tabulate tabulate frequencies (tab, tab1, tab2), row, column tabstat tabulation of statistics
  • 17. Data Management generate create variable egen extended variable generation replace replace values of variable rename rename variable recode recode variable values label variable give variable description label define generate value label set label value apply value labels to variable keep keep variables, drop others drop drop variables, keep others keep if keep observations, drop others drop if drop observations, keep others sort sort by variables, ascending gsort ascending and descending sort
  • 18. gen command creates a new variable using an expression that may combine constants, variables, functions, and arithmetic and logical operators gen id=_n /* id number of observation */ gen total=_N /* total number of observations */ gen ten=10 /* constant value of 10 */ gen tensq = ten^2 /* squared of ten*/ gen lnten = log(ten) /* generates ten in log form */ The egen command creates new variables based on summary measures, such as sum, mean, min and max. For example Generate (gen) command
  • 19. replace The typical syntax to replace values of an existing variable is: replace oldvar = exp [if] [in] the exp are similar to those used for the generate command above and can use the oldvar. Here are two examples: replace oldvar = oldvar * 5. replace oldvar = oldvar * -1 if oldvar < 0 *recode This command is useful to deal with missing values or special codes in the existing variables and to change the existing values of categorical variables. recode varlist (rule) [(rule) ...] [, generate(newvar)] Rule Example Meaning # = # 3 = 1 3 recoded to 1 # # = # 2 . = 9 2 and . recoded to 9 #/# = # 1/5 = 4 1 through 5 recoded to 4 nonmissing = # nonmiss = 8 all other nonmissing to 8 missing = # miss = 9 all other missings to 9
  • 20. Raw datasets, especially large ones, often contain variable names which are not intuitive. For example, don’t be surprised to find variable named a001s01 or d23s02r34. For this reason, it is important to “label” variables so that we understand what exactly they mean, but variable labels cannot be so long that they appear verbose. To attach a label to a variable use the label variable command in the following way label variable name “name of the head of the household” Labelling Variables and Values *The values of categorical variables have a meaning unlike those of continuous variables. This is achieved by first defining a label of the values and then applying those value labels to a variable as shown below. label define sexlbl 0 “Female” 1 "Male” label values sexhead sexlbl
  • 21. *Creating a dummy variable using gen command *‘By’ group processing To execute a Stata command separately for groups of observations for which the values of the variables in varlist are the same, type: by varlist: command Most commands allow the by prefix, but data should be sorted by varlist (precede command with sort varlist or use bysort): bysort varlist: command Examples: bysort id: summarize varname bysort id: tabulate varname bysort id: ta varname if varname>=18
  • 22. Combining Datasets append add more observations merge add more variables, join by matching variable
  • 23. Some other statistical commands. summarize : descriptive statistics correlate : correlation matrices ttest : perform 1-, 2-sample and paired t-tests anova : 1-, 2-, n-way analysis of variance regress : least squares regression predict : generate fitted values, residuals, etc. test : test linear hypotheses on parameters logit, logistic : logit model, logistic regression probit : binomial probit model tobit : one- and two-limit Tobit model cnsreg : Censored normal regression (generalized Tobit) reg3 : three-stage least squares lincom : linear combinations of parameters cnsreg : regression with linear constraints testnl : test nonlinear hypothesis on parameters margins : marginal effects (elasticities, etc.) ivregress : instrumental variables regression prais : regression with AR(1) errors sureg : seemingly unrelated regressions qreg : quantile regression ologit, oprobit : ordered logit and probit models mlogit : multinomial logit model poisson : Poisson regression heckman : selection model
  • 24. Importing txt. File (Fixed Width Data) Very often, important datasets carry textual information on each household, individual, or firm. In Stata, one can import the “txt. data” using the following command: infix specifications using <filename> In this example, for importing the text file provided by the NSSO, i.e. “PLFS data, the following command is used: infix id 1-3 FSU 4-8 Round 9-10 Schedule 11-13 Sample 14 Sector 15 State 16-18 Dist 19-20 Stratum 21- 22 Sub 23-24 Sub_round 25 Sub_Sample 26 FOD 27-30 HG 31 Second_Stage_Str 32 Sample_HH_No 33- 34 level 35-36 filler 37-41 Informant_sl_no 42-43 response_code 44 survey_code 45 subst_code 46 using “c:PLFSDataABCD.TXT"