SlideShare a Scribd company logo
1 of 3
Introduction to Computing
Stata by and egen commands
by and bysort
Many Stata commands can be executed on a group-by-group basis. Earlier we looked at how the
Stata by command can be used as a prefix for statistical commands (see help by). For example the
following Stata code will execute the summarize command for each unique value of marital (married,
widowed, etc.):
by marital: summarize sociability
The by construct can also be used to create new variables. The by prefix steps through each
unique value of a variable, and treats each set of observations as a separate dataset. The data must be
sorted by values of the by-group variable before using by. Suppose we wanted to calculate the minimum
value of age for each marital status. We could:
sort marital age
by marital: gen minage=age[1]
The first line orders all the cases by marital status and within marital status by age. The second
line looks only within each marital group and assigns the value of age to the first observation which
because of sorting is the minimum value, to the new variable minage. The _n variable which records the
current observation number resets within each by-group, i.e. each group is treated like its own little
dataset.
You can combine these two steps with the bysort command (abbreviated bys):
bysort marital (age): gen minage=age[1]
where the variables after bysort but before the parentheses are the variables you want to
perform the command by, and the variables inside the parentheses specify the sort order of each little
dataset you are performing the command on.
egen
One of Stata’s most powerful and useful commands is egen. Like generate, it is used to
create new variables, but it is much more than that. Using egen difficult and tedious variables can be
created easily. Some examples are variables whose values are the mean of another variable for each
group such as sociability for males and females. You can also use egen to create other variables that
count the number of observations that fit a certain criteria, or even simply number observations. The only
way to truly see how powerful egen can be is to show a few examples and then have you explore the
other available functions on your own. See help egen for a complete listing of all of these commands.
egen age_cat1 = cut(age), ///
at(18, 25, 35, 45, 55, 65, 75, 90)
egen age_cat2 = cut(age), group(6)
The cut function is useful for
collapsing variables. You specify the
lowest value for each new group with
the at() option. Any observations
with a value less than 18 will be
given a missing value for age_cat,
and all observations with a value
greater than 90 will be placed in the
“90” age_cat group. Or simply
specify the number of groups you
want with the group() option.
egen sexfreq2 = mean(sexfreq1), by(marital)
egen sexfreq3 = mean(sexfreq1), by(sex marital)
This example creates a variable that
is the mean of sexfreq1 for each
marital group. In addition to the
mean, other statistics can be
calculated (e.g. min, max, sd).
egen numobs1 = count(sexfreq1), by(year)
egen numobs2 = count(sexfreq1), ///
by(year marital)
The count function counts the
number of non-missing observations
of sexfreq1 within each year creating
a constant for each case. The second
example counts the non-missing
values of sexfreq1 within year and
within marital status.
egen sexmar=group(sex marital), label
egen marsex=group(marital sex), label
The group function numbers the
groups formed by crossing sex and
marital. The groups are numbered
consecutively which makes this a
good variable to use in analysis. The
label option causes Stata to use the
value labels (if any) of sex and
marital.
egen sexmarcon=concat(sex marital), punct(/)
The concat function is useful when
you have two or more variables that
you want to combine to form one
variable but adding or multiplying
them would not make sense. The
p() option allows you to put a
separator character between the
values.
1. Use the rowmiss() function to create a new variable called socmiss for the sociability component
measures (socrel, socommun, socfrend, and socbar). Use tabulate to show the distribution of
missing values for these variables
2. Use the group() function to create a new categoricalvariable for every combination of race (by)
sex. Restrict race to white and black respondents (if can be used with egen). Use graph bar
and plot the 25th
and 75th
percentiles of sociability for these categories (look at the over()
option).
3. Create a new variable avgsoc that is the average sociability within year and sex.

More Related Content

Similar to Stata compute

Oracle_Analytical_function.pdf
Oracle_Analytical_function.pdfOracle_Analytical_function.pdf
Oracle_Analytical_function.pdfKalyankumarVenkat1
 
5 structured programming
5 structured programming 5 structured programming
5 structured programming hccit
 
SPSS PRESENTATION.PPT.pptx
SPSS PRESENTATION.PPT.pptxSPSS PRESENTATION.PPT.pptx
SPSS PRESENTATION.PPT.pptxBMmugal
 
Excel tips advance
Excel tips advanceExcel tips advance
Excel tips advanceAshish Patel
 
Excel tips advance
Excel tips advanceExcel tips advance
Excel tips advanceAshish Patel
 
Excel tips advance
Excel tips advanceExcel tips advance
Excel tips advanceAshish Patel
 
Excel tips advance
Excel tips advanceExcel tips advance
Excel tips advanceAshish Patel
 
Advance Excel tips
Advance Excel tips Advance Excel tips
Advance Excel tips Ashish Patel
 
Java R20 - UNIT-3.docx
Java R20 - UNIT-3.docxJava R20 - UNIT-3.docx
Java R20 - UNIT-3.docxPamarthi Kumar
 
111249-140817070204-phpapp02.pdf
111249-140817070204-phpapp02.pdf111249-140817070204-phpapp02.pdf
111249-140817070204-phpapp02.pdfBMmugal
 
Two dimensional array
Two dimensional arrayTwo dimensional array
Two dimensional arrayRajendran
 
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docxExcel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docxSANSKAR20
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advancedexcel content
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss softwareDebashis Baidya
 
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachReducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachErik De Monte
 
T - test independent samples
T - test  independent samplesT - test  independent samples
T - test independent samplesAmit Sharma
 
T test independent samples
T test  independent samplesT test  independent samples
T test independent samplesAmit Sharma
 

Similar to Stata compute (20)

Oracle_Analytical_function.pdf
Oracle_Analytical_function.pdfOracle_Analytical_function.pdf
Oracle_Analytical_function.pdf
 
5 structured programming
5 structured programming 5 structured programming
5 structured programming
 
INTRODUCTION TO STATA.pptx
INTRODUCTION TO STATA.pptxINTRODUCTION TO STATA.pptx
INTRODUCTION TO STATA.pptx
 
SPSS PRESENTATION.PPT.pptx
SPSS PRESENTATION.PPT.pptxSPSS PRESENTATION.PPT.pptx
SPSS PRESENTATION.PPT.pptx
 
Excel tips advance
Excel tips advanceExcel tips advance
Excel tips advance
 
Excel tips advance
Excel tips advanceExcel tips advance
Excel tips advance
 
Excel tips advance
Excel tips advanceExcel tips advance
Excel tips advance
 
Excel tips advance
Excel tips advanceExcel tips advance
Excel tips advance
 
Advance Excel tips
Advance Excel tips Advance Excel tips
Advance Excel tips
 
Chapter 18,19
Chapter 18,19Chapter 18,19
Chapter 18,19
 
Java R20 - UNIT-3.docx
Java R20 - UNIT-3.docxJava R20 - UNIT-3.docx
Java R20 - UNIT-3.docx
 
111249-140817070204-phpapp02.pdf
111249-140817070204-phpapp02.pdf111249-140817070204-phpapp02.pdf
111249-140817070204-phpapp02.pdf
 
Two dimensional array
Two dimensional arrayTwo dimensional array
Two dimensional array
 
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docxExcel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss software
 
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachReducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
 
T - test independent samples
T - test  independent samplesT - test  independent samples
T - test independent samples
 
T test independent samples
T test  independent samplesT test  independent samples
T test independent samples
 

Recently uploaded

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 

Recently uploaded (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 

Stata compute

  • 1. Introduction to Computing Stata by and egen commands by and bysort Many Stata commands can be executed on a group-by-group basis. Earlier we looked at how the Stata by command can be used as a prefix for statistical commands (see help by). For example the following Stata code will execute the summarize command for each unique value of marital (married, widowed, etc.): by marital: summarize sociability The by construct can also be used to create new variables. The by prefix steps through each unique value of a variable, and treats each set of observations as a separate dataset. The data must be sorted by values of the by-group variable before using by. Suppose we wanted to calculate the minimum value of age for each marital status. We could: sort marital age by marital: gen minage=age[1] The first line orders all the cases by marital status and within marital status by age. The second line looks only within each marital group and assigns the value of age to the first observation which because of sorting is the minimum value, to the new variable minage. The _n variable which records the current observation number resets within each by-group, i.e. each group is treated like its own little dataset. You can combine these two steps with the bysort command (abbreviated bys): bysort marital (age): gen minage=age[1] where the variables after bysort but before the parentheses are the variables you want to perform the command by, and the variables inside the parentheses specify the sort order of each little dataset you are performing the command on. egen One of Stata’s most powerful and useful commands is egen. Like generate, it is used to create new variables, but it is much more than that. Using egen difficult and tedious variables can be created easily. Some examples are variables whose values are the mean of another variable for each group such as sociability for males and females. You can also use egen to create other variables that count the number of observations that fit a certain criteria, or even simply number observations. The only way to truly see how powerful egen can be is to show a few examples and then have you explore the other available functions on your own. See help egen for a complete listing of all of these commands.
  • 2. egen age_cat1 = cut(age), /// at(18, 25, 35, 45, 55, 65, 75, 90) egen age_cat2 = cut(age), group(6) The cut function is useful for collapsing variables. You specify the lowest value for each new group with the at() option. Any observations with a value less than 18 will be given a missing value for age_cat, and all observations with a value greater than 90 will be placed in the “90” age_cat group. Or simply specify the number of groups you want with the group() option. egen sexfreq2 = mean(sexfreq1), by(marital) egen sexfreq3 = mean(sexfreq1), by(sex marital) This example creates a variable that is the mean of sexfreq1 for each marital group. In addition to the mean, other statistics can be calculated (e.g. min, max, sd). egen numobs1 = count(sexfreq1), by(year) egen numobs2 = count(sexfreq1), /// by(year marital) The count function counts the number of non-missing observations of sexfreq1 within each year creating a constant for each case. The second example counts the non-missing values of sexfreq1 within year and within marital status. egen sexmar=group(sex marital), label egen marsex=group(marital sex), label The group function numbers the groups formed by crossing sex and marital. The groups are numbered consecutively which makes this a good variable to use in analysis. The label option causes Stata to use the value labels (if any) of sex and marital. egen sexmarcon=concat(sex marital), punct(/) The concat function is useful when you have two or more variables that you want to combine to form one variable but adding or multiplying them would not make sense. The p() option allows you to put a separator character between the values. 1. Use the rowmiss() function to create a new variable called socmiss for the sociability component measures (socrel, socommun, socfrend, and socbar). Use tabulate to show the distribution of missing values for these variables 2. Use the group() function to create a new categoricalvariable for every combination of race (by) sex. Restrict race to white and black respondents (if can be used with egen). Use graph bar and plot the 25th and 75th percentiles of sociability for these categories (look at the over() option).
  • 3. 3. Create a new variable avgsoc that is the average sociability within year and sex.