Advanced econometrics and Stata
L1-2
Dr. Chunxia Jiang
Business School, University of Aberdeen, UK
Beijing , 17-26 Nov 2019
 Topics and schedule
Sessions plan
10月17日 Evening —
L1-2 Introduction to Econometrics and Stata
10月18日 Evening —
L3-4 Data, simple regression
Morning —
L5-6 Hypothesis testing, Multi-regression , Violation of assumptions
Afternoon Exercises and practice
Morning —
L7-8 Time series models
Evening —
L9-10 Panel data models & Endogeneity
Morning Exercises and practice
Afternoon L11-12 Frontier1 SFA
10月24 Evening L13-14: Frontier2 DEA
10月25日 Evening L15-16 DID
Morning Revision
Afternoon Exam
10月20日
10月19日
10月22日
10月26日
What is Econometrics?
 Economic measurement
 Application of mathematical statistics to economic data to lend
empirical support to the models constructed by mathematical
economics and obtain numerical results.
 Quantitative analysis of actual economic phenomena based on
the concurrent development of theory and observation, related
by appropriate methods of inference.
 The empirical determination of economic laws
 A conjunction of economic theory and actual measurements,
using the theory and techniques of statistical inference as a bridge
pier.
Econometrics
 Try to explain the behaviour of a variable accounting for
factors that might affect the behaviour.
 Example: analyse the determinants of innovative activities
 Objective: What factors affect innovations?
 How do we measure innovations?
 Investment in R&D, number of patents
 What factors do we include as determinants of innovations?
 Skilled workers
 Proximity to universities/research centres
 Access to finance
 Past experience
 Institutional framework
Methodology of econometrics
 Statement of theory or hypothesis
 Model definition
 Data
 Estimation
 Hypothesis testing
 Forecasting
 Policy simulation
Methodology of econometrics :
Statement of theory or hypothesis
 Keynes: marginal propensity to consume (MPC) is greater
than zero but less than 1
 MPC is the rate of change of consumption for a unit change in
income
Methodology of econometrics:
Model definition
 A mathematical model
 Keynes postulated a positive relationship between consumption and
income, but not specify the precise form of the functional relational
relationship between two
Y=β1+β2X 0<β<1 Eq. (1)
Y=consumption, X=income, β1, β2 are parameters - intercept and slope
coefficients; β2 measures MPC
 Eq. (1) is an exact or deterministic relationship, but relationships
between economic variables are generally inexact.
 Eq. (1) is modified to an econometric model
Y=β1+β2X+u Eq. (2)
u is disturbance (error term) that is a random (stochastic) variable
Methodology of econometrics:
Data
 Y=β1+β2X+u Eq. (2)
 To estimate Eq. (2) to obtain the numerical values of
β1, β2, we need data.
 US economy data : personal consumption
expenditure and GDP 1960-2005.
Methodology of econometrics:
Estimation
 Regression analysis we
obtain
 β2 (MPC) is 0.72,
suggesting that for the
sample period 1960-2005
an increase in real income
of one dollar led, on
average, to an increase of
about 72 cents in real
consumption expenditure
t
t X
Y 7218
.
0
5913
.
299
ˆ 


Methodology of econometrics
 Hypothesis testing
 Assuming a good fit of the estimated model, if the estimates
are in accord with the expectations of the theory statistical
inference (hypothesis testing)
 Forecasting
 The chosen model can be used to predict the future value of Y
on the basis of X
 Policy simulation
 The model may be used for control , or policy, purposes.
 i.e., manipulate X to achieve desired Y
Types of Econometrics
 Econometrics
How are data organized?
 Time series data:
 a set of observations on the values that a variable takes at
different times.
 Cross-sectional data:
 Data on one or more variables collected at the same point in
time.
 Pooled or panel data
 Elements of both time series and cross-section data
 Panel data: same cross-sectional unit is surveyed over time
Examples of economic data
 Employment: total number of workers or total
number of hours worked
 Output: total output, industry output
 Skills: number of people with a certain degree
 Capital stocks: physical capital, ICT capital, human
capital
 Import/export/FDI
Examples of financial data
 What sorts of financial variables do we usually want
to explain?
 Prices - stock prices, stock indices, exchange rates
 Returns - stock returns, index returns, interest rates
 Volatility
 Trading volumes
 Corporate finance variables
 Debt issuance, use of hedging instruments
Where do we find the data
 Data sources: primary survey data and secondary published data,
grouped data and non-grouped data
 Macro data: national statistics, world bank, IMF
 Banking data: national central bank, BankScope
 Firm data: datastream,
 Survey to collect first-hand data
 Accuracy of data: selection biases and data quality
 Possible observational errors
 In questionnaire-type survey—non-response
 Sampling
 Aggregate
 Researchers should always keep in mind that the results of
research are only as good as the quality of the data
Elements of economic models:
The basic concepts
 Each model consists of a set of equations. The number of
equations can be various.
 Each equation has a set of variables.
 Each equation has a set of parameters (or coefficients).
 e.g. A very simple macroeconomic model:
Y = C + I + G (1)
C = c0 + c1(Y - T) (2)
T = t0 + t1Y (3)
I = i0 + i1Y + i2R (4)
G = g0 + g1Y (5)
Elements of economic models:
Assumptions
 The economy is a close one. It does not involve any
import or export activities.
 All the equations are linear. Each dependent variable
is exactly explained by the RHS variables.
 The model is 'static' in that all parameters are
assumed to be 'fixed'.
 All the prices and wages are fixed.
 There exists a margin of unused resources.
Types of model:
Single equation model
 This is commonly used for some particular analyses:
 e.g. --- A demand equation for a specific commodity
 Qi
d = f(Income, Pi, Pk)
 Dependent variable: In any single equation, the dependent
variable is always placed on the left hand side (LHS) of the
equation. It is explained or dependent on the expression on the
right hand side. In this case the dependent variable is Qi
d.
 Independent variables: In any single equation, all the
independent variables are placed on the right hand side (RHS) of
the equation. They explain or determine the dependent variable.
They are independent because they do not depend on any other
variables in that equation.
Types of model:
Multi-equation model
Y = C + I + G (1)
C = c0 + c1(Y - T) (2)
T = t0 + t1Y (3)
I = i0 + i1Y + i2R (4)
G = g0 + g1Y (5)
 Dependent variables: Y, C, T, I, G because they are all
on the LHS.
 Independent variables: R.
 Coefficients: c0, c1, t0, t1, i0, i1, i2, g0 and g1.
Simultaneous Equations Models
 A simple example
V = V0 + I (1)
I = i0 + i1R + i2V (2)
Where V is total revenue, V0 is basic revenue of a firm,
I is total investment, and R is interest rate. i0, i1 and i2
are coefficients.
 In the simultaneous equations, dependent variables
are called endogenous and independent variables
exogenous variables.
Simultaneous Equations Models
 How to obtain a reduced form from the structural
form?
 Step I: Let us copy the structural form
V = V0 + I (1)
I = i0 + i1R + i2V (2)
Simultaneous Equations Models
 Step II: Try to express V and I as the functions of exogenous
variables only
 Substitute (2) into (1)
V = V0 + i0 + i1R + i2V
V - i2V = V0 + i0 + i1R
)
3
(
,
1
1
1
1
2
0
1
0
2
1
0
2
2
0
R
V
V
or
R
i
i
V
i
i
i
V


 








Simultaneous Equations Models
 Substitute (1) into (2) and manipulate, we obtain
 Structural and reduced form equations
 Equations (1) and (2) are the structural form equations in
which dependent variables (I and V) can appear on both right
and left-hand sides.
 Equations (3) and (4) are the reduced form equations in
which dependent variables appear only on the left-hand side.
)
4
(
,
1
1
1
2
0
1
0
2
1
0
2
2
2
0
R
V
I
or
R
i
i
V
i
i
i
i
I


 








Questions sheet for lecture 1
1. Examine the following multi-equation system:
C = c0 + c1 Y + c2 R (consumption function with interest rate R)
I = t0 + t1Y + t2R (Investment function with interest rate R)
Y = C + I (definition of GDP as sum of consumption and investment)
(a) List as many as possible the assumptions implied by the model.
(b) List all the dependent variables, the parameters and the independent
variables. Discuss the economic meaning of each parameter in the equation.
2. Derive a reduced form equation for Y from the model in question 1 and
explain what is the marginal effect of R on Y. What do you expect the sizes
and signs of the following parameters a priori: c1, c2, t1 and t2?
L2 An introduction to Stata
Using Stata for data management
and reproducible research
 Stata is a full-featured statistical programming language
for Windows, Mac OS X, Unix and Linux. It can be
considered a “stat package,” like SAS, SPSS, RATS, or
eViews.
 Stata is available in several versions: Stata/IC (the standard
version), Stata/SE (an extended version) and Stata/MP (for
multiprocessing).
 The major difference between the versions is the number
of variables allowed in memory, which is limited to 2,047 in
standard Stata/IC, but can be up to 32,767 in Stata/SE or
Stata/MP. The number of observations in any version is
limited only by your computer’s memory.
Overview of Stata environment
 All versions of Stata provide the full set of features
and commands: there are no special add-ons or
‘toolboxes’. Each copy of Stata includes a complete
set of manuals (over 11,000 pages) in PDF format,
hyperlinked to the on-line help.
 A Stata license may be used on any machine which
supports Stata (Mac OS X, Windows, Linux): there are
no machine-specific licenses
Overview of Stata environment
 Stata is portable, and its developers are committed to
cross-platform compatibility. Stata runs the same way on
Windows, Mac OS X, Unix, and Linux systems
 Perhaps unique among statistical packages, Stata’s binary
data files may be freely copied from one platform to any
other, or even accessed over the Internet from any
machine that runs Stata. You may store Stata’s binary
datafiles on a webserver (HTTP server) and open them on
any machine with access to that server.
Overview of Stata environment
 The Toolbar contains icons that allow you to Open and
Save files, Print results, control Logs, and manipulate
windows. Some very important
 tools allow you to open the Do-File Editor, the Data
Editor and the Data Browser.
 The Data Editor and Data Browser present you with a
spreadsheet-like view of the data, no matter how large
your dataset may be. The Do-File editor, as we will
discuss, allows you to construct a file of Stata commands,
or “do-file”, and execute it in whole or in part from the
editor.
Overview of Stata environment
 There are several panels in the default interface: the Review,
Results, Command, Variables and Properties panels. You may
alter the appearance of any panel using the Preferences-
>General dialog, and make those changes on a temporary or
permanent basis.
 As you might expect, you may type commands in the
Command panel. You may only enter one command in that
panel, so you should not try pasting a list of several commands.
When a command is executed—with or without error—it
appears in the Review panel, and the results of the command
(or an error message) appears in the Results panel. You may
click on any command in the Review panel and it will reappear
in the Command panel, where it may be edited and
resubmitted.
Overview of Stata environment
 Once you have loaded data into the program, the
Variables panel will be populated with information on
each variable, as you can see in the example. That
information includes the variable name, its label (if
any), its type and its format. This is a subset of
information available from the describe command.
 Let’s look at the interface after I have loaded one of
the datasets provided with Stata, uslifeexp, with the
sysuse command and given the describe and
summarize commands:
Overview of Stata environment
 Notice that the three commands are listed in the Review
panel. If any had failed, the _rc column would contain a
nonzero number, in red, indicating the error code.
 The Variables panel contains the list of variables and their
labels.
 The Results panel shows the effects of summarize: for
each variable, the number of observations, their mean,
standard deviation, minimum and maximum. If there
were any string variables in the dataset, they would be
listed as having zero observations.
Overview of Stata environment
 Try it out: type the commands
 sysuse uslifeexp
 describe
 summarize
 Take note of an important design feature of Stata. If you
do not say what to describe or summarize, Stata assumes
you want to perform those commands for every variable in
memory, as shown here. As we shall see, this design
principle holds throughout the program.
Overview of Stata environment
 We may also write a do-file in the do-file editor and execute it. The
 Do-File Editor icon on the Toolbar brings up a window in which we may
 type those same three commands, as well as a few more:
 sysuse uslifeexp
 describe
 summarize
 notes
 // average life expectancy, 1900-1949
 summarize le if year < 1950
 // average life expectancy, 1950-1999
 summarize le if year >= 1950
 After typing those commands into the window, the rightmost icon, with
 tooltip Do, may be used to execute them.
Overview of Stata environment
 In this do-file, I have included the notes command to
display the notes saved with the dataset, and
included two comment lines. There are several styles
of comments available. In this style, anything on a line
following a double slash (//) is ignored.
 You may use the other icons in the Do-File Editor
window to save your do-file, print it, or edit its
contents. You may also select a portion of the file
with the mouse and execute only those commands.
Overview of Stata environment
 Try it out: use the Do-File Editor to save and reopen
the do-file S1.1.do, and run the file.
 Try selecting only those last four lines and run those
commands.
Overview of Stata environment
 The rightmost menu on the menu bar is labeled Help. From
that menu, you can search for help on any command or
feature. The Help Browser, which opens in a Viewer
window, provides hyperlinks, in blue, to additional help
pages. At the foot of each help screen, there are hyperlinks
to the full manuals, which are accessible in PDF format. The
links will take you directly to the appropriate page of the
manual.
 You may also search for help at the command line with
help command. But what if you don’t know the exact
command name? Then you may use the search command,
which may be followed by one or several words.
Overview of Stata environment
 Results from search are presented in a Viewer window.
Those commands will present results from a keyword
database and from the Internet: for instance, FAQs from
the Stata website, articles in the Stata Journal and Stata
Technical Bulletin, and downloadable routines from the
SSC Archive (about which more later) and user sites.
 Try it out: when you are connected to the Internet, type
the command search baum, au and then try search baum
 Note the hyperlinks that appear on URLs for the books and
journal articles, and on the individual software packages
(e.g., st0030_3, archlm).
Overview of Stata environment
 One of Stata’s great strengths is that it can be updated
over the Internet. Stata is actually a web browser, so it
may contact Stata’s web server and enquire whether there
are more recent versions of either Stata’s executable (the
kernel) or the ado-files. This enables Stata’s developers to
distribute bug fixes, enhancements to existing commands,
and even entirely new commands during the lifetime of a
given major release (including ‘dot-releases’ such as Stata
14.1).
 Updates during the life of the version you own are free.
You need only have a licensed copy of Stata and access to
the Internet (which may be by proxy server) to check for
and, if desired, download the updates.
Overview of Stata environment
 Another advantage of the command-line driven
environment involves extensibility: the continual
expansion of Stata’s capabilities. A command, to
Stata, is a verb instructing the program to perform
some action.
 Commands may be “built in” commands—those
elements so frequently used that they have been
coded into the “Stata kernel.” A relatively small
fraction of the total number of official Stata
commands are built in, but they are used very heavily.
Overview of Stata environment
 If Stata’s developers tomorrow wrote a new
command named “foobar”, they would make two
files available on their web site: foobar.ado (the ado-
file code) and foobar.sthlp (the associated help file).
Both are ordinary, readable ASCII text files. These files
should be produced in a text editor, not a word
processing program.
Overview of Stata environment
 The importance of this program design goes far
beyond the limits of official Stata.
 You may acquire new Stata commands from a
number of web sites. The Stata Journal (SJ), a
quarterly peer-reviewed journal, is the primary
method for distributing user contributions. Between
1991 and 2001, the Stata Technical Bulletin played this
role, and a complete set of issues of the STB are
available on line at the Stata website.
Overview of Stata environment
 The importance of all this is that Stata is infinitely
extensible. Any ado-file on your adopath is a full-
fledged Stata command. Stata’s capabilities thus
extend far beyond the official, supported features
described in the Stata manual to a vast array of
additional tools.
Overview of Stata environment
 Create our own list…
Command focus

Advanced Econometrics L1-2.pptx

  • 1.
    Advanced econometrics andStata L1-2 Dr. Chunxia Jiang Business School, University of Aberdeen, UK Beijing , 17-26 Nov 2019
  • 2.
     Topics andschedule Sessions plan 10月17日 Evening — L1-2 Introduction to Econometrics and Stata 10月18日 Evening — L3-4 Data, simple regression Morning — L5-6 Hypothesis testing, Multi-regression , Violation of assumptions Afternoon Exercises and practice Morning — L7-8 Time series models Evening — L9-10 Panel data models & Endogeneity Morning Exercises and practice Afternoon L11-12 Frontier1 SFA 10月24 Evening L13-14: Frontier2 DEA 10月25日 Evening L15-16 DID Morning Revision Afternoon Exam 10月20日 10月19日 10月22日 10月26日
  • 3.
    What is Econometrics? Economic measurement  Application of mathematical statistics to economic data to lend empirical support to the models constructed by mathematical economics and obtain numerical results.  Quantitative analysis of actual economic phenomena based on the concurrent development of theory and observation, related by appropriate methods of inference.  The empirical determination of economic laws  A conjunction of economic theory and actual measurements, using the theory and techniques of statistical inference as a bridge pier.
  • 4.
    Econometrics  Try toexplain the behaviour of a variable accounting for factors that might affect the behaviour.  Example: analyse the determinants of innovative activities  Objective: What factors affect innovations?  How do we measure innovations?  Investment in R&D, number of patents  What factors do we include as determinants of innovations?  Skilled workers  Proximity to universities/research centres  Access to finance  Past experience  Institutional framework
  • 5.
    Methodology of econometrics Statement of theory or hypothesis  Model definition  Data  Estimation  Hypothesis testing  Forecasting  Policy simulation
  • 6.
    Methodology of econometrics: Statement of theory or hypothesis  Keynes: marginal propensity to consume (MPC) is greater than zero but less than 1  MPC is the rate of change of consumption for a unit change in income
  • 7.
    Methodology of econometrics: Modeldefinition  A mathematical model  Keynes postulated a positive relationship between consumption and income, but not specify the precise form of the functional relational relationship between two Y=β1+β2X 0<β<1 Eq. (1) Y=consumption, X=income, β1, β2 are parameters - intercept and slope coefficients; β2 measures MPC  Eq. (1) is an exact or deterministic relationship, but relationships between economic variables are generally inexact.  Eq. (1) is modified to an econometric model Y=β1+β2X+u Eq. (2) u is disturbance (error term) that is a random (stochastic) variable
  • 8.
    Methodology of econometrics: Data Y=β1+β2X+u Eq. (2)  To estimate Eq. (2) to obtain the numerical values of β1, β2, we need data.  US economy data : personal consumption expenditure and GDP 1960-2005.
  • 9.
    Methodology of econometrics: Estimation Regression analysis we obtain  β2 (MPC) is 0.72, suggesting that for the sample period 1960-2005 an increase in real income of one dollar led, on average, to an increase of about 72 cents in real consumption expenditure t t X Y 7218 . 0 5913 . 299 ˆ   
  • 10.
    Methodology of econometrics Hypothesis testing  Assuming a good fit of the estimated model, if the estimates are in accord with the expectations of the theory statistical inference (hypothesis testing)  Forecasting  The chosen model can be used to predict the future value of Y on the basis of X  Policy simulation  The model may be used for control , or policy, purposes.  i.e., manipulate X to achieve desired Y
  • 11.
  • 12.
    How are dataorganized?  Time series data:  a set of observations on the values that a variable takes at different times.  Cross-sectional data:  Data on one or more variables collected at the same point in time.  Pooled or panel data  Elements of both time series and cross-section data  Panel data: same cross-sectional unit is surveyed over time
  • 13.
    Examples of economicdata  Employment: total number of workers or total number of hours worked  Output: total output, industry output  Skills: number of people with a certain degree  Capital stocks: physical capital, ICT capital, human capital  Import/export/FDI
  • 14.
    Examples of financialdata  What sorts of financial variables do we usually want to explain?  Prices - stock prices, stock indices, exchange rates  Returns - stock returns, index returns, interest rates  Volatility  Trading volumes  Corporate finance variables  Debt issuance, use of hedging instruments
  • 15.
    Where do wefind the data  Data sources: primary survey data and secondary published data, grouped data and non-grouped data  Macro data: national statistics, world bank, IMF  Banking data: national central bank, BankScope  Firm data: datastream,  Survey to collect first-hand data  Accuracy of data: selection biases and data quality  Possible observational errors  In questionnaire-type survey—non-response  Sampling  Aggregate  Researchers should always keep in mind that the results of research are only as good as the quality of the data
  • 16.
    Elements of economicmodels: The basic concepts  Each model consists of a set of equations. The number of equations can be various.  Each equation has a set of variables.  Each equation has a set of parameters (or coefficients).  e.g. A very simple macroeconomic model: Y = C + I + G (1) C = c0 + c1(Y - T) (2) T = t0 + t1Y (3) I = i0 + i1Y + i2R (4) G = g0 + g1Y (5)
  • 17.
    Elements of economicmodels: Assumptions  The economy is a close one. It does not involve any import or export activities.  All the equations are linear. Each dependent variable is exactly explained by the RHS variables.  The model is 'static' in that all parameters are assumed to be 'fixed'.  All the prices and wages are fixed.  There exists a margin of unused resources.
  • 18.
    Types of model: Singleequation model  This is commonly used for some particular analyses:  e.g. --- A demand equation for a specific commodity  Qi d = f(Income, Pi, Pk)  Dependent variable: In any single equation, the dependent variable is always placed on the left hand side (LHS) of the equation. It is explained or dependent on the expression on the right hand side. In this case the dependent variable is Qi d.  Independent variables: In any single equation, all the independent variables are placed on the right hand side (RHS) of the equation. They explain or determine the dependent variable. They are independent because they do not depend on any other variables in that equation.
  • 19.
    Types of model: Multi-equationmodel Y = C + I + G (1) C = c0 + c1(Y - T) (2) T = t0 + t1Y (3) I = i0 + i1Y + i2R (4) G = g0 + g1Y (5)  Dependent variables: Y, C, T, I, G because they are all on the LHS.  Independent variables: R.  Coefficients: c0, c1, t0, t1, i0, i1, i2, g0 and g1.
  • 20.
    Simultaneous Equations Models A simple example V = V0 + I (1) I = i0 + i1R + i2V (2) Where V is total revenue, V0 is basic revenue of a firm, I is total investment, and R is interest rate. i0, i1 and i2 are coefficients.  In the simultaneous equations, dependent variables are called endogenous and independent variables exogenous variables.
  • 21.
    Simultaneous Equations Models How to obtain a reduced form from the structural form?  Step I: Let us copy the structural form V = V0 + I (1) I = i0 + i1R + i2V (2)
  • 22.
    Simultaneous Equations Models Step II: Try to express V and I as the functions of exogenous variables only  Substitute (2) into (1) V = V0 + i0 + i1R + i2V V - i2V = V0 + i0 + i1R ) 3 ( , 1 1 1 1 2 0 1 0 2 1 0 2 2 0 R V V or R i i V i i i V            
  • 23.
    Simultaneous Equations Models Substitute (1) into (2) and manipulate, we obtain  Structural and reduced form equations  Equations (1) and (2) are the structural form equations in which dependent variables (I and V) can appear on both right and left-hand sides.  Equations (3) and (4) are the reduced form equations in which dependent variables appear only on the left-hand side. ) 4 ( , 1 1 1 2 0 1 0 2 1 0 2 2 2 0 R V I or R i i V i i i i I            
  • 24.
    Questions sheet forlecture 1 1. Examine the following multi-equation system: C = c0 + c1 Y + c2 R (consumption function with interest rate R) I = t0 + t1Y + t2R (Investment function with interest rate R) Y = C + I (definition of GDP as sum of consumption and investment) (a) List as many as possible the assumptions implied by the model. (b) List all the dependent variables, the parameters and the independent variables. Discuss the economic meaning of each parameter in the equation. 2. Derive a reduced form equation for Y from the model in question 1 and explain what is the marginal effect of R on Y. What do you expect the sizes and signs of the following parameters a priori: c1, c2, t1 and t2?
  • 25.
    L2 An introductionto Stata Using Stata for data management and reproducible research
  • 26.
     Stata isa full-featured statistical programming language for Windows, Mac OS X, Unix and Linux. It can be considered a “stat package,” like SAS, SPSS, RATS, or eViews.  Stata is available in several versions: Stata/IC (the standard version), Stata/SE (an extended version) and Stata/MP (for multiprocessing).  The major difference between the versions is the number of variables allowed in memory, which is limited to 2,047 in standard Stata/IC, but can be up to 32,767 in Stata/SE or Stata/MP. The number of observations in any version is limited only by your computer’s memory. Overview of Stata environment
  • 27.
     All versionsof Stata provide the full set of features and commands: there are no special add-ons or ‘toolboxes’. Each copy of Stata includes a complete set of manuals (over 11,000 pages) in PDF format, hyperlinked to the on-line help.  A Stata license may be used on any machine which supports Stata (Mac OS X, Windows, Linux): there are no machine-specific licenses Overview of Stata environment
  • 28.
     Stata isportable, and its developers are committed to cross-platform compatibility. Stata runs the same way on Windows, Mac OS X, Unix, and Linux systems  Perhaps unique among statistical packages, Stata’s binary data files may be freely copied from one platform to any other, or even accessed over the Internet from any machine that runs Stata. You may store Stata’s binary datafiles on a webserver (HTTP server) and open them on any machine with access to that server. Overview of Stata environment
  • 29.
     The Toolbarcontains icons that allow you to Open and Save files, Print results, control Logs, and manipulate windows. Some very important  tools allow you to open the Do-File Editor, the Data Editor and the Data Browser.  The Data Editor and Data Browser present you with a spreadsheet-like view of the data, no matter how large your dataset may be. The Do-File editor, as we will discuss, allows you to construct a file of Stata commands, or “do-file”, and execute it in whole or in part from the editor. Overview of Stata environment
  • 30.
     There areseveral panels in the default interface: the Review, Results, Command, Variables and Properties panels. You may alter the appearance of any panel using the Preferences- >General dialog, and make those changes on a temporary or permanent basis.  As you might expect, you may type commands in the Command panel. You may only enter one command in that panel, so you should not try pasting a list of several commands. When a command is executed—with or without error—it appears in the Review panel, and the results of the command (or an error message) appears in the Results panel. You may click on any command in the Review panel and it will reappear in the Command panel, where it may be edited and resubmitted. Overview of Stata environment
  • 31.
     Once youhave loaded data into the program, the Variables panel will be populated with information on each variable, as you can see in the example. That information includes the variable name, its label (if any), its type and its format. This is a subset of information available from the describe command.  Let’s look at the interface after I have loaded one of the datasets provided with Stata, uslifeexp, with the sysuse command and given the describe and summarize commands: Overview of Stata environment
  • 32.
     Notice thatthe three commands are listed in the Review panel. If any had failed, the _rc column would contain a nonzero number, in red, indicating the error code.  The Variables panel contains the list of variables and their labels.  The Results panel shows the effects of summarize: for each variable, the number of observations, their mean, standard deviation, minimum and maximum. If there were any string variables in the dataset, they would be listed as having zero observations. Overview of Stata environment
  • 33.
     Try itout: type the commands  sysuse uslifeexp  describe  summarize  Take note of an important design feature of Stata. If you do not say what to describe or summarize, Stata assumes you want to perform those commands for every variable in memory, as shown here. As we shall see, this design principle holds throughout the program. Overview of Stata environment
  • 34.
     We mayalso write a do-file in the do-file editor and execute it. The  Do-File Editor icon on the Toolbar brings up a window in which we may  type those same three commands, as well as a few more:  sysuse uslifeexp  describe  summarize  notes  // average life expectancy, 1900-1949  summarize le if year < 1950  // average life expectancy, 1950-1999  summarize le if year >= 1950  After typing those commands into the window, the rightmost icon, with  tooltip Do, may be used to execute them. Overview of Stata environment
  • 35.
     In thisdo-file, I have included the notes command to display the notes saved with the dataset, and included two comment lines. There are several styles of comments available. In this style, anything on a line following a double slash (//) is ignored.  You may use the other icons in the Do-File Editor window to save your do-file, print it, or edit its contents. You may also select a portion of the file with the mouse and execute only those commands. Overview of Stata environment
  • 36.
     Try itout: use the Do-File Editor to save and reopen the do-file S1.1.do, and run the file.  Try selecting only those last four lines and run those commands. Overview of Stata environment
  • 37.
     The rightmostmenu on the menu bar is labeled Help. From that menu, you can search for help on any command or feature. The Help Browser, which opens in a Viewer window, provides hyperlinks, in blue, to additional help pages. At the foot of each help screen, there are hyperlinks to the full manuals, which are accessible in PDF format. The links will take you directly to the appropriate page of the manual.  You may also search for help at the command line with help command. But what if you don’t know the exact command name? Then you may use the search command, which may be followed by one or several words. Overview of Stata environment
  • 38.
     Results fromsearch are presented in a Viewer window. Those commands will present results from a keyword database and from the Internet: for instance, FAQs from the Stata website, articles in the Stata Journal and Stata Technical Bulletin, and downloadable routines from the SSC Archive (about which more later) and user sites.  Try it out: when you are connected to the Internet, type the command search baum, au and then try search baum  Note the hyperlinks that appear on URLs for the books and journal articles, and on the individual software packages (e.g., st0030_3, archlm). Overview of Stata environment
  • 39.
     One ofStata’s great strengths is that it can be updated over the Internet. Stata is actually a web browser, so it may contact Stata’s web server and enquire whether there are more recent versions of either Stata’s executable (the kernel) or the ado-files. This enables Stata’s developers to distribute bug fixes, enhancements to existing commands, and even entirely new commands during the lifetime of a given major release (including ‘dot-releases’ such as Stata 14.1).  Updates during the life of the version you own are free. You need only have a licensed copy of Stata and access to the Internet (which may be by proxy server) to check for and, if desired, download the updates. Overview of Stata environment
  • 40.
     Another advantageof the command-line driven environment involves extensibility: the continual expansion of Stata’s capabilities. A command, to Stata, is a verb instructing the program to perform some action.  Commands may be “built in” commands—those elements so frequently used that they have been coded into the “Stata kernel.” A relatively small fraction of the total number of official Stata commands are built in, but they are used very heavily. Overview of Stata environment
  • 41.
     If Stata’sdevelopers tomorrow wrote a new command named “foobar”, they would make two files available on their web site: foobar.ado (the ado- file code) and foobar.sthlp (the associated help file). Both are ordinary, readable ASCII text files. These files should be produced in a text editor, not a word processing program. Overview of Stata environment
  • 42.
     The importanceof this program design goes far beyond the limits of official Stata.  You may acquire new Stata commands from a number of web sites. The Stata Journal (SJ), a quarterly peer-reviewed journal, is the primary method for distributing user contributions. Between 1991 and 2001, the Stata Technical Bulletin played this role, and a complete set of issues of the STB are available on line at the Stata website. Overview of Stata environment
  • 43.
     The importanceof all this is that Stata is infinitely extensible. Any ado-file on your adopath is a full- fledged Stata command. Stata’s capabilities thus extend far beyond the official, supported features described in the Stata manual to a vast array of additional tools. Overview of Stata environment
  • 44.
     Create ourown list… Command focus

Editor's Notes

  • #7 MPC- the rate of change of consumption for a unit change in income.