SlideShare a Scribd company logo
Application of Statistical Tools
for Data Analysis in Research
Dr Joseph James V.
Professor of Commerce & Management, MSN
Institute of Management & Technology, Chavara.
(Formerly Associate Professor
and Head, P G & Research
Department of Commerce, Fatima
Mata National College
(Autonomous), Kollam).
Probability
 Approaches towards Probability
 Basic terminology
◦ Experiments and events
◦ Mutually exclusive events
◦ Collectively exhaustive events/ sample space
◦ Equally likely events
◦ Independent and Dependant events
◦ Simple and compound events
Theorems of Probability
 Addition Theorem
◦ Mutually exclusive cases
◦ Not mutually exclusive cases
 Multiplication Theorem (Joint Probability)
◦ Under statistical independence
◦ Under statistical dependence
 Conditional probability
 Revision of probability
◦ Bayes’ Theorem
 Mathematical Expectation
 Probability/Theoretical Distribution
Statistical Data
• Measurement Scales
• Nominal
• Ordinal
• Interval
• Scale/Ratio
• Data Types
• Simple, Discrete and Continuous Data
• Temporal/Time series Data
• Cross Sectional Data
• Pooled Data
• Panel Data
A Broad Classification of
Statistical Analysis
 Descriptive Analysis
 Difference Analysis
 Relationship Analysis
 Predictive Analysis
 Analysis through Classification
Descriptive Analysis
 Describe the characteristics of the data/
distribution in a summary form
 Tools.
• Measures of central tendency
 Mean, Median, Mode, Partition values, GM, HM,
Specialised Averages like Index Numbers
• Measures of dispersion
• Skewness /Asymmetry
• Kurtosis / Peakness or flatness
Difference Analysis
 As to whether a statistic is significantly
different from the population parameter
◦ Crosstab and Chi square test in the case of
categorical variables
◦ In case of Ordinal or better:
 Independent samples – Mann Whitney U Test
 Dependant samples – Wilcox sign test
◦ Scale/Ratio
 One variable – t test, one way ANOVA
 Two or more samples – ANOVA, MANOVA,
MANCOVA etc.
Relationship Analysis
 Correlation Analysis
◦ Scatter Diagram
◦ Correlation graph
◦ Karl Pearson coefficient of correlation
◦ Coefficient of determination (R square)
◦ Spearman’s Rank correlation
◦ Partial Correlation
◦ Multiple correlation (Correlation Matrix)
Predictive Analysis
• Simple regression
–Uses of Regression Analysis
–The regression lines
–The regression equations
–Properties of regression coefficients
–Standard error of estimate
–The coefficient of determination (r2)
• Multiple regression analysis
–E(Y) = a + b1X1 + b2X2 + …..bjXj + eij
Interpretation of Regression
Result
• Descriptive Statistics
• Correlations
• Variables Entered/Removed(Stepwise
regression)
• Model Summary(R,R square, Adj R
square& SE
• ANOVA – p value
• Coefficients (Constant, B -
unstandardized, Beta - standardized, SE,
t test and p values , Confidence limits)
Assumptions of Classical Linear
Regression Model (CLRM)
• Assumption of Linearity
Correlation and Scatter plot
• Assumption of Normality
– Histogram and a fitted normal curve or a Q-Q-Plot.
– Box plots
– Descriptive statistics using skewness and kurtosis
– Normality can be checked with a goodness of fit test,
e.g., the Kolmogorov-Smirnov test or by Shapiro Wilk
test or by Jarque – Bera test available in Eviews
• When the data is not normally distributed a non-
linear transformation, e.g., log-transformation
might fix this issue, however it can introduce
effects of multi collinearity.
Assumptions of Classical Linear
Regression Model (CLRM)
• Assumption of Stationarity.
– first differencing and Second differencing
– smoothed by performing regression on a deterministic
time scale and generating expected values.
– unit root test - Augmented Dickey Fuller (ADF)
• Assumption of Homoscedasticity (problem of
Hetroscedasticity)
Test that there is no outlier
The data points are independent (No
autocorrelation within the variables) –
Durbin Watson test.
The residuals are normally distributed with mean zero
and have constant variance - Residual statistics and
Histogram of the residuals
Assumptions of Classical Linear
Regression Model (CLRM)
 Assumption of Autocorrelation
◦ DW statistics
◦ Correlogram Q statistic – Eviews output
 Autocorrelation and Partial Autocorrelation
 Problem of Multicolleaniarity
◦ Correlation matrix
◦ Tolerance and Variance Inflation Factor
(VIF).
Test for Specification error
• Ramsey’s RESET
–Single test which gives an overall idea on the
presence of specification error arising out of
inadequacy of the model specification,
measurement errors and errors with respect to
normality.
–The model, in order to be precise and suitable,
the coefficient of the fitted values when
regressed on the dependent variable along with
the independent variable should be equal to
zero. Ramsey’s RESET is a test in this direction
Test for Specification error
• Ramsey’s RESET
• Estimate the LRM, Y = α + β1X1+ β2X2+………+
βjXj + ej and save the fitted values.
• Include the combination of the powered values of
predicted (fitted) values of Y2, Y3… ) in the model
and regress again to test whether the coefficient
of fitted values (γ) = 0 against the model:
Y = α + β1X1+ β2X2+………+ βjXj + γ1Y2 + γ2Y3
+ ej
• The significance of γ (coefficients of squired
fitted values, 3rd power of fitted values etc. are
tested using F test for generalization.
• Eviews example
Thank You

More Related Content

Similar to Statistical analysis for researchJJ.ppt

Outlier Analysis.pdf
Outlier Analysis.pdfOutlier Analysis.pdf
Outlier Analysis.pdf
H K Yoon
 
Intermediate Strategies for Metabolomic Data Analysis
Intermediate Strategies for Metabolomic Data AnalysisIntermediate Strategies for Metabolomic Data Analysis
Intermediate Strategies for Metabolomic Data AnalysisDmitry Grapov
 
An Introduction to Factor analysis ppt
An Introduction to Factor analysis pptAn Introduction to Factor analysis ppt
An Introduction to Factor analysis ppt
Mukesh Bisht
 
Factor analysis ppt
Factor analysis pptFactor analysis ppt
Factor analysis ppt
Mukesh Bisht
 
A presentation for Multiple linear regression.ppt
A presentation for Multiple linear regression.pptA presentation for Multiple linear regression.ppt
A presentation for Multiple linear regression.ppt
vigia41
 
CORRELATION AND REGRESSION.pptx
CORRELATION AND REGRESSION.pptxCORRELATION AND REGRESSION.pptx
CORRELATION AND REGRESSION.pptx
Vitalis Adongo
 
Inferential statistics nominal data
Inferential statistics   nominal dataInferential statistics   nominal data
Inferential statistics nominal data
Dhritiman Chakrabarti
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysisILRI-Jmaru
 
factor-analysis (1).pdf
factor-analysis (1).pdffactor-analysis (1).pdf
factor-analysis (1).pdf
Yashwanth Rm
 
Linear regression
Linear regressionLinear regression
Linear regression
Learnbay Datascience
 
Autocorrelation (1)
Autocorrelation (1)Autocorrelation (1)
Autocorrelation (1)
Manokamna Kochar
 
STATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSARSTATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSAR
RaniBhagat1
 
Bias in Research Methods
Bias in Research Methods Bias in Research Methods
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
MultivariateDataAnal
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
Sitamarhi Institute of Technology
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
Sitamarhi Institute of Technology
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
Sitamarhi Institute of Technology
 
UNIT 5.pptx
UNIT 5.pptxUNIT 5.pptx
UNIT 5.pptx
ShifnaRahman
 
Logistical Regression.pptx
Logistical Regression.pptxLogistical Regression.pptx
Logistical Regression.pptx
Ramakrishna Reddy Bijjam
 

Similar to Statistical analysis for researchJJ.ppt (20)

Outlier Analysis.pdf
Outlier Analysis.pdfOutlier Analysis.pdf
Outlier Analysis.pdf
 
Intermediate Strategies for Metabolomic Data Analysis
Intermediate Strategies for Metabolomic Data AnalysisIntermediate Strategies for Metabolomic Data Analysis
Intermediate Strategies for Metabolomic Data Analysis
 
An Introduction to Factor analysis ppt
An Introduction to Factor analysis pptAn Introduction to Factor analysis ppt
An Introduction to Factor analysis ppt
 
Factor analysis ppt
Factor analysis pptFactor analysis ppt
Factor analysis ppt
 
0 introduction
0  introduction0  introduction
0 introduction
 
A presentation for Multiple linear regression.ppt
A presentation for Multiple linear regression.pptA presentation for Multiple linear regression.ppt
A presentation for Multiple linear regression.ppt
 
CORRELATION AND REGRESSION.pptx
CORRELATION AND REGRESSION.pptxCORRELATION AND REGRESSION.pptx
CORRELATION AND REGRESSION.pptx
 
Inferential statistics nominal data
Inferential statistics   nominal dataInferential statistics   nominal data
Inferential statistics nominal data
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysis
 
factor-analysis (1).pdf
factor-analysis (1).pdffactor-analysis (1).pdf
factor-analysis (1).pdf
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Autocorrelation (1)
Autocorrelation (1)Autocorrelation (1)
Autocorrelation (1)
 
STATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSARSTATISTICAL METHOD OF QSAR
STATISTICAL METHOD OF QSAR
 
Bias in Research Methods
Bias in Research Methods Bias in Research Methods
Bias in Research Methods
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
Unit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdfUnit-3 Data Analytics.pdf
Unit-3 Data Analytics.pdf
 
UNIT 5.pptx
UNIT 5.pptxUNIT 5.pptx
UNIT 5.pptx
 
Logistical Regression.pptx
Logistical Regression.pptxLogistical Regression.pptx
Logistical Regression.pptx
 

Recently uploaded

一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 

Recently uploaded (20)

一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 

Statistical analysis for researchJJ.ppt

  • 1. Application of Statistical Tools for Data Analysis in Research Dr Joseph James V. Professor of Commerce & Management, MSN Institute of Management & Technology, Chavara. (Formerly Associate Professor and Head, P G & Research Department of Commerce, Fatima Mata National College (Autonomous), Kollam).
  • 2. Probability  Approaches towards Probability  Basic terminology ◦ Experiments and events ◦ Mutually exclusive events ◦ Collectively exhaustive events/ sample space ◦ Equally likely events ◦ Independent and Dependant events ◦ Simple and compound events
  • 3. Theorems of Probability  Addition Theorem ◦ Mutually exclusive cases ◦ Not mutually exclusive cases  Multiplication Theorem (Joint Probability) ◦ Under statistical independence ◦ Under statistical dependence  Conditional probability  Revision of probability ◦ Bayes’ Theorem  Mathematical Expectation  Probability/Theoretical Distribution
  • 4. Statistical Data • Measurement Scales • Nominal • Ordinal • Interval • Scale/Ratio • Data Types • Simple, Discrete and Continuous Data • Temporal/Time series Data • Cross Sectional Data • Pooled Data • Panel Data
  • 5. A Broad Classification of Statistical Analysis  Descriptive Analysis  Difference Analysis  Relationship Analysis  Predictive Analysis  Analysis through Classification
  • 6. Descriptive Analysis  Describe the characteristics of the data/ distribution in a summary form  Tools. • Measures of central tendency  Mean, Median, Mode, Partition values, GM, HM, Specialised Averages like Index Numbers • Measures of dispersion • Skewness /Asymmetry • Kurtosis / Peakness or flatness
  • 7. Difference Analysis  As to whether a statistic is significantly different from the population parameter ◦ Crosstab and Chi square test in the case of categorical variables ◦ In case of Ordinal or better:  Independent samples – Mann Whitney U Test  Dependant samples – Wilcox sign test ◦ Scale/Ratio  One variable – t test, one way ANOVA  Two or more samples – ANOVA, MANOVA, MANCOVA etc.
  • 8. Relationship Analysis  Correlation Analysis ◦ Scatter Diagram ◦ Correlation graph ◦ Karl Pearson coefficient of correlation ◦ Coefficient of determination (R square) ◦ Spearman’s Rank correlation ◦ Partial Correlation ◦ Multiple correlation (Correlation Matrix)
  • 9. Predictive Analysis • Simple regression –Uses of Regression Analysis –The regression lines –The regression equations –Properties of regression coefficients –Standard error of estimate –The coefficient of determination (r2) • Multiple regression analysis –E(Y) = a + b1X1 + b2X2 + …..bjXj + eij
  • 10. Interpretation of Regression Result • Descriptive Statistics • Correlations • Variables Entered/Removed(Stepwise regression) • Model Summary(R,R square, Adj R square& SE • ANOVA – p value • Coefficients (Constant, B - unstandardized, Beta - standardized, SE, t test and p values , Confidence limits)
  • 11. Assumptions of Classical Linear Regression Model (CLRM) • Assumption of Linearity Correlation and Scatter plot • Assumption of Normality – Histogram and a fitted normal curve or a Q-Q-Plot. – Box plots – Descriptive statistics using skewness and kurtosis – Normality can be checked with a goodness of fit test, e.g., the Kolmogorov-Smirnov test or by Shapiro Wilk test or by Jarque – Bera test available in Eviews • When the data is not normally distributed a non- linear transformation, e.g., log-transformation might fix this issue, however it can introduce effects of multi collinearity.
  • 12. Assumptions of Classical Linear Regression Model (CLRM) • Assumption of Stationarity. – first differencing and Second differencing – smoothed by performing regression on a deterministic time scale and generating expected values. – unit root test - Augmented Dickey Fuller (ADF) • Assumption of Homoscedasticity (problem of Hetroscedasticity) Test that there is no outlier The data points are independent (No autocorrelation within the variables) – Durbin Watson test. The residuals are normally distributed with mean zero and have constant variance - Residual statistics and Histogram of the residuals
  • 13. Assumptions of Classical Linear Regression Model (CLRM)  Assumption of Autocorrelation ◦ DW statistics ◦ Correlogram Q statistic – Eviews output  Autocorrelation and Partial Autocorrelation  Problem of Multicolleaniarity ◦ Correlation matrix ◦ Tolerance and Variance Inflation Factor (VIF).
  • 14. Test for Specification error • Ramsey’s RESET –Single test which gives an overall idea on the presence of specification error arising out of inadequacy of the model specification, measurement errors and errors with respect to normality. –The model, in order to be precise and suitable, the coefficient of the fitted values when regressed on the dependent variable along with the independent variable should be equal to zero. Ramsey’s RESET is a test in this direction
  • 15. Test for Specification error • Ramsey’s RESET • Estimate the LRM, Y = α + β1X1+ β2X2+………+ βjXj + ej and save the fitted values. • Include the combination of the powered values of predicted (fitted) values of Y2, Y3… ) in the model and regress again to test whether the coefficient of fitted values (γ) = 0 against the model: Y = α + β1X1+ β2X2+………+ βjXj + γ1Y2 + γ2Y3 + ej • The significance of γ (coefficients of squired fitted values, 3rd power of fitted values etc. are tested using F test for generalization. • Eviews example