SlideShare a Scribd company logo
A BRIEF INTRO TO ‘R’ – APPLIED 
STATS & TIME SERIES ANALYSIS 
- Shanmukha Sreenivas P
THE R ENVIRONMENT 
 R is an integrated suite of software facilities for data 
manipulation, calculation and graphical display. 
 An effective data handling and storage facility 
 A suite of operators for calculations on arrays, in particular 
matrices 
 A large, coherent, integrated collection of intermediate tools 
for data analysis 
 Graphical facilities for data analysis 
 A well developed, simple and effective programming 
language (called ‘S’) which includes conditionals, loops, user 
defined recursive functions and I/O facilities.
“OPEN SOURCE”... THAT JUST 
MEANS I DON’T HAVE TO PAY FOR 
IT, RIGHT? 
5 
•No. Much more: 
–Provides full access to algorithms and their implementation 
–Ability to fix bugs and extend software 
–Provides a forum allowing researchers to explore and 
expand the methods used to analyze data 
–Promotes reproducible research by providing open and 
accessible tools 
–Most of R is written in… R! This makes it quite easy to see 
what functions are actually doing.
WHAT IS IT? 
•R is an interpreted computer language. 
–Most user-visible functions are written in R itself, calling upon a 
smaller set of internal primitives. 
– It is possible to interface procedures written in C, C+, or 
FORTRAN languages for efficiency, and to write additional 
primitives. 
–System commands can be called from within R 
•R is used for data manipulation, statistics, and graphics. 
It is made up of: 
– operators (+ - <- * %*% …) for calculations on arrays & 
matrices 
– large, coherent, integrated collection of functions 
– facilities for making unlimited types of publication quality 
graphics 
– user written functions & sets of functions (packages); 800+ 
contributed packages so far & growing
R 
ADVANTAGES 
DISADVANTAGES 
oNot user friendly @ start - steep 
learning curve, minimal GUI. 
oNo commercial support; figuring out 
correct methods or how to use a function 
on your own can be frustrating. 
oEasy to make mistakes and not know. 
oWorking with large datasets is limited 
by RAM 
oData prep & cleaning can be messier & 
more mistake prone in R vs. SPSS or 
SAS 
oFast and free. 
oState of the art: Statistical 
researchers provide their methods as 
R packages. SPSS and SAS are 
years behind R! 
o2nd only to MATLAB for graphics. 
oMx, WinBugs, and other programs 
use or will use R. 
oActive user community 
oExcellent for simulation, 
programming, computer intensive 
analyses, etc. 
oForces you to think about your 
analysis. 
oInterfaces with database storage 
software (SQL)
TYPICAL R SESSION 
 Start up R via the GUI or favorite text editor 
 Two windows: 
 1+ new or existing scripts (text files) - these will be saved 
 Terminal – output & temporary input - usually unsaved
STATISTICAL METHODS 
 Statistics: “meaningful” quantities about a sample of 
objects, things, persons, events, phenomena, etc. 
 Simple to complex issues. E.g. 
 Correlation 
 ANOVA 
 MANOVA 
 Regression – linear, multiple, logistic 
 LDA 
 PCA/ Factor Analysis 
 Frequency domain analysis 
 Econometric modelling (TSA) 
 Two main categories: 
* Descriptive statistics 
* Inferential statistics
DESCRIPTIVE STATISTICS 
 Use sample information to explain/make abstraction of 
population “phenomena”. 
 Common “phenomena”: 
 * Association (e.g. σ1,2.3 = 0.75) 
 * Tendency (left-skew, right-skew) 
 * Causal relationship (e.g. if X, then, Y) 
 * Trend, pattern, dispersion, range 
 Used in non-parametric analysis
INFERENTIAL STATISTICS 
 Using sample statistics to infer some “phenomena” of 
population parameters 
 Hypothesis Testing 
 Common “phenomena”: cause-and-effect 
* One-way r/ship - ANOVA 
* Multi-directional r/ship - MANOVA 
 Use parametric analysis
COMMON MISTAKES (CONTD.) – “ABUSE OF 
STATISTICS” 
Issue Data analysis techniques 
Example of abuse Correct technique 
Measure the “influence” of a variable 
on another 
Using partial correlation 
(e.g. Spearman coeff.) 
Using a regression 
parameter 
Finding the “relationship” between one 
variable with another 
Multi-dimensional 
scaling, Likert scaling 
Simple regression 
coefficient 
To evaluate whether a model fits data 
better than the other 
Using R2 Many – a.o.t. Box-Cox 
c2 test for model 
equivalence 
To evaluate accuracy of “prediction” Using R2 and/or F-value 
of a model 
Hold-out sample’s 
MAPE,MAD 
“Compare” whether a group is different 
from another 
Multi-dimensional 
scaling, Likert scaling 
Many – a.o.t. two-way 
anova, c2, Z test 
To determine whether a group of 
factors “significantly influence” the 
observed phenomenon 
Multi-dimensional 
scaling, Likert scaling 
Many – a.o.t. manova, 
regression
TIME SERIES ANALYSIS 
 A time series is a collection of observations made 
sequentially in time. 
11
STOCHASTIC PROCESSES USEFUL 
IN MODELING TIME SERIES 
(1) a purely random process, 
 (2) a random walk, 
(3) a moving average (MA) process, 
(4) an autoregressive (AR) process, 
(5) an autoregressive moving average (ARMA) 
process, and 
(6) an autoregressive integrated moving 
average (ARIMA)process. 
12
13
14
 
M->Multiplicative Error 
N->No trend 
N->No seasonality alpha = 0.1713 15
VALIDATION 
Forecasts using ARIMA(1,1,2) Rel Err Forecasts using ETS(M,N,N) Rel Err 
13-03-12 65 60.48468 0.069466 57.33989 0.117848 
12-03-12 73 55.66896 0.237412 57.33989 0.214522 
11-03-12 80 58.24566 0.271929 57.33989 0.283251 
10-03-12 54 56.86697 0.053092 57.33989 0.06185 
09-03-12 55 57.60465 0.047357 57.33989 0.042543 
08-03-12 55 57.20995 0.040181 57.33989 0.042543 
07-03-12 51 57.42114 0.125905 57.33989 0.124312 
MAPE 0.120763 0.126696 
16

More Related Content

What's hot

Intro to R statistic programming
Intro to R statistic programming Intro to R statistic programming
Intro to R statistic programming
Bryan Downing
 
Introduction to R Language
Introduction to R LanguageIntroduction to R Language
Introduction to R Language
Visuality
 
R programming language
R programming languageR programming language
R programming language
Keerti Verma
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of R
AnalyticsWeek
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
Stéphane Fréchette
 
R programming for data science
R programming for data scienceR programming for data science
R programming for data science
Sovello Hildebrand
 
R for data analytics
R for data analyticsR for data analytics
R for data analytics
VijayMohan Vasu
 
Coding and Cookies: R basics
Coding and Cookies: R basicsCoding and Cookies: R basics
Coding and Cookies: R basics
C. Tobin Magle
 
Data wrangling with dplyr
Data wrangling with dplyrData wrangling with dplyr
Data wrangling with dplyr
C. Tobin Magle
 
List.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF ListsList.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF Lists
Albert Meroño-Peñuela
 
Data and donuts: Data Visualization using R
Data and donuts: Data Visualization using RData and donuts: Data Visualization using R
Data and donuts: Data Visualization using R
C. Tobin Magle
 
R programming
R programmingR programming
R programming
Nandhini G
 
Class ppt intro to r
Class ppt intro to rClass ppt intro to r
Class ppt intro to r
JigsawAcademy2014
 
Managing large datasets in R – ff examples and concepts
Managing large datasets in R – ff examples and conceptsManaging large datasets in R – ff examples and concepts
Managing large datasets in R – ff examples and conceptsAjay Ohri
 
Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...తేజ దండిభట్ల
 
Big data analytics using R
Big data analytics using RBig data analytics using R
Big data analytics using R
Karthik Padmanabhan ( MLE℠)
 
R introduction
R introductionR introduction
R introduction
Teachers Mitraa
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programming
Ramon Salazar
 
Essentials of R
Essentials of REssentials of R
Essentials of R
ExternalEvents
 
Data and Donuts: Data cleaning with OpenRefine
Data and Donuts: Data cleaning with OpenRefineData and Donuts: Data cleaning with OpenRefine
Data and Donuts: Data cleaning with OpenRefine
C. Tobin Magle
 

What's hot (20)

Intro to R statistic programming
Intro to R statistic programming Intro to R statistic programming
Intro to R statistic programming
 
Introduction to R Language
Introduction to R LanguageIntroduction to R Language
Introduction to R Language
 
R programming language
R programming languageR programming language
R programming language
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of R
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
R programming for data science
R programming for data scienceR programming for data science
R programming for data science
 
R for data analytics
R for data analyticsR for data analytics
R for data analytics
 
Coding and Cookies: R basics
Coding and Cookies: R basicsCoding and Cookies: R basics
Coding and Cookies: R basics
 
Data wrangling with dplyr
Data wrangling with dplyrData wrangling with dplyr
Data wrangling with dplyr
 
List.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF ListsList.MID: A MIDI-Based Benchmark for RDF Lists
List.MID: A MIDI-Based Benchmark for RDF Lists
 
Data and donuts: Data Visualization using R
Data and donuts: Data Visualization using RData and donuts: Data Visualization using R
Data and donuts: Data Visualization using R
 
R programming
R programmingR programming
R programming
 
Class ppt intro to r
Class ppt intro to rClass ppt intro to r
Class ppt intro to r
 
Managing large datasets in R – ff examples and concepts
Managing large datasets in R – ff examples and conceptsManaging large datasets in R – ff examples and concepts
Managing large datasets in R – ff examples and concepts
 
Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...Achieving time effective federated information from scalable rdf data using s...
Achieving time effective federated information from scalable rdf data using s...
 
Big data analytics using R
Big data analytics using RBig data analytics using R
Big data analytics using R
 
R introduction
R introductionR introduction
R introduction
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programming
 
Essentials of R
Essentials of REssentials of R
Essentials of R
 
Data and Donuts: Data cleaning with OpenRefine
Data and Donuts: Data cleaning with OpenRefineData and Donuts: Data cleaning with OpenRefine
Data and Donuts: Data cleaning with OpenRefine
 

Viewers also liked

Proactive planning for catastrophic events in supply chains
Proactive planning for catastrophic events in supply chainsProactive planning for catastrophic events in supply chains
Proactive planning for catastrophic events in supply chains
Shanmukha S. Potti
 
Commercialization Options for a set of Wireless Patents
Commercialization Options for a set of Wireless PatentsCommercialization Options for a set of Wireless Patents
Commercialization Options for a set of Wireless Patents
Shanmukha S. Potti
 
Merrill Lynch: Understanding financial statements
Merrill Lynch: Understanding financial statementsMerrill Lynch: Understanding financial statements
Merrill Lynch: Understanding financial statementsSreehari Menon CFSA, CAMS
 
OTC Derivatives: Pervasive Regulatory Changes and Impact on Market Participan...
OTC Derivatives: Pervasive Regulatory Changes and Impact on Market Participan...OTC Derivatives: Pervasive Regulatory Changes and Impact on Market Participan...
OTC Derivatives: Pervasive Regulatory Changes and Impact on Market Participan...Sreehari Menon CFSA, CAMS
 
How NOT to make a presentation!!
How NOT to make a presentation!!How NOT to make a presentation!!
How NOT to make a presentation!!
Shanmukha S. Potti
 
HR Analytics: New approaches, higher returns on human capital investment
HR Analytics: New approaches, higher returns on human capital investmentHR Analytics: New approaches, higher returns on human capital investment
HR Analytics: New approaches, higher returns on human capital investment
Shanmukha S. Potti
 
Time series data mining techniques
Time series data mining techniquesTime series data mining techniques
Time series data mining techniques
Shanmukha S. Potti
 
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Shanmukha S. Potti
 
Introduction to strategic management
Introduction to strategic managementIntroduction to strategic management
Introduction to strategic managementDr Bryan Mills
 
Marketing, Value, Value Propositions, Selling, Value Adding, Sales
Marketing, Value, Value Propositions, Selling, Value Adding, SalesMarketing, Value, Value Propositions, Selling, Value Adding, Sales
Marketing, Value, Value Propositions, Selling, Value Adding, Sales
Dr Bryan Mills
 
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLERBIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
Shanmukha S. Potti
 
Loan impairment modeling according to IAS 39 by using Basel II parameters
Loan impairment modeling according to IAS 39 by using Basel II parametersLoan impairment modeling according to IAS 39 by using Basel II parameters
Loan impairment modeling according to IAS 39 by using Basel II parametersSreehari Menon CFSA, CAMS
 
The financial crisis
The financial crisisThe financial crisis
The financial crisis
Dr Bryan Mills
 

Viewers also liked (16)

Proactive planning for catastrophic events in supply chains
Proactive planning for catastrophic events in supply chainsProactive planning for catastrophic events in supply chains
Proactive planning for catastrophic events in supply chains
 
Commercialization Options for a set of Wireless Patents
Commercialization Options for a set of Wireless PatentsCommercialization Options for a set of Wireless Patents
Commercialization Options for a set of Wireless Patents
 
Merrill Lynch: Understanding financial statements
Merrill Lynch: Understanding financial statementsMerrill Lynch: Understanding financial statements
Merrill Lynch: Understanding financial statements
 
VaR Methodologies Jp Morgan
VaR Methodologies Jp MorganVaR Methodologies Jp Morgan
VaR Methodologies Jp Morgan
 
OTC Derivatives: Pervasive Regulatory Changes and Impact on Market Participan...
OTC Derivatives: Pervasive Regulatory Changes and Impact on Market Participan...OTC Derivatives: Pervasive Regulatory Changes and Impact on Market Participan...
OTC Derivatives: Pervasive Regulatory Changes and Impact on Market Participan...
 
How NOT to make a presentation!!
How NOT to make a presentation!!How NOT to make a presentation!!
How NOT to make a presentation!!
 
Private Placement
Private PlacementPrivate Placement
Private Placement
 
HR Analytics: New approaches, higher returns on human capital investment
HR Analytics: New approaches, higher returns on human capital investmentHR Analytics: New approaches, higher returns on human capital investment
HR Analytics: New approaches, higher returns on human capital investment
 
Time series data mining techniques
Time series data mining techniquesTime series data mining techniques
Time series data mining techniques
 
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
Construction of 6 CPCL Oil storage tankers - A critical Project Management pe...
 
Introduction to strategic management
Introduction to strategic managementIntroduction to strategic management
Introduction to strategic management
 
Marketing, Value, Value Propositions, Selling, Value Adding, Sales
Marketing, Value, Value Propositions, Selling, Value Adding, SalesMarketing, Value, Value Propositions, Selling, Value Adding, Sales
Marketing, Value, Value Propositions, Selling, Value Adding, Sales
 
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLERBIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
BIDIRECTIONAL SPEED CONTROL OF DC MOTOR USING 8051 MICROCONTROLLER
 
Loan impairment modeling according to IAS 39 by using Basel II parameters
Loan impairment modeling according to IAS 39 by using Basel II parametersLoan impairment modeling according to IAS 39 by using Basel II parameters
Loan impairment modeling according to IAS 39 by using Basel II parameters
 
FATCA
FATCAFATCA
FATCA
 
The financial crisis
The financial crisisThe financial crisis
The financial crisis
 

Similar to A brief introduction to 'R' statistical package

Educ 190_Data Analysis and Collection Tools
Educ 190_Data Analysis and Collection ToolsEduc 190_Data Analysis and Collection Tools
Educ 190_Data Analysis and Collection ToolsTeacher Pauline
 
Computer assistance in statistical methods.28.04.2021
Computer assistance in statistical methods.28.04.2021Computer assistance in statistical methods.28.04.2021
Computer assistance in statistical methods.28.04.2021
DrAnjaliUpadhye
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document useful
ssuser3c3f88
 
Revolution Analytics
Revolution AnalyticsRevolution Analytics
Revolution Analyticstempledf
 
Uses of SPSS and Excel to analyze data
Uses of SPSS and Excel   to analyze dataUses of SPSS and Excel   to analyze data
Uses of SPSS and Excel to analyze data
Kudrat-E- Khoda(Prince)
 
Presentation on spss
Presentation on spssPresentation on spss
Presentation on spss
alfiyajamalcj
 
Various statistical software's in data analysis.
Various statistical software's in data analysis.Various statistical software's in data analysis.
Various statistical software's in data analysis.
SelvaMani69
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
Revolution Analytics
 
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHONUNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
Nandakumar P
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
DataWorks Summit
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
Jinseob Kim
 
R tutorial
R tutorialR tutorial
R tutorial
Richard Vidgen
 
Social_Distancing_DIS_Time_Series
Social_Distancing_DIS_Time_SeriesSocial_Distancing_DIS_Time_Series
Social_Distancing_DIS_Time_Series
Guttenberg Ferreira Passos
 
Sas profile csg_0413
Sas  profile csg_0413Sas  profile csg_0413
Sas profile csg_0413
C.S. Ganti
 
E05312426
E05312426E05312426
E05312426
IOSR-JEN
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
Daniel S. Katz
 
RES814 U1 Individual Project
RES814 U1 Individual ProjectRES814 U1 Individual Project
RES814 U1 Individual ProjectThienSi Le
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
Korkrid Akepanidtaworn
 
Introduction to basic statistics
Introduction to basic statisticsIntroduction to basic statistics
Introduction to basic statistics
IBM
 

Similar to A brief introduction to 'R' statistical package (20)

Spss
SpssSpss
Spss
 
Educ 190_Data Analysis and Collection Tools
Educ 190_Data Analysis and Collection ToolsEduc 190_Data Analysis and Collection Tools
Educ 190_Data Analysis and Collection Tools
 
Computer assistance in statistical methods.28.04.2021
Computer assistance in statistical methods.28.04.2021Computer assistance in statistical methods.28.04.2021
Computer assistance in statistical methods.28.04.2021
 
An introduction to R is a document useful
An introduction to R is a document usefulAn introduction to R is a document useful
An introduction to R is a document useful
 
Revolution Analytics
Revolution AnalyticsRevolution Analytics
Revolution Analytics
 
Uses of SPSS and Excel to analyze data
Uses of SPSS and Excel   to analyze dataUses of SPSS and Excel   to analyze data
Uses of SPSS and Excel to analyze data
 
Presentation on spss
Presentation on spssPresentation on spss
Presentation on spss
 
Various statistical software's in data analysis.
Various statistical software's in data analysis.Various statistical software's in data analysis.
Various statistical software's in data analysis.
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHONUNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
UNIT - 5 : 20ACS04 – PROBLEM SOLVING AND PROGRAMMING USING PYTHON
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
useR 2014 jskim
useR 2014 jskimuseR 2014 jskim
useR 2014 jskim
 
R tutorial
R tutorialR tutorial
R tutorial
 
Social_Distancing_DIS_Time_Series
Social_Distancing_DIS_Time_SeriesSocial_Distancing_DIS_Time_Series
Social_Distancing_DIS_Time_Series
 
Sas profile csg_0413
Sas  profile csg_0413Sas  profile csg_0413
Sas profile csg_0413
 
E05312426
E05312426E05312426
E05312426
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
RES814 U1 Individual Project
RES814 U1 Individual ProjectRES814 U1 Individual Project
RES814 U1 Individual Project
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
 
Introduction to basic statistics
Introduction to basic statisticsIntroduction to basic statistics
Introduction to basic statistics
 

Recently uploaded

原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 

Recently uploaded (20)

原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 

A brief introduction to 'R' statistical package

  • 1. A BRIEF INTRO TO ‘R’ – APPLIED STATS & TIME SERIES ANALYSIS - Shanmukha Sreenivas P
  • 2. THE R ENVIRONMENT  R is an integrated suite of software facilities for data manipulation, calculation and graphical display.  An effective data handling and storage facility  A suite of operators for calculations on arrays, in particular matrices  A large, coherent, integrated collection of intermediate tools for data analysis  Graphical facilities for data analysis  A well developed, simple and effective programming language (called ‘S’) which includes conditionals, loops, user defined recursive functions and I/O facilities.
  • 3. “OPEN SOURCE”... THAT JUST MEANS I DON’T HAVE TO PAY FOR IT, RIGHT? 5 •No. Much more: –Provides full access to algorithms and their implementation –Ability to fix bugs and extend software –Provides a forum allowing researchers to explore and expand the methods used to analyze data –Promotes reproducible research by providing open and accessible tools –Most of R is written in… R! This makes it quite easy to see what functions are actually doing.
  • 4. WHAT IS IT? •R is an interpreted computer language. –Most user-visible functions are written in R itself, calling upon a smaller set of internal primitives. – It is possible to interface procedures written in C, C+, or FORTRAN languages for efficiency, and to write additional primitives. –System commands can be called from within R •R is used for data manipulation, statistics, and graphics. It is made up of: – operators (+ - <- * %*% …) for calculations on arrays & matrices – large, coherent, integrated collection of functions – facilities for making unlimited types of publication quality graphics – user written functions & sets of functions (packages); 800+ contributed packages so far & growing
  • 5. R ADVANTAGES DISADVANTAGES oNot user friendly @ start - steep learning curve, minimal GUI. oNo commercial support; figuring out correct methods or how to use a function on your own can be frustrating. oEasy to make mistakes and not know. oWorking with large datasets is limited by RAM oData prep & cleaning can be messier & more mistake prone in R vs. SPSS or SAS oFast and free. oState of the art: Statistical researchers provide their methods as R packages. SPSS and SAS are years behind R! o2nd only to MATLAB for graphics. oMx, WinBugs, and other programs use or will use R. oActive user community oExcellent for simulation, programming, computer intensive analyses, etc. oForces you to think about your analysis. oInterfaces with database storage software (SQL)
  • 6. TYPICAL R SESSION  Start up R via the GUI or favorite text editor  Two windows:  1+ new or existing scripts (text files) - these will be saved  Terminal – output & temporary input - usually unsaved
  • 7. STATISTICAL METHODS  Statistics: “meaningful” quantities about a sample of objects, things, persons, events, phenomena, etc.  Simple to complex issues. E.g.  Correlation  ANOVA  MANOVA  Regression – linear, multiple, logistic  LDA  PCA/ Factor Analysis  Frequency domain analysis  Econometric modelling (TSA)  Two main categories: * Descriptive statistics * Inferential statistics
  • 8. DESCRIPTIVE STATISTICS  Use sample information to explain/make abstraction of population “phenomena”.  Common “phenomena”:  * Association (e.g. σ1,2.3 = 0.75)  * Tendency (left-skew, right-skew)  * Causal relationship (e.g. if X, then, Y)  * Trend, pattern, dispersion, range  Used in non-parametric analysis
  • 9. INFERENTIAL STATISTICS  Using sample statistics to infer some “phenomena” of population parameters  Hypothesis Testing  Common “phenomena”: cause-and-effect * One-way r/ship - ANOVA * Multi-directional r/ship - MANOVA  Use parametric analysis
  • 10. COMMON MISTAKES (CONTD.) – “ABUSE OF STATISTICS” Issue Data analysis techniques Example of abuse Correct technique Measure the “influence” of a variable on another Using partial correlation (e.g. Spearman coeff.) Using a regression parameter Finding the “relationship” between one variable with another Multi-dimensional scaling, Likert scaling Simple regression coefficient To evaluate whether a model fits data better than the other Using R2 Many – a.o.t. Box-Cox c2 test for model equivalence To evaluate accuracy of “prediction” Using R2 and/or F-value of a model Hold-out sample’s MAPE,MAD “Compare” whether a group is different from another Multi-dimensional scaling, Likert scaling Many – a.o.t. two-way anova, c2, Z test To determine whether a group of factors “significantly influence” the observed phenomenon Multi-dimensional scaling, Likert scaling Many – a.o.t. manova, regression
  • 11. TIME SERIES ANALYSIS  A time series is a collection of observations made sequentially in time. 11
  • 12. STOCHASTIC PROCESSES USEFUL IN MODELING TIME SERIES (1) a purely random process,  (2) a random walk, (3) a moving average (MA) process, (4) an autoregressive (AR) process, (5) an autoregressive moving average (ARMA) process, and (6) an autoregressive integrated moving average (ARIMA)process. 12
  • 13. 13
  • 14. 14
  • 15.  M->Multiplicative Error N->No trend N->No seasonality alpha = 0.1713 15
  • 16. VALIDATION Forecasts using ARIMA(1,1,2) Rel Err Forecasts using ETS(M,N,N) Rel Err 13-03-12 65 60.48468 0.069466 57.33989 0.117848 12-03-12 73 55.66896 0.237412 57.33989 0.214522 11-03-12 80 58.24566 0.271929 57.33989 0.283251 10-03-12 54 56.86697 0.053092 57.33989 0.06185 09-03-12 55 57.60465 0.047357 57.33989 0.042543 08-03-12 55 57.20995 0.040181 57.33989 0.042543 07-03-12 51 57.42114 0.125905 57.33989 0.124312 MAPE 0.120763 0.126696 16