SlideShare a Scribd company logo
1https://xkcd.com/1478/
Introducing the R software: Free
statistics at your fingertips
Kamarul Imran Musa
MD, M Community Medicine
Associate Professor (Epidemiology and Statistics)
Dept of Community Medicine, School of Medical Sciences,
Universiti Sains Malaysia, Health Campus
Email: drki.musa@gmail.com
2
Overview of presentation
• A bit on ‘Data’ and people dealing with ‘Data’
• Statistical software – choices
• Our main course ---- R -----
• Different flavors of R
• Our experiences with R at Health Campus
• Data analysis – now and future
3
Data as for now … Data in future?
• What is data?
– Facts and statistics collected together for reference and analysis
4
5
https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/
(Harvard Business Review)
Scientist and Data
• Are you a scientist?
• Do we work with data?
• Scientist + Data
• Are we data scientist?
• What is new for data science in health and medicine?
6
7
What does a data scientist do?
• Data scientists are inquisitive:
8
http://www-01.ibm.com/software/data/infosphere/data-
scientist/
Scientists use tools to work with
• We need tools in science
• What is the right tool for a data scientist?
• In my case, tools to deal with data
– Scientists use tool or tools to ‘manipulate’ data giving them results that
they have to make sense of the findings
• Which tools are available to help scientist best work with their
data?
9
Choices of statistical software – many. So don’t get
spoiled
• The normal questions for scientist dealing with data analysis
– What choices do you have?
– Which one are you familiar with?
– Popularity?
– Cost?
– Capability?
– After-sale support?
– Meet scientific rigor?
10
IBM SPSS – everyone knows
• Popular, easy and user-friendly
• How about the cost? When does the license expire?
Usually for USM, every July.
• What does it do when it expires? NOTHING works
• http://www.ibm.com/marketplace/cloud/spss-
statistics/us/en-us?step=Plan
11
STATA – less people know it, but it is amazing
• Do not expire but upgradeable
• Much cheaper than SPSS
• Balanced between
– Codes use
– Point-and-click use
• Powerful
• http://www.stata.com/order/new/edu/single-user-licenses/
12
Who are using what software?
• Number of scholarly articles found in the most recent complete
year (2014) for each software package.
• In order of # of articles:
1. SPSS
2. SAS
3. R
4. STATA
• http://r4stats.com/articles/popularity/
13
The number of scholarly articles found in each year by
Google Scholar.
Only the top six “classic” statistics packages are shown.
14
No so new kid on the block – R
15
What (almost) everybody knows about R?
• R is :
– ‘GNU S’, a freely available language and environment for
statistical computing and graphics which provides a wide
variety of statistical and graphical techniques: linear and
nonlinear modelling, statistical tests, time series
analysis, classification, clustering, etc.
• Questions:
1. What can R do?
2. What is special about R?
3. Does R have future?
16
R and R-gui
• https://cran.r-project.org/
17
Revolution-R
• Microsoft owns Revolution-R
http://www.revolutionanalytics.com/revolution-r-enterprise 18
Revolution R Ent and Revolution R Open
19
Rstudio IDE
• https://www.rstudio.com/
• Highly recommend to start
with Rstudio IDE
• It is an interface for R
• Requires users to download
and install R first from CRAN
20
RStudio IDE- Features
• Clean interface
• Organized
• Integrated with many brilliant in-built
tools
21
DEMO
22
How does R fit into data analysis now and in the
future?
23
Recognition
24
Reproducibility (DEMO)
25
Reproducibility
• Reproducibility in research
• The Associate Editor for reproducibility (AER) will handle
submissions of reproducible articles.
– Data: The analytic data from which the principal results were derived are
made available on the journal's Web site.
– Code: Any computer code, software, or other computer instructions that
were used to compute published results are provided.
– Reproducible: An article is designated as reproducible if the AER succeeds
in executing the code on the data provided and produces results matching
those that the authors claim are reproducible.
– http://biostatistics.oxfordjournals.org/content/10/3/405.full
26
On the fly report using R-markdown (DEMO)
• Produce report on a fly
• In HTML or PDF formats
• Benefits
– Save time
– Reduce error – no more copy paste
– Pretty
27
Integration with other software (DEMO)
• Latex
• Stata
• WinBUGS
• SPSS
• SAS
28
Our experience with R
• No experience with undergraduate
• Started teaching R for DrPH candidates this academic session
• Personally introduced R, 2 years ago
• Common resistance
– Totally command-driven
– Steep learning curve
– Limited resources esp books on R --- that was 2 years ago. Not a problem
now
– You need to know your statistics
– Not for data entry
– Very difficult to view and manipulate variables
29
How’s the feedback from users?
• No formal study or assessment on their experience
• Users seem to like R because it opens up creativity
• R pushes users to explore more and challenge themselves
• R is not boring like point-and-click (menu driven) software
• They seem to like R-markdown
– On-the-fly report
30
The BIG question--- stick to R? .. And R only?
• Yes, you may
• Hmm, maybe not
– Specialized software for data entry
– Software for data cleaning
– Software for data mining
• But yes, 1 software is enough for 95% of us
31
Embrace R and abandon others?
• I love R
– Lots of data analysis – Creating publication : HTML, PDF
– Spatial data analysis
– Bayesian
• WINBUGS
• INLA
• But I do love Stata too
– Data cleaning
– Variable manipulation
• And I use Epidata for data entry
• But yes, I have left SPSS
32
• East-coast data science user group
• My blog :
– https://designdataanalysis.wordpress.com
33

More Related Content

What's hot

2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning r
Netaji Gandi
 
Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga
DTU - Technical University of Denmark
 
Webinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine LearningWebinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine Learning
Edureka!
 
Introduction to statistical software R
Introduction to statistical software RIntroduction to statistical software R
Introduction to statistical software R
Paola Pozzolo - La tua statistica
 
R Programming Overview
R Programming Overview R Programming Overview
R Programming Overview
dlamb3244
 
R programming
R programmingR programming
R programming
Pooja Sharma
 
R programming
R programmingR programming
R programming
TIB Academy
 
R language
R languageR language
R language tutorial
R language tutorialR language tutorial
R language tutorial
David Chiu
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
Ashraf Uddin
 
R crash course
R crash courseR crash course
R crash course
Tomislav Hengl
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
Korkrid Akepanidtaworn
 
Big data analytics using R
Big data analytics using RBig data analytics using R
Big data analytics using R
Karthik Padmanabhan ( MLE℠)
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
Syracuse University
 
R programming
R programmingR programming
R programming
Nandhini G
 
1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r
Simple Research
 
Intro to R statistic programming
Intro to R statistic programming Intro to R statistic programming
Intro to R statistic programming
Bryan Downing
 
R programming slides
R  programming slidesR  programming slides
R programming slides
Pankaj Saini
 

What's hot (20)

2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning r
 
Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga
 
Webinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine LearningWebinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine Learning
 
Introduction to statistical software R
Introduction to statistical software RIntroduction to statistical software R
Introduction to statistical software R
 
R Programming Overview
R Programming Overview R Programming Overview
R Programming Overview
 
R programming
R programmingR programming
R programming
 
R programming
R programmingR programming
R programming
 
R language
R languageR language
R language
 
R language tutorial
R language tutorialR language tutorial
R language tutorial
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
 
R crash course
R crash courseR crash course
R crash course
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
 
Big data analytics using R
Big data analytics using RBig data analytics using R
Big data analytics using R
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
 
R programming
R programmingR programming
R programming
 
1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r1.3 introduction to R language, importing dataset in r, data exploration in r
1.3 introduction to R language, importing dataset in r, data exploration in r
 
Intro to R statistic programming
Intro to R statistic programming Intro to R statistic programming
Intro to R statistic programming
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Introtor
IntrotorIntrotor
Introtor
 
Reason To learn & use r
Reason To learn & use rReason To learn & use r
Reason To learn & use r
 

Similar to Introducing The R Software

Statistical software packages ,their layout & applications
Statistical software packages ,their layout & applicationsStatistical software packages ,their layout & applications
Statistical software packages ,their layout & applications
Neurosurgeon Mumtaz Ali Narejo
 
Data Processing DOH Workshop.pptx
Data Processing DOH Workshop.pptxData Processing DOH Workshop.pptx
Data Processing DOH Workshop.pptx
charlslabarda
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
Diane Talley
 
What is The Importance of SPSS How Will I Get SPSS help online in Australia.pdf
What is The Importance of SPSS How Will I Get SPSS help online in Australia.pdfWhat is The Importance of SPSS How Will I Get SPSS help online in Australia.pdf
What is The Importance of SPSS How Will I Get SPSS help online in Australia.pdf
WilliamJhons
 
SEO Asset (PPT) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
SEO Asset (PPT) Comparing Python, R, and SAS Overcoming Training Data Set Cha...SEO Asset (PPT) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
SEO Asset (PPT) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
Stats Statswork
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational Statistics
Setia Pramana
 
UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...
Jisc
 
Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)
Daniel S. Katz
 
R and Rcmdr Statistical Software
R and Rcmdr Statistical SoftwareR and Rcmdr Statistical Software
R and Rcmdr Statistical Software
arttan2001
 
GNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based MedicineGNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based Medicine
Adrian Olszewski
 
SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
Stats Statswork
 
Spss and software
Spss and softwareSpss and software
Spss and software
Ashok Pandey
 
softwares in public health
softwares in public healthsoftwares in public health
softwares in public health
Pragyan Parija
 
20160607 citation4software opening
20160607 citation4software opening20160607 citation4software opening
20160607 citation4software opening
Daniel S. Katz
 
How to be data savvy manager
How to be data savvy managerHow to be data savvy manager
How to be data savvy manager
TOSHI STATS Co.,Ltd.
 
R programming
R programmingR programming
R programming
yashpalyadav49
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
javed75
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
Daniel S. Katz
 
RES814 U1 Individual Project
RES814 U1 Individual ProjectRES814 U1 Individual Project
RES814 U1 Individual ProjectThienSi Le
 
SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware ac.uk
 

Similar to Introducing The R Software (20)

Statistical software packages ,their layout & applications
Statistical software packages ,their layout & applicationsStatistical software packages ,their layout & applications
Statistical software packages ,their layout & applications
 
Data Processing DOH Workshop.pptx
Data Processing DOH Workshop.pptxData Processing DOH Workshop.pptx
Data Processing DOH Workshop.pptx
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
 
What is The Importance of SPSS How Will I Get SPSS help online in Australia.pdf
What is The Importance of SPSS How Will I Get SPSS help online in Australia.pdfWhat is The Importance of SPSS How Will I Get SPSS help online in Australia.pdf
What is The Importance of SPSS How Will I Get SPSS help online in Australia.pdf
 
SEO Asset (PPT) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
SEO Asset (PPT) Comparing Python, R, and SAS Overcoming Training Data Set Cha...SEO Asset (PPT) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
SEO Asset (PPT) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational Statistics
 
UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...UCL’s research IT management systems architecture review aligned with Open Sc...
UCL’s research IT management systems architecture review aligned with Open Sc...
 
Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)
 
R and Rcmdr Statistical Software
R and Rcmdr Statistical SoftwareR and Rcmdr Statistical Software
R and Rcmdr Statistical Software
 
GNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based MedicineGNU R in Clinical Research and Evidence-Based Medicine
GNU R in Clinical Research and Evidence-Based Medicine
 
SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
 
Spss and software
Spss and softwareSpss and software
Spss and software
 
softwares in public health
softwares in public healthsoftwares in public health
softwares in public health
 
20160607 citation4software opening
20160607 citation4software opening20160607 citation4software opening
20160607 citation4software opening
 
How to be data savvy manager
How to be data savvy managerHow to be data savvy manager
How to be data savvy manager
 
R programming
R programmingR programming
R programming
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
 
RES814 U1 Individual Project
RES814 U1 Individual ProjectRES814 U1 Individual Project
RES814 U1 Individual Project
 
SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers SoundSoftware: Software Sustainability for audio and Music Researchers
SoundSoftware: Software Sustainability for audio and Music Researchers
 

Recently uploaded

一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 

Introducing The R Software

  • 2. Introducing the R software: Free statistics at your fingertips Kamarul Imran Musa MD, M Community Medicine Associate Professor (Epidemiology and Statistics) Dept of Community Medicine, School of Medical Sciences, Universiti Sains Malaysia, Health Campus Email: drki.musa@gmail.com 2
  • 3. Overview of presentation • A bit on ‘Data’ and people dealing with ‘Data’ • Statistical software – choices • Our main course ---- R ----- • Different flavors of R • Our experiences with R at Health Campus • Data analysis – now and future 3
  • 4. Data as for now … Data in future? • What is data? – Facts and statistics collected together for reference and analysis 4
  • 6. Scientist and Data • Are you a scientist? • Do we work with data? • Scientist + Data • Are we data scientist? • What is new for data science in health and medicine? 6
  • 7. 7
  • 8. What does a data scientist do? • Data scientists are inquisitive: 8 http://www-01.ibm.com/software/data/infosphere/data- scientist/
  • 9. Scientists use tools to work with • We need tools in science • What is the right tool for a data scientist? • In my case, tools to deal with data – Scientists use tool or tools to ‘manipulate’ data giving them results that they have to make sense of the findings • Which tools are available to help scientist best work with their data? 9
  • 10. Choices of statistical software – many. So don’t get spoiled • The normal questions for scientist dealing with data analysis – What choices do you have? – Which one are you familiar with? – Popularity? – Cost? – Capability? – After-sale support? – Meet scientific rigor? 10
  • 11. IBM SPSS – everyone knows • Popular, easy and user-friendly • How about the cost? When does the license expire? Usually for USM, every July. • What does it do when it expires? NOTHING works • http://www.ibm.com/marketplace/cloud/spss- statistics/us/en-us?step=Plan 11
  • 12. STATA – less people know it, but it is amazing • Do not expire but upgradeable • Much cheaper than SPSS • Balanced between – Codes use – Point-and-click use • Powerful • http://www.stata.com/order/new/edu/single-user-licenses/ 12
  • 13. Who are using what software? • Number of scholarly articles found in the most recent complete year (2014) for each software package. • In order of # of articles: 1. SPSS 2. SAS 3. R 4. STATA • http://r4stats.com/articles/popularity/ 13
  • 14. The number of scholarly articles found in each year by Google Scholar. Only the top six “classic” statistics packages are shown. 14
  • 15. No so new kid on the block – R 15
  • 16. What (almost) everybody knows about R? • R is : – ‘GNU S’, a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. • Questions: 1. What can R do? 2. What is special about R? 3. Does R have future? 16
  • 17. R and R-gui • https://cran.r-project.org/ 17
  • 18. Revolution-R • Microsoft owns Revolution-R http://www.revolutionanalytics.com/revolution-r-enterprise 18
  • 19. Revolution R Ent and Revolution R Open 19
  • 20. Rstudio IDE • https://www.rstudio.com/ • Highly recommend to start with Rstudio IDE • It is an interface for R • Requires users to download and install R first from CRAN 20
  • 21. RStudio IDE- Features • Clean interface • Organized • Integrated with many brilliant in-built tools 21
  • 23. How does R fit into data analysis now and in the future? 23
  • 26. Reproducibility • Reproducibility in research • The Associate Editor for reproducibility (AER) will handle submissions of reproducible articles. – Data: The analytic data from which the principal results were derived are made available on the journal's Web site. – Code: Any computer code, software, or other computer instructions that were used to compute published results are provided. – Reproducible: An article is designated as reproducible if the AER succeeds in executing the code on the data provided and produces results matching those that the authors claim are reproducible. – http://biostatistics.oxfordjournals.org/content/10/3/405.full 26
  • 27. On the fly report using R-markdown (DEMO) • Produce report on a fly • In HTML or PDF formats • Benefits – Save time – Reduce error – no more copy paste – Pretty 27
  • 28. Integration with other software (DEMO) • Latex • Stata • WinBUGS • SPSS • SAS 28
  • 29. Our experience with R • No experience with undergraduate • Started teaching R for DrPH candidates this academic session • Personally introduced R, 2 years ago • Common resistance – Totally command-driven – Steep learning curve – Limited resources esp books on R --- that was 2 years ago. Not a problem now – You need to know your statistics – Not for data entry – Very difficult to view and manipulate variables 29
  • 30. How’s the feedback from users? • No formal study or assessment on their experience • Users seem to like R because it opens up creativity • R pushes users to explore more and challenge themselves • R is not boring like point-and-click (menu driven) software • They seem to like R-markdown – On-the-fly report 30
  • 31. The BIG question--- stick to R? .. And R only? • Yes, you may • Hmm, maybe not – Specialized software for data entry – Software for data cleaning – Software for data mining • But yes, 1 software is enough for 95% of us 31
  • 32. Embrace R and abandon others? • I love R – Lots of data analysis – Creating publication : HTML, PDF – Spatial data analysis – Bayesian • WINBUGS • INLA • But I do love Stata too – Data cleaning – Variable manipulation • And I use Epidata for data entry • But yes, I have left SPSS 32
  • 33. • East-coast data science user group • My blog : – https://designdataanalysis.wordpress.com 33