SlideShare a Scribd company logo
1 of 12
Download to read offline
Executive Intro to R
William M. Cohee
November 2016
Prepared using Apache OpenOffice 4.1.2
Presenter Bio
● 15+ years of Wall Street Technology
experience
● Expertise in front-office Fixed Income
Systems, Analytics, Pricing, Instrument,
& Entity Reference Data Management
● BA, Computer Science
● MS, Information Systems Engineering
● Certified Bloomberg Specialist
● Currently in the Chief Data Office
@ HSBC
● www.linkedin.com/in/billcohee
Topic
● Tool of choice for Statisticians, Data Analysts, & Data Scientists
● Popularity and use of R is on the rise
● R Community is vibrant & the talent pool is growing rapidly
● R is evolving from its statistical computing roots into a development
platform for robust, reusable software
● A lot of commercial, third-party systems are adding support
● Oracle, Microsoft becoming big players
● R can be used to manage & analyze data in Hadoop
● A growing ecosystem is accelerating industry acceptance/adoption
● R savvy IT leaders can deliver more effective, lower cost solutions
Agenda
● What is R [slides 5-8]
● What can R be used for [slides 9-10]
● Recap & where to learn more [slides 11-12]
R – What is it?
● A powerful computing environment for Data Analysis & Statistics
● 'R' proper, is an open-source programming language
● Developed as a dialect of 'S'
● S developed by Bell Labs to 'turn ideas into software, quickly and
faithfully' c.1976
● strong desire at the time for an alternative to writing FORTRAN
subroutines for analyzing data
● Ross Ihaka and Robert Gentleman recognized as original creators
of R while professors at the University of Auckland in New Zealand
c.1995
● v1.0 came onto the scene in the early 2000s
R – What is it?
● Traditional user base consists of
● Researchers
● Statisticians
● Academia
● 'New wave' R users
● Wall Street Desk Quants
● Risk Analysts & Financial Modelers
● Data Scientists
● Advent of Big Data and the nascent field of Data Science are serving
as catalysts to the sudden rise of this 16+ year old technology
R – What is it?
● When people speak of R, they are usually referring to the broader
ecosystem, not the language
● R for Windows, Microsoft R Open – command line interpreters
● RStudio, R Tools for Visual Studio – IDEs (Interactive Development Environments)
● user-friendly, robust, graphical front-ends for working with R
● CRAN and MRAN
● Comprehensive R Archive Network
● Microsoft R Open Archive Network
● repositories of open-source extensions to R known as 'Packages'
● think of a Package as a pre-built library of functions & data
R – What is it?
● R was not created with 'coders' in mind
● Creators were focused on how to make Data Analysis easier on the
users of data
● Geared toward the power-user who has to work with large amounts
of data while avoiding coding as much as practically possible
● Why is it called R ???
● the co-creators were Ross & Robert!
● it was trendy to give languages letter names (B, C, S, etc)
● As R becomes more mainstream, it may have everyday applications
for people in roles requiring them to work with or 'be in the data'
R – What can it be used for?
● For presenting & solving data-oriented problems
● Exploratory Analysis
● discovering data about the data
● clustering & visualizing data
● quickly building summaries of the data being worked with
● Wrangling/Munging & re-shaping data
● working with structured & unstructured data
● sub-setting, filtering, and merging data
● making data 'tidy' – datasets that facilitate some kind of analysis
● dplyr & tidyr Packages popular
R – What can it be used for?
● Predictive Analytics & Machine Learning
● modeling, sampling, forecasting, trending, regression
● caret, h2o, quantmod Packages popular
● Data Visualization
● powerful, publication-quality graphing & plotting Packages
● ggplot2, leaflets, and shiny Packages popular
● shiny example: Where are the so-called 'SuperZIPs'?
● US postal codes scored on a scale of 0-100, 100 being highest
● score is a function of median household income and education level
● Top 5% are deemed the 'SuperZIPs'
● click to see the R + shiny powered Interactive data map
Recap & Resources
● R is an open-source environment that can be used for complex Data
'work'
● essential part of a Data Scientist's Toolbox
● Also a functional programming language
● can be used to create programs to automate routine, repetitive data
tasks and for general software development
● Becoming a mainstream tool
● benefiting from increased commercial support
● maturing ecosystem of Packages
● Agility, flexibility, growing talent pool, & low cost of ownership all a
part of R's appeal
Recap & Resources
● Where to learn more...
● The R Homepage: https://www.r-project.org
● RStudio: https://www.rstudio.com/products/RStudio
● CRAN: https://cran.r-project.org
● Oracle and R: http://bit.ly/2dUC24a
● Microsoft and R: http://bit.ly/2e5CT5m
● The R Consortium: https://www.r-consortium.org
● Playlist of R video tutorials: http://bit.ly/1iRcgyn
● Free Courses
● https://www.coursera.org/learn/r-programming
● https://www.datacamp.com/courses/free-introduction-to-r
Scan this QR code to view
online from a mobile device

More Related Content

What's hot

Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
Revolution Analytics
 
Big data bi-mature-oanyc summit
Big data bi-mature-oanyc summitBig data bi-mature-oanyc summit
Big data bi-mature-oanyc summit
Open Analytics
 
Big data-science-oanyc
Big data-science-oanycBig data-science-oanyc
Big data-science-oanyc
Open Analytics
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Willy Marroquin (WillyDevNET)
 

What's hot (20)

Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
 
Data analytics using the cloud challenges and opportunities for india
Data analytics using the cloud   challenges and opportunities for india Data analytics using the cloud   challenges and opportunities for india
Data analytics using the cloud challenges and opportunities for india
 
R and Rcmdr Statistical Software
R and Rcmdr Statistical SoftwareR and Rcmdr Statistical Software
R and Rcmdr Statistical Software
 
Batter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and StormBatter Up! Advanced Sports Analytics with R and Storm
Batter Up! Advanced Sports Analytics with R and Storm
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Microsoft R Server for Data Sciencea
Microsoft R Server for Data ScienceaMicrosoft R Server for Data Sciencea
Microsoft R Server for Data Sciencea
 
Big data bi-mature-oanyc summit
Big data bi-mature-oanyc summitBig data bi-mature-oanyc summit
Big data bi-mature-oanyc summit
 
Big data-science-oanyc
Big data-science-oanycBig data-science-oanyc
Big data-science-oanyc
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeed
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
Skillshare - Let's talk about R in Data Journalism
Skillshare - Let's talk about R in Data JournalismSkillshare - Let's talk about R in Data Journalism
Skillshare - Let's talk about R in Data Journalism
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
Applications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the MarketplaceApplications in R - Success and Lessons Learned from the Marketplace
Applications in R - Success and Lessons Learned from the Marketplace
 
Ets train ppt_big_data_basics_v2.0
Ets train ppt_big_data_basics_v2.0Ets train ppt_big_data_basics_v2.0
Ets train ppt_big_data_basics_v2.0
 
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and RevolutionIn-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Towards a Commons RDF Library - ApacheCon Europe 2014
Towards a Commons RDF Library - ApacheCon Europe 2014Towards a Commons RDF Library - ApacheCon Europe 2014
Towards a Commons RDF Library - ApacheCon Europe 2014
 
Distributed R: The Next Generation Platform for Predictive Analytics
Distributed R: The Next Generation Platform for Predictive AnalyticsDistributed R: The Next Generation Platform for Predictive Analytics
Distributed R: The Next Generation Platform for Predictive Analytics
 

Similar to Executive Intro to R

2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning r
Netaji Gandi
 
BIG DATA ANALYTICS USING R
BIG DATA ANALYTICS USING  RBIG DATA ANALYTICS USING  R
BIG DATA ANALYTICS USING R
Umair Shafique
 
Bluegranite AA Webinar FINAL 28JUN16
Bluegranite AA Webinar FINAL 28JUN16Bluegranite AA Webinar FINAL 28JUN16
Bluegranite AA Webinar FINAL 28JUN16
Andy Lathrop
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
Revolution Analytics
 

Similar to Executive Intro to R (20)

Data mining with Rattle For R
Data mining with Rattle For RData mining with Rattle For R
Data mining with Rattle For R
 
Data Science Environment with R on openSUSE Leap 15.1
Data Science Environment with R on openSUSE Leap 15.1Data Science Environment with R on openSUSE Leap 15.1
Data Science Environment with R on openSUSE Leap 15.1
 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning r
 
UNIT-1 Start Learning R.pdf
UNIT-1 Start Learning R.pdfUNIT-1 Start Learning R.pdf
UNIT-1 Start Learning R.pdf
 
Job Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsJob Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabs
 
How to become a data scientist
How to become a data scientist How to become a data scientist
How to become a data scientist
 
Reason To learn & use r
Reason To learn & use rReason To learn & use r
Reason To learn & use r
 
R and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with HadoopR and Big Data using Revolution R Enterprise with Hadoop
R and Big Data using Revolution R Enterprise with Hadoop
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
 
Top 10 Data analytics tools to look for in 2021
Top 10 Data analytics tools to look for in 2021Top 10 Data analytics tools to look for in 2021
Top 10 Data analytics tools to look for in 2021
 
BIG DATA ANALYTICS USING R
BIG DATA ANALYTICS USING  RBIG DATA ANALYTICS USING  R
BIG DATA ANALYTICS USING R
 
R Vs Python – The most trending debate of aspiring Data Scientists
R Vs Python – The most trending debate of aspiring Data ScientistsR Vs Python – The most trending debate of aspiring Data Scientists
R Vs Python – The most trending debate of aspiring Data Scientists
 
Bluegranite AA Webinar FINAL 28JUN16
Bluegranite AA Webinar FINAL 28JUN16Bluegranite AA Webinar FINAL 28JUN16
Bluegranite AA Webinar FINAL 28JUN16
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
 
Business Intelligence Open Source
Business Intelligence Open SourceBusiness Intelligence Open Source
Business Intelligence Open Source
 
Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics? Are You Ready for Big Data Big Analytics?
Are You Ready for Big Data Big Analytics?
 
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar 18Mar14 Find the Hidden Signal in Market Data Noise Webinar
18Mar14 Find the Hidden Signal in Market Data Noise Webinar
 
Large drupal site builds a workshop for sxsw interactive - march 17, 2015
Large drupal site builds   a workshop for sxsw interactive - march 17, 2015Large drupal site builds   a workshop for sxsw interactive - march 17, 2015
Large drupal site builds a workshop for sxsw interactive - march 17, 2015
 

Recently uploaded

Recently uploaded (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Executive Intro to R

  • 1. Executive Intro to R William M. Cohee November 2016 Prepared using Apache OpenOffice 4.1.2
  • 2. Presenter Bio ● 15+ years of Wall Street Technology experience ● Expertise in front-office Fixed Income Systems, Analytics, Pricing, Instrument, & Entity Reference Data Management ● BA, Computer Science ● MS, Information Systems Engineering ● Certified Bloomberg Specialist ● Currently in the Chief Data Office @ HSBC ● www.linkedin.com/in/billcohee
  • 3. Topic ● Tool of choice for Statisticians, Data Analysts, & Data Scientists ● Popularity and use of R is on the rise ● R Community is vibrant & the talent pool is growing rapidly ● R is evolving from its statistical computing roots into a development platform for robust, reusable software ● A lot of commercial, third-party systems are adding support ● Oracle, Microsoft becoming big players ● R can be used to manage & analyze data in Hadoop ● A growing ecosystem is accelerating industry acceptance/adoption ● R savvy IT leaders can deliver more effective, lower cost solutions
  • 4. Agenda ● What is R [slides 5-8] ● What can R be used for [slides 9-10] ● Recap & where to learn more [slides 11-12]
  • 5. R – What is it? ● A powerful computing environment for Data Analysis & Statistics ● 'R' proper, is an open-source programming language ● Developed as a dialect of 'S' ● S developed by Bell Labs to 'turn ideas into software, quickly and faithfully' c.1976 ● strong desire at the time for an alternative to writing FORTRAN subroutines for analyzing data ● Ross Ihaka and Robert Gentleman recognized as original creators of R while professors at the University of Auckland in New Zealand c.1995 ● v1.0 came onto the scene in the early 2000s
  • 6. R – What is it? ● Traditional user base consists of ● Researchers ● Statisticians ● Academia ● 'New wave' R users ● Wall Street Desk Quants ● Risk Analysts & Financial Modelers ● Data Scientists ● Advent of Big Data and the nascent field of Data Science are serving as catalysts to the sudden rise of this 16+ year old technology
  • 7. R – What is it? ● When people speak of R, they are usually referring to the broader ecosystem, not the language ● R for Windows, Microsoft R Open – command line interpreters ● RStudio, R Tools for Visual Studio – IDEs (Interactive Development Environments) ● user-friendly, robust, graphical front-ends for working with R ● CRAN and MRAN ● Comprehensive R Archive Network ● Microsoft R Open Archive Network ● repositories of open-source extensions to R known as 'Packages' ● think of a Package as a pre-built library of functions & data
  • 8. R – What is it? ● R was not created with 'coders' in mind ● Creators were focused on how to make Data Analysis easier on the users of data ● Geared toward the power-user who has to work with large amounts of data while avoiding coding as much as practically possible ● Why is it called R ??? ● the co-creators were Ross & Robert! ● it was trendy to give languages letter names (B, C, S, etc) ● As R becomes more mainstream, it may have everyday applications for people in roles requiring them to work with or 'be in the data'
  • 9. R – What can it be used for? ● For presenting & solving data-oriented problems ● Exploratory Analysis ● discovering data about the data ● clustering & visualizing data ● quickly building summaries of the data being worked with ● Wrangling/Munging & re-shaping data ● working with structured & unstructured data ● sub-setting, filtering, and merging data ● making data 'tidy' – datasets that facilitate some kind of analysis ● dplyr & tidyr Packages popular
  • 10. R – What can it be used for? ● Predictive Analytics & Machine Learning ● modeling, sampling, forecasting, trending, regression ● caret, h2o, quantmod Packages popular ● Data Visualization ● powerful, publication-quality graphing & plotting Packages ● ggplot2, leaflets, and shiny Packages popular ● shiny example: Where are the so-called 'SuperZIPs'? ● US postal codes scored on a scale of 0-100, 100 being highest ● score is a function of median household income and education level ● Top 5% are deemed the 'SuperZIPs' ● click to see the R + shiny powered Interactive data map
  • 11. Recap & Resources ● R is an open-source environment that can be used for complex Data 'work' ● essential part of a Data Scientist's Toolbox ● Also a functional programming language ● can be used to create programs to automate routine, repetitive data tasks and for general software development ● Becoming a mainstream tool ● benefiting from increased commercial support ● maturing ecosystem of Packages ● Agility, flexibility, growing talent pool, & low cost of ownership all a part of R's appeal
  • 12. Recap & Resources ● Where to learn more... ● The R Homepage: https://www.r-project.org ● RStudio: https://www.rstudio.com/products/RStudio ● CRAN: https://cran.r-project.org ● Oracle and R: http://bit.ly/2dUC24a ● Microsoft and R: http://bit.ly/2e5CT5m ● The R Consortium: https://www.r-consortium.org ● Playlist of R video tutorials: http://bit.ly/1iRcgyn ● Free Courses ● https://www.coursera.org/learn/r-programming ● https://www.datacamp.com/courses/free-introduction-to-r Scan this QR code to view online from a mobile device