Analytics with R in SQL Server 2016
Chennai SQL Server User Group
Hariharan
Lead Consultant, Your SQL Man (I) Pvt Ltd
Microsoft Certified Trainer
Microsoft Certified Solution Expert
Business Intelligence & Data Management and Analytics
SAP HANA & Business Objects
Active Speaker in DAGEOP DATA DAY
Blog
http://dataap.org/author/hariharanr/
Twitter
@imhariharanr
Linked In
hariharan-r-12635640
Topics
• Analytics
• Introduction to R
• Challenges
• R in SQL Server 2016
• Advanced Analytics in R
• Visualization with R
3
Analytics
What
Happened? Why
Happened?
What Will
happen?
What should I
do?
Descriptive Diagnostic Predictive Prescriptive
Introduction to R
• It is a Open Source
• A statistics programming language
• A data visualization tool
• It has a very big community with 2.5+ M users
• Scalable to big data
• Rich application & platform integration
Introduction to R
0
500
1000
1500
2000
2500
3000
3500
4000
2011 2012 2013 2014 2015 2016
R Packages in CRAN
Demo
Challenges in R
• Data Movement
• Moving data from database to R
• Operationalization
• How do I call R script from my application
• Scale and Performance
• R runs in single threaded and only accommodates datasets that
fit into available memory
SQL Server with R
• Data Movement
• Reduce or eliminate data movement with In-Database analytics
• Operationalization
• T-SQL Stored Procedure
• Scale and Performance
• In-memory, columnstore indexes. Leverage RevoScaleR support
large datasets and parallel algorithms
SQL Server with R
• It a new feature in SQL Server 2016
• New Workload
• Integrated highly popular R language for enterprise
customers
• Intelligent & Predictive Applications
R Components
R Components
DeployR
• RESTful APIs for easy
integration from java,
• JavaScript, .NET
• Enterprise authentication &
security
• Horizontal Scaling
ConnectR
• High-speed & direct
connectors
Available for:
• High performance XDF
• SAS, SPSS, delimited & fixed
format text data files
• Hadoop HDFS (text & XDF)
• Teradata Database & Aster
• EDWs and ADWs
• ODBC
DevelopR
• Develop R using familiar
tools
• RTVS
• R Studio
• R Client
SclaeR
• Ready-to-use high
performance big data
analytics
• Fully-parallelized analytics
• Data prep & data distillation
• Descriptive statistics &
statistical tests
• Range of predictive functions
• Wide data sets supported –
thousands of variables
DistributedR
• Distributed computing
framework
• Delivers cross-platform
portability
CRAN
• Open source R interpreter
• Freely available huge range
of R algorithms (packages)
• Huge community of users
Microsoft R Open
• Based on open source R
• High performance math
library to speed up linear
algebra functions
• Checkpoint package to easily
share R code and replicate
results using specific R
package versions
Tools
• Microsoft R Client
• Microsoft R Open
• Visual Studio R Tools
• R Studio
• R GUI
End to End Scenario
Developer
DBA
Data
Engineer
Data
Scientist
Data Exploration
and Predicative
Modeling
Operationalizing
the R Code
Managing
my server
Authoring
workflows
SQL Server with R
SQL Server with R
SQL Server with R
Demo
Advanced Analytics - R & Data
Optimization
SQL Server
R
Client
• ScaleR Library
• Compute Context
• Parallel Processing
• Algorithm Parameters
Flow
Visualization
http://www.r-graph-gallery.com/all-graphs/
Demo
Thank You

Analytics with R in SQL Server 2016

  • 1.
    Analytics with Rin SQL Server 2016 Chennai SQL Server User Group
  • 2.
    Hariharan Lead Consultant, YourSQL Man (I) Pvt Ltd Microsoft Certified Trainer Microsoft Certified Solution Expert Business Intelligence & Data Management and Analytics SAP HANA & Business Objects Active Speaker in DAGEOP DATA DAY Blog http://dataap.org/author/hariharanr/ Twitter @imhariharanr Linked In hariharan-r-12635640
  • 3.
    Topics • Analytics • Introductionto R • Challenges • R in SQL Server 2016 • Advanced Analytics in R • Visualization with R 3
  • 4.
    Analytics What Happened? Why Happened? What Will happen? Whatshould I do? Descriptive Diagnostic Predictive Prescriptive
  • 5.
    Introduction to R •It is a Open Source • A statistics programming language • A data visualization tool • It has a very big community with 2.5+ M users • Scalable to big data • Rich application & platform integration
  • 6.
    Introduction to R 0 500 1000 1500 2000 2500 3000 3500 4000 20112012 2013 2014 2015 2016 R Packages in CRAN
  • 7.
  • 8.
    Challenges in R •Data Movement • Moving data from database to R • Operationalization • How do I call R script from my application • Scale and Performance • R runs in single threaded and only accommodates datasets that fit into available memory
  • 9.
    SQL Server withR • Data Movement • Reduce or eliminate data movement with In-Database analytics • Operationalization • T-SQL Stored Procedure • Scale and Performance • In-memory, columnstore indexes. Leverage RevoScaleR support large datasets and parallel algorithms
  • 10.
    SQL Server withR • It a new feature in SQL Server 2016 • New Workload • Integrated highly popular R language for enterprise customers • Intelligent & Predictive Applications
  • 11.
  • 12.
    R Components DeployR • RESTfulAPIs for easy integration from java, • JavaScript, .NET • Enterprise authentication & security • Horizontal Scaling ConnectR • High-speed & direct connectors Available for: • High performance XDF • SAS, SPSS, delimited & fixed format text data files • Hadoop HDFS (text & XDF) • Teradata Database & Aster • EDWs and ADWs • ODBC DevelopR • Develop R using familiar tools • RTVS • R Studio • R Client SclaeR • Ready-to-use high performance big data analytics • Fully-parallelized analytics • Data prep & data distillation • Descriptive statistics & statistical tests • Range of predictive functions • Wide data sets supported – thousands of variables DistributedR • Distributed computing framework • Delivers cross-platform portability CRAN • Open source R interpreter • Freely available huge range of R algorithms (packages) • Huge community of users Microsoft R Open • Based on open source R • High performance math library to speed up linear algebra functions • Checkpoint package to easily share R code and replicate results using specific R package versions
  • 13.
    Tools • Microsoft RClient • Microsoft R Open • Visual Studio R Tools • R Studio • R GUI
  • 14.
    End to EndScenario Developer DBA Data Engineer Data Scientist Data Exploration and Predicative Modeling Operationalizing the R Code Managing my server Authoring workflows
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
    Advanced Analytics -R & Data Optimization SQL Server R Client • ScaleR Library • Compute Context • Parallel Processing • Algorithm Parameters
  • 20.
  • 21.
  • 22.
  • 23.