Building 1 million predictions per second using SQL-R

Amit Banerjee
Amit BanerjeeSenior PM Manager at Microsoft
Building 1 million predictions per second using SQL-R
CIS Tiger
Agenda
• Data Science Process
• Bringing Analytics to Data
• Demo Using Lending Club Data
• Optimization Tips
CIS Tiger
Data Science Process – CRISP-DM
CRISP-DM Model– General data science framework
CIS Tiger
Jack Henry
A leading provider for banking solutions for credit unions across Americas
In-memory OLTPColumnStore
Age, Original Balance, Interest
Rate, Loan Remaining
Months, Credit Score
20M Vehicle Loans
PowerBI DashboardIn-Database
Analytics at Scale
R
Business User
Prepare for
analytics
Store
Predictions
Visualize
CIS Tiger
Using SQL Server R Services
Bringing Analytics to the Data
• Data already in SQL
• Use T-SQL know-hows to do ETL
• Use the power of in-memory OLTP and column store indexing to enhance speed of
ETL
• RevoScaleR package to provide parallelism and scale
Making the data travel
• Data sources not in SQL
• Data sinks not in SQL
• Complex ETL needed
• Long running R script
CIS Tiger
sp_execute_external
EXEC sp_execute_external_script
@language = N'R'
, @script = N'iris_data <- iris;'
, @input_data_1 = N''
, @output_data_1_name = N'iris_data'
WITH RESULT SETS (("Sepal.Length" float not null,
"Sepal.Width" float not null,
"Petal.Length" float not null,
"Petal.Width" float not null, "Species" varchar(100)));
END;
go
sp_execute_external_script
@language = N'language' ,
@script = N'script',
@input_data_1 = ] 'input_data_1'
[ , @input_data_1_name = ] N'input_data_1_name' ]
[ , @output_data_1_name = 'output_data_1_name' ]
[ , @parallel = 0 | 1 ] [ , @params = ]
N'@parameter_name data_type [ OUT | OUTPUT ] [ ,...n ]'
[ , @parameter1 = ] 'value1' [ OUT | OUTPUT ] [ ,...n ]
[ WITH <execute_option> ]
[;]
<execute_option>::=
{
{ RESULT SETS UNDEFINED }
| { RESULT SETS NONE }
| { RESULT SETS ( <result_sets_definition> ) }
}
<result_sets_definition> ::=
{
(
{ column_name
data_type
[ COLLATE collation_name ]
[ NULL | NOT NULL ] }
[,...n ]
)
| AS OBJECT
[ db_name . [ schema_name ] . | schema_name . ]
{table_name | view_name | table_valued_function_name }
| AS TYPE [ schema_name.]table_type_name
}
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
CIS Tiger
SQL Server as Scoring Engine
Deployment Using:
• Triggers
• Powershell scripts
• SQL agent jobs
CIS Tiger
DEMO
• Using public dataset of Lending Club
• Using G5 instance of Azure Data
Science VM (DSVM)
• Following Data Science Process using
SQL Server 2016 R Services
CIS Tiger
References
Loan Classification using SQL Server 2016 R Services
A walkthrough of Loan Classification using SQL Server 2016 R
Services
Using MicrosoftML in SQL-Server
GitHub SQL Server Samples
Microsoft Data Amp
WHERE DATA GETS TO WORK
Put data, analytics and artificial intelligence into
the heart of your solutions. Get the latest on big
data and machine learning innovations.
Join us online April 19, 2017 at 8AM PT
microsoft.com/data-amp
Building 1 million predictions per second using SQL-R
1 of 16

Recommended

SQL Saturday 511 - Troubleshooting made easier using extended events by
SQL Saturday 511 - Troubleshooting made easier using extended eventsSQL Saturday 511 - Troubleshooting made easier using extended events
SQL Saturday 511 - Troubleshooting made easier using extended eventsAmit Banerjee
3.6K views12 slides
Slow query bring it on by
Slow query bring it onSlow query bring it on
Slow query bring it onAmit Banerjee
1.3K views15 slides
SQL Server Tips and Tricks - Power by
SQL Server Tips and Tricks - PowerSQL Server Tips and Tricks - Power
SQL Server Tips and Tricks - PowerAmit Banerjee
2.9K views12 slides
You, SQL Server and PowerShell by
You, SQL Server and PowerShellYou, SQL Server and PowerShell
You, SQL Server and PowerShellAmit Banerjee
1.9K views12 slides
Troubleshooting common scenarios with Always On - A Dress Rehearsal by
Troubleshooting common scenarios with Always On - A Dress RehearsalTroubleshooting common scenarios with Always On - A Dress Rehearsal
Troubleshooting common scenarios with Always On - A Dress RehearsalAmit Banerjee
1.6K views15 slides
Enhancements to High Availability, Disaster Recovery and Replication by
Enhancements to High Availability, Disaster Recovery and ReplicationEnhancements to High Availability, Disaster Recovery and Replication
Enhancements to High Availability, Disaster Recovery and ReplicationAmit Banerjee
703 views18 slides

More Related Content

What's hot

Securing SQL Server with TLS 1.2 by
Securing SQL Server with TLS 1.2Securing SQL Server with TLS 1.2
Securing SQL Server with TLS 1.2Amit Banerjee
5.3K views22 slides
Always On Availability Group Troubleshooting Ninja by
Always On Availability Group Troubleshooting Ninja Always On Availability Group Troubleshooting Ninja
Always On Availability Group Troubleshooting Ninja Amit Banerjee
247 views11 slides
Introduction to Asp.net 3.5 using VS 2008 by
Introduction to Asp.net 3.5 using VS 2008Introduction to Asp.net 3.5 using VS 2008
Introduction to Asp.net 3.5 using VS 2008maddinapudi
1.6K views20 slides
R in Power BI by
R in Power BIR in Power BI
R in Power BIEric Bragas
311 views16 slides
Modern ETL: Azure Data Factory, Data Lake, and SQL Database by
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseModern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseEric Bragas
838 views26 slides
SQL PASS BAC - 60 reporting tips in 60 minutes by
SQL PASS BAC - 60 reporting tips in 60 minutesSQL PASS BAC - 60 reporting tips in 60 minutes
SQL PASS BAC - 60 reporting tips in 60 minutesIke Ellis
3.1K views70 slides

What's hot(20)

Securing SQL Server with TLS 1.2 by Amit Banerjee
Securing SQL Server with TLS 1.2Securing SQL Server with TLS 1.2
Securing SQL Server with TLS 1.2
Amit Banerjee5.3K views
Always On Availability Group Troubleshooting Ninja by Amit Banerjee
Always On Availability Group Troubleshooting Ninja Always On Availability Group Troubleshooting Ninja
Always On Availability Group Troubleshooting Ninja
Amit Banerjee247 views
Introduction to Asp.net 3.5 using VS 2008 by maddinapudi
Introduction to Asp.net 3.5 using VS 2008Introduction to Asp.net 3.5 using VS 2008
Introduction to Asp.net 3.5 using VS 2008
maddinapudi1.6K views
Modern ETL: Azure Data Factory, Data Lake, and SQL Database by Eric Bragas
Modern ETL: Azure Data Factory, Data Lake, and SQL DatabaseModern ETL: Azure Data Factory, Data Lake, and SQL Database
Modern ETL: Azure Data Factory, Data Lake, and SQL Database
Eric Bragas838 views
SQL PASS BAC - 60 reporting tips in 60 minutes by Ike Ellis
SQL PASS BAC - 60 reporting tips in 60 minutesSQL PASS BAC - 60 reporting tips in 60 minutes
SQL PASS BAC - 60 reporting tips in 60 minutes
Ike Ellis3.1K views
Azure Automation-Your SQL Agent in the Cloud by Taiob Ali
Azure Automation-Your SQL Agent in the CloudAzure Automation-Your SQL Agent in the Cloud
Azure Automation-Your SQL Agent in the Cloud
Taiob Ali67 views
The Importance of Wait Statistics in SQL Server by Grant Fritchey
The Importance of Wait Statistics in SQL ServerThe Importance of Wait Statistics in SQL Server
The Importance of Wait Statistics in SQL Server
Grant Fritchey1.2K views
SQL Server 2016 What's New For Developers by Davide Mauri
SQL Server 2016  What's New For DevelopersSQL Server 2016  What's New For Developers
SQL Server 2016 What's New For Developers
Davide Mauri890 views
Data mining tools for excel and sql server by Sayed Ahmed
Data mining tools for excel and sql serverData mining tools for excel and sql server
Data mining tools for excel and sql server
Sayed Ahmed134 views
Performance Tuning Azure SQL Database by Grant Fritchey
Performance Tuning Azure SQL DatabasePerformance Tuning Azure SQL Database
Performance Tuning Azure SQL Database
Grant Fritchey875 views
Sql Azure - Sql Saturday Kansas City by Aaron King
Sql Azure - Sql Saturday Kansas CitySql Azure - Sql Saturday Kansas City
Sql Azure - Sql Saturday Kansas City
Aaron King264 views
Geek Sync | Taking Your First Steps to the Cloud—Building a Hybrid Model by IDERA Software
Geek Sync | Taking Your First Steps to the Cloud—Building a Hybrid ModelGeek Sync | Taking Your First Steps to the Cloud—Building a Hybrid Model
Geek Sync | Taking Your First Steps to the Cloud—Building a Hybrid Model
IDERA Software141 views
Become an Expert in SQL Server with Cloud by thisiswali
Become an Expert in SQL Server with CloudBecome an Expert in SQL Server with Cloud
Become an Expert in SQL Server with Cloud
thisiswali91 views
Microsoft Azure Data Factory Hands-On Lab Overview Slides by Mark Kromer
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer1.5K views

Similar to Building 1 million predictions per second using SQL-R

Java Developers, make the database work for you (NLJUG JFall 2010) by
Java Developers, make the database work for you (NLJUG JFall 2010)Java Developers, make the database work for you (NLJUG JFall 2010)
Java Developers, make the database work for you (NLJUG JFall 2010)Lucas Jellema
685 views50 slides
SQL PASS 2017 - Building one million predictions per second using SQL Server ... by
SQL PASS 2017 - Building one million predictions per second using SQL Server ...SQL PASS 2017 - Building one million predictions per second using SQL Server ...
SQL PASS 2017 - Building one million predictions per second using SQL Server ...Amit Banerjee
14.2K views23 slides
Spring data requery by
Spring data requerySpring data requery
Spring data requerySunghyouk Bae
1.9K views46 slides
U-SQL - Azure Data Lake Analytics for Developers by
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for DevelopersMichael Rys
5.7K views35 slides
An Approach to Sql tuning - Part 1 by
An Approach to Sql tuning - Part 1An Approach to Sql tuning - Part 1
An Approach to Sql tuning - Part 1Navneet Upneja
1.1K views34 slides
EM12c: Capacity Planning with OEM Metrics by
EM12c: Capacity Planning with OEM MetricsEM12c: Capacity Planning with OEM Metrics
EM12c: Capacity Planning with OEM MetricsMaaz Anjum
19.9K views59 slides

Similar to Building 1 million predictions per second using SQL-R(20)

Java Developers, make the database work for you (NLJUG JFall 2010) by Lucas Jellema
Java Developers, make the database work for you (NLJUG JFall 2010)Java Developers, make the database work for you (NLJUG JFall 2010)
Java Developers, make the database work for you (NLJUG JFall 2010)
Lucas Jellema685 views
SQL PASS 2017 - Building one million predictions per second using SQL Server ... by Amit Banerjee
SQL PASS 2017 - Building one million predictions per second using SQL Server ...SQL PASS 2017 - Building one million predictions per second using SQL Server ...
SQL PASS 2017 - Building one million predictions per second using SQL Server ...
Amit Banerjee14.2K views
U-SQL - Azure Data Lake Analytics for Developers by Michael Rys
U-SQL - Azure Data Lake Analytics for DevelopersU-SQL - Azure Data Lake Analytics for Developers
U-SQL - Azure Data Lake Analytics for Developers
Michael Rys5.7K views
An Approach to Sql tuning - Part 1 by Navneet Upneja
An Approach to Sql tuning - Part 1An Approach to Sql tuning - Part 1
An Approach to Sql tuning - Part 1
Navneet Upneja1.1K views
EM12c: Capacity Planning with OEM Metrics by Maaz Anjum
EM12c: Capacity Planning with OEM MetricsEM12c: Capacity Planning with OEM Metrics
EM12c: Capacity Planning with OEM Metrics
Maaz Anjum19.9K views
Using Couchbase and Elasticsearch as data layers by Tal Maayani
Using Couchbase and Elasticsearch as data layersUsing Couchbase and Elasticsearch as data layers
Using Couchbase and Elasticsearch as data layers
Tal Maayani499 views
2018 data warehouse features in spark by Chester Chen
2018   data warehouse features in spark2018   data warehouse features in spark
2018 data warehouse features in spark
Chester Chen768 views
3 CityNetConf - sql+c#=u-sql by Łukasz Grala
3 CityNetConf - sql+c#=u-sql3 CityNetConf - sql+c#=u-sql
3 CityNetConf - sql+c#=u-sql
Łukasz Grala463 views
SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso... by Marek Maśko
SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso...SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso...
SQLSaturday 664 - Troubleshoot SQL Server performance problems like a Microso...
Marek Maśko94 views
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks by Grega Kespret
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Grega Kespret1.5K views
Geek Sync I Learn to Troubleshoot Query Performance in Analysis Services by IDERA Software
Geek Sync I Learn to Troubleshoot Query Performance in Analysis ServicesGeek Sync I Learn to Troubleshoot Query Performance in Analysis Services
Geek Sync I Learn to Troubleshoot Query Performance in Analysis Services
IDERA Software289 views
SQL Server 2014 Monitoring and Profiling by Abouzar Noori
SQL Server 2014 Monitoring and ProfilingSQL Server 2014 Monitoring and Profiling
SQL Server 2014 Monitoring and Profiling
Abouzar Noori111 views
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive by Xu Jiang
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Xu Jiang33.1K views
Just-in-Time Analytics and the Need for Autonomous Database Administration wi... by Databricks
Just-in-Time Analytics and the Need for Autonomous Database Administration wi...Just-in-Time Analytics and the Need for Autonomous Database Administration wi...
Just-in-Time Analytics and the Need for Autonomous Database Administration wi...
Databricks499 views
Bb world 2012 using database statistics to make capacity planning decisions... by Geoff Mower
Bb world 2012   using database statistics to make capacity planning decisions...Bb world 2012   using database statistics to make capacity planning decisions...
Bb world 2012 using database statistics to make capacity planning decisions...
Geoff Mower963 views
Machine Learning and AI at Oracle by Sandesh Rao
Machine Learning and AI at OracleMachine Learning and AI at Oracle
Machine Learning and AI at Oracle
Sandesh Rao513 views

More from Amit Banerjee

The Roadmap for SQL Server 2019 by
The Roadmap for SQL Server 2019The Roadmap for SQL Server 2019
The Roadmap for SQL Server 2019Amit Banerjee
3.6K views16 slides
Azure database services for PostgreSQL and MySQL by
Azure database services for PostgreSQL and MySQLAzure database services for PostgreSQL and MySQL
Azure database services for PostgreSQL and MySQLAmit Banerjee
656 views22 slides
Upgrade your SQL Server like a Ninja by
Upgrade your SQL Server like a NinjaUpgrade your SQL Server like a Ninja
Upgrade your SQL Server like a NinjaAmit Banerjee
336 views14 slides
Performance Demystified for SQL Server on Azure Virtual Machines by
Performance Demystified for SQL Server on Azure Virtual MachinesPerformance Demystified for SQL Server on Azure Virtual Machines
Performance Demystified for SQL Server on Azure Virtual MachinesAmit Banerjee
1.3K views14 slides
Provisioning Azure Virtual Machines to run SQL Server by
Provisioning Azure Virtual Machines to run SQL ServerProvisioning Azure Virtual Machines to run SQL Server
Provisioning Azure Virtual Machines to run SQL ServerAmit Banerjee
1.6K views19 slides
Backup enhancements with SQL Server 2014 by
Backup enhancements with SQL Server 2014Backup enhancements with SQL Server 2014
Backup enhancements with SQL Server 2014Amit Banerjee
1.8K views14 slides

More from Amit Banerjee(14)

The Roadmap for SQL Server 2019 by Amit Banerjee
The Roadmap for SQL Server 2019The Roadmap for SQL Server 2019
The Roadmap for SQL Server 2019
Amit Banerjee3.6K views
Azure database services for PostgreSQL and MySQL by Amit Banerjee
Azure database services for PostgreSQL and MySQLAzure database services for PostgreSQL and MySQL
Azure database services for PostgreSQL and MySQL
Amit Banerjee656 views
Upgrade your SQL Server like a Ninja by Amit Banerjee
Upgrade your SQL Server like a NinjaUpgrade your SQL Server like a Ninja
Upgrade your SQL Server like a Ninja
Amit Banerjee336 views
Performance Demystified for SQL Server on Azure Virtual Machines by Amit Banerjee
Performance Demystified for SQL Server on Azure Virtual MachinesPerformance Demystified for SQL Server on Azure Virtual Machines
Performance Demystified for SQL Server on Azure Virtual Machines
Amit Banerjee1.3K views
Provisioning Azure Virtual Machines to run SQL Server by Amit Banerjee
Provisioning Azure Virtual Machines to run SQL ServerProvisioning Azure Virtual Machines to run SQL Server
Provisioning Azure Virtual Machines to run SQL Server
Amit Banerjee1.6K views
Backup enhancements with SQL Server 2014 by Amit Banerjee
Backup enhancements with SQL Server 2014Backup enhancements with SQL Server 2014
Backup enhancements with SQL Server 2014
Amit Banerjee1.8K views
Troubleshooting SQL Server using Power Pivot and Power View by Amit Banerjee
Troubleshooting SQL Server using Power Pivot and Power ViewTroubleshooting SQL Server using Power Pivot and Power View
Troubleshooting SQL Server using Power Pivot and Power View
Amit Banerjee1.8K views
SQL Server Scheduling Basics by Amit Banerjee
SQL Server Scheduling BasicsSQL Server Scheduling Basics
SQL Server Scheduling Basics
Amit Banerjee1.6K views
System health session by Amit Banerjee
System health sessionSystem health session
System health session
Amit Banerjee1.3K views
Microsoft TechEd India 2012 Backup Restore by Amit Banerjee
Microsoft TechEd India 2012 Backup RestoreMicrosoft TechEd India 2012 Backup Restore
Microsoft TechEd India 2012 Backup Restore
Amit Banerjee964 views
Debugging the Deadlock for the Scheduler by Amit Banerjee
Debugging the Deadlock for the SchedulerDebugging the Deadlock for the Scheduler
Debugging the Deadlock for the Scheduler
Amit Banerjee6.3K views
Managing SQL Server Made Easy - Features and Concepts by Amit Banerjee
Managing SQL Server Made Easy - Features and ConceptsManaging SQL Server Made Easy - Features and Concepts
Managing SQL Server Made Easy - Features and Concepts
Amit Banerjee647 views
Understanding performance bottlenecks using performance dashboard by Amit Banerjee
Understanding performance bottlenecks using performance dashboardUnderstanding performance bottlenecks using performance dashboard
Understanding performance bottlenecks using performance dashboard
Amit Banerjee1.3K views

Recently uploaded

STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdfDr. Jimmy Schwarzkopf
19 views29 slides
Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
56 views21 slides
20231123_Camunda Meetup Vienna.pdf by
20231123_Camunda Meetup Vienna.pdf20231123_Camunda Meetup Vienna.pdf
20231123_Camunda Meetup Vienna.pdfPhactum Softwareentwicklung GmbH
41 views73 slides
Info Session November 2023.pdf by
Info Session November 2023.pdfInfo Session November 2023.pdf
Info Session November 2023.pdfAleksandraKoprivica4
12 views15 slides
Melek BEN MAHMOUD.pdf by
Melek BEN MAHMOUD.pdfMelek BEN MAHMOUD.pdf
Melek BEN MAHMOUD.pdfMelekBenMahmoud
14 views1 slide
Voice Logger - Telephony Integration Solution at Aegis by
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at AegisNirmal Sharma
39 views1 slide

Recently uploaded(20)

STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma39 views
AMAZON PRODUCT RESEARCH.pdf by JerikkLaureta
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdf
JerikkLaureta26 views
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn22 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi127 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10248 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman33 views
Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst478 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software263 views
Data Integrity for Banking and Financial Services by Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely21 views
handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex22 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada136 views

Building 1 million predictions per second using SQL-R

  • 2. CIS Tiger Agenda • Data Science Process • Bringing Analytics to Data • Demo Using Lending Club Data • Optimization Tips
  • 3. CIS Tiger Data Science Process – CRISP-DM CRISP-DM Model– General data science framework
  • 4. CIS Tiger Jack Henry A leading provider for banking solutions for credit unions across Americas In-memory OLTPColumnStore Age, Original Balance, Interest Rate, Loan Remaining Months, Credit Score 20M Vehicle Loans PowerBI DashboardIn-Database Analytics at Scale R Business User Prepare for analytics Store Predictions Visualize
  • 5. CIS Tiger Using SQL Server R Services Bringing Analytics to the Data • Data already in SQL • Use T-SQL know-hows to do ETL • Use the power of in-memory OLTP and column store indexing to enhance speed of ETL • RevoScaleR package to provide parallelism and scale Making the data travel • Data sources not in SQL • Data sinks not in SQL • Complex ETL needed • Long running R script
  • 6. CIS Tiger sp_execute_external EXEC sp_execute_external_script @language = N'R' , @script = N'iris_data <- iris;' , @input_data_1 = N'' , @output_data_1_name = N'iris_data' WITH RESULT SETS (("Sepal.Length" float not null, "Sepal.Width" float not null, "Petal.Length" float not null, "Petal.Width" float not null, "Species" varchar(100))); END; go sp_execute_external_script @language = N'language' , @script = N'script', @input_data_1 = ] 'input_data_1' [ , @input_data_1_name = ] N'input_data_1_name' ] [ , @output_data_1_name = 'output_data_1_name' ] [ , @parallel = 0 | 1 ] [ , @params = ] N'@parameter_name data_type [ OUT | OUTPUT ] [ ,...n ]' [ , @parameter1 = ] 'value1' [ OUT | OUTPUT ] [ ,...n ] [ WITH <execute_option> ] [;] <execute_option>::= { { RESULT SETS UNDEFINED } | { RESULT SETS NONE } | { RESULT SETS ( <result_sets_definition> ) } } <result_sets_definition> ::= { ( { column_name data_type [ COLLATE collation_name ] [ NULL | NOT NULL ] } [,...n ] ) | AS OBJECT [ db_name . [ schema_name ] . | schema_name . ] {table_name | view_name | table_valued_function_name } | AS TYPE [ schema_name.]table_type_name }
  • 7. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  • 8. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  • 9. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  • 10. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  • 11. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  • 12. CIS Tiger SQL Server as Scoring Engine Deployment Using: • Triggers • Powershell scripts • SQL agent jobs
  • 13. CIS Tiger DEMO • Using public dataset of Lending Club • Using G5 instance of Azure Data Science VM (DSVM) • Following Data Science Process using SQL Server 2016 R Services
  • 14. CIS Tiger References Loan Classification using SQL Server 2016 R Services A walkthrough of Loan Classification using SQL Server 2016 R Services Using MicrosoftML in SQL-Server GitHub SQL Server Samples
  • 15. Microsoft Data Amp WHERE DATA GETS TO WORK Put data, analytics and artificial intelligence into the heart of your solutions. Get the latest on big data and machine learning innovations. Join us online April 19, 2017 at 8AM PT microsoft.com/data-amp

Editor's Notes

  1. Use Case: Predict vehicle loan charge off (default) based on attributes like interest rate, credit scores etc Input: A subset of 8 million row of vehicle loan data in SQL Server - columns including branch location, customer profiles, interest rate, loan age etc.. Expected Result: Probability score of loans get charged off (Higher the score, higher the probability of loan get charged off) Build PowerBI report using probability score to show healthiness of vehicle loans across different branches Build what if scenario in business application