Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
CIS Tiger
Agenda
• Data Science Process
• Bringing Analytics to Data
• Demo Using Lending Club Data
• Optimization Tips
CIS Tiger
Data Science Process – CRISP-DM
CRISP-DM Model– General data science framework
CIS Tiger
Jack Henry
A leading provider for banking solutions for credit unions across Americas
In-memory OLTPColumnStore
...
CIS Tiger
Using SQL Server R Services
Bringing Analytics to the Data
• Data already in SQL
• Use T-SQL know-hows to do ETL...
CIS Tiger
sp_execute_external
EXEC sp_execute_external_script
@language = N'R'
, @script = N'iris_data <- iris;'
, @input_...
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
Fast Models
Correct attribute selection
Fast Ingestion
Fast Reads
Uniform resource usage
CIS Tiger
SQL Server as Scoring Engine
Deployment Using:
• Triggers
• Powershell scripts
• SQL agent jobs
CIS Tiger
DEMO
• Using public dataset of Lending Club
• Using G5 instance of Azure Data
Science VM (DSVM)
• Following Data...
CIS Tiger
References
Loan Classification using SQL Server 2016 R Services
A walkthrough of Loan Classification using SQL S...
Microsoft Data Amp
WHERE DATA GETS TO WORK
Put data, analytics and artificial intelligence into
the heart of your solution...
Building 1 million predictions per second using SQL-R
Building 1 million predictions per second using SQL-R
Upcoming SlideShare
Loading in …5
×

Building 1 million predictions per second using SQL-R

2,706 views

Published on

Using the power of OLTP and data transformation in SQL 2016 and advanced analytics in Microsoft R Server, various industries really push the boundary of processing higher number of transaction per second (tps) for different use cases. In this talk, we walked through the use case of predicting loan charge off (loan default) rate, architecture configuration that enable this use case, and rich visual dashboard that allow customer to do what-if analysis. Find out how SQL + R allows you to build an “intelligent datawarehouse”.

Published in: Technology

Building 1 million predictions per second using SQL-R

  1. 1. CIS Tiger Agenda • Data Science Process • Bringing Analytics to Data • Demo Using Lending Club Data • Optimization Tips
  2. 2. CIS Tiger Data Science Process – CRISP-DM CRISP-DM Model– General data science framework
  3. 3. CIS Tiger Jack Henry A leading provider for banking solutions for credit unions across Americas In-memory OLTPColumnStore Age, Original Balance, Interest Rate, Loan Remaining Months, Credit Score 20M Vehicle Loans PowerBI DashboardIn-Database Analytics at Scale R Business User Prepare for analytics Store Predictions Visualize
  4. 4. CIS Tiger Using SQL Server R Services Bringing Analytics to the Data • Data already in SQL • Use T-SQL know-hows to do ETL • Use the power of in-memory OLTP and column store indexing to enhance speed of ETL • RevoScaleR package to provide parallelism and scale Making the data travel • Data sources not in SQL • Data sinks not in SQL • Complex ETL needed • Long running R script
  5. 5. CIS Tiger sp_execute_external EXEC sp_execute_external_script @language = N'R' , @script = N'iris_data <- iris;' , @input_data_1 = N'' , @output_data_1_name = N'iris_data' WITH RESULT SETS (("Sepal.Length" float not null, "Sepal.Width" float not null, "Petal.Length" float not null, "Petal.Width" float not null, "Species" varchar(100))); END; go sp_execute_external_script @language = N'language' , @script = N'script', @input_data_1 = ] 'input_data_1' [ , @input_data_1_name = ] N'input_data_1_name' ] [ , @output_data_1_name = 'output_data_1_name' ] [ , @parallel = 0 | 1 ] [ , @params = ] N'@parameter_name data_type [ OUT | OUTPUT ] [ ,...n ]' [ , @parameter1 = ] 'value1' [ OUT | OUTPUT ] [ ,...n ] [ WITH <execute_option> ] [;] <execute_option>::= { { RESULT SETS UNDEFINED } | { RESULT SETS NONE } | { RESULT SETS ( <result_sets_definition> ) } } <result_sets_definition> ::= { ( { column_name data_type [ COLLATE collation_name ] [ NULL | NOT NULL ] } [,...n ] ) | AS OBJECT [ db_name . [ schema_name ] . | schema_name . ] {table_name | view_name | table_valued_function_name } | AS TYPE [ schema_name.]table_type_name }
  6. 6. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  7. 7. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  8. 8. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  9. 9. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  10. 10. Fast Models Correct attribute selection Fast Ingestion Fast Reads Uniform resource usage
  11. 11. CIS Tiger SQL Server as Scoring Engine Deployment Using: • Triggers • Powershell scripts • SQL agent jobs
  12. 12. CIS Tiger DEMO • Using public dataset of Lending Club • Using G5 instance of Azure Data Science VM (DSVM) • Following Data Science Process using SQL Server 2016 R Services
  13. 13. CIS Tiger References Loan Classification using SQL Server 2016 R Services A walkthrough of Loan Classification using SQL Server 2016 R Services Using MicrosoftML in SQL-Server GitHub SQL Server Samples
  14. 14. Microsoft Data Amp WHERE DATA GETS TO WORK Put data, analytics and artificial intelligence into the heart of your solutions. Get the latest on big data and machine learning innovations. Join us online April 19, 2017 at 8AM PT microsoft.com/data-amp

×