SlideShare a Scribd company logo
1 of 18
Extending the R Language
to the Enterprise with Spotfire and TERR
Lou Bajuk-Yorgan
Sr. Dir., Product Management
TIBCO Spotfire
lbajuk@tibco.com
@loubajuk
© Copyright 2000-2014 TIBCO Software Inc.
1
Extending the Reach of R to the Enterprise
• TIBCO, S+, and embracing R in Spotfire
• Challenges of R for Enterprise applications
• TIBCO Enterprise Runtime for R (TERR)
• Benefits for organizations (and individuals) who use R
• Spotfire & TERR
• TERR integration & performance
• Learn more and try it yourself
- 2
Our Journey to TERR
• John Chambers developed the S language at Bell Labs
– Starting in the mid 70’s
• Insightful (Statsci) founded to commercial S as S+ in 1987
– The ―plus‖: statistical libraries, documentation, and support
– Later focus on commercial users, ease of use, server integration
• R: development begun by Ross Ihaka and Robert Gentleman at University of
Auckland in mid 90’s
• Insightful acquired by TIBCO in 2008
– Spotfire (for Data Discovery and Visualization) acquired in 2007
• Focus shifted to applying Predictive Analytics in Spotfire
– Step 1: Embrace R
- 3
Embracing R
• Spotfire Statistics Server
– Integration of R & S+ into Spotfire
applications
• Later added SAS® & MATLAB®
– Leverage the interactive visualizations
of Spotfire
• Contribute to the R community
• Well received—but our Enterprise
customers need more
– R provides tremendous benefits to
statisticians
– But large enterprises are often
challenged to leverage that value
- 4
© Copyright 2000-2013 TIBCO Software Inc.
• Core R engine struggles with Big Data
– Customers don’t use R, or reimplement R code in specialized libraries or other languages
– Lose agility & consistency, delay time to production, lose opportunities
• R was not built for enterprise usage and integration
– Built as an academic tool for research and teaching
– Software vendors attempting to use R in ways it was never intended
• GPL great for statisticians, but limits enterprise innovation and investment
– Viral open source licensing risks commercial IP
– Large vendors avoid tight integration due to open source concerns
– Leads to complex installations for enterprise customers
• Free to acquire, but costly to maintain
– Version incompatibilities, variable quality in packages
– Lack of enterprise-level technical support
Enterprise Challenges for R
5
© Copyright 2000-2013 TIBCO Software Inc.
• Unique, enterprise-quality implementation of the R language
– Fundamentally different
• New architecture, developed from the ground up
• Based on our long history and expertise with S+
• Faster, more robust and more memory-efficient than R
– TIBCO IP: Not open source/GPL
• Independent implementation
• Licensable for embedding and redistribution by partners
• Enables implementation of transparent big data handling
– Broad compatibility with R functions and 1400+ CRAN packages
• Ongoing effort to broaden our coverage of R
• Extends the Reach of R to the Enterprise
– Develop in R, deploy on TERR
– Rapidly iterate prototyping to production without recoding/retesting—more rapidly respond to
changing business conditions
– Easily integrate R-language analytics consistently across organization—into grids, BI
applications, event-driven analytics, etc.
TIBCO Enterprise Runtime for R (TERR)
© Copyright 2000-2013 TIBCO Software Inc.
Leveraging TERR
Embeddable
TERR Engine
Custom (tight) integration, batch, existing grids, etc.
• Faster than R, more robust, better memory management, fully
supported
• Low level APIs for tight integration
• Integrated into TIBCO products: CEP, Cloud Compute, …
TERR in
Statistics Services
Distributed analytics
• Managed pools of engines
• Load balancing, queuing, failover, parallelization, etc.
• High level APIs for loose integration, data i/o (C#, Java)
• Central management of analytics, R packages
TERR in Spotfire
Ad hoc tools and interactive applications powered by advanced analytics
• Spotfire Analytics platform: interactive visualization & data
discovery, easily build and share applications, broad data access, etc.
Easily provide targeted, relevant predictive analytics to business users to
improve decision making
• Ensure compliance & proper usage
• Share best practices and consistent workflows
• Get the answer & do ―What If?‖ analyses when needed
• Leverage investments in R, S+, SAS, MATLAB, …
Powerful Predictive Analytics tools for Spotfire analysts
• Integrated into Spotfire workflows
• Easily create, evaluate, and share Predictive Models
• Add Forecasts with a single click
Benefits of Predictive Analytics to a spectrum of users
• Increase confidence & effectiveness in decision-making
– Reduce uncertainty
– Discover meaningful patterns, important data
– Maximize ROI
• Anticipate and react to emerging trends
• Reduce/manage risk
– Scenario planning, forecasts, fraud detection
• Forecast specific behavior, preemptively act on it
– Increase upsell, decrease churn
Predictive Analytics with Spotfire
Providing Value for individuals who use R
• Not seeking to displace R from statistician’s
desktops
– Enterprise platform for the deployment and
integration of your work—without having to rewrite
it!
• Contribute to the R community
– Sponsor useR conferences, contribute to R
Foundation
– Contribute bug reports and propose fixes to R core
• Contribute packages to CRAN
– As we port from S+ or develop for TERR
• Supports ―Develop in Open Source R, Deploy
on TERR‖
• E.g., splusTimeSeries, splusTimeDate, sjdbc
• TERR Developer Edition
– Full version of TERR engine for testing code prior to
deployment
• Compatible with RStudio & ESS Emacs
– Free for non-production use
– Supported through Community site
-
Example 1: TERR vs. R Raw Performance
One specific example
• Non-optimal, non-vectorized, real-world R script
• For loop with row by row processing
for (i in seq(1,length=nrow(df))) {
…process each customer record…
}
Results
• TERR is ~35x faster for 50K rows, 150x faster for 500K rows
• No code modification required
We are looking for more real-world performance tests!
• On average 2-10x faster than R in microtests
TERR vs. R Performance: SVMs from e1071
TERR vs. R Performance: GLM Model Fitting and Scoring
Model Fitting: 5 Million Rows Model Scoring: 20 Million Rows
• Forecast Tool
– Easily add Forecasts to
Visualizations by right click menu
– Advanced users can tune settings
– Uses embedded TERR engine
• Benefits
– Extend the power of Predictive
Analytics for ad hoc analysis to all
Spotfire users
– Easy entry point to Spotfire
Predictive Analytics
Example 2: Spotfire Forecast Tool
• Event-Driven analysis in TIBCO Spotfire
Event Analytics
– Process monitoring, analysis, and
optimization
• Apply predictive models in real-time
decision making
– Best marketing offer
– Customer churn
– Predictive Maintenance
– Yield optimization
• Rapidly develop and iterate models in
production
– Respond to changing opportunities
and threats
TERR integration with TIBCO StreamBase
15
TIBCO Cloud Compute Grid
• High performance computing on the cloud
– Available on TIBCO Cloud Marketplace
– TERR, Java and .NET computations
• Robust DataSynapse GridServer architecture
– Used by Wall Street to manage 10K’s nodes
– Java, .NET, and REST APIs (JSON)
• Perfect for pure computational work
– Vastly easier to use for applications like Monte Carlo
simulations than Map-Reduce
– Run complex statistical models multiple orders of
magnitude faster than open source R on a single
computer
– Unparalleled scalability without upfront capital
investment
• Easy to get started
– Uses your Amazon EC2 account
Demos
• TERR in Spotfire
– Forecast Tool
– Sales Forecasting (A simple application)
• Data Functions: using the R language in Spotfire
• Calling TERR from Spotfire Expression Language
– Creating Expression Functions
– Fraud Detection (A more complex application)
• TERR in RStudio
• TERR Community at TIBCOmmunity.com
– Resources, FAQs, Forums
– Details of R coverage
– Product documentation & download
– More info at spotfire.tibco.com/terr
• TERR Developer Edition
– Full version of TERR engine for testing code prior to deployment
– Supported through TIBCOmmunity, download via tap.tibco.com
• TIBCO Cloud Compute Grid
– https://marketplace.cloud.tibco.com
• Presentations: http://www.slideshare.net/loubajukyorgan/presentations
• We want your feedback and input!
– Real world performance tests
– Package & R coverage prioritization
– Via TERR Community, or contact me lbajuk@tibco.com or @loubajuk
TIBCO Spotfire Day by Perkin Elmer
• PerkinElmer San Jose Tech Center
• Tuesday May 6th, 8:30 AM to 3 PM
• Flyers available

More Related Content

More from Lou Bajuk

Embracing data science for smarter analytics apps
Embracing data science for smarter analytics appsEmbracing data science for smarter analytics apps
Embracing data science for smarter analytics appsLou Bajuk
 
EARL Sept 2016 R consortium
EARL Sept 2016 R consortiumEARL Sept 2016 R consortium
EARL Sept 2016 R consortiumLou Bajuk
 
R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016Lou Bajuk
 
Applying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time ApplicationsApplying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time ApplicationsLou Bajuk
 
Applying R in BI and Real Time applications EARL London 2015
Applying R in BI and Real Time applications EARL London 2015Applying R in BI and Real Time applications EARL London 2015
Applying R in BI and Real Time applications EARL London 2015Lou Bajuk
 
Extending the R language to BI and Real-time Applications JSM 2015
Extending the R language to BI and Real-time Applications JSM 2015Extending the R language to BI and Real-time Applications JSM 2015
Extending the R language to BI and Real-time Applications JSM 2015Lou Bajuk
 
Using the R Language in BI and Real Time Applications (useR 2015)
Using the R Language in BI and Real Time Applications (useR 2015)Using the R Language in BI and Real Time Applications (useR 2015)
Using the R Language in BI and Real Time Applications (useR 2015)Lou Bajuk
 
Real time applications using the R Language
Real time applications using the R LanguageReal time applications using the R Language
Real time applications using the R LanguageLou Bajuk
 
The Importance of an Analytics Platform
The Importance of an Analytics PlatformThe Importance of an Analytics Platform
The Importance of an Analytics PlatformLou Bajuk
 
TERR in BI and Real Time applications
TERR in BI and Real Time applicationsTERR in BI and Real Time applications
TERR in BI and Real Time applicationsLou Bajuk
 
Software Testing and the R language
Software Testing and the R languageSoftware Testing and the R language
Software Testing and the R languageLou Bajuk
 
The Compatibility Challenge:Examining R and Developing TERR
The Compatibility Challenge:Examining R and Developing TERRThe Compatibility Challenge:Examining R and Developing TERR
The Compatibility Challenge:Examining R and Developing TERRLou Bajuk
 
Deploying R in BI and Real time Applications
Deploying R in BI and Real time ApplicationsDeploying R in BI and Real time Applications
Deploying R in BI and Real time ApplicationsLou Bajuk
 
Extending the Reach of R to the Enterprise with TERR and Spotfire
Extending the Reach of R to the Enterprise with TERR and SpotfireExtending the Reach of R to the Enterprise with TERR and Spotfire
Extending the Reach of R to the Enterprise with TERR and SpotfireLou Bajuk
 
Sannella use r2013-terr-memory-management
Sannella use r2013-terr-memory-managementSannella use r2013-terr-memory-management
Sannella use r2013-terr-memory-managementLou Bajuk
 
Extend the Reach of R to the Enterprise (for useR! 2013)
Extend the Reach of R to the Enterprise (for useR! 2013)Extend the Reach of R to the Enterprise (for useR! 2013)
Extend the Reach of R to the Enterprise (for useR! 2013)Lou Bajuk
 

More from Lou Bajuk (16)

Embracing data science for smarter analytics apps
Embracing data science for smarter analytics appsEmbracing data science for smarter analytics apps
Embracing data science for smarter analytics apps
 
EARL Sept 2016 R consortium
EARL Sept 2016 R consortiumEARL Sept 2016 R consortium
EARL Sept 2016 R consortium
 
R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016
 
Applying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time ApplicationsApplying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time Applications
 
Applying R in BI and Real Time applications EARL London 2015
Applying R in BI and Real Time applications EARL London 2015Applying R in BI and Real Time applications EARL London 2015
Applying R in BI and Real Time applications EARL London 2015
 
Extending the R language to BI and Real-time Applications JSM 2015
Extending the R language to BI and Real-time Applications JSM 2015Extending the R language to BI and Real-time Applications JSM 2015
Extending the R language to BI and Real-time Applications JSM 2015
 
Using the R Language in BI and Real Time Applications (useR 2015)
Using the R Language in BI and Real Time Applications (useR 2015)Using the R Language in BI and Real Time Applications (useR 2015)
Using the R Language in BI and Real Time Applications (useR 2015)
 
Real time applications using the R Language
Real time applications using the R LanguageReal time applications using the R Language
Real time applications using the R Language
 
The Importance of an Analytics Platform
The Importance of an Analytics PlatformThe Importance of an Analytics Platform
The Importance of an Analytics Platform
 
TERR in BI and Real Time applications
TERR in BI and Real Time applicationsTERR in BI and Real Time applications
TERR in BI and Real Time applications
 
Software Testing and the R language
Software Testing and the R languageSoftware Testing and the R language
Software Testing and the R language
 
The Compatibility Challenge:Examining R and Developing TERR
The Compatibility Challenge:Examining R and Developing TERRThe Compatibility Challenge:Examining R and Developing TERR
The Compatibility Challenge:Examining R and Developing TERR
 
Deploying R in BI and Real time Applications
Deploying R in BI and Real time ApplicationsDeploying R in BI and Real time Applications
Deploying R in BI and Real time Applications
 
Extending the Reach of R to the Enterprise with TERR and Spotfire
Extending the Reach of R to the Enterprise with TERR and SpotfireExtending the Reach of R to the Enterprise with TERR and Spotfire
Extending the Reach of R to the Enterprise with TERR and Spotfire
 
Sannella use r2013-terr-memory-management
Sannella use r2013-terr-memory-managementSannella use r2013-terr-memory-management
Sannella use r2013-terr-memory-management
 
Extend the Reach of R to the Enterprise (for useR! 2013)
Extend the Reach of R to the Enterprise (for useR! 2013)Extend the Reach of R to the Enterprise (for useR! 2013)
Extend the Reach of R to the Enterprise (for useR! 2013)
 

Recently uploaded

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Extending the Reach of the R Language with TERR and Spotfire

  • 1. Extending the R Language to the Enterprise with Spotfire and TERR Lou Bajuk-Yorgan Sr. Dir., Product Management TIBCO Spotfire lbajuk@tibco.com @loubajuk © Copyright 2000-2014 TIBCO Software Inc. 1
  • 2. Extending the Reach of R to the Enterprise • TIBCO, S+, and embracing R in Spotfire • Challenges of R for Enterprise applications • TIBCO Enterprise Runtime for R (TERR) • Benefits for organizations (and individuals) who use R • Spotfire & TERR • TERR integration & performance • Learn more and try it yourself - 2
  • 3. Our Journey to TERR • John Chambers developed the S language at Bell Labs – Starting in the mid 70’s • Insightful (Statsci) founded to commercial S as S+ in 1987 – The ―plus‖: statistical libraries, documentation, and support – Later focus on commercial users, ease of use, server integration • R: development begun by Ross Ihaka and Robert Gentleman at University of Auckland in mid 90’s • Insightful acquired by TIBCO in 2008 – Spotfire (for Data Discovery and Visualization) acquired in 2007 • Focus shifted to applying Predictive Analytics in Spotfire – Step 1: Embrace R - 3
  • 4. Embracing R • Spotfire Statistics Server – Integration of R & S+ into Spotfire applications • Later added SAS® & MATLAB® – Leverage the interactive visualizations of Spotfire • Contribute to the R community • Well received—but our Enterprise customers need more – R provides tremendous benefits to statisticians – But large enterprises are often challenged to leverage that value - 4
  • 5. © Copyright 2000-2013 TIBCO Software Inc. • Core R engine struggles with Big Data – Customers don’t use R, or reimplement R code in specialized libraries or other languages – Lose agility & consistency, delay time to production, lose opportunities • R was not built for enterprise usage and integration – Built as an academic tool for research and teaching – Software vendors attempting to use R in ways it was never intended • GPL great for statisticians, but limits enterprise innovation and investment – Viral open source licensing risks commercial IP – Large vendors avoid tight integration due to open source concerns – Leads to complex installations for enterprise customers • Free to acquire, but costly to maintain – Version incompatibilities, variable quality in packages – Lack of enterprise-level technical support Enterprise Challenges for R 5
  • 6. © Copyright 2000-2013 TIBCO Software Inc. • Unique, enterprise-quality implementation of the R language – Fundamentally different • New architecture, developed from the ground up • Based on our long history and expertise with S+ • Faster, more robust and more memory-efficient than R – TIBCO IP: Not open source/GPL • Independent implementation • Licensable for embedding and redistribution by partners • Enables implementation of transparent big data handling – Broad compatibility with R functions and 1400+ CRAN packages • Ongoing effort to broaden our coverage of R • Extends the Reach of R to the Enterprise – Develop in R, deploy on TERR – Rapidly iterate prototyping to production without recoding/retesting—more rapidly respond to changing business conditions – Easily integrate R-language analytics consistently across organization—into grids, BI applications, event-driven analytics, etc. TIBCO Enterprise Runtime for R (TERR)
  • 7. © Copyright 2000-2013 TIBCO Software Inc. Leveraging TERR Embeddable TERR Engine Custom (tight) integration, batch, existing grids, etc. • Faster than R, more robust, better memory management, fully supported • Low level APIs for tight integration • Integrated into TIBCO products: CEP, Cloud Compute, … TERR in Statistics Services Distributed analytics • Managed pools of engines • Load balancing, queuing, failover, parallelization, etc. • High level APIs for loose integration, data i/o (C#, Java) • Central management of analytics, R packages TERR in Spotfire Ad hoc tools and interactive applications powered by advanced analytics • Spotfire Analytics platform: interactive visualization & data discovery, easily build and share applications, broad data access, etc.
  • 8. Easily provide targeted, relevant predictive analytics to business users to improve decision making • Ensure compliance & proper usage • Share best practices and consistent workflows • Get the answer & do ―What If?‖ analyses when needed • Leverage investments in R, S+, SAS, MATLAB, … Powerful Predictive Analytics tools for Spotfire analysts • Integrated into Spotfire workflows • Easily create, evaluate, and share Predictive Models • Add Forecasts with a single click Benefits of Predictive Analytics to a spectrum of users • Increase confidence & effectiveness in decision-making – Reduce uncertainty – Discover meaningful patterns, important data – Maximize ROI • Anticipate and react to emerging trends • Reduce/manage risk – Scenario planning, forecasts, fraud detection • Forecast specific behavior, preemptively act on it – Increase upsell, decrease churn Predictive Analytics with Spotfire
  • 9. Providing Value for individuals who use R • Not seeking to displace R from statistician’s desktops – Enterprise platform for the deployment and integration of your work—without having to rewrite it! • Contribute to the R community – Sponsor useR conferences, contribute to R Foundation – Contribute bug reports and propose fixes to R core • Contribute packages to CRAN – As we port from S+ or develop for TERR • Supports ―Develop in Open Source R, Deploy on TERR‖ • E.g., splusTimeSeries, splusTimeDate, sjdbc • TERR Developer Edition – Full version of TERR engine for testing code prior to deployment • Compatible with RStudio & ESS Emacs – Free for non-production use – Supported through Community site -
  • 10. Example 1: TERR vs. R Raw Performance One specific example • Non-optimal, non-vectorized, real-world R script • For loop with row by row processing for (i in seq(1,length=nrow(df))) { …process each customer record… } Results • TERR is ~35x faster for 50K rows, 150x faster for 500K rows • No code modification required We are looking for more real-world performance tests! • On average 2-10x faster than R in microtests
  • 11. TERR vs. R Performance: SVMs from e1071
  • 12. TERR vs. R Performance: GLM Model Fitting and Scoring Model Fitting: 5 Million Rows Model Scoring: 20 Million Rows
  • 13. • Forecast Tool – Easily add Forecasts to Visualizations by right click menu – Advanced users can tune settings – Uses embedded TERR engine • Benefits – Extend the power of Predictive Analytics for ad hoc analysis to all Spotfire users – Easy entry point to Spotfire Predictive Analytics Example 2: Spotfire Forecast Tool
  • 14. • Event-Driven analysis in TIBCO Spotfire Event Analytics – Process monitoring, analysis, and optimization • Apply predictive models in real-time decision making – Best marketing offer – Customer churn – Predictive Maintenance – Yield optimization • Rapidly develop and iterate models in production – Respond to changing opportunities and threats TERR integration with TIBCO StreamBase 15
  • 15. TIBCO Cloud Compute Grid • High performance computing on the cloud – Available on TIBCO Cloud Marketplace – TERR, Java and .NET computations • Robust DataSynapse GridServer architecture – Used by Wall Street to manage 10K’s nodes – Java, .NET, and REST APIs (JSON) • Perfect for pure computational work – Vastly easier to use for applications like Monte Carlo simulations than Map-Reduce – Run complex statistical models multiple orders of magnitude faster than open source R on a single computer – Unparalleled scalability without upfront capital investment • Easy to get started – Uses your Amazon EC2 account
  • 16. Demos • TERR in Spotfire – Forecast Tool – Sales Forecasting (A simple application) • Data Functions: using the R language in Spotfire • Calling TERR from Spotfire Expression Language – Creating Expression Functions – Fraud Detection (A more complex application) • TERR in RStudio
  • 17. • TERR Community at TIBCOmmunity.com – Resources, FAQs, Forums – Details of R coverage – Product documentation & download – More info at spotfire.tibco.com/terr • TERR Developer Edition – Full version of TERR engine for testing code prior to deployment – Supported through TIBCOmmunity, download via tap.tibco.com • TIBCO Cloud Compute Grid – https://marketplace.cloud.tibco.com • Presentations: http://www.slideshare.net/loubajukyorgan/presentations • We want your feedback and input! – Real world performance tests – Package & R coverage prioritization – Via TERR Community, or contact me lbajuk@tibco.com or @loubajuk
  • 18. TIBCO Spotfire Day by Perkin Elmer • PerkinElmer San Jose Tech Center • Tuesday May 6th, 8:30 AM to 3 PM • Flyers available

Editor's Notes

  1. “But we needed to do more. This wasn’t enough for Enterprise customers)
  2. Meaningful benchmarks are hardWe can handle how you typically write code.
  3. Benjamin Kreck, 3/;3:Hi all, I’ve done some more performance comparisons between R and TERR: This time for support vector machines coming from e1071. Results can be found in the attached .dxp (R/TERR results are already calculated -> Spotfire can be used offline).The take home message is again: The bigger the data set the higher the performance advantage of TERR compared to R! 
  4. NOTE: TIBCO® Cloud Compute Grid will be using TERR 1.5 at Nov. 2013 release. Date for migration to TERR 2.0 TBD. More info at http://confluence.tibco.com/display/spf/TERR%2C+Grid+Server+and+TIBCO+Cloud+Compute