Revolution R is a commercial product that adds functionality to the open source R programming language. It provides an integrated development environment, improved performance through multi-threaded math, and capabilities for handling big data through interfaces with Hadoop and Netezza. Revolution R also includes tools for interactive debugging, organizing code and data, and performing distributed analytics on large datasets using algorithms in RevoScaleR.
In-Database Analytics Deep Dive with Teradata and RevolutionRevolution Analytics
Teradata and Revolution Analytics worked together to develop in-database analytical capabilities for Teradata Database. Teradata v14.10 provides a foundation for in-database analytics in Teradata. Revolution Analytics has ported its Revolution R Enterprise (RRE) Version 7.1 to use the in-database capabilities of version 14.10. With RRE inside Teradata, users can run fully parallelized algorithms in each node of the Teradata appliance to achieve performance and data scale heretofore unavailable. We'll get past the market-ecture quickly and dive into a “how it really works” presentation, review implications for system configuration and administration, and then take questions from Teradata users who will be charged with deploying and administering Teradata systems as platforms for big data analytics inside the database engine.
Presented by Joseph Rickert at the NYC R Conference, April 25 2015.
Good data analysis is reproducible. If someone else can’t independently replicate your results from your data, the consequences can be severe. With R, a major challenge for reproducibility is the ever-changing package ecosystem: it's all too easy to develop an R script using packages, only to find collaborators will download later versions of those packages when they attempt to reproduce your results, and outcome can be unpredictable!
In this talk I'll introduce the Reproducible R Toolkit, and the "checkpoint" package, included with Revolution R Open, and describe some best practices for writing reliable, reproducible R code with packages.
In-Database Analytics Deep Dive with Teradata and RevolutionRevolution Analytics
Teradata and Revolution Analytics worked together to develop in-database analytical capabilities for Teradata Database. Teradata v14.10 provides a foundation for in-database analytics in Teradata. Revolution Analytics has ported its Revolution R Enterprise (RRE) Version 7.1 to use the in-database capabilities of version 14.10. With RRE inside Teradata, users can run fully parallelized algorithms in each node of the Teradata appliance to achieve performance and data scale heretofore unavailable. We'll get past the market-ecture quickly and dive into a “how it really works” presentation, review implications for system configuration and administration, and then take questions from Teradata users who will be charged with deploying and administering Teradata systems as platforms for big data analytics inside the database engine.
Presented by Joseph Rickert at the NYC R Conference, April 25 2015.
Good data analysis is reproducible. If someone else can’t independently replicate your results from your data, the consequences can be severe. With R, a major challenge for reproducibility is the ever-changing package ecosystem: it's all too easy to develop an R script using packages, only to find collaborators will download later versions of those packages when they attempt to reproduce your results, and outcome can be unpredictable!
In this talk I'll introduce the Reproducible R Toolkit, and the "checkpoint" package, included with Revolution R Open, and describe some best practices for writing reliable, reproducible R code with packages.
Big data analytics on teradata with revolution r enterprise bill jacobsBill Jacobs
Revolution Analytics brings big data analytics to Teradata database. Presentation from Teradata Partners, October 2013 overviewing Revolution R Enterprise for Teradata by Bill Jacobs, Director, Product Marketing, Revolution Analytics.
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics
R and Hadoop go together. In fact, they go together so well, that the number of options available can be confusing to IT and data science teams seeking solutions under varying performance and operational requirements.
Which configuration is faster for big files? Which is faster for sharing data and servers among groups? Which eliminates data movement? Which is easiest to manage? Which works best with iterative and multistep algorithms? What are the hardware requirements of each alternative?
This webinar is intended to help new users of R with Hadoop select their best architecture for integrating Hadoop and R, by explaining the benefits of several popular configurations, their performance potential, workload handling and programming model and administrative characteristics.
Presenters from Revolution Analytics will describe the options for using Revolution R Open and Revolution R Enterprise with Hadoop including servers, edge nodes, rHadoop and ScaleR. We’ll then compare the characteristics of each configuration as regards performance but also programming model, administration, data movement, ease of scaling, mixed workload handling, and performance for large individual analyses vs. mixed workloads.
Analysts predict that the Hadoop market will reach $50.2 billion USD by 2020.1 Applications driving these large expenditures are some of the most important workloads for businesses today including:
• Analyzing clickstream data, including site-side clicks and web media tags. • Measuring sentiment by scanning product feedback, blog feeds, social media comments, and Twitter streams. • Analysis of behavior and risk by capturing vehicle telematics. • Optimizing product performance and utilization by gathering data from built-in sensors. • Tracking and analyzing people and material movement with location-aware systems. • Identifying system performance and intrusion attempts by analyzing server and network log. • Enabling automatic document and speech categorization. • Extracting learning from digitized images, voice, video, and other media types.
Predictive analytics on large data sets provides organizations with a key opportunity to improve a broad variety of business outcomes, and many have embraced Apache Hadoop as the platform of choice.
In the last few years, large businesses have adopted Apache Hadoop as a next-generation data platform, one capable of managing large data assets in a way that is flexible, scalable, and relatively low cost. However, to realize predictive benefits of big data, organizations must be able to develop or hire individuals with the requisite statistics skills, then provide them with a platform for analyzing massive data assets collected in Hadoop “data lakes.”
As users adopted Hadoop, many discovered performance and complexity limited Hadoop’s use for broad predictive analytics use. In response, the Hadoop community has focused on the Apache Spark platform to provide Hadoop with significant performance improvements. With Spark atop Hadoop, users can leverage Hadoop’s big-data management capabilities while achieving new performance levels by running analytics in Apache Spark.
What remains is a challenge—conquering the complexity of Hadoop when developing predictive analytics applications.
In this white paper, we’ll describe how Microsoft R Server helps data scientists, actuaries, risk analysts, quantitative analysts, product planners, and other R users to capture the benefits of Apache Spark on Hadoop by providing a straightforward platform that eliminates much of the complexity of using Spark and Hadoop to conduct analyses on large data assets.
Presentation given by US Chief Scientist, Mario Inchiosa, at the June 2013 Hadoop Summit in San Jose, CA.
ABSTRACT: Hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data, and for computing descriptive and query types of analytics on that data. However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees. At Revolution Analytics we think that reputation is unjustified, and in this talk I discuss the approach we have taken to porting our suite of High Performance Analytics algorithms to run natively and efficiently in Hadoop. Our algorithms are written in C++ and R, and are based on a platform that automatically and efficiently parallelizes a broad class of algorithms called Parallel External Memory Algorithms (PEMA’s). This platform abstracts both the inter-process communication layer and the data source layer, so that the algorithms can work in almost any environment in which messages can be passed among processes and with almost any data source. MPI and RPC are two traditional ways to send messages, but messages can also be passed using files, as in Hadoop. I describe how we use the file-based communication choreographed by MapReduce and how we efficiently access data stored in HDFS.
This session will demonstrate how the all-star line-up featuring R and Storm enables real-time processing on massive data sets; a real home run! The presenters will use actual baseball data and a real-world use case to compose an implementation of the use case as Storm components (spouts, bolts, etc.) and highlight how R can be an effective tool in prototyping a solution. Attendees will leave the session with information that could easily be applied for other use cases such as video game analytics, fraud detection, intrusion detection, and consumer propensity to buy calculations.
The business need for real-time analytics at large scale has focused attention on the use of Apache Storm, but an approach that is sometimes overlooked is the use of Storm and R together. This novel combination of real-time processing with Storm and the practical but powerful statistical analysis offered by R substantially extends the usefulness of Storm as a solution to a variety of business critical problems. By architecting R into the Storm application development process, Storm developers can be much more effective. The aim of this design is not necessarily to deploy faster code but rather to deploy code faster. Just a few lines of R code can be used in place of lengthy Storm code for the purpose of early exploration – you can easily evaluate alternative approaches and quickly make a working prototype.
27 Aug 2013 Webinar High Performance Predictive Analytics in Hadoop and R presented by Mario E. Inchiosa, PhD., US Data Scientist and Kathleen Rohrecker, Director of Product Marketing
R is free software for data analysis and graphics that is similar to SAS and SPSS. Two million people are part of the R Open Source Community. Its use is growing very rapidly and Revolution Analytics distributes a commercial version of R that adds capabilities that are not available in the Open Source version. This 60-minute webinar is for people who are familiar with SAS or SPSS who want to know how R can strengthen their analytics strategy.
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics
Presented by David Smith, Chief Community Officer, Revolution Analytics at Garner Business Intelligence and Analytics Summit, April 2014.
In this presentation, I'll introduce the open source R language — the modern standard for Data Science — and the enhanced performance, scalability and ease-of-use capabilities of Revolution R Enterprise. Customer case studies will illustrate Revolution R Enterprise as a component of the real-time analytics deployment process, via integration with Hadoop, database warehousing systems and Cloud platforms, to implement data-driven end-user applications.
There is one consistent message we hear from customers across industries and around the world: "We would like to reduce our reliance on SAS." In this webinar, we review the top reasons customers cite for moving fromSAS to R; the benefits of open source analytics; the challenges of switching; and the tools you will need to build your own roadmap. We review the key differences between SAS and R from the user's perspective, and provide you with the tools to move forward.
High Performance Predictive Analytics in R and HadoopDataWorks Summit
Hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data, and for computing descriptive and query types of analytics on that data. However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees. At Revolution Analytics we think that reputation is unjustified, and in this talk I discuss the approach we have taken to porting our suite of High Performance Analytics algorithms to run natively and efficiently in Hadoop. Our algorithms are written in C++ and R, and are based on a platform that automatically and efficiently parallelizes a broad class of algorithms called Parallel External Memory Algorithms (PEMA’s). This platform abstracts both the inter-process communication layer and the data source layer, so that the algorithms can work in almost any environment in which messages can be passed among processes and with almost any data source. MPI and RPC are two traditional ways to send messages, but messages can also be passed using files, as in Hadoop. I describe how we use the file-based communication choreographed by MapReduce and how we efficiently access data stored in HDFS.
Big Data refers to a large amount of data both structured and unstructured. For managing and analyzing this amount of data we need technologies like Hadoop and language like R.
http://www.techsparks.co.in/thesis-in-big-data-with-r/
An introduction to Microsoft R Services,
Microsoft R Open and Microsoft R Server.
This presentation will briefly cover the following:
-Why consider MRO and R Server
-R Server
-MRO
-Microsoft R Services/R Server Platform
-DistributedR
-RevoScaleR/ScaleR
-ConnectR
-DevelopR
-DeployR
-Resources
-References
Big data analytics on teradata with revolution r enterprise bill jacobsBill Jacobs
Revolution Analytics brings big data analytics to Teradata database. Presentation from Teradata Partners, October 2013 overviewing Revolution R Enterprise for Teradata by Bill Jacobs, Director, Product Marketing, Revolution Analytics.
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics
R and Hadoop go together. In fact, they go together so well, that the number of options available can be confusing to IT and data science teams seeking solutions under varying performance and operational requirements.
Which configuration is faster for big files? Which is faster for sharing data and servers among groups? Which eliminates data movement? Which is easiest to manage? Which works best with iterative and multistep algorithms? What are the hardware requirements of each alternative?
This webinar is intended to help new users of R with Hadoop select their best architecture for integrating Hadoop and R, by explaining the benefits of several popular configurations, their performance potential, workload handling and programming model and administrative characteristics.
Presenters from Revolution Analytics will describe the options for using Revolution R Open and Revolution R Enterprise with Hadoop including servers, edge nodes, rHadoop and ScaleR. We’ll then compare the characteristics of each configuration as regards performance but also programming model, administration, data movement, ease of scaling, mixed workload handling, and performance for large individual analyses vs. mixed workloads.
Analysts predict that the Hadoop market will reach $50.2 billion USD by 2020.1 Applications driving these large expenditures are some of the most important workloads for businesses today including:
• Analyzing clickstream data, including site-side clicks and web media tags. • Measuring sentiment by scanning product feedback, blog feeds, social media comments, and Twitter streams. • Analysis of behavior and risk by capturing vehicle telematics. • Optimizing product performance and utilization by gathering data from built-in sensors. • Tracking and analyzing people and material movement with location-aware systems. • Identifying system performance and intrusion attempts by analyzing server and network log. • Enabling automatic document and speech categorization. • Extracting learning from digitized images, voice, video, and other media types.
Predictive analytics on large data sets provides organizations with a key opportunity to improve a broad variety of business outcomes, and many have embraced Apache Hadoop as the platform of choice.
In the last few years, large businesses have adopted Apache Hadoop as a next-generation data platform, one capable of managing large data assets in a way that is flexible, scalable, and relatively low cost. However, to realize predictive benefits of big data, organizations must be able to develop or hire individuals with the requisite statistics skills, then provide them with a platform for analyzing massive data assets collected in Hadoop “data lakes.”
As users adopted Hadoop, many discovered performance and complexity limited Hadoop’s use for broad predictive analytics use. In response, the Hadoop community has focused on the Apache Spark platform to provide Hadoop with significant performance improvements. With Spark atop Hadoop, users can leverage Hadoop’s big-data management capabilities while achieving new performance levels by running analytics in Apache Spark.
What remains is a challenge—conquering the complexity of Hadoop when developing predictive analytics applications.
In this white paper, we’ll describe how Microsoft R Server helps data scientists, actuaries, risk analysts, quantitative analysts, product planners, and other R users to capture the benefits of Apache Spark on Hadoop by providing a straightforward platform that eliminates much of the complexity of using Spark and Hadoop to conduct analyses on large data assets.
Presentation given by US Chief Scientist, Mario Inchiosa, at the June 2013 Hadoop Summit in San Jose, CA.
ABSTRACT: Hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data, and for computing descriptive and query types of analytics on that data. However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees. At Revolution Analytics we think that reputation is unjustified, and in this talk I discuss the approach we have taken to porting our suite of High Performance Analytics algorithms to run natively and efficiently in Hadoop. Our algorithms are written in C++ and R, and are based on a platform that automatically and efficiently parallelizes a broad class of algorithms called Parallel External Memory Algorithms (PEMA’s). This platform abstracts both the inter-process communication layer and the data source layer, so that the algorithms can work in almost any environment in which messages can be passed among processes and with almost any data source. MPI and RPC are two traditional ways to send messages, but messages can also be passed using files, as in Hadoop. I describe how we use the file-based communication choreographed by MapReduce and how we efficiently access data stored in HDFS.
This session will demonstrate how the all-star line-up featuring R and Storm enables real-time processing on massive data sets; a real home run! The presenters will use actual baseball data and a real-world use case to compose an implementation of the use case as Storm components (spouts, bolts, etc.) and highlight how R can be an effective tool in prototyping a solution. Attendees will leave the session with information that could easily be applied for other use cases such as video game analytics, fraud detection, intrusion detection, and consumer propensity to buy calculations.
The business need for real-time analytics at large scale has focused attention on the use of Apache Storm, but an approach that is sometimes overlooked is the use of Storm and R together. This novel combination of real-time processing with Storm and the practical but powerful statistical analysis offered by R substantially extends the usefulness of Storm as a solution to a variety of business critical problems. By architecting R into the Storm application development process, Storm developers can be much more effective. The aim of this design is not necessarily to deploy faster code but rather to deploy code faster. Just a few lines of R code can be used in place of lengthy Storm code for the purpose of early exploration – you can easily evaluate alternative approaches and quickly make a working prototype.
27 Aug 2013 Webinar High Performance Predictive Analytics in Hadoop and R presented by Mario E. Inchiosa, PhD., US Data Scientist and Kathleen Rohrecker, Director of Product Marketing
R is free software for data analysis and graphics that is similar to SAS and SPSS. Two million people are part of the R Open Source Community. Its use is growing very rapidly and Revolution Analytics distributes a commercial version of R that adds capabilities that are not available in the Open Source version. This 60-minute webinar is for people who are familiar with SAS or SPSS who want to know how R can strengthen their analytics strategy.
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics
Presented by David Smith, Chief Community Officer, Revolution Analytics at Garner Business Intelligence and Analytics Summit, April 2014.
In this presentation, I'll introduce the open source R language — the modern standard for Data Science — and the enhanced performance, scalability and ease-of-use capabilities of Revolution R Enterprise. Customer case studies will illustrate Revolution R Enterprise as a component of the real-time analytics deployment process, via integration with Hadoop, database warehousing systems and Cloud platforms, to implement data-driven end-user applications.
There is one consistent message we hear from customers across industries and around the world: "We would like to reduce our reliance on SAS." In this webinar, we review the top reasons customers cite for moving fromSAS to R; the benefits of open source analytics; the challenges of switching; and the tools you will need to build your own roadmap. We review the key differences between SAS and R from the user's perspective, and provide you with the tools to move forward.
High Performance Predictive Analytics in R and HadoopDataWorks Summit
Hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data, and for computing descriptive and query types of analytics on that data. However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees. At Revolution Analytics we think that reputation is unjustified, and in this talk I discuss the approach we have taken to porting our suite of High Performance Analytics algorithms to run natively and efficiently in Hadoop. Our algorithms are written in C++ and R, and are based on a platform that automatically and efficiently parallelizes a broad class of algorithms called Parallel External Memory Algorithms (PEMA’s). This platform abstracts both the inter-process communication layer and the data source layer, so that the algorithms can work in almost any environment in which messages can be passed among processes and with almost any data source. MPI and RPC are two traditional ways to send messages, but messages can also be passed using files, as in Hadoop. I describe how we use the file-based communication choreographed by MapReduce and how we efficiently access data stored in HDFS.
Big Data refers to a large amount of data both structured and unstructured. For managing and analyzing this amount of data we need technologies like Hadoop and language like R.
http://www.techsparks.co.in/thesis-in-big-data-with-r/
An introduction to Microsoft R Services,
Microsoft R Open and Microsoft R Server.
This presentation will briefly cover the following:
-Why consider MRO and R Server
-R Server
-MRO
-Microsoft R Services/R Server Platform
-DistributedR
-RevoScaleR/ScaleR
-ConnectR
-DevelopR
-DeployR
-Resources
-References
Learn Business Analytics with R at edureka!Edureka!
This is a 6-week course for professionals who aspire to learn 'R' language for Analytics. Practical approach of learning has been followed in order to provide a real time experience and make you think like an analyst. Our course will cover not only the basic concepts but also the advanced concepts like Data Visualization, Data Mining, Model Building in R, Web Analytics and so on.
Solliciteren: social media of toch die traditionele sollicitatiebrief? (Natio...Tanker Communicatie
Presentatie voor de Nationale Carrièrebeurs 2014 op 14 maart.
Deze presentatie gaf ik samen met Damen Shipyards. Hierbij bespraken we of social media noodzakelijk is tijdens het solliciteren, hoe je social media in kan zetten en hoe Damen Shipyards gebruik maakt van social media.
2015. gada 1. oktobrī Ministru prezidente Laimdota Straujuma ziņoja Saeimai par Latvijas ilgtspējīgas attīstības stratēģijas līdz 2030.gadam, Nacionālā attīstības plāna 2014.-2020. gadam un deklarācijas par Laimdotas Straujumas vadītā Ministru kabineta iecerēto darbību īstenošanu.
Aicinām Jūs iepazīties ar 2015. gada ziņojumu, indikatoru pielikumu un infografiku, kas atspoguļo Latvijas virzību pretim valsts attīstības mērķiem un pasākumus, kas tiek īstenoti vai ir nepieciešami šo mērķu sasniegšanai PKC mājaslapā: http://www.pkc.gov.lv/448-ministru-prezidente-zi%C5%86os-saeimai-par-valsts-att%C4%ABst%C4%ABbas-m%C4%93r%C4%B7u-sasnieg%C5%A1anu
बालबालिकाको उपयुक्त हेरचाहका लागि पारिवारिक शिक्षा
सिविसले बालबालिकाको उपयुक्त हेरचाह तथा बालमैत्री घर परिवार निर्माणका लागि अभिभावक शिक्षा स्रोत पुस्तक प्रकाशन गरेको छ । स्रोत पुस्तकमा बालबालिकाको उज्ज्वल भविष्यका लागि आमाबाबु परिवार विद्यालय, समुदाय तथा बालबालिकासँग सरोकार राख्ने अन्य निकायहरूको भूमिका उल्लेख गरिएको छ । उक्त स्रोतपुस्तकलाई आधार मानी ग्रामीण भेगमा आफ्ना छोराछोरी राष्ट्रका असल नागरिक बनून् भन्ने चाहना गर्ने तर लेखपढ गर्न नजान्ने वा सामान्य लेखपढ मात्र गर्न जान्ने अभिभावकलाई आफ्ना छोराछोरीले आफूले चाहे अनुसार प्रगती गरेको हेर्नका लागि उनीहरूको भूमिकाको बारेमा जानकारी गराउन यो सचित्र स्रोत पुस्तिका तयार गरिएको हो ।
हरेक आमाबाबुलाई आफ्ना बालबालिकाको उज्ज्वल भविष्यको चाहना हुनु स्वभाविकै हो तर बालबालिकाको उज्ज्वल भविष्य आमाबाबुले चाहेर मात्र सम्भव छैन । बालबालिकालाई उचित ढङ्गबाट हुर्काउन बढाउनका लागि पारिवारिक वातावरणले पनि महत्वपूर्ण भूमिका खेल्ने गर्दछ । यसका लागि नव विवाहित दम्पतीले विवाह पश्चात् बालबालिकाको जन्मका लागि योजना गर्नु पर्दछ । बालबालिकाको वृद्धि र विकासलाई शिशु गर्भमा रहेदेखि नै गर्भवती आमाको स्याहार तथा शिशु जन्मे पश्चात गरिने स्याहारले पनि असर गर्दछ । त्यस�
Revolution R Enterprise - 100% R and More Webinar PresentationRevolution Analytics
R users already know why the R language is the lingua franca of statisticians today: because it's the most powerful statistical language in the world. Revolution Analytics builds on the power of open source R, and adds performance, productivity and integration features to create Revolution R Enterprise. In this presentation, author and blogger David Smith will introduce the additional capabilities of Revolution R Enterprise.
100% R and More: Plus What's New in Revolution R Enterprise 6.0Revolution Analytics
R users already know why the R language is the lingua franca of statisticians today: because it's the most powerful statistical language in the world. Revolution Analytics builds on the power of open source R, and adds performance, productivity and integration features to create Revolution R Enterprise. In this webinar, author and blogger David Smith will introduce the additional capabilities of Revolution R Enterprise.
VP of Product Development, Dr. Sue Ranney will also provide an overview of the features introduced in Revolution R Enterprise 6.0 including:
1. Big Data Generalized Linear Model, the new RevoScaleR function that provides a fast, scalable, distributable implementation of generalized linear models, offering impressive speed-ups relative to glm on in-memory data frames
2. Platform LSF Cluster Support, which allows you to create a distributed compute context for the Platform LSF workload manager
3. Azure Burst support added to RxHpcServer
4. Updated R engine (R 2.14.2)
5. Ability to use RevoScaleR analysis functions with non-xdf data sources such as SAS, SPSS or text
6. New methods for RxXdfData data sources including head, tail, names, dim, colnames, length, str, and formula
7. New function rxRoc for generating ROC curves
In this presentation from Revolution Analytics, Bill Jacobs presents: Are You Ready for Big Data Analytics?
"Revolution Analytics delivers advanced analytics software at half the cost of existing solutions. By building on open source R—the world's most powerful statistics software—with innovations in big data analysis, integration and user experience, Revolution Analytics meets the demands and requirements of modern data-driven businesses."
Learn more: http://www.revolutionanalytics.com
Watch the presentation video: http://wp.me/p3RLEV-12S
Turbo-Charge Your Analytics with IBM Netezza and Revolution R Enterprise: A S...Revolution Analytics
Everyone involved in high-stakes analytics wants power, speed and flexibility regardless of the size of the data set and complexity of the analysis. Trailblazing organizations that have deployed IBM Netezza Analytics with their IBM Netezza data warehouse appliances (TwinFin) with Revolution R Enterprise are getting all three.
Applications in R - Success and Lessons Learned from the MarketplaceRevolution Analytics
Adoption of the R language has grown rapidly in the last few years, and is ranked as the number-one data science language in several surveys. This accelerating R adoption curve has been driven by the Big Data revolution, and the fact that so many data scientists — having learned R at university — are actively unlocking the secrets hidden in these new, vast data troves.
In this webinar David Smith, Chief Community Officer, will take a look at the growth of R and the innovative uses of R in business, government and non-profit sectors. Then Neera Talbert, Vice President, Professional Services will take you into the trenches of recent customer deployments and share best practices and pitfalls to avoid in deploying or expanding your own R applications.
Microsoft and Revolution Analytics -- what's the add-value? 20150629Mark Tabladillo
Microsoft has been a leader in the enterprise analytics space for years. In 2014, Microsoft had already created R language functionality within Azure Machine Learning. On April 6, 2015, Microsoft and closed on a deal to acquire Revolution Analytics, a company focusing on scalable processing solutions initiated by the well-known R language. Many data science projects and initial demos do not need high-volume solutions: however, having a high-volume answer for the R language allows for planning or working toward the largest data science solutions.
This presentation describes the add-value for the Revolution Analytics acquisition. The talk covers 1) an overview of current data science technologies from Microsoft; 2) a description of the R language; 3) a brief review of the add-value for R with Azure Machine Learning, and 4) a description of the performance architecture and demo of the language constructs developed by Revolution Analytics. Most of the presentation will be focused on sections two and four. It is anticipated that these technologies will be partially if not fully integrated into SQL Server 2016.
As the Big Data market has evolved, the focus has shifted from data operations (storage, access and processing of data) to data science (understanding, analyzing and forecasting from data). And as new models are developed, organizations need a process for deploying analytics from research into the production environment. In this talk, we'll describe the five stages of real-time analytics deployment:
Data distillation
Model development
Model validation and deployment
Model refresh
Real-time model scoring
We'll review the technologies supporting each stage, and how Revolution Analytics software works with the entire analytics stack to bring Big Data analytics to real-time production environments.
Robert Luong: Analyse prédictive dans ExcelMSDEVMTL
15 mars 2017
Groupe Excel et Power BI
Sujet: Analyse prédictive dans Excel
Conférencier: Robert Luong
Ensuite, nous recevrons Robert Luong, qui viendra nous parler de l’analyse prédictive avec Azure ML et l’intégration avec Excel. Azure ML est un service d’analyse prédictive sur l’infonuagique qui permet de créer et de déployer rapidement des modèles prédictifs sous forme de solutions d’analyse. Azure ML fournit non seulement des outils pour modéliser des analyses prédictives, mais également un service entièrement pris en charge, que vous pouvez utiliser pour déployer vos modèles prédictifs sous la forme de services web.
We at Revolution Analytics are often asked “What is the best way to learn R?” While acknowledging that there may be as many effective learning styles as there are people we have identified three factors that greatly facilitate learning R. For a quick start:
- Find a way of orienting yourself in the open source R world
- Have a definite application area in mind
- Set an initial goal of doing something useful and then build on it
In this webinar, we focus on data mining as the application area and show how anyone with just a basic knowledge of elementary data mining techniques can become immediately productive in R. We will:
- Provide an orientation to R’s data mining resources
- Show how to use the "point and click" open source data mining GUI, rattle, to perform the basic data mining functions of exploring and visualizing data, building classification models on training data sets, and using these models to classify new data.
- Show the simple R commands to accomplish these same tasks without the GUI
- Demonstrate how to build on these fundamental skills to gain further competence in R
- Move away from using small test data sets and show with the same level of skill one could analyze some fairly large data sets with RevoScaleR
Data scientists and analysts using other statistical software as well as students who are new to data mining should come away with a plan for getting started with R.
The use of R statistical package in controlled infrastructure. The case of Cl...Adrian Olszewski
Facts and myths on the use of the R statistical package in controlled, validated environments by the example of Clinical Research in the pharmaceutical industry. This is the first part constituting the introduction. Technical details will be presented in the part II.
This document was presented at a conference organized by Polish National Group of the International Society for Clinical Biostatistics.
Presented to eRum (Budapest), May 2018
There are many common workloads in R that are "embarrassingly parallel": group-by analyses, simulations, and cross-validation of models are just a few examples. In this talk I'll describe the doAzureParallel package, a backend to the "foreach" package that automates the process of spawning a cluster of virtual machines in the Azure cloud to process iterations in parallel. This will include an example of optimizing hyperparameters for a predictive model using the "caret" package.
By David Smith. Presented at Microsoft Build (Seattle), May 7 2018.
Your data scientists have created predictive models using open-source tools, proprietary software, or some combination of both, and now you are interested in lifting and shifting those models to the cloud. In this talk, I'll describe how data scientists can transition their existing workflows — while using mostly the same tools and processes — to train and deploy machine learning models based on open source frameworks to Azure. I'll provide guidance on keeping connections to data sources up-to-date, evaluating and monitoring models, and deploying applications that make use of those models.
Presentation delivered by David Smith to NY R Conference https://www.rstats.nyc/, April 2018:
Minecraft is an open-world creativity game, and a hit with kids. To get kids interested in learning to program with R, we created the "miner" package. This package is a collection of simple functions that allow you to connect with a Minecraft instance, manipulate the world within by creating blocks and controlling the player, and to detect events within the world and react accordingly.
The miner package is intended mainly for kids, to inspire them to learn R while playing Minecraft. But the development of the package also provides some useful insights into how to build an R package to interface with a persistent API, and how to instruct others on its use. In this talk I'll describe how to set up your own Minecraft server, and how to use and extend the package. I'll also provide a few examples of the package in action in a live Minecraft session.
While Python is a widely-used tool for AI development, in this talk I'll make the case for considering R as a platform for developing models for intelligent applications. Firstly, R provides a first-class experience working deep learning frameworks with its keras integration. Equally importantly, it provides the most comprehensive suite of statistical data analysis tools, which are extremely useful for many intelligent applications such as transfer learning. I'll give a few high-level examples in this talk, and we'll go into further detail in the accompanying interactive code lab.
There are many common workloads in R that are "embarrassingly parallel": group-by analyses, simulations, and cross-validation of models are just a few examples. In this talk I'll describe several techniques available in R to speed up workloads like these, by running multiple iterations simultaneously, in parallel.
Many of these techniques require the use of a cluster of machines running R, and I'll provide examples of using cloud-based services to provision clusters for parallel computations. In particular, I will describe how you can use the SparklyR package to distribute data manipulations using the dplyr syntax, on a cluster of servers provisioned in the Azure cloud.
Presented by David Smith at Data Day Texas in Austin, January 27 2018.
A look at the changing perceptions of R, from the early days of the R project to today. Microsoft sponsor talk, presented by David Smith to the useR!2017 conference in Brussels, July 5 2017.
Predicting Loan Delinquency at One Million Transactions per SecondRevolution Analytics
Real-time applications of predictive models must be able to generate predictions at the rate that transactions are generated. Previously, such applications of models trained using R needed to be converted to other languages like C++ or Java to achieve the required throughput. In this talk, I’ll describe how to use the in-database R processing capabilities of Microsoft R Server to detect fraud in a SQL Server database of loan records at a rate exceeding one million transactions per second. I will also show the process of training the underlying gradient-boosted tree model on a large training set using the out-of-memory algorithms of Microsoft R.
Presented by David Smith at The Data Science Summit, Chicago, April 20 2017.
The ability to independently reproduce results is a critical issue within the scientific community today, and is equally important for collaboration and compliance in business. In this talk, I'll introduce several features available in R that help you make reproducibility a standard part of your data science workflow. The talk will include tips on working with data and files, combining code and output, and managing R's changing package ecosystem.
Presented by David Smith, R Community Lead (Microsoft), at Monktoberfest October 2016.
The value of open source isn’t just in the software itself. The communities that form around open source software provide just as much value and sometimes even more: in ongoing development, in documentation, in support, in marketing, and as a supply of ready-trained employees. Companies who build on open source tend to focus on the software, but neglect communities at their peril.
In this talk, I share some of my experiences in building community for an open-source software company, Revolution Analytics, and perspectives since the acquisition by Microsoft in 2015.
R is more than just a language. Many of the reasons why R has become such a popular tool for data science come from the ecosystem surrounding the R project. R users benefit from the many resources and packages created by the community, while commercial companies (including Microsoft) provide tools to extend and support R, and services to help people use R.
In this talk, I will give an overview of the R Ecosystem and describe how it has been a critical component of R’s success, and include several examples of Microsoft’s contributions to the ecosystem.
(Presented to EARL London, September 2016)
(Presented by David Smith at useR!2016, June 2016. Recording: https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/R-at-Microsoft )
Since the acquisition of Revolution Analytics in April 2015, Microsoft has embarked upon a project to build R technology into many Microsoft products, so that developers and data scientists can use the R language and R packages to analyze data in their data centers and in cloud environments.
In this talk I will give an overview (and a demo or two) of how R has been integrated into various Microsoft products. Microsoft data scientists are also big users of R, and I'll describe a couple of examples of R being used to analyze operational data at Microsoft. I'll also share some of my experiences in working with open source projects at Microsoft, and my thoughts on how Microsoft works with open source communities including the R Project.
Hadoop is famously scalable. Cloud Computing is famously scalable. R – the thriving and extensible open source Data Science software – not so much. But what if we seamlessly combined Hadoop, Cloud Computing, and R to create a scalable Data Science platform? Imagine exploring, transforming, modeling, and scoring data at any scale from the comfort of your favorite R environment. Now, imagine calling a simple R function to operationalize your predictive model as a scalable, cloud-based Web Service. Learn how to leverage the magic of Hadoop on-premises or in the cloud to run your R code, thousands of open source R extension packages, and distributed implementations of the most popular machine learning algorithms at scale.
With rising business challenges in the aftermarket service areas, it becomes imperative for manufacturers to gain actionable intelligence across the warranty management life cycle.
Join Revolution Analytics and Tech Mahindra to hear how to reduce the information visibility gap:
• Identify statistically significant business drivers
• Forecast warranty costs and claims
• Improve Customer Satisfaction
Presented to Chicago R User Group, Jan 29 2015
Good data analysis is reproducible. If someone else can’t independently replicate your results from your data, the consequences can be severe. With R, a major challenge for reproducibility is the ever-changing package ecosystem: it's all too easy to develop an R script using packages, only to find collaborators will download later versions of those packages when they attempt to reproduce your results, and outcome can be unpredictable!
In this talk I'll introduce the Reproducible R Toolkit, and the "checkpoint" package, included with Revolution R Open, and describe some best practices for writing reliable, reproducible R code with packages.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
3. F ebruary 22, 2011: Welc ome! Revolution Confidential
Thanks for coming.
Slides and replay available (soon) at:
http://bit.ly/z9xUG9
David Smith
VP Marketing & Community, Revolution Analytics
Editor, Revolutions blog
http://blog.revolutionanalytics.com
Twitter: @revodavid
3
4. In today’s webc as t: Revolution Confidential
About Revolution Analytics and R
What Revolution R adds to R
Resources for getting more from R
Q&A
Introducing Revolution R 4
5. What is R ? Download the White PaperConfidential
R is Hot
Revolution
bit.ly/r-is-hot
Data analysis software
A programming language
Development platform designed by and for statisticians
An environment
Huge library of algorithms for data access, data
manipulation, analysis and graphics
An open-source software project
Free, open, and active
A community
Thousands of contributors, 2 million users
Resources and help in every domain
5
6. R is exploding in popularity and
func tionality Revolution Confidential
Scholarly Activity
Google Scholar hits (’05-’09 CAGR)
R 46% “I’ve been astonished by the rate at which
R has been adopted. Four years ago,
SAS -11%
everyone in my economics department [at
SPSS -27%
the University of Chicago] was using
Stata; now, as far as I can tell, R is the
S-Plus 0% standard tool, and students learn it first.”
Stata 10%
Deputy Editor for New Products at Forbes
Package Growth
Number of R packages listed on CRAN
“A key benefit of R is that it provides near-
instant availability of new and
experimental methods created by its user
base — without waiting for the
development/release cycle of commercial
software. SAS recognizes the value of R
to our customer base…”
Product Marketing Manager SAS Institute, Inc.
2002 2004 2006 2008 2010
Source: http://r4stats.com/popularity 6
7. “ R is the mos t powerful & flexible s tatis tic al
Revolution Confidential
programming language in the world” 1
Capabilities
Sophisticated
statistical analyses
Predictive analytics
Data visualization
Applications
Real-time trading MSFT [2009-
Last 29.29
Finance 30
Risk assessment 25
Forecasting 20
Bio-technology 15
Drug development
Social networks
.. and more
1. Norman Nie, multiple interviews 7
8. From: The R Ecosystem
R Us er C ommunity bit.ly/R-ecosystem
8
10. R evolution R E nterpris e is Revolution Confidential
10
11. R P roduc tivity E nvironment (Windows )
Revolution Confidential
Script with type
ahead and code Solutions window
snippets for organizing
code and data
Sophisticated
debugging with
breakpoints , variable Objects
values etc. loaded in the
R
Environment
Packages Object
installed and details
loaded
http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm
11
12. Interac tive Debugging Revolution Confidential
One-click to set a breakpoint in an R script
Step in/out/over, inspect variables
Eliminate the edit -> browser -> repair cycle
12
13. P erformanc e: Multi-threaded Math Revolution Confidential
Open Revolution R
Source R Enterprise
Computation (4-core laptop) Open Source R Revolution R Speedup
Linear Algebra1
Matrix Multiply 327 sec 13.4 sec 23x
Cholesky Factorization 31.3 sec 1.8 sec 17x
Linear Discriminant Analysis 216 sec 74.6 sec 2x
General R Benchmarks2
R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x
R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable
1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php
2. http://r.research.att.com/benchmarks/
13
14. T hree P aradigms for B ig Data Revolution Confidential
Standard R engine is constrained by
capacity and performance
Revolution R Enterprise offers three
methods for big data with R:
Off-line: high-performance file-based analytics
Off-line, parallel & distributed analytics
On-line, in-database analytics
Hadoop
Netezza
14
15. R evolution R E nterpris e with R evoS c aleR
B ig Data S tatis tic s in R Revolution Confidential
www.revolutionanalytics.com/bigdata
Every US airline
departure and arrival,
1987-2008
File: AirlineData87to08.xdf
Rows: 123.5 million
Variables: 29
Size on disk: 13.2Gb
arrDelayLm2 <- rxLinMod(ArrDelay ~ DayOfWeek:F(CRSDepTime),cube=TRUE)
15
16. R evoS c aleR : B ig Data algorithms Revolution Confidential
Data processing (rxDataStep)
Descriptive statistics (rxSummary)
Tables and cubes (rxCube, rxCrossTabs)
Correlations/covariances (rxCovCor, rxCor,
rxCov, rxSSCP)
Linear regressions (rxLinMod)
Logistic regressions (rxLogit)
K means clustering (rxKmeans)
Predictions (scoring) (rxPredict)
Custom distributed computing (RxExec)
Revolution R Enterprise 16
17. R evoS c aleR – Dis tributed C omputing Revolution Confidential
Compute • Portions of the data source are
Data Node made available to each compute
Partition (RevoScaleR) node
• RevoScaleR on the master node
Compute assigns a task to each compute
Data Node node
Partition (RevoScaleR)
Master • Each compute node independently
Node processes its data, and returns its
Compute (RevoScaleR) intermediate results back to the
Data Node master node
Partition (RevoScaleR)
• master node aggregates all of the
intermediate results from each
Compute compute node and produces the
Data Node final result
Partition (RevoScaleR)
*Available now for Microsoft HPC Server
Video demo: http://bit.ly/ugQ9KR
17
18. P latform-agnos tic B ig Data A nalytic s Revolution Confidential
Set “compute context” to define hardware (one line of code)
Native job-scheduler handles distribution, monitoring, failover etc.
Same code runs on other supported architectures
Just change compute context
Supported architectures:
Windows: Microsoft HPC Server
Linux: Platform Computing LSF (coming 2012)
42 seconds instead of 6 minutes
18
19. A c ommon analytic platform ac ros s big
data arc hitec tures Revolution Confidential
Hadoop File Based In-database
19
20. In-Databas e E xec ution with IB M Netezza Revolution Confidential
More info: http://bit.ly/R-Netezza
20
21. R and Hadoop Revolution Confidential
Hadoop offers a scalable infrastructure for
processing massive amounts of data
Storage – HDFS, HBASE
Distributed Computing - MapReduce
R is a statistical programming language for
developing advanced analytic applications
Currently, writing analytics for Hadoop requires
a combination of Java, pig, Python, …
The Rhadoop project makes it possible to
write Big Data algorithms for Hadoop using the
R language alone.
21
22. R evoC onnec tR for Hadoop Revolution Confidential
Write Map-Reduce analytics using
HBASE only R code with these R
packages:
HDFS
rhdfs - R and HDFS
R
Thrift rhbase - R and HBASE
Map or
Reduce
rmr - R and MapReduce
Task rhbase
rhdfs
Node
Revolution R More information at:
Job Client bit.ly/r-hadoop
Tracker rmr
22
23. E nterpris e R eadines s :
R evolution R E nterpris e S erver Revolution Confidential
Multi-User Support
Production Applications
Integrate R analytics into Web based applications
Data Analysis and Visualization
Reporting
Dashboards
Interactive applications
Revolution R Enterprise Server with RevoDeployR
23
24. E nterpris e-Wide Deployment Revolution Confidential
Production Research and Development
Revolution R Enterprise Server
+ Hadoop
+ IBM Netezza Data Scientists / Modelers
+ Windows HPC Server cluster
Management End-User Deployment
Console
Excel Web BI
RevoDeployR Server App
Web Services API
Analysts / Corporate Users
24
26. T he A dvanc ed A nalytic s S tac k Revolution Confidential
Deployment / Consumption
Advanced Analytics
ETL
Data / Infrastructure
“Open Analytics Stack” White Paper: bit.ly/lC43Kw
26
27. Revolution Confidential
On-Call Technical Support
Consulting
Migration | Analytics | Applications | Validation
Training
R | Revolution R | Statistical Topics
Systems Integration
BI | ERP | Databases | Cloud
27
29. Why R ? Revolution Confidential
Every data analysis technique at your fingertips
Create beautiful and unique data visualizations
Get better results faster
Draw on the talents of data scientists worldwide
R is hot, and growing fast
29
30. R evolution R E nterpris e Revolution Confidential
Production-Grade Statistical Analysis for the Workplace
High-performance R for multiprocessor systems
Modern Integrated Development Environment
Statistical Analysis of Terabyte-Class Data Sets
In-database R analytics with Hadoop and Netezza
Deploy R Applications via Web Services
Telephone and email technical support
Training and consulting services
100% compatible with R packages
30
31. R evolution R E nterpris e: F ree to A c ademia Revolution Confidential
Personal use
Research
Teaching
Package development
Free Academic Download
www.revolutionanalytics.com/downloads/free-academic.php
Discounted Technical Support Subscriptions Available
31
32. T hank You! Revolution Confidential
Download slides, replay
http://bit.ly/z9xUG9
Learn more about Revolution R
revolutionanalytics.com/products
Contact Revolution Analytics
http://bit.ly/hey-revo
Feb 29: Turbo-Charge Your Analytics with IBM Netezza and
Revolution R Enterprise
A Step-by-Step Approach for Acceleration and Innovation, presented by William
Zanine (IBM Analytics Solutions).
www.revolutionanalytics.com/news-events/free-webinars
32
34. Revolution Confidential
The leading commercial provider of software and support for the
popular open source R statistics language.
www.revolutionanalytics.com
+1 (650) 646 9545
Twitter: @RevolutionR
34