Big Data refers to a large amount of data both structured and unstructured. For managing and analyzing this amount of data we need technologies like Hadoop and language like R.
http://www.techsparks.co.in/thesis-in-big-data-with-r/
Revolution Analytics was the first company dedicated to the R Project. This presentation from useR! 2014 covers the history of Revolution Analytics since its founding in 2007 and its contributions to the R project and community.
Big Data refers to a large amount of data both structured and unstructured. For managing and analyzing this amount of data we need technologies like Hadoop and language like R.
http://www.techsparks.co.in/thesis-in-big-data-with-r/
Revolution Analytics was the first company dedicated to the R Project. This presentation from useR! 2014 covers the history of Revolution Analytics since its founding in 2007 and its contributions to the R project and community.
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics
Presented by David Smith, Chief Community Officer, Revolution Analytics at Garner Business Intelligence and Analytics Summit, April 2014.
In this presentation, I'll introduce the open source R language — the modern standard for Data Science — and the enhanced performance, scalability and ease-of-use capabilities of Revolution R Enterprise. Customer case studies will illustrate Revolution R Enterprise as a component of the real-time analytics deployment process, via integration with Hadoop, database warehousing systems and Cloud platforms, to implement data-driven end-user applications.
This session will demonstrate how the all-star line-up featuring R and Storm enables real-time processing on massive data sets; a real home run! The presenters will use actual baseball data and a real-world use case to compose an implementation of the use case as Storm components (spouts, bolts, etc.) and highlight how R can be an effective tool in prototyping a solution. Attendees will leave the session with information that could easily be applied for other use cases such as video game analytics, fraud detection, intrusion detection, and consumer propensity to buy calculations.
The business need for real-time analytics at large scale has focused attention on the use of Apache Storm, but an approach that is sometimes overlooked is the use of Storm and R together. This novel combination of real-time processing with Storm and the practical but powerful statistical analysis offered by R substantially extends the usefulness of Storm as a solution to a variety of business critical problems. By architecting R into the Storm application development process, Storm developers can be much more effective. The aim of this design is not necessarily to deploy faster code but rather to deploy code faster. Just a few lines of R code can be used in place of lengthy Storm code for the purpose of early exploration – you can easily evaluate alternative approaches and quickly make a working prototype.
An introduction to Microsoft R Services,
Microsoft R Open and Microsoft R Server.
This presentation will briefly cover the following:
-Why consider MRO and R Server
-R Server
-MRO
-Microsoft R Services/R Server Platform
-DistributedR
-RevoScaleR/ScaleR
-ConnectR
-DevelopR
-DeployR
-Resources
-References
Analysts predict that the Hadoop market will reach $50.2 billion USD by 2020.1 Applications driving these large expenditures are some of the most important workloads for businesses today including:
• Analyzing clickstream data, including site-side clicks and web media tags. • Measuring sentiment by scanning product feedback, blog feeds, social media comments, and Twitter streams. • Analysis of behavior and risk by capturing vehicle telematics. • Optimizing product performance and utilization by gathering data from built-in sensors. • Tracking and analyzing people and material movement with location-aware systems. • Identifying system performance and intrusion attempts by analyzing server and network log. • Enabling automatic document and speech categorization. • Extracting learning from digitized images, voice, video, and other media types.
Predictive analytics on large data sets provides organizations with a key opportunity to improve a broad variety of business outcomes, and many have embraced Apache Hadoop as the platform of choice.
In the last few years, large businesses have adopted Apache Hadoop as a next-generation data platform, one capable of managing large data assets in a way that is flexible, scalable, and relatively low cost. However, to realize predictive benefits of big data, organizations must be able to develop or hire individuals with the requisite statistics skills, then provide them with a platform for analyzing massive data assets collected in Hadoop “data lakes.”
As users adopted Hadoop, many discovered performance and complexity limited Hadoop’s use for broad predictive analytics use. In response, the Hadoop community has focused on the Apache Spark platform to provide Hadoop with significant performance improvements. With Spark atop Hadoop, users can leverage Hadoop’s big-data management capabilities while achieving new performance levels by running analytics in Apache Spark.
What remains is a challenge—conquering the complexity of Hadoop when developing predictive analytics applications.
In this white paper, we’ll describe how Microsoft R Server helps data scientists, actuaries, risk analysts, quantitative analysts, product planners, and other R users to capture the benefits of Apache Spark on Hadoop by providing a straightforward platform that eliminates much of the complexity of using Spark and Hadoop to conduct analyses on large data assets.
[Presented to the 7th China R Users Conference, Beijing, May 2014.]
Adoption of the R language has grown rapidly in the last few years, and is ranked as the number-one data science language in several surveys. This accelerating R adoption curve has been driven by the Big Data revolution, and the fact that so many data scientists — having learned R at university — are actively unlocking the secrets hidden in these new, vast data troves.
In more than 6 years of writing for the Revolutions blog, I’ve discovered hundreds of applications of R in business, in government, and in the non-profit sector. Sometimes the use of R is obvious, and sometimes it takes a little bit of detective work to learn how R is operating behind the scenes. In this talk, I’ll begin by presenting some recent statistics on the growth of R. Then I’ll recount some of my favourite applications of R, and show how R is behind some amazing innovations in today’s world.
(Presented by David Smith at useR!2016, June 2016. Recording: https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/R-at-Microsoft )
Since the acquisition of Revolution Analytics in April 2015, Microsoft has embarked upon a project to build R technology into many Microsoft products, so that developers and data scientists can use the R language and R packages to analyze data in their data centers and in cloud environments.
In this talk I will give an overview (and a demo or two) of how R has been integrated into various Microsoft products. Microsoft data scientists are also big users of R, and I'll describe a couple of examples of R being used to analyze operational data at Microsoft. I'll also share some of my experiences in working with open source projects at Microsoft, and my thoughts on how Microsoft works with open source communities including the R Project.
Skillshare - Let's talk about R in Data JournalismSchool of Data
What is R and how useful is it for datajournalism? Isn't Excel enough? And how do you use R anyway?
This new School of Data skillshare by David Opoku will help you understand how R fits into the data pipeline and introduce you to the basics of using the software.
R is more than just a language. Many of the reasons why R has become such a popular tool for data science come from the ecosystem surrounding the R project. R users benefit from the many resources and packages created by the community, while commercial companies (including Microsoft) provide tools to extend and support R, and services to help people use R.
In this talk, I will give an overview of the R Ecosystem and describe how it has been a critical component of R’s success, and include several examples of Microsoft’s contributions to the ecosystem.
(Presented to EARL London, September 2016)
Applications in R - Success and Lessons Learned from the MarketplaceRevolution Analytics
Adoption of the R language has grown rapidly in the last few years, and is ranked as the number-one data science language in several surveys. This accelerating R adoption curve has been driven by the Big Data revolution, and the fact that so many data scientists — having learned R at university — are actively unlocking the secrets hidden in these new, vast data troves.
In this webinar David Smith, Chief Community Officer, will take a look at the growth of R and the innovative uses of R in business, government and non-profit sectors. Then Neera Talbert, Vice President, Professional Services will take you into the trenches of recent customer deployments and share best practices and pitfalls to avoid in deploying or expanding your own R applications.
In-Database Analytics Deep Dive with Teradata and RevolutionRevolution Analytics
Teradata and Revolution Analytics worked together to develop in-database analytical capabilities for Teradata Database. Teradata v14.10 provides a foundation for in-database analytics in Teradata. Revolution Analytics has ported its Revolution R Enterprise (RRE) Version 7.1 to use the in-database capabilities of version 14.10. With RRE inside Teradata, users can run fully parallelized algorithms in each node of the Teradata appliance to achieve performance and data scale heretofore unavailable. We'll get past the market-ecture quickly and dive into a “how it really works” presentation, review implications for system configuration and administration, and then take questions from Teradata users who will be charged with deploying and administering Teradata systems as platforms for big data analytics inside the database engine.
My talk about using Rattle for R in Data Mining . Includes
- Introduction to Data Mining
- The Data Mining Process
- Introduction to Rattle, RStudio and R
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics
Presented by David Smith, Chief Community Officer, Revolution Analytics at Garner Business Intelligence and Analytics Summit, April 2014.
In this presentation, I'll introduce the open source R language — the modern standard for Data Science — and the enhanced performance, scalability and ease-of-use capabilities of Revolution R Enterprise. Customer case studies will illustrate Revolution R Enterprise as a component of the real-time analytics deployment process, via integration with Hadoop, database warehousing systems and Cloud platforms, to implement data-driven end-user applications.
This session will demonstrate how the all-star line-up featuring R and Storm enables real-time processing on massive data sets; a real home run! The presenters will use actual baseball data and a real-world use case to compose an implementation of the use case as Storm components (spouts, bolts, etc.) and highlight how R can be an effective tool in prototyping a solution. Attendees will leave the session with information that could easily be applied for other use cases such as video game analytics, fraud detection, intrusion detection, and consumer propensity to buy calculations.
The business need for real-time analytics at large scale has focused attention on the use of Apache Storm, but an approach that is sometimes overlooked is the use of Storm and R together. This novel combination of real-time processing with Storm and the practical but powerful statistical analysis offered by R substantially extends the usefulness of Storm as a solution to a variety of business critical problems. By architecting R into the Storm application development process, Storm developers can be much more effective. The aim of this design is not necessarily to deploy faster code but rather to deploy code faster. Just a few lines of R code can be used in place of lengthy Storm code for the purpose of early exploration – you can easily evaluate alternative approaches and quickly make a working prototype.
An introduction to Microsoft R Services,
Microsoft R Open and Microsoft R Server.
This presentation will briefly cover the following:
-Why consider MRO and R Server
-R Server
-MRO
-Microsoft R Services/R Server Platform
-DistributedR
-RevoScaleR/ScaleR
-ConnectR
-DevelopR
-DeployR
-Resources
-References
Analysts predict that the Hadoop market will reach $50.2 billion USD by 2020.1 Applications driving these large expenditures are some of the most important workloads for businesses today including:
• Analyzing clickstream data, including site-side clicks and web media tags. • Measuring sentiment by scanning product feedback, blog feeds, social media comments, and Twitter streams. • Analysis of behavior and risk by capturing vehicle telematics. • Optimizing product performance and utilization by gathering data from built-in sensors. • Tracking and analyzing people and material movement with location-aware systems. • Identifying system performance and intrusion attempts by analyzing server and network log. • Enabling automatic document and speech categorization. • Extracting learning from digitized images, voice, video, and other media types.
Predictive analytics on large data sets provides organizations with a key opportunity to improve a broad variety of business outcomes, and many have embraced Apache Hadoop as the platform of choice.
In the last few years, large businesses have adopted Apache Hadoop as a next-generation data platform, one capable of managing large data assets in a way that is flexible, scalable, and relatively low cost. However, to realize predictive benefits of big data, organizations must be able to develop or hire individuals with the requisite statistics skills, then provide them with a platform for analyzing massive data assets collected in Hadoop “data lakes.”
As users adopted Hadoop, many discovered performance and complexity limited Hadoop’s use for broad predictive analytics use. In response, the Hadoop community has focused on the Apache Spark platform to provide Hadoop with significant performance improvements. With Spark atop Hadoop, users can leverage Hadoop’s big-data management capabilities while achieving new performance levels by running analytics in Apache Spark.
What remains is a challenge—conquering the complexity of Hadoop when developing predictive analytics applications.
In this white paper, we’ll describe how Microsoft R Server helps data scientists, actuaries, risk analysts, quantitative analysts, product planners, and other R users to capture the benefits of Apache Spark on Hadoop by providing a straightforward platform that eliminates much of the complexity of using Spark and Hadoop to conduct analyses on large data assets.
[Presented to the 7th China R Users Conference, Beijing, May 2014.]
Adoption of the R language has grown rapidly in the last few years, and is ranked as the number-one data science language in several surveys. This accelerating R adoption curve has been driven by the Big Data revolution, and the fact that so many data scientists — having learned R at university — are actively unlocking the secrets hidden in these new, vast data troves.
In more than 6 years of writing for the Revolutions blog, I’ve discovered hundreds of applications of R in business, in government, and in the non-profit sector. Sometimes the use of R is obvious, and sometimes it takes a little bit of detective work to learn how R is operating behind the scenes. In this talk, I’ll begin by presenting some recent statistics on the growth of R. Then I’ll recount some of my favourite applications of R, and show how R is behind some amazing innovations in today’s world.
(Presented by David Smith at useR!2016, June 2016. Recording: https://channel9.msdn.com/Events/useR-international-R-User-conference/useR2016/R-at-Microsoft )
Since the acquisition of Revolution Analytics in April 2015, Microsoft has embarked upon a project to build R technology into many Microsoft products, so that developers and data scientists can use the R language and R packages to analyze data in their data centers and in cloud environments.
In this talk I will give an overview (and a demo or two) of how R has been integrated into various Microsoft products. Microsoft data scientists are also big users of R, and I'll describe a couple of examples of R being used to analyze operational data at Microsoft. I'll also share some of my experiences in working with open source projects at Microsoft, and my thoughts on how Microsoft works with open source communities including the R Project.
Skillshare - Let's talk about R in Data JournalismSchool of Data
What is R and how useful is it for datajournalism? Isn't Excel enough? And how do you use R anyway?
This new School of Data skillshare by David Opoku will help you understand how R fits into the data pipeline and introduce you to the basics of using the software.
R is more than just a language. Many of the reasons why R has become such a popular tool for data science come from the ecosystem surrounding the R project. R users benefit from the many resources and packages created by the community, while commercial companies (including Microsoft) provide tools to extend and support R, and services to help people use R.
In this talk, I will give an overview of the R Ecosystem and describe how it has been a critical component of R’s success, and include several examples of Microsoft’s contributions to the ecosystem.
(Presented to EARL London, September 2016)
Applications in R - Success and Lessons Learned from the MarketplaceRevolution Analytics
Adoption of the R language has grown rapidly in the last few years, and is ranked as the number-one data science language in several surveys. This accelerating R adoption curve has been driven by the Big Data revolution, and the fact that so many data scientists — having learned R at university — are actively unlocking the secrets hidden in these new, vast data troves.
In this webinar David Smith, Chief Community Officer, will take a look at the growth of R and the innovative uses of R in business, government and non-profit sectors. Then Neera Talbert, Vice President, Professional Services will take you into the trenches of recent customer deployments and share best practices and pitfalls to avoid in deploying or expanding your own R applications.
In-Database Analytics Deep Dive with Teradata and RevolutionRevolution Analytics
Teradata and Revolution Analytics worked together to develop in-database analytical capabilities for Teradata Database. Teradata v14.10 provides a foundation for in-database analytics in Teradata. Revolution Analytics has ported its Revolution R Enterprise (RRE) Version 7.1 to use the in-database capabilities of version 14.10. With RRE inside Teradata, users can run fully parallelized algorithms in each node of the Teradata appliance to achieve performance and data scale heretofore unavailable. We'll get past the market-ecture quickly and dive into a “how it really works” presentation, review implications for system configuration and administration, and then take questions from Teradata users who will be charged with deploying and administering Teradata systems as platforms for big data analytics inside the database engine.
My talk about using Rattle for R in Data Mining . Includes
- Introduction to Data Mining
- The Data Mining Process
- Introduction to Rattle, RStudio and R
Basic of R Programming Language,
Introduction, How to run R, R Sessions and Functions, Basic Math, Variables, Data Types, Vectors, Conclusion, Advanced Data Structures, Data Frames, Lists, Matrices, Arrays, Classes
Basic of R Programming Language
R is a programming language and environment commonly used in statistical computing, data analytics and scientific research.
Learn about the different Job Profiles in Big Data and Why is Big Data the best career move? Learn Big Data from StackDataLabs and get certified by the Professionals!
The talk is on How to become a data scientist. This was at 2ns Annual event of Pune Developer's Community. It focuses on Skill Set required to become data scientist. And also based on who you are what you can be.
Top 10 Data analytics tools to look for in 2021Mobcoder
This write-up has surrounded the top 10 tools used by data analysts, architects, scientists, and other professionals. Each tool has some specific feature that makes it an ideal fit for a specific task. So choose wisely depending on your business need, type of data, the volume of information, experience in analytical thinking.
BIG DATA ANALYTICS
USING R
Analytics is the combination of mathematical, statistical, and heuristic techniques to glean useful insights from data and to implement actions derived from those insights.
Big Data Analytics servicesWe offer our service of Big Data Analytics for you to be able to see further progress and business prospects. To gain an insight into marketing trends and always be one step ahead of your business rivals, we resort to the following tools:
Data mining.We make your data meaningful to predict future outcomes.
StatisticsWe use statistics to measure the quality of data, define uncertainties and extract only accurate data.
Data modelingWe structure data in order so that it can feet the needs of application
Machine learningWe use machine learning to gather, integrate and process huge volumes of data.
Database managementOur services also include database management, which allows to collect, track and store stream of data, build data warehouses and make a data processing efficient. More than that, you can also receive support and maintenance of your database software, if there is such a need.
Big data visualizationBig data visualization promotes better understanding of the whole data, by breaking it into pieces with the help of colors, graphs, symbols etc.
Business IntelligenceUse Business Intelligence services to receive the assessment and summary of current situations from the point of view of market trends, financial reporting, budget planning, customer analysis and many more.
R Vs Python – The most trending debate of aspiring Data Scientistsabhishekdf3
Now, it’s the time for a battle of two most demanding programming languages that is R vs Python. We will go deep in understanding the differences between the two languages. And, I assure you that you will not have any confusion left after completing this article i.e. R vs Python – the most trending debate of aspiring data scientists.
Learn more at :- https://data-flair.training/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
1. Executive Intro to R
William M. Cohee
November 2016
Prepared using Apache OpenOffice 4.1.2
2. Presenter Bio
● 15+ years of Wall Street Technology
experience
● Expertise in front-office Fixed Income
Systems, Analytics, Pricing, Instrument,
& Entity Reference Data Management
● BA, Computer Science
● MS, Information Systems Engineering
● Certified Bloomberg Specialist
● Currently in the Chief Data Office
@ HSBC
● www.linkedin.com/in/billcohee
3. Topic
● Tool of choice for Statisticians, Data Analysts, & Data Scientists
● Popularity and use of R is on the rise
● R Community is vibrant & the talent pool is growing rapidly
● R is evolving from its statistical computing roots into a development
platform for robust, reusable software
● A lot of commercial, third-party systems are adding support
● Oracle, Microsoft becoming big players
● R can be used to manage & analyze data in Hadoop
● A growing ecosystem is accelerating industry acceptance/adoption
● R savvy IT leaders can deliver more effective, lower cost solutions
4. Agenda
● What is R [slides 5-8]
● What can R be used for [slides 9-10]
● Recap & where to learn more [slides 11-12]
5. R – What is it?
● A powerful computing environment for Data Analysis & Statistics
● 'R' proper, is an open-source programming language
● Developed as a dialect of 'S'
● S developed by Bell Labs to 'turn ideas into software, quickly and
faithfully' c.1976
● strong desire at the time for an alternative to writing FORTRAN
subroutines for analyzing data
● Ross Ihaka and Robert Gentleman recognized as original creators
of R while professors at the University of Auckland in New Zealand
c.1995
● v1.0 came onto the scene in the early 2000s
6. R – What is it?
● Traditional user base consists of
● Researchers
● Statisticians
● Academia
● 'New wave' R users
● Wall Street Desk Quants
● Risk Analysts & Financial Modelers
● Data Scientists
● Advent of Big Data and the nascent field of Data Science are serving
as catalysts to the sudden rise of this 16+ year old technology
7. R – What is it?
● When people speak of R, they are usually referring to the broader
ecosystem, not the language
● R for Windows, Microsoft R Open – command line interpreters
● RStudio, R Tools for Visual Studio – IDEs (Interactive Development Environments)
● user-friendly, robust, graphical front-ends for working with R
● CRAN and MRAN
● Comprehensive R Archive Network
● Microsoft R Open Archive Network
● repositories of open-source extensions to R known as 'Packages'
● think of a Package as a pre-built library of functions & data
8. R – What is it?
● R was not created with 'coders' in mind
● Creators were focused on how to make Data Analysis easier on the
users of data
● Geared toward the power-user who has to work with large amounts
of data while avoiding coding as much as practically possible
● Why is it called R ???
● the co-creators were Ross & Robert!
● it was trendy to give languages letter names (B, C, S, etc)
● As R becomes more mainstream, it may have everyday applications
for people in roles requiring them to work with or 'be in the data'
9. R – What can it be used for?
● For presenting & solving data-oriented problems
● Exploratory Analysis
● discovering data about the data
● clustering & visualizing data
● quickly building summaries of the data being worked with
● Wrangling/Munging & re-shaping data
● working with structured & unstructured data
● sub-setting, filtering, and merging data
● making data 'tidy' – datasets that facilitate some kind of analysis
● dplyr & tidyr Packages popular
10. R – What can it be used for?
● Predictive Analytics & Machine Learning
● modeling, sampling, forecasting, trending, regression
● caret, h2o, quantmod Packages popular
● Data Visualization
● powerful, publication-quality graphing & plotting Packages
● ggplot2, leaflets, and shiny Packages popular
● shiny example: Where are the so-called 'SuperZIPs'?
● US postal codes scored on a scale of 0-100, 100 being highest
● score is a function of median household income and education level
● Top 5% are deemed the 'SuperZIPs'
● click to see the R + shiny powered Interactive data map
11. Recap & Resources
● R is an open-source environment that can be used for complex Data
'work'
● essential part of a Data Scientist's Toolbox
● Also a functional programming language
● can be used to create programs to automate routine, repetitive data
tasks and for general software development
● Becoming a mainstream tool
● benefiting from increased commercial support
● maturing ecosystem of Packages
● Agility, flexibility, growing talent pool, & low cost of ownership all a
part of R's appeal
12. Recap & Resources
● Where to learn more...
● The R Homepage: https://www.r-project.org
● RStudio: https://www.rstudio.com/products/RStudio
● CRAN: https://cran.r-project.org
● Oracle and R: http://bit.ly/2dUC24a
● Microsoft and R: http://bit.ly/2e5CT5m
● The R Consortium: https://www.r-consortium.org
● Playlist of R video tutorials: http://bit.ly/1iRcgyn
● Free Courses
● https://www.coursera.org/learn/r-programming
● https://www.datacamp.com/courses/free-introduction-to-r
Scan this QR code to view
online from a mobile device