SlideShare a Scribd company logo
1 of 50
Download to read offline
An interactive e-learning platform
for Data Analysis based on R
Albert Jorissen
Martijn Theuwissen
Dieter De Mesmaeker
Jonathan Cornelissen
Jonathan@datamind.org
Dieter@datamind.org
Albert@datamind.org
Martijn@datamind.org
Who’s who
Why e-learning with and for R?
Need for scalable tools to learn
R and Data Analysis…
Because of exponentially growing R user base
More than 2 million R users growing at 40-60% yearly
Source: http://r4stats.com/articles/popularity/ and http://prezi.com/s1qrgfm9ko4i/the-r-ecosystem/
Keyword Competition Global2Monthly2Searches
r"tutorial 0 6600
introduction"to"r 0 1600
online"statistics"course 0.98 1600
ggplot2"tutorial 0 880
statistics"course 0.85 880
an"introduction"to"r 0.01 880
r"book 0.06 590
learning"statistics 0.38 590
r"tutorials 0 590
r"introduction 0.01 480
statistics"courses 0.84 480
statistics"introduction 0.1 480
online"statistics"courses 0.99 320
r"course 0.04 260
r"training 0.17 260
free"online"statistics"course 0.56 260
statistics"training 0.62 210
online"statistics"class 0.98 170
statistics"class"online 0.98 140
data"analysis"tutorial 0.5 110
Analysis of r-project.org Analysis of Google keywords
Compare to:
SAS tutorial: 4400
Eviews tutorial: 390
Stata tutorial: 1900
Matlab tutorial: 22200
Hadoop tutorial: 12100
Source: Analysis based on
http://cran.r-project.org/report_cran.html
Source: Analysis based on
http://adwords.google.com/select/keywordtoolexternal
That needs to learn the basics and the specifics
of R
• Number of downloads per month for:
• Introduction to R pdfs: 140.000
• Summary pdfs: 50.000
• Some of the “top” package:
(reliability/stability of numbers below?)
kernlab.pdf 349,780
party.pdf 167,396
igraph.pdf 59,969
VennDiagram.pdf 30,889
mclust.pdf 19,347
KnitR.pdf 10,697
twitteR.pdf 7,507
randomForest.pdf 6,824
Ggplot2 5,924
raster.pdf 5,326
Source: http://r4stats.com/articles/popularity/
6,275 R packages at all major repositories, 4,315 of which were at CRAN
Across a broad spectrum of domains: Financial engineering, biostatistics, data mining, …
Because of the exponentially growing functionality
• Great books, tutorials,… on R
• But coding is learned by doing
• No online learning interface for R
• Documentation made by experts for experts,
not for beginners or intermediate users
Teachers :
Learners :
• Often give the same or similar feedback to
students in exercise sessions
• Manually correct assignments
• Static content
• Hard to get feedback
Students, Professionals, Researchers, Employees
Why e-learning with and for R?
Data Analysis Professors, Consultants, Researchers, Book authors
Technical overview
DataMind IT architecture
WebSockets
AJAX requests
R serve
Ruby on Rails
High productivity
web application
framework
Node.js
Platform for real-time
scalable network
applications
RESTful API
R
Open-source
statistical language
Angular.js
MVC JavaScript framework
for single-page applications,
maintained by Google
DataMind leverages state of the art open-source
frameworks in the cloud
Simplified Student experience on DataMind
Try the first course “Summer of R” at www.datamind.org!
1. Read Assignment
2. Read instructions
Try the first course “Summer of R” at www.datamind.org!
3. Write R code in the browser
Try the first course “Summer of R” at www.datamind.org!
4. Pre-exercise code is run in the
background to pre-load a
dataset, graphs, etc.
Try the first course “Summer of R” at www.datamind.org!
5. If a student gets stuck, he can
look at a solution
Try the first course “Summer of R” at www.datamind.org!
6. Student submits answer and
the R code is executed
Try the first course “Summer of R” at www.datamind.org!
7. Output is shown in the R
console below AND student gets
feedback on his assignment
Try the first course “Summer of R” at www.datamind.org!
Very positive reactions on early stage prototype
launch mid-July
• Students
Already >3000 happy registered students
with 0 marketing budget and only a very limited course
-> confirmation of market need
Over > 40.000 exercises were made
and 47% of all visitors return to the website
-> confirmation of engaging experience for students
• Content creators
-> Interest from high-profile academics
-> Interest from high-profile companies
Course creators attracted based on prototype
• Eric Zivot – University of Washington
Has Comp Finance course on Coursera with 150k students yearly
Is recruiting a PhD student to start building an interactive course on DataMind
• Ramnath Vaidyanathan – McGill University
Providing strategic and technical advice, and interactive online complement to his book
on interactive graphics/slides with R.
• Ista Zahn – R workshops at Harvard University
• Katrien Antonio – KULeuven
A life insurance courses with R
• Frank DiTraglia – University of Pennsylvania
• Marc Carlson – Bioconductor core member and Data Analyst at Fred Hutchinson
Cancer Research Center
• Ajay Ohri – Author of popular “R for Business Analytics” book
• Stephen Davies – University of Mary Washington
• Dai, Zhuo Jia – Consultant interested in creating credit scoring course
• Geert Molenberghs – KULeuven - Interested, Skype call planned
Reactions
“Sounds terrific that you think there might be an opportunity to
collaborate on this. Of all the learning platforms for R out there, I
strongly believe DataMind has the best interface, and I would be happy
to help push things further by collaborating.”
Prof. Ramnath V. – McGill University
“Het spreekt tot de verbeelding om mee te kunnen werken aan deze
(r)evolutie.”
Prof. Katrien Antonio – UVA & KU Leuven
“At the moment, I use static html tutorials composed in R Markdown.
Something interactive would clearly be much better for helping the
students learn.”
Prof. Francis DiTraglia – University of Pennsylvania
…
Interest from both Companies & Publishers
• Publishers
• Companies
Roadmap – coming months
–> focus on user growth and business development
• User Growth through content creation
> Attract content creators
(further target Coursera professors & interesting niches)
> Aid interactive course development
• User growth through gamification
• User growth trough marketing sideprojects
> Rfiddle
> Rdocumentation
• Business development
> Attract company courses
> Partnerships
• IT development – Formal launch in September
> Implementation of gamification
> Challenges
> Course creation interface improvement
Back-up slides
Content Creators
Corporations
Efficient & scalable course delivery in
the cloud
Affordable, up-to-date,
interactive data analysis
training
Find, train and certify the data
analysts they require
Learners
26/06/13 Vlerick Business School 24
Value Proposition to Stakeholders
Content Creators
Traditional
Institutions
Digital Content
Providers
Non - Digital
Content providers
26/06/13 Vlerick Business School 25
Competitive analysis for our value
proposition
Online Data
Analytics
Learners
Online
Interactive
Traditional
Institutions
26/06/13 Vlerick Business School 26
Competitive analysis for our value
proposition
Corporations
Recruitment
Agencies
Certification
Data analytics
training
providers
26/06/13 Vlerick Business School 27
Competitive analysis for our value
proposition
Marketing
26/06/13 Vlerick Business School 28
26/06/13 Vlerick Business School 29
Current Documentation
26/06/13 Vlerick Business School 30
Rdocumentation.org
26/06/13 Vlerick Business School 31
Rdocumentation.org
26/06/13 Vlerick Business School 32
“This is a great idea!”
David L Carlson Associate Professor of AnthropologyTexas
A&M University
“I tried it out and it is fantastic! Your Rdocumentation brought to my
attention 19 other subset functions aside from base and it presents even
the base documentation in a better way. Thank you very much.”
Rees Morrison
General Counsel Metrics, LLC
Management consulting and Data Analytics
“That is pretty neat. I think it is the nicest search and way to
interact with R documentation that I have seen.”
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles (UCLA)
Senior Analyst - Elkhart Group Ltd.
Rdocumentation.org
0
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
0m 6m 12m 18m 24m 30m
2,231,00
0 novice
users
1,662,00
0
intermediate
users
744,000
Expert users
4,638,000 R-users within
30m
Annual Growth Rate R =
40%
2.5
%
M.S.*
7.5
%
M.S.*
12 %
M.S.*
18,000
Expert
users
* Predicted expected scenario
124,000
Intermediate
users
267,000
novice users
411,000 users
after 30 months
Distribution Quantification
USERS
RECRUITMENT
ü Targeted job-advertising to users $2500
ü Fee / Placement $15k
PAID COURSES
ü Price determined by course creator $100
ü Percentage-based fee 24%
CERTIFICATES
ü Joint certificates with educational
institutions
$100
1
2
3
Revenue Generation
18-24 months 24-30 months
13,700
Expert
55,000
Intermediate
68,700
novice
Paid
Courses
Certificates
Recruitment
Job
Advertising
Revenue
peruser
type
18,000
Expert
124,000
Intermediate
267,000
novice
137,000
total users
Total Revenue $357,000
923
2770
923
140
280
47
4
6
0
8
12
0
$166k
$214k
$27k
Paid
Courses
Certificates
Recruitment
Job
Advertising
Revenue
peruser
type
2031
6093
2031
307
615
102
16
24
0
8
12
0
$239k
$447k
$59k
411,000
total users
$110k $46k $150k $50k $243k $102k $100k $300k
Total Revenue $746,000
Financial Revenue breakdown according to user type
revenue generated by 411k users between 24-30months
Recruitment Paid Courses
60% Intermediate
Certificates
Intermediate 60%
Experts 40%
20% Expert
20% Beginners
160k
241k
48.5k
48.5k
146k
60% intermediate
30% expert 31k
61k
10k
54%
32%
14%
Recruitment
(54%)
Certificates
(14%)
$746k
Paid Courses
(32%)
Revenue breakdown according to 3 main drivers
and 3 target groups
$0
$100,000
$200,000
$300,000
$400,000
$500,000
$600,000
$700,000
$800,000
6m 12m 18m 24m 30m
Total Revenue
Total Cost
21%
53%
11%
5.5%
8.5% Others
Rent
Payment
Wages
Promotion
$316,000
at 30 months
Financial Cost
WebSockets
AJAX requests
R serve
Ruby on Rails
High productivity
web application
framework
Node.js
Platform for real-time
scalable network
applications
RESTful API
R
Open-source
statistical language
Angular.js
MVC JavaScript framework
for single-page applications,
maintained by Google
DataMind leverages state of the art, open source frameworks in the
cloud
How It Works
- Originated in Bell Labs
- > 2 million data analysts are using R intensively
- Growing exponentially at 40% to 60% per annum
- An important driver of this fast-paced growth is the popularity of R in
universities
- Rexer Analytics 2011 Data Miner Survey, which indicates R as the #1 most
commonly-used software for data (Figure 1).
26/06/13 Vlerick Business School 39
R statistical open-source language
Source: http://r4stats.com/articles/popularity/ and http://prezi.com/s1qrgfm9ko4i/the-r-ecosystem/
26/06/13 Vlerick Business School 40
Exponentially Growing User Base
> 2 million users and 40-60% yearly
Exponentially Growing Functionality of R
6,275 R packages at all major repositories, 4,315 of which were at CRAN
Across a broad spectrum of domains: Financial engineering, biostatistics, data mining, …
Source: http://r4stats.com/articles/popularity/26/06/13 41
Filled out by 286 Academics, professionals and students from around the globe.
Majority of respondents interested
in free interactive courses
Most package authors willing to create
free interactive tutorials
Full data set of the survey and discussion of results at www.datamind.org/survey
Survey on R and education to verify interest of community
26/06/13 Vlerick Business School 42
26/06/13 Vlerick Business School 43
Place: Analysis of statistics.com user
base
26/06/13 Vlerick Business School 44
Place: Rstudio User base
26/06/13 Vlerick Business School 45
Extract of www.statistics.com Price List regarding R Training
26/06/13 Vlerick Business School 46
7 trillion dollar industry ready for disruption
570x online advertising market
7x global mobile industry
Rapid Rise E-Learning Students
“Vastly improved technology and
increased student drop out rates
have set the stage for disruption”
EDUCATION
An e-learning market of $91bn in 2012
17.0 % e-learning
enrollment growth rate
vs. 1.2% overal higher
education enrollment
11million Online only
students
in the US
by2019
50% classes of all
classes taught will
be offered online
GROWING FASTER THEN
E-learning market is the
fastest growing market in
education
23% CAGR
to 2017
84%
$1tr
E-learning
reduces cost with
Cost-increase since 2000
Student load debt US
40%
240%
Increase e-learning
investments by US
companies
47
Education’s Internet Moment is Now
But a shortage in talent and skills
The Data in the World is Growing
Business Analytics Business is Growing
of the data in the
world has been
created in the last 2
years
90%
Google Results for ‘what is big data’
1.35 billion
112 million
Blog posts discussing big data
68%
of US and UK executives
committed to analytics
2.5 quintillion bytes
Created per day by business and consumer
$16.9
billion
$3.2 billion
Market for big technology and
services grows between 2010 to
2015 with
40%CAGR
Two Billion
dollars
Revenue increase for
fortune 1000 companies
If 10% increase usability of data
2/3rds
67% North-America businesses
see big data as a concern within 5
years
70%
fortune 1000 companies
planning to hire data analysts
in near future
60%
Extremely difficult to
find data analysts
and skills
91%
Hiring new people from
outside their organization
69%
Retraining existing
data analytical skills
26/06/13 Vlerick Business School 48
Data Analytics is booming
R-community related:
Predecessor(s) Revolution Computing
Founded 2007
HeadquartersPalo Alto, CA, United States
Key people David Rich, CEO
Products Revolution R
Website www.revolutionanalytics.com
(Interactive) learning related:
26/06/13 Vlerick Business School 49
Exit strategy: Take-over potential
1. Certifications
students pay money to receive certification
2. Authentic assessments
students pay money to have their learning assessed and certified
3. Recruitment
4. Screening
companies and educational institutions pay to gain access to student record
5.Human tutoring
students pay a tutor to help them achieve the desired learning outcomes
6. Corporate learning
companies pay money to get customized courses
7. Sponsorship
sponsors pay money to have their advertising appear beside course materials
8. Tuition fees
students pay tuition fees for advanced level learning
Coursera Contract:
http://chronicle.com/article/Document-Examine-the-U-of/133063/
26/06/13 Vlerick Business School 50
8 monetization strategies stipulated by
Coursera

More Related Content

What's hot

Symposium 2018 - Big data transport and collaboration - Gregory Vial
Symposium 2018 - Big data  transport and collaboration - Gregory VialSymposium 2018 - Big data  transport and collaboration - Gregory Vial
Symposium 2018 - Big data transport and collaboration - Gregory Vial
PMI-Montréal
 
Mattmiddaghscatterplotppt
MattmiddaghscatterplotpptMattmiddaghscatterplotppt
Mattmiddaghscatterplotppt
mattmidd
 

What's hot (20)

Online Educa Berlin conference: Big Data in Education - theory and practice
Online Educa Berlin conference: Big Data in Education - theory and practiceOnline Educa Berlin conference: Big Data in Education - theory and practice
Online Educa Berlin conference: Big Data in Education - theory and practice
 
Lightning talks: digital strategy
Lightning talks: digital strategyLightning talks: digital strategy
Lightning talks: digital strategy
 
Big Data in Education, Desire2Learn Inc
Big Data in Education, Desire2Learn IncBig Data in Education, Desire2Learn Inc
Big Data in Education, Desire2Learn Inc
 
Jisc Learning Analytics intro for digital leaders
Jisc Learning Analytics intro for digital leadersJisc Learning Analytics intro for digital leaders
Jisc Learning Analytics intro for digital leaders
 
Jisc technology to tutoring new and emerging developments
Jisc technology to tutoring new and emerging developmentsJisc technology to tutoring new and emerging developments
Jisc technology to tutoring new and emerging developments
 
Big Data In Education
Big Data In EducationBig Data In Education
Big Data In Education
 
Phase 1 Learning Analytics Intro Slides
Phase 1 Learning Analytics Intro SlidesPhase 1 Learning Analytics Intro Slides
Phase 1 Learning Analytics Intro Slides
 
Student experience experts group meet up, April 2020
Student experience experts group meet up, April 2020Student experience experts group meet up, April 2020
Student experience experts group meet up, April 2020
 
Big Data & Text Analytics - Lesson Schedule
Big Data & Text Analytics - Lesson ScheduleBig Data & Text Analytics - Lesson Schedule
Big Data & Text Analytics - Lesson Schedule
 
Identifying and Tracking Trends in Instructional Design and Technology
Identifying and Tracking Trends in Instructional Design and TechnologyIdentifying and Tracking Trends in Instructional Design and Technology
Identifying and Tracking Trends in Instructional Design and Technology
 
Symposium 2018 - Big data transport and collaboration - Gregory Vial
Symposium 2018 - Big data  transport and collaboration - Gregory VialSymposium 2018 - Big data  transport and collaboration - Gregory Vial
Symposium 2018 - Big data transport and collaboration - Gregory Vial
 
Jisc learning analytics service sept2016
Jisc learning analytics service sept2016Jisc learning analytics service sept2016
Jisc learning analytics service sept2016
 
Wellbeing analytics code of practice
Wellbeing analytics code of practiceWellbeing analytics code of practice
Wellbeing analytics code of practice
 
Mattmiddaghscatterplotppt
MattmiddaghscatterplotpptMattmiddaghscatterplotppt
Mattmiddaghscatterplotppt
 
Learning Analytics: Today, Tomorrow, and When We Get Flying Cars #psuweb Conf...
Learning Analytics: Today, Tomorrow, and When We Get Flying Cars #psuweb Conf...Learning Analytics: Today, Tomorrow, and When We Get Flying Cars #psuweb Conf...
Learning Analytics: Today, Tomorrow, and When We Get Flying Cars #psuweb Conf...
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
 
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...
Where Cognitive Science, Interaction Design and Data Dwells: The Competencies...
 
Towards Advanced Business Analytics using Text Mining and Deep Learning
Towards Advanced Business Analytics using Text Mining and Deep LearningTowards Advanced Business Analytics using Text Mining and Deep Learning
Towards Advanced Business Analytics using Text Mining and Deep Learning
 
BigMLSchool: ML in the Healthcare Industry
BigMLSchool: ML in the Healthcare IndustryBigMLSchool: ML in the Healthcare Industry
BigMLSchool: ML in the Healthcare Industry
 
The Download: Tech Talks by the HPCC Systems Community, Episode 12
 The Download: Tech Talks by the HPCC Systems Community, Episode 12 The Download: Tech Talks by the HPCC Systems Community, Episode 12
The Download: Tech Talks by the HPCC Systems Community, Episode 12
 

Similar to DataMind Pitch August 2013

Aligning Learning Analytics with Classroom Practices & Needs
Aligning Learning Analytics with Classroom Practices & NeedsAligning Learning Analytics with Classroom Practices & Needs
Aligning Learning Analytics with Classroom Practices & Needs
Simon Knight
 
Could you increase your knowledge—and raise your grade—i.docx
Could you increase your knowledge—and raise your grade—i.docxCould you increase your knowledge—and raise your grade—i.docx
Could you increase your knowledge—and raise your grade—i.docx
faithxdunce63732
 
Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater
AlleneMcclendon878
 

Similar to DataMind Pitch August 2013 (20)

Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Enhancing Student Employability Through The Peer Review Of Professional Onlin...
Enhancing Student Employability Through The Peer Review Of Professional Onlin...Enhancing Student Employability Through The Peer Review Of Professional Onlin...
Enhancing Student Employability Through The Peer Review Of Professional Onlin...
 
Academic Innovation Data Showcase 2-14-19
Academic Innovation Data Showcase 2-14-19Academic Innovation Data Showcase 2-14-19
Academic Innovation Data Showcase 2-14-19
 
Aligning Learning Analytics with Classroom Practices & Needs
Aligning Learning Analytics with Classroom Practices & NeedsAligning Learning Analytics with Classroom Practices & Needs
Aligning Learning Analytics with Classroom Practices & Needs
 
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
 
Could you increase your knowledge—and raise your grade—i.docx
Could you increase your knowledge—and raise your grade—i.docxCould you increase your knowledge—and raise your grade—i.docx
Could you increase your knowledge—and raise your grade—i.docx
 
Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
Data-X-Sparse-v2
Data-X-Sparse-v2Data-X-Sparse-v2
Data-X-Sparse-v2
 
DataScience.pptx
DataScience.pptxDataScience.pptx
DataScience.pptx
 
Data-X-v3.1
Data-X-v3.1Data-X-v3.1
Data-X-v3.1
 
Data Scientists
 Data Scientists Data Scientists
Data Scientists
 
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHow Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
 
Transforming Education through Disruptive Technologies
Transforming Education through Disruptive TechnologiesTransforming Education through Disruptive Technologies
Transforming Education through Disruptive Technologies
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
 
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...
DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetu...
 
Data science in 10 steps
Data science in 10 stepsData science in 10 steps
Data science in 10 steps
 
QuSandbox+NVIDIA Rapids
QuSandbox+NVIDIA RapidsQuSandbox+NVIDIA Rapids
QuSandbox+NVIDIA Rapids
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

DataMind Pitch August 2013

  • 1. An interactive e-learning platform for Data Analysis based on R
  • 2. Albert Jorissen Martijn Theuwissen Dieter De Mesmaeker Jonathan Cornelissen Jonathan@datamind.org Dieter@datamind.org Albert@datamind.org Martijn@datamind.org Who’s who
  • 3. Why e-learning with and for R? Need for scalable tools to learn R and Data Analysis…
  • 4. Because of exponentially growing R user base More than 2 million R users growing at 40-60% yearly Source: http://r4stats.com/articles/popularity/ and http://prezi.com/s1qrgfm9ko4i/the-r-ecosystem/
  • 5. Keyword Competition Global2Monthly2Searches r"tutorial 0 6600 introduction"to"r 0 1600 online"statistics"course 0.98 1600 ggplot2"tutorial 0 880 statistics"course 0.85 880 an"introduction"to"r 0.01 880 r"book 0.06 590 learning"statistics 0.38 590 r"tutorials 0 590 r"introduction 0.01 480 statistics"courses 0.84 480 statistics"introduction 0.1 480 online"statistics"courses 0.99 320 r"course 0.04 260 r"training 0.17 260 free"online"statistics"course 0.56 260 statistics"training 0.62 210 online"statistics"class 0.98 170 statistics"class"online 0.98 140 data"analysis"tutorial 0.5 110 Analysis of r-project.org Analysis of Google keywords Compare to: SAS tutorial: 4400 Eviews tutorial: 390 Stata tutorial: 1900 Matlab tutorial: 22200 Hadoop tutorial: 12100 Source: Analysis based on http://cran.r-project.org/report_cran.html Source: Analysis based on http://adwords.google.com/select/keywordtoolexternal That needs to learn the basics and the specifics of R • Number of downloads per month for: • Introduction to R pdfs: 140.000 • Summary pdfs: 50.000 • Some of the “top” package: (reliability/stability of numbers below?) kernlab.pdf 349,780 party.pdf 167,396 igraph.pdf 59,969 VennDiagram.pdf 30,889 mclust.pdf 19,347 KnitR.pdf 10,697 twitteR.pdf 7,507 randomForest.pdf 6,824 Ggplot2 5,924 raster.pdf 5,326
  • 6. Source: http://r4stats.com/articles/popularity/ 6,275 R packages at all major repositories, 4,315 of which were at CRAN Across a broad spectrum of domains: Financial engineering, biostatistics, data mining, … Because of the exponentially growing functionality
  • 7. • Great books, tutorials,… on R • But coding is learned by doing • No online learning interface for R • Documentation made by experts for experts, not for beginners or intermediate users Teachers : Learners : • Often give the same or similar feedback to students in exercise sessions • Manually correct assignments • Static content • Hard to get feedback Students, Professionals, Researchers, Employees Why e-learning with and for R? Data Analysis Professors, Consultants, Researchers, Book authors
  • 9. WebSockets AJAX requests R serve Ruby on Rails High productivity web application framework Node.js Platform for real-time scalable network applications RESTful API R Open-source statistical language Angular.js MVC JavaScript framework for single-page applications, maintained by Google DataMind leverages state of the art open-source frameworks in the cloud
  • 11. Try the first course “Summer of R” at www.datamind.org! 1. Read Assignment
  • 12. 2. Read instructions Try the first course “Summer of R” at www.datamind.org!
  • 13. 3. Write R code in the browser Try the first course “Summer of R” at www.datamind.org!
  • 14. 4. Pre-exercise code is run in the background to pre-load a dataset, graphs, etc. Try the first course “Summer of R” at www.datamind.org!
  • 15. 5. If a student gets stuck, he can look at a solution Try the first course “Summer of R” at www.datamind.org!
  • 16. 6. Student submits answer and the R code is executed Try the first course “Summer of R” at www.datamind.org!
  • 17. 7. Output is shown in the R console below AND student gets feedback on his assignment Try the first course “Summer of R” at www.datamind.org!
  • 18. Very positive reactions on early stage prototype launch mid-July • Students Already >3000 happy registered students with 0 marketing budget and only a very limited course -> confirmation of market need Over > 40.000 exercises were made and 47% of all visitors return to the website -> confirmation of engaging experience for students • Content creators -> Interest from high-profile academics -> Interest from high-profile companies
  • 19. Course creators attracted based on prototype • Eric Zivot – University of Washington Has Comp Finance course on Coursera with 150k students yearly Is recruiting a PhD student to start building an interactive course on DataMind • Ramnath Vaidyanathan – McGill University Providing strategic and technical advice, and interactive online complement to his book on interactive graphics/slides with R. • Ista Zahn – R workshops at Harvard University • Katrien Antonio – KULeuven A life insurance courses with R • Frank DiTraglia – University of Pennsylvania • Marc Carlson – Bioconductor core member and Data Analyst at Fred Hutchinson Cancer Research Center • Ajay Ohri – Author of popular “R for Business Analytics” book • Stephen Davies – University of Mary Washington • Dai, Zhuo Jia – Consultant interested in creating credit scoring course • Geert Molenberghs – KULeuven - Interested, Skype call planned
  • 20. Reactions “Sounds terrific that you think there might be an opportunity to collaborate on this. Of all the learning platforms for R out there, I strongly believe DataMind has the best interface, and I would be happy to help push things further by collaborating.” Prof. Ramnath V. – McGill University “Het spreekt tot de verbeelding om mee te kunnen werken aan deze (r)evolutie.” Prof. Katrien Antonio – UVA & KU Leuven “At the moment, I use static html tutorials composed in R Markdown. Something interactive would clearly be much better for helping the students learn.” Prof. Francis DiTraglia – University of Pennsylvania …
  • 21. Interest from both Companies & Publishers • Publishers • Companies
  • 22. Roadmap – coming months –> focus on user growth and business development • User Growth through content creation > Attract content creators (further target Coursera professors & interesting niches) > Aid interactive course development • User growth through gamification • User growth trough marketing sideprojects > Rfiddle > Rdocumentation • Business development > Attract company courses > Partnerships • IT development – Formal launch in September > Implementation of gamification > Challenges > Course creation interface improvement
  • 24. Content Creators Corporations Efficient & scalable course delivery in the cloud Affordable, up-to-date, interactive data analysis training Find, train and certify the data analysts they require Learners 26/06/13 Vlerick Business School 24 Value Proposition to Stakeholders
  • 25. Content Creators Traditional Institutions Digital Content Providers Non - Digital Content providers 26/06/13 Vlerick Business School 25 Competitive analysis for our value proposition
  • 26. Online Data Analytics Learners Online Interactive Traditional Institutions 26/06/13 Vlerick Business School 26 Competitive analysis for our value proposition
  • 27. Corporations Recruitment Agencies Certification Data analytics training providers 26/06/13 Vlerick Business School 27 Competitive analysis for our value proposition
  • 29. 26/06/13 Vlerick Business School 29 Current Documentation
  • 30. 26/06/13 Vlerick Business School 30 Rdocumentation.org
  • 31. 26/06/13 Vlerick Business School 31 Rdocumentation.org
  • 32. 26/06/13 Vlerick Business School 32 “This is a great idea!” David L Carlson Associate Professor of AnthropologyTexas A&M University “I tried it out and it is fantastic! Your Rdocumentation brought to my attention 19 other subset functions aside from base and it presents even the base documentation in a better way. Thank you very much.” Rees Morrison General Counsel Metrics, LLC Management consulting and Data Analytics “That is pretty neat. I think it is the nicest search and way to interact with R documentation that I have seen.” Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles (UCLA) Senior Analyst - Elkhart Group Ltd. Rdocumentation.org
  • 33. 0 1,000,000 2,000,000 3,000,000 4,000,000 5,000,000 0m 6m 12m 18m 24m 30m 2,231,00 0 novice users 1,662,00 0 intermediate users 744,000 Expert users 4,638,000 R-users within 30m Annual Growth Rate R = 40% 2.5 % M.S.* 7.5 % M.S.* 12 % M.S.* 18,000 Expert users * Predicted expected scenario 124,000 Intermediate users 267,000 novice users 411,000 users after 30 months Distribution Quantification
  • 34. USERS RECRUITMENT ü Targeted job-advertising to users $2500 ü Fee / Placement $15k PAID COURSES ü Price determined by course creator $100 ü Percentage-based fee 24% CERTIFICATES ü Joint certificates with educational institutions $100 1 2 3 Revenue Generation
  • 35. 18-24 months 24-30 months 13,700 Expert 55,000 Intermediate 68,700 novice Paid Courses Certificates Recruitment Job Advertising Revenue peruser type 18,000 Expert 124,000 Intermediate 267,000 novice 137,000 total users Total Revenue $357,000 923 2770 923 140 280 47 4 6 0 8 12 0 $166k $214k $27k Paid Courses Certificates Recruitment Job Advertising Revenue peruser type 2031 6093 2031 307 615 102 16 24 0 8 12 0 $239k $447k $59k 411,000 total users $110k $46k $150k $50k $243k $102k $100k $300k Total Revenue $746,000 Financial Revenue breakdown according to user type
  • 36. revenue generated by 411k users between 24-30months Recruitment Paid Courses 60% Intermediate Certificates Intermediate 60% Experts 40% 20% Expert 20% Beginners 160k 241k 48.5k 48.5k 146k 60% intermediate 30% expert 31k 61k 10k 54% 32% 14% Recruitment (54%) Certificates (14%) $746k Paid Courses (32%) Revenue breakdown according to 3 main drivers and 3 target groups
  • 37. $0 $100,000 $200,000 $300,000 $400,000 $500,000 $600,000 $700,000 $800,000 6m 12m 18m 24m 30m Total Revenue Total Cost 21% 53% 11% 5.5% 8.5% Others Rent Payment Wages Promotion $316,000 at 30 months Financial Cost
  • 38. WebSockets AJAX requests R serve Ruby on Rails High productivity web application framework Node.js Platform for real-time scalable network applications RESTful API R Open-source statistical language Angular.js MVC JavaScript framework for single-page applications, maintained by Google DataMind leverages state of the art, open source frameworks in the cloud How It Works
  • 39. - Originated in Bell Labs - > 2 million data analysts are using R intensively - Growing exponentially at 40% to 60% per annum - An important driver of this fast-paced growth is the popularity of R in universities - Rexer Analytics 2011 Data Miner Survey, which indicates R as the #1 most commonly-used software for data (Figure 1). 26/06/13 Vlerick Business School 39 R statistical open-source language
  • 40. Source: http://r4stats.com/articles/popularity/ and http://prezi.com/s1qrgfm9ko4i/the-r-ecosystem/ 26/06/13 Vlerick Business School 40 Exponentially Growing User Base > 2 million users and 40-60% yearly
  • 41. Exponentially Growing Functionality of R 6,275 R packages at all major repositories, 4,315 of which were at CRAN Across a broad spectrum of domains: Financial engineering, biostatistics, data mining, … Source: http://r4stats.com/articles/popularity/26/06/13 41
  • 42. Filled out by 286 Academics, professionals and students from around the globe. Majority of respondents interested in free interactive courses Most package authors willing to create free interactive tutorials Full data set of the survey and discussion of results at www.datamind.org/survey Survey on R and education to verify interest of community 26/06/13 Vlerick Business School 42
  • 43. 26/06/13 Vlerick Business School 43 Place: Analysis of statistics.com user base
  • 44. 26/06/13 Vlerick Business School 44 Place: Rstudio User base
  • 46. Extract of www.statistics.com Price List regarding R Training 26/06/13 Vlerick Business School 46
  • 47. 7 trillion dollar industry ready for disruption 570x online advertising market 7x global mobile industry Rapid Rise E-Learning Students “Vastly improved technology and increased student drop out rates have set the stage for disruption” EDUCATION An e-learning market of $91bn in 2012 17.0 % e-learning enrollment growth rate vs. 1.2% overal higher education enrollment 11million Online only students in the US by2019 50% classes of all classes taught will be offered online GROWING FASTER THEN E-learning market is the fastest growing market in education 23% CAGR to 2017 84% $1tr E-learning reduces cost with Cost-increase since 2000 Student load debt US 40% 240% Increase e-learning investments by US companies 47 Education’s Internet Moment is Now
  • 48. But a shortage in talent and skills The Data in the World is Growing Business Analytics Business is Growing of the data in the world has been created in the last 2 years 90% Google Results for ‘what is big data’ 1.35 billion 112 million Blog posts discussing big data 68% of US and UK executives committed to analytics 2.5 quintillion bytes Created per day by business and consumer $16.9 billion $3.2 billion Market for big technology and services grows between 2010 to 2015 with 40%CAGR Two Billion dollars Revenue increase for fortune 1000 companies If 10% increase usability of data 2/3rds 67% North-America businesses see big data as a concern within 5 years 70% fortune 1000 companies planning to hire data analysts in near future 60% Extremely difficult to find data analysts and skills 91% Hiring new people from outside their organization 69% Retraining existing data analytical skills 26/06/13 Vlerick Business School 48 Data Analytics is booming
  • 49. R-community related: Predecessor(s) Revolution Computing Founded 2007 HeadquartersPalo Alto, CA, United States Key people David Rich, CEO Products Revolution R Website www.revolutionanalytics.com (Interactive) learning related: 26/06/13 Vlerick Business School 49 Exit strategy: Take-over potential
  • 50. 1. Certifications students pay money to receive certification 2. Authentic assessments students pay money to have their learning assessed and certified 3. Recruitment 4. Screening companies and educational institutions pay to gain access to student record 5.Human tutoring students pay a tutor to help them achieve the desired learning outcomes 6. Corporate learning companies pay money to get customized courses 7. Sponsorship sponsors pay money to have their advertising appear beside course materials 8. Tuition fees students pay tuition fees for advanced level learning Coursera Contract: http://chronicle.com/article/Document-Examine-the-U-of/133063/ 26/06/13 Vlerick Business School 50 8 monetization strategies stipulated by Coursera