2. Albert Jorissen
Martijn Theuwissen
Dieter De Mesmaeker
Jonathan Cornelissen
Jonathan@datamind.org
Dieter@datamind.org
Albert@datamind.org
Martijn@datamind.org
Who’s who
3. Why e-learning with and for R?
Need for scalable tools to learn
R and Data Analysis…
4. Because of exponentially growing R user base
More than 2 million R users growing at 40-60% yearly
Source: http://r4stats.com/articles/popularity/ and http://prezi.com/s1qrgfm9ko4i/the-r-ecosystem/
5. Keyword Competition Global2Monthly2Searches
r"tutorial 0 6600
introduction"to"r 0 1600
online"statistics"course 0.98 1600
ggplot2"tutorial 0 880
statistics"course 0.85 880
an"introduction"to"r 0.01 880
r"book 0.06 590
learning"statistics 0.38 590
r"tutorials 0 590
r"introduction 0.01 480
statistics"courses 0.84 480
statistics"introduction 0.1 480
online"statistics"courses 0.99 320
r"course 0.04 260
r"training 0.17 260
free"online"statistics"course 0.56 260
statistics"training 0.62 210
online"statistics"class 0.98 170
statistics"class"online 0.98 140
data"analysis"tutorial 0.5 110
Analysis of r-project.org Analysis of Google keywords
Compare to:
SAS tutorial: 4400
Eviews tutorial: 390
Stata tutorial: 1900
Matlab tutorial: 22200
Hadoop tutorial: 12100
Source: Analysis based on
http://cran.r-project.org/report_cran.html
Source: Analysis based on
http://adwords.google.com/select/keywordtoolexternal
That needs to learn the basics and the specifics
of R
• Number of downloads per month for:
• Introduction to R pdfs: 140.000
• Summary pdfs: 50.000
• Some of the “top” package:
(reliability/stability of numbers below?)
kernlab.pdf 349,780
party.pdf 167,396
igraph.pdf 59,969
VennDiagram.pdf 30,889
mclust.pdf 19,347
KnitR.pdf 10,697
twitteR.pdf 7,507
randomForest.pdf 6,824
Ggplot2 5,924
raster.pdf 5,326
6. Source: http://r4stats.com/articles/popularity/
6,275 R packages at all major repositories, 4,315 of which were at CRAN
Across a broad spectrum of domains: Financial engineering, biostatistics, data mining, …
Because of the exponentially growing functionality
7. • Great books, tutorials,… on R
• But coding is learned by doing
• No online learning interface for R
• Documentation made by experts for experts,
not for beginners or intermediate users
Teachers :
Learners :
• Often give the same or similar feedback to
students in exercise sessions
• Manually correct assignments
• Static content
• Hard to get feedback
Students, Professionals, Researchers, Employees
Why e-learning with and for R?
Data Analysis Professors, Consultants, Researchers, Book authors
9. WebSockets
AJAX requests
R serve
Ruby on Rails
High productivity
web application
framework
Node.js
Platform for real-time
scalable network
applications
RESTful API
R
Open-source
statistical language
Angular.js
MVC JavaScript framework
for single-page applications,
maintained by Google
DataMind leverages state of the art open-source
frameworks in the cloud
13. 3. Write R code in the browser
Try the first course “Summer of R” at www.datamind.org!
14. 4. Pre-exercise code is run in the
background to pre-load a
dataset, graphs, etc.
Try the first course “Summer of R” at www.datamind.org!
15. 5. If a student gets stuck, he can
look at a solution
Try the first course “Summer of R” at www.datamind.org!
16. 6. Student submits answer and
the R code is executed
Try the first course “Summer of R” at www.datamind.org!
17. 7. Output is shown in the R
console below AND student gets
feedback on his assignment
Try the first course “Summer of R” at www.datamind.org!
18. Very positive reactions on early stage prototype
launch mid-July
• Students
Already >3000 happy registered students
with 0 marketing budget and only a very limited course
-> confirmation of market need
Over > 40.000 exercises were made
and 47% of all visitors return to the website
-> confirmation of engaging experience for students
• Content creators
-> Interest from high-profile academics
-> Interest from high-profile companies
19. Course creators attracted based on prototype
• Eric Zivot – University of Washington
Has Comp Finance course on Coursera with 150k students yearly
Is recruiting a PhD student to start building an interactive course on DataMind
• Ramnath Vaidyanathan – McGill University
Providing strategic and technical advice, and interactive online complement to his book
on interactive graphics/slides with R.
• Ista Zahn – R workshops at Harvard University
• Katrien Antonio – KULeuven
A life insurance courses with R
• Frank DiTraglia – University of Pennsylvania
• Marc Carlson – Bioconductor core member and Data Analyst at Fred Hutchinson
Cancer Research Center
• Ajay Ohri – Author of popular “R for Business Analytics” book
• Stephen Davies – University of Mary Washington
• Dai, Zhuo Jia – Consultant interested in creating credit scoring course
• Geert Molenberghs – KULeuven - Interested, Skype call planned
20. Reactions
“Sounds terrific that you think there might be an opportunity to
collaborate on this. Of all the learning platforms for R out there, I
strongly believe DataMind has the best interface, and I would be happy
to help push things further by collaborating.”
Prof. Ramnath V. – McGill University
“Het spreekt tot de verbeelding om mee te kunnen werken aan deze
(r)evolutie.”
Prof. Katrien Antonio – UVA & KU Leuven
“At the moment, I use static html tutorials composed in R Markdown.
Something interactive would clearly be much better for helping the
students learn.”
Prof. Francis DiTraglia – University of Pennsylvania
…
22. Roadmap – coming months
–> focus on user growth and business development
• User Growth through content creation
> Attract content creators
(further target Coursera professors & interesting niches)
> Aid interactive course development
• User growth through gamification
• User growth trough marketing sideprojects
> Rfiddle
> Rdocumentation
• Business development
> Attract company courses
> Partnerships
• IT development – Formal launch in September
> Implementation of gamification
> Challenges
> Course creation interface improvement
24. Content Creators
Corporations
Efficient & scalable course delivery in
the cloud
Affordable, up-to-date,
interactive data analysis
training
Find, train and certify the data
analysts they require
Learners
26/06/13 Vlerick Business School 24
Value Proposition to Stakeholders
32. 26/06/13 Vlerick Business School 32
“This is a great idea!”
David L Carlson Associate Professor of AnthropologyTexas
A&M University
“I tried it out and it is fantastic! Your Rdocumentation brought to my
attention 19 other subset functions aside from base and it presents even
the base documentation in a better way. Thank you very much.”
Rees Morrison
General Counsel Metrics, LLC
Management consulting and Data Analytics
“That is pretty neat. I think it is the nicest search and way to
interact with R documentation that I have seen.”
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles (UCLA)
Senior Analyst - Elkhart Group Ltd.
Rdocumentation.org
34. USERS
RECRUITMENT
ü Targeted job-advertising to users $2500
ü Fee / Placement $15k
PAID COURSES
ü Price determined by course creator $100
ü Percentage-based fee 24%
CERTIFICATES
ü Joint certificates with educational
institutions
$100
1
2
3
Revenue Generation
35. 18-24 months 24-30 months
13,700
Expert
55,000
Intermediate
68,700
novice
Paid
Courses
Certificates
Recruitment
Job
Advertising
Revenue
peruser
type
18,000
Expert
124,000
Intermediate
267,000
novice
137,000
total users
Total Revenue $357,000
923
2770
923
140
280
47
4
6
0
8
12
0
$166k
$214k
$27k
Paid
Courses
Certificates
Recruitment
Job
Advertising
Revenue
peruser
type
2031
6093
2031
307
615
102
16
24
0
8
12
0
$239k
$447k
$59k
411,000
total users
$110k $46k $150k $50k $243k $102k $100k $300k
Total Revenue $746,000
Financial Revenue breakdown according to user type
36. revenue generated by 411k users between 24-30months
Recruitment Paid Courses
60% Intermediate
Certificates
Intermediate 60%
Experts 40%
20% Expert
20% Beginners
160k
241k
48.5k
48.5k
146k
60% intermediate
30% expert 31k
61k
10k
54%
32%
14%
Recruitment
(54%)
Certificates
(14%)
$746k
Paid Courses
(32%)
Revenue breakdown according to 3 main drivers
and 3 target groups
38. WebSockets
AJAX requests
R serve
Ruby on Rails
High productivity
web application
framework
Node.js
Platform for real-time
scalable network
applications
RESTful API
R
Open-source
statistical language
Angular.js
MVC JavaScript framework
for single-page applications,
maintained by Google
DataMind leverages state of the art, open source frameworks in the
cloud
How It Works
39. - Originated in Bell Labs
- > 2 million data analysts are using R intensively
- Growing exponentially at 40% to 60% per annum
- An important driver of this fast-paced growth is the popularity of R in
universities
- Rexer Analytics 2011 Data Miner Survey, which indicates R as the #1 most
commonly-used software for data (Figure 1).
26/06/13 Vlerick Business School 39
R statistical open-source language
41. Exponentially Growing Functionality of R
6,275 R packages at all major repositories, 4,315 of which were at CRAN
Across a broad spectrum of domains: Financial engineering, biostatistics, data mining, …
Source: http://r4stats.com/articles/popularity/26/06/13 41
42. Filled out by 286 Academics, professionals and students from around the globe.
Majority of respondents interested
in free interactive courses
Most package authors willing to create
free interactive tutorials
Full data set of the survey and discussion of results at www.datamind.org/survey
Survey on R and education to verify interest of community
26/06/13 Vlerick Business School 42
47. 7 trillion dollar industry ready for disruption
570x online advertising market
7x global mobile industry
Rapid Rise E-Learning Students
“Vastly improved technology and
increased student drop out rates
have set the stage for disruption”
EDUCATION
An e-learning market of $91bn in 2012
17.0 % e-learning
enrollment growth rate
vs. 1.2% overal higher
education enrollment
11million Online only
students
in the US
by2019
50% classes of all
classes taught will
be offered online
GROWING FASTER THEN
E-learning market is the
fastest growing market in
education
23% CAGR
to 2017
84%
$1tr
E-learning
reduces cost with
Cost-increase since 2000
Student load debt US
40%
240%
Increase e-learning
investments by US
companies
47
Education’s Internet Moment is Now
48. But a shortage in talent and skills
The Data in the World is Growing
Business Analytics Business is Growing
of the data in the
world has been
created in the last 2
years
90%
Google Results for ‘what is big data’
1.35 billion
112 million
Blog posts discussing big data
68%
of US and UK executives
committed to analytics
2.5 quintillion bytes
Created per day by business and consumer
$16.9
billion
$3.2 billion
Market for big technology and
services grows between 2010 to
2015 with
40%CAGR
Two Billion
dollars
Revenue increase for
fortune 1000 companies
If 10% increase usability of data
2/3rds
67% North-America businesses
see big data as a concern within 5
years
70%
fortune 1000 companies
planning to hire data analysts
in near future
60%
Extremely difficult to
find data analysts
and skills
91%
Hiring new people from
outside their organization
69%
Retraining existing
data analytical skills
26/06/13 Vlerick Business School 48
Data Analytics is booming
49. R-community related:
Predecessor(s) Revolution Computing
Founded 2007
HeadquartersPalo Alto, CA, United States
Key people David Rich, CEO
Products Revolution R
Website www.revolutionanalytics.com
(Interactive) learning related:
26/06/13 Vlerick Business School 49
Exit strategy: Take-over potential
50. 1. Certifications
students pay money to receive certification
2. Authentic assessments
students pay money to have their learning assessed and certified
3. Recruitment
4. Screening
companies and educational institutions pay to gain access to student record
5.Human tutoring
students pay a tutor to help them achieve the desired learning outcomes
6. Corporate learning
companies pay money to get customized courses
7. Sponsorship
sponsors pay money to have their advertising appear beside course materials
8. Tuition fees
students pay tuition fees for advanced level learning
Coursera Contract:
http://chronicle.com/article/Document-Examine-the-U-of/133063/
26/06/13 Vlerick Business School 50
8 monetization strategies stipulated by
Coursera