MK99 – Big Data 1 
Big data & cross-platform analytics 
MOOC lectures Pr. Clement Levallois
MK99 – Big Data 2 
How to create value from data
MK99 – Big Data 3 
How to read these slides 
• 
Not a closed list, not a recipe! 
• 
Rather, these are essential building ...
MK99 – Big Data 4 
predict 
1. 
Domain-specific historical data 
2. 
Data mashups 
3. 
Feedback data on accuracy of predic...
MK99 – Big Data 5 
suggest (rec sys) 
1. 
A rich static dataset and / or historical data on past transactions 
2. 
A feedb...
MK99 – Big Data 6 
curate 
1. 
Dirty (inaccuracies, missing values, bad formatting) 
2. 
Cheap but hard to collect 
3. 
Pr...
MK99 – Big Data 7 
enrich 
1. 
Clean 
2. 
Diverse 
1. 
Knowing which cocktail of data is valued by the market 
2. 
Limit r...
MK99 – Big Data 8 
rank / match / compare 
1. 
Varied data 
1. 
Many attributes 
2. 
Large range of values 
1. 
Finding em...
MK99 – Big Data 9 
segment 
1. 
Varied data 
1. 
Many attributes 
2. 
Large range of values 
1. 
Choosing the relevant ass...
MK99 – Big Data 10 
classify 
1. 
Any dataset, the richer the better 
2. 
A feedback mechanism providing data to improve o...
MK99 – Big Data 11 
Combos! 
Curate 
Enrich 
Segment 
Rank 
(within segments) 
Enrich 
Rank 
Enrich 
Suggest 
(how to find...
MK99 – Big Data 12 
This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) 
Conta...
Upcoming SlideShare
Loading in …5
×

7 building blocks to create value from data

2,942 views

Published on

Slides of the course on big data by C. Levallois from EMLYON Business School.
For business students. Check the online video connected with these slides.

-> It is often the case that companies realize that there is value in their data (the new oil!), but thay they don't know where to start digging to create value proposals based on their data.
This slide presentation offer 7 building blocks to start thinking about value creation with data.

Published in: Business
  • Be the first to comment

7 building blocks to create value from data

  1. 1. MK99 – Big Data 1 Big data & cross-platform analytics MOOC lectures Pr. Clement Levallois
  2. 2. MK99 – Big Data 2 How to create value from data
  3. 3. MK99 – Big Data 3 How to read these slides • Not a closed list, not a recipe! • Rather, these are essential building blocks for a strategy of value creation based on data.
  4. 4. MK99 – Big Data 4 predict 1. Domain-specific historical data 2. Data mashups 3. Feedback data on accuracy of prediction 1. Hard to start (no data yet…) 2. Risk missing the long tail 3. Over reliance 1. Data scientists for predictive algo 2. Econometricians for modelling 3. Domain specialists for heuristics and quality control 1. Predicting credit score 2. Predicting crime 3. Predicting deals The data you need The people you need The ones doing it The hard part
  5. 5. MK99 – Big Data 5 suggest (rec sys) 1. A rich static dataset and / or historical data on past transactions 2. A feedback mechanism providing data to improve on the suggestion procedure 1. Getting a proper feedback mechanism 2. Managing the serendipity problem 3. Finding the value proposition which goes beyond the simple “you purchased this, you’ll like that” 1. Data scientists 2. Domain specialists for insights into the suggestion logic Amazon’s product recommendation system Google’s “Related searches…” Retailer’s personalized recommendations The data you need The people you need The ones doing it The hard part
  6. 6. MK99 – Big Data 6 curate 1. Dirty (inaccuracies, missing values, bad formatting) 2. Cheap but hard to collect 3. Preferably unstructured 1. Slow progress 2. Must maintain continuity 3. Scaling up / right incentives for the workforce 4. Quality control 1. Cheap but not unskilled labor to perform curation (hint: curators can be the users of a free service you provide to them) 2. Data scientists for quality control 3. Domain specialists for quality control 1. Thomson Reuters curating and selling scientific data 2. Nielsen and IRI curating and selling retail data 3. ImDB curating and selling movie data The data you need The people you need The ones doing it The hard part
  7. 7. MK99 – Big Data 7 enrich 1. Clean 2. Diverse 1. Knowing which cocktail of data is valued by the market 2. Limit replicability 3. Establish legitimacy 1. Specialists of APIs / data mashups / data integration 2. Possibly: specialists of linked data 3. Domain specialists for quality control 1. Selling samples from the enriched dataset 2. Selling aggregated indicators 3. Selling dashboards The data you need The people you need The ones doing it The hard part
  8. 8. MK99 – Big Data 8 rank / match / compare 1. Varied data 1. Many attributes 2. Large range of values 1. Finding emergent, implicit attributes 2. Insuring consistency of the ranking 3. Avoid gaming of the system by the users 1. Data scientists 2. Hacker mentality to imagine how unstructured data can contribute to ranking 3. Domain specialists for quality control 1. Search engines ranking results 2. Yelp, Travelocity, etc… ranking destinations 3. Any system that needs to filter out best quality entities among a crowd of candidates The data you need The people you need The ones doing it already The hard part
  9. 9. MK99 – Big Data 9 segment 1. Varied data 1. Many attributes 2. Large range of values 1. Choosing the relevant association measure 2. Judging the quality of the segmentation 3. Dealing with boundary cases 4. Choosing between supervised vs unsupervised methods 5. Deciding whether overlapping segments (instead of disjoint segments) are allowed. 1. Data scientists (including network analysts) 2. Domain specialists for quality control 1. All industries, when doing: 1. Marketing research (segment markets) 2. Product development (segment portfolio of products) 3. Advertising (target ads to groups) The data you need The people you need The ones doing it The hard part
  10. 10. MK99 – Big Data 10 classify 1. Any dataset, the richer the better 2. A feedback mechanism providing data to improve on the classification procedure 1. Evaluating the quality of the comparison 2. Dealing with boundary cases 3. Choosing between a supervised and unsupervised approach (how many categories?) 1. Data scientists 2. Domain specialists for insights into the classification logic Detectors of spam (this email: spam or not?) Algorithmic trading (this piece of news: buy or sell?) Medical apps: computer-aided diagnosis based on a set of measurements The data you need The people you need The ones doing it The hard part
  11. 11. MK99 – Big Data 11 Combos! Curate Enrich Segment Rank (within segments) Enrich Rank Enrich Suggest (how to find most valuable customers in a CRM)
  12. 12. MK99 – Big Data 12 This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) Contact Clement Levallois (levallois [at] em-lyon.com) for more information.

×