SlideShare a Scribd company logo
1 of 67
Open Data for Agriculture
Intro to Big Data
29/11/2013
Athens, Greece
Joint offering by

Supported by EU projects
Designing Data Products

Dr. Vassilis Protonotarios
Agro-Know Technologies, Greece
Intro
• This presentation provides introductory information
about
– the (big) data products
– the design of (big) data products,
– the Drivetrain approach for the design of objective-based
(big) data products.

• The Drivetrain approach will be applied to
agricultural case studies in the next session
• The majority of the slides were based on the report
“Big Data Now: 2012 Edition. O’Reilly Media, Inc.”
Slide 3 of 66
Objectives
This presentation aims to:
• Provide an introduction to data products
• Define the “objective-based data products” concept
– Describe the Drivetrain approach in the design of (big) Data
products
– Analyze the design of data products
– Provide applications / case studies

In order to provide the methodology for the development
of data products
Slide 4 of 66
Structure of the presentation
1. Intro to designing (great) data products
2. Objective-based data products
– The Drivetrain approach

3. Case studies (x4): Application of the
Drivetrain approach
4. The future of data products

Slide 5 of 66
INTRO TO DESIGNING GREAT DATA
PRODUCTS
What is a (big) data product?
• What happens when (big) data becomes a
product
– specifically, a consumer product

• Produce (big) data based on inputs ?
= data producers

• Deliver results based on (big) data ?
= data processors

• Uses big data for providing useful outcomes?
Slide 7 of 66
Facts about (big) data products
• Enable their users to do whatever they want
– which most often has little to do with (big) data

• Replace physical products

Slide 8 of 66
The past: Predictive modeling
• Development of data products based on data
predictive modeling
– weather forecasting
– recommendation engines
– email spam filters
– services that predict airline flight times
• sometimes more accurately than the airlines
themselves.

Slide 9 of 66
The issue of predictive modeling
• Prediction technology: interesting, useful and
mathematically elegant
– BUT we need to take the next step because….

• These products just make predictions
– instead of asking what action they want someone
to take as a result of a prediction

Slide 10 of 66
The role of predictive modeling
• Great predictive modeling is still an important
part of the solution
– but it no longer stands on its own
– as products become more sophisticated, it
becomes less useful

Slide 11 of 66
A new, alternative approach
• the Drivetrain Approach
– a four-step approach, already applied in the
industry
– inspired by the emerging field of self-driving
vehicles
– Objective-based approach
B
A
http://www.popularmechanics.com/cars/how-to/repair-questions/1302716

Slide 12 of 66
A fine objective-based model

Slide 13 of 66
Case study
• A user of Google’s self-driving car
– is completely unaware of the hundreds (if not
thousands) of models and the petabytes of data
that make it work

BUT
• It is an increasingly sophisticated product built
by data scientists
– they need a systematic design approach
Slide 14 of 66
OBJECTIVE-BASED DATA PRODUCTS
The Drivetrain approach

http://cdn.oreilly.com/radar/images/posts/0312-1-drivetrain-approach-lg.png
Slide 16 of 66
ource: http://en.wikipedia.org/wiki/File:Altavista-1999.png

Case study: Search engines

Why ?

1997
2013

Slide 17 of 66
The 4 steps in the Drivetrain approach
• The four steps in this transition:
– Identify the main objective
• For Google: show the most relevant search results

– Specify the system’s manageable inputs [levers]
• For Google: ranking the results

– Consider the data needed for managing the inputs
• Information about users’ activities in other web sites

– Building the predictive models
• For Google: PageRank algorithm
Slide 18 of 66
Drivetrain approach goal

NOT
use data not just to generate more data
– especially in the form of predictions

BUT
use data to produce actionable outcomes

Slide 19 of 66
[CASE STUDY 1] THE MODEL ASSEMBLY
LINE: A CASE STUDY OF INSURANCE
COMPANIES
The issue of insurance companies
• Case study: Insurance companies
– Their objective: maximizing the profit from each
policy price
– An optimal pricing model is to them what the
assembly line is to automobile manufacturing
– Despite their long experience in prediction, they
often fail to make optimal business decisions
about what price to charge each new customer

Slide 21 of 66
Transition to Drivetrain approach
• Identifying solutions to this issue
– Optimal Decisions Group (ODG) approached this
problem with an early use of the Drivetrain
Approach
– Resulted in a practical take on step 4 that can be
applied to a wide range of problems.

Model Assembly Line

Slide 22 of 66
The Drivetrain approach
Set a price for policies, maximizing profit
• price to charge each customer;
• types of accidents to cover;
• how much to spend on marketing and customer service;
• how to react to their competitors’ pricing decisions

• Data collected from real experiments on customers

Randomly changing the prices of hundreds of thousands of policies
over many months
•

• Develop a probability model for optimizing the insurer’s profit
Slide 23 of 66
Slide 24 of 66

http://cdn.oreilly.com/radar/images/posts/0312-2-drivetrain-step4-lg.png

Developing the modeler:
Model Assembly Line
The role of the Modeler
• Modeler Component 1
• model of price elasticity: the probability that a customer will
accept a given price (for new policies and renewals)

• Modeler Component 2
• relates price to the insurance company’s profit, conditional
on the customer accepting this price.

• Multiplying these two curves creates a final curve
• Shows price versus expected profit
• The final curve has a clearly identifiable local maximum that
represents the best price to charge a customer for the first
year.
Slide 25 of 66
The final curve

http://strata.oreilly.com/2012/03/drivetrain-approach-data-products.html

Slide 26 of 66
The role of the Simulator
• Lets ODG ask the “what if ” questions
• to see how the levers affect the distribution of the final
outcome

• Runs the models over a wide range of inputs
• The operator can adjust the input levers to answer specific
questions
• “What will happen if our company offers the customer a low
teaser price in year one but then raises the premiums in Y2?

• Explores the distribution of profit as affected the
inputs outside of the insurer’s control
• E.g. “What if the economy crashes and the customer loses
his job?”
Slide 27 of 66
The role of the Optimizer
• takes the surface of possible outcomes and
identifies the highest point
• finds the best outcomes
• identify catastrophic outcomes
– and show how to avoid them

Slide 28 of 66
Take-home message
Using a Drivetrain Approach combined with a
Model Assembly Line bridges the gap
between predictive models and actionable
outcomes.

Slide 29 of 66
[CASE STUDY 2] MARKETING INDUSTRY
RECOMMENDER SYSTEMS
Recommendation engines
• Recommendation engines
– data product based on well-built predictive
models that do not achieve an optimal objective.
– The current algorithms predict what products a
customer will like,
• based on purchase history and the histories of similar
customers.

Slide 31 of 66
The case of Amazon
• Amazon represents every purchase that has
ever been made as a giant sparse matrix
– with customers as the rows and products as the
columns.

• Once they have the data in this format, data
scientists apply some form of collaborative
filtering to “fill in the matrix.”

Slide 32 of 66
The case of Amazon
• Such models are good at predicting whether a
customer will like a given product
– but they often suggest products that the customer
already knows about or has already decided not
to buy.

http://strata.oreilly.com/2012/03/drivetrain-approach-data-products.html

Slide 33 of 66
Mixed-up recommendations

Slide 34 of 66
The Drivetrain approach
Drive additional sales
by surprising and delighting the customer with books not initially considered
without the recommendation

• Ranking of the recommendations

Data to derive from many randomized experiments about a
wide range of recommendations for a wide range of
customers:
To generate recommendations that will cause new sales

• Develop an algorithm providing recommendations which escape a
recommendation filter bubble
Slide 35 of 66
Slide 36 of 66

http://cdn.oreilly.com/radar/images/posts/0312-2-drivetrain-step4-lg.png

Developing the modeler
The role of the Modeler
• Modeler Component 1
• purchase probabilities, conditional on seeing a recommendation

• Modeler Component 2
• purchase probabilities, conditional on not seeing a
recommendation

• The difference between these two probabilities is a
utility function for a given recommendation to a
customer
• Low in cases where the algorithm recommends a familiar book
that the customer has already rejected (both components are
small) or a book that he/she would have bought even without
the recommendation
Slide 37 of 66
The final curve

http://strata.oreilly.com/2012/03/drivetrain-approach-data-products.html

Slide 38 of 66
The role of the Simulator
• Test the utility of each of the many possible
books in stock
• Alternatively just overall the outputs of a
collaborative filtering model of similar
customer purchases

Slide 39 of 66
The role of the Optimizer
• Rank and display the recommended books
based on their simulated utility
• Less emphasis on the “function” and more on
the “objective.”
– What is the objective of the person using our data
product?
– What choice are we actually helping him or her
make?
Slide 40 of 66
[CASE STUDY 3] OPTIMIZING
LIFETIME CUSTOMER VALUE
Customer value
• Includes all interactions between a retailer and
his customers outside the actual buy-sell
transaction
– making a product recommendation
– encouraging the customer to check out a new feature
of the online store
– sending sales promotions

• Making the wrong choices comes at a cost to the
retailer
– reduced margins (discounts that do not drive extra
sales)
– opportunity costs

Slide 42 of 66
The Drivetrain approach
optimize the lifetime value from each
customer
• Product recommendations

• Offer tailored discounts / special offers on products
• Make customer-care calls just to see how the user is
• Invite them to use the site and ask for their feedback

Zafu approach:
not send customers directly to clothes but ask a series of simple questions
about the customers’ body type, how well their other jeans fit and their fashion
preferences

• Develop an algorithm leading customer to browse a recommended selection
of Zafu’s inventory
Slide 43 of 66
Slide 44 of 66
Slide 45 of 66

http://cdn.oreilly.com/radar/images/posts/0312-2-drivetrain-step4-lg.png

Developing the modeler
The role of the Modeler
• Modeler Component 1
• purchase probabilities, conditional on seeing a
recommendation

• Modeler Component 2
• purchase probabilities, conditional on not seeing a
recommendation

• Modeler Component 3
• price elasticity model to test how offering a discount might
change the probability that the customer will buy the item

• Modeler Component 4
• patience model for the customers’ tolerance for poorly
targeted communications
Slide 46 of 66
The final curve

http://strata.oreilly.com/2012/03/drivetrain-approach-data-products.html

Slide 47 of 66
The role of the Simulator
• Test the utility of each of the many possible
clothes available
• Provide successful matches between
questions & recommendations

Slide 48 of 66
The role of the Optimizer
• Rank and display the recommended clothes
based on their simulated utility
– driving sales and improving the customer experience

• Less emphasis on the “function” and more on the
“objective”
– What is the objective of the person using our data
product?
– What choice are we actually helping him or her make?
Slide 49 of 66
[CASE STUDY 4] REAL LIFE
APPLICATION: SELF-DRIVING CAR
Building a car that drives itself (1/2)
• Alternative approach: Instead of being data
driven, we can now let the data drive us!
• Models required:
– model of distance / speed-limit to predict arrival
time; a ruler and a road map needed
– model for traffic congestion
– model to forecast weather conditions and their
effect on the safest maximum speed

Slide 51 of 66
Building a car that drives itself (2/2)
Plenty of cool challenges in building these
models but by themselves, they do not take us
to our destination

• Simulator: to predict the drive times along
various routes
• Optimizer: pick the shortest route subject to
constraints like avoiding tolls or maximizing
gas mileage
Slide 52 of 66
It is already implemented
According to Google, about a dozen self-driving cars
are on the road at any given time. They've already
logged more than 500,000 miles in beta tests.

Slide 53 of 66
The Drivetrain approach
Building a car that drives itself
Vehicle controls
• Steering wheel, Accelerator, Βrakes

Data from sensors etc.
• sensors that gather data about the road
• cameras that detect road signs, red or green lights & unexpected
obstacles

• Physics models to predict the effects of steering, braking &
acceleration
• Pattern recognition algorithms to interpret data from the road signs
Slide 54 of 66
Developing the Modeler

http://cdn.oreilly.com/radar/images/posts/0312-2-drivetrain-step4-lg.png

Slide 55 of 66
The role of the Modeler
• Modeler Component 1
• Route selection, conditional on following a
recommendation

• Modeler Component 2
• Route selection, conditional on not following a
recommendation

Slide 56 of 66
The role of the Simulator
• examine the results of the possible actions the
self-driving car could take
– If it turns left now, will it hit that pedestrian?
– If it makes a right turn at 55 km/h in these
weather conditions, will it skid off the road?

• Merely predicting what will happen isn’t good
enough.
Slide 57 of 66
The role of the Optimizer
• optimize the results of the simulation
– to pick the best combination of acceleration and
braking, steering and signaling

Prediction only tells us that there is going to be
an accident.
An optimizer tells us how to avoid accidents.
Slide 58 of 66
THE FUTURE FOR DATA PRODUCTS
The present
• Drivetrain Approach:
– a framework for designing the next generation of
great data products
– heavily relies on optimization

• A need for the data science community to
educate others
– on how to derive value from their predictive
models
– Based on product design process
Slide 60 of 66
Current status of data products
• Data continuously provided in big data
providers
– Facebook, Twitter etc.

• Data are transformed -> they do not look like
data in the end
– Telematics, booking systems etc.

• Example: Music now lives on the cloud
– Amazon, Apple, Google, or Spotify
Slide 61 of 66
The future
• Optimization taught in business schools &
statistics departments.
• Data scientists ship products designed to
produce desirable business outcomes
Risk: Models using data to create more
data, rather than using data to create
actions, disrupt industries, and
transform lives.
Slide 62 of 66
To keep in mind for future big data
products
• when building a data product, it is critical to
integrate designers into the engineering team
from the beginning.
• Data products frequently have special
challenges around inputting or displaying
data.

Slide 63 of 66
What to expect in the future?
Google needs to move beyond the current
search format of you entering a query and
getting 10 results.
The ideal would be us knowing what you want
before you search for it…
Eric Schmidt
Executive Chairman of Google

Slide 64 of 66
The future is near!
25/11/2013

Slide 65 of 66
References
• Big Data Now: 2012 Edition. O’Reilly Media, Inc.
• O’Reilly Strata: Making Data Work
(http://strata.oreilly.com/tag/big-data)

• Jeremy Howard - The Drivetrain Approach: A four-step process
for building data products
(http://strata.oreilly.com/2012/03/drivetrain-approach-dataproducts.html)

• Mike Loukides - The evolution of data products
(http://strata.oreilly.com/2011/09/evolution-of-data-products.html)

• Wikipedia: Big data (http://en.wikipedia.org/wiki/Big_data)

Slide 66 of 66
Thank you!

Vassilis Protonotarios
Agro-Know Technologies
vprot@agroknow.gr

More Related Content

What's hot

SharePoint Information Architecture & Usability - SharePoint Saturday The Con...
SharePoint Information Architecture & Usability - SharePoint Saturday The Con...SharePoint Information Architecture & Usability - SharePoint Saturday The Con...
SharePoint Information Architecture & Usability - SharePoint Saturday The Con...Richard Harbridge
 
Data Visualization & Data Storytelling
Data Visualization & Data StorytellingData Visualization & Data Storytelling
Data Visualization & Data Storytelling彭其捷 Jack
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyShivam Dhawan
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesLars E Martinsson
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science TeamsEMC
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks FundamentalsDalibor Wijas
 
Data Product Management by Tinder Group PM
Data Product Management by Tinder Group PMData Product Management by Tinder Group PM
Data Product Management by Tinder Group PMProduct School
 
Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Harald Erb
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeKent Graziano
 
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...HostedbyConfluent
 
Business Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachBusiness Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationDATAVERSITY
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 

What's hot (20)

Operational Data Vault
Operational Data VaultOperational Data Vault
Operational Data Vault
 
SharePoint Information Architecture & Usability - SharePoint Saturday The Con...
SharePoint Information Architecture & Usability - SharePoint Saturday The Con...SharePoint Information Architecture & Usability - SharePoint Saturday The Con...
SharePoint Information Architecture & Usability - SharePoint Saturday The Con...
 
Data Visualization & Data Storytelling
Data Visualization & Data StorytellingData Visualization & Data Storytelling
Data Visualization & Data Storytelling
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and Strategy
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
Data modeling for the business
Data modeling for the businessData modeling for the business
Data modeling for the business
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Data Product Management by Tinder Group PM
Data Product Management by Tinder Group PMData Product Management by Tinder Group PM
Data Product Management by Tinder Group PM
 
Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
 
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
Apache Kafka With Spark Structured Streaming With Emma Liu, Nitin Saksena, Ra...
 
Business Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachBusiness Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected Approach
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Data Mesh
Data MeshData Mesh
Data Mesh
 

Viewers also liked

Building Data Products
Building Data ProductsBuilding Data Products
Building Data ProductsCloudera, Inc.
 
Using language services to enrich the LOs' descriptions
Using language services to enrich the LOs' descriptionsUsing language services to enrich the LOs' descriptions
Using language services to enrich the LOs' descriptionsVassilis Protonotarios
 
Data Science & Data Products at Neue Zürcher Zeitung
Data Science & Data Products at Neue Zürcher ZeitungData Science & Data Products at Neue Zürcher Zeitung
Data Science & Data Products at Neue Zürcher ZeitungRené Pfitzner
 
LinkedIn Data Products
LinkedIn Data ProductsLinkedIn Data Products
LinkedIn Data ProductsVitaly Gordon
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudCloudera, Inc.
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013Cain Ransbottyn
 

Viewers also liked (9)

Building Data Products
Building Data ProductsBuilding Data Products
Building Data Products
 
The agINFRA Germplasm Working Group
The agINFRA Germplasm Working GroupThe agINFRA Germplasm Working Group
The agINFRA Germplasm Working Group
 
Using language services to enrich the LOs' descriptions
Using language services to enrich the LOs' descriptionsUsing language services to enrich the LOs' descriptions
Using language services to enrich the LOs' descriptions
 
agINFRA Germplasm metadata analysis
agINFRA Germplasm metadata analysisagINFRA Germplasm metadata analysis
agINFRA Germplasm metadata analysis
 
Data Science & Data Products at Neue Zürcher Zeitung
Data Science & Data Products at Neue Zürcher ZeitungData Science & Data Products at Neue Zürcher Zeitung
Data Science & Data Products at Neue Zürcher Zeitung
 
LinkedIn Data Products
LinkedIn Data ProductsLinkedIn Data Products
LinkedIn Data Products
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
 

Similar to Designing Data Products

Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdfChapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdfAndresBelloAvila
 
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationUsing PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationDatabricks
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
Smart solutions for productivity gain IQA conference 2017
Smart solutions for productivity gain   IQA conference 2017Smart solutions for productivity gain   IQA conference 2017
Smart solutions for productivity gain IQA conference 2017Steve Franklin
 
CS-422 THESIS (1).pptx
CS-422 THESIS (1).pptxCS-422 THESIS (1).pptx
CS-422 THESIS (1).pptxpoojagupta010
 
Concept testing, product architecture and design of modular system
Concept testing, product architecture and design of modular systemConcept testing, product architecture and design of modular system
Concept testing, product architecture and design of modular systemShafeequr Rehman
 
Big Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonBig Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonSocietyConsulting
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxkprasad8
 
Multicriteria and cost benefit analysis for smart grid projects
Multicriteria and cost benefit analysis for smart grid projectsMulticriteria and cost benefit analysis for smart grid projects
Multicriteria and cost benefit analysis for smart grid projectsLeonardo ENERGY
 
Concept Generation in Product Design
Concept Generation in Product DesignConcept Generation in Product Design
Concept Generation in Product DesignSurendher Emrose
 
churn_detection.pptx
churn_detection.pptxchurn_detection.pptx
churn_detection.pptxDhanuDhanu49
 
Advanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarAdvanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarSigOpt
 
KB Seminars: Working with Technology - Product Management; 10/13
KB Seminars: Working with Technology - Product Management; 10/13KB Seminars: Working with Technology - Product Management; 10/13
KB Seminars: Working with Technology - Product Management; 10/13MDIF
 

Similar to Designing Data Products (20)

Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdfChapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
Chapter_6_Prescriptive_Analytics_Optimization_and_Simulation.pptx.pdf
 
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy ExplorationUsing PySpark to Scale Markov Decision Problems for Policy Exploration
Using PySpark to Scale Markov Decision Problems for Policy Exploration
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Smart solutions for productivity gain IQA conference 2017
Smart solutions for productivity gain   IQA conference 2017Smart solutions for productivity gain   IQA conference 2017
Smart solutions for productivity gain IQA conference 2017
 
Prototyping
PrototypingPrototyping
Prototyping
 
Swayam assignment
Swayam assignmentSwayam assignment
Swayam assignment
 
CS-422 THESIS (1).pptx
CS-422 THESIS (1).pptxCS-422 THESIS (1).pptx
CS-422 THESIS (1).pptx
 
Concept testing, product architecture and design of modular system
Concept testing, product architecture and design of modular systemConcept testing, product architecture and design of modular system
Concept testing, product architecture and design of modular system
 
Big Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonBig Data Meetup by Chad Richeson
Big Data Meetup by Chad Richeson
 
Npd
NpdNpd
Npd
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
Final.pptx
Final.pptxFinal.pptx
Final.pptx
 
Multicriteria and cost benefit analysis for smart grid projects
Multicriteria and cost benefit analysis for smart grid projectsMulticriteria and cost benefit analysis for smart grid projects
Multicriteria and cost benefit analysis for smart grid projects
 
Concept Generation in Product Design
Concept Generation in Product DesignConcept Generation in Product Design
Concept Generation in Product Design
 
churn_detection.pptx
churn_detection.pptxchurn_detection.pptx
churn_detection.pptx
 
Operationalizing Analytics in Forestry
Operationalizing Analytics in ForestryOperationalizing Analytics in Forestry
Operationalizing Analytics in Forestry
 
F1033541
F1033541F1033541
F1033541
 
Advanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise WebinarAdvanced Optimization for the Enterprise Webinar
Advanced Optimization for the Enterprise Webinar
 
KB Seminars: Working with Technology - Product Management; 10/13
KB Seminars: Working with Technology - Product Management; 10/13KB Seminars: Working with Technology - Product Management; 10/13
KB Seminars: Working with Technology - Product Management; 10/13
 
3P TOOL
3P TOOL3P TOOL
3P TOOL
 

More from Vassilis Protonotarios

Doing business with Open Data in agriculture
Doing business with Open Data in agricultureDoing business with Open Data in agriculture
Doing business with Open Data in agricultureVassilis Protonotarios
 
Legal interoperability in the fishery and marine data ecosystem
Legal interoperability in the fishery and marine data ecosystemLegal interoperability in the fishery and marine data ecosystem
Legal interoperability in the fishery and marine data ecosystemVassilis Protonotarios
 
Agricultural Data Interest Group & Wheat Data Working Group of RDA
Agricultural Data Interest Group & Wheat Data Working Group of RDAAgricultural Data Interest Group & Wheat Data Working Group of RDA
Agricultural Data Interest Group & Wheat Data Working Group of RDAVassilis Protonotarios
 
Agro-Know internal training: Using the Agro-Know blog
Agro-Know internal training: Using the Agro-Know blogAgro-Know internal training: Using the Agro-Know blog
Agro-Know internal training: Using the Agro-Know blogVassilis Protonotarios
 
Introduction to Agriculture & Food Safety Data
Introduction to Agriculture & Food Safety DataIntroduction to Agriculture & Food Safety Data
Introduction to Agriculture & Food Safety DataVassilis Protonotarios
 
Seeding organic agriculture courses on Moodle: the agriMoodle Case
Seeding organic agriculture courses on Moodle:  the agriMoodle CaseSeeding organic agriculture courses on Moodle:  the agriMoodle Case
Seeding organic agriculture courses on Moodle: the agriMoodle CaseVassilis Protonotarios
 
Global RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataGlobal RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataVassilis Protonotarios
 
KOS Management - The case of the Organic.Edunet Ontology
KOS Management - The case of the Organic.Edunet OntologyKOS Management - The case of the Organic.Edunet Ontology
KOS Management - The case of the Organic.Edunet OntologyVassilis Protonotarios
 
Major germplasm data sources and referatories
Major germplasm data sources and referatoriesMajor germplasm data sources and referatories
Major germplasm data sources and referatoriesVassilis Protonotarios
 
Using Agricultural Learning Portals in Developing Countries: The case of Orga...
Using Agricultural Learning Portals in Developing Countries: The case of Orga...Using Agricultural Learning Portals in Developing Countries: The case of Orga...
Using Agricultural Learning Portals in Developing Countries: The case of Orga...Vassilis Protonotarios
 
Developing a network of content providers: The case of Organic.Edunet
Developing a network of content providers: The case of Organic.EdunetDeveloping a network of content providers: The case of Organic.Edunet
Developing a network of content providers: The case of Organic.EdunetVassilis Protonotarios
 
AgEdWS12 - Introduction to the Workshop
AgEdWS12 - Introduction to the WorkshopAgEdWS12 - Introduction to the Workshop
AgEdWS12 - Introduction to the WorkshopVassilis Protonotarios
 
Developing a network of content providers: The case of Organic.Edunet
Developing a network of content providers: The case of Organic.EdunetDeveloping a network of content providers: The case of Organic.Edunet
Developing a network of content providers: The case of Organic.EdunetVassilis Protonotarios
 
Introducing a content integration process for a federation of agricultural in...
Introducing a content integration process for a federation of agricultural in...Introducing a content integration process for a federation of agricultural in...
Introducing a content integration process for a federation of agricultural in...Vassilis Protonotarios
 
Organic.Edunet Web Portal - User Satisfaction Analysis (EFITA 2011)
Organic.Edunet Web Portal - User Satisfaction Analysis (EFITA 2011)Organic.Edunet Web Portal - User Satisfaction Analysis (EFITA 2011)
Organic.Edunet Web Portal - User Satisfaction Analysis (EFITA 2011)Vassilis Protonotarios
 
Designing a Training Session for Public Authorities (EFITA 2011)
Designing a Training Session for Public Authorities (EFITA 2011)Designing a Training Session for Public Authorities (EFITA 2011)
Designing a Training Session for Public Authorities (EFITA 2011)Vassilis Protonotarios
 
Identifying the Training Content Needs in Vocational Education & Training Pr...
Identifying the Training Content Needs in Vocational Education  & Training Pr...Identifying the Training Content Needs in Vocational Education  & Training Pr...
Identifying the Training Content Needs in Vocational Education & Training Pr...Vassilis Protonotarios
 
Green Education Using Open Educational Resources (OER) (SPDECE 2012)
Green Education Using Open Educational Resources (OER) (SPDECE 2012)Green Education Using Open Educational Resources (OER) (SPDECE 2012)
Green Education Using Open Educational Resources (OER) (SPDECE 2012)Vassilis Protonotarios
 
Presentation of the ISLE Network @ the SPDECE 2012 Symposium
Presentation of the ISLE Network @ the SPDECE 2012 SymposiumPresentation of the ISLE Network @ the SPDECE 2012 Symposium
Presentation of the ISLE Network @ the SPDECE 2012 SymposiumVassilis Protonotarios
 

More from Vassilis Protonotarios (20)

Doing business with Open Data in agriculture
Doing business with Open Data in agricultureDoing business with Open Data in agriculture
Doing business with Open Data in agriculture
 
Legal interoperability in the fishery and marine data ecosystem
Legal interoperability in the fishery and marine data ecosystemLegal interoperability in the fishery and marine data ecosystem
Legal interoperability in the fishery and marine data ecosystem
 
Agricultural Data Interest Group & Wheat Data Working Group of RDA
Agricultural Data Interest Group & Wheat Data Working Group of RDAAgricultural Data Interest Group & Wheat Data Working Group of RDA
Agricultural Data Interest Group & Wheat Data Working Group of RDA
 
Agro-Know internal training: Using the Agro-Know blog
Agro-Know internal training: Using the Agro-Know blogAgro-Know internal training: Using the Agro-Know blog
Agro-Know internal training: Using the Agro-Know blog
 
Introduction to Agriculture & Food Safety Data
Introduction to Agriculture & Food Safety DataIntroduction to Agriculture & Food Safety Data
Introduction to Agriculture & Food Safety Data
 
Seeding organic agriculture courses on Moodle: the agriMoodle Case
Seeding organic agriculture courses on Moodle:  the agriMoodle CaseSeeding organic agriculture courses on Moodle:  the agriMoodle Case
Seeding organic agriculture courses on Moodle: the agriMoodle Case
 
Global RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataGlobal RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm Data
 
KOS Management - The case of the Organic.Edunet Ontology
KOS Management - The case of the Organic.Edunet OntologyKOS Management - The case of the Organic.Edunet Ontology
KOS Management - The case of the Organic.Edunet Ontology
 
Major germplasm data sources and referatories
Major germplasm data sources and referatoriesMajor germplasm data sources and referatories
Major germplasm data sources and referatories
 
Using Agricultural Learning Portals in Developing Countries: The case of Orga...
Using Agricultural Learning Portals in Developing Countries: The case of Orga...Using Agricultural Learning Portals in Developing Countries: The case of Orga...
Using Agricultural Learning Portals in Developing Countries: The case of Orga...
 
Developing a network of content providers: The case of Organic.Edunet
Developing a network of content providers: The case of Organic.EdunetDeveloping a network of content providers: The case of Organic.Edunet
Developing a network of content providers: The case of Organic.Edunet
 
AgEdWS12 - Introduction to the Workshop
AgEdWS12 - Introduction to the WorkshopAgEdWS12 - Introduction to the Workshop
AgEdWS12 - Introduction to the Workshop
 
Developing a network of content providers: The case of Organic.Edunet
Developing a network of content providers: The case of Organic.EdunetDeveloping a network of content providers: The case of Organic.Edunet
Developing a network of content providers: The case of Organic.Edunet
 
Introducing a content integration process for a federation of agricultural in...
Introducing a content integration process for a federation of agricultural in...Introducing a content integration process for a federation of agricultural in...
Introducing a content integration process for a federation of agricultural in...
 
Organic.Edunet Web Portal - User Satisfaction Analysis (EFITA 2011)
Organic.Edunet Web Portal - User Satisfaction Analysis (EFITA 2011)Organic.Edunet Web Portal - User Satisfaction Analysis (EFITA 2011)
Organic.Edunet Web Portal - User Satisfaction Analysis (EFITA 2011)
 
Designing a Training Session for Public Authorities (EFITA 2011)
Designing a Training Session for Public Authorities (EFITA 2011)Designing a Training Session for Public Authorities (EFITA 2011)
Designing a Training Session for Public Authorities (EFITA 2011)
 
Identifying the Training Content Needs in Vocational Education & Training Pr...
Identifying the Training Content Needs in Vocational Education  & Training Pr...Identifying the Training Content Needs in Vocational Education  & Training Pr...
Identifying the Training Content Needs in Vocational Education & Training Pr...
 
Pecha Kucha
Pecha KuchaPecha Kucha
Pecha Kucha
 
Green Education Using Open Educational Resources (OER) (SPDECE 2012)
Green Education Using Open Educational Resources (OER) (SPDECE 2012)Green Education Using Open Educational Resources (OER) (SPDECE 2012)
Green Education Using Open Educational Resources (OER) (SPDECE 2012)
 
Presentation of the ISLE Network @ the SPDECE 2012 Symposium
Presentation of the ISLE Network @ the SPDECE 2012 SymposiumPresentation of the ISLE Network @ the SPDECE 2012 Symposium
Presentation of the ISLE Network @ the SPDECE 2012 Symposium
 

Recently uploaded

Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community PartnershipsSpring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community Partnershipsexpandedwebsite
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文中 央社
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxMarlene Maheu
 
Poster_density_driven_with_fracture_MLMC.pdf
Poster_density_driven_with_fracture_MLMC.pdfPoster_density_driven_with_fracture_MLMC.pdf
Poster_density_driven_with_fracture_MLMC.pdfAlexander Litvinenko
 
How to Analyse Profit of a Sales Order in Odoo 17
How to Analyse Profit of a Sales Order in Odoo 17How to Analyse Profit of a Sales Order in Odoo 17
How to Analyse Profit of a Sales Order in Odoo 17Celine George
 
Chapter 7 Pharmacosy Traditional System of Medicine & Ayurvedic Preparations ...
Chapter 7 Pharmacosy Traditional System of Medicine & Ayurvedic Preparations ...Chapter 7 Pharmacosy Traditional System of Medicine & Ayurvedic Preparations ...
Chapter 7 Pharmacosy Traditional System of Medicine & Ayurvedic Preparations ...Sumit Tiwari
 
Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Celine George
 
Benefits and Challenges of OER by Shweta Babel.pptx
Benefits and Challenges of OER by Shweta Babel.pptxBenefits and Challenges of OER by Shweta Babel.pptx
Benefits and Challenges of OER by Shweta Babel.pptxsbabel
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽中 央社
 
An Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppAn Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppCeline George
 
The Ball Poem- John Berryman_20240518_001617_0000.pptx
The Ball Poem- John Berryman_20240518_001617_0000.pptxThe Ball Poem- John Berryman_20240518_001617_0000.pptx
The Ball Poem- John Berryman_20240518_001617_0000.pptxNehaChandwani11
 
How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17Celine George
 
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...Denish Jangid
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSean M. Fox
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjMohammed Sikander
 
Book Review of Run For Your Life Powerpoint
Book Review of Run For Your Life PowerpointBook Review of Run For Your Life Powerpoint
Book Review of Run For Your Life Powerpoint23600690
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...Nguyen Thanh Tu Collection
 
Đề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinhĐề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinhleson0603
 

Recently uploaded (20)

Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community PartnershipsSpring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
PSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptxPSYPACT- Practicing Over State Lines May 2024.pptx
PSYPACT- Practicing Over State Lines May 2024.pptx
 
Poster_density_driven_with_fracture_MLMC.pdf
Poster_density_driven_with_fracture_MLMC.pdfPoster_density_driven_with_fracture_MLMC.pdf
Poster_density_driven_with_fracture_MLMC.pdf
 
How to Analyse Profit of a Sales Order in Odoo 17
How to Analyse Profit of a Sales Order in Odoo 17How to Analyse Profit of a Sales Order in Odoo 17
How to Analyse Profit of a Sales Order in Odoo 17
 
Chapter 7 Pharmacosy Traditional System of Medicine & Ayurvedic Preparations ...
Chapter 7 Pharmacosy Traditional System of Medicine & Ayurvedic Preparations ...Chapter 7 Pharmacosy Traditional System of Medicine & Ayurvedic Preparations ...
Chapter 7 Pharmacosy Traditional System of Medicine & Ayurvedic Preparations ...
 
Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17
 
Benefits and Challenges of OER by Shweta Babel.pptx
Benefits and Challenges of OER by Shweta Babel.pptxBenefits and Challenges of OER by Shweta Babel.pptx
Benefits and Challenges of OER by Shweta Babel.pptx
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
An Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge AppAn Overview of the Odoo 17 Knowledge App
An Overview of the Odoo 17 Knowledge App
 
The Ball Poem- John Berryman_20240518_001617_0000.pptx
The Ball Poem- John Berryman_20240518_001617_0000.pptxThe Ball Poem- John Berryman_20240518_001617_0000.pptx
The Ball Poem- John Berryman_20240518_001617_0000.pptx
 
How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17
 
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
 
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading RoomSternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
Sternal Fractures & Dislocations - EMGuidewire Radiology Reading Room
 
IPL Online Quiz by Pragya; Question Set.
IPL Online Quiz by Pragya; Question Set.IPL Online Quiz by Pragya; Question Set.
IPL Online Quiz by Pragya; Question Set.
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
Book Review of Run For Your Life Powerpoint
Book Review of Run For Your Life PowerpointBook Review of Run For Your Life Powerpoint
Book Review of Run For Your Life Powerpoint
 
Including Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdfIncluding Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdf
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
 
Đề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinhĐề tieng anh thpt 2024 danh cho cac ban hoc sinh
Đề tieng anh thpt 2024 danh cho cac ban hoc sinh
 

Designing Data Products

  • 1. Open Data for Agriculture Intro to Big Data 29/11/2013 Athens, Greece Joint offering by Supported by EU projects
  • 2. Designing Data Products Dr. Vassilis Protonotarios Agro-Know Technologies, Greece
  • 3. Intro • This presentation provides introductory information about – the (big) data products – the design of (big) data products, – the Drivetrain approach for the design of objective-based (big) data products. • The Drivetrain approach will be applied to agricultural case studies in the next session • The majority of the slides were based on the report “Big Data Now: 2012 Edition. O’Reilly Media, Inc.” Slide 3 of 66
  • 4. Objectives This presentation aims to: • Provide an introduction to data products • Define the “objective-based data products” concept – Describe the Drivetrain approach in the design of (big) Data products – Analyze the design of data products – Provide applications / case studies In order to provide the methodology for the development of data products Slide 4 of 66
  • 5. Structure of the presentation 1. Intro to designing (great) data products 2. Objective-based data products – The Drivetrain approach 3. Case studies (x4): Application of the Drivetrain approach 4. The future of data products Slide 5 of 66
  • 6. INTRO TO DESIGNING GREAT DATA PRODUCTS
  • 7. What is a (big) data product? • What happens when (big) data becomes a product – specifically, a consumer product • Produce (big) data based on inputs ? = data producers • Deliver results based on (big) data ? = data processors • Uses big data for providing useful outcomes? Slide 7 of 66
  • 8. Facts about (big) data products • Enable their users to do whatever they want – which most often has little to do with (big) data • Replace physical products Slide 8 of 66
  • 9. The past: Predictive modeling • Development of data products based on data predictive modeling – weather forecasting – recommendation engines – email spam filters – services that predict airline flight times • sometimes more accurately than the airlines themselves. Slide 9 of 66
  • 10. The issue of predictive modeling • Prediction technology: interesting, useful and mathematically elegant – BUT we need to take the next step because…. • These products just make predictions – instead of asking what action they want someone to take as a result of a prediction Slide 10 of 66
  • 11. The role of predictive modeling • Great predictive modeling is still an important part of the solution – but it no longer stands on its own – as products become more sophisticated, it becomes less useful Slide 11 of 66
  • 12. A new, alternative approach • the Drivetrain Approach – a four-step approach, already applied in the industry – inspired by the emerging field of self-driving vehicles – Objective-based approach B A http://www.popularmechanics.com/cars/how-to/repair-questions/1302716 Slide 12 of 66
  • 13. A fine objective-based model Slide 13 of 66
  • 14. Case study • A user of Google’s self-driving car – is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work BUT • It is an increasingly sophisticated product built by data scientists – they need a systematic design approach Slide 14 of 66
  • 17. ource: http://en.wikipedia.org/wiki/File:Altavista-1999.png Case study: Search engines Why ? 1997 2013 Slide 17 of 66
  • 18. The 4 steps in the Drivetrain approach • The four steps in this transition: – Identify the main objective • For Google: show the most relevant search results – Specify the system’s manageable inputs [levers] • For Google: ranking the results – Consider the data needed for managing the inputs • Information about users’ activities in other web sites – Building the predictive models • For Google: PageRank algorithm Slide 18 of 66
  • 19. Drivetrain approach goal NOT use data not just to generate more data – especially in the form of predictions BUT use data to produce actionable outcomes Slide 19 of 66
  • 20. [CASE STUDY 1] THE MODEL ASSEMBLY LINE: A CASE STUDY OF INSURANCE COMPANIES
  • 21. The issue of insurance companies • Case study: Insurance companies – Their objective: maximizing the profit from each policy price – An optimal pricing model is to them what the assembly line is to automobile manufacturing – Despite their long experience in prediction, they often fail to make optimal business decisions about what price to charge each new customer Slide 21 of 66
  • 22. Transition to Drivetrain approach • Identifying solutions to this issue – Optimal Decisions Group (ODG) approached this problem with an early use of the Drivetrain Approach – Resulted in a practical take on step 4 that can be applied to a wide range of problems. Model Assembly Line Slide 22 of 66
  • 23. The Drivetrain approach Set a price for policies, maximizing profit • price to charge each customer; • types of accidents to cover; • how much to spend on marketing and customer service; • how to react to their competitors’ pricing decisions • Data collected from real experiments on customers Randomly changing the prices of hundreds of thousands of policies over many months • • Develop a probability model for optimizing the insurer’s profit Slide 23 of 66
  • 24. Slide 24 of 66 http://cdn.oreilly.com/radar/images/posts/0312-2-drivetrain-step4-lg.png Developing the modeler: Model Assembly Line
  • 25. The role of the Modeler • Modeler Component 1 • model of price elasticity: the probability that a customer will accept a given price (for new policies and renewals) • Modeler Component 2 • relates price to the insurance company’s profit, conditional on the customer accepting this price. • Multiplying these two curves creates a final curve • Shows price versus expected profit • The final curve has a clearly identifiable local maximum that represents the best price to charge a customer for the first year. Slide 25 of 66
  • 27. The role of the Simulator • Lets ODG ask the “what if ” questions • to see how the levers affect the distribution of the final outcome • Runs the models over a wide range of inputs • The operator can adjust the input levers to answer specific questions • “What will happen if our company offers the customer a low teaser price in year one but then raises the premiums in Y2? • Explores the distribution of profit as affected the inputs outside of the insurer’s control • E.g. “What if the economy crashes and the customer loses his job?” Slide 27 of 66
  • 28. The role of the Optimizer • takes the surface of possible outcomes and identifies the highest point • finds the best outcomes • identify catastrophic outcomes – and show how to avoid them Slide 28 of 66
  • 29. Take-home message Using a Drivetrain Approach combined with a Model Assembly Line bridges the gap between predictive models and actionable outcomes. Slide 29 of 66
  • 30. [CASE STUDY 2] MARKETING INDUSTRY RECOMMENDER SYSTEMS
  • 31. Recommendation engines • Recommendation engines – data product based on well-built predictive models that do not achieve an optimal objective. – The current algorithms predict what products a customer will like, • based on purchase history and the histories of similar customers. Slide 31 of 66
  • 32. The case of Amazon • Amazon represents every purchase that has ever been made as a giant sparse matrix – with customers as the rows and products as the columns. • Once they have the data in this format, data scientists apply some form of collaborative filtering to “fill in the matrix.” Slide 32 of 66
  • 33. The case of Amazon • Such models are good at predicting whether a customer will like a given product – but they often suggest products that the customer already knows about or has already decided not to buy. http://strata.oreilly.com/2012/03/drivetrain-approach-data-products.html Slide 33 of 66
  • 35. The Drivetrain approach Drive additional sales by surprising and delighting the customer with books not initially considered without the recommendation • Ranking of the recommendations Data to derive from many randomized experiments about a wide range of recommendations for a wide range of customers: To generate recommendations that will cause new sales • Develop an algorithm providing recommendations which escape a recommendation filter bubble Slide 35 of 66
  • 36. Slide 36 of 66 http://cdn.oreilly.com/radar/images/posts/0312-2-drivetrain-step4-lg.png Developing the modeler
  • 37. The role of the Modeler • Modeler Component 1 • purchase probabilities, conditional on seeing a recommendation • Modeler Component 2 • purchase probabilities, conditional on not seeing a recommendation • The difference between these two probabilities is a utility function for a given recommendation to a customer • Low in cases where the algorithm recommends a familiar book that the customer has already rejected (both components are small) or a book that he/she would have bought even without the recommendation Slide 37 of 66
  • 39. The role of the Simulator • Test the utility of each of the many possible books in stock • Alternatively just overall the outputs of a collaborative filtering model of similar customer purchases Slide 39 of 66
  • 40. The role of the Optimizer • Rank and display the recommended books based on their simulated utility • Less emphasis on the “function” and more on the “objective.” – What is the objective of the person using our data product? – What choice are we actually helping him or her make? Slide 40 of 66
  • 41. [CASE STUDY 3] OPTIMIZING LIFETIME CUSTOMER VALUE
  • 42. Customer value • Includes all interactions between a retailer and his customers outside the actual buy-sell transaction – making a product recommendation – encouraging the customer to check out a new feature of the online store – sending sales promotions • Making the wrong choices comes at a cost to the retailer – reduced margins (discounts that do not drive extra sales) – opportunity costs Slide 42 of 66
  • 43. The Drivetrain approach optimize the lifetime value from each customer • Product recommendations • Offer tailored discounts / special offers on products • Make customer-care calls just to see how the user is • Invite them to use the site and ask for their feedback Zafu approach: not send customers directly to clothes but ask a series of simple questions about the customers’ body type, how well their other jeans fit and their fashion preferences • Develop an algorithm leading customer to browse a recommended selection of Zafu’s inventory Slide 43 of 66
  • 45. Slide 45 of 66 http://cdn.oreilly.com/radar/images/posts/0312-2-drivetrain-step4-lg.png Developing the modeler
  • 46. The role of the Modeler • Modeler Component 1 • purchase probabilities, conditional on seeing a recommendation • Modeler Component 2 • purchase probabilities, conditional on not seeing a recommendation • Modeler Component 3 • price elasticity model to test how offering a discount might change the probability that the customer will buy the item • Modeler Component 4 • patience model for the customers’ tolerance for poorly targeted communications Slide 46 of 66
  • 48. The role of the Simulator • Test the utility of each of the many possible clothes available • Provide successful matches between questions & recommendations Slide 48 of 66
  • 49. The role of the Optimizer • Rank and display the recommended clothes based on their simulated utility – driving sales and improving the customer experience • Less emphasis on the “function” and more on the “objective” – What is the objective of the person using our data product? – What choice are we actually helping him or her make? Slide 49 of 66
  • 50. [CASE STUDY 4] REAL LIFE APPLICATION: SELF-DRIVING CAR
  • 51. Building a car that drives itself (1/2) • Alternative approach: Instead of being data driven, we can now let the data drive us! • Models required: – model of distance / speed-limit to predict arrival time; a ruler and a road map needed – model for traffic congestion – model to forecast weather conditions and their effect on the safest maximum speed Slide 51 of 66
  • 52. Building a car that drives itself (2/2) Plenty of cool challenges in building these models but by themselves, they do not take us to our destination • Simulator: to predict the drive times along various routes • Optimizer: pick the shortest route subject to constraints like avoiding tolls or maximizing gas mileage Slide 52 of 66
  • 53. It is already implemented According to Google, about a dozen self-driving cars are on the road at any given time. They've already logged more than 500,000 miles in beta tests. Slide 53 of 66
  • 54. The Drivetrain approach Building a car that drives itself Vehicle controls • Steering wheel, Accelerator, Βrakes Data from sensors etc. • sensors that gather data about the road • cameras that detect road signs, red or green lights & unexpected obstacles • Physics models to predict the effects of steering, braking & acceleration • Pattern recognition algorithms to interpret data from the road signs Slide 54 of 66
  • 56. The role of the Modeler • Modeler Component 1 • Route selection, conditional on following a recommendation • Modeler Component 2 • Route selection, conditional on not following a recommendation Slide 56 of 66
  • 57. The role of the Simulator • examine the results of the possible actions the self-driving car could take – If it turns left now, will it hit that pedestrian? – If it makes a right turn at 55 km/h in these weather conditions, will it skid off the road? • Merely predicting what will happen isn’t good enough. Slide 57 of 66
  • 58. The role of the Optimizer • optimize the results of the simulation – to pick the best combination of acceleration and braking, steering and signaling Prediction only tells us that there is going to be an accident. An optimizer tells us how to avoid accidents. Slide 58 of 66
  • 59. THE FUTURE FOR DATA PRODUCTS
  • 60. The present • Drivetrain Approach: – a framework for designing the next generation of great data products – heavily relies on optimization • A need for the data science community to educate others – on how to derive value from their predictive models – Based on product design process Slide 60 of 66
  • 61. Current status of data products • Data continuously provided in big data providers – Facebook, Twitter etc. • Data are transformed -> they do not look like data in the end – Telematics, booking systems etc. • Example: Music now lives on the cloud – Amazon, Apple, Google, or Spotify Slide 61 of 66
  • 62. The future • Optimization taught in business schools & statistics departments. • Data scientists ship products designed to produce desirable business outcomes Risk: Models using data to create more data, rather than using data to create actions, disrupt industries, and transform lives. Slide 62 of 66
  • 63. To keep in mind for future big data products • when building a data product, it is critical to integrate designers into the engineering team from the beginning. • Data products frequently have special challenges around inputting or displaying data. Slide 63 of 66
  • 64. What to expect in the future? Google needs to move beyond the current search format of you entering a query and getting 10 results. The ideal would be us knowing what you want before you search for it… Eric Schmidt Executive Chairman of Google Slide 64 of 66
  • 65. The future is near! 25/11/2013 Slide 65 of 66
  • 66. References • Big Data Now: 2012 Edition. O’Reilly Media, Inc. • O’Reilly Strata: Making Data Work (http://strata.oreilly.com/tag/big-data) • Jeremy Howard - The Drivetrain Approach: A four-step process for building data products (http://strata.oreilly.com/2012/03/drivetrain-approach-dataproducts.html) • Mike Loukides - The evolution of data products (http://strata.oreilly.com/2011/09/evolution-of-data-products.html) • Wikipedia: Big data (http://en.wikipedia.org/wiki/Big_data) Slide 66 of 66
  • 67. Thank you! Vassilis Protonotarios Agro-Know Technologies vprot@agroknow.gr

Editor's Notes

  1. Image source: http://thedataqualitychronicle.org/data-quality/
  2. Predictive modelling is the process by which a model is created or chosen to try to best predict the probability of an outcome.[1] In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example given an email determining how likely that it is spam.
  3. A motor vehicle's driveline or drivetrain consists of the parts of the powertrain excluding the engine and transmission. It is the portion of a vehicle, after the transmission, that changes depending on whether a vehicle is front-wheel, rear-wheel, or four-wheel driveStarting point of engineers: clear objective: They want a car to drive safely from point A to point B without human intervention.
  4. Back in 1997, AltaVista was king of the algorithmic search world. While their models were good at finding relevant websites, the answer the user was most interested in was often buried on page 100 = no ranking!Google realized that the objective was to show the most relevant search results first for each unique user
  5. Link to PageRank: http://en.wikipedia.org/wiki/PageRank
  6. Optimal Decisions Group: www.lexisnexis.com/risk/solutions/optimal-decisions-toolkit.aspx
  7. recommendation filter bubble = the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
  8. The figure refers to Step 4 of the Drivetrain approach
  9. Component 1: The price elasticity model is a curve of price versus the probability of the customer accepting the policy conditional on that price. This curve moves from almost certain acceptance at very low prices to almost never at high prices.Component 2: The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
  10. Amazon’s recommendation engine is probably the best one out there but it’s easy to get it to show its warts.
  11. recommendation filter bubble = the tendency of personalized news feeds to only display articles that are blandly popular or further confirm the readers’ existing biases.
  12. The figure refers to Step 4 of the Drivetrain approach
  13. Component 1: The price elasticity model is a curve of price versus the probability of the customer accepting the policy conditional on that price. This curve moves from almost certain acceptance at very low prices to almost never at high prices.Component 2: The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
  14. The figure refers to Step 4 of the Drivetrain approach
  15. Component 1: The price elasticity model is a curve of price versus the probability of the customer accepting the policy conditional on that price. This curve moves from almost certain acceptance at very low prices to almost never at high prices.Component 2: The profit for a very low price will be in the red by the value of expected claims in the first year, plus any overhead for acquiring and servicing the new customer.
  16. The figure refers to Step 4 of the Drivetrain approach
  17. http://strata.oreilly.com/2011/09/evolution-of-data-products.html
  18. http://www.thedrum.co.uk/news/2011/06/25/22817-quotes-of-the-week-huffington-post-bbc-salford-google-and-more/