SlideShare a Scribd company logo
INTRODUCING THE CONCEPTS OF DATA SCIENCE
AND BIG DATA IN A TRADITIONAL DATA WAREHOUSE
COMPANY
Joris Bos
Manager IT, DDW
Dairy Data Warehouse (DDW)
• Start up founded in 2013
• Based in Assen, Netherlands
• DDW offers data services to dairy farm
consultants and corporate stakeholders in
the dairy industry across the globe
DDW BIG Dairy Data Platform
1. DATA INPUT
Herd
management
Data capture
devices
Milk recording
data
2. DATA TRANSFORMATION 3. DEEP LEARNING
DDW ETL unique
technology transforms
all data into a standard
common format
Dairy Data Warehouse
Single consistent database
Production
Reproduction
Disease
4. DDW PREDICTIONS
‘Advance to go’
• Daily uploads (batching)
• Slow database XML exports
• Calculations freezing data scientist’ pc
• Valuable data locked away in private networks,
widely spread
• This new ‘thing’!
Data warehouses are dead! Or?
• Trendy, all will be superseded
• New paradigm
• On premises or Cloud?
• Clearly differentiate use-cases, strengths
• HDFS != DMS
• Different target audience
• Aim to retain value from previous investments
DDW Big data infrastructure
• Set-up from zero, best practices opportunity
• Security, data access and data governance at the core
• New roles and competences within the organization
• Data lake at the center
• Seperation storage / compute
• Support for persisted, historical data now (batch)
• Streaming (real-time) near future
• Audit, audit, audit, audit
Data Lake & Data Warehouse
Big Data pipeline
Data Engineer Data Scientist
Value
Chosen setup (Azure)
• Storage : Azure Data Lake (Active Directory Security / Office365)
• Integration runtimes (Data Factory)
• ELT
• Data warehouse as ‘just’ another data source
• Data Catalog
• Compute : On-demand clusters with Hortonworks distribution platform (HDP)
• ML Deep Learning : Model calculation with GPU powered VMs
• ETL to data warehouse for hot layer going forward
Chosen setup
Takeaways
• Establish clear language around Big Data and tooling
• Clearly position Big Data in your company (vision, budget, etc.)
• Define new roles, workflows and conventions
• Stick to best practices
• Start small, Determine Minimum Viable Product
• Cloud is great for ad-hoc, scalability demands
• Iterate and experiment through POCs
Thank you
Feel free to contact me:
Joris Bos
J.bos@dairydatawarehouse.com

More Related Content

What's hot

Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Denodo
 
Denodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the CloudDenodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo
 
Case Study: Big Data Analytics
Case Study: Big Data AnalyticsCase Study: Big Data Analytics
Case Study: Big Data Analytics
Abhinav Das
 
Data Warehouse in Cloud
Data Warehouse in CloudData Warehouse in Cloud
Data Warehouse in Cloud
Pawan Bhargava
 
Architecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-HavesArchitecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-Haves
Yellowbrick Data
 
Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...
Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...
Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...
DataStax
 
DataStax on Azure: Deploying an industry-leading data platform for cloud apps...
DataStax on Azure: Deploying an industry-leading data platform for cloud apps...DataStax on Azure: Deploying an industry-leading data platform for cloud apps...
DataStax on Azure: Deploying an industry-leading data platform for cloud apps...
DataStax
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
Cloudera, Inc.
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
Cloudera, Inc.
 
SQL In/On/Around Hadoop
SQL In/On/Around Hadoop SQL In/On/Around Hadoop
SQL In/On/Around Hadoop
DataWorks Summit
 
Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...
DataStax
 
Terracotta Hadoop & In-Memory Webcast
Terracotta Hadoop & In-Memory WebcastTerracotta Hadoop & In-Memory Webcast
Terracotta Hadoop & In-Memory Webcast
Terracotta, a product line at Software AG
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Stora Enso&Wipro - Stora Enso Rethinks Supply Chain - ProcessForum Nordic, No...
Stora Enso&Wipro - Stora Enso Rethinks Supply Chain - ProcessForum Nordic, No...Stora Enso&Wipro - Stora Enso Rethinks Supply Chain - ProcessForum Nordic, No...
Stora Enso&Wipro - Stora Enso Rethinks Supply Chain - ProcessForum Nordic, No...
Software AG
 
Optimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptxOptimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptx
IDERA Software
 
DW 101
DW 101DW 101
DW 101
jeffd00
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
DataStax
 
The Yellowbrick Impact for MicroStrategy
The Yellowbrick Impact for MicroStrategyThe Yellowbrick Impact for MicroStrategy
The Yellowbrick Impact for MicroStrategy
Yellowbrick Data
 
"Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wo...
"Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wo..."Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wo...
"Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wo...
Comsysto Reply GmbH
 

What's hot (20)

Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
 
Denodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the CloudDenodo DataFest 2016: Big Data Virtualization in the Cloud
Denodo DataFest 2016: Big Data Virtualization in the Cloud
 
Case Study: Big Data Analytics
Case Study: Big Data AnalyticsCase Study: Big Data Analytics
Case Study: Big Data Analytics
 
Data Warehouse in Cloud
Data Warehouse in CloudData Warehouse in Cloud
Data Warehouse in Cloud
 
Architecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-HavesArchitecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-Haves
 
Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...
Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...
Webinar - The Agility Challenge - Powering Cloud Apps with Multi-Model & Mixe...
 
DataStax on Azure: Deploying an industry-leading data platform for cloud apps...
DataStax on Azure: Deploying an industry-leading data platform for cloud apps...DataStax on Azure: Deploying an industry-leading data platform for cloud apps...
DataStax on Azure: Deploying an industry-leading data platform for cloud apps...
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
SQL In/On/Around Hadoop
SQL In/On/Around Hadoop SQL In/On/Around Hadoop
SQL In/On/Around Hadoop
 
Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...
 
Terracotta Hadoop & In-Memory Webcast
Terracotta Hadoop & In-Memory WebcastTerracotta Hadoop & In-Memory Webcast
Terracotta Hadoop & In-Memory Webcast
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Stora Enso&Wipro - Stora Enso Rethinks Supply Chain - ProcessForum Nordic, No...
Stora Enso&Wipro - Stora Enso Rethinks Supply Chain - ProcessForum Nordic, No...Stora Enso&Wipro - Stora Enso Rethinks Supply Chain - ProcessForum Nordic, No...
Stora Enso&Wipro - Stora Enso Rethinks Supply Chain - ProcessForum Nordic, No...
 
Optimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptxOptimize the performance, cost, and value of databases.pptx
Optimize the performance, cost, and value of databases.pptx
 
DW 101
DW 101DW 101
DW 101
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
The Yellowbrick Impact for MicroStrategy
The Yellowbrick Impact for MicroStrategyThe Yellowbrick Impact for MicroStrategy
The Yellowbrick Impact for MicroStrategy
 
"Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wo...
"Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wo..."Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wo...
"Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wo...
 

Similar to Dairy data warehouse - Introducing the concept of Data Science and Big Data in a traditional data warehouse company

Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
Caserta
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
Caserta
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
DATAVERSITY
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seeling Cheung
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
punedevscom
 
Data Vault Introduction
Data Vault IntroductionData Vault Introduction
Data Vault Introduction
Patrick Van Renterghem
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
Mark Rittman
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
Caserta
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
South West Data Meetup
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Hadoop as a Data Hub
Hadoop as a Data HubHadoop as a Data Hub
Hadoop as a Data HubDianna Doan
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
Caserta
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)Moacyr Passador
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
Seeling Cheung
 

Similar to Dairy data warehouse - Introducing the concept of Data Science and Big Data in a traditional data warehouse company (20)

Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Designing modern dw and data lake
Designing modern dw and data lakeDesigning modern dw and data lake
Designing modern dw and data lake
 
Data Vault Introduction
Data Vault IntroductionData Vault Introduction
Data Vault Introduction
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Hadoop as a Data Hub
Hadoop as a Data HubHadoop as a Data Hub
Hadoop as a Data Hub
 
Big Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data LakeBig Data: Setting Up the Big Data Lake
Big Data: Setting Up the Big Data Lake
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 

More from BigDataExpo

Centric - Jaap huisprijzen, GTST, The Bold, IKEA en IENS. Zomaar wat toepassi...
Centric - Jaap huisprijzen, GTST, The Bold, IKEA en IENS. Zomaar wat toepassi...Centric - Jaap huisprijzen, GTST, The Bold, IKEA en IENS. Zomaar wat toepassi...
Centric - Jaap huisprijzen, GTST, The Bold, IKEA en IENS. Zomaar wat toepassi...
BigDataExpo
 
Google Cloud - Google's vision on AI
Google Cloud - Google's vision on AIGoogle Cloud - Google's vision on AI
Google Cloud - Google's vision on AI
BigDataExpo
 
Pacmed - Machine Learning in health care: opportunities and challanges in pra...
Pacmed - Machine Learning in health care: opportunities and challanges in pra...Pacmed - Machine Learning in health care: opportunities and challanges in pra...
Pacmed - Machine Learning in health care: opportunities and challanges in pra...
BigDataExpo
 
PGGM - The Future Explore
PGGM - The Future ExplorePGGM - The Future Explore
PGGM - The Future Explore
BigDataExpo
 
Universiteit Utrecht & gghdc - Wat zijn de gezondheidseffecten van omgeving e...
Universiteit Utrecht & gghdc - Wat zijn de gezondheidseffecten van omgeving e...Universiteit Utrecht & gghdc - Wat zijn de gezondheidseffecten van omgeving e...
Universiteit Utrecht & gghdc - Wat zijn de gezondheidseffecten van omgeving e...
BigDataExpo
 
Rob van Kranenburg - Kunnen we ons een sociaal krediet systeem zoals in het o...
Rob van Kranenburg - Kunnen we ons een sociaal krediet systeem zoals in het o...Rob van Kranenburg - Kunnen we ons een sociaal krediet systeem zoals in het o...
Rob van Kranenburg - Kunnen we ons een sociaal krediet systeem zoals in het o...
BigDataExpo
 
OrangeNXT - High accuracy mapping from videos for efficient fiber optic cable...
OrangeNXT - High accuracy mapping from videos for efficient fiber optic cable...OrangeNXT - High accuracy mapping from videos for efficient fiber optic cable...
OrangeNXT - High accuracy mapping from videos for efficient fiber optic cable...
BigDataExpo
 
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AIDynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
BigDataExpo
 
Teleperformance - Smart personalized service door het gebruik van Data Science
Teleperformance - Smart personalized service door het gebruik van Data Science Teleperformance - Smart personalized service door het gebruik van Data Science
Teleperformance - Smart personalized service door het gebruik van Data Science
BigDataExpo
 
FunXtion - Interactive Digital Fitness with Data Analytics
FunXtion - Interactive Digital Fitness with Data AnalyticsFunXtion - Interactive Digital Fitness with Data Analytics
FunXtion - Interactive Digital Fitness with Data Analytics
BigDataExpo
 
fashionTrade - Vroeger noemde we dat Big Data
fashionTrade - Vroeger noemde we dat Big DatafashionTrade - Vroeger noemde we dat Big Data
fashionTrade - Vroeger noemde we dat Big Data
BigDataExpo
 
BigData Republic - Industrializing data science: a view from the trenches
BigData Republic - Industrializing data science: a view from the trenchesBigData Republic - Industrializing data science: a view from the trenches
BigData Republic - Industrializing data science: a view from the trenches
BigDataExpo
 
Bicos - Hear how a top sportswear company produced cutting-edge data infrastr...
Bicos - Hear how a top sportswear company produced cutting-edge data infrastr...Bicos - Hear how a top sportswear company produced cutting-edge data infrastr...
Bicos - Hear how a top sportswear company produced cutting-edge data infrastr...
BigDataExpo
 
Endrse - Next level online samenwerkingen tussen personalities en merken met ...
Endrse - Next level online samenwerkingen tussen personalities en merken met ...Endrse - Next level online samenwerkingen tussen personalities en merken met ...
Endrse - Next level online samenwerkingen tussen personalities en merken met ...
BigDataExpo
 
Bovag - Refine-IT - Proces optimalisatie in de automotive sector
Bovag - Refine-IT - Proces optimalisatie in de automotive sectorBovag - Refine-IT - Proces optimalisatie in de automotive sector
Bovag - Refine-IT - Proces optimalisatie in de automotive sector
BigDataExpo
 
Schiphol - Optimale doorstroom van passagiers op Schiphol dankzij slimme data...
Schiphol - Optimale doorstroom van passagiers op Schiphol dankzij slimme data...Schiphol - Optimale doorstroom van passagiers op Schiphol dankzij slimme data...
Schiphol - Optimale doorstroom van passagiers op Schiphol dankzij slimme data...
BigDataExpo
 
Veco - Big Data in de Supply Chain: Hoe Process Mining kan helpen kosten te r...
Veco - Big Data in de Supply Chain: Hoe Process Mining kan helpen kosten te r...Veco - Big Data in de Supply Chain: Hoe Process Mining kan helpen kosten te r...
Veco - Big Data in de Supply Chain: Hoe Process Mining kan helpen kosten te r...
BigDataExpo
 
Rabobank - There is something about Data
Rabobank - There is something about DataRabobank - There is something about Data
Rabobank - There is something about Data
BigDataExpo
 
VU Amsterdam - Big data en datagedreven waardecreatie: valt er nog iets te ki...
VU Amsterdam - Big data en datagedreven waardecreatie: valt er nog iets te ki...VU Amsterdam - Big data en datagedreven waardecreatie: valt er nog iets te ki...
VU Amsterdam - Big data en datagedreven waardecreatie: valt er nog iets te ki...
BigDataExpo
 
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
BigDataExpo
 

More from BigDataExpo (20)

Centric - Jaap huisprijzen, GTST, The Bold, IKEA en IENS. Zomaar wat toepassi...
Centric - Jaap huisprijzen, GTST, The Bold, IKEA en IENS. Zomaar wat toepassi...Centric - Jaap huisprijzen, GTST, The Bold, IKEA en IENS. Zomaar wat toepassi...
Centric - Jaap huisprijzen, GTST, The Bold, IKEA en IENS. Zomaar wat toepassi...
 
Google Cloud - Google's vision on AI
Google Cloud - Google's vision on AIGoogle Cloud - Google's vision on AI
Google Cloud - Google's vision on AI
 
Pacmed - Machine Learning in health care: opportunities and challanges in pra...
Pacmed - Machine Learning in health care: opportunities and challanges in pra...Pacmed - Machine Learning in health care: opportunities and challanges in pra...
Pacmed - Machine Learning in health care: opportunities and challanges in pra...
 
PGGM - The Future Explore
PGGM - The Future ExplorePGGM - The Future Explore
PGGM - The Future Explore
 
Universiteit Utrecht & gghdc - Wat zijn de gezondheidseffecten van omgeving e...
Universiteit Utrecht & gghdc - Wat zijn de gezondheidseffecten van omgeving e...Universiteit Utrecht & gghdc - Wat zijn de gezondheidseffecten van omgeving e...
Universiteit Utrecht & gghdc - Wat zijn de gezondheidseffecten van omgeving e...
 
Rob van Kranenburg - Kunnen we ons een sociaal krediet systeem zoals in het o...
Rob van Kranenburg - Kunnen we ons een sociaal krediet systeem zoals in het o...Rob van Kranenburg - Kunnen we ons een sociaal krediet systeem zoals in het o...
Rob van Kranenburg - Kunnen we ons een sociaal krediet systeem zoals in het o...
 
OrangeNXT - High accuracy mapping from videos for efficient fiber optic cable...
OrangeNXT - High accuracy mapping from videos for efficient fiber optic cable...OrangeNXT - High accuracy mapping from videos for efficient fiber optic cable...
OrangeNXT - High accuracy mapping from videos for efficient fiber optic cable...
 
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AIDynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
Dynniq & GoDataDriven - Shaping the future of traffic with IoT and AI
 
Teleperformance - Smart personalized service door het gebruik van Data Science
Teleperformance - Smart personalized service door het gebruik van Data Science Teleperformance - Smart personalized service door het gebruik van Data Science
Teleperformance - Smart personalized service door het gebruik van Data Science
 
FunXtion - Interactive Digital Fitness with Data Analytics
FunXtion - Interactive Digital Fitness with Data AnalyticsFunXtion - Interactive Digital Fitness with Data Analytics
FunXtion - Interactive Digital Fitness with Data Analytics
 
fashionTrade - Vroeger noemde we dat Big Data
fashionTrade - Vroeger noemde we dat Big DatafashionTrade - Vroeger noemde we dat Big Data
fashionTrade - Vroeger noemde we dat Big Data
 
BigData Republic - Industrializing data science: a view from the trenches
BigData Republic - Industrializing data science: a view from the trenchesBigData Republic - Industrializing data science: a view from the trenches
BigData Republic - Industrializing data science: a view from the trenches
 
Bicos - Hear how a top sportswear company produced cutting-edge data infrastr...
Bicos - Hear how a top sportswear company produced cutting-edge data infrastr...Bicos - Hear how a top sportswear company produced cutting-edge data infrastr...
Bicos - Hear how a top sportswear company produced cutting-edge data infrastr...
 
Endrse - Next level online samenwerkingen tussen personalities en merken met ...
Endrse - Next level online samenwerkingen tussen personalities en merken met ...Endrse - Next level online samenwerkingen tussen personalities en merken met ...
Endrse - Next level online samenwerkingen tussen personalities en merken met ...
 
Bovag - Refine-IT - Proces optimalisatie in de automotive sector
Bovag - Refine-IT - Proces optimalisatie in de automotive sectorBovag - Refine-IT - Proces optimalisatie in de automotive sector
Bovag - Refine-IT - Proces optimalisatie in de automotive sector
 
Schiphol - Optimale doorstroom van passagiers op Schiphol dankzij slimme data...
Schiphol - Optimale doorstroom van passagiers op Schiphol dankzij slimme data...Schiphol - Optimale doorstroom van passagiers op Schiphol dankzij slimme data...
Schiphol - Optimale doorstroom van passagiers op Schiphol dankzij slimme data...
 
Veco - Big Data in de Supply Chain: Hoe Process Mining kan helpen kosten te r...
Veco - Big Data in de Supply Chain: Hoe Process Mining kan helpen kosten te r...Veco - Big Data in de Supply Chain: Hoe Process Mining kan helpen kosten te r...
Veco - Big Data in de Supply Chain: Hoe Process Mining kan helpen kosten te r...
 
Rabobank - There is something about Data
Rabobank - There is something about DataRabobank - There is something about Data
Rabobank - There is something about Data
 
VU Amsterdam - Big data en datagedreven waardecreatie: valt er nog iets te ki...
VU Amsterdam - Big data en datagedreven waardecreatie: valt er nog iets te ki...VU Amsterdam - Big data en datagedreven waardecreatie: valt er nog iets te ki...
VU Amsterdam - Big data en datagedreven waardecreatie: valt er nog iets te ki...
 
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
 

Recently uploaded

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 

Recently uploaded (20)

一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 

Dairy data warehouse - Introducing the concept of Data Science and Big Data in a traditional data warehouse company

  • 1. INTRODUCING THE CONCEPTS OF DATA SCIENCE AND BIG DATA IN A TRADITIONAL DATA WAREHOUSE COMPANY Joris Bos Manager IT, DDW
  • 2. Dairy Data Warehouse (DDW) • Start up founded in 2013 • Based in Assen, Netherlands • DDW offers data services to dairy farm consultants and corporate stakeholders in the dairy industry across the globe
  • 3. DDW BIG Dairy Data Platform 1. DATA INPUT Herd management Data capture devices Milk recording data 2. DATA TRANSFORMATION 3. DEEP LEARNING DDW ETL unique technology transforms all data into a standard common format Dairy Data Warehouse Single consistent database Production Reproduction Disease 4. DDW PREDICTIONS
  • 4. ‘Advance to go’ • Daily uploads (batching) • Slow database XML exports • Calculations freezing data scientist’ pc • Valuable data locked away in private networks, widely spread • This new ‘thing’!
  • 5. Data warehouses are dead! Or? • Trendy, all will be superseded • New paradigm • On premises or Cloud? • Clearly differentiate use-cases, strengths • HDFS != DMS • Different target audience • Aim to retain value from previous investments
  • 6. DDW Big data infrastructure • Set-up from zero, best practices opportunity • Security, data access and data governance at the core • New roles and competences within the organization • Data lake at the center • Seperation storage / compute • Support for persisted, historical data now (batch) • Streaming (real-time) near future • Audit, audit, audit, audit
  • 7. Data Lake & Data Warehouse
  • 8. Big Data pipeline Data Engineer Data Scientist Value
  • 9. Chosen setup (Azure) • Storage : Azure Data Lake (Active Directory Security / Office365) • Integration runtimes (Data Factory) • ELT • Data warehouse as ‘just’ another data source • Data Catalog • Compute : On-demand clusters with Hortonworks distribution platform (HDP) • ML Deep Learning : Model calculation with GPU powered VMs • ETL to data warehouse for hot layer going forward
  • 11. Takeaways • Establish clear language around Big Data and tooling • Clearly position Big Data in your company (vision, budget, etc.) • Define new roles, workflows and conventions • Stick to best practices • Start small, Determine Minimum Viable Product • Cloud is great for ad-hoc, scalability demands • Iterate and experiment through POCs
  • 12. Thank you Feel free to contact me: Joris Bos J.bos@dairydatawarehouse.com