SlideShare a Scribd company logo
1 of 9
Download to read offline
Data Discoverability
Maggie Hays
Senior Product Manager -- Data Services
DataHub Town Hall -- September 25, 2020
2
Agenda
● Overview of Teams & Data Stack
● Current State of Data Discoverability
● Data Catalog Evaluation
● DataHub POC - Hypotheses, Progress, and Next Steps
● Brief Demo!
3
SpotHero’s Data-Focused Teams
Data Engineering
3 Engineers
SpotHero IQ
2 Engineers
1-3 Data Scientists
(We’re hiring!!)
Analytics
5 Business Analysts
4
Looker
Airflow
SpotHero’s Data Stack
SH Application
Data
Workflow Tools
Marketing Tools
Microservices
Clickstream
Analytics
Redshift
S3/Parquet
Fivetran
Segment
Kafka
SQL
Python
Spark
Sources Ingestion Storage ETL
5
1
2
3
Current State of Data Discoverability
Data Lineage is difficult to discover and navigate,
regardless of role or tenure
● Impact analysis is arduous; Engineers avoid breaking changes at all costs
● Prolonged debugging/troubleshooting data issues
Difficult to discover what data exists and/or
what it represents
● Reliance on tribal knowledge
● Large burden on the Analytics team to answer any/all questions
Confidence in Data Accuracy is neutral, but room for
improvement
● Once folks track down the data, they are relatively confident in its
accuracy
May 2020 Internal Survey - Engineering, Product, Analytics, Data Science teams; 47% response rate
6
Data Catalog Evaluation
DataHub
Amundsen
/ Marquez
Apache
Atlas Alation
Ease of Integration
Lineage Support
Configurable
Metadata
Affordability
7
1
2
3
DataHub POC - Hypotheses
Increase visibility into what data exists, what it
represents, and how it’s used across the
company
Decrease the effort required by SpotHero
teammates to use and interpret data
Increase SpotHero teammates’ confidence in
the accuracy and/or relevance of data
8
Looker
Airflow
DataHub POC - Progress & Next Steps
SH Application
Data
Workflow Tools
Marketing Tools
Microservices
Clickstream
Analytics
Redshift
S3/Parquet
Fivetran
Segment
Kafka
SQL
Python
Spark
Sources Ingestion Storage ETL
Complete
Q4 2020
9
Quick Demo!

More Related Content

What's hot

Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Tristan Baker
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks FundamentalsDalibor Wijas
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in DeltaDatabricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsDenodo
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino ProjectMartin Traverso
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPDatabricks
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
Apache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - finalApache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - finalSub Szabolcs Feczak
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowDremio Corporation
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeDatabricks
 
OpenTelemetry: From front- to backend (2022)
OpenTelemetry: From front- to backend (2022)OpenTelemetry: From front- to backend (2022)
OpenTelemetry: From front- to backend (2022)Sebastian Poxhofer
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshSion Smith
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDatabricks
 
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...DataWorks Summit
 
Oracle database 12c data masking and subsetting guide
Oracle database 12c data masking and subsetting guideOracle database 12c data masking and subsetting guide
Oracle database 12c data masking and subsetting guidebupbechanhgmail
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 

What's hot (20)

Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in Delta
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino Project
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Apache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - finalApache Beam and Google Cloud Dataflow - IDG - final
Apache Beam and Google Cloud Dataflow - IDG - final
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache Arrow
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta LakeBuilding Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
 
OpenTelemetry: From front- to backend (2022)
OpenTelemetry: From front- to backend (2022)OpenTelemetry: From front- to backend (2022)
OpenTelemetry: From front- to backend (2022)
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured Streaming
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
Delta from a Data Engineer's Perspective
Delta from a Data Engineer's PerspectiveDelta from a Data Engineer's Perspective
Delta from a Data Engineer's Perspective
 
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
 
Oracle database 12c data masking and subsetting guide
Oracle database 12c data masking and subsetting guideOracle database 12c data masking and subsetting guide
Oracle database 12c data masking and subsetting guide
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 

Similar to Data Discoverability at SpotHero

Data Discoverability with DataHub
Data Discoverability with DataHubData Discoverability with DataHub
Data Discoverability with DataHubGleb Mezhanskiy
 
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraFrom Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraMolly Alexander
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyAgile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyTamrMarketing
 
Predictive Human Capital Analytics (1).pptx
Predictive Human Capital Analytics (1).pptxPredictive Human Capital Analytics (1).pptx
Predictive Human Capital Analytics (1).pptxSaminaNawaz14
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptxsharpan
 
Rabobank - There is something about Data
Rabobank - There is something about DataRabobank - There is something about Data
Rabobank - There is something about DataBigDataExpo
 
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptxarpit206900
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupCaserta
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumSimplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumVMware Tanzu
 
Max Cottica slides from Future of Business Intelligence
Max Cottica slides from Future of Business Intelligence Max Cottica slides from Future of Business Intelligence
Max Cottica slides from Future of Business Intelligence Lauren Campbell Assoc CIPD
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Caserta
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadatamarkgrover
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Denodo
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryInside Analysis
 
P6 analytics product roadmap and overview - Oracle Primavera P6 Collaborate 14
P6 analytics product roadmap and overview - Oracle Primavera P6 Collaborate 14P6 analytics product roadmap and overview - Oracle Primavera P6 Collaborate 14
P6 analytics product roadmap and overview - Oracle Primavera P6 Collaborate 14p6academy
 

Similar to Data Discoverability at SpotHero (20)

Data Discoverability with DataHub
Data Discoverability with DataHubData Discoverability with DataHub
Data Discoverability with DataHub
 
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraFrom Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
 
Joe C
Joe CJoe C
Joe C
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and UncertaintyAgile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
Agile Leadership: Guiding DataOps Teams Through Rapid Change and Uncertainty
 
Predictive Human Capital Analytics (1).pptx
Predictive Human Capital Analytics (1).pptxPredictive Human Capital Analytics (1).pptx
Predictive Human Capital Analytics (1).pptx
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
Rabobank - There is something about Data
Rabobank - There is something about DataRabobank - There is something about Data
Rabobank - There is something about Data
 
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing Meetup
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal GreenplumSimplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
Simplified Machine Learning, Text, and Graph Analytics with Pivotal Greenplum
 
Max Cottica slides from Future of Business Intelligence
Max Cottica slides from Future of Business Intelligence Max Cottica slides from Future of Business Intelligence
Max Cottica slides from Future of Business Intelligence
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling Fundamentals
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data Discovery
 
P6 analytics product roadmap and overview - Oracle Primavera P6 Collaborate 14
P6 analytics product roadmap and overview - Oracle Primavera P6 Collaborate 14P6 analytics product roadmap and overview - Oracle Primavera P6 Collaborate 14
P6 analytics product roadmap and overview - Oracle Primavera P6 Collaborate 14
 

Recently uploaded

Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 

Recently uploaded (20)

Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 

Data Discoverability at SpotHero

  • 1. Data Discoverability Maggie Hays Senior Product Manager -- Data Services DataHub Town Hall -- September 25, 2020
  • 2. 2 Agenda ● Overview of Teams & Data Stack ● Current State of Data Discoverability ● Data Catalog Evaluation ● DataHub POC - Hypotheses, Progress, and Next Steps ● Brief Demo!
  • 3. 3 SpotHero’s Data-Focused Teams Data Engineering 3 Engineers SpotHero IQ 2 Engineers 1-3 Data Scientists (We’re hiring!!) Analytics 5 Business Analysts
  • 4. 4 Looker Airflow SpotHero’s Data Stack SH Application Data Workflow Tools Marketing Tools Microservices Clickstream Analytics Redshift S3/Parquet Fivetran Segment Kafka SQL Python Spark Sources Ingestion Storage ETL
  • 5. 5 1 2 3 Current State of Data Discoverability Data Lineage is difficult to discover and navigate, regardless of role or tenure ● Impact analysis is arduous; Engineers avoid breaking changes at all costs ● Prolonged debugging/troubleshooting data issues Difficult to discover what data exists and/or what it represents ● Reliance on tribal knowledge ● Large burden on the Analytics team to answer any/all questions Confidence in Data Accuracy is neutral, but room for improvement ● Once folks track down the data, they are relatively confident in its accuracy May 2020 Internal Survey - Engineering, Product, Analytics, Data Science teams; 47% response rate
  • 6. 6 Data Catalog Evaluation DataHub Amundsen / Marquez Apache Atlas Alation Ease of Integration Lineage Support Configurable Metadata Affordability
  • 7. 7 1 2 3 DataHub POC - Hypotheses Increase visibility into what data exists, what it represents, and how it’s used across the company Decrease the effort required by SpotHero teammates to use and interpret data Increase SpotHero teammates’ confidence in the accuracy and/or relevance of data
  • 8. 8 Looker Airflow DataHub POC - Progress & Next Steps SH Application Data Workflow Tools Marketing Tools Microservices Clickstream Analytics Redshift S3/Parquet Fivetran Segment Kafka SQL Python Spark Sources Ingestion Storage ETL Complete Q4 2020