SlideShare a Scribd company logo
Rapid Data Analytics @ Netflix
Jason Flittner
Senior BI Engineer
Chris Stephens
Senior Data Engineer
Monisha Kanoth
Senior Data Architect
What We Do
633643 DEA @ Netflix
Content Analytics
Global
Expansion &
Content Spend
Freedom & Responsibility
Highly Aligned, Loosely
Coupled
Context, not Control
Culture + Technology
Courage
Judgement
Honesty
Communication
Curiosity
Passion
Innovation
Impact
Selflessness
Parquet FF
Storage Compute Tools BI
AWS
S3
(Hadoop
clusters)
Deploy Fast, Fix Faster
● Improve & Iterate vs Perfect
● Have a Rollback Plan Ready
Develop Business
Logic not ETL
● Think in Patterns
The Path of Least Resistance is the
Right Path
● Make Smart Engineering
Tradeoffs
The Clock starts Ticking when you
Deploy
● Every Data Pipeline comes with
an Expiration Date
● Deprecate and Prune
No Man’s Land
is Expensive
● Ownership
Be a Noob
● User Groups
What You Could Do
in your Data Warehouse
Let everyone drop tables in production
Cost / Benefit
Conscientious people make mistakes,
but not very often
Data warehouse is not an operational system
What happens if a table is accidentally dropped?
● Do you have backups?
● How quickly can you restore a table?
Is the benefit of worth the tax on every data /
analytical product your team produces?
We have some protection
In Hive, all tables are external tables pointing to S3 locations.
ETL writes a new “batch” of data then updates the metastore.
s3://[bucket]/hive/schema.db/table/batchid=1459364911
ALTER TABLE table SET LOCATION [path to new batch ID];
DROP TABLE does not delete any data.
In our MPP databases, we have a procedure for upgrading and
downgrading our privileges.
CALL admin.UpgradePrivileges('me')
Lasts for several hours. Usage is logged.
Accidents? Restore from backups. Or reload from Hive.
When other teams are ready to move to production ...
We’re done. And moving on to the next thing.
You can trust your people to work the same way.
Don’t have an “on call”
(Use a “first responder” instead)
Everyone on the team takes a shift: both BI and data engineers
(even managers every once in a while!)
First Responder = the first one to respond
● handles most common failures (restarting jobs)
● reaches out directly to ETL owner if escalation is required
● handles communication surrounding ETL delays
Goal is to protect the team’s time and focus
How we do this
● visually define what needs attention and what doesn’t
○ “above the line” vs “below the line”
● email alerts for “above the line” jobs that take longer than normal
● playbook for fixing common stuff
○ the more complete your entries are, the less you get called!
Have a very clear sense of what is urgent, and what isn’t
Treating every failure like it’s urgent bleeds your team of the time they
need to do work
Build your processes so they can be ignored for 3 days
● don’t load data if it’s incomplete
● reprocess fact data for several days instead of picking up the latest
Gives you the freedom to judge whether a failure is worth an
interruption
Everybody owns ETL
(when they need to)
BI engineer needs data structured a certain way for a report
Many environments:
● Ask a data engineer to build them a table
Our environment:
● Let them schedule a Hive script and adjust as necessary
We focus on centers of excellence, not role boundaries
More Examples:
● our BI engineers use Python to automate tasks
● our data engineers have Tableau licenses, and use them for
quick visualizations and report deployments
For small tasks, this helps us avoid the overhead of interruption and
knowledge transfer
What You Could Do
on the Front-end
Parquet FF
(Hadoop
clusters)
Storage Compute Data Interface Data Access, Analytics and Visualization
AWS
S3
Do Not Limit Yourself to Conventional Tools
○ Tableau - Data Visualization and Dashboards
○ MicroStrategy - Dynamic SQL and Metadata
○ Python or Custom Reporting - Emails
Give your BI Engineers
Superpowers (like this guy)
○ Provide a data platform
○ BI + Data Engineering
○ Context not Requirements
○ Be early adopters
Simple is
Often Best
Dismantle your Data Warehouse Team
○ Integrate with the business
○ Data Engineering and Data Science teams
○ Open and honest communication
Fast is better than perfect
○ Build, iterate… repeat
○ How to handle adhocs
○ Freedom - make the right call
○ Responsibility - Ownership
Encourage
Hacking
Questions?
Want to chill with us!?
jobs.netflix.com

More Related Content

What's hot

Kettle – Etl Tool
Kettle – Etl ToolKettle – Etl Tool
Kettle – Etl Tool
Dr Anjan Krishnamurthy
 
BIG DATA-Seminar Report
BIG DATA-Seminar ReportBIG DATA-Seminar Report
BIG DATA-Seminar Report
josnapv
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURE
Sachin Batham
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
Manish Kumar
 
Understanding MLOps
Understanding MLOpsUnderstanding MLOps
Understanding MLOps
Chun Myung Kyu
 
Data Visualization using different python libraries.pptx
Data Visualization using different python libraries.pptxData Visualization using different python libraries.pptx
Data Visualization using different python libraries.pptx
HamzaAli998966
 
Big Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To KnowBig Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To Know
SnapLogic
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with Alation
Databricks
 
Commercializing Alternative Data
Commercializing Alternative DataCommercializing Alternative Data
Commercializing Alternative Data
Databricks
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
Guido Schmutz
 
Best Practices for Killer Data Visualization
Best Practices for Killer Data VisualizationBest Practices for Killer Data Visualization
Best Practices for Killer Data Visualization
Qualtrics
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Kai Wähner
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
Cloudera, Inc.
 
[D2]java 성능에 대한 오해와 편견
[D2]java 성능에 대한 오해와 편견[D2]java 성능에 대한 오해와 편견
[D2]java 성능에 대한 오해와 편견
NAVER D2
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis Platform
Databricks
 
Spotify in the Cloud - An evolution of data infrastructure - Strata NYC
Spotify in the Cloud - An evolution of data infrastructure - Strata NYCSpotify in the Cloud - An evolution of data infrastructure - Strata NYC
Spotify in the Cloud - An evolution of data infrastructure - Strata NYC
Josh Baer
 
Database migration
Database migrationDatabase migration
Database migration
Sankar Patnaik
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
Snowflake Computing
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
Nicholas McClure
 

What's hot (20)

Kettle – Etl Tool
Kettle – Etl ToolKettle – Etl Tool
Kettle – Etl Tool
 
BIG DATA-Seminar Report
BIG DATA-Seminar ReportBIG DATA-Seminar Report
BIG DATA-Seminar Report
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURE
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
 
Understanding MLOps
Understanding MLOpsUnderstanding MLOps
Understanding MLOps
 
Data Visualization using different python libraries.pptx
Data Visualization using different python libraries.pptxData Visualization using different python libraries.pptx
Data Visualization using different python libraries.pptx
 
Big Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To KnowBig Data Management: What's New, What's Different, and What You Need To Know
Big Data Management: What's New, What's Different, and What You Need To Know
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with Alation
 
Commercializing Alternative Data
Commercializing Alternative DataCommercializing Alternative Data
Commercializing Alternative Data
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Best Practices for Killer Data Visualization
Best Practices for Killer Data VisualizationBest Practices for Killer Data Visualization
Best Practices for Killer Data Visualization
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
[D2]java 성능에 대한 오해와 편견
[D2]java 성능에 대한 오해와 편견[D2]java 성능에 대한 오해와 편견
[D2]java 성능에 대한 오해와 편견
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis Platform
 
Spotify in the Cloud - An evolution of data infrastructure - Strata NYC
Spotify in the Cloud - An evolution of data infrastructure - Strata NYCSpotify in the Cloud - An evolution of data infrastructure - Strata NYC
Spotify in the Cloud - An evolution of data infrastructure - Strata NYC
 
Database migration
Database migrationDatabase migration
Database migration
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
 

Viewers also liked

Use of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case StudyUse of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case Study
Saket Toshniwal
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
Blake Irvine
 
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Data Con LA
 
Netflix-Using analytics to predict hits
Netflix-Using analytics to predict hitsNetflix-Using analytics to predict hits
Netflix-Using analytics to predict hits
Gaurav Dutta
 
Data Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for Hadoop
Michelle Ufford
 
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Data Con LA
 
Big Data Day LA 2016/ Data Science Track - Intuit's Payments Risk Platform, D...
Big Data Day LA 2016/ Data Science Track - Intuit's Payments Risk Platform, D...Big Data Day LA 2016/ Data Science Track - Intuit's Payments Risk Platform, D...
Big Data Day LA 2016/ Data Science Track - Intuit's Payments Risk Platform, D...
Data Con LA
 
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Data Con LA
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Data Con LA
 
Big Data Day LA 2016/ NoSQL track - Spark And Couchbase: Augmenting The Opera...
Big Data Day LA 2016/ NoSQL track - Spark And Couchbase: Augmenting The Opera...Big Data Day LA 2016/ NoSQL track - Spark And Couchbase: Augmenting The Opera...
Big Data Day LA 2016/ NoSQL track - Spark And Couchbase: Augmenting The Opera...
Data Con LA
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Spark Summit
 
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
Amazon Web Services
 
eMatrics Summit Milano: Come gli Open Big Data possono migliorare la vostra v...
eMatrics Summit Milano: Come gli Open Big Data possono migliorare la vostra v...eMatrics Summit Milano: Come gli Open Big Data possono migliorare la vostra v...
eMatrics Summit Milano: Come gli Open Big Data possono migliorare la vostra v...
Monia Spinelli
 
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
DataStax
 
101129 tokyopref bochibochi
101129 tokyopref bochibochi101129 tokyopref bochibochi
101129 tokyopref bochibochiredgang
 
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of GruterBig Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Data Con LA
 
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Data Con LA
 
Big Data Day LA 2015 - Using data visualization to find patterns in multidime...
Big Data Day LA 2015 - Using data visualization to find patterns in multidime...Big Data Day LA 2015 - Using data visualization to find patterns in multidime...
Big Data Day LA 2015 - Using data visualization to find patterns in multidime...
Data Con LA
 
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Data Con LA
 

Viewers also liked (20)

Use of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case StudyUse of Analytics by Netflix - Case Study
Use of Analytics by Netflix - Case Study
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
 
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
 
Netflix-Using analytics to predict hits
Netflix-Using analytics to predict hitsNetflix-Using analytics to predict hits
Netflix-Using analytics to predict hits
 
Data Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for Hadoop
 
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
 
Big Data Day LA 2016/ Data Science Track - Intuit's Payments Risk Platform, D...
Big Data Day LA 2016/ Data Science Track - Intuit's Payments Risk Platform, D...Big Data Day LA 2016/ Data Science Track - Intuit's Payments Risk Platform, D...
Big Data Day LA 2016/ Data Science Track - Intuit's Payments Risk Platform, D...
 
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
 
Big Data Day LA 2016/ NoSQL track - Spark And Couchbase: Augmenting The Opera...
Big Data Day LA 2016/ NoSQL track - Spark And Couchbase: Augmenting The Opera...Big Data Day LA 2016/ NoSQL track - Spark And Couchbase: Augmenting The Opera...
Big Data Day LA 2016/ NoSQL track - Spark And Couchbase: Augmenting The Opera...
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
 
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
(BDT207) Real-Time Analytics In Service Of Self-Healing Ecosystems
 
eMatrics Summit Milano: Come gli Open Big Data possono migliorare la vostra v...
eMatrics Summit Milano: Come gli Open Big Data possono migliorare la vostra v...eMatrics Summit Milano: Come gli Open Big Data possono migliorare la vostra v...
eMatrics Summit Milano: Come gli Open Big Data possono migliorare la vostra v...
 
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
 
Dot pab forum september 2011
Dot pab forum september 2011Dot pab forum september 2011
Dot pab forum september 2011
 
101129 tokyopref bochibochi
101129 tokyopref bochibochi101129 tokyopref bochibochi
101129 tokyopref bochibochi
 
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of GruterBig Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
 
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
Big Data Day LA 2015 - The Big Data Journey: How Big Data Practices Evolve at...
 
Big Data Day LA 2015 - Using data visualization to find patterns in multidime...
Big Data Day LA 2015 - Using data visualization to find patterns in multidime...Big Data Day LA 2015 - Using data visualization to find patterns in multidime...
Big Data Day LA 2015 - Using data visualization to find patterns in multidime...
 
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
Big Data Day LA 2015 - Transforming into a data driven enterprise using exist...
 

Similar to Rapid Data Analytics @ Netflix

Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
Rob Winters
 
ETL Practices for Better or Worse
ETL Practices for Better or WorseETL Practices for Better or Worse
ETL Practices for Better or WorseEric Sun
 
Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data Logistics
Ken Farmer
 
Architecting for analytics
Architecting for analyticsArchitecting for analytics
Architecting for analytics
Rob Winters
 
Pitchero - Increasing agility through DevOps - Leeds DevOps November 2016
Pitchero - Increasing agility through DevOps - Leeds DevOps November 2016Pitchero - Increasing agility through DevOps - Leeds DevOps November 2016
Pitchero - Increasing agility through DevOps - Leeds DevOps November 2016
Jon Milsom
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data Platform
Dani Solà Lagares
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for Integration
Inside Analysis
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Daniel Zivkovic
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
David Portnoy
 
2011 06 15 velocity conf from visible ops to dev ops final
2011 06 15 velocity conf   from visible ops to dev ops final2011 06 15 velocity conf   from visible ops to dev ops final
2011 06 15 velocity conf from visible ops to dev ops final
Gene Kim
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
Kent Graziano
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
Venkatesh Umaashankar
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
Anant Corporation
 
Inextricably linked: reproducibility and productivity in data science and AI
Inextricably linked: reproducibility and productivity in data science and AIInextricably linked: reproducibility and productivity in data science and AI
Inextricably linked: reproducibility and productivity in data science and AI
Luke Marsden
 
A field guide to the Financial Times, Rhys Evans, Financial Times
A field guide to the Financial Times, Rhys Evans, Financial TimesA field guide to the Financial Times, Rhys Evans, Financial Times
A field guide to the Financial Times, Rhys Evans, Financial Times
Neo4j
 
Agile methods and dw mha
Agile methods and dw mhaAgile methods and dw mha
Agile methods and dw mha
AgileDenver
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value Thereafter
Inside Analysis
 
SharePoint Operations Framework - Planning and Guidance
SharePoint Operations Framework - Planning and GuidanceSharePoint Operations Framework - Planning and Guidance
SharePoint Operations Framework - Planning and Guidance
Chandima Kulathilake
 
How to build data accessibility for everyone
How to build data accessibility for everyoneHow to build data accessibility for everyone
How to build data accessibility for everyone
Karen Hsieh
 
Enabling Your Data Science Team with Modern Data Engineering
Enabling Your Data Science Team with Modern Data EngineeringEnabling Your Data Science Team with Modern Data Engineering
Enabling Your Data Science Team with Modern Data Engineering
James Densmore
 

Similar to Rapid Data Analytics @ Netflix (20)

Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
 
ETL Practices for Better or Worse
ETL Practices for Better or WorseETL Practices for Better or Worse
ETL Practices for Better or Worse
 
Python for Data Logistics
Python for Data LogisticsPython for Data Logistics
Python for Data Logistics
 
Architecting for analytics
Architecting for analyticsArchitecting for analytics
Architecting for analytics
 
Pitchero - Increasing agility through DevOps - Leeds DevOps November 2016
Pitchero - Increasing agility through DevOps - Leeds DevOps November 2016Pitchero - Increasing agility through DevOps - Leeds DevOps November 2016
Pitchero - Increasing agility through DevOps - Leeds DevOps November 2016
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data Platform
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for Integration
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
2011 06 15 velocity conf from visible ops to dev ops final
2011 06 15 velocity conf   from visible ops to dev ops final2011 06 15 velocity conf   from visible ops to dev ops final
2011 06 15 velocity conf from visible ops to dev ops final
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
Inextricably linked: reproducibility and productivity in data science and AI
Inextricably linked: reproducibility and productivity in data science and AIInextricably linked: reproducibility and productivity in data science and AI
Inextricably linked: reproducibility and productivity in data science and AI
 
A field guide to the Financial Times, Rhys Evans, Financial Times
A field guide to the Financial Times, Rhys Evans, Financial TimesA field guide to the Financial Times, Rhys Evans, Financial Times
A field guide to the Financial Times, Rhys Evans, Financial Times
 
Agile methods and dw mha
Agile methods and dw mhaAgile methods and dw mha
Agile methods and dw mha
 
The Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value ThereafterThe Right Data Warehouse: Automation Now, Business Value Thereafter
The Right Data Warehouse: Automation Now, Business Value Thereafter
 
SharePoint Operations Framework - Planning and Guidance
SharePoint Operations Framework - Planning and GuidanceSharePoint Operations Framework - Planning and Guidance
SharePoint Operations Framework - Planning and Guidance
 
How to build data accessibility for everyone
How to build data accessibility for everyoneHow to build data accessibility for everyone
How to build data accessibility for everyone
 
Enabling Your Data Science Team with Modern Data Engineering
Enabling Your Data Science Team with Modern Data EngineeringEnabling Your Data Science Team with Modern Data Engineering
Enabling Your Data Science Team with Modern Data Engineering
 

More from Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
Data Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
Data Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
Data Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

Rapid Data Analytics @ Netflix

  • 1. Rapid Data Analytics @ Netflix Jason Flittner Senior BI Engineer Chris Stephens Senior Data Engineer Monisha Kanoth Senior Data Architect
  • 3. 633643 DEA @ Netflix Content Analytics
  • 5. Freedom & Responsibility Highly Aligned, Loosely Coupled Context, not Control Culture + Technology Courage Judgement Honesty Communication Curiosity Passion Innovation Impact Selflessness
  • 6. Parquet FF Storage Compute Tools BI AWS S3 (Hadoop clusters)
  • 7. Deploy Fast, Fix Faster ● Improve & Iterate vs Perfect ● Have a Rollback Plan Ready
  • 8. Develop Business Logic not ETL ● Think in Patterns
  • 9. The Path of Least Resistance is the Right Path ● Make Smart Engineering Tradeoffs
  • 10. The Clock starts Ticking when you Deploy ● Every Data Pipeline comes with an Expiration Date ● Deprecate and Prune
  • 11. No Man’s Land is Expensive ● Ownership
  • 12. Be a Noob ● User Groups
  • 13.
  • 14. What You Could Do in your Data Warehouse
  • 15. Let everyone drop tables in production
  • 16. Cost / Benefit Conscientious people make mistakes, but not very often Data warehouse is not an operational system What happens if a table is accidentally dropped? ● Do you have backups? ● How quickly can you restore a table? Is the benefit of worth the tax on every data / analytical product your team produces?
  • 17. We have some protection
  • 18. In Hive, all tables are external tables pointing to S3 locations. ETL writes a new “batch” of data then updates the metastore. s3://[bucket]/hive/schema.db/table/batchid=1459364911 ALTER TABLE table SET LOCATION [path to new batch ID]; DROP TABLE does not delete any data.
  • 19. In our MPP databases, we have a procedure for upgrading and downgrading our privileges. CALL admin.UpgradePrivileges('me') Lasts for several hours. Usage is logged. Accidents? Restore from backups. Or reload from Hive.
  • 20. When other teams are ready to move to production ... We’re done. And moving on to the next thing. You can trust your people to work the same way.
  • 21. Don’t have an “on call” (Use a “first responder” instead)
  • 22. Everyone on the team takes a shift: both BI and data engineers (even managers every once in a while!) First Responder = the first one to respond ● handles most common failures (restarting jobs) ● reaches out directly to ETL owner if escalation is required ● handles communication surrounding ETL delays
  • 23. Goal is to protect the team’s time and focus
  • 24. How we do this ● visually define what needs attention and what doesn’t ○ “above the line” vs “below the line” ● email alerts for “above the line” jobs that take longer than normal ● playbook for fixing common stuff ○ the more complete your entries are, the less you get called!
  • 25. Have a very clear sense of what is urgent, and what isn’t
  • 26. Treating every failure like it’s urgent bleeds your team of the time they need to do work Build your processes so they can be ignored for 3 days ● don’t load data if it’s incomplete ● reprocess fact data for several days instead of picking up the latest Gives you the freedom to judge whether a failure is worth an interruption
  • 27. Everybody owns ETL (when they need to)
  • 28. BI engineer needs data structured a certain way for a report Many environments: ● Ask a data engineer to build them a table Our environment: ● Let them schedule a Hive script and adjust as necessary
  • 29. We focus on centers of excellence, not role boundaries
  • 30. More Examples: ● our BI engineers use Python to automate tasks ● our data engineers have Tableau licenses, and use them for quick visualizations and report deployments For small tasks, this helps us avoid the overhead of interruption and knowledge transfer
  • 31. What You Could Do on the Front-end
  • 32. Parquet FF (Hadoop clusters) Storage Compute Data Interface Data Access, Analytics and Visualization AWS S3
  • 33. Do Not Limit Yourself to Conventional Tools ○ Tableau - Data Visualization and Dashboards ○ MicroStrategy - Dynamic SQL and Metadata ○ Python or Custom Reporting - Emails
  • 34. Give your BI Engineers Superpowers (like this guy) ○ Provide a data platform ○ BI + Data Engineering ○ Context not Requirements ○ Be early adopters
  • 36. Dismantle your Data Warehouse Team ○ Integrate with the business ○ Data Engineering and Data Science teams ○ Open and honest communication
  • 37. Fast is better than perfect ○ Build, iterate… repeat ○ How to handle adhocs ○ Freedom - make the right call ○ Responsibility - Ownership
  • 39. Questions? Want to chill with us!? jobs.netflix.com