Rapid Data Analytics @ Netflix

Rapid Data Analytics @ Netflix
Jason Flittner
Senior BI Engineer
Chris Stephens
Senior Data Engineer
Monisha Kanoth
Senior Data Architect

633643 DEA @ Netflix
Content Analytics

Global
Expansion &
Content Spend

Freedom & Responsibility
Highly Aligned, Loosely
Coupled
Context, not Control
Culture + Technology
Courage
Judgement
Honesty
Communication
Curiosity
Passion
Innovation
Impact
Selflessness

Parquet FF
Storage Compute Tools BI
AWS
S3
(Hadoop
clusters)

Deploy Fast, Fix Faster
● Improve & Iterate vs Perfect
● Have a Rollback Plan Ready

Develop Business
Logic not ETL
● Think in Patterns

The Path of Least Resistance is the
Right Path
● Make Smart Engineering
Tradeoffs

The Clock starts Ticking when you
Deploy
● Every Data Pipeline comes with
an Expiration Date
● Deprecate and Prune

No Man’s Land
is Expensive
● Ownership

What You Could Do
in your Data Warehouse

Let everyone drop tables in production

Cost / Benefit
Conscientious people make mistakes,
but not very often
Data warehouse is not an operational system
What happens if a table is accidentally dropped?
● Do you have backups?
● How quickly can you restore a table?
Is the benefit of worth the tax on every data /
analytical product your team produces?

In Hive, all tables are external tables pointing to S3 locations.
ETL writes a new “batch” of data then updates the metastore.
s3://[bucket]/hive/schema.db/table/batchid=1459364911
ALTER TABLE table SET LOCATION [path to new batch ID];
DROP TABLE does not delete any data.

In our MPP databases, we have a procedure for upgrading and
downgrading our privileges.
CALL admin.UpgradePrivileges('me')
Lasts for several hours. Usage is logged.
Accidents? Restore from backups. Or reload from Hive.

When other teams are ready to move to production ...
We’re done. And moving on to the next thing.
You can trust your people to work the same way.

Don’t have an “on call”
(Use a “first responder” instead)

Everyone on the team takes a shift: both BI and data engineers
(even managers every once in a while!)
First Responder = the first one to respond
● handles most common failures (restarting jobs)
● reaches out directly to ETL owner if escalation is required
● handles communication surrounding ETL delays

Goal is to protect the team’s time and focus

How we do this
● visually define what needs attention and what doesn’t
○ “above the line” vs “below the line”
● email alerts for “above the line” jobs that take longer than normal
● playbook for fixing common stuff
○ the more complete your entries are, the less you get called!

Have a very clear sense of what is urgent, and what isn’t

Treating every failure like it’s urgent bleeds your team of the time they
need to do work
Build your processes so they can be ignored for 3 days
● don’t load data if it’s incomplete
● reprocess fact data for several days instead of picking up the latest
Gives you the freedom to judge whether a failure is worth an
interruption

Everybody owns ETL
(when they need to)

BI engineer needs data structured a certain way for a report
Many environments:
● Ask a data engineer to build them a table
Our environment:
● Let them schedule a Hive script and adjust as necessary

We focus on centers of excellence, not role boundaries

More Examples:
● our BI engineers use Python to automate tasks
● our data engineers have Tableau licenses, and use them for
quick visualizations and report deployments
For small tasks, this helps us avoid the overhead of interruption and
knowledge transfer

What You Could Do
on the Front-end

Parquet FF
(Hadoop
clusters)
Storage Compute Data Interface Data Access, Analytics and Visualization
AWS
S3

Do Not Limit Yourself to Conventional Tools
○ Tableau - Data Visualization and Dashboards
○ MicroStrategy - Dynamic SQL and Metadata
○ Python or Custom Reporting - Emails

Give your BI Engineers
Superpowers (like this guy)
○ Provide a data platform
○ BI + Data Engineering
○ Context not Requirements
○ Be early adopters

Dismantle your Data Warehouse Team
○ Integrate with the business
○ Data Engineering and Data Science teams
○ Open and honest communication

Fast is better than perfect
○ Build, iterate… repeat
○ How to handle adhocs
○ Freedom - make the right call
○ Responsibility - Ownership

Questions?
Want to chill with us!?
jobs.netflix.com

Rapid Data Analytics @ Netflix

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to Rapid Data Analytics @ Netflix

Similar to Rapid Data Analytics @ Netflix (20)

Recently uploaded

Recently uploaded (20)

Rapid Data Analytics @ Netflix