Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Doubling down on python to move fast without
breaking things.
Embracing the Monolith in Small Teams
Leon Sasson
@leonsasso...
Rise Science
Rise Science
Product Goals
• Sleep Improvement
• User Enjoyment
Iterate Fast
Young company, timeline of weeks and days.
Data is core to the product
No data = 😩
Development Cycle
Hypothesis
Exploration
Experiment
Productizing
Evaluate & Analyze
Easy, right?
😓
Obstacles
• Data Silos
• Data Silos
• Different Tooling
• Data Silos
• Different Tooling
• People
• Data Silos
• Moving from phase requires different tools
• People
• "It works on staging"
• Testing data products is hard...
• Data Silos
• Moving from phase requires different tools
• People
• "It works on staging"
Extended Product Cycles
How do we start?
Descriptive, visuals, basic summaries
Step Back
What the organization needs.
Understand problem before getting into solutions
Solution First
Focus is on tech trade-off
Solution First
Focus is on tech trade-off
Problem First
Focus is on making progress for the org
vs.
Solution First
Focus is on tech trade-off
Problem First
Focus is on making progress for the org
vs.
Solution First
Focus is on tech trade-off
Problem First
Focus is on making progress for the org
vs.
Business
Optimality
Technical
Optimality
What's the least I can do to solve the problem?
What's the least I can do to solve the problem?
You need an architecture compatible with this mindset
Monolithic Architecture
© Martin Fowler: http://martinfowler.com/articles/microservices.html
A monolithic application
puts all its functionality
i...
© Martin Fowler: http://martinfowler.com/articles/microservices.html
A microservices
architecture puts each
element of fun...
© Martin Fowler: http://martinfowler.com/articles/microservices.html© Martin Fowler: http://martinfowler.com/articles/micr...
Django. The Good Things
Reuse Libraries
IPython Notebooks
Reuse your ORM when accessing data.
Pandas, django-pandas
Instrumentation
People
The Problem of Toil
".. manual, repetitive, automatable, tactical, devoid of enduring value,
and scales linearly as a serv...
Toil-induced negative data culture
Self-Serve Analytics
Breaking Data Silos
Why do Data Silos Happen?
person id date duration
1 2016-08-01 450
2 2016-08-01 426
1 2016-08-02 438
Row
person id date du...
Centralizing Data
Segment.com
Backend DB
RedshiftETL
Redshift is fast for aggregations
Out-of-the-box compatible with Postgres
(Mostly..)
Bring data to the people
Positive Feedback Loop on Data Culture
Non tech can access data whenever
Data team can focus on bigger problems and act as...
Be scrappy.
Thanks!
Embracing the Monolith
Embracing the Monolith
Embracing the Monolith
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
ViA - Gemstone WaterBottles for Structure Water
Next
Upcoming SlideShare
ViA - Gemstone WaterBottles for Structure Water
Next
Download to read offline and view in fullscreen.

Share

Embracing the Monolith

Download to read offline

PyData Chicago 2016 Talk.

  • Be the first to like this

Embracing the Monolith

  1. 1. Doubling down on python to move fast without breaking things. Embracing the Monolith in Small Teams Leon Sasson @leonsasson PyData Chicago 2016
  2. 2. Rise Science
  3. 3. Rise Science
  4. 4. Product Goals • Sleep Improvement • User Enjoyment
  5. 5. Iterate Fast Young company, timeline of weeks and days.
  6. 6. Data is core to the product No data = 😩
  7. 7. Development Cycle Hypothesis Exploration Experiment Productizing Evaluate & Analyze
  8. 8. Easy, right?
  9. 9. 😓
  10. 10. Obstacles
  11. 11. • Data Silos
  12. 12. • Data Silos • Different Tooling
  13. 13. • Data Silos • Different Tooling • People
  14. 14. • Data Silos • Moving from phase requires different tools • People • "It works on staging" • Testing data products is hard • Garbage in → Garbage out • Capacity problems
  15. 15. • Data Silos • Moving from phase requires different tools • People • "It works on staging" Extended Product Cycles
  16. 16. How do we start? Descriptive, visuals, basic summaries
  17. 17. Step Back
  18. 18. What the organization needs. Understand problem before getting into solutions
  19. 19. Solution First Focus is on tech trade-off
  20. 20. Solution First Focus is on tech trade-off Problem First Focus is on making progress for the org vs.
  21. 21. Solution First Focus is on tech trade-off Problem First Focus is on making progress for the org vs.
  22. 22. Solution First Focus is on tech trade-off Problem First Focus is on making progress for the org vs.
  23. 23. Business Optimality Technical Optimality
  24. 24. What's the least I can do to solve the problem?
  25. 25. What's the least I can do to solve the problem? You need an architecture compatible with this mindset
  26. 26. Monolithic Architecture
  27. 27. © Martin Fowler: http://martinfowler.com/articles/microservices.html A monolithic application puts all its functionality into a singles process.. ... and scales by replicating the monolith on multiple servers
  28. 28. © Martin Fowler: http://martinfowler.com/articles/microservices.html A microservices architecture puts each element of functionality into a separate service.. ... and scales by distributing these services across servers, replicating as needed
  29. 29. © Martin Fowler: http://martinfowler.com/articles/microservices.html© Martin Fowler: http://martinfowler.com/articles/microservices.html
  30. 30. Django. The Good Things
  31. 31. Reuse Libraries
  32. 32. IPython Notebooks
  33. 33. Reuse your ORM when accessing data.
  34. 34. Pandas, django-pandas
  35. 35. Instrumentation
  36. 36. People
  37. 37. The Problem of Toil ".. manual, repetitive, automatable, tactical, devoid of enduring value, and scales linearly as a service grows.."
  38. 38. Toil-induced negative data culture
  39. 39. Self-Serve Analytics
  40. 40. Breaking Data Silos
  41. 41. Why do Data Silos Happen? person id date duration 1 2016-08-01 450 2 2016-08-01 426 1 2016-08-02 438 Row person id date duration 1 2 1 2016-08-01 2016-08-01 2016-08-02 450 426 438 Columnar
  42. 42. Centralizing Data Segment.com Backend DB RedshiftETL
  43. 43. Redshift is fast for aggregations
  44. 44. Out-of-the-box compatible with Postgres (Mostly..)
  45. 45. Bring data to the people
  46. 46. Positive Feedback Loop on Data Culture Non tech can access data whenever Data team can focus on bigger problems and act as enablers
  47. 47. Be scrappy.
  48. 48. Thanks!

PyData Chicago 2016 Talk.

Views

Total views

223

On Slideshare

0

From embeds

0

Number of embeds

2

Actions

Downloads

3

Shares

0

Comments

0

Likes

0

×