Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 18

From Batch to Real Time: Overstock’s Journey Towards Unifying Analytics Across Data Engineering and Data Science with Chris Robison

1

Share

Download to read offline

With great data, comes great responsibility. At Overstock.com, lack of data has never been an issue. We know everything from the color you search most, to which room you’ll redesign next. We can see individuals transition from furnishing their first flat to building their dream home, but processing this data requires some serious firepower. It has fueled our focus on delivering real-time personalization through the unification of data and AI. Databricks is at the crux of this vision – empowering us to leverage cloud-scale with a platform that simplifies data engineering and increases the productivity of our data science team.

Tune in as Chris Robison takes you through marketecture innovations in building a successful marketing technology infrastructure for instantaneous individualized marketing experiences.

From Batch to Real Time: Overstock’s Journey Towards Unifying Analytics Across Data Engineering and Data Science with Chris Robison

  1. 1. from batch to real time overstock’s journey towards unifying analytics across data engineering and data science
  2. 2. retail is nearing the end of its digital transformation competition is more intense and personalization is key if you don't know your customer and speak directly to her you will lose © Overstock 2
  3. 3. overstock.com © Overstock 3 pushing boundaries since 1999
  4. 4. we needed a new marketing architecture to unleash our data and personalize across silos © Overstock 4
  5. 5. © Overstock 5 Data Warehouse Hadoop Structured Data Semi-Structured Data ETL Data Eng Analytics Use Cases Data Science Use Cases Before
  6. 6. © Overstock 6 N+1 Data Sources Batch Streaming
  7. 7. comprehensive, cross- channel marketing powered by AI built on Databricks © Overstock 7
  8. 8. © Overstock 8
  9. 9. Databricks Unified Analytics Platform © Overstock 9 • support for python and R much more robust in Databricks • full suite of libraries are interchangeable and easy to install • collaborative notebooks means less code reproduction for common tasks • full suite of technology from ETL to deep learning • democratize the power of Spark across the organization Why Apache Spark on Databricks?
  10. 10. the right tool for the right person: data scientist, data analyst, engineer, or business owner © Overstock 10
  11. 11. from silos © Overstock 11 Business Owner Analyst Data Scientist Engineer
  12. 12. to unification © Overstock 12 Business Owner Analyst Data Scientist Engineer
  13. 13. delivering actionable insights when the business needs them © Overstock 13 • a more collaborative approach to data science and analytics => personalized marketing • audience/customer data in real time • streaming infrastructure gives real-time diagnostics and analysis • business owners operate strategically driving business value
  14. 14. parallel innovation = shared success at unprecedented speed © Overstock 14
  15. 15. results: closed the gap between POC and production © Overstock 15 • decreased cost of moving models to production by nearly half • stand up new models in one fifth the time previously required • can make inter-day improvements on existing models without new deploys • quickly spin up/down clusters through self-service, cluster management driving business value
  16. 16. right people unified platform personalized marketing in real time © Overstock 16
  17. 17. Come see our talk! © Overstock 17 Data Science and Enterprise Engineering 11:00am – Enterprise Track

Editor's Notes

  • make sure to push personalization through to the end

    The key to achieving personalization is Data and AI

  • Collecting a ton of data to provide a more personalized experience than anyone else on the planet.
  • add another slide after this. How can we build in 6-9 months a marketecture that meets the needs of data scientists and business stake holders alike.
  • Before: everything ad-hoc and batch

  • Databricks and Spark Structured Streaming powering all the intelligence

    Snowflake connectors allow us to easiy pull data

    Streaming (Kinesis) allows for real time data

    Pull out how ML + AI built in Databricks is powering all of the intelligence in the system above
  • Databricks and Spark Structured Streaming powering all the intelligence

    Snowflake connectors allow us to easiy pull data

    Streaming (Kinesis) allows for real time data

    Pull out how ML + AI built in Databricks is powering all of the intelligence in the system above
  • Each of these was brought up with the same search terms using personalization powered by AI!

    We want to be a best friend for our users when we can combine all of the inspiration, advice and ideas into one single place tailored to you.
    This is our future. We want to take you there with us!
  • Python (especially Python 3) and R are much more robust in Databricks

    The full suite of libraries for both environments can be used and notebooks can switch between Python 2 and 3

    Install Python/R libraries as we need them, enabling us to use and prototype new libraries without requesting external support

    Ability to push internal code base to the notebook clusters allows for customization and less code reproduction for common tasks

    Engineers have the ability to use the latest and greatest technology with little cost for prototyping

    Analysts can use the tools they are most familiar with and deliver insights at rapid speed across large scale data sources
  • MAKE SURE TO SLOW DOWN AND EXPLAIN THE CHALLENGES OF EACH GROUP

    from batch parallel
    picture of parallel lines with stick figures inside
    to batch parallel
  • MAKE SURE TO SLOW DOWN AND EXPLAIN THE ADVANTAGES

    from batch parallel
    picture of parallel lines with stick figures inside
    to batch parallel
  • Get close to 30 pt, at least 24.
    6 –word bullet points
    focus on high business impact

    Delivering actionable insights when the business need them
    Analysts can diagnose performance from the last minute not yesterday

    hachimal example

  • Get close to 30 pt, at least 24.
    6 –word bullet points
    focus on high business impact

    Delivering actionable insights when the business need them
    Analysts can diagnose performance from the last minute not yesterday

    hachimal example

  • Overstock.com is setting the bar for marketing architecture. We can see trends as they unfold. We are able to show her just what she wants, when she wants to see it. Stay tuned – you WONT believe what we are going to do next.
  • Overstock.com is setting the bar for marketing architecture. We can see trends as they unfold. We are able to show her just what she wants, when she wants to see it. Stay tuned – you WONT believe what we are going to do next.
  • Add Slide with time and track for second talk
  • ×