Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS S3 + Alluxio + Presto = ❤️ The Ryte Use Case

177 views

Published on

Alluxio Online Meetup
Oct 9, 2019
Speaker: Danny Linden, Ryte

Find more Alluxio events: https://www.alluxio.io/events/

Published in: Software
  • Be the first to comment

  • Be the first to like this

AWS S3 + Alluxio + Presto = ❤️ The Ryte Use Case

  1. 1. AWS S3 + Alluxio + Presto = ❤ The Ryte Use Case
  2. 2. ● What is Ryte: Platform to optimize your Online-Marketing ● Requirements for the Ryte Search Success ● The way to Presto on AWS EMR with S3 ● Problems came up ● How we solve them with Alluxio AWS S3 + Alluxio + Presto = ❤
  3. 3. Customer Relationship Management Productivity Application Website Quality Management Welcome to the age of strategic platforms 3
  4. 4. Welcome the Ryte Suite with three essential tools! 4 Make sure the technical elements of your website are flawless. Website Success Create engaging content that users will love. Content Success Score top search rankings with the help of real Google data. Search Success
  5. 5. Monitor OptimizeAnalyze Making sure the digital evolution won’t hurt your business Hands-on advice to make your website the best it can be. Get the best possible data to make informed decisions. Monitor vital elements of your website’s success. 5
  6. 6. Let’s have a look at Website Success Website Success Content Success Search Success Make sure the technical elements of your website are flawless. 6
  7. 7. Let’s have a look at Content Success Website Success Content Success Search Success Create engaging content that users will love. 7
  8. 8. Let’s have a look at Search Success Website Success Content Success Search Success Score top search rankings with the help of real Google data. 8
  9. 9. ● Ryte is 100% Google compliant ● Ryte does not use scraped data, only Google Search Console data ● So monitor your important keywords based on 100% real Google data and therefore real search queries Our key asset: real Google data
  10. 10. Requirements for Ryte Search Success From Product Objectives to the technical implementation Daily Import of multiple GB JSON Data Mainly analytics based Queries Our Product Features require queries on raw Data Product Objectives to choose our technical solution Prefer usage of AWS High-Level Services Development-Team experience
  11. 11. HTTP JSON API Daily Data Import Ryte Data-Backend Ryte Web-Frontend
  12. 12. HTTP JSON API Daily Data Import Ryte Data-Backend Ryte Web-Frontend
  13. 13. Ryte Data-Backend: first Edition AWS Elasticsearch Service Import Application on AWS EC2 REST API on AWS EC2
  14. 14. Ryte Data-Backend: first Edition AWS Elasticsearch Service Import Application on AWS EC2 REST API on AWS EC2 ● Simple Data transformation. JSON 2 JSON ● Knowledge in Teams exist ● Performance ● Analytics Queries
  15. 15. ● Costs scales on Data, not on Usage ● High-Performance Setup are difficult Elasticsearch as Data-Storage Downsides
  16. 16. Ryte Data-Backend: second Edition Parquet on AWS S3 Import Application on AWS ECS AWS EMR (Hadoop) w Presto as Dist-SQL-Engine ● Full decoupled read / write engine ● High reliable, low-cost storage with AWS S3 (99.999999999% durability) ● Cost-intensive scaling is usage based REST API on AWS ECS
  17. 17. AWS ECS Container AWS ECS Container AWS S3 (unlimited persistent & reliable object store) AWS EMR Task Node REST API AWS EMR Master Instance Presto Task NodePresto Master Node Ryte Backend Flow
  18. 18. RANDOMLY HIGH PEAKS ON S3 Request RESULT IN TIMEOUTS!!!111 AWS EMR/Presto with S3 as Data-Backend Downsides ● 1 of 100 slow S3 Request kill the whole query ● S3 Latency has direct impact to the user ● AWS try they best to find a solution but it stuck for days & weeks
  19. 19. API Responses up to 20s instead of 3s
  20. 20. Decoupling of our Storage Layer, or: How Alluxio solves all Problems
  21. 21. Ryte Data-Backend: third Edition Parquet stored on AWS S3 Import on AWS ECS REST API on AWS ECS Presto on AWS EMR Cluster Alluxio cache on AWS EMR Cluster ● Currently no extra Hardware costs ● Alluxio Cache can “warmed up” ● Cache costs scaling on usage ● Fits perfectly between Presto and S3
  22. 22. AWS ECS Container AWS ECS Container AWS S3 (unlimited persistent & reliable object store) AWS EMR Task Node REST API AWS EMR Master Instance Alluxio workerAlluxio Master RAM CachePresto Task Node Alluxio Client Presto Master Node Ryte Backend Flow
  23. 23. Performance Push 😱😍 Query-Time reduced by 72% on average!
  24. 24. Summary ● Alluxio help us perfectly to decouple S3 latency spikes from user requests ● No need for additional Hardware until today ● Easy integration between Presto on Hadoop & S3 ● Hardware requirements scaling still with business 👍
  25. 25. 26 Danny Linden @ Ryte Chapter Lead Engineering E-Mail: d.linden@ryte.com linkedin.com/in/danny-linden/ Twitter: @CodingDanny Questions ? WE ARE HIRING IN MUNICH: jobs.ryte.com

×