Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Spark at Vungle

2,373 views

Published on

Vungle enables developers to put video ads in their apps. In this talk, we will discuss Spark's critical role in Vungle's infrastructure. Spark's flexibility and power allowed us to easily horizontally scale a variety of systems within our stack to power over 1 billion events per day. From completely replacing existing data processing systems to reinforcing stressed components, Spark's versatility, intuitive development model, and ease of deployment have made it a very attractive choice for a number of data processing problems we face at Vungle.

Published in: Mobile
  • Be the first to comment

Using Spark at Vungle

  1. 1. Introduction Old Architecture New Architecture Decoupling Streaming Conclusion 1
  2. 2. 2 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion ● Introduction ● Old Architecture ● New Architecture ● Decoupling ● Streaming ● Conclusion
  3. 3. 3 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion ● Legacy Java Process ○ “Crunches” data ○ Sends data downstream to our own datastores and to 3rd party analytics ○ Runs every hour ● Growth ○ Process can run over an hour ○ 12 GB -> 24GB heap in less than 1 year ○ Cron is a horrible job management system ○ A failure requires rerunning a job from the beginning ● 2.0 ○ Horizontably scalable ○ Real Time ETL ○ Reuesable
  4. 4. 4 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion ETL @ Vungle ● ~1 Billion Events / Day ● Deduplication ● Calculating $$$ ● Outputting data to various destinations
  5. 5. 5 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Old Architecture
  6. 6. 6 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  7. 7. 7 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  8. 8. 8 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  9. 9. 9 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  10. 10. 10 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  11. 11. 11 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  12. 12. 12 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  13. 13. 13 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  14. 14. 14 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion New Architecture
  15. 15. 15 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  16. 16. 16 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  17. 17. 17 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  18. 18. 18 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  19. 19. 19 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  20. 20. 20 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  21. 21. 21 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  22. 22. 22 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Decoupling
  23. 23. 23 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  24. 24. 24 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  25. 25. 25 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  26. 26. 26 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  27. 27. 27 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  28. 28. 28 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  29. 29. 29 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  30. 30. 30 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  31. 31. 31 Introduction Problem Decoupling Streaming Conclusion Setup connection and spark streams Map each line of log into Mongo Objects and insert into mongo
  32. 32. 32 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Setup connection and spark streams
  33. 33. 33 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Mapping to Mongo objects and insertions
  34. 34. 34 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Questions
  35. 35. 35 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Streaming
  36. 36. 36 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  37. 37. 37 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  38. 38. 38 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  39. 39. 39 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Ingestion
  40. 40. 40 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Event ID Request View Install ... Request Added View Added Install Added Value Ingestion Table Schema
  41. 41. 41 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion ... Date Time Deliveries Views Installs Processed Deliveries Processed Views Processed Installs Fact Table Schema
  42. 42. 42 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Ingestion
  43. 43. 43 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  44. 44. 44 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  45. 45. 45 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  46. 46. 46 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  47. 47. 47 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  48. 48. 48 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  49. 49. 49 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Process
  50. 50. 50 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  51. 51. 51 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  52. 52. 52 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  53. 53. 53 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  54. 54. 54 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion
  55. 55. 55 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Next Steps ● Switching from JSON to ProtoBuf ● Using YARN to run multiple jobs on one cluster ● Data Science ● Who knows?
  56. 56. 56 Introduction Old Architecture New Architecture Decoupling Streaming Conclusion Questions
  57. 57. Thank you! 57

×