Wait! Exclusive 60 day trial to the world's largest digital library.
The SlideShare family just got bigger. You now have unlimited* access to books, audiobooks, magazines, and more from Scribd.Cancel anytime.
Processing large amounts of data for analytical or business cases is a daily occurrence for Apache Spark users. Cost, Latency and Accuracy are 3 sides of a triangle a product owner has to trade off. When dealing with TBs of data a day and PBs of data overall, even small efficiencies have a major impact on the bottom line.