This session discusses how to achieve low latency in MapReduce data analysis, with various industrial and academic case studies. These illustrate various improvements on MapReduce for squeezing out latency from whole data processing stack, covering batch-mode MapReduce system, as well as stream processing systems. This session also introduces our BoltMR project efforts on this topic and discloses some interesting benchmark results.
Objective 1: Understand why low-latency matters for many MapReduce-based big data analytics scenarios.
After this session you will be able to:
Objective 2: Learn the root causes of MapReduce latency, the obstacles to lowering the latency and the various (im)mature solutions.
Objective 3: Understand the extent of MapReduce low-latency that is needed for their own applications and which optimization techniques are potentially applicable.
Clipping is a handy way to collect important slides you want to go back to later.