A lot of players on the market have built successful MapReduce workflows to daily process terabytes of historical data. But who wants to wait for 24h to get updated analytics? This talk will introduce you to the lambda architecture designed to take advantages of both batch and streaming processing methods. So we will leverage fast access to historical data with real-time streaming data using Spark (Core, SQL, Streaming), Twitter, Apache Parquet, etc.
Clear code plus intuitive demo are also included - https://github.com/tmatyashovsky/lambda-architecture-jeeconf-kyiv
Was presented on JEEConf 2016 in Kyiv on 20/05/2016.
Design by Yarko Filevych: http://www.filevych.com/