Apache Impala (incubating) is an open source distributed SQL engine, designed to run at scale on top of Hadoop, while giving short response times for interactive queries over petabytes of data.
In this 1-hour technical session, Tanel will guide you through a sample of Impala's key features such as parallel processing, columnar data format access, block skipping (storage indexes), distributed aggregations and joins, Bloom filters and standard monitoring tools. Because of Tanel's long-time background in the Oracle Database world, he will compare Impala's built-in features to the typical Oracle database counterparts.
This session will show you the power of Apache Impala in the modern world and give you an understanding of which workloads it is designed to work with the best.