Introduction: This workshop will provide a hands-on introduction to Apache Spark using the HDP Sandbox on students’ personal machines.
Format: A short introductory lecture about Apache Spark components used in the lab followed by a demo, lab exercises and a Q&A session. The lecture will be followed by lab time to work through the lab exercises and ask questions.
Objective: To provide a quick and short hands-on introduction to Apache Spark. This lab will use the following Spark and Apache Hadoop components: Spark, Spark SQL, Apache Hadoop HDFS, Apache Hadoop YARN, Apache ORC, and Apache Ambari User Views. You will learn how to move data into HDFS using Spark APIs, create Apache Hive tables, explore the data with Spark and Spark SQL, transform the data and then issue some SQL queries.
Pre-requisites: Registrants must bring a laptop that can run the Hortonworks Data Cloud.
Speaker:
Robert Hryniewicz, Developer Advocate, Hortonworks