Assignment Overview Apache Spark is a distributed data processing analytics engine that makes available new capabilities to data scientists, business analysts, and application developers. Apache Spark runs on Hadoop, Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources including Hadoop Distributed File System (HDF), Cassandra File System (CFS), Hadoop Database (HBase), and Simple Storage Service (S3). Apache Spark is used as a method for data Grid implementation. Analytics for Apache Spark provides fast in-memory analytics processing of large data sets. IBM Bluemix has recently added Apache Spark as platform-as-a-service (PaaS). For this assignment, you will write a literature review on Apache Spark in cloud. This assignment should include the following: 1. Report (80 marks) a. Abstract b. Introduction c. Architecture of Apache Spark in Could d. Application of Apache Spark e. Apache Spark Security f. Conclusion g. Reference 2. Presentation (20 marks) a. Power Point Slide (8-12 slides) ...