Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HDFS on Kubernetes—Lessons Learned with Kimoon Kim

There is growing interest in running Apache Spark natively on Kubernetes (see https://github.com/apache-spark-on-k8s/spark). Spark applications often access data in HDFS, and Spark supports HDFS locality by scheduling tasks on nodes that have the task input data on their local disks. When running Spark on Kubernetes, if the HDFS daemons run outside Kubernetes, applications will slow down while accessing the data remotely.

This session will demonstrate how to run HDFS inside Kubernetes to speed up Spark. In particular, it will show how Spark scheduler can still provide HDFS data locality on Kubernetes by discovering the mapping of Kubernetes containers to physical nodes to HDFS datanode daemons. You’ll also learn how you can provide Spark with the high availability of the critical HDFS namenode service when running HDFS in Kubernetes.

  • Be the first to comment

HDFS on Kubernetes—Lessons Learned with Kimoon Kim

  1. 1. Kimoon Kim (kimoon@pepperdata.com) HDFS on Kubernetes -- Lessons Learned
  2. 2. Outline 1. Kubernetes intro 2. Big Data on Kubernetes 3. Demo 4. Problems we fixed -- HDFS data locality
  3. 3. Kubernetes New open-source cluster manager. Runs programs in Linux containers. 1000+ contributors and 40,000+ commits.
  4. 4. “My app was running fine until someone installed their software”
  5. 5. More isolation is good Kubernetes provides each program with: • a lightweight virtual file system – an independent set of S/W packages • a virtual network interface – a unique virtual IP address – an entire range of ports • etc
  6. 6. Kubernetes architecture node A node B Pod 1 Pod 2 Pod 3 10.0.0.2 196.0.0.5 196.0.0.6 10.0.0.3 10.0.1.2 Pod, a unit of scheduling and isolation. • runs a user program in a primary container • holds isolation layers like an virtual IP in an infra container
  7. 7. Big Data on Kubernetes github.com/apache-spark-on-k8s • Google, Haiwen, Hyperpilot, Intel, Palantir, Pepperdata, Red Hat, and growing. • patching up Spark Driver and Executor code to work on Kubernetes. Yesterday’s talk: spark-summit.org/2017/events/apache-spark-on-kuberne tes/
  8. 8. Spark on Kubernetes node A node B Driver Pod Executor Pod 1 Executor Pod 2 10.0.0.2 196.0.0.5 196.0.0.6 10.0.0.3 10.0.1.2 Client Client Driver Pod Executor Pod 1 Executor Pod 2 10.0.0.4 10.0.0.5 10.0.1.3 Job 1 Job 2
  9. 9. What about storage? Spark often stores data on HDFS. How can Spark on Kubernetes access HDFS data?
  10. 10. Hadoop Distributed File System Namenode • runs on a central cluster node. • maintains file system metadata. Datanodes • on every cluster node. • read and write file data on local disks.
  11. 11. HDFS on Kubernetes node A node B Driver Pod Executor Pod 1 Executor Pod 2 10.0.0.2 196.0.0.5 196.0.0.6 10.0.0.3 10.0.1.2 Client Client Driver Pod Executor Pod 1 Executor Pod 2 10.0.0.4 10.0.0.5 10.0.1.3 Job 1 Job 2 Namenode Pod Datanode Pod 1 Datanode Pod 2 HDFS hadoop.fs.defaultFS hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
  12. 12. Demo 1. Label cluster nodes 2. Stand up HDFS 3. Launch a Spark job 4. Check Spark job output
  13. 13. What about data locality? • Read data from local disks when possible • Remote reads can be slow if network is slow • Implemented in HDFS daemons, integrated with app frameworks like Spark Client 2 node B Client 1 node A Datanode 1 Datanode 2 SLOWFAST
  14. 14. HDFS data locality in YARN Spark driver sends tasks to right executors with tasks’ HDFS data. Driver Executor 1 Executor 2 /fileA /fileB Read /fileA Read /fileB (/fileA → Datanode 1 → 196.0.0.5) == (Executor 1 →196.0.0.5) Datanode 1 Datanode 2 node A 196.0.0.5 node B 196.0.0.6 Job 1 HDFS YARN example
  15. 15. Hmm, how do I find right executors in Kubernetes... (/fileA → Datanode 1 → 196.0.0.5) != (Executor 1 → 10.0.0.3) Executor Pod 2 10.0.1.2 Driver Pod Executor Pod 1 10.0.0.2 10.0.0.3 Read /fileB Read /fileA /fileA /fileB Datanode Pod 1 node A 196.0.0.5 node B 196.0.0.6 Kubernetes example Datanode Pod 2
  16. 16. Fix Spark Driver code 1. Ask Kubernetes master to find the cluster node where the executor pod is running. 2. Get the node IP. 3. Compare with the datanode IPs.
  17. 17. Rescued data locality Executor Pod 2 10.0.1.2 Driver Executor Pod 1 10.0.0.2 10.0.0.3 (/fileA → Datanode 1 → 196.0.0.5) == (Executor 1 →10.0.0.3 → 196.0.0.5) Read /fileA Read /fileB /fileA /fileB node A 196.0.0.5 node B 196.0.0.6 Datanode Pod 1 Datanode Pod 2
  18. 18. Rescued data locality! with data locality fix - duration: 10 minutes without data locality fix - duration: 25 minutes
  19. 19. Recap Got HDFS up and running. Basic data locality works. Open problems: • Remaining data locality issues -- rack locality, node preference, etc • Kerberos support • Namenode High Availability
  20. 20. Join us! ● github.com/apache-spark-on-k8s ● pepperdata.com/careers/ More questions? ● Come to Pepperdata booth #101 ● Mail kimoon@pepperdata.com

×