Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Published on
Supporting Hadoop in containers takes much more than the very primitive support Docker provides using the Storage Plugin. A production scale Hadoop deployment inside containers needs to honor anti/affinity, fault-domain and data-locality policies. Kubernetes alone, with primitives such as StatefulSets and PersitentVolumeClaims, is not sufficient to support a complex data-heavy application such as Hadoop. One needs to think about this problem more holistically across containers, networking and storage stacks. Also, constructs around deployment, scaling, upgrade etc in traditional orchestration platforms is designed for applications that have adopted a microservices philosophy, which doesn't fit most Big Data applications across the ingest, store, process, serve and visualization stages of the pipeline. Come to this technical session to learn how to run and manage lifecycle of containerized Hadoop and other applications in the data analytics pipeline efficiently and effectively, far and beyond simple container orchestration. #BigData, #NoSQL, #Hortonworks, #Cloudera, #Kafka, #Tensorflow, #Cassandra, #MongoDB, #Kudu, #Hive, #HBase, PARTHA SEETALA, CTO, Robin Systems.
Be the first to comment