There is increased interest in using Kubernetes, the open-source container orchestration system for modern, stateful Big Data analytics workloads. The promised land is a unified platform that can handle cloud native stateless and stateful Big Data applications. However, stateful, multi-service Big Data cluster orchestration brings unique challenges. This session will delve into the technical gaps and considerations for Big Data on Kubernetes.
Containers offer significant value to businesses; including increased developer agility, and the ability to move applications between on-premises servers, cloud instances, and across data centers. Organizations have embarked on this journey to containerization with an emphasis on stateless workloads. Stateless applications are usually microservices or containerized applications that don’t “store” data. Web services (such as front end UIs and simple, content-centric experiences) are often great candidates as stateless applications since HTTP is stateless by nature. There is no dependency on the local container storage for the stateless workload.
Stateful applications, on the other hand, are services that require backing storage and keeping state is critical to running the service. Hadoop, Spark and to lesser extent, noSQL platforms such as Cassandra, MongoDB, Postgres, and mySQL are great examples. They require some form of persistent storage that will survive service restarts...
Anant Chintamaneni, VP Products, BlueData
Nanda Vijaydev, Director Solutions, BlueData