Thoughts about how to use CloudStack to develop a Big Data Solution in your data center.
Cloud as a virtual machine infrastructure and Big Data are converging as two key technical evolution in the data center. Virtualization enables multi-tenancy, heterogenous Operating systems and added security via isolation. Clouds like AWS EC2, Rackspace, or google GCE are good examples. Big Data tackles the challenge of the increase scale (amount) and complexity (type) of data faced in the enterprise. While Compute cloud show a departure from traditional hardware provisioning and configuration management via virtualization, big data is a departure from traditional relational databases and file systems. These two technical evolutions have been triggered by the new workloads of the internet (search, streaming) and the scale needed to server millions of users and millions/billions of objects to store or serve.
In this talk we show how CloudStack and its support for bare-metal provisioning is compatible with a public cloud. CloudStack being a data center orchestrator that can tackle both traditional enterprise workloads and internet scale/type workloads. Multiple zones can be created for compute cloud or big data. Big data can used as backend store to the compute cloud or as zone type to enabled big data workload on the bare metal hardware.
This hybrid mode of operation is seen as the next evolution of clouds and positions a data center orchestrator has more than a VM management system and a solution to big data management as well.