Introduction and agenda Ops benefits Tech benefits Architecture Use cases Demo video Hybrid data model Current directions Q&A Supplementals
Adobe is a Big Data company. Adobe adopting a virtualization approach of Hadoop has both business and technical justifications and allows competitive differentiation. Analytics is core competency of DMBU.
Rapid provisioning: Much of the cluster deployment process can be automated using existing tools. High availability: HA protection can be provided through the virtualization platform to protect the single points of failure in the Hadoop system. Elasticity: Hadoop capacity can be scaled up and down on demand in a virtual environment. Multi-tenancy: Different tenants running Hadoop can be isolated in separate VMs, providing stronger VM-grade resource and security isolation.
Expecting a lot of questions on this one and halfway through, so good time for intermediate Q&A if Chris wants to discuss some of the physical design. We can defer questions on use-cases and workflows since those will be immediately following.
Prod and dev review
Video walkthrough of vCAC deployment and auto-discovery via Cloudera Manager
Hybrid storage model to get the both of both worlds Or for flexibility Master nodes: NameNode, JobTracker on shared storage Leverage vSphere vMotion, HA and FT Slave nodes TaskTracker, DataNode on local storage Lower cost, scalable bandwidth
Identify acronyms, DMBU and vCAC first. Integration with Adobe DMBU Private Cloud: IaaS environment leveraging VMware stack (vCAC + vCOPs + vCenter). HDFS Storage Integration: Storage team is currently managing >10PB of data on Isilon. Presenting this layer, via HDFS, to multiple product teams from a single-view. Service Blueprints in vCAC: Offering multiple blueprints for various cluster types and sizes within vCAC. Present these blueprints to the Service Catalog and our internal self-provisioning portal.