Engineering Big Data
Clouds
Debo Dutta – Principal Engineer, Cisco

© 2010 Cisco and/or its affiliates. All rights reserve...
Forward-looking Statements
This presentation contains projections and other forwardlooking statements regarding future eve...
Data Deluge Everywhere: Enterprises need
Insights in a cost-effective manner

Cloud (XaaS) & App
stores – All data in
the ...
Big Data to Big Insights
in the Cloud
IDC  A new generation of technologies and
architectures designed to economically ex...
Rise of semi and unstructured data
Forces Driving The Growth of Big Unstructured Data:

Web
Products,
Commerce,
Services

...
BDaaS on public
clouds
• Focus on Platforms
• Focus on Integration

• Mostly ETL
• Leverage Public Clouds
• Very little fo...
Big Data on Openstack
/tenant/industrial
/tenant/finance
/tenant/industrial
/tenant/healthcare

Healthcare Big Data
Health...
Scheduling
Heterogeneous
Resources is key

Developer API

Compute
Service

Storage
Service

(VMs, Memory,
Local Disk)

(Bl...
Upcoming SlideShare
Loading in …5
×

Engineering Big Data Infra with Openstack

1,099 views

Published on

This deck talks about, at a high level, how one can optimize Big Data analytics applications on Openstack.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,099
On SlideShare
0
From Embeds
0
Number of Embeds
32
Actions
Shares
0
Downloads
26
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Customers2 very large SPs for early field trialsTaking to biggest customers first who need our help)Formal engagements kicked off in late JanuaryCisco ITInternal cloudWebex
  • Engineering Big Data Infra with Openstack

    1. 1. Engineering Big Data Clouds Debo Dutta – Principal Engineer, Cisco © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1
    2. 2. Forward-looking Statements This presentation contains projections and other forwardlooking statements regarding future events or the future financial performance of Cisco, including future operating results. These projections and statements are only predictions. Actual events or results may differ materially from those in the projections or other forward-looking statements. Please see Cisco’s filings with the SEC, including its most recent filings on Form 10-K and 10-Q, for a discussion of important risk factors that could cause actual events or results to differ materially from those in the projections or other forward-looking statements. © 2010 Cisco and/or its affiliates. All rights reserved. © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2 2
    3. 3. Data Deluge Everywhere: Enterprises need Insights in a cost-effective manner Cloud (XaaS) & App stores – All data in the cloud Social Media – Consumer behavior, targeted advertisement, Social network platforms Mobile Data – Location, Presence, D evice, Access, Custo mer Volume Variety Velocity Veracity Smart Converged Networks – B/W optimization, content placement, offload, SDN © 2010 Cisco and/or its affiliates. All rights reserved. Commerce – Mobile Payment platforms and local offers Video Growth - 65% of Mobile and 90% of Fixed traffic will be video by 2015 (Cisco VNI) M2M – 225 Million connections by 2014 (ABI Research) from vending machines & ATMs to connected automobile Adapted from PRIME deck Cisco Confidential 3
    4. 4. Big Data to Big Insights in the Cloud IDC  A new generation of technologies and architectures designed to economically extract value from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis • Shift from technology for finding information to discovering insights • Increases interest in real-time analytics of machine generated data “Software defined” and converged technologies • Open source software/platforms will play a pivotal role in big data Infrastructure – gain greater commercial adoption • 2013 will be the year of “Big Data in the Cloud” © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4 4
    5. 5. Rise of semi and unstructured data Forces Driving The Growth of Big Unstructured Data: Web Products, Commerce, Services • • • • Click streams Email AVI files User data / search data Defense, Intelligence and Security Cloud Computing • Network log files • Event processing • Impact analysis • • • • • Call data Online activity Travel data GPS data Satellite Feeds Financial Services • Fraud detection / risk analysis • Transaction data warehousing • PCI compliance • Surveillance Other • Meteorology • Disease / epidemiology • Genomics • RFID data • Sensor data Difficult to capture, store, search, share, and analyze with traditional tools 1) Source: Cisco Visual Networking Index, June 2011 © 2010 Cisco and/or its affiliates. All rights reserved. Thanks to Corp Dev Cisco Confidential 5
    6. 6. BDaaS on public clouds • Focus on Platforms • Focus on Integration • Mostly ETL • Leverage Public Clouds • Very little focus on Insights • Insights are obtained by in-house Data Scientists • For Viz, UX is not there yet Exceptions: Tableau @Netflix © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6 6
    7. 7. Big Data on Openstack /tenant/industrial /tenant/finance /tenant/industrial /tenant/healthcare Healthcare Big Data Healthcare Big Data Application Virtua Application Virtua Healthcare Big Data Virtua l Virtua Healthcare Big Data l Application Virtua l VPN Firew Virtua l Waas Virtua Healthcare Big Data l Application Virtua all l VPN Firew Virtua l Waas Virtua l Application Virtua all l VPN Firew Virtua l Waas lDevop Single Instanceall Virtual l VPN Firew Virtual Load Waas s Services Instanceall VPN Single Balanc Server er Services App Single Instance App Services App Single Instance App OS ServicesOS App App Single Instance OS ServicesOS App App VM OS VM OS Sensor VM Dashboard VM OS OS VM OS VM OS DataBase VM VM DataBase VM VM OS DataBase OS VM DataBase OS VM DataBase OS VM OS VM VM (Private) Cloud Provider, Big-data Services Batch Real time noSQL API API HIVE API Cassandra Truviso API API API Hadoop Yahoo S4 Hive API API Oozie App Topology API Pig API MapRD rill API Lucene API Mongo Storm Devops OpenStack Cloud Platform • Bridges the virtual and physical layers Compute Service Servers Storage Service Disks Network Service Networks User and System Admin Intelligent Scheduler Resource Virtualization/hypervisor Layer • Creates and manages virtualized compute, storage and networking resources Hypervisor: KVM, Xen, ESX - Nexus 1000v + Open vSwitch Network Virtualization: VLAN, OpenFlow, LISP, VXLAN Physical Resource Layer • Networking, Storage and Compute resources © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7
    8. 8. Scheduling Heterogeneous Resources is key Developer API Compute Service Storage Service (VMs, Memory, Local Disk) (Block, Massiv e Key-value store) Disks Servers Network Service (Virtual Networks, Services) Networks Network(Topology), Storage aware scheduling VMs metal VMs metal metal VMs VMs metal Map heavy workloads on bare metal with more resources, Light workloads on virtualized resources © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8 8

    ×