Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cloud Computing and Big Data


Published on

A presentation I gave at the Fromm Institute at the University of San Francisco

Published in: Technology
  • Be the first to comment

Cloud Computing and Big Data

  1. 1. Cloud Computing and Big DataFromm Institute – University of San Francisco Robert Keahey 4/16/2012 1
  2. 2. Agenda• A little about me• Review the handouts• Cloud Computing – What is a cloud? – History – Types of clouds – Implications – What does the future hold?• Break• Big Data – How big is big? – The importance of Big Data – How it’s different – Dealing with Big Data – What does the future hold?• Wrap-up and Q&A 2
  3. 3. Robert Keahey• Computer Scientist• EDS• Cordys• SummaLogic• Avid blogger – – – SAP on the Cloud• Interested in several areas of technology – Cloud Computing – Software-Defined Networking – Big Data – Social Media/Networking – Location-based Services – Augmented Reality – Content/Context-aware computing 3
  4. 4. Handouts Handout Key TakeawayAccenture: A new era of • It’s not just about economicsinnovation - Cloud and the • Innovation is a key byproductfuture of business • Alignment of business and IT • Corporations will reshape themselvesThe Economist: The Power of • The cloud is now delivering personalization contentMany • Consumer electronics allow tailoring of the experience • “SoLoMo” will drive new social norms and “social tribes” • User demand will drive broadband deploymentHealth Care IT News: Amazon • Cloud computing enables “virtual file cabinets”Cloud to Ease 1000 Genomes • Quantum leaps in genome analysis – from years to weeksProject • Low cost techniques will directly impact clinical outcomesHealth Care IT News: Big Data, • Tremendously valuable data hidden in unstructured dataPersonalized Medicine to Trend • Comprehensive medical views of patients will be possiblein Health Care in 2012 • Future EHRs will be predictive 4
  5. 5. The $64,000 question…What is a cloud? 5
  6. 6. It’s Complicated… 6
  7. 7. Quick History • IBM Watson • Stream computing • Virtualization part 2 • Utility computing, multi-tenancy • Cloud Computing, mobility, “App Stores” • Client Server & service oriented architectures • WWW, global telecommunications • Internet & distributed computing • Mainframe virtualization40’s 50’s 60’s 70’s 80’s 90’s 00’s 10’s • DEC PDP, distributed “mini” computing • Mainframe mainstream • IBM 360 • Timesharing realized Outsourcing • Big, general purpose computers • Timesharing envisioned Moore’s Law • Big, single purpose computers Metcalfe’s Law 7
  8. 8. Critical Convergence Low cost infrastructure Empowered users Virtualization Outsourcing The “Cloud”Information Technology Service oriented economics architectures Global connectivity (mobility) 8
  9. 9. Cloud Computing DefinedAccording to the National Institute of Standards and Technology,cloud computing is… …a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models. Version 15 Published 10/7/09 Whoa… say what? 9
  10. 10. Let’s Try Again… Cloud Computing is…• A new service model for creating and consuming stuff in the digital age• Clouds use the same old stuff we’ve always used – It’s just faster, better and cheaper now• Clouds are more flexible, adaptable and scalable• Not sure if they’re really more economical• But hype drives markets and markets create opportunity 10
  11. 11. Cloud Computing in Simple Terms Consolidate for economies of scale Make it “elastic” Make it globally available Hide the complexity Tailor the experience Make it easily consumable 11
  12. 12. A Quick Diversion 12
  13. 13. Inside a Google Data Center 13
  14. 14. Data Center in a Container 14
  15. 15. Lefdal Mine - Måløy, 15
  16. 16. Goals of Cloud Computing• Be more responsive – Ability to create new services (time to market) – Expand and contract• Improve performance – Do things faster – Be more resilient – Recover faster• Hide complexity from the end user• Be more economical – Reduce both CAPEX and OPEX 16
  17. 17. Types of Clouds Type Purpose Model Typical User Examples Amazon AWS Typically replace Public Large enterprises RackspaceEnterprise existing compute, Private Mid-size companies GoGrid Clouds network and Hybrid Savvis storage systems Teremark Overlap Salesforce.comBusiness ZOHO Provide on-demand Large enterprises Service Public Google Apps business services SMBs Clouds Netsuite DropBox Overlap Flickr, Picassa Pandora, SpotifyConsumer Provide lifestyle iTunes, iCloud Public You and me Clouds convenience OpenTable Netflix, Flixster YouTube, Vimeo 17
  18. 18. The Other Side of Cloud Computing• Outages – Amazon AWS – April 2011 – 2,000 customers affected• Security – Anonymous – Lul Sec• Privacy – Location tracking – Personal information uploading – Facebook, Google• Piracy – Napster, The Pirate Bay – Legislation • SOPA, PIPA • MPAA, RIAA• Environmental impact – 18
  19. 19. The Future of Cloud Computing • Function-specific clouds – Research – Analysis – Quality of life • Embedded “cloudlets” • Customizable services • Snap-together services • Convergence of the user experience – Location-based services – Augmented reality – Content/Context-based services 19
  20. 20. Big Data 20
  21. 21. How Big is “Big”? Monthly Mobile Traffic Annual Internet Protocol Traffic 80.5 EB 40.2 EBSource: Cisco 2011 Visual Networking Index Report 21
  22. 22. Putting It in Perspective Visual Networking Index IP Traffic Chart • The number of mobile-connected devices will exceed the worlds population in 2012 • By 2016 that number will grow to 10 billion…Source: Cisco 2011 Visual Networking Index Report 22
  23. 23. Why is This Important?• Scientific and medical research• Financial modeling There’s gold in them there hills…• Weather forecasting• Risk analysis & management• Rogue trading• Terrorist tracking• Visual analytics• Disease tracking• Crop analysis• Consumption patterns• Sentiment analysis• Personalization• Targeted marketing 23
  24. 24. How is Big Data Different? Traditional Data Big Data•Large scale •Massive scale•Highly centralized •Highly distributed •Unstructured•Structured – Emails – Files – Audio/Video – Records – Documents – Databases – Spreadsheets – Log files•Sequential – Sensor data•Indexed – Geo-spatial data – Books•Processing transactions – Journals – Blogs – Text messages – Chat sessions – Search data •Random •Looking for patterns and relationships 24
  25. 25. Dealing with Structured Data • Very large databases stored in a central location – data warehouse • Attached to very large, very powerful computers • Accessed by structured queries • Continually updated • Used for “real time” transactional processing • Reports created by a “batch” process 25
  26. 26. Dealing with Big Data• Unstructured data retrieved Unstructured from variety of sources Data• Data is Extracted, Translated Emails and Loaded into Big Data Audio/Video system Documents Spreadsheets• Small parts of data are Log files distributed by master nodes Sensor to hundreds (thousands) of Geo-spatial ETL small networked nodes Books Journals• Each node processes a part of Blogs the data and returns an Text messages answer Chat sessions Search data • MapReduce• Process is repeated until all data is analyzed• Results are then used for further analysis 26
  27. 27. The Future Cloud Computing + Big Data• Stream computing• Dramatically improved forecasting and predictive analysis across all scientific disciplines• The rise of the Social Graph – Battle lines are drawn• Individually tailored and personalized solutions, services and experiences – Medical diagnosis and treatment – Lifestyle management Good afternoon, John Anderton… – Targeted marketing and advertising 27
  28. 28. Glossary• Cloud-bursting Acquiring additional cloud resources to handle unexpected or seasonal processing demands.• Cloud-washing Claiming that your business, product or service is “cloud computing” in order to generate market hype• Flash-crowd Unusually large number of users created by an unexpected event such as a global political crisis or natural disaster.• Hybrid Cloud A cloud ecosystem (infrastructure and services) that combines elements of public and private clouds to provide compute, networking and storage services for a corporation or enterprise.• IaaS “Infrastructure as a Service” – a cloud computing service that provides only computers, networking and storage.• MapReduce A framework for processing highly distributable problems across huge datasets using a large number of computers (nodes), collectively referred to as a cluster (if all nodes use the same hardware) or a grid (if the nodes use different hardware).• PaaS “Platform as a Service” – a cloud computing service that provides additional software (middleware) to enable business applications.• Private Cloud A cloud ecosystem implemented by a corporation or enterprise solely for its own use.• Public Cloud A cloud ecosystem that is provided by a service provider and is shared among many customers.• SaaS “Software as a Service” – a cloud computing service that enables users (companies) to access a business application on a subscription basis.• Social Graph A term coined by scientists working in the social areas of graph theory. It has been described as "the global mapping of everybody and how theyre related". 28
  29. 29. Suggested Reading• Where Wizards Stay Up Late: The Origins of the Internet – Katie Hafner & Matthew Lyon• The Laws of Disruption: Harnessing the New Forces that Govern Life and Business in the Digital Age – Larry Downes• The Facebook Effect: The Inside Story of the Company that is Connecting the World – David Kirkpatrick 29
  30. 30. Questions? 30