Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6

2,230 views

Published on

To Infinity and Beyond - 2012 Internet Scale Workloads and Internet-scale Data Center Design: This Oct 2012 presentation given at the IBM Europe Symposium in Budapest, takes an advanced look at today's massive internet scale workloads and data centers, and dissects how/what lessons we can/should/must apply to our own IT shops. We'll examine how Internet Scale is very different than a collection of co-located servers - how these data centers respond to real-time, dynamic, fluid, competitive-advantage-leapfrog Internet business environments. These Internet-scale data center's servers, storage, software use new approaches to work as a end-to-end efficient, flexible, adaptable work flow. Using Google's definitive work on "The Data Center as A Computer - Intro to Warehouse-scale Machine" as a foundation (superb open license material by Google 2009), come discover the design, deployment, and lessons that we all must learn from these giants of the Internet. Why / How do they do what they do? Where are they being built? How are they powered and cooled? What are deployment form factors, design philosophies, power/cooling/packaging principles and trends, including modular portable container data center architecture? You'll come away knowing what you should apply to your own individual IT datacenter infrastructure in 2012 and beyond. My only request when using / referencing this material, is that you give full credit to me and IBM as the original authors of this research. That having been said, please spread the good word with good business judgement - we all benefit in today's modern global world.

Published in: Technology
  • There are over 16,000 woodworking plans that comes with step-by-step instructions and detailed photos, Click here to take a look ◆◆◆ http://tinyurl.com/y3hc8gpw
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Want to preview some of our plans? You can get 50 Woodworking Plans and a 440-Page "The Art of Woodworking" Book... Absolutely FREE ●●● http://tinyurl.com/y3hc8gpw
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Want to preview some of our plans? You can get 50 Woodworking Plans and a 440-Page "The Art of Woodworking" Book... Absolutely FREE ★★★ http://ishbv.com/tedsplans/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

To_Infinity_and_Beyond_Internet_Scale_Workloads_Data_Center_Design_v6

  1. 1. © 2012 IBM Corporation sGE06 To Infinity and Beyond: 2012 Internet Scale Workloads and Data Center Design John Sing, Executive Consultant, IBM Systems and Technology Group IBM Systems Technical Universities– Budapest, Hungary – October 15-19
  2. 2. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 2 IBMTECHU.COMIBMTECHU.COM  IBM STG Technical Universities & Conferences web portal  Direct link: ibmtechu.com/hu  KEY FEATURES... – Create a personal agenda using the agenda planner – View the agenda and agenda changes – Use the agenda search to find the sessions and/or – Download presentations – Submit Session and Conference Evaluations Win prizes by submitt ing evaluat ions online. The more evalutions submit t ed, t he greater chance of w inning
  3. 3. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 3 Evaluations are Online!Evaluations are Online! IBMTECHU.COMIBMTECHU.COM sGE06
  4. 4. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 John Sing  31 years of experience with IBM in high end servers, storage, and software – 2009 - Present: IBM Executive Strategy Consultant: IT Strategy and Planning, Enterprise Large Scale Storage, Internet Scale Workloads and Data Center Design, Big Data Analytics, HA/DR/BC – 2002-2008: IBM IT Data Center Strategy, Large Scale Systems, Business Continuity, HA/DR/BC, IBM Storage – 1998-2001: IBM Storage Subsystems Group - Enterprise Storage Server Marketing Manager, Planner for ESS Copy Services (FlashCopy, PPRC, XRC, Metro Mirror, Global Mirror) – 1994-1998: IBM Hong Kong, IBM China Marketing Specialist for High-End Storage – 1989-1994: IBM USA Systems Center Specialist for High-End S/390 processors – 1982-1989: IBM USA Marketing Specialist for S/370, S/390 customers (including VSE and VSE/ESA)  singj@us.ibm.com  IBM colleagues may access my intranet webpage: – http://snjgsa.ibm.com/~singj/  You may follow my daily IT research blog – http://www.delicious.com/atsf_arizona  You may follow me on Slideshare.net: – http://www.slideshare.net/johnsing1  My LinkedIn: – http://www.linkedin.com/in/johnsing
  5. 5. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Agenda  Today’s Internet Scale Data Center Landscape – Where are they? How big? How fast growing? – What are they being used for? Cloud impact? – Why understand them?  What is internet data center / warehouse- scale computing? – How is it different? Workloads? – Hardware and software? – How the same?  How best to meld with it / use it / exploit? – Lessons we can apply from Internet scale computing to traditional IT – Resources to help you on this journeyThis session is the author’s research compilation, Great thanks to Google for the seminal work on which this lecture is based. Download a copy at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Published in 2009
  6. 6. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Today’s Internet Scale Data Center Landscape Paraphrased: “The world has changed. And some things that should not be forgotten, may be lost”. The Lord of the Rings, Galadriel
  7. 7. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Today: two different types of IT Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Internet scale wkloadsTransactional IT
  8. 8. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Today’s two major IT workload types Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Transactional IT Internet scale wkloads
  9. 9. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 How to build these two different clouds Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Transactional IT Internet scale wkloads
  10. 10. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 What You (Consumer) Get with These Clouds: Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
  11. 11. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Policy-based Clouds and Design-for-fail Clouds are purpose optimized Infrastructure Management solutions  Policy-based Clouds • Purpose optimized for longer-lived virtual machines managed by Server Administrator • Centralizes enterprise server virtualization administration tasks • High degree of flexibility designed to accommodate virtualization all workloads • Significant focus on managing availability and QoS for long-lived workloads with level of isolation • Characteristics derived from exploiting enterprise class hardware • Legacy applications  Design-for-fail Clouds • Purpose optimized for shorter-term virtual machines managed via end-user or automated process • Decentralized control, embraces eventual consistency, focus on making “good enough” decisions • High degree of standardization • Significant focus on ensuring availability of control plane • Characteristics driven by software • New applications
  12. 12. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 What’s happening?  Continually rising worldwide internet bandwidth • Cisco global IP traffic study and forecast • http://www.akamai.com/stateoftheinternet/  Has given rise to pervasive and hyper-growing web services delivery model – (i.e. “The Cloud”)  The Cloud is provided by data centers with massive amounts of well-connected processors, storage, network – Amortized across internet scale user population – Across multiple workloads http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/VNI_Hyperconnectivity_WP.html
  13. 13. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Bandwidth and the Cloud…..  This new class of large-scale internet and cloud data centers  Has data volume: – 10s / 100s petabytes  Servers: – 100,000s  Workload can’t fit – In single server / rack of servers  Workload: – Requires server clusters of 100s, 1000s, 10,000s, or more…….
  14. 14. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Growth of The Cloud by 2014  Means very big shift in resources  And in the way that IT is managed for the enterprise http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns1175/Cloud_Index_White_Paper.html Source:
  15. 15. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 How Big is the World? - 1 http://wikibon.org/blog/how-big-is-the-world-of-cloud-computing-infographic/ This is significant Cheaper 7.1x 5.7x 7.3x Network Storage Admins
  16. 16. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 How Big is the World? - 2 http://wikibon.org/blog/how-big-is-the-world-of-cloud-computing-infographic/ We’re going to talk about this 
  17. 17. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Warehouse Scale Computers  The name for this emerging class of data centers is: – Warehouse-Scale Computer (WSC)  Large portions of hardware and software resource must work together  Only achieved by holistic approach to their design and development  Treat the datacenter itself as one massive computer  Enclosure for this computer looks like a building or warehouse This session is the author’s research compilation, Great thanks to Google for the seminal work on which this lecture is based. Download a copy at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Here in plain English is the fundamentals of the next-generation IT age
  18. 18. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Tell me more. How does this compare to my existing data centers? What are different workloads that best fit into the two different types? How best to meet / meld / jointly profit  ? OK. Hmmmmmmm……….
  19. 19. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Big Positioning picture Traditional IT Storagereq’d:GB,TB,PB $/server Data Warehouse Traditional IT Data Warehouse Current IT architectures Growth areas Mobile, Cloud Growth areas Mobile, Cloud Big Data Internet scale Big Data Internet scale Current IT architectures
  20. 20. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Build new, different skills sets Traditional IT Storagereq’d:GB,TB,PB $/server,storage Data Warehouse Big Data Internet scale Traditional IT Data Warehouse Big Data Internet scale Current IT architectures Traditional IT workloads Current IT architectures Highly parallelized internet scale architecture Integrated E2E software centric
  21. 21. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Key strategies $/server,storage Traditional IT Data Warehouse Big Data Internet scale Current IT architectures Traditional IT architectures Internet scale architectures  Continue modernize current traditional IT … Architect new-gen connectors, skills Architect future expandability  Connect with – New generation mobile-enabled workloads View new gen as powerful partner Enable them to view traditional IT as powerful enabler
  22. 22. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Why Warehouse Scale Computers (WSC) might matter to you  While WSCs / Cloud Data Centers might today be considered a niche area – Their sheer size / cost / architecture is no longer uncommon – Among large internet companies and cloud co-lo’s  Problems solved by today’s huge Internet-scale IT design-for-fail architectures – Have already become meaningful to a much larger constituency  Many organizations are already: – Exploiting similarly architected computers, at a much lower cost - Hadoop is an example – Soon, we may have 2000+ cores in a single server  The experience learned building today’s Internet Scale Data centers – Is useful in preparing your team / company to meld, interact, plan, grow, expand, exploit the future in your own best interest – As these potentially ubiquitous next-generation machines and data centers take hold
  23. 23. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 What is: Internet Scale Data Center? Warehouse Scale Computer?
  24. 24. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Internet scale scale computing - what is different? Traditional Data Center  Co-located machines share security, environmentals  Applications = a few binaries, running on relatively small number of machines – 100s of inter-process relationships requiring 100 nanosecond response  Heterogenous hardware, software  Partitioned resources, managed and scheduled appropriately  Facility and computing equipment designed separately Warehouse-scale computer (WSC)  Computing system designed to run massive internet services  Highly parallelized applications = 10s of binaries running on 1000s of machines – 100,000s of independent tasks only requiring 100 microsecond response time (100x slower)  Homogeneous hardware, system software  Common pool of resources managed centrally  Integrated design of facility and computing machinery This is a different thing, for a different workload type Main difference Credit for all these ideas is to Google 2011 June talk by Luis Andre Barroso at 2011 Federated Computing Research Conference San Jose, Calif
  25. 25. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Another way to tell them apart Traditional Data Center  If your storage system has a few petabytes of data Warehouse-scale computer  If your storage subsystem pages you in the middle of the night  Because you only have a few petabytes of free space left Credit for all these ideas is to Google 2011 June talk by Luis Andre Barroso at 2011 Federated Computing Research Conference San Jose, Calif
  26. 26. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Let’s see some of largest Internet Scale Data Centers  Many are co-location Cloud data centers  Many are true Warehouse Scale Computers  All of them have a very specific Internet web services application profile
  27. 27. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 http://wikibon.org/blog/wp-content/uploads/2011/10/5-top-data-centers.html
  28. 28. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 http://wikibon.org/blog/wp-content/uploads/2011/10/5-top-data-centers.html
  29. 29. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Large Data Centers in past 2 years 10. SUPERNAP, LAS VEGAS, 407,000 SF 9A and 9B. MICROSOFT QUINCY AND SAN ANTONIO DATA CENTERS, 470,000 S
  30. 30. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Container Data Center Architecture 7. PHOENIX ONE, PHOENIX, ARIZ. 538,000 SF 5. MICROSOFT CHICAGO DATA CENTER, Chicago 700,000 SF 2. QTS METRO DATA CENTER, ATLANTA, 990,000 SF Microsoft’s Chicago Container Data Center
  31. 31. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 More data centers…. 4. NEXT GENERATION DATA EU 3. NAP OF THE AMERICAS, MIAMI, 750,000 SF 1. 350 EAST CERMAK, CHICAGO, Consumes 100 megawatts of power, 2nd-largest power customer for Commonwealth Edison, trailing only Chicago’s O’Hare Airport.
  32. 32. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 2012: Other large world data centers  For Tulip Telecom, India, in Bangalore  Currently largest in AP and 3d largest in world (for now)  Nearly 1 M sq feet  Co-built with IBM China to build 6.2 M sq feet data center by 2016 Amadeus, Erding, Germany 1+ billion transactions / day .3 second response time Access to 95% of the worlds airline seats 5000+ servers Powers over 260 websites in 110 countries for over 100 airlines 10 PB of storage Utah Data Center, US Govt, 1M sq feet
  33. 33. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Now….. what about the web giants?  i.e. Apple, Facebook, Google, Amazon, etc? That’s Big! Great Technology Wars of 2012 – Future of the Innovation Economy - Fast Company.com
  34. 34. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Apple Here’s what powers iCloud, see Jobs at WWDC 2011 iCloud announce (YouTube) Rendering of Apple's new North Carolina Data Center. Credit: Apple Other Apple data centers: Cork, Ireland Munich, Germany Newark, California Cupertion, Calif Apple Data Center FAQ Maiden, North Carolina 500K sq ft USD $1B This is phase 1 only Apple Data Center Newark, California Purposes for all these data centers: •iCloud •Support Apple’s WW install base of devices •Futures: Move Content Delivery Network in-house? •Futures: Streaming video? Under construction: Prineville, Oregon
  35. 35. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Facebook Facebook’s North Carolina Data Center Goes Live Lulea, Sweden - 290K sq ft (27K sq me Facebook – Prinville , Oregon Has spent $1B on it’s data centers Open Compute Project http://www.wired.com/wiredenterprise/2011/12/facebook-data-center/all/1
  36. 36. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Amazon http://www.searchenginejournal.com/fathoming-amazon-a-visualization-of-their-success-infographic/36768/
  37. 37. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Amazon Web Services As of 1Q2012, AWS stores 905 billion objects and servers 650K requests/sec Amazon Web Services 1Q12: 450,000 servers Amazon Perdix Modular Datacenter Amazon EC2 Cloud, with 17K core, 240 teraflop cluster is 42nd fastest supercomputer in the world 450,000 servers 905 billion objects 650K req/sec http://aws.typepad.com/aws/2012/04/amazon-s3-905-billion-objects-and-650000-requestssecond.html http://gigaom.com/cloud/how-big-is-amazon-web-services-bigger-than-a-billion/
  38. 38. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 To understand the modern Internet scale data center…… Let’s study the creators of this new paradigm Google originated and continues to drive much of this style of technology
  39. 39. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 What is Google? Google is not a search engine Google is a real-time “Data Factory” ecosystem – Defacto organizer of all human internet data – Provides worldwide Patterns of Life data • Search, analytics, etc as processing • Interactive maps as visualization – Android as ingest / output devices • Motorola Wireless acquisition $12B – Supporting businesses and ecosystem roles: • Google+, Play, Shop, Books, Gmail, Docs • Voice recognition software The history of search engine http://www.wordstream.com/articles/internet-search-engines-history
  40. 40. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06  Apple – Apple bought 12 PB for iTunes, iCloud – iPod = successful because of iTunes ecosystem – iPhone = successful because of App Store ecosystem  Facebook ecosystem – Patterns of life data on over 900 million users worldwide – Storage size of their Hadoop cluster: 30 PB  Amazon Web Services ecosystem – Building 4 new modular datacenters: Oregon + Ireland – http://www.datacenterknowledge.com/archives/2011/03/28/amazons-cloud-goes-modular-in-oregon/ – http://www.slideshare.net/AmazonWebServices/best-practices-in-architecting-for-the-cloud-webinar-jinesh-varia  eBay ecosystem – 2009: Analyzes 50PB of data a day, over 8 billion URL requests per day  Bottom line: ecosystem is no longer optional, hasn’t been for some time Internet scale data centers… are “Data Factories with ecosystem”
  41. 41. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Google has already gone through three major end-to-end transformations  Google has 3 ages in terms of managing data:  Batch: Indexes calculated every month (2003) – Crawled the web 1x month. Built a search index, answered queries. Largely read- only, pretty easy to scale. This is still how most people have in their minds about how Google works  Warehouse: the datacenter as one huge computer (2005) – Things move faster. The Internet has happened - pervasive, high speed, interactive. – Building their own datacenters, more sophisticated at every level – Iconic systems like BigTable in production – At this time Google realized they were building something qualitatively different than anything before, something we now think of as cloud computing. Amazon's EC2 launched in 2006  Instant: make it all real-time (2010) – Google's Colossus Makes Search Real-Time By dumping MapReduce – 3 Billion Writes and 20 Billion Read Transactions daily Reflects the shift to mobile devices and computing http://www.google.com/insidesearch/features/instant/about.html http://highscalability.com/blog/2011/8/29/the-three-ages-of-google-batch-warehouse-instant.html The history of search engine http://www.wordstream.com/articles/internet-search-engines-history http://highscalability.com/blog/2010/9/11/googles-colossus-makes-search-real-time-by-dumping-mapreduce.html
  42. 42. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Why Google Instant? It was part of the smartphone explosion of value across Internet….  In 2011, every 5 minutes = 250 hours of YouTube video uploads
  43. 43. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 You’ve noticed Google Instant:
  44. 44. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Architectural Guiding Principles For Internet Scale Servers in Big Data companies
  45. 45. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Let’s examine the infrastructure Looking for lessons Hint: what is an Internet workload?
  46. 46. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Internet Scale Workload Characteristics - 1  Embarrassingly parallel Internet workload – Immense data sets, but relatively independent records being processed • Example: billions of web pages, billions of log / cookie / click entries – Web requests from different users essentially independent of each over • Creating natural units of data partitioning and concurrency • Lends itself well to cluster-level scheduling / load-balancing – Independence = peak server performance not important – What’s important is aggregate throughput of 100,000s of servers i.e. Very low inter-process communication  Workload Churn – Well-defined, stable high level API’s (i.e. simple URLs) – Software release cycles on the order of every couple of weeks • Means Google’s entire core of search services rewritten in 2 years – Great for rapid innovation • Expect significant software re-writes to fix problems ongoing basis – New products hyper-frequently emerge • Often with workload-altering characteristics, example = YouTube
  47. 47. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Internet Scale Workload Characteristics - 2  Platform Homogeneity – Single company owns, has technical capability, runs entire platform end-to-end including an ecosystem – Most Web applications more homogeneous than traditional IT – With immense number of independent worldwide users 1% - 2% of all Internet requests fail* Users can’t tell difference between Internet down and your system down Hence 99% good enough *The Data Center as a Computer: Introduction to Warehouse Scale Computing, p.81 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006  Fault-free operation via application middleware – Some type of failure every few hours, including software bugs – All hidden from users by fault-tolerant middleware – Means hardware, software doesn’t have to be perfect  Immense scale: – Workload can’t be held within 1 server, or within max size tightly-clustered memory-shared SMP – Requires clusters of 1000s, 10000s of servers with corresponding PBs storage, network, power, cooling, software – Scale of compute power also makes possible apps such as Google Maps, Google Translate, Amazon Web Services EC2, Facebook, etc.
  48. 48. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Server, storage architecture at internet scale  Internet scale server, storage architecture fundamental assumptions: – Distributed aggregation of data – Storage functionality is in software on the server – Time to Market is everything • Breakage = “OK” if I can insulate that from user – Affordability is everything – Use open source software where-ever possible – Expect that something somewhere in infrastructure will always be broken – Infrastructure is designed top-to-bottom to address this  All other criteria are driven off of these Storage criteria: Cost Extreme: - Scale - Parallelism - Performance - Real time -Time to Market
  49. 49. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 1. Google File System Architecture – GFS II 2. Google Database - Bigtable 3. Google Computation - MapReduce 4. Google Scheduling - GWQ To meet this workload, typical internet-scale software stack 2003 - 2008 The OS or HW doesn’t do any of the above Reliability, redundancy all in the “application stack”
  50. 50. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Distributed Execution Overview in typical internet-scale workflow User Program Worker Worker Master Worker Worker Worker fork fork fork assign map assign reduce read local write remote read, sort Output File 0 Output File 1 write Split 0 Split 1 Split 2 Input Data 10s of thousands of servers Technologies such as Hadoop and MapReduce
  51. 51. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Internet-scale IT infrastructure InputfromtheInternet Yourcustomers End Result: Each red block is an inexpensive server = plenty of power for its portion of workflow
  52. 52. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Warehouse Scale Computer programmer productivity framework example  Hadoop – Overall name of software stack  HDFS – Hadoop Distributed File System  MapReduce – Software compute framework • Map = queries • Reduce=aggregates answers  Hive – Hadoop-based data warehouse  Pig – Hadoop-based language  Hbase – Non-relationship database fast lookups  Flume – Populate Hadoop with data  Oozie – Workflow processing system  Whirr – Libraries to spin up Hadoop on Amazon EC2, Rackspace, etc.  Avro – Data serialization  Mahout – Data mining  Sqoop – Connectivity to non-Hadoop data stores  BigTop – Packaging / interop of all Hadoop components http://wikibon.org/wiki/v/Big_Data:_Hadoop%2C_Business_Analytics_and_Beyond
  53. 53. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 So the real question is:  If we run these immense scale Internet-style workloads  And: – The Internet-sized workload is too large for even maximum size tightly-clustered memory-shared SMP – Therefore, workload runs on clusters of 1000s, 10000s of servers • With their corresponding PBs storage, network, power, cooling, software  Given this workload, what is most cost-effective hardware?  We compare many high-end servers vs. thousands of commodity servers This is the REAL question For Internet Scale Data Centers
  54. 54. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 TPC-C Benchmark: High-end SMP vs. low-end PC-class server  Low-end server TPC-C is 4x less expensive  If we exclude storage costs, low-end server advantage jumps to 12x cheaper. This is meaningful. 4x less 12x less “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, table 3-1, p. 32 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 (from late 2007)
  55. 55. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Then, compare Execution Time Parallel Tasks at 3 levels of communication intensity “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 3-1, p.34 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Traditional IT high inter-communication workload = high end SMP has high inter-process overhead So what would happen if we increased number of nodes 130x? Internet light intercommunication workload = small performance degradation Past 8 nodes, little additional penalty for increasing cluster size
  56. 56. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Performance difference: Internet workload high-end vs. low-end server “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 3-2, p.35 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Note how quickly performance advantage of high-end SMP diminishes as cluster size increases At > 2000 cores, 512 low-end servers is within 5% of 16 high-end servers 12x cost savings at 5% difference Bottom line: whenever Internet workload is involved (which is too large for any single high-end server cluster) we do need to think differently about it That’s why commodity class servers used for light communication Internet-scale workloads Internet workload
  57. 57. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Bottom line for Internet Scale Workloads: It makes sense to use consumer grade servers, storage For Internet-style workloads at Internet scale It makes sense to use high performance tightly coupled high-end servers If your workload has high inter-process communication (typical of traditional IT applications)
  58. 58. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Therefore, internet workload purpose-built server…. with onboard UPS  Huh?  Why an onboard UPS?  We’ll examine that next Energy in the form of a UPS on each server is deployed As part of strategy to address biggest data center construction costs Much more than power outage. Goal is support temporary > 100% power provisioning in data center To ride through renewable energy lulls (lack of wind, lack of solar) Credit for these ideas: Google 2011 June talk by Luis Andre Barroso at 2011 Federated Computing Research Conference San Jose, Calif
  59. 59. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Let’s now examine the warehouse-level data center design itself Ask yourself: What’s biggest cost-savings element?
  60. 60. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Internet Scale data center power components… Image courtesy of DLB Associates: D. Dyer, “Current trends/challenges in datacenter thermal management—a facilities perspective,”presentation at ITHERM, San Diego, CA, June 1, 2006. “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 4-1, p.40 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
  61. 61. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Breakdown of data center energy overheads Image courtesy of ASHRAE “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 5-2, p.49 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Chiller alone is 33% of the cost UPS alone is 18% of construction cost Physical cooling, UPS dominates the electrical power cost
  62. 62. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 construction cost of Internet Scale Data Center is Power / Cooling Facebook’s North Carolina Data Center Goes Live Facebook: Lulea, Sweden - 29 Facebook – Prinville , Oregon Has spent $1B on it’s data centers Open Compute Project ? Reducing power profile reduces construction cost
  63. 63. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Wow. Given that fact….. Whose data centers are most power efficient?  Reducing power profile = lowers initial CAPEX SIGNIFICANTLY  Therefore, fundamental Internet Scale Data Center goal is:  Decrease Power Usage Effectiveness (PUE)  PUE = http://gigaom.com/cloud/whose-data-centers-are-more-efficient-facebooks-or-googles/ Total Building Power consumed --------------------------------------------- IT power consumed
  64. 64. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Google claims its data centers use 50% less energy than competitors  Power Usage Effectiveness – PUE=1.14 means power overhead is only 14% – Industry average is around 1.8 http://venturebeat.com/2012/03/26/google-data-centers-use-less-energy/ Industry average PUE is about 1.8 http://www.datacenterknowledge.com/archives/2011/05/10/uptime-institute-the-average-pue-is-1-8/
  65. 65. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Container modular data center: solving the Power Density issue Image courtesy of DLB Associates: D. Dyer, “Current trends/challenges in datacenter thermal management—a facilities perspective,”presentation at ITHERM, San Diego, CA, June 1, 2006. “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 4-2, p.42 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Without specialized plenums or containerized enclosures, maximum power density of 150-200W / square foot Due to limits to how much air regular fans can push Data center can only operate a few minutes without cooling air
  66. 66. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Modular Data Center Value isn’t just time to delivery / flexibility It’s also Higher Power density = lower construction cost http://www.youtube.com/watch?v=zRwPSFpLX8I
  67. 67. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 That’s why you see such a big modern push on Container Data Centers: 7. PHOENIX ONE, PHOENIX, ARIZ. 538,000 SF 5. MICROSOFT CHICAGO DATA CENTER, Chicago 700,000 SF 2. QTS METRO DATA CENTER, ATLANTA, 990,000 SF Microsoft’s Chicago Container Data Center
  68. 68. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 State of the Modular Data Center Cyrus One 1 million sq ft “Massively Modular” data center under construction in Phoenix, Arizona I/O Modular Data Center Assembly line http://www.datacenterknowledge.com/archives/2012/05/17/cyrusone-going-massively-modular-in-phoenix/ http://www.datacenterknowledge.com/archives/2012/02/06/the-state-of-the-modular-data-center/ http://www.datacenterknowledge.com/archives/2012/01/30/inside-ios-modular-data-center-assembly-line/ Mismatch between rapid workload churn vs. 10+ year data center lifespan = modular data center characteristics strategic possibilities for new build data centers
  69. 69. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 So given all of this How do I put it all together In a Warehouse Scale Computer?
  70. 70. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Google’s Machinery as result of all these factors: 70
  71. 71. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Architectural view of Google server and storage hierarchy 71
  72. 72. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Clusters through the years “Google” Circa 1997 (google.stanford.edu) Google (circa 1999) 72
  73. 73. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Google Data Center (Circa 2000) Clusters through the years Google (new data center 2001) 3 days later 73
  74. 74. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Recent Google Design • In-house rack design • PC-class motherboards • Low-end storage and networking hardware • Linux • + in-house software 74
  75. 75. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Container Datacenter 75 Run container hotter than normal human comfort temperature = big cost savings
  76. 76. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Google Container Datacenter 76
  77. 77. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Google: The Dalles, Oregon internet scale data center 77 Google Data Center – The Dalles, Oregon
  78. 78. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Google Data Centers in 2008:
  79. 79. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Google Data Center CAPEX worldwide  Capital expenditures on datacenters: – 1Q12: USD$ 607M – 2011: USD$ 3.4B – 2010: USD$ 4.0B – 2009: USD$ 809M Each data center between $200M and $600M The Dalles, Oregon
  80. 80. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 And that…. is what today’s Internet Scale Data Center looks like
  81. 81. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 What will a European version of Internet Scale Cloud look like?  Data protection situation still evolving  Europe is Europe – Languages, culture, currencies  Cloud adoption will be very different country to country  Regardless, interest is Hot, Hot, Hot – 2012: 73% of companies moving to some sort of cloud – 2012: 55% moving to a private cloud  I believe Europe will adopt best of what’s already been done elsewhere – In a uniquely European flavor http://gigaom.com/cloud/will-there-be-an-amazon-of-europe/ http://gigaom.com/cloud/ec-cloud-plan-addresses-data-protection-problem-sort-of/ http://gigaom.com/cloud/5-things-you-need-to-know-about-cloud-in-europe/ http://gigaom.com/europe/dont-worry-europe-youre-about-to-get-a-new-beginning/
  82. 82. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Apply lessons from today to Traditional IT as best possible Source: Egan Ford, IBM Distinguished Engineer, OpenStack presentation: http://xmission.com/~egan/cloud/ Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
  83. 83. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Read all about it, Google published this information into the public domain in 2009  By Google: – Luiz Andre Barroso – Uri Holze  Available to all, free of charge Download a copy at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Video of Luis giving one of these lectures: http://inst-tech.engin.umich.edu/leccap/view/cse-dls-08/4903 http://www.barroso.org/
  84. 84. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Let’s review our plans To meld / meet / build readiness
  85. 85. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 To successfully co-exist / thrive with new generation workloads  Understand the Internet Scale Data Center / workload environment / lessons – End to end discovery, monitoring, operational automation – Differentiate between traditional IT and internet-scale workloads • For these two categories, architect IT systems accordingly – Essential role of power efficiency on CAPEX for new data center costs $/server Traditional IT Data Warehouse Big Data Internet scale Views new generation as a powerful partner Traditional IT workloads Internet scale warehouse computer Views the traditional IT as a powerful enabler  Understand and innovate using these principles within your environment: – Be viewed as a powerful partner and enabler of these future directions – Architect now, how you wish to your platform, people, and infrastructure to grow along these lines – Begin evolving and building skills now  Review attached learning points
  86. 86. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Today, our users all have many non-traditional IT alternatives Traditional IT:  Traditional IT platforms  Traditional IT vendors  Non-traditional alternatives: – The Cloud, the Developing World What will the effect be on your IT infrastructure, skill sets, and business models?
  87. 87. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Other observations  Think larger than technology  Watch the business models, learn and apply  My additional presentation: “ Disruptive Innovation in Modern IT World” – http://www.slideshare.net/johnsing1/a-india-csii2012disruptiveinnovationinthemodernitworldv3plenarypresentation  Keeping up with it all: – Necessary today: first thing every day, 1 hour of industry study, to keep up – Then share via your own digital footprint • A job skill necessity for today’s world – Social network, personal exploitation of modern smart devices and tools – See appendix for resources  Endless possibilities!!  I believe you would know better than I where to apply yourself $/server Traditional IT Data Warehouse Big Data Internet scale Current IT processor req’d linear with workload Internet scale, warehouse scale computer New gen workloads Exascale datacenter  Massive parallelism  Flexible system optimization Workload Optimized Systems
  88. 88. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Summary  Computation is inevitably moving into Warehouse Scale Computing supporting The Cloud – IT Architects, now and in near future, must be aware of and capable of exploiting Internet Data Center Design and Workload experiences to best design the systems of the future – When the workload is true internet scale, will require the physical and economic mechanisms at play in a Warehouse Scale Computer This session is the author’s research compilation, Great thanks to Google for the seminal work on which this lecture is based. Download a copy at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006  Final comments – While WSC’s at one level are simple: a few ten thousand servers and a LAN…….. – In reality, building cost-effective massive warehouse scale computing platform with necessary reliability, programming productivity, energy cost effectiveness is as difficult and as exciting / stimulating opportunity as IT has ever seen. – The authors of “Intro to Warehouse Scale Computing” hope that this information will stimulate IT staff and scientists to understand this new area – In the years to come, our collective understanding / efforts will solve and expand the many fascinating problems and benefits to humanity, arising from warehouse scale computer systems.
  89. 89. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Evaluations and chart downloads are online  http://www.ibmtechu.com/hu sGE06
  90. 90. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Together, let’s build a Smarter Planet
  91. 91. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learning Points
  92. 92. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learning Points - 1  Rising bandwidth worldwide enables web services delivery model (“The Cloud”)  The Cloud runs in massive data centers with well-connected commodity processors, storage – With homogenous applications amortized across internet scale user population  These data centers are a different class of large-scale computing machines called: – Warehouse scale computers (WSC) – With huge PB data volumes – Running the easily parallelized, high workload churn, homogeneous platform, fault- tolerant clustered software stack  Understanding this class of machines: – Important as multi-core processor chip advancement within just a few years – Will make even modest-sized computing systems – Approach the behavior of today’s Warehouse Scale Computer
  93. 93. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learning points - 2  Building block of choice for WSC is: – Commodity server-class processors, consumer/enterprise grade disk drives, Ethernet-based networking – Because the internet workload characteristics include easy parallelization  Fault-tolerant software stack mitigates continuous failure rate – Of 10,000s / 100,000s of hardware and software components in WSC – Programmer software stack provides the tools to cost-effectively, time-effectively program this highly clustered environment – Redundancy in application-level software eliminates need for redundancy in OS or storage  Software development differs from traditional IT: – Ample parallelism: • Internet users have a high degree independence from each other – High workload churn: • Release cycle measured in days and weeks – Platform homogeneity: • Single organization owns / has technical capability / runs WSC end to end – Fault-tolerant software: • Makes feasible continuous recovery mode operation of servers / components w/o user application impact
  94. 94. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learning Points – 3 – Economics / Cost  80% of construction cost of data center due to amount of power and cooling required  Maximizing Power Usage Efficiency is therefore paramount – To reducing capital expenditure as well as operating expenditure – Target PUE = 1.2  Modular Container Data Center architecture is popular: – Mainly because it increases the Power Profile / Power Density – Which in turn significantly reduces the data center construction cost – In addition provides flexibility, much faster time to delivery – Finally, is important tool to help address mismatch between Internet-scale hyper- workload churn and 10+ year data center lifespan – Modular Container Data Center architecture has considerable merit for any organization with scale requirements
  95. 95. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learning Points – 4 – Key Challenges / Opportunities Ahead  Rapidly changing workload – New applications / innovations gain popularity at very fast pace – Often exhibit disruptive workload characteristics (YouTube example) – WSC architecture, container data centers are best practices to cost-effectively best position / adapt the organization – To disruptive business innovations over the 10+ year lifespan of physical WSC structure  Building balanced systems from imbalanced components – Multi-core processors continue to get faster, become more energy efficient – Memory, disk storage, networking gear not evolving at same pace in either performance or energy efficiency – Research / innovation must shift to these subsystems else further increases in processor power will not be able to provide further WSC improvement
  96. 96. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learning Points – 5  Curbing energy cost – We must continue to find ways to ensure performance improvements are accompanied by corresponding energy efficiency improvements – Otherwise scarce future energy budget will increasingly curb growth in computing capabilities  Internet-style workloads – Future performance gains will be delivered by more multi-cores, not clock speed – Future large scale systems will continue to increasingly exhibit characteristics of today’s Internet Scale Data Centers and Workloads
  97. 97. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Supplemental Resources
  98. 98. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Energy Proportionality
  99. 99. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Activity profile 5,000 Google servers over period of 6 months “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 5-4, p.55 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Majority of time, server utilization is in 10-50% range Obviously some opportunity to increase processor util. % The real question: how much power / cooling did I have to pay for in this data center to run these idling servers?
  100. 100. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 SPECpower_ssj2008: traditional IT servers consume nearly 70% power even when idling! “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 5-3, p.53 Barroso, Holzle http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006 Server consumes 68% of peak power requirement when idling! i.e. at a 30% utilization, we’re using 75% of max power A lot!!!! Even though most of server time < 50% utiliz. I’m paying 70% energy cost
  101. 101. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Energy Proportionality 2011 servers have gotten a lot better By closing this gap, construction costs of internet scale data center feasible This gap = excessive data center construction cost Enables warehouse scale computer to be affordable Credit for all these ideas is to Google 2011 June talk by Luis Andre Barroso at 2011 Federated Computing Research Conference San Jose, Calif
  102. 102. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 But to fully exploit Energy Proportionality….. Rest of IT infrastructure needs to catch up  Servers today: 3x – Have improved greatly since 2008 But: – Currently little/no focus on energy proportionality in:  Networking equip: 1.2x  Storage: 1.3x – Hard to do because we’re spinning the disk drive constantly – Spinning drives -> flash? Dynamic Range Bigger is better Means uses nearly same power whether it’s idle or fully utilized Affects data center construction costs
  103. 103. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Is Internet Scale = High Performance Computing (HPC)? No  HPC clusters: 1. Recovery model: OK to pause entire job, restart computation from checkpoint 2. Requires significant CPU supporting – large numbers of synchronized tasks – which intensely communicate 3. Typically single binary, single job on 1000s of nodes 4. CPUs may run at 100% for days/weeks 5. Building block of choice: high perf, high avail high-end SMPs with high shared memory interconnect bandwidth for intense inter-process communication Warehouse scale computers: 1. Recovery model: gracefully tolerate large #s component faults – operating near-continuous recovery state 2. Requires significant CPU but individual tasks less synchronized – Little or no inter-communications – Internet workload = ample parallelism 3. Diverse set of applications – Hyper-pace workload churn / release cycles 4. CPU utilization varies, rarely 90% due to need to reserve capacity for Internet spikes or to cover failed cluster components 5. Building blocks of choice: commodity server-class machines with direct attached disk drives, Ethernet-based interconnect
  104. 104. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Resources for your Internet Scale Workload and Data Center journey
  105. 105. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 How to get ahead and thrive in this new world?  2012: devote 1st hour of day to keeping current –No longer optional  Establish power-knowledge digital footprint, intelligently sharing what you find –Don’t email what you find (too much email already) –Use social networking, social bookmarking, blogs, etc  Become a power user of your smartphone’s ecosystem
  106. 106. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Keeping Current Using John Sing’s bookmarks   You may use me as one source of ‘who/what to follow’ – Connect with me: http://www.linkedin.com/in/johnsing  External: my daily list of social bookmarks is: – http://delicious.com/atsf_arizona  IBM colleagues may see my IBM Intranet webpage: – http://snjgsa.ibm.com/~singj/  IBM Colleagues can see my IBM SONAS intranet web page – http://snjgsa.ibm.com/~singj/public/sonas_index.html  singj@us.ibm.com
  107. 107. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 See video reviewing the Space-Time-Travel example by IBM Distinguished Engineer (SWG) on Big Data – superb insight into Big Data  http://gigaom.com/2011/03/23/jeff-jonas-ibm/  Jeff Jonas/Las Vegas/IBM
  108. 108. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Keeping current – More places to connect Find out what your colleagues are doing https://www.facebook.com/pages/IBM-NAS/156301741086498 https://www.facebook.com/IBMRedbooks https://www.facebook.com/peopleforasmarterplanet http://storagecommunity.org/ https://www.ibm.com/developerworks/mydeveloperworks/blogs/InsideSystemStorage/?lang=en
  109. 109. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learn what’s out there: McKinsey Global Report on Big Data  A seminal work on this fast evolving technology, critically important technology.   While 153 pages long - if you understand the content of this presentation and realize that Big Data is insanely important to future IT viability skills....  This paper gives superb, concrete, well-substantiated ideas on what Big Data is being used for today, as we speak, to create the business models of the future  You may download a copy here: http://www.mckinsey.com/mgi/publications/big_data/index.asp
  110. 110. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Forbes Sept 2011: Impact of Social Media on Corporate  http://www.forbes.com/sites/techonomy/2011/09/07/social-power-and-the-coming-corporate-revolution/
  111. 111. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 http://www.theregister.co.uk/hardware/storage/
  112. 112. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 http://gigaom.com/
  113. 113. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 http://www.datacenterknowledge.com/special-report-the-worlds-largest-data-centers/ Develop your list of daily reading and updating…
  114. 114. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Here’s one view of world’s largest data center:  Questions:  Do you know where the largest data centers are?  Arew we tracking what they do, and why?  We could, we should! http://www.datacenterknowledge.com/special-report-the-worlds-largest-data-centers/worlds-largest-data-center-350-e-cermak/
  115. 115. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-most-we
  116. 116. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 All about the Hadoop Distributed File System (open source)  http://hadoop.apache.org/hdfs/
  117. 117. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 http://www.highscalability.com Of particular interest is the “Real Life Architectures” tab
  118. 118. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Hugely important, keep inspiring ourselves – one of my favorites:  http://www.ted.com/ - superb world class non-profit dedicated to Ideas Worth Spreading in technology, entertainment, design
  119. 119. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 http://www.slideshare.net  Search on the topic that you’re researching – Competitors in particular  Find fantastic number of downloadable presentations – Some better than others, but quickly, you’ll learn to sift find great quality for yourself
  120. 120. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Recommend you download, read, this very informative IBM book  “Understanding Big Data” – Published April 2012 – Free download – Well worth reading to understand components of Big Data, and how to exploit  Part 1: The Big Deal about Big Data – Chapter 1 – What is Big Data? Hint: You’re a Part of it Every Day – Chapter 2 – Why Big Data is Important – Chapter 3 – Why IBM for Big Data  Part II: Big Data: From the Technology Perspective – Chapter 4 - All About Hadoop: The Big Data Lingo Chapter – Chapter 5 – IBM InfoSphere Big Insights – Analytics for “At Rest” Big Data – Chapter 6 – IBM InfoSphere Streams – Analytics for “In Motion” Big Data http://public.dhe.ibm.com/common/ssi/ecm/en/iml14297usen/IML14297USEN.PDF Download your free copy here
  121. 121. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Interested in reading more about Competitive Advantage Analytics- based applications? Easy-to-read pages in this IBM book:  Download it (3.8 MB Acrobat Reader file) at:  ftp://ftp.software.ibm.com/common/ssi/pm/bk/n/imm14055usen/IMM14055USEN.PDF User Interface Layer Dashboards, Mashups, Search, Ad hoc reporting, Spreadsheets Analytic Process Layer Real-time computing and analysis, stream computing, entity analytics, data mining, content management, text analytics, etc. Infrastructure layer Virtualization, central end to end management, control, data proximity, deployment on global virtual file server with geographically dispersed storage Security authorization Location of customer competitive advantage applications This book defines everything you need to know about Competitive Advantage modern analytics applications. Interesting reading. If you are needing a quick overview of modern Analytics IT capability, start on page 14, read through page 48.
  122. 122. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 IBM Smarter Planet Big Data website  http://www-03.ibm.com/systems/data/flash/smartercomputing/bigdata.html
  123. 123. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 IBM Software Group Big Data web site  http://www-01.ibm.com/software/data/bigdata/
  124. 124. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learning Points: Ten Big Data Realities  Here are the first ten points that I want you think about when you’re grokking big data:  Oracle is not big data  Big data is not traditional Relational Database Management System (RDBMS)  Big data is not a Exadata  Big data is not a EMC VMAX  Big data is not highly structured  Big data is not centralized  IT people are not driving big data initiatives  Big data is not a pipe dream – big data initiatives are adding consumer and business value today. Right now. Every second of every minute of every hour of every day.  Big data has meaning to the enterprise  Data is the next source of competitive advantage in the technology business. Source: David Vellante, 1Q2011 Source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
  125. 125. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Learning points: What does Big Data mean to IT infrastructure professionals?  Big data means the amount of data you’re working with today will look trivial within five years.  Huge amounts of data will be kept longer and have way more value than today’s archived data.  Business people will covet a new breed of alpha geeks. You will need new skills around data science, new types of programming, more math and statistics skills and data hackers…lots of data hackers.  You are going to have to develop new techniques to access, secure, move, analyze, process, visualize and enhance data; in near real time.  You will be minimizing data movement wherever possible by moving function to the data instead of data to function. You will be leveraging or inventing specialized capabilities to do certain types of processing- e.g.early recognition of images or content types – so you can do some processing close to the head.  The cloud will become the compute and storage platform for big data which will be populated by mobile devices and social networks.  Metadata management will become increasingly important.  You will have opportunities to separate data from applications and create new data products.  You will need orders of magnitude cheaper infrastructure that emphasizes bandwidth, not IOPs - and data movement with efficient metadata management.  You will realize sooner or later that data and your ability to exploit it is going to change your business, social and personal life; permanently. Source: David Vellante, 1Q2011 Source: Wikibon.org, March 1, 2011 public broadcat on “Big Data”, http://wikibon.org/blog/ten-%E2%80%9Cbig-data%E2%80%9D-realities-and-what-they-mean-to-you/
  126. 126. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 More information for my IBM colleagues - read transcript of Big Data Overview  http://snjgsa.ibm.com/~singj/public/2011_Big_Data_Modern_Analytics_Tutorial/
  127. 127. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 More information for my IBM colleagues  Below is the 2012 IBM Research Global Technology Outlook: – https://w3-connections.ibm.com/wikis/home?lang=en_US#/wiki/Wd99c91e6c090_42d6_bbef_  Below is the IBM Research Global Technology Outlook 2011 Overview which includes our first discussions of Big Data: – http://snjgsa.ibm.com/~singj/public/2011_Prague_IBM_Systems_Conference/STG%20Tech%20Conference%20GTO%202011%20from  See all the IBM IBM Research Global Technology Outlook 2011 charts at: – http://w3.ibm.com/articles/workingknowledge/2010/12/res_gto_2011.html
  128. 128. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Internet Scale Data Centers  A different scale and set of server storage and facility economics  Implies where our own strategies, skill sets, and architectures can expand: – With additional styles of thinking, architecture – If you think 2012 is growing fast – Going to take off even more in 2013 – Mucho resources in appendix to these charts  We are all at an inflection point Growth areas Traditional areas
  129. 129. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Thank You
  130. 130. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Appendix  Following are charts from Budapest session xCL01, from Egan Ford, IBM Distinguised Engineer, System X / STG Cloud Strategy  It is his research on this similar topic of Internet Scale Workloads and Data Center Design
  131. 131. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Today: two different types of IT Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Internet scale wkloadsTransactional IT
  132. 132. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Today’s two major IT workload types Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Transactional IT Internet scale wkloads
  133. 133. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 How to build these two different clouds Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Transactional IT Internet scale wkloads
  134. 134. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 What You (Consumer) Get with These Clouds: Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
  135. 135. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Policy-based Clouds and Design-for-fail Clouds are purpose optimized Infrastructure Management solutions  Policy-based Clouds • Purpose optimized for longer-lived virtual machines managed by Server Administrator • Centralizes enterprise server virtualization administration tasks • High degree of flexibility designed to accommodate virtualization all workloads • Significant focus on managing availability and QoS for long-lived workloads with level of isolation • Characteristics derived from exploiting enterprise class hardware • Legacy applications  Design-for-fail Clouds • Purpose optimized for shorter-term virtual machines managed via end-user or automated process • Decentralized control, embraces eventual consistency, focus on making “good enough” decisions • High degree of standardization • Significant focus on ensuring availability of control plane • Characteristics driven by software • New applications
  136. 136. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Some OpenStack Public Use Cases • Internap – http://www.internap.com/press-release/internap-announces-world%E2%80%99s-first- commercially-available-openstack-cloud-compute-service/ • Rackspace Cloud Servers, Powered by OpenStack – http://www.rackspace.com/blog/rackspace-cloud-servers-powered-by-openstack-beta/ • Deutsche Telekom – http://www.telekom.com/media/media-kits/104982 • AT&T – http://arstechnica.com/business/news/2012/01/att-joins-openstack-as-it-launches-cloud- for-developers.ars • MercadoLibre – http://openstack.org/user-stories/mercadolibre-inc/mercadolibre-s-bid-for-cloud- automation/ • NeCTAR – http://nectar.org.au/ • San Diego Supercomputing Center – http://openstack.org/user-stories/sdsc/
  137. 137. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 OpenStack design tenets focus on delivering essential infrastructure on an available, scalable, elastic control plane • Sources: http://www.openstack.org/downloads/openstack-compute-datasheet.pdf http://wiki.openstack.org/BasicDesignTenets Basic Design Tenets 1) Scalability and elasticity are our main goals 2) Any feature that limits our main goals must be optional 3) Everything should be asynchronous. If you can't do something asynchronously, see #2 4) All required components must be horizontally scalable 5) Always use shared nothing architecture (SN) or sharding. If you can't Share nothing/shard, see #2 6) Distribute everything. Especially logic. Move logic to where state naturally exists. 7) Accept eventual consistency and use it where it is appropriate. 8) Test everything. We require tests with submitted code. (We will help you if you need it) OpenStack Leadership's vision statement “essential Infrastructure, support platform”
  138. 138. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 OpenStack Source: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/
  139. 139. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 OpenStack is comprised of seven core projects that form a complete IaaS solution Compute (Nova) Storage (Cinder) Network (Quantum) Provision and manage virtual resources  Dashboard (Horizon) Self-service portal Image (Glance) Catalog and manage server images Identity (Keystone) Unified authentication, integrates with existing systems Object Storage (Swift) petabytes of secure, reliable object storage IaaS Source: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/ IaaS
  140. 140. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Compute delivers a fully featured, redundant, and scalable cloud computing platform  Architecture • Sources: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/ http://openstack.org/projects/compute/ Key Capabilities: •Manage virtualized server resources • CPU/Memory/Disk/Network Interfaces •API with rate limiting and authentication •Distributed and asynchronous architecture • Massively scalable and highly available system •Live guest migration • Move running guests between physical hosts •Live VM management (Instance) • Run, reboot, suspend, resize, terminate instances •Security Groups •Role Based Access Control (RBAC) • Ensure security by user, role and project •Projects & Quotas •VNC Proxy through web browser
  141. 141. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Compute management stack control plane is built on queue and database  Key Capabilities: • Responsible for providing communications hub and managing data persistence • RabbitMQ is default queue, MySQL DB – Documented HA methods – ZeroMQ implementation available to decentralize queue • Single “cell” (1 Queue, 1 Database) typically scales from 500 – 1000 physical machines – Cells can be rolled up to support larger deployments • Communications route through queue – API requests are validated and placed on queue – Workers listen to queues based on role or role + hostname – Responses are dispatched back through queue
  142. 142. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 nova-compute manages individual hypervisors and compute nodes  Key Capabilities: • Responsible for managing all interactions with individual endpoints providing compute resource, e.g. •-- Attach iSCSI volume to phsyical host, map to guest as additional HDD • Implementations direct to native hypervisor APIs – Avoids abstraction layers that bring least common denomination support – Enables easier exploitation of hypervisor differentiators • Service instance runs on every physical compute node, helps to minimize failure domain • Support for security groups that define firewall rules • Support for – KVM – LXC – VMware ESX/ESXi (4.1 update 1) – Xen (XenServer 5.5, Xen Cloud Platform) – Hyper V
  143. 143. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 nova-scheduler allocates virtual resources to physical hardware  Key Capabilities: • Determines which physical hardware to allocate to a virtual resource • Default scheduler uses a series of filters to reduce set of applicable hosts and uses costing functions to provide Weight • Not a focus point for OpenStack – Default implementation finds first fit – Shorter the workload lifespan, less critical the placement decision • If default does not work, often deployers have specific requirements and develop custom
  144. 144. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 nova-api supports multiple API implementations and is the entry point into the cloud  Key Capabilities: • APIs supported – OpenStack Compute API (REST-based) – Similar to RackSpace APIs – EC2 API (subset) – Can be excluded – Admin API (nova-manage) • Robust extensions mechanism to add new capabilities
  145. 145. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Network automates management of networks and attachments (network connectivity as a service) Key Capabilities: •Responsible for managing networks, ports, and attachments on infrastructure for virtual resources •Create/delete tenant-specific L2 networks •L3 support (Floating IPs, DHCP, routing) •Moving to L4 and above in Grizzly •Attach / Detach host to network •Similar to dynamic VLAN support •Support for – Open vSwitch – OpenFlow (NEC & Floodlight controllers) – Cisco Nexus – Niciria  » Architecture
  146. 146. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Cinder manages block-based storage, enables persistent storage  Key Capabilities: • Responsible for managing lifecycle of volumes and exposing for attachment • Structure is a copy of Compute (Nova), sharing same characteristics and structure in API server, scheduler, etc. • Enables additional attached persistent block storage to virtual machines • Support for booting virtual machines from nova-volume backed storage • Allows multiple volumes to be attached per virtual machine • Supports following – ISCSI – RADOS block devices (e.g. Ceph distributed file system) – Sheepdog – Zadara Architecture
  147. 147. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Identity service offers unified, project-wide identity, token, service catalog, and policy service designed to integrate with existing systems  Key Capabilities: • Identity service provides auth credential validation and data about Users, Tenants and Roles • Token service validates and manages tokens used to authenticate requests after initial credential verification • Catalog service provides an endpoint registry used for endpoint discovery. • Policy service provides a rule-based authorization engine and the associated rule management interface. • Each service configured to serve data from pluggable backend – Key-Value, SQL, PAM, LDAP, PAM, Templates • REST-based APIs
  148. 148. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Image service provides basic discovery, registration, and delivery services for virtual disk images  Key Capabilities: • Think Image Registry, not Image Repository • REST-based APIs • Query for information on public and private disk images • Register new disk images • Disk images can be stored in and delivered from a variety of stores (e.g. SoNFS, Swift) • Supported formats – Raw – Machine (a.k.a. AMI) – VHD (Hyper-V) – VDI (VirtualBox) – qcow2 (Qemu/KVM) – VMDK (VMWare) – OVF (VMWare, others) References http://openstack.org/projects/image-service/
  149. 149. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 Dashboard enables administrators and users to access and provision cloud-based resources through a self-service portal  Key Capabilities: • Thin wrapper over APIs, no local state • Registration pattern for applications to hook into • Ships with three central dashboards, a “User Dashboard”, a “System Dashboard”, and a “Settings • Out-of-the-box support for all core OpenStack projects – Nova, Glace, Switch, Quantum • Anyone can add a new component as a “first-class citizen”. – Follow design and style guide. • Visual and interaction paradigms are maintained throughout. • Console Access References http://horizon.openstack.org/intro.html
  150. 150. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 OpenStack Resources • Forums – http://forums.openstack.org/ • Wiki – http://wiki.openstack.org/ • Documentation – http://docs.openstack.org/ • Mailing Lists – http://wiki.openstack.org/MailingLists • OpenStack Project Management – https://launchpad.net/openstack • Blogs – http://planet.openstack.org • Real-time chat room – #openstack and #openstack-dev on irc://freenode.net (443 users currently logged in) • Rackspace Reference Architectures – http://www.referencearchitecture.org/ • Easy Install – http://www.hastexo.com/resources/docs/installing-openstack-essex-20121-ubuntu-1204-precise- pangolin
  151. 151. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 IBM Resources/Solutions for OpenStack Available Today • developerWorks – https://www.ibm.com/developerworks/mydeveloperworks/wikis/home? lang=en#/wiki/OpenStack – Google: openstack IBM developerworks • xCAT (FOSS) for 0-day deployment – xCAT OpenStack Paper (CATStack) – Automated qcow2 image creation for Glance – HW control – Bare-metal discovery and bring up •Firmware, Base OS, etc… • IBM Intelligent Cluster Solutions (see Matt Ziegler's PPT) – Preconfigured Switches – Rack and stacked and ready to go – Lab Services for 0-day
  152. 152. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 IBM Resources/Solutions for OpenStack Available Today • All IBM System Software and Tools can coexist with OpenStack. – Director, ASU, lflash, etc… • SoNAS for shared file (NFS, SMB) • XIV for block storage (Nova Volume) • iDPX for scale-out Nova Compute and Swift • BNT switches for OpenFlow and Quantum • GPFS for iSCSI/block (Nova Volume) or file.
  153. 153. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 OpenStack Demo Setup 10.0.9.10 10.0.9.11 10.0.9.12 10.0.9.13 10.0.9.X 172.20.249.10 172.20.249.11 172.20.249.12 172.20.249.13 172.20.249.X os-essex0 os-essex1 os-essex2 os-essex3 os-essexX Control Nodes Compute Nodes Private Networks: eth0: 172.20.249/24 vm: 172.20.250/24 Public Networks: eth1: 10.0.9.0/25 vm: 10.0.9.128/25 compute network compute network compute network compute network scheduler volume console glance api compute network scheduler volume console glance api Scale OutHA Active/Passive VMVM VMVM VMVM VMVM VM Firewall
  154. 154. © 2012 IBM Corporation IBM Systems Technical Universities– Budapest, Hungary – October 15-19 sGE06 PPT’s and Videos: http://xmission.com/~egan/cloud/

×