Boston open stack meetup deployment case study

1,346 views

Published on

Published in: Technology, Business
  • Be the first to comment

Boston open stack meetup deployment case study

  1. 1. Beth Cohen Boston OpenStack Meet-UpSr. Architect Global Electronics Manufacturer OpenStack ProjectCloud Technology Partners617.721.7256 │ Beth.Cohen@cloudTP.com May 16th, 2012 TRANSFORM INNOVATE OPTIMIZE
  2. 2. Theme: Building Cloud Right “First you build your cloud, then you need to operate it. That is less than straightforward in a typical enterprise IT environment.” Gordon Haff, Cloud Computing Evangelist, Red Hat www.cloudTP.com 2 May 16, 2012 Boston OpenStack Conference
  3. 3. Boston OpenStack Cloud Meet-Up Agenda • Overview of client cloud build-out project • Project challenges • Three pronged solution • Organizational recommendations • Network architecture recommendations • Tools recommendations • Lessons from the trenches www.cloudTP.com 3 May 16, 2012 Boston OpenStack Conference
  4. 4. Overview of Cloud Project • Client is $3 billion IT infrastructure outsourcing company separate from Global Electronics Co. • Executive mandate was to build a cloud in support of consumer division activities • Requirements: – Create an IT organization to support the cloud infrastructure – Internally develop applications to support millions of external customers – Provide a platform and tools for building future applications www.cloudTP.com 4 May 16, 2012 Boston OpenStack Conference
  5. 5. Objectives • Review client Cloud project, architecture and identify technical and operational risks • Recommendations to address risks • Recommend appropriate Swift architecture based on client planned uses • Modify Swift staging environment • Build Nova and Swift deployment automation based on the recommended architecture • Document deployment plan/process • Recommend appropriately scaled Nova network architecture based on client planned uses www.cloudTP.com 5 May 16, 2012 Boston OpenStack Conference
  6. 6. Challenges • Highly competitive consumer electronics sector • Very traditional IT organization with poor reputation within the company • IT organization had to continue to support its other internal and external customers • Lack of experience with incorporating new technology into IT environments • Little in-house cloud expertise • Had chosen immature cloud technology • Weak middle management support for project • Slow business decision on planned system use www.cloudTP.com 6 May 16, 2012 Boston OpenStack Conference
  7. 7. Risk • Operations and business • Nova network limitations • Security • Scalability • High availability Client Cloud Concept www.cloudTP.com 7 May 16, 2012 Boston OpenStack Conference
  8. 8. Risk Analysis: Basic Approach and Principals • Tactical approach to match client culture • Limit duplication of effects across the organization • Create realistic target metrics • Learning to live with continuous change • An emphasis on operational automation • Continuous testing and test driven development • Develop a culture supportive of cross functional teamwork • Identify small changes that make big impacts www.cloudTP.com 8 May 16, 2012 Boston OpenStack Conference
  9. 9. Operations and Business Risks • Vendor lock-in • Complicated to configure and maintain • Large number of protocols and configurations • Many hardware components – Use software load balancers, firewalls, etc. to reduce costs and increase flexibility • Difficult administrative access – Use modern data center best practices www.cloudTP.com 9 May 16, 2012 Boston OpenStack Conference
  10. 10. Security Risks • Remote administrative access • Single firewall in cloud – Follow industry best practices – Use software firewalls instead • Virtual Machines could be • Potential access to network through compromised switches – Isolate networks to limit exposure – Configure switches for maximum isolation of protocols • Server nodes share network w/VMs – Isolate networks to limit exposure www.cloudTP.com 10 May 16, 2012 Boston OpenStack Conference
  11. 11. Scalability Risks • HP Tipping Point IDS system supports maximum 10GB bandwidth – Change to SW Firewalls • Limited North/South bandwidth doesn’t lend itself to horizontal scaling – Use Layer 3 distributed core • Network connectivity to the Internet is currently 2 10GB up-links – Add up-links as needed • Number of racks in cluster limited to 15 – Use Layer 3 distributed core or Spine and Leaf configuration • Block storage network allocates to specific racks rather than spreading the network traffic evenly across the compute nodes – Use Layer 3 w/traffic shaping to distribute traffic • VM state changes stored in core network fabric – Use virtual networking to segregate VM traffic www.cloudTP.com 11 May 16, 2012 Boston OpenStack Conference
  12. 12. High Availability Risks • Several Single Points of Failure (SPOF) in environment – Identify and change architecture to remove • MLAG doesn’t scale – Use Layer 3 network to eliminate bottleneck • Costly hardware redundancy in systems – Re-architect to take advantage of software redundancy instead • Nova Network Issues – Use virtual networking – Nova Network node is a Single Point of Failure (SPOF) – Immature networking software – Quantum is still under development – Nova DHCP server is also a SPOF www.cloudTP.com 12 May 16, 2012 Boston OpenStack Conference
  13. 13. Organizational Recommendations – Operation Principles • Create an operational mindset across the team – Develop better understanding of Open Source, cloud architectures, Agile methodologies, continuous dev, test and integration, overall dev/ops concepts in general • Coordinate the Openstack development effects across the project – Build clear communication channels between functional groups • Leverage the Openstack community efforts more effectively – Client team encouraged to contribute key features back to the Openstack project • Create more/better test metrics and test harnesses to support continuous and integrated dev/test processes and automation • Leverage existing Global Electronics Co. organizational expertise on how to run highly efficient operations organizations www.cloudTP.com 13 May 16, 2012 Boston OpenStack Conference
  14. 14. Network Architecture www.cloudTP.com 14 May 16, 2012 Boston OpenStack Conference
  15. 15. Network Considerations • Need for vendor independence – Don’t rely on specific features of router/switch vendors, – Example: MLAG is vendor specific • Need to massively scale ecosystem – Hierarchical addressing modeled on Internet is only real option • Need to design for cost efficient operations – minimize hardware redundancy, etc. • No new hardware for staging environment • No single point of failure in the network • Tolerant of rack level failure www.cloudTP.com 15 May 16, 2012 Boston OpenStack Conference
  16. 16. Network Requirements • Independent network requirements for physical server nodes and virtual machines (VM) • Need to isolate VM networking information from the core network for scaling • Many components interact at different levels of the system stack adds complexity • Need to isolate networks and separate functions for security • Separate networks by function for traffic shaping • Complex data paths – Data between VM’s, East/West and in and out of the system, North/South • OpenStack has a weak high availability architecture www.cloudTP.com 16 May 16, 2012 Boston OpenStack Conference
  17. 17. Network Decisions • Choose virtual networking or flat networking – Recommend virtual networking • eBGP or static with Suwon backbone – Recommend eBGP • Software or hardware load balancing – Recommend software LB • Spine and leaf or distributed core topology – Recommend distributed core • Network shared with storage or dedicated storage network – Recommend shared network • VIP or MPIO (multipath) for iSCSI redundancy – Recommend MPIO www.cloudTP.com 17 May 16, 2012 Boston OpenStack Conference
  18. 18. Option 1: Shared Storage and VM Network www.cloudTP.com 18 May 16, 2012 Boston OpenStack Conference
  19. 19. Option 2: Separate Storage and VM Network www.cloudTP.com 19 May 16, 2012 Boston OpenStack Conference
  20. 20. Option 1: Layer 3 with Virtual Networking 182.196.0.0/22 0.0.0.0/0 Suwon Network Edge 182.196.0.1 182.196.0.255 EBGP/30 EBGP/30 Eth1/182.196.0.100 Eth1/182.196.0.101 Cloud Network Edge Private AS eg. AS64512 Eth0 192.168.8.5 Eth0 192.168.9.10 Virtual Switch Virtual Switch Cloud Backbone Network 192.168.1.5 192.168.1.7 192.168.2.5 192.168.2.7 Nova Compute Node 1 Nova Compute Node 2 iSCSI SAN iSCSI SAN Node 1 VM view VM VM’s Tapx/10.10.2.x Tap0/182.196.2.7 Eth 192.168.1.5 Virtual Switch www.cloudTP.com 20 May 16, 2012 Boston OpenStack Conference
  21. 21. Network Recommendations • Change from a Layer 2 to a Layer 3 configuration to build dense multipath network core and support for multi-directional scaling and flexibility • Isolate virtual networks using traffic shaping for performance • Isolate virtual networks using L2 over L3 encapsulation • Use eBGP to connect to the Internet up-link • Use iBGP for internal traffic on the mesh • Determine best configuration for block storage network www.cloudTP.com 21 May 16, 2012 Boston OpenStack Conference
  22. 22. Spine and Leaf Cloud Network Diagram Expand network by adding either aggregators or ToR switches. Each is independent, this allows maximum flexibility. Notes: Llinks from ToR switches up to the core aggregation layer either 10GB or 40GB. All links in network /30 Network iBGP. www.cloudTP.com 22 May 16, 2012 Boston OpenStack Conference
  23. 23. Scaling of Spine and Leaf Network www.cloudTP.com 23 May 16, 2012 Boston OpenStack Conference
  24. 24. Distributed Core Cloud Network Diagram Notes: Llinks from ToR switches up to the core aggregation layer either 10GB or 40GB All links in network /30 Network iBGP Expand network by adding either aggregators or ToR switches. www.cloudTP.com 24 May 16, 2012 Boston OpenStack Conference
  25. 25. Scaling of Distributed Core Network www.cloudTP.com 25 May 16, 2012 Boston OpenStack Conference
  26. 26. Option 2: Scaling Using Availability Zones www.cloudTP.com 26 May 16, 2012 Boston OpenStack Conference
  27. 27. Scaling of Distributed Core Network with AZ www.cloudTP.com 27 May 16, 2012 Boston OpenStack Conference
  28. 28. Tools Recommendations – Deployment Using Crowbar • Support for HP hardware added • 2 Crowbar servers installed – One on management network – ss15 – One on Swift Proxy rack -- 106 • Documentation of Crowbar deployment – Completed • Stabilize Crowbar system for production – Waiting for April 1.3 Release • Crowbar development work remaining – Barclamp for DHCP relay for subnets near completion • Infra team Crowbar training to be done by Client www.cloudTP.com 28 May 16, 2012 Boston OpenStack Conference
  29. 29. Final Deployment Automation Network www.cloudTP.com 29 May 16, 2012 Boston OpenStack Conference
  30. 30. Advice From the Trenches • Think holistically • Top management needs to actively support cross organizational change • Focus on building in-house expertise in cloud: – Architecture – Networking – Applications – Data center operations • Use the rack as the base unit for scaling • Scale the cloud horizontally, not vertically • Automate, automate, automate! www.cloudTP.com 30 May 16, 2012 Boston OpenStack Conference
  31. 31. Boston OpenStack Meet-Up May 16th, 2012 Questions? cloudTP.comBeth CohenCloud Technology Partners P: 617.674.0874Chief Architect & Technology Officer Info@cloudtp.com617.721.7256 │ Beth.Cohen@cloudTP.com 308 Congress St, 5th Floor Boston MA, 02210 TRANSFORM INNOVATE OPTIMIZE

×