OpenStack at Scale Inside NetApp

OpenStack at Scale inside
NetApp
Manasi Prabhavalkar
NetApp Inc.
August 24, 2016

 Manasi Prabhavalkar
 Systems Architect for OpenStack in the
Engineering Shared Infrastructure Services group
AKA Customer Zero
 Masters in Computer Science @ NC State
University
 Bleeding edge of technology to serve as a platform
for innovation inside NetApp
About Me
© 2016 NetApp, Inc. All rights reserved.2
@manasip11

3
AUTOMATION
WITH PUPPET
GLOBALIZING
OPENSTACK
FUTURE
STEPS
OPENSTACK
INTRODUCTION
BEFORE
OPENSTACK
AUTOMATING
NDO UPGRADES
GLOBAL NDO
UPGRADES
Pre-2014 Aug 2014 Sept 2014 Aug 2015 Dec 2015 Jan 2016
Timeline
© 2016 NetApp, Inc. All rights reserved.

Global Engineering Cloud
Key stats
 Internal Private Cloud: GEC
 One stop portal
 Multi-hypervisor
 75,000 Total VM Capacity
 15% KVM and growing
 FlexPod Datacenter
 OpenStack RDO Mitaka
 NetApp FAS and/or E-Series Storage
 Cisco Nexus Networking
 Cisco UCS Compute
 Automation Deployed
 Puppet Open Source
 Jenkins
 Git
Massively scalable shared virtual data center infrastructure

Region Architecture
Glance
Nova
Neutron
Cinder
Ceilometer
Heat
CONTROLLER
RabbitMQ
COMPUTESLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
OK FAIL OK FAIL OK FAIL OK FAIL
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
SLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
Keystone HorizonKeystone Keystone Horizon Horizon
Glance
Nova
Neutron
Cinder
Ceilometer
Heat
CONTROLLER
RabbitMQ
COMPUTESLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
SLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
Glance
Nova
Neutron
Cinder
Ceilometer
Heat
CONTROLLER
RabbitMQ
COMPUTESLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
SLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3

6
01
05
10
15
20
25
30
35
40
02
03
04
06
07
08
09
11
12
13
14
16
17
18
19
21
22
23
24
26
27
28
29
31
32
33
34
36
37
38
39
41
42
01
05
10
15
20
25
30
35
40
02
03
04
06
07
08
09
11
12
13
14
16
17
18
19
21
22
23
24
26
27
28
29
31
32
33
34
36
37
38
39
41
42
STS
BCN
ACT
Cisco Nexus 9396PX
1
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
N9K-M12PQ STS 1 2
ACT
3 4
ACT
5 6
ACT
7 8
ACT
9 10
ACT
11 12
ACT
STS
BCN
ACT
Cisco Nexus 9396PX
1
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
N9K-M12PQ STS 1 2
ACT
3 4
ACT
5 6
ACT
7 8
ACT
9 10
ACT
11 12
ACT
A B
FAS8040FAS8040
4 5 6 70 1 2 3 12 13 14 158 9 10 11 20 21 22 2316 17 18 19
DS2246
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
4 5 6 70 1 2 3 12 13 14 158 9 10 11 20 21 22 2316 17 18 19
DS2246
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
CISCO UCS 6248UP 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
STAT
ID
SLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
SLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
CISCO UCS 6248UP 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
STAT
ID
4 5 6 70 1 2 3 12 13 14 158 9 10 11 20 21 22 2316 17 18 19
DS2246
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
4 5 6 70 1 2 3 12 13 14 158 9 10 11 20 21 22 2316 17 18 19
DS2246
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
1200GB
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
! ResetConsole
UCS B200 M4
PUPPET
ROLES
FLEXCLONE AND
ASSIGN BOOT LUN
ASSIGN SERVICE PROFILE
CREATE FLEXVOLS FOR CINDER
AND NOVA STORAGE
VLAN
####
CREATE VLANS
FOR INSTANCES
Automation With Puppet
WEB
LOADBALANCERS
KEYSTONE
GALERADB
CONTROLLER
COMPUTE
DATABASE
MONGODB

1. Shared Services
 Keystone & Horizon upgraded
serially
2. Controller
 Services upgraded serially across
regions
3. Compute
 Live migrate Instances to other
Compute nodes
 Upgrade empty Compute node
serially within region and parallel
across regions
Zero service interruption
Automating Non-Disruptive Upgrades
Seamless user experience
COMPUTESLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
SLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
COMPUTESLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
SLOT
1
SLOT
5
SLOT
3
SLOT
7
SLOT
2
SLOT
6
SLOT
4
SLOT
8
!
UCS 5108
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
! ResetConsole
UCS B200 M3
Glance
Nova
Neutron
Cinder
Ceilometer
Heat
CONTROLLER
RabbitMQ
Glance
Nova
Neutron
Cinder
Ceilometer
Heat
CONTROLLER
RabbitMQ
Keystone Keystone Keystone Horizon HorizonHorizon

Global NDO Upgrades
2000 VM Capacity
30 Total Nodes
2 Hours to Upgrade
SITE 3: RTP, NC
600 VM Capacity
22 Total Nodes
1.5 Hours to Upgrade
SITE 2:
CALIFORNIA
6000 VM Capacity
86 Total Nodes
4 Hours to Upgrade
SITE 4: RTP, NC
100 VM Capacity
14 Total Nodes
1 Hour to Upgrade
SITE 1:
BANGLORE

Lessons Learned
 OpenStack is maturing but documentation is key
 Set Expectations: OpenStack is different from what
we’ve supported in the past
 NetApp Storage played a positive role in
deployment and upgrades
 Non-disruptive
 Easy to scale
 Fast instance creation using NetApp Cinder Driver –
50% faster than generic NFS
OpenStack Lessons
Advice for you
 FlexPod provides a highly available, independently
scalable and resilient platform
 Monitoring for greater visibility in your OpenStack
environment
 Define an upgrade strategy that suits your
architecture
 Try to leverage automation tools and CI/CD platforms
 Globally dispersed team
 Refine and test automation in your local geography and
then roll out globally
 Educate, enable, and mentor your peers to upgrade
based on their schedule

10
Where we’re going next
 Integration of other OpenStack projects
1. Ironic (Baremetal as a Service)
2. Trove (Database as a Service)
3. Manila (File Share as a Service)
4. Magnum (Container as a Service)

Key Takeaways
11 © 2016 NetApp, Inc. All rights reserved.
 Have a good foundation that you can count on
 Converged Infrastructure (FlexPod) provides a scalable, highly efficient platform
 Set expectations, PLAN ahead, and DOCUMENT well!
 Automation and non-disruptive upgrades were KEY ingredients for success
 Our Global Engineering Cloud is backed by an OpenStack ecosystem that is highly
available, upgradeable between releases, and provided at scale across geographical
regions

Other collateral
 NEW Technical Report (FlexPod OSP8):
http://nt-ap.com/1XN5Tgc
 RHEL-OSP6 on FlexPod Deployment:
http://bit.ly/1Q7b3Qb
 RHEL-OSP6 on FlexPod Design:
http://bit.ly/1LFCHEz
Reference architectures

OpenStack at Scale Inside NetApp

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to OpenStack at Scale Inside NetApp

Similar to OpenStack at Scale Inside NetApp (20)

More from Tesora

More from Tesora (20)

Recently uploaded

Recently uploaded (20)

OpenStack at Scale Inside NetApp

Editor's Notes