This document discusses SuperChoice's hybrid cloud orchestration approach. It summarizes:
- SuperChoice migrated over 200 applications to hybrid clouds using a Cloud Management Platform for orchestration.
- They took a "fix by rebuild" approach to automate deploying entire environments from source scripts.
- Significant challenges included addressing technical debt, cultural change for staff, and ensuring portability across clouds.
- Lessons learned were that automation is critical, people issues are the biggest barrier, and a methodical approach worked best for the transition.
2. 2
Agenda
Why automation and orchestration are critical
How we automated 200+ apps
The role of the Cloud Management Platform (CMP)
Lessons Learned
3. 3
SuperChoice at a Glance
SuperChoice is a pension / superannuation e-commerce platform provider
with a leading Australian market position, an emerging UK business and
aspirations to expand into Asia and Europe
SuperChoice – the company
– Established in 1996
– Sydney based with approx. 90 professional staff
– Privately owned by CPS Group and management
– 20 years of continuous, double-digit growth
Largest player in the Australian market
– Approx. 40% of the total market by transaction numbers
– approx. 20% of the active Self Managed Superannuation Funds
5. 5
The SuperChoice journey so far
SuperChoice Technology Headcount
(FTEs)
Trained
all staff
on
Agile
Commenced
SuperStream
project
Hired Full-
time Agile
Coach
Commenced
1-Click-
Build™
Commenced automated
regression testing
Selectively
outsourced
development (Aust,
NZ, Malaysia)
Adopted New Technology
Strategy
Initiated actions to
increase staff engagement
Adopted Atlassian
tool stack
Adopted
new Code
Quality
practices
Initiated capability
development programme
Started
redeveloping core
platform
Moved from 2
Monthly to Monthly
to Weekly
deployments
Started investigating
migrating database
technology
Commenced
piloting cloud
migration
100
Head
Count
75
50
25
6. 6
Most organisations can no longer do infrastructure and
security cost effectively
Lack economies of scale and scope and the ability to develop deep expertise
Complex integration of people, processes and products
Interconnectedness
Defence in depth / layers of counter measures
Impossible to win an Arms Race:
– Hackers compromised 2.2 Billion records in just the first 10 months of 2016
– Security budgets climbed from 22% of technology budgets in 2014 to 28% in 2016!
Source: Forrester, Develop Your Information Security Management System, January 19, 2017
7. 7
Few Mature Corporates are Migrating Fully to Public cloud
20%
25% 24%
15%
5%
2%
9%
<20% 20% to
39%
40% to
59%
60% to
79%
80% to
99%
100% Don't
know
How much of the server-side
code that you write is deployed
to cloud environments today?
(2015; Index: N = 55)
Source: Forrester Business Technographics Infrastructure Survey, 2015
What percentage of your
infrastructure is in the Public cloud?
(2015; Index: N = 49)
14.5%
8. 8
SuperChoice Requirements
Core Objectives:
– Reduce costs and timeframes by automating provisioning infrastructure and deploying
environments
– Reduce operational costs with maintaining and supporting infrastructure and environments
Clouds:
– Initially: AWS and VMware Private, Azure added in the last 6 months, investigating Google
Applications:
– Rapidly growing list of apps (>200 currently) from adopting micro services architecture
VMs/Instances:
– Averaging several hundred concurrent instances, varies considerably over time
Environments:
– Multiple Development, Test and Client Acceptance Test environments
Frequency of updates:
– Daily to Weekly
9. 9
SuperChoice’s Overarching Strategy
Automatically deploying entire, complex software environments
Adopting a “Fix by Rebuild” approach
Supporting multiple cloud service providers independently
10. 10
Key Benefits Targeted
Automate deployments ‘at scale’ of large sets of applications and
infrastructure to a managed baseline in a timely manner
Deploy and test environments in approx. 60 minutes to enable moving from
a Break-Troubleshoot-Fix model to a Fix-by-Rebuild model
Have repeatable and consistent deployments
Support high agility for environment provisioning for dev and test teams
Track environments and charge directly to the project team
Optimize costs by stopping instances not in use (e.g. scheduled to run only
within business hours)
Choose the right Cloud for the Environment’s purpose
11. 11
Adopt Cloud IaaS / Paas / SaaS capabilities
Automate Infrastructure Provisioning
Benefit
Automation Capability
Automate Environment
Provisioning
Straight replacement
of physical
infrastructure
Uplift platform-level
provisioning and
management (for
discrete components)
Fully automate provisioning of
integrated suites of
applications and databases
Cloud technologies have the ability to transform the way
functionality is delivered while reducing costs
12. 12
Control / Master Manufacture Use Dispose
SCM
Developer
Continuous
Integration
Platform
Management
Source
Scripts
Libraries
Reference
Data
Test Data
Reference Data
Management
Tester
Environment
Config
Release
Management
Dev
Test
Environments
DevDevelopment
Environments
Production
Environment
Changes are applied to Environments
by updating Automation
Cloud
Management
Platform
Automation
Tools
Configuration,
Deployment and
Compliance
Testing tools
Automation drives rethinking how we deliver capability
14. 14
Public or Private Cloud
Cloud Account
Network (VPC/Vnet) Boundary
Factory
Nexus Master
Confluence
BitBucket
Bamboo
Bamboo
Agents
Other …
Management
DNS2SMTP DNS1
Logging Consul 1 Consul 2
Auto-
mation
Backup
Jump-
host
Boundary Network Device
(Cloud or appliance, e.g. Palo Alto)
Inbound
Proxy
Outbound
Internet Proxy
Environment
Environment
Environment
DMZ
webserver webserver
Business
Application Servers
App
Environment
Integration
DMZ
SFTP MQIPT
Axway MQ
App
Data Tier (Master)
Shared File Storage
Application
Database
Application
Database
Application
Database
Ephemeral deployments
managed fully by Fix-By-
Rebuild.
Deployments are “Cookie-
Cutter” with no Configuration
Variance, resulting in
Simplicity and Repeatability
Long-lived Data, retained
through Environment
rebuilds
Long-lived Management
Services
Boundary Security
Network
Network
Internet
Network
A Standardised Model
A standardised model for
DataCentre components with
standardised, segmented
Network layout
Each ‘Cell’ is deployed using
Automation, managed as a Unit
Lifecycle and management
approach for each Cell
tailored to the nature of the
services
Repeat with as many Networks
across as many Clouds as
needed
15. 15
Key Concepts / Approach
Information Model - model all of the information associated with an
enterprise's environments. Baseline and keep separate from the execution
Software management and control disciplines applied to Infrastructure;
“Infrastructure as code”
Take a manufacturing approach to building complex environments.
Always go back to source and rebuild from the ground up
Fix by rebuild model, we don’t spend time fixing environments
Alter mindset around asset value
16. 16
Leveraging RightScale CMP
RightScale as the Orchestration Engine or manufacturing engine
(Multi-Cloud + Governance)
Establish RightScale Cloud Application Templates (CAT) for automated multi-
cloud deployment:
– Specific functional areas (eg SDN, DNS, FW, Management / App tier etc)
– Generate CATs for specific purposes eg deployment models
– Launch an application / environment
18. 18
Lessons Learned
Key Lessons:
CI/CD tools & processes are critical
Change Management must be a focus with respect to human resources
Cost to Value needs to be tracked. Favour a user pays approach and cost
allocation model at the environment level
Concept of Brittleness in Infrastructure deployment (Embrace it)
19. 19
Misconfigurations have a major impact on costs
80% of unplanned outages are due to ill-planned changes made by
“operations staff”
- IT Process Institute’s Visible Ops Handbook
60% of availability and performance errors are the result of
misconfigurations
- Enterprise Management Association
80% of outages impacting mission-critical services will be caused by people
and process issues, and more than 50% of those outages will be caused by
change / configuration / release integration and hand-off issues
- Gartner
Source: Downtime, Outages and Failures – Understanding their true costs, November 2015
20. 20
What we have achieved so far with migrating to the Cloud
Addressed a lot of historical tech debt
Automated / migrated nearly 100% of our code including:
– Core legacy application
– 100% of new distributed micro services (IS’s)
De-commissioned 4 physical environments
Implemented software firewalls and cloud neutral backup capability
Spun up 5 (more on the way) dedicated Client test environments
Transitioned some staff
Upgraded security capability (and there’s more to come!)
– Piloting Voice authentication
Upgraded the Build process
21. 21
What’s been more difficult than we thought it would be
Order of operations – Which applications to migrate first?
Some lack of clarity on requirements – Trying to replicate “existing
environments” when the requirements were implicit or poorly specified
Finding portable cloud technologies – The cloud is fairly AWS-centric, often
difficult to find cloud solutions that were portable to other cloud providers
Running hybrid environments – cloud environments that have
interconnectivity into our existing data centres
Getting some basics right:
– Error reporting is slower than previously; takes an hour
– Infrastructure changes were impacting code branches
– Need to re-launch an environment when an application fails
22. 22
What we found that surprised us
Approx. 40% of the efforts been in addressing tech debt
Benefits have been higher and costs at or lower (even after requiring greater
external assistance)
Cloud service provider agnostic backups are not easy to do
Cloud Management Platforms (CMPs) are still a maturing technology
Demand for dedicated test environments much higher then expected
Cloud service providers are not that interested in what we are doing:
– Focus on sole sourcing
– “Adopt AWS, resistance is futile” attitude
Had to slow the pace of change to accommodate the team’s ability to cope:
– Adopted small, regular milestones
Cloud environment support takes up a lot more time during transition
23. 23
Biggest challenge is addressing the People issues
Building the understanding of how it all should work
Developing and upgrading new skills eg analytical and conceptual capabilities
Getting infrastructure / operations staff to start thinking like developers
Getting the Developers on board / caring about the Infrastructure:
– Understanding how to use and taking on greater responsibility
– Addressing Tech Debt issues
24. 24
Observations from our experience to date
Today’s Key Messages:
Automation is critical to staying sane
Accessing the right mindset and skills is important
Cultural change is the biggest barrier to success and benefits capture
Don’t under-estimate the amount of Tech debt that you’ll need to address
along the way
– No different to the early days of virtualisation
Moving to the cloud is not a straight lift and shift (where uplifting capability)
While it’s all new, taking a deliberate methodical approach works