2. Engineering Cloud Computing Solutions
The Enterprise Consumer Perspective
Dr. Anna Liu
Research Group Leader
Software Systems
National ICT Australia
5. About NICTA
National ICT Australia
• Federal and state funded research
company established in 2002
• Largest ICT research resource in
Australia
• National impact is an important
success metric
• ~700 staff/students working in 5 labs
across major capital cities
• 7 university partners NICTA technology is
• Providing R&D services, knowledge in over 1 billion mobile
transfer to Australian (and global) ICT phones
industry
5
6. NICTA’s mission: to be an enduring world-class ICT
research institute that generates national benefit.
Australia’s National Centre of
Excellence in ICT Research
Research focused on areas of
importance to Australia
Publicly funded, not for profit
MEDICAL
Best of breed research teams
(400 staff + 300 students) AutoMap
Industry engagement RedLizards.com
Industry outcomes
Engagement models include… Enduring solutions
Contract R&D ‘Spinout’ companies
Consulting services
6
Strategic Partnerships
Licensing
7. Research Areas at NICTA
Networks Machine
Software Learning
Systems
Aruna Seneviratne
Bob Williamson
Anna Liu
Computer Gernot Heiser
Vision Optimisation
Nick Barnes,
Richard Hartley
Control &
Peter Corke Signal Mark Wallace,
Sylvie Thiebaux,
Processing Toby Walsh
Rob Evans
7
8. Our team’s mission: help enterprises take full
advantage as software extends into cloud!
Cost optimised
High availability
Onsite/offsite Hybrid cloud
Real-time monitoring
Disaster recovery
Actionable analytics
Business continuity
Intelligent management
Systems resilience
Dynamic Elastic
Real time
High performance
Our Research Capability
spans cloud computing, web, SOA, distributed systems, data
management, analytics, performance
8 monitoring, DR, automated reasoning, ontologies, AI…
9. Agenda
• The Enterprise Perspective
• Evaluating cloud computing
• Business opportunities
• challenges
• Proof on Concept Experience
• Workload appropriate for cloud
• Technical architecture
• Migration issues
• Business and commercial considerations
• Future of Software Engineering FOR and IN Cloud
• What‟s so different about cloud?
• NICTA current research in cloud
• What‟s to come
10. Agenda
• The Enterprise Perspective
• Evaluating cloud computing
• Business opportunities
• challenges
• Proof on Concept Experience
• Workload appropriate for cloud
• Technical architecture
• Migration issues
• Business and commercial considerations
• Future of Software Engineering FOR and IN Cloud
• What‟s so different about cloud?
• NICTA current research in cloud
• What‟s to come
11. Enterprise Cloud Computing
The Business Values
• High Elasticity/Scalability leads to agility
– Virtually infinite amount of resources is available on
demand
• Reduce cost and complexity
– Pay per usage, economies of scale
• Generally speaking, non-7x24x365 systems with higher
resource usage bring large cost savings
– No in-house IT maintenance
– No up-front cost , geographically distributed disaster
recovery
• Innovation Possibilities
– Ease of Use, speed to market with minimum capex
– Processing Big Data
• Cost of 1 machine for 100 hours = Cost of 100
machines for 1 hour
12. Enterprise Cloud Computing - The Challenges
• Top risks/adoption issues:
–Security & privacy - Migration
challenges
–Ownership of data – Service levels
–Lock-in / interoperability – Performance
–Availability / reliability – Cost and ROI
–Monitoring & control – Governance
–Operational challenges - Competencies
–Compliance and regulation
–Software licensing in cloud
–Contracts and commercials
–new roles and responsibilities
–Payment model, metering/charge backs
• Risks vary with service model and 12
13. Australian Cloud Adoption Snapshot
• Software as a service
• Enterprise and SME
• Productivity suites, CRM
• Telco and SaaS vendor partnership
• emerging tier 2 System integrator
• Platform and Infrastructure as a Service
• SME, startups well on their way
• Enterprise doing evaluation
• Government Cloud, Community Cloud
• Data centre consolidation
• SOA, shared services
• Financial industry leadership
13
14. Some Australian Enterprise Proof of
Concepts
• Internet scale web applications
• User base from around the world
• Integration with existing web APIs
• Transient campaigns
• Many Mobile devices connecting to cloud
• Good adoption in utilities industries
• Development/Test environment
• Dynamic provisioning of dev/test resources
• Pay for usage
• Bursty workload
• Web apps
• Large scale data analysis
• eScience, Financial risk calculations, Government statistical data
14
15. Agenda
• The Enterprise Perspective
• Evaluating cloud computing
• Business opportunities
• challenges
• Proof on Concept Experience
• Workload appropriate for cloud
• Technical architecture
• Migration issues
• Business and commercial considerations
• Future of Software Engineering FOR and IN Cloud
• What‟s so different about cloud?
• NICTA current research in cloud
• What‟s to come
16. Proof of Concept Overview
• Objective
• reduce IT cost
• evaluate cloud opportunity and risks
• Test and Dev environment, as opposed to
production
• Maximise re-applicability of learning experience across other
apps
• Evaluation dimensions
• Performance, security, feasibility
• cost and license, flexibility and elasticity
• integration with existing environment, migration effort
• disaster recovery and backup, new roles and responsibilities
16
• …
17. Solution Design Rationale
• POC Solution Design Rationale
• Standard 3 tier web application, with backend and authentication
server integration
• Location of data tier
• Maintain as much as dev/test configuration as common as possible
• PaaS or IaaS
• Selection of cloud platform for POC
• Project Management
• Governance: CIO/Director level sponsorship
• Project participants: enterprise architect, solution developer, security
specialist, commercial specialist
• NICTA: cloud computing experience and evaluation framework
• 2 wks POC selection; 6 wks POC; 2 wks consolidate findings
17
18. Architecture of a Hybrid Dev Environment
NICTA Corporate Network
Internet
Remote-desktop to XX.XX.0.*
(No direct access to Amazon VPC)
Amazon Cloud (US-East Datacenter)
IPSec VPN
approx 230ms RTT
Enterprise Data store Business Web
Authentication server application
On-Premise Servers Virtual Machines
Private Cloud (Isolated Network)
Only accessible from NICTA Isolated Network in Amazon
18
19. Security
• There is „Secure integration to cloud‟ solutions emerging
– Amazon VPC, Google Secure Data Connector, Azure App Fabric,
etc
• Standard IPSec-VPN brings peace of mind to enterprise
users
– One of the strong key enablers for enterprise use
– Fit in an existing security policy
• Data masking could increase the cost/effort
– An automated method is necessary for further cost/effort
reduction
• Secure Software Development Lifecycle
– Process change required
19
20. Performance
• The performance of each component (network, VMs, …)
in cloud is comparable to or better than current on-
premise components
– For dev/test environments, suitable for production systems?
• Do not underestimate the latency in hybrid environments
– Many of traditional applications and protocols are not optimized for
a high-latency/WAN environment
• E.g., a protocol is too “chatty” and we observed that the network
usage never exceeds 0.1% in some cases
– There are performance improvement opportunities
• Alternative solution design, Configuration and tuning
20
21. Cost
• Many companies use „private cloud‟; however, current
offering is seen to be more expensive and less flexible
– increasingly Pay-as-you-go options are available
– unit price is typically more costly for storage
– SLA & management services usually included
– Cost of keeping data/VMs is larger
Annual Operating Cost
• Current Cost 2500.00
Monitoring
Storage
would vary 2000.00
Data Transfer
VPN
VM/License
depending on 1500.00
the SLA tiers of USD
1000.00
service 500.00
0.00
Min Max
21
22. Customers‟ Responsibility in IaaS Cloud
(Cost Center Charging)
OS/Application Security
Application App Data Application
(e.g., Active Directory)
Patching Backup Monitoring
Application Installation/Configuration
Billing
OS OS OS
Antivirus Backup Monitoring
Patching
OS/Middleware Installation/Configuration
Customers‟
Responsibility
Access Control Infrastructure Configuration
to IaaS (VPN, VMs, Disk, …)
Infrastructure Usage Report
Amazon EC2 Monitoring and
(IaaS providers) (CPU, Disk, Net, …) Basic Billing
23. Commercial Implications
• Software Licensing in the cloud?
• Reuse enterprise license
• Pay for usage software license model
• Payment model?
• enterprise governance model
• Metering and chargeback
• Service level agreement?
• Monitoring and management
• Contracts
• Backup, disaster recovery
• New roles and responsibility?
• Existing IT outsourcing arrangements
23
24. POC Experience Summary
• Cloud Computing has the potential to reduce existing
enterprise IT cost
• There are technical solutions for managing performance,
security risks
• Need some fresh approach to manage:
• Enterprise architecture and governance
• Commercial implications such as SLA, new roles and responsibility
24
25. Agenda
• The Enterprise Perspective
• Evaluating cloud computing
• Business opportunities
• challenges
• Proof on Concept Experience
• Workload appropriate for cloud
• Technical architecture
• Migration issues
• Business and commercial considerations
• Future of Software Engineering FOR and IN Cloud
• What‟s so different about cloud?
• NICTA current research in cloud
• What‟s to come
26. What‟s so Different About the Cloud?
• Key Architectural Differences
• Data structure (key value store, NOSQL vs relational)
• Transactional guarantee (BASE vs ACID)
• Elastic compute capability
• Unpredictable Unavailability
• Geographic distribution (latency across WAN)
• Tight integration between development and deployment
...
• These differences directly Impact Software Engineering
and Software Architecture best practice!
• New data architecture, abstractions, programming models
• New architecture trade off concerns, architecture patterns
• Replicate everything architecture, new disaster recovery mechanisms
• Emergence of „DevOps‟ influences future software engineering
process
27
27. Elastic Compute Capability
• Elasticity is the defining characteristic of cloud computing
• The aim is to allocate sufficient resource to do the job, but
not too much such that it wastes resources
• There are broadly 2 architectures that achieves elastic
compute capability
– Push architecture
– Pull architecture
28
28. Elastic Compute Capability Reference
Architecture –Push Architecture
• The Push architecture is typically used for web
applications
– Web browser (client) send a request to the web application side
– Load balancer receives the request and “push” to one of the web
servers running on a compute node
• Requests are forwarded immediately (or at a certain rate)
• Load balancer is aware of the intensity of the workload
e.g., Amazon
e.g., web browser,
CloudWatch, Azure
DB client Clients e.g., Amazon
Diagnostic API
Auto Scaling
Send request/ Controller
connect to server Monitor
e.g., Amazon use
Elastic LB, GAE Load Balancer
Task Queue / Queue Rules
monitor
Forward to nodes invoke
Resource Pool
provision
deprovision
Computing Nodes (e.g., VMs, processes, …)
29
Fig 1. Push Architecture Pattern
30. Elastic Compute Capability Reference
Architecture –Pull Architecture
• The Pull architecture is often seen as an application-level
architecture
– Also known as the Producer-Consumer design pattern
– Requests are sent to a queue
• In contrast to the Push architecture, it does not forward the request
(hence less suitable for web applications)
– Compute nodes polls the queue periodically for jobs
• Requests are processed one at a time
• Polling frequently can induce overhead
– Easier to implement fail-safe mechanism
• Compute nodes need NOT inform the queue in case of failure
• Typical fail-safe mechanism involves a queue (e.g., AWS SQS or
Azure Queue) that employs a lock attached with a timer. A message
is locked when polled by a node. In case of a node failure, the
message lock expires and return the message back to the queue.
31
31. Using Cloud for Business Continuity
• Two main usages of cloud for Business Continuity:
– Provides highly available systems for day-to-day business
– Serves as a technology platform to implement disaster recovery
• Some definitions:
– Business Continuity: “Activity performed by an organisation to
ensure that critical business functions will be available to
customers, suppliers, regulators and other entities…”
– Disaster Recovery: “A small subset of business continuity. The
process, policies and procedures related to preparing for recovery
or continuation of technology infrastructure critical to an
organisation after a natural or human-induced disaster”
– Fault Tolerance: “The property that enables a system to continue
operating properly, possibly at a reduced quality level…”
32
32. Building Highly Reliable Systems with Cloud
• Must address potential failures at two levels:
– Hardware/Infrastructure
• To prevent Single-Point-of-Failure (SPOF) by adding redundancy in
all hardware components (i.e., redundant disks, redundant network
devices, redundant power supply, etc.)
• NOT all cloud providers provide enterprise grade availability. Check
your SLA!!
– Application
• Prepare fail-over system to take over in case of a failure
• Database replicates to minimise downtime and loss of data
• Replicate to geographically different location (e.g., to avoid natural
disasters such as floods)
33
33. Case Study: Building Reliable System using
EC2
• Highly replicated Elastic IP address
Minimum Size= 1
Availability Zones = A, B, C
xxx.xxx.xxx.xxx
architecture of cloud Auto Scaling Rule
makes them great as Allocate Create
foundations for business
continuity solutions EC2 Instance
• Globally distributed nature Availability Zone A Availability Zone B Availability Zone C
further enhances the
Minimum Size= 2
disaster recovery Availability Zones = A, B, C
capability of cloud Auto Scaling Rule Request from Clients Availability Zones
= A, B, C
• Availability limitations
Elastic Load Balancer
Forward Request
means need to be realistic
about Hot vs Warm vs EC2 Instance EC2 Instance
Cold standby options Availability Zone A Availability Zone B Availability Zone C
34
34. The Reality of Eventual Consistency in
Amazon SimpleDB
• The probability to read updated data in SimpleDB in US West
– An application reads data X (ms) after it has written data
Consistent Read Eventual Consistent
• SimpleDB has two
read operations
– Eventual Consistent
Read
– Consistent Read
• This pattern is
consistent
regardless of the
time of day
35
35. Other Commercial NoSQL Databases
• Google App Engine
– Offers eventual consistent read and consistent read
– Behavior of eventual consistent read is completely
different from Amazon‟s
– In GAE, both types of reads behave exactly same
unless data centers have a failure(s)
• Windows Azure
– Offers no options for read
– Always consistent
Reference: H Wada, A Fekete, L Zhao, K Lee, A Liu, “Data Consistency Proper
And the Trade-offs in Commercial Cloud Storage: The Consumers‟ Perspective
CiDR 2011. http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper15.pdf 36
36. What‟s so Different About the Cloud?
• Key Architectural Differences
• Data structure (key value store, NOSQL vs relational)
• Transactional guarantee (BASE vs ACID)
• Elastic compute capability
• Unpredictable Unavailability
• Geographic distribution (latency across WAN)
• Tight integration between development and deployment
...
• These differences directly Impact Software Engineering
and Software Architecture best practice!
• New data architecture, abstractions, programming models
• New architecture trade off concerns, architecture patterns
• Replicate everything architecture, new disaster recovery mechanisms
• Emergence of „DevOps‟ influences future software engineering
process
37
37. Research Agenda
• Enterprise Architecture Framework
• Evaluation, acquisition, effort estimation, project and risk
management
• Software Development Lifecycle
• Requirement solicitation for cloud, design for interoperable services,
MDA/MDD/DSL, testing at massively parallel scale, cloud design
patterns
• Interoperability and Integration
• Hybrid cloud, integration challenges across clouds
• Performance Engineering
• Monitoring and measurement, performance modelling, prediction and
analysis, quality of service, SLA and assurance
• Many more…
38
38. Cost Effort Estimation for Cloud Migration
Cost implication/estimation for cloud migration is especially
challenging because:
– Applications and migration projects vary in terms of: size/complexity,
functionality, quality requirements, target deployment platforms...
– Cloud computing is new and different from traditional software
engineering paradigm: different development and deployment models,
non-functional characteristics, pricing models...
– Migration effort/cost estimation is not trivial
– Little Empirical Data in cloud
• V Tran, K Lee, A Fekete, A Liu, J Keung, “Size Estimation of Cloud Migration
Projects with Cloud Migration Point (CMP)”, 5th Intl Symposium on Empirical
Software Engineering and Measurement
• V Tran, J Keung, A Liu, A Fekete, “Application Migration to Cloud: A
Taxonomy of Critical Factors”, ICSE Software Engineering For Cloud
Computing Workshop 2011.
39
39. Adaptive Cloud Middleware Research
• Evaluating Cloud Performance – Measuring Elasticity
• Achieving Cloudburst – Integrated monitoring and
management
• Cloud Data Management – Elastic Data Store
– S Sakr, L Zhao, H Wada, A Liu, “CloudDB AutoAdmin: Towards a Truly
Elastic Cloud-Based Data Store”, 9th IEEE Intl Conf on Web Service ICWS
2011.
– S Islam, J Keung, K Lee, A Liu, “An Empirical Study into Adaptive Resource
Provisioning in the Cloud”, IEEE Intl Conf on Utility and Cloud Computing
UCC2010.
– L Zhao, A Liu, J Keung, “Evaluating Cloud Platform Architecture with the
CARE Framework”, APSEC 2010.
– P Brebner, A Liu, “Modeling Cloud Cost and Performance”, Cloud Computing
and Virtualisation (CCV 2010)
40
40. What Is Cloudburst?
Rent computing resources in public cloud(s) and
replicated App. C to meet the (short-time) demand
Cloudburst
Application A reconfiguration Application A
Application C
Application B Application B
Application C Application C
Private Cloud Public Cloud
Spikes in demand for
App.C but your private
cloud has no resources! Application A
Application C
If App. C has huge amount Application B
of data or has sensitive
data to transfer
• Dynamic reconfiguration of applications to use a
public cloud when a private cloud cannot provide
enough computing resources 49
41. Conclusion
• Cloud Computing adoption is happening rapidly at the
long-tail
• Challenges remain for Enterprise to adopt cloud
computing
• The cloud computing model embodies many architectural
differences that requires different software engineering
approaches
• There are many tough Software Engineering research
challenges to be solved in the new cloud context
53
42. Standing on the shoulder of giants
• The team
Hiroshi Wada, Kevin Lee,
Adnene Guabtni, Sherif
Sakr, Alan Fekete,
Quanqing Xu, Sean
Xiong, Bruce McCabe,
Jacky Keung, Paul
Bannerman, Liang
Zhao, Sadeka Islam,
Van Tran, Xiaomin
Wu…
43. Getting Involved
• Linkage with National ICT Australia
• Research Collaboration
• Researcher exchanges
• Expert Advisory Services, Architecture Reviews
• Public and In-house Training Courses
• Market Surveys, Case Studies
• Professional in Research Residence
Anna.Liu@nicta.com.au, @annaliu
http://blogs.unsw.edu.au/annaliu/
45. Alternative Architecture of a Hybrid Dev
Environment (Non-VPN based)
NICTA Corporate Network
Internet
Remote-desktop to XX.XX.0.*
(Possible direct access to Amazon VPC)
Amazon Cloud (US-East Datacenter)
Secure connection
Enterprise Data store (e.g., SSL) Business Web
Authentication server application
On-Premise Servers Virtual Machines
Private Cloud (Isolated Network)
Only accessible from NICTA Isolated Network in Amazon
57
46. Alternative Architecture of a Hybrid Dev
Environment (contd)
• Characteristics of a non-VPN based architecture:
– Simpler to setup and more light-weight
• No special hardware required
• Preserves isolated network in Amazon (i.e., cloud hosts with private
IPs)
– VPC host can directly access the internet
• Assign elastic IP (i.e., public IP) to VPC host if internet access is
required
• Arguably less secure (because two firewalls to take care of)
• Yields better throughput to internet hosts (because no rerouting
through in-house network)
– Suitable for applications with fewer connection points between in-
house and cloud
58
47. 2. Hybrid Cloud Control Centre
Diagnose
and Plan
Your Future
• Extensible architectures
supporting various plug-ins
Understand
• Diagnose and suggest optimal
Monitoring Engine Decision Making Support
at a Glance system configurations
• Auto generation of reconfiguration
Hybrid Cloud Environment
Automate workflows
Adaptations
Monitor
Everything
You Have
Public Cloud
In-House Data Center
• Integrated monitoring across local
and remote public clouds
• Works with existing enterprise
monitoring and mgmt tools
11/8/20
59 11
48. 3. Cloud Computing Cost Estimator
Application Profile
• Resource consumption per
Live Usage Pattern business transaction
IT Administrator or
“What-If” Scenarios
• Daily, weekly, monthly, yearly
usage patterns
System Monitoring
(ACT Monitor) • Possible deployment locations -
US, EU, Asia or Australia
Cloud Computing Providers
Cloud Cost Estimator
• Calculate operating cost of
applications
Knowledge base on
cost model, SLA, …
• Total operating cost on each
vendor
• Monthly cost and break-down
Estimated Operating Cost
Need to cut out more words on this slide – just tell the story!!Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assetsCome pick out brains at UNSW/NICTA
NICTA will focus on six research groups of significant scale and focus in which we have genuine opportunity to be ranked in the top five in an area in the world. Research groups have been selected on the basis of current NICTA strengths in research and research leadership. Software Systems. - Software Systems aims to develop game-changing techniques, frameworks and methodologies for the design of integrated, secure, reliable, performant and adaptive software architectures. Software systems has pervasive application in real-world applications ranging from enterprise ecosystems to embedded systems.Networks. - The networks research group will develop new theories, models and methods to support future networked applications andservices. Networked systems will address issues such as radio spectrum scarcity, wired bandwidth abundance, context and content, improvements to computing, energy constraints, and data privacy.Machine Learning. - is the science of interpreting and understanding data. The core problems are jointly statistical and computational. NICTA research will aim to develop machine learning as an engineering discipline, drawing on a spectrum of work from conceptual theory through algorithmics. Machine learning applications will aim to commonalities between problems, developing implementation frameworks that genuinely encourage reuse across different domains.Computer Vision - aims to understand the world through images and video. NICTA will focus on areas including geometry, detection and recognition, optimisation, segmentation, scene understanding, shape/illumination and reflectance, biological inspired approaches and the interfaces between them, drawing from approaches including statistical methods and learning and optimisation. Computer vision is a key enabling research discipline for many applications, including visual surveillance, bionic eye, mapping of the environment and visual surveillance.Control and Signal Processing. - comprises a substantial group of sub-disciplines dealing with optimisation, estimation, detection, identification, behaviour modification, feedback control and stability of a very large class of dynamical systems. It is likely that NICTA will focus on problems of control and signal processing in large-scale decentralised systems which are core to many new ICT systems. Techniques from information theory, Bayesian networks, large scale optimization etc are employed to address this important class of problem.Optimisation - the "science of better". Research will focus on the interface between constraint programming, operations research, satisfiability, search, automated reasoning, machine learning, simulation and game theory, exploring methods that combine algorithms fromthese different areas. Optimisation applications will address multi-faceted questions such as how best to schedule in a network, whether there is a better folding for a protein, or how best to operate a supply chain.
Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assets
Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assets
Reduce cost, reduce complexity
Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assets
Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assets
Adaptation engine patent pendingSeeking collaboration with industry to source ‘use inspiration’ and trial partnership
Need to cut out more words on this slide – just tell the story!!Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assetsCome pick out brains at UNSW/NICTA