Your SlideShare is downloading. ×
Wicsa2011 cloud tutorial
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Wicsa2011 cloud tutorial

2,731
views

Published on

Published in: Technology, Business

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,731
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Reduce cost, reduce complexity
  • Need to cut out more words on this slide – just tell the story!!Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assetsCome pick out brains at UNSW/NICTA
  • NICTA will focus on six research groups of significant scale and focus in which we have genuine opportunity to be ranked in the top five in an area in the world. Research groups have been selected on the basis of current NICTA strengths in research and research leadership. Software Systems. - Software Systems aims to develop game-changing techniques, frameworks and methodologies for the design of integrated, secure, reliable, performant and adaptive software architectures. Software systems has pervasive application in real-world applications ranging from enterprise ecosystems to embedded systems.Networks. - The networks research group will develop new theories, models and methods to support future networked applications andservices. Networked systems will address issues such as radio spectrum scarcity, wired bandwidth abundance, context and content, improvements to computing, energy constraints, and data privacy.Machine Learning. - is the science of interpreting and understanding data. The core problems are jointly statistical and computational. NICTA research will aim to develop machine learning as an engineering discipline, drawing on a spectrum of work from conceptual theory through algorithmics. Machine learning applications will aim to commonalities between problems, developing implementation frameworks that genuinely encourage reuse across different domains.Computer Vision - aims to understand the world through images and video. NICTA will focus on areas including geometry, detection and recognition, optimisation, segmentation, scene understanding, shape/illumination and reflectance, biological inspired approaches and the interfaces between them, drawing from approaches including statistical methods and learning and optimisation. Computer vision is a key enabling research discipline for many applications, including visual surveillance, bionic eye, mapping of the environment and visual surveillance.Control and Signal Processing. - comprises a substantial group of sub-disciplines dealing with optimisation, estimation, detection, identification, behaviour modification, feedback control and stability of a very large class of dynamical systems. It is likely that NICTA will focus on problems of control and signal processing in large-scale decentralised systems which are core to many new ICT systems. Techniques from information theory, Bayesian networks, large scale optimization etc are employed to address this important class of problem.Optimisation - the "science of better". Research will focus on the interface between constraint programming, operations research, satisfiability, search, automated reasoning, machine learning, simulation and game theory, exploring methods that combine algorithms fromthese different areas. Optimisation applications will address multi-faceted questions such as how best to schedule in a network, whether there is a better folding for a protein, or how best to operate a supply chain.
  • Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assets
  • Also comment on Public vs Private, and need to prepare for HybridRapid Elasticity: Elasticity is defined as the ability to scale resources both up and down as needed. To the consumer, the cloud appears to be infinite, and the consumer can purchase as much or as little computing power as they need. This is one of the essential characteristics of cloud computing in the NIST definition. • Measured Service: In a measured service, aspects of the cloud service are controlled and monitored by the cloud provider. This is crucial for billing, access control, resource optimization, capacity planning and other tasks. • On-Demand Self-Service: The on-demand and self-service aspects of cloud computing mean that a consumer can use cloud services as needed without any human interaction with the cloud provider. • Ubiquitous Network Access: Ubiquitous network access means that the cloud provider’s capabilities are available over the network and can be accessed through standard mechanisms by both thick and thin clients.4 • Resource Pooling: Resource pooling allows a cloud provider to serve its consumers via a multi-tenant model. Physical and virtual resources are assigned and reassigned according to consumer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter).5
  • Reduce cost, reduce complexity
  • Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assets
  • Cloud Computing Interoperability Forum (CCIF)Amazon/Google/MS missing from the CCIF sponsor listTHE CLOUD COMPUTING INTEROPERABILITY FORUMThe Cloud Computing Interoperability Forum (CCIF) was formed in order to enable a global cloud computing ecosystem whereby organizations are able to seamlessly work together for the purposes for wider industry adoption of cloud computing technology and related services. A key focus will be placed on the creation of a common agreed upon framework / ontology that enables the ability of two or more cloud platforms to exchange information in an unified manor.MissionCCIF is an open, vendor neutral, not for profit community of technology advocates, and consumers dedicated to driving the rapid adoption of global cloud computing services. CCIF shall accomplish this by working through the use open forums (physical and virtual) focused on building community consensus, exploring emerging trends, and advocating best practices / reference architectures for the purposes of standardized cloud computing.
  • Service Bus The Microsoft .NET Service Bus makes it easy to connect applications together over the Internet. Services that register on the Bus can easily be discovered and accessed, across any network topology. The Service Bus provides the familiar Enterprise Service Bus application pattern, while helping to solve some of the hard issues that arise when implementing this pattern across network, security, and organizational boundaries, at Internet-scale.
  • Reduce cost, reduce complexity
  • Reduce cost, reduce complexity
  • Reduce cost, reduce complexity
  • Quotas are resource constrains configured by the vendors. You probably can contact the vendors for more resources beyond the quotas, but communication takes time, and it will bring about opportunity cost. Limitations mostly are functions restrictions, you probably can’t go beyond it by making a phone call.Amazon Web ServicesManually setup all applications – large maintenance cost and operation cost, including upgrading systems, installing applications and configuration.Maximum 5 GB per file in S3 – e.g. TB magnitude files can not be put into S3 directly. Extra efforts are needed, i.e. It has to be divided into small trunks (5GB each) before storing. Same efforts are also required during retrieval, all retrieved trunks have to be merged manually.Maximum 5 seconds query execution time in SimpleDB – no long time query in SimpleDB. If thousands items are query in SimpleDB, it could be failed due to timeout. Developers need to estimate the query time before hand, and separate a large query into small queries. And combine/merge the query results on client sides.20 On-Demand or Reserved Instances and 100 Spot Instances by default – You can have more instances by contacting Amazon, but that definitely will increase your opportunity cost, if you need a scale out immediately.1GB free outgoing bandwidth per month in SimpleDB, S3 and EC2 – Yep, you need to pay for extra usages.Microsoft Windows Azure2 deployments per service (production and staging) – The two deployments are used for deploying production version and staging version separately, targeting the end-users and test users correspondingly. But it is not efficient enough to run multiple test versions at the same time..NET, PHP or Java programming language – limited languages for .NET, PHP and Java developersUp to 50 GB for SQL Azure – The maximum size of a single SQL Azure database is 50 GB. If your data is more than 50 GB, then you probably have to consider data partitioning to scale out your database to multiple databases.20 concurrent small compute instances or equivalent per month – 1 clock hour to an extra large instance equates to 8 small instance hours. Therefore, you can only have 10 TB of total data transfers per month – Probably you can get more if you send a request to MicrosoftUp to 750 GB SQL Azure databases per month – For SQL Azure, it originally states 150 Web Edition databases (not sure it is or/and, see http://www.microsoft.com/windowsazure/offers/popup/popup.aspx?lang=en&locale=en-us&offer=MS-AZR-0013P) 15 Business Edition databases, since the maximum size for each Web Edition is 5GB and maximum size for each Business Edition is 50GB. I do the simple math, 150*5 or 15*50, calculating the result as 750 GB.Google App EngineJava or Python programming language – PHP developer can do nothing on Google App Engine.Maximum 30 seconds for each request – Each request has to be responded within 30 seconds, otherwise, exceptions will be returned instead of results. In this case, high computational tasks is not applicable in GAE. The alternative is still splitting the task. GAE has made an early experimental release of MapReduce to fulfill the alternative. But only Mapper is implemented at this stage.1 MB for each Datastore entity – Only 1MB for each data item. You probably will find it hard to store a photo in GAE. And also due to the 30 seconds limitation, your query should also be processed within 30 seconds.Maximum 2 GB per file in Blobstore – The same reason as AWS. Plus: maximum size of Blobstore data that can be read by the app with one API call is only 1 MB. So even you stored 2GB in Blobstore, it is still difficult to manipulate these data in GAE.10 web applications per user – since the case of bush fire in 2009. I think all the following parameters can be adjusted by Google.43, 200, 000 requests per day 1 GB (1, 046 GB maximum if billing enabled) incoming/outgoing bandwidth per day6.5 CPU-hours (1, 729 CPU-hours maximum if billing enabled) per day
  • Reduce cost, reduce complexity
  • Reference – Saaland paper at VLDB
  • Reduce cost, reduce complexity
  • Reduce cost, reduce complexity
  • Reduce cost, reduce complexity
  • Reduce cost, reduce complexity
  • Reduce cost, reduce complexity
  • Figure 1 shows a typical set up of the Amazon VPC. This VPC setup allows a company’s infrastructure to be connected with the Amazon EC2 infrastructure via a VPN connection. It requires setting up two VPN gateways (one on each of the local and remote sides). A secure VPN connection is established between the two gateways via the IPsec protocol. EC2 instances on the remote side (Amazon side) are operated within subnets behind the remote VPN gateway. That is, these EC2 instances are isolated from the rest of the EC2 network and only these instances can access the hosts on the local side. Similarly, hosts can be added on the local side behind the customer gateway (local VPN gateway) and only these hosts have access to the remote EC2 instances. A typical VPC connection meets the following security requirements:Utilise the AES 128-bit encryption functionUtilise the SHA-1 hashing function
  • An example business report query took 16min 30sectakes less than 1min in the existing on-premise dev environmentData transfer over SSIS takes 14min (only 42KB/sec of throughput)No bottleneck observed on CPU (3-10%), memory (6G free), disk (low activity) or network (0.03% usage of 1Gbps) SSIS protocol? -----------------Done.  It works!  I did the following:1.  Start an EC2 micro instance outside the VPC and attach an EBS volume to it2. Copy file from S3 to the EBS volume attached to the micro instance3. Detach the EBS volume from the micro instance4. Attach EBS volume to an instance inside the VPCNote that, we did NOT route through NICTA here at all.The file I used for this experiment is ~700MB in size.  Step 2 took 130s (i.e. 5.39MB/s).
  • Reduce cost, reduce complexity
  • Reduce cost, reduce complexity
  • References:http://aws.amazon.com/ec2/http://code.google.com/appengine/whyappengine.html#scalehttp://www.microsoft.com/windowsazure/appliance/
  • An article (with link to his paper) by Huan Liu discussing limitations of load balancers and autoscaling:http://huanliu.wordpress.com/tag/auto-scaling/http://codecrafter.wordpress.com/2008/10/03/google-app-engine-scalability-that-doesnt-just-work/An example on scaling in Azure:http://code.msdn.microsoft.com/azurescale/Release/ProjectReleases.aspx?ReleaseId=4167
  • Reduce cost, reduce complexity
  • The Australian Prudential Regulation Authority (APRA) is the prudential regulator of the Australian financial services industry. It oversees banks, credit unions, building societies, general insurance and reinsurance companies, life insurance, friendly societies, and most members of the superannuation industry. APRA is funded largely by the industries that it supervises. It was established on 1 July 1998. APRA currently supervises institutions holding approximately $3.6 trillion in assets for 22 million Australian depositors, policyholders and superannuation fund members.AustraliaIn Australia, the federal Privacy Act 1988 sets out principles in relation to the collection, use, disclosure, security and access to personal information. The Act applies to the Australian Government and Australian Capital Territory agencies and private sector organisations (except some small businesses). The Office of the Privacy Commissioner is the complaints handler for alleged breaches of the Act. Some Australian States have enacted privacy laws.The Australian Law Reform Commission [1] completed an inquiry into the state of Australia's privacy laws in 2008. The Report entitled For Your Information: Australian Privacy Law and Practice [2] recommended significant changes be made to the Privacy Act, as well as the introduction of a statutory cause of action for breach of privacy [3]. The Australian Government committed in October 2009 to implementing a large number of the recommendations that the Australian Law Reform Commission had made in its report [4].
  • Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assets
  • - P.36 of DIAC mentioned that they have large amount of data (e.g. 100TB of documentation) and also structured data. Large documents can be stored in Azure Blob and structured data (depending on what type) can be stored in Azure Table.- Parallelised frameworks (such as MapReduce) can be used to perform usage analytics (P.27 of DIAC slides) such as abandonment rate. Azure Table is indexed by time (also partition and row) keys which makes it suitable for time-based queries.
  • Adaptation engine patent pendingSeeking collaboration with industry to source ‘use inspiration’ and trial partnership
  • Need to cut out more words on this slide – just tell the story!!Still need to do good EA, planning, monitoring, governance and managementRisk management approach to security, privacyPlan for Integration with existing assetsCome pick out brains at UNSW/NICTA
  • Transcript

    • 1. From imagination to impact
    • 2. Architecting Cloud Applications
      Dr. Anna Liu
      Research Group Leader
      Software Systems
      National ICT Australia
    • 3. The Land Down Under
    • 4. Sydney
    • 5. 5
      About NICTA
      National ICT Australia
      • Federal and state funded research company established in 2002
      • 6. Largest ICT research resource in Australia
      • 7. National impact is an important success metric
      • 8. ~700 staff/students working in 5 labs across major capital cities
      • 9. 7 university partners
      • 10. Providing R&D services, knowledge transfer to Australian (and global) ICT industry
      NICTA technology is in over 1 billion mobile phones
    • 11. Research Areas at NICTA
      Networks
      Machine Learning
      Software Systems
      Aruna Seneviratne
      Bob Williamson
      Anna Liu
      Gernot Heiser
      Computer Vision
      Optimisation
      Control & Signal Processing
      Nick Barnes,
      Richard Hartley
      Peter Corke
      Mark Wallace,
      Sylvie Thiebaux,
      Toby Walsh
      Rob Evans
      6
    • 12. NICTA’s mission: to be an enduring world-class ICT research institute that generates national benefit.
      Australia’s National Centre of Excellence in ICT Research
      Research focused on areas of importance to Australia
      Publicly funded, not for profit
      Best of breed research teams (400 staff + 300 students)
      Industry engagement
      Industry outcomes
      Enduring solutions
      ‘Spinout’ companies
      Engagement models include…
      • Contract R&D
      • 13. Consulting services
      • 14. Strategic Partnerships
      • 15. Licensing
      7
    • 16. Our team’s mission: help enterprises take full advantage as software extends into cloud!
      Cost optimised
      High availability
      Hybrid cloud
      Onsite/offsite
      Real-time monitoring
      Disaster recovery
      Actionable analytics
      Business continuity
      Intelligent management
      Systems resilience
      Elastic
      Dynamic
      Real time
      Our applied R&D capability
      spans cloud computing, web, SOA, distributed systems, data management, analytics, performance monitoring, DR, automated reasoning, ontologies, AI…
      High performance
      8
    • 17. Agenda
      Introduction to Cloud Computing
      Characteristics, Deployment and Delivery Models
      Enterprise Architecture and Migration Framework
      Usage Scenarios
      Evaluating Cloud Computing
      Enterprise context, Business opportunities, risks
      Technical qualities of platforms
      Platform Architectural Insights
      Proof of Concept Experiences
      Advanced Architecture Issues
      Future Directions
      Industry happenings
      Research Agenda
    • 18. What is Cloud Computing?
      Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
      This cloud model is composed of five essential characteristics, three service models, and four deployment models.
      - US National Institute of Standards and Technology
    • 19. Characterising Cloud Computing
    • 20. Five Characteristics – NIST Definition
      On-demand Self-Service
      A consumer can provision computing capabilities without human interaction
      Broad network access
      Computing capabilities are available over the network and accessed through standard mechanisms
      Resource pooling
      Provider’s computing resources are pooled to serve multiple consumers with different resources dynamically assigned according to consumers’ demands
      Rapid elasticity
      Computing capabilities can be rapidly and elastically provisioned to quickly scale out and rapidly released to scale in
      Measured service
      Resource usage can be monitored, controlled, and reported. Providing transparency for both the provider and consumer
    • 21. Leading Provider: Amazon EC2
      Let’s see how Amazon EC2, a leading commercial cloud, looks
      I want my cloud!
    • 22. 1. Grab your credit card and create an account. (10 min) Then, access to a console
      3. Hit this button
      2. Select where you want to create your virtual machines
      (US East, US West, Ireland or Singapore)
    • 23. 4. Select a machine image
      • Many pre-configured images are available
      • 24. You can register your machine images as well
    • 5. Determine the amount of resources to allocate
      • <1.0Ghz CPU + 600MB RAM  0.01 USD/hour
      • 25. 1.0Ghz CPU + 1.7GB RAM  0.04 USD/hour
      • 26. 3.0Ghz x 8 CPUs + 68GB RAM  1.1 USD/hour
      • 27. You can pay Win/SQL Server license fees in pay-per-hour
    • 6. Define a set of access control rules
    • 28. 7. Done! (< 5 minutes in total)
      • You have your virtual machine atec2-184-74-14-28.us-west-1.compute.amazonaws.com
      I got my virtual machine!
    • 29. 8. Connect to my virtual machine
      • Just SSH to the address
      • 30. You have a root access!!
      You’re in an Amazon Datacenter in CA
      This is my desktop in Sydney
    • 31. If you like Windows, just launch a Windows virtual machine and remote-desktop to it
      Connected through
      a VPN connection
      You’re in an Amazon Datacenter in NV
      This is my desktop in Sydney
    • 32. 9. Terminate or hibernate virtual machines when they are not in use
      • In some systems, we use a script to hibernate virtual machines at 8:00PM
      • 33. Restart instances in the morning if necessary. It takes just a couple of minutes
    • 10. Check a bill in real-time

    • 38. Three Service Models – NIST definition
      Technology exposed to customers
      Providers
      Software
      as a Service
      Platform
      as a Service
      Infrastructure
      as a Service
      Datacenter
      Infrastructure
    • 39. Three Delivery Models
      Infrastructure as a Service (IaaS)
      The consumer has control over operating systems, storage and deployed applications
      Platform as a Service (PaaS)
      Consumers can deploy applications created using programming languages and tools supported by the provider (e.g., Java Servlet)
      The provider shields the complexity of its infrastructure
      Scale up/down, load balancing, replication, disaster recovery, database management, …
      Software as a Service (SaaS)
      Consumers use the provider’s applications
      The consumer does not manage the underlying cloud infrastructure
    • 40. Leading Provider: Google App Engine
      PaaS is the hottest area because many players (e.g., VMware) are currently moving into PaaS.
      Let’s see how Google App Engine, a leading commercial PaaS, looks
      I want my PaaS!
    • 41. 1. Create an account. (5 min) GAE offers a large amount of quota for free
      2. Write an application using GAE’s framework
    • 42. 3. Deploy your application on GAE!
      Scale up/down, load balancing, replication, disaster recovery, database management, … many functions are implemented by GAE’s framework
    • 43. 4. Check your resource usage (CPU, storage, # of API calls, …)
      Pay only when usage exceeds the free quota
    • 44. Four Deployment Models – NIST Definition
      Hybrid Cloud
      Private Cloud
      Public Cloud
      Community Cloud
      Exclusively own,
      maintain and use
      Share and justuse when need
      Share and maintain
      by a group
      Organization
      General public
      A group of organizations
    • 45. Four Deployment Models
      Public cloud
      Generally very secure but risk of discontinuity and less control
      Community cloud
      Enjoy cost savings and relatively easy to retain governance
      Private cloud
      More control but less cost savings compared to public cloud
      Hybrid cloud
      Hot area in the next couple years: Enjoy both high security and cost savings from private and public cloud, respectively
    • 46. Why Cloud Computing?
      High Elasticity/Scalability
      Virtually infinite amount of resources is available on demand
      Reduce cost and complexity
      Pay per usage, economies of scale
      Generally speaking, non-7x24x365 systems with higher resource usage bring large cost savings
      No in-house IT maintenance
      No up-front cost for geographically distributed disaster recovery
      Innovation Possibilities
      Ease of Use
      You can implement your idea with minimum overhead and cost
      Processing Big Data
      Cost of 1 machine for 100 hours = Cost of 100 machines for 1 hour
    • 47. Issues and What NICTA is doing? #1
      Benefits and risks trade-off analysis
      Helps with decision making: use cloud or not? Suitable architecture for hybrid cloud?
      A model to show benefits (e.g., operational cost savings and elasticity) and risks (e.g., security, performance degradation, migration cost)
      Cost estimation
      Everybody’s question: Ok, then what is the actual cost for me?
      A model to estimate the actual initial and operational cost from application’s profile
      We’re collaborating with various Australian organizations to answer these questions through building and migrating systems in and to cloud
    • 48. Issues and What NICTA is doing? #2
      Automatic reconfiguration in hybrid cloud
      “outsource” your workload to public clouds only when needed
      Move some components/VMs/data to or from a public cloud to achieve certain performance
      Monitoring and management
      Monitor whether Service Level Agreements are guaranteed
      Secure the transparency of SLA monitoring
      Developing the new yardstick for cloud platforms, measuring elasticity for SPEC RG
      Exploring the possibility of new applications
      What we can do using huge computing resources?
      Collaborating with Microsoft Research on Azure Cloud Platform
    • 49. Agenda
      Introduction to Cloud Computing
      Characteristics, Deployment and Delivery Models
      Enterprise Architecture and Migration Framework
      Usage Scenarios
      Evaluating Cloud Computing
      Enterprise context, Business opportunities, risks
      Technical qualities of platforms
      Platform Architectural Insights
      Proof of Concept Experiences
      Advanced Architecture Issues
      Future Directions
      Industry happenings
      Research Agenda
    • 50. Cloud, Cloud, Cloud,...
      Cloud Computing is the No. 1 in the top 10 strategic technologies for 2011
      Cloud is everywhere? No
      Middle to large enterprises see huge opportunity in public cloud but also anxiety/pain due to…
      The lack of governance, i.e., visibility and control
      The lack of “architectures for (hybrid) cloud”
      The lack of migration methodology
      The lack of common cost structure
      The lack of automation across cloud and in-house

      35
    • 51. Cloud Computing - The Enterprise Context
      36
      STATE OF PLAY
      Clear benefits in cloud adoption
      Reduced IT cost, agility, efficiency, innovation opportunities
      Top risks/adoption issues:
      Security & privacy - Migration challenges
      Ownership of data – Service levels
      Lock-in / interoperability – Performance
      Availability / reliability – Cost and ROI
      Monitoring & control – Governance
      Compliance and regulation – Competencies
      Software licensing in the cloud - Operational challenges
      Contracts and commercials - new roles and responsibilities
      Payment model, metering/charge backs
      Risks vary with service model and provider
      Many progressive organisations evaluating cloud
      Proof of concepts, pilots, cloud computing strategy papers
      Some good adoptions in certain verticals, SME, Software as a service…
      CIOs need greater visibility and control over their assets running in local servers and in the cloud before reaping the benefits of cloud computing.
    • 52. Integration Challenges
      Integration Challenges
      • UI Integration
      • 53. Data Integration
      • 54. Process Integration
      Identity Challenges:
      Access Control
      AuthN, SSO, AuthZ
      Identity Lifecycle
      Identity Portability
      Interoperability
      Management Challenges
      • SLA Monitoring
      • 55. Halting, Pausing, Throttling…
      • 56. Programmatic access to health model
    • Standards and Interoperability
      Cloud Computing Interoperability Forum (CCIF), OMG effort, The Open Group, Open Cloud Manifesto...
      Is Standards THE solution?
      Competing standards? Timing? Design by committee?
      In fact, does it make sense when cloud platform architecture varies significantly?
      Individual services already surfaced on the internet
      Still want to orchestrate services within a long running workflow, across/from different clouds
    • 57. Internet Service Bus
      REST on .NET Service Bus
      Simple to implement for interop across different languages
      Less overhead packages
      SOAP on .NET Service Bus
      Only available for .NET Frameworks communications atm
      Other languages are not fully supported (Java can only pass Access Control on .NET Service)
      More overhead packages when communicate between C# and Java, than C# to C#
    • 58. 40 / 25
      Overview of Cloud Computing Offerings
    • 59. Overview of Three Leading Cloud Computing Platforms
    • 60. Cloud Computing Environment from AWS
      On-demand instances operate on a virtual environment
      EC2 is a IaaS offering
      Scaling computing environment
      Datacenter located in different regions, including US (North and East), EU and APAC.
      Types of instances:
      Standard
      Small (1 ECU, 1 Core, 1.7GB memory)
      Large (4 ECUs, 2 Cores, 7.5 GB memory)
      Extra Large (8 ECUs, 4 Cores, 15GB memory)
      High-Memory
      Extra Large (6.5 ECUs, 2 Cores, 17.1GB memory)
      Double Extra Large (13 ECUs, 4 Cores, 34.2GB memory)
      Quadruple Extra Large (26 ECUs, 8 Cores, 68.4GB memory)
      High-CPU
      Medium (5 ECUs, 2 Cores, 1.7GB memory)
      Extra Large (20 ECUs, 8 Cores, 7GB memory)
    • 61. Cloud Computing Environment from AWS (contd)
      Database Support
      S3
      Bucket storage
      Relational Database Service (RDS)
      Scalable SQL database
      Elastic Block Store (EBS)
      Disk partition (< 1TB)
      Supported environment
      Operating System
      Linux (e.g. Fedora, Ubuntu & Debian)
      Windows (e.g. Windows 7 & Windows Server 2008)
      Other licensed environment
      IBM WebSphere
      Application Server
      sMash
      Portal Server
      Oracle Database
      Oracle Enterprise Linux
    • 62. Cloud Computing Environment from GAE
      Cloud hosting environment for web applications
      GAE is PaaS offering
      Automatic Scaling and load balancing
      Hardware specification is unknown
      No notion of geographical regions
      Database Support
      BigTable
      Other Support
      Google Documents
      Google Calendar
      Upcoming Products
      AppEngine for Business
      SLA and SQL support
      Data store (bucket storage)
      Used in conjunction with Prediction and BigQuery API
      Prediction and BigQuery API
      Analytics support
    • 63. Cloud Computing Environment from Azure
      Windows Azure has 3 main components: Compute, Storage and Fabric
      Compute is based on Web Role and Worker Role
      Storage are scalable storage (see below)
      Azure is PaaS offering
      Database Support
      Small (1 CPU, 1.75GB memory)
      Medium (2 CPUs, 3.5GB memory)
      Large (4CPUs, 7 GB memory)
      Extra Large (8CPUs, 14GB memory)
      Storage support
      Types of storage
      Blob
      Queue
      Table
      Drive
    • 64. Details of Storage Offerings
    • 65. Storage Offerings from AWS
      AWS S3
      Stores blobs (up to 5GB per blob)
      Access via REST/SOAP. Sneakernet option (i.e., fedexing) is offered
      AWS EBS
      Network attached disk storage
      Used as an external HDDs of EC2 instances (Up to 1TB per volume)
      No direct access from the outside
      Allow for creating point-in-time snapshots of volumes in S3
      High performance
      It’s reported that sequential access is faster than 70MB/sec (0.54Gbps)
      Allow disk striping by attaching multiple volumes to an EC2 instance
    • 66. RDB Offerings from AWS
      Amazon RDS
      An EC2 instance with pre-installed MySQL 5.1 (Up to 1TB storage)
      Automatically patches
      Automated transaction logs backup up to last eight days and user-initiated DB snapshot
      Replication between multiple Availability Zones
      Amazon Relational Database Offers
      IBM DB2 9.5, Informix Dynamic Server
      Oracle 11g, 10g
      SQL Server Express, 2005
      Sybase SQL Anyware 11
      Postgres Plus
      Vertica Analytic Database
    • 67. Storage Offerings from Azure
      Windows Azure Blob
      Stores blobs (up to 1TB per blob)
      Read/write a blob 4MB piece by piece
      Access via REST/SOAP/ADO.NET
      Windows Azure Drive
      NTFS volume on Azure Blob accessed from Azure instances
      Azure SQL
      Support a subset of Transact-SQL, which SQL Server fully supports (up to 50GB per database)
      Automatically patches
      Automatic high availability (no details are available)
      SQL Azure Data Sync is offered to sync on-premise DB and Azure SQL
    • 68. Storage Offerings from GAE
      GAE Blobstore
      Stores blobs (up to 2GB per blob)
      Read/write a blob 1MB piece by piece
      Access via HTTP and no access control
      GAE Datastore
      Support SQL-like language and JDO (no storage size limit?)
      Services on the way
      Google Storage for Developers
      Extended version of GAE Blobstore
      Store blobs (100GB per blob), REST interface, fine access control
      BigQuery
      Analyze massive data in Google Storage using SQL-like language
      A query against 60TB of data takes less than 1 min
    • 69. Comparison of Storages
    • 70. Comparison of Storages (con’t)
    • 71. Cost Example
      Have 400GB data. Transfer 7GB/day log data into cloud and add 0.5 GB/day to a storage. Read 0.1GB/day from a storage. 1M requests/day on a storage. Cost for one year?
      AWS S3
      (400*12 + Σ13650.5*i/30)*0.15 + (0.1*7 + 0.15*0.1)*365 + 0.1 * 365 $1,185
      Amazon RDS
      (400*12 + Σ13650.5*i/30)*0.1 + (0.1*7 + 0.15*0.1)* 365 + 0.1 * 365 $889 + CPU fees (min $1,000/year)
      Azure Blob
      (400*12 + Σ13650.5*i/30)*0.15 + (0.1*7 + 0.15*0.1)*365 + 1 * 365 $1,513
      Azure SQL
      (400*12 + Σ13650.5*i/30)*10+ (0.1*7 + 0.15*0.1)*365  $59,427
      Azure SQL is quite expensive as a data storage but cheap as a small-mid scale high-performance and reliable SQL server
    • 72. Cost Example 2
      Have 5GB data. Transfer 0.01GB/day log data into cloud and add 0.01 GB/day to a storage. Read 0.1GB/day from a storage. 0.1M requests/day on a storage. Cost for one year?
      AWS S3
      (5*12 + Σ13650.01*i/30)*0.15 + (0.01*7 + 0.15*0.1)*365 + 0.1*0.1*365 $235
      Amazon RDS
      (5*12 + Σ13650.01*i/30)*0.1 + (0.01*7 + 0.15*0.1)* 365 + 0.1*0.1 * 365 $167 + CPU fees (min $1,000/year)
      Azure Blob
      (5*12 + Σ13650.01*i/30)*0.15 + (0.01*7 + 0.15*0.1)*365 + 0.1*1 * 365 $268
      Azure SQL
      (5*12 + Σ13650.01*i/30)*10+ (0.01*7 + 0.15*0.1)*365  $853
    • 73. Security Support
      AWS
      Firewall support to control network access to and from instances
      Amazon Virtual Private Cloud
      Isolates instances by IP range
      Connect to existing private infrastructure via encrypted Ipsec VPN
      Charged based on number of VPN connections and duration, as well as data transfer through VPN connection
      S3
      Bucket policies
      Access Control List (ACL)
      Query string authentication
      GAE
      Google Secure Data Connector
      Encrypted Connection from Google Apps to internal applications behind firewall
      Filters traffic by users and applications
      OAuth
      Denial of Service (DoS) protection
      Blacklist IP addresses or subnets
      Impose limits
    • 74. Security Support (contd)
      Azure
      AppFabric Service Bus
      Connects Azure applications and databases to internal infrastructure
      AppFabric Access Control
      Provide federated authorisation to applications and servers
    • 75. Elastic Compute Capability
      Elasticity is the defining characteristic of cloud computing
      The aim is to allocate sufficient resource to do the job, but not too much such that it wastes resources
      There are broadly 2 architectures that achieves elastic compute capability
      Push architecture
      Pull architecture
      57
    • 76. Elastic Compute Capability Reference Architecture –Push Architecture
      The Push architecture is typically used for web applications
      Web browser (client) send a request to the web application side
      Load balancer receives the request and “push” to one of the web servers running on a compute node
      Requests are forwarded immediately (or at a certain rate)
      Load balancer is aware of the intensity of the workload
      58
    • 77. Elastic Compute Capability Reference Architecture –Push Architecture
      59
      Fig 1. Push Architecture Pattern
    • 78. Elastic Compute Capability Reference Architecture –Pull Architecture
      The Pull architecture is often seen as an application-level architecture
      Also known as the Producer-Consumer design pattern
      Requests are sent to a queue
      In contrast to the Push architecture, it does not forward the request (hence less suitable for web applications)
      Compute nodes polls the queue periodically for jobs
      Requests are processed one at a time
      Polling frequently can induce overhead
      Easier to implement fail-safe mechanism
      Compute nodes need NOT inform the queue in case of failure
      Typical fail-safe mechanism involves a queue (e.g., AWS SQS or Azure Queue) that employs a lock attached with a timer. A message is locked when polled by a node. In case of a node failure, the message lock expires and return the message back to the queue.
      60
    • 79. Elastic Compute Capability Reference Architecture
      61
      Fig 2. Pull Architecture Pattern
    • 80. Using Cloud for Business Continuity
      Two main usages of cloud for Business Continuity:
      Provides highly available systems for day-to-day business
      Serves as a technology platform to implement disaster recovery
      Some definitions:
      Business Continuity: “Activity performed by an organisation to ensure that critical business functions will be available to customers, suppliers, regulators and other entities…”
      Disaster Recovery: “A small subset of business continuity. The process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organisation after a natural or human-induced disaster”
      Fault Tolerance: “The property that enables a system to continue operating properly, possibly at a reduced quality level…”
      62
    • 81. Building Highly Reliable Systems with Cloud
      Must address potential failures at two levels:
      Hardware/Infrastructure
      To prevent Single-Point-of-Failure (SPOF) by adding redundancy in all hardware components (i.e., redundant disks, redundant network devices, redundant power supply, etc.)
      NOT all cloud providers provide enterprise grade availability. Check your SLA!!
      Application
      Prepare fail-over system to take over in case of a failure
      Database replicates to minimise downtime and loss of data
      Replicate to geographically different location (e.g., to avoid natural disasters such as floods)
      63
    • 82. Case Study: Building Reliable System using EC2
      Highly replicated architecture of cloud makes them great as foundations for business continuity solutions
      Globally distributed nature further enhances the disaster recovery capability of cloud
      Availability limitations means need to be realistic about Hot vs Warm vs Cold standby options
      64
    • 83. Case Study: Building Reliable System using EC2 (Contd)
      Data backup in AWS
      Amazon S3 is best for off-site data backup
      Stores large binary files
      Designed to provide 99.999999999% durability
      Objects are redundantly stored in multiple facilities in a Region
      Back up using EBS
      Uses a regular file system
      Takes image (or snapshot) of the partition
      VM Import
      Allows for easy replication from on-premise to cloud
      Not trivial to replicate various configuration such as network configuration and disk drives
      65
    • 84. 10 Things You Didn’t Know About Cloud Platforms: Azure, GAE and AWS
      Dr. Anna Liu, Dr. Hiroshi Wada, Kevin Lee
      National ICT Australia
    • 85. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 86. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 87. 69
      The Reality of Eventual Consistency in Amazon SimpleDB
      The probability to read updated data in SimpleDB in US West
      An application reads data X (ms) after it has written data
      Eventual Consistent
      Consistent Read
      • SimpleDB has two read operations
      • 88. Eventual Consistent Read
      • 89. Consistent Read
      • 90. This pattern is consistent regardless of the time of day
    • 70
      Consistent vs. Eventual Consistent Read
      SimpleDB’s consistent read guarantees to read updated data
      What is the cost you need to pay for consistency?
      RTT is same as that of eventual consistent read
      Monetary cost (usage fee) is exactly same as eventual consistent read
       Trade-off is not clear! We suspect consistent read is less scalable and slower under datacenter failures. However, we’ve not observed any differences
    • 91. 71
      Other Commercial NoSQL Databases
      Google App Engine
      Offers eventual consistent read and consistent read
      Behavior of eventual consistent read is completely different from Amazon’s
      In GAE, both types of reads behave exactly same unless data centers have a failure(s)
      Windows Azure
      Offers no options for read
      Always consistent
      Reference: H Wada, A Fekete, L Zhao, K Lee, A Liu, “Data Consistency Properties
      And the Trade-offs in Commercial Cloud Storage: The Consumers’ Perspective”,
      CiDR 2011. http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper15.pdf
    • 92. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 93. Limitations and Quotas
    • 94. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 95. Performance Unpredictability in Cloud
      Performance unpredictability is one of the major obstacles
      Performance variance of a MapReduce job for a 50-node EC2 cluster and a 50-node local cluster
      Examples (time as performance metric)
      Repeatability of results for researchers
      Time critical tasks for enterprises
    • 96. Benchmark Details
    • 97. Benchmark Results in EC2
      The COV of large instance is higher than the small. However, both are at least by an order magnitude less stable than on a physical cluster.
      The COV of S3 Access may be influenced by other traffic on the network, showing this experiment just for completeness.
      Reference - Schad, Jo ̈rg, Jens Dittrich, and Jorge-Arnulfo Quiané-Ruiz. 2010. Runtime Measurements in the Cloud: Observing, Analyzing, and Reducing Variance. In Proceedings of the 36th international conference on Very large data bases. Vol. 3. 1. Singapore, Singapore: VLDB Endowment.
    • 98. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 99. Distributed Transactions in Cloud
      There is now a range of Cloud Database types
      NOSQL (Azure Table, GAE Datastore, Amazon SimpleDB...)
      Much more ‘shardable’ architecture; No joins, not full ACID support
      SQL (Azure SQL, Amazon RDS, Oracle on EC2...)
      Variable distributed transactional support compared to their traditional RDBMS counterpart
      Experience with porting PetShop
      Challenge with porting the data access layer
      Some JDO interface not supported by App Engine, eg. ‘Join query’
      No distributed transaction support in Azure SQL atm
      79
    • 100. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 101. Pricing fluctuates over space and time
      On demand pricing (hourly, per GB, per ‘000 requests)
      Reserved instances (1 or 3 year term + unit cost)
      Spot pricing (typically cheaper in US-East!)
      Similar pricing schemes observed for GAE and Azure
      81
    • 102. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 103. Sticky Session Support
      Autoscaling alone does not guarantee that clients of the same session will always contact the same instance
      Clients cannot perform a series of connected operations
      Amazon ELB supports Session Affinity
      Session affinity allows mapping to be created at the ELB
      Limitations
      Session affinity cannot handle HTTPS
      Autoscaling down an instance with a live session
      MS Azure advocates stateless sessions
      If you must – store session state in eg table storage
      Design issue - Server to remember conversation context? Or for client to remind it every time? How long should it ‘stick’? Too long: compromise server ability to distribute load
    • 104. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 105. Customers’ Responsibility in IaaS Cloud
      Application
      Patching
      App Data
      Backup
      Application
      Monitoring
      Application Installation/Configuration
      OS/Application Security
      (e.g., Active Directory)
      Billing
      (Cost Center Charging)
      Antivirus
      OS
      Backup
      OS
      Monitoring
      OS
      Patching
      OS/Middleware Installation/Configuration
      Customers’
      Responsibility
      Infrastructure Configuration
      (VPN, VMs, Disk, …)
      Access Control
      to IaaS
      Infrastructure
      Monitoring
      (CPU, Disk, Net, …)
      Usage Report
      and
      Basic Billing
      Amazon EC2
      (IaaS providers)
    • 106. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 107. Secure Connection to the Cloud
      87
    • 108. Performance Implications
      Low Security Option – max throughput 5.6MB/sec
      High Security Option - connection throughput is 4MB/sec
      Performance hit due to encryption, decryption and firewall
      Other interesting observations:
      VPC only available US East-1 and EU-west1
      in single availability zone only
      S3 not working well with VPC yet (very slow), EBS is a workaround
      MS Azure VPN support next year
      Google Secure Connector
    • 109. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 110. Time to Getting a New Instance
      Typically takes minutes to create an instance from its image on EC2
      Trick to “create” instances quicker
      Create a pool of instances in advance, and stop (hibernate) them all
      Pay no instance cost but need to pay for storage cost (for stopped instances)
      Revive stopped instances if new instances are needed
    • 111. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 112. Autoscaling is Not All Magic
      Amazon EC2
      “… your application can automatically scale itself up and down depending on its needs.”
      Windows Azure
      “Optimizd for scale-out applications-designed so that developers can easily build scale-out applications…”
      Google App Engine
      “No matter how many users you have or how much data your application stores, App Engine can scale to meet your needs”
    • 113. Autoscaling is Not All Magical (contd)
    • 114. The 10 Things are...
      How long does it take for data in cloud to become consistent
      Limitation and quotas
      How unpredictable/variable is the cloud?
      Distributed transaction support in Cloud
      Pricing variations over time and space
      Sticky session support
      The new matrix of roles and responsibilities for cloud providers, consumers and system integrators
      Secure connections to the cloud
      Time to getting a new instance
      Auto-scaling is not all magic
    • 115. Additional Slides
    • 116. Virtual Machine ‘Stolen Time’
      Using traditional system resource monitoring tools in cloud
      Measuring system performance within a virtual instance (using tools such as vmstat and top) can give misleading information
      Example: An EC2 instance (e.g. m1.small with 1 EC2 compute unit) does not go above around 40% CPU load as observed from vmstat
      Certain percentage (around 50-60%) appears on vmstat as ‘st’
      “st – Time stolen from a virtual machine” (from vmstat manpage)
      Does it mean I am not getting what I paid for? No, not really
      Amazon instances are measured by EC2 compute units
      “One EC2 compute Unit provides the equivalent CPU capacity of a 1.0-1.2GHz 2007 Opteron or 2007 Xeon process”
      Monitoring system performance in cloud
      Use Cloud monitoring tools such as CloudWatch and RightScale
    • 117. Limitation of Virtual Private Cloud (VPC)
      VPC hosts are logically detached from (but physically attached to) the Amazon network
      No direct connection to and from S3 via the Amazon local network
      Connection via internet only
      What happen if we need to transfer data from S3 to a VPC host?
      E.g. If we ship a removable media to Amazon, it would be uploaded to S3. How do we transfer the data to a VPC host?
      Option 1: Direct transfer from S3 to VPC host
      Traffic routes through the remote side and comes back (High latency)
      Option 2: Transfer to EBS and mount EBS to VPC host
      Traffic routes through local network (Low latency)
    • 118. 98
      How Long You Need to Wait to Get Updated with Eventual Consistent Read?
      Result of the “5 minutes run” for one week
      • t1: the first time to read updated data
      • 119. t2: the first time to reach 100% of reading updated
      • 120. t3: the last time to read stale data
       Mostly updated after 600ms but no guarantee
    • 121. Let’s Switch Gear… what’s happening in industry?
      99
    • 122. Australian Cloud Adoption
      Software as a service
      Enterprise and SME
      Productivity suites, CRM
      Telco and SaaS vendor partnership
      emerging tier 2 System integrator
      Platform and Infrastructure as a Service
      SME, startups well on their way
      Enterprise doing evaluation
      Government Cloud, Community Cloud
      Data centre consolidation
      SOA, shared services
      Financial industry leadership
      100
    • 123. Some Australian Enterprise Proof of Concepts
      Internet scale web applications
      User base from around the world
      Integration with existing web APIs
      Transient campaigns
      Many Mobile devices connecting to cloud
      Good adoption in utilities industries
      Development/Test environment
      Dynamic provisioning of dev/test resources
      Pay for usage
      Bursty workload
      Web apps
      Large scale data analysis
      eScience, Financial risk calculations, Government statistical data
      101
    • 124. One example POC detail findings
      An Example POC Experience
      102
    • 125. 103
      Proof of Concept Overview
      Objective
      reduce IT cost
      evaluate cloud opportunity and risks
      Test and Dev environment, as opposed to production
      Maximise re-applicability of learning experience across other apps
      Evaluation dimensions
      Performance, security, feasibility
      cost and license, flexibility and elasticity
      integration with existing environment, migration effort
      disaster recovery and backup, new roles and responsibilities

    • 126. Solution Design Rationale
      POC Solution Design Rationale
      Standard 3 tier web application, with backend and authentication server integration
      Location of data tier
      Maintain as much as dev/test configuration as common as possible
      PaaS or IaaS
      Selection of cloud platform for POC
      Project Management
      Governance: CIO/Director level sponsorship
      Project participants: enterprise architect, solution developer, security specialist, commercial specialist
      NICTA: cloud computing experience and evaluation framework
      2 wks POC selection; 6 wks POC; 2 wks consolidate findings
      104
    • 127. 105
      Architecture of a Hybrid Dev Environment
      NICTA Corporate Network
      Internet
      Remote-desktop to XX.XX.0.*
      (No direct access to Amazon VPC)
      Amazon Cloud (US-East Datacenter)
      IPSec VPN
      approx 230ms RTT
      Enterprise Data store
      Authentication server
      Business Web application
      On-Premise Servers
      Virtual Machines
      Private Cloud (Isolated Network)
      Only accessible from NICTA
      Isolated Network in Amazon
    • 128. 106
      Security
      There is ‘Secure integration to cloud’ solutions emerging
      Amazon VPC, Google Secure Data Connector, Azure App Fabric, etc
      Standard IPSec-VPN brings peace of mind to enterprise users
      One of the strong key enablers for enterprise use
      Fit in an existing security policy
      Data masking could increase the cost/effort
      An automated method is necessary for further cost/effort reduction
      Secure Software Development Lifecycle
      Process change required
    • 129. 107
      Performance
      The performance of each component (network, VMs, …) in cloud is comparable to or better than current on-premise components
      For dev/test environments, suitable for production systems?
      Do not underestimate the latency in hybrid environments
      Many of traditional applications and protocols are not optimized for a high-latency/WAN environment
      E.g., a protocol is too “chatty” and we observed that the network usage never exceeds 0.1% in some cases
      There are performance improvement opportunities
      Alternative solution design, Configuration and tuning
    • 130. 108
      Cost
      Many companies use ‘private cloud’; however, current offering is seen to be more expensive and less flexible
      increasingly Pay-as-you-go options are available
      unit price is typically ~100 times more costly for storage
      SLA & management services usually included
      Cost of keeping data/VMs is larger
      • Current Cost would vary depending on the SLA tiers of service
    • Customers’ Responsibility in IaaS Cloud
      Application
      Patching
      App Data
      Backup
      Application
      Monitoring
      Application Installation/Configuration
      OS/Application Security
      (e.g., Active Directory)
      Billing
      (Cost Center Charging)
      Antivirus
      OS
      Backup
      OS
      Monitoring
      OS
      Patching
      OS/Middleware Installation/Configuration
      Customers’
      Responsibility
      Infrastructure Configuration
      (VPN, VMs, Disk, …)
      Access Control
      to IaaS
      Infrastructure
      Monitoring
      (CPU, Disk, Net, …)
      Usage Report
      and
      Basic Billing
      Amazon EC2
      (IaaS providers)
    • 131. Commercial Implications
      Software Licensing in the cloud?
      Reuse enterprise license
      Pay for usage software license model
      Payment model?
      enterprise governance model
      Metering and chargeback
      Service level agreement?
      Monitoring and management
      Contracts
      Backup, disaster recovery
      New roles and responsibility?
      Existing IT outsourcing arrangements
      110
    • 132. POC Experience Summary
      Cloud Computing has the potential to reduce existing enterprise IT cost
      There are technical solutions for managing performance, security risks
      Need some fresh approach to manage:
      Enterprise architecture and governance
      Commercial implications such as SLA, new roles and responsibility
      111
    • 133. Other Global Challenges
      Policy and Procedure
      Procurement strategy?
      Pricing strategy?
      Governance and Control
      Financial control vs shared model
      Taxation and legal
      Federal and state based taxation, sales and payroll tax
      Compliance and assessment
      112
    • 134. Other Challenges Australian Face
      The Tyranny of Distance
      Latency: ~250ms Singapore, ~220ms US west coast, ~5-600ms US east coast, Europe
      No business case for an Australian Data centre
      22 mil population, 12 mil internet users
      National Broadband Network
      The rise of oz cloud innovations
      Strong Privacy Laws
      Federal Privacy Act
      APRA – Australian Prudential Regulation Authority
      EU Safe Harbour <> oz Safe Harbour
      113
    • 135. Agenda
      Introduction to Cloud Computing
      Characteristics, Deployment and Delivery Models
      Enterprise Architecture and Migration Framework
      Usage Scenarios
      Evaluating Cloud Computing
      Enterprise context, Business opportunities, risks
      Technical qualities of platforms
      Platform Architectural Insights
      Proof of Concept Experiences
      Advanced Architecture Issues
      Future Directions
      Industry happenings
      Research Agenda
    • 136. Other Industry Happenings
      Specialist cloud
      New types of System integrators
      Innovative Scenarios
      115
    • 137. Research Agenda
      Enterprise Architecture Framework
      Evaluation, acquisition, effort estimation, project and risk management
      Software Development Lifecycle
      Requirement solicitation for cloud, design for interoperable services, MDA/MDD/DSL, testing at massively parallel scale, cloud design patterns
      Interoperability and Integration
      Hybrid cloud, integration challenges across clouds
      Performance Engineering
      Monitoring and measurement, performance modelling, prediction and analysis, quality of service, SLA and assurance
      Many more…
      116
    • 138. Cost Effort Estimation for Cloud Migration
      Cost implication/estimation for cloud migration is especially challenging because:
      Applications and migration projects vary in terms of: size/complexity, functionality, quality requirements, target deployment platforms...
      Cloud computing is new and different from traditional software engineering paradigm: different development and deployment models, non-functional characteristics, pricing models...
      Migration effort/cost estimation is not trivial
      Little Empirical Data in cloud
      V Tran, K Lee, A Fekete, A Liu, J Keung, “Size Estimation of Cloud Migration Projects with Cloud Migration Point (CMP)”, 5th Intl Symposium on Empirical Software Engineering and Measurement
      V Tran, J Keung, A Liu, A Fekete, “Application Migration to Cloud: A Taxonomy of Critical Factors”, ICSE Software Engineering For Cloud Computing Workshop 2011.
      117
    • 139. Adaptive Cloud Middleware Research
      Evaluating Cloud Performance – Measuring Elasticity
      Achieving Cloudburst – Integrated monitoring and management
      Cloud Data Management – Elastic Data Store
      S Sakr, L Zhao, H Wada, A Liu, “CloudDB AutoAdmin: Towards a Truly Elastic Cloud-Based Data Store”, 9th IEEE Intl Conf on Web Service ICWS 2011.
      S Islam, J Keung, K Lee, A Liu, “An Empirical Study into Adaptive Resource Provisioning in the Cloud”, IEEE Intl Conf on Utility and Cloud Computing UCC2010.
      L Zhao, A Liu, J Keung, “Evaluating Cloud Platform Architecture with the CARE Framework”, APSEC 2010.
      P Brebner, A Liu, “Modeling Cloud Cost and Performance”, Cloud Computing and Virtualisation (CCV 2010)
      H Wada, A Fekete, L Zhao, K Lee, A Liu, “Data Consistency Properties And the Trade-offs in Commercial Cloud Storage: The Consumers’ Perspective”, CiDR 2011.
      118
    • 140. Elasticity Measure
      Elasticity is the defining characteristic of cloud
      Challenge: No existing metrics to measure elasticity
      Not the same as ‘scalability’ or ‘throughput’ measures
      Users care about running cost, agility
      Understanding elasticity
      “the ability of software to meet changing capacity demands, deploying and releasing relevant necessary resources on-demand”
      Varying elasticity behaviour across platforms
      SPEC Standardisation effort
    • 141. Data Consistency in Cloud
      Inconsistent views of data is common in cloud
      Due to the distributed nature and support of massive scalability
      Understanding data inconsistency is a new and big challenge for software industry
      What is the exact characteristics? When (not) to use them? How to use them?
      • Conducted scientific measurements and theoretical analysis
      • 142. Working on a decision making algorithm involving large number of parameters
      • 143. CiDR 2011 paper for more details
      • 144. H Wada, A Fekete, L Zhao, K Lee, A Liu, “Data Consistency Properties And the Trade-offs in Commercial Cloud Storage: The Consumers’ Perspective”, CiDR 2011.
      120
    • 145. Cloud Data Management
      One of the main goals of the next wave of Cloud Computing is to facilitate the job of implementing every application as a distributed, scalable and widely-accessible service on the Web.
      Recently, a new generation of low-cost, high-performance database software has emerged to challenge dominance of RDBMS named as NoSQL (Not Only SQL).
      Examples are BigTable, Dynamo, Cassandra, Hbase, HyperTable,…
      The main features of these systems include: ability to horizontally scale, supporting weaker consistency models, using flexible schemas and data models and supporting simple low-level query interfaces.
      121
    • 146. Cloud Data Management
      122
    • 147. Cloud Data Management: NoSQL Limitations
      In practice, there are many obstacles still need to overcome before theses systems can appeal to mainstream enterprises such as:
      Simple Programming Model: Even a simple query requires signicant programming expertise.
      Transaction Support: limited support (if any) of the transaction notion from NoSQL database systems
      Maturity: NoSQL alternatives are in pre-production versions with many key features either not stable enough or yet to be implemented.
      Support: small start-ups without the global reach, support resources, or credibility of an Oracle, Microsoft, or IBM.
      123
    • 148. Database-as-a-service (DaaS)
      DaaS is a new paradigm for data management in which a third party service provider hosts a database as a service.
      The service provides data management for its customers and thus alleviates the need for the service user to purchase expensive hardware and software, deal with software upgrades and hire professionals for administrative and maintenance tasks.
      Examples: Amazon RDS, Windows SQL Azure
      124
    • 149. Database-as-a-service (DaaS)
      In general, the service level agreements (SLA) of cloud database services are mainly focusing on providing their customers with high availability (99.99%) to the hosted databases.
      On the other side, they are not providing any guarantee or support on the performance and scalability aspects.
      Consumer applications of cloud-based database services have to take care of additional responsibilities and challenges in order to achieve performance improvement, scalability and elasticity goals
      125
    • 150. DaaS: Challenges
      Handling the Performance and Cost Aspects of Application-Defined SLAs
      Data Spike
      Distributed Transactions
      Geo-Distributed User and Geo-Replicated Databases
      126
    • 151. DaaS: Challenges
      127
    • 152. DaaS: Challenges
      128
    • 153. Our Solution: CloudDB AutoAdmin
      129
    • 154. CloudDB AutoAdmin: Goals
      Declarative Specification of Replication Management Strategies
      Declarative Specification of Data Partitioning and Re-distribution
      Declarative Specification of Consistency Management
      Logging and Monitoring
      130
    • 155. Our Solution: CloudDB AutoAdmin Architecture
      131
    • 156. Measuring Elasticity - Cloud Benchmark
      Performance Evaluation and Analysis
      L Zhao, A Liu, J Keung, “Evaluating Cloud Platform Architecture with the CARE Framework”, APSEC 2010.
      Modelling Cost and Performance
      P Brebner, A Liu, “Modeling Cloud Cost and Performance”, Cloud Computing and Virtualisation (CCV 2010)
      Measuring Elasticity, research contribution to SPEC
      Submission to SOCC 2010
      132
    • 157. Storing and Processing Large Datasets
      Scalable Cloud Storage
      Stores billions of records (e.g. user/application profiles and status)
      Partitions automatically to preserve scalability
      Supports structured data such as RDF and OWL (W3C recommendations)
      Retains rich semantic information
      Processing large datasets with parallelised frameworks
      Supports real-time reasoning
      Checks consistency against rules
      Infers implicit knowledge from dataset
      Enables efficient data analytics
    • 158. “always-on” costs in cloud. Also, very hot one is not feasible
      Cost
      Hot Standby
      Warm Standby
      Cold Standby
      • Ship backup to offsite
      • 159. Hardware is not already set up
      • 160. Recover systems after disaster
      • 161. Run transactions on multiple sites but use only one
      • 162. Mirror data via dedicated high speed network (e.g., SANs)
      • 163. Regularly backup app/data in a backup site
      • 164. Launch systems upon a disaster
      Cost of cold and warm is comparable
      TraditionalDR
      CloudDR
      seconds – minutes
      (automatic failover,
      minimum data loss)
      minutes – hours
      (manual failover, few data loss)
      hours – days
      (large data loss)
      Downtime
      134
    • 165. Automated Business Continuity
      For standard application stacks, automatically builds a backup site in cloud and keeps in sync
      Given application architecture/implementation, suggests the best DR solutions
      In-house
      Failover
      Config, launch only propagate changes
      Config, replicate always
      Build or pick, config
      135
    • 166. 2. Hybrid Cloud Control Centre
      Extensible architectures supporting various plug-ins
      Diagnose and suggest optimal system configurations
      Auto generation of reconfiguration workflows
      • Integrated monitoring across local and remote public clouds
      • 167. Works with existing enterprise monitoring and mgmt tools
      6/24/2011
      136
    • 168. Rent computing resources in public cloud(s) and replicated App. C to meet the (short-time) demand
      Application A
      Application C
      Application B
      Application C
      Public Cloud
      What Is Cloudburst?
      Cloudburst
      reconfiguration
      Application A
      Application B
      Application C
      Private Cloud
      Spikes in demand for App.C but your private cloud has no resources!
      Application C
      Application A
      Application B
      If App. C has huge amount of data or has sensitive data to transfer
      • Dynamic reconfiguration of applications to use a public cloud when a private cloud cannot provide enough computing resources
      137
    • 169. 1: Monitoring Cloud Applications
      Cloud management tools should monitor performance of cloud(s) and support writing of rules to trigger cloudburst
      Problem:Many limitations to existing tools
      Difficult to come up with appropriate rules manually
      Rules do not automatically adapt to changes over time
      No way to ensure quality of these rules
      Our solutions:
      Generate rules automatically from historical data
      Reconfigure rules automatically over time
      Provide guarantees on the quality of generated rules
      138
    • 170. 2: Determining a Reconfiguration
      Many possible ways to reconfigure applications
      Scale-up/down? Scale-out/back? Which application components to migrate to a public cloud?
      Problem:Difficult to find the best reconfiguration(s) due to conflicting objectives
      Performance and cost of after-reconfigured applications
      Time and cost to reconfigure applications
      Our solutions:
      Analyse trade-offs of possible reconfigurations with respect to performance, cost and time requirements
      Determine series of steps for automatic reconfiguration
      139
    • 171. 3: Selecting Cloud Technologies and Architectures
      Many cloud technologies and architectures with different characteristics. E.g., for data storage:
      RDB: strong consistency, low scalability
      Distributed RDB + cache: high scalability, high maintenance
      Key-value storages: low consistency, high scalability
      Problem:Difficult to select appropriate technologies & architectures satisfying applications’ requirements
      Consistency level, data portability, scalability, throughput, …
      Our solutions:
      Determine the best mixture of cloud technologies (e.g., data storage) and architectures depending on requirements
      140
    • 172. Adaptive Cloud Technologies
      Extensible monitoring engine across local and cloud
      Diagnose and suggest optimal system configurations
      Auto generation of reconfiguration workflows
      • Integrated with existing enterprise monitoring and management tools
      • 173. SPEC standardisation lead for cloud computing
      • 174. Adaptation engine patent pending
    • 3. Cloud Computing Cost Estimator
      Application Profile
      • Resource consumption per business transaction
      • 175. Daily, weekly, monthly, yearly usage patterns
      • 176. Possible deployment locations - US, EU, Asia or Australia
      Live Usage Patternor“What-If” Scenarios
      IT Administrator
      System Monitoring
      (ACT Monitor)
      Cloud Computing Providers
      Cloud Cost Estimator
      • Calculate operating cost of applications
      Knowledge base on
      cost model, SLA, …
      • Total operating cost on each vendor
      • 177. Monthly cost and break-down
      Estimated Operating Cost
    • 178. Standing on the shoulder of giants
      The team
      Hiroshi Wada, Kevin Lee, Adnene Guabtni, Sherif Sakr, Alan Fekete, Quanqing Xu, Sean Xiong, Bruce McCabe, Jacky Keung, Paul Bannerman, Liang Zhao, Sadeka Islam, Van Tran, Xiaomin Wu…
    • 179. Getting Involved
      Linkage with National ICT Australia
      Research Collaboration
      Researcher exchanges
      Expert Advisory Services, Architecture Reviews
      Public and In-house Training Courses
      Market Surveys, Case Studies
      Professional in Research Residence
      Anna.Liu@nicta.com.au, @annaliu
      http://blogs.unsw.edu.au/annaliu/
    • 180.
    • 181. 146
      Alternative Architecture of a Hybrid Dev Environment (Non-VPN based)
      NICTA Corporate Network
      Internet
      Remote-desktop to XX.XX.0.*
      (Possible direct access to Amazon VPC)
      Amazon Cloud (US-East Datacenter)
      Secure connection (e.g., SSL)
      Enterprise Data store
      Authentication server
      Business Web application
      On-Premise Servers
      Virtual Machines
      Private Cloud (Isolated Network)
      Only accessible from NICTA
      Isolated Network in Amazon
    • 182. 147
      Alternative Architecture of a Hybrid DevEnvironment (contd)
      Characteristics of a non-VPN based architecture:
      Simpler to setup and more light-weight
      No special hardware required
      Preserves isolated network in Amazon (i.e., cloud hosts with private IPs)
      VPC host can directly access the internet
      Assign elastic IP (i.e., public IP) to VPC host if internet access is required
      Arguably less secure (because two firewalls to take care of)
      Yields better throughput to internet hosts (because no rerouting through in-house network)
      Suitable for applications with fewer connection points between in-house and cloud
    • 183. Bondi Beach