Your SlideShare is downloading. ×
CloudConnect 2011 - Building Highly Scalable Java Applications on Windows Azure
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

CloudConnect 2011 - Building Highly Scalable Java Applications on Windows Azure

7,546

Published on

Presentation delivered at CloudConnect 2011, Santa Clara

Presentation delivered at CloudConnect 2011, Santa Clara

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,546
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
93
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Microsoft's Windows Azure platform is a virtualized and abstracted application platform that can be used to build highly scalable and reliable applications, with Java. The environment consists of a set of services such as NoSQL table storage, blob storage, queues, relational database service, internet service bus, access control, and more. Java applications can be built using these services via Web services APIs, and your own Java Virtual Machine, without worrying about the underlying server OS and infrastructure. Highlights of this session will include: • An overview of the Windows Azure environment • How to develop and deploy Java applications in Windows Azure • How to architect horizontally scalable applications in Windows Azure
  • To build for big scale – use more of the same pieces, not bigger pieces; though a different approach may be needed
  • Source: http://danga.com/words/2007_06_usenix/usenix.pdf
  • Source: http://highscalability.com/blog/2007/11/13/flickr-architecture.html
  • Source: http://www.slideshare.net/jboutelle/scalable-web-architectures-w-ruby-and-amazon-s3
  • Source: http://www.slideshare.net/netik/billions-of-hits-scaling-twitterSource: http://highscalability.com/blog/2009/6/27/scaling-twitter-making-twitter-10000-percent-faster.html
  • Source: http://highscalability.com/blog/2009/10/12/high-performance-at-massive-scale-lessons-learned-at-faceboo-1.html
  • Picture source: http://pdp.protopak.net/Belltheous90/DeathStarII.gif
  • Transcript

    • 1. Building Highly Scalable Java Applications on Windows Azure
      David Chou
      david.chou@microsoft.com
      blogs.msdn.com/dachou
    • 2. >Introduction
      Agenda
      Overview of Windows Azure
      Java How-to
      Architecting for Scale
    • 3. >Azure Overview
      What is Windows Azure?
      A cloud computing platform(as-a-service)
      on-demand application platform capabilities
      geo-distributed Microsoft data centers
      automated, model-driven services provisioning and management
      You manage code, data, content, policies, service models, etc.
      not servers (unless you want to)
      We manage the platform
      application containers and services, distributed storage systems
      service lifecycle, data replication and synchronization
      server operating system, patching, monitoring, management
      physical infrastructure, virtualization networking
      security
      “fabric controller” (automated, distributed service management system)
    • 4. > Azure Overview >Anatomy of a Windows Azure instance
      Compute – instance types: Web Role & Worker Role. Windows Azure applications are built with web role instances, worker role instances, or a combination of both.
      Storage – distributed storage systems that are highly consistent, reliable, and scalable.
      Anatomy of a Windows Azure instance
      HTTP/HTTPS
      Each instance runs on its own VM (virtual machine) and local transient storage; replicated as needed
      Guest VM
      Guest VM
      Guest VM
      Host VM
      Maintenance OS,
      Hardware-optimized hypervisor
      The Fabric Controller communicates with every server within the Fabric. It manages Windows Azure, monitors every application, decides where new applications should run – optimizing hardware utilization.
    • 5. > Azure Overview > Application Platform Services
      Application Platform Services
      Marketplace
      Application
      Marketplace
      Information Marketplace
      Frameworks
      Workflow Hosting
      Distributed Cache
      Services Hosting
      Security
      Claims-Based Identity
      Federated Identities
      Secure Token Service
      Declarative Policies
      Integration
      Messasging
      Registry
      Service Bus
      Data
      Transact-SQL
      Data Synchronization
      Relational Database
      ADO.NET, ODBC, PHP
      Compute
      C / C++
      Win32
      VHD
      Storage
      Dynamic Tabular Data
      Blobs
      Message Queues
      Distributed File System
      Content Distribution
      On-Premises Bridging
      Networking
    • 6. > Azure Overview >Application Platform Services
      Application Platform Services
      Applications
      DataMarket
      Marketplace
      Composite App
      Caching
      Frameworks
      Access Control
      Security
      Integration
      Connect
      (BizTalk)
      Service Bus
      Integration
      Relational Database
      Reporting
      DataSync
      Data
      VM Role
      Web Role
      Worker Role
      Compute
      Storage
      Table Storage
      Blob Storage
      Queue
      Drive
      Content Delivery Network
      Connect
      Networking
    • 7. >Azure Overview
      How this may be interesting to you
      Not managing and interacting with server OS
      less work for you
      don’t have to care it is “Windows Server” (you can if you want to)
      but have to live with some limits and constraints
      Some level of control
      process isolation (runs inside your own VM/guest OS)
      service and data geo-location
      allocated capacity, scale on-demand
      full spectrum of application architectures and programming models
      You can run Java!
      plus PHP, Python, Ruby, MySQL, memcached, etc.
      and eventually anything that runs on Windows
    • 8. >Java How-To
      Java and Windows Azure
      Provide your JVM
      any version or flavor that runs on Windows
      Provide your code
      no programming constraints (e.g., whitelisting libraries, execution time limit, multi-threading, etc.)
      use existing frameworks
      use your preferred tools (Eclipse, emacs, etc.)
      Windows Azure “Worker Role” sandbox
      standard user (non-admin privileges; “full trust” environment)
      native code execution (via launching sub-processes)
      service end points (behind VIPs and load balancers)
    • 9. > Java How-To >Deployment
      Deployment Options
      Worker Role (using scripts)
      script-based installation and execution
      automated, need scripts
      Worker Role (using C# boot-strapping)
      fabric sandbox native deployment
      automated, need additional code
      VM Role
      host your own pre-configured Windows Server 2008 R2 Enterprise x64 VM image
      automated, full control
      available shortly (in beta)
      Manual
      remote-desktop
      loses automated provisioning, service lifecycle management, fault-tolerance, etc.
    • 10. > Java How-To > Tomcat (SDK 1.2-based)
      Running Tomcat in Windows Azure
      Service Instance
      listen port(x)
      Service Instance
      Worker Role
      Sub-Process
      Tomcat
      server.xml
      Catalina
      index.jsp
      new Process()
      RoleEntry Point
      bind port(x)
      get
      runtime
      info
      SQL Database
      JVM
      http://instance:x
      http://instance:y
      Service
      Bus
      Access Control
      http://app:80
      Fabric Controller
      Load Balancer
      Table
      Storage
      Blob
      Storage
      Queue
    • 11. > Java How-To > Jetty (SDK 1.2-based)
      Running Jetty in Windows Azure
      Boot-strapping code in WorkerRole.run()
      Service end point(s) in ServiceDefinition.csdef
      string response = ""; try{     System.IO.StreamReadersr;     string port = RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["HttpIn"].IPEndpoint.Port.ToString();     stringroleRoot = Environment.GetEnvironmentVariable("RoleRoot");     stringjettyHome = roleRoot + @"approotappjetty7";     stringjreHome = roleRoot + @"approotappjre6";     Processproc = newProcess();     proc.StartInfo.UseShellExecute = false;     proc.StartInfo.RedirectStandardOutput = true;     proc.StartInfo.FileName = String.Format(""{0}binjava.exe"", jreHome);     proc.StartInfo.Arguments = String.Format("-Djetty.port={0} -Djetty.home="{1}" -jar "{1}start.jar"", port, jettyHome);     proc.EnableRaisingEvents = false;     proc.Start();     sr = proc.StandardOutput;     response = sr.ReadToEnd();} catch(Exception ex) {     response = ex.Message;     Trace.TraceError(response); }
      <Endpoints> <InputEndpointname="HttpIn"port="80"protocol="tcp" /></Endpoints>
    • 12. > Java How-To >Jetty (SDK 1.3-based)
      Running Jetty with admin access + fixed ports
      Execute startup script in ServiceDefinition.csdef
      Service end point(s) in ServiceDefinition.csdef
      <Startup><TaskcommandLine=“runme.cmd"executionContext=“elevated"TaskType=“background"> </Task></Startup>
      <Endpoints><InputEndpointname="HttpIn"protocol=“http"port="80"localPort="80"/></Endpoints>
    • 13. > Java How-To >GlassFish (using script; SDK 1.3-based)
      Running GlassFish
      Execute startup script in ServiceDefinition.csdef
      Service end point(s) in ServiceDefinition.csdef
      <Startup><TaskcommandLine=“Run.cmd"executionContext=“limited"TaskType=“background"> </Task></Startup>
      <Endpoints><InputEndpointname="Http_Listener_1"protocol=“tcp"port="80"localPort="8080" /><InputEndpointname="Http_Listener_2"protocol=“tcp"port="8181"localPort="8181"/><InputEndpointname=“Admin_Listener"protocol=“tcp"port=“4848"localPort=“4848"/><InputEndpointname=“JMX_Connector_Port"protocol=“tcp"port=“8686"localPort=“8686"/> <InputEndpointname=“Remote_Debug_Port"protocol=“tcp"port=“9009"localPort=“9009"/></Endpoints>
    • 14. > Java How-To >GlassFish (SDK 1.3-based)
      Running GlassFish in Windows Azure
      Service Instance
      Service Instance
      Worker Role
      Sub-Process
      GlassFish
      script
      Startup
      Command
      SQL Database
      JVM
      http://instance:8080
      http://instance:8080
      Service
      Bus
      Access Control
      http://app:80
      Fabric Controller
      Load Balancer
      Table
      Storage
      Blob
      Storage
      Queue
    • 15. > Java How-To > Limitations
      Current constraints
      Platform
      Dynamic networking
      <your app>.cloudapp.net
      no naked domain
      CNAME re-direct from custom domain
      can declare up to 5 specific ports be opened, or port ranges; cannot open arbitrary ports
      tcp socket connections terminated if idle for >60 seconds
      Non-persistent local file system
      allocate local storage directory
      read-only: Windows directory, machine configuration files, service configuration files
      Stateless application model
      round-robin traffic distribution used dynamic load balancer; no sticky sessions
      Java
      REST-based APIs to services
      Table Storage – schema-less (noSQL)
      Blob Storage – large files (<200GB block blobs; <1TB page blobs)
      Queues
      Service Bus
      Access Control
    • 16. > Java How-To >Support for Java
      Current tools for Java
      Windows Azure
      Windows Azure Tools for Eclipse/Java
      Multiple Java app servers
      Any Windows-based JRE
      http://www.windowsazure4e.org/
      Windows Azure SDK for Java
      Java classes for Windows Azure Blobs, Tables & Queues (for CRUD operations)
      Helper Classes for HTTP transport, AuthN/AuthZ, REST & Error Management
      Manageability, Instrumentation & Logging support
      Support for storing Java sessions in Azure Table Storage
      http://www.windowsazure4j.org/
      Windows Azure Starter Kit for Java
      Ant-based package & deployment tool
      http://wastarterkit4java.codeplex.com/
      SQL Azure
      Microsoft JDBC Driver 3.0
      Type 4 JDBC driver
      Supports TDS & OData
      Interoperability using REST
      Wrap SQL Azure with WCF Data Services
      Restlet extension for OData
      Windows Azure AppFabric
      App Fabric SDK for Java
      http://www.jdotnetservices.com/
      Solution Accelerators
      Tomcat
      Jetty
      GlassFish
      etc.
    • 17. > Cloud Scenarios
      Additional Cloud Integration/InteropOptions
      Cloud
      On-premises
      Data Synchronization
      SQL Azure Data Sync
      Application-layer
      Connectivity & Messaging
      AppFabric Service Bus
      Security &
      Federated IdentityAppFabricAccess Control
      Secure Network Connectivity
      Virtual Network Connect
    • 18. > Architecting for Scale
      Size matters
      Facebook (2009)
      +200B pageviews /month
      >3.9T feed actions /day
      +300M active users
      >1B chat mesgs /day
      100M search queries /day
      >6B minutes spent /day (ranked #2 on Internet)
      +20B photos, +2B/month growth
      600,000 photos served /sec
      25TB log data /day processed thru Scribe
      120M queries /sec on memcache
      Twitter (2009)
      600 requests /sec
      avg 200-300 connections /sec; peak at 800
      MySQL handles 2,400 requests /sec
      30+ processes for handling odd jobs
      process a request in 200 milliseconds in Rails
      average time spent in the database is 50-100 milliseconds
      +16 GB of memcached
      Google (2007)
      +20 petabytes of data processed /day by +100K MapReduce jobs
      1 petabyte sort took ~6 hours on ~4K servers replicated onto ~48K disks
      +200 GFS clusters, each at 1-5K nodes, handling +5 petabytes of storage
      ~40 GB /sec aggregate read/write throughput across the cluster
      +500 servers for each search query < 500ms
      >1B views / day on Youtube (2009)
      Myspace(2007)
      115B pageviews /month
      5M concurrent users @ peak
      +3B images, mp3, videos
      +10M new images/day
      160 Gbit/sec peak bandwidth
      Flickr (2007)
      +4B queries /day
      +2B photos served
      ~35M photos in squid cache
      ~2M photos in squid’s RAM
      38k req/sec to memcached (12M objects)
      2 PB raw storage
      +400K photos added /day
      Source: multiple articles, High Scalability
      http://highscalability.com/
    • 19. > Architecting for Scale > Vertical Scaling
      Traditional scale-up architecture
      Common characteristics
      synchronous processes
      sequential units of work
      tight coupling
      stateful
      pessimistic concurrency
      clustering for HA
      vertical scaling
      units of work
      app server
      web
      data store
      app server
      web
      data store
    • 20. > Architecting for Scale >Vertical Scaling
      Traditional scale-up architecture
      To scale, get bigger servers
      expensive
      has scaling limits
      inefficient use of resources
      app server
      web
      data store
      app server
      web
    • 21. > Architecting for Scale >Vertical Scaling
      Traditional scale-up architecture
      When problems occur
      bigger failure impact
      data store
      app server
      web
      app server
      web
    • 22. > Architecting for Scale >Vertical Scaling
      Traditional scale-up architecture
      When problems occur
      bigger failure impact
      more complex recovery
      app server
      web
      data store
      web
    • 23. > Architecting for Scale > Horizontal scaling
      Use more pieces, not bigger pieces
      LEGO 7778 Midi-scale Millennium Falcon
      • 9.3 x 6.7 x 3.2 inches (L/W/H)
      • 24. 356 pieces
      LEGO 10179 Ultimate Collector's Millennium Falcon
      • 33 x 22 x 8.3 inches (L/W/H)
      • 25. 5,195 pieces
    • > Architecting for Scale > Horizontal scaling
      Scale-out architecture
      Common characteristics
      small logical units of work
      loosely-coupled processes
      stateless
      event-driven design
      optimistic concurrency
      partitioned data
      redundancy fault-tolerance
      re-try-based recoverability
      app server
      web
      data store
      app server
      web
      data store
    • 26. > Architecting for Scale > Horizontal scaling
      Scale-out architecture
      To scale, add more servers
      not bigger servers
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
    • 27. > Architecting for Scale > Horizontal scaling
      Scale-out architecture
      When problems occur
      smaller failure impact
      higher perceived availability
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
    • 28. > Architecting for Scale > Horizontal scaling
      Scale-out architecture
      When problems occur
      smaller failure impact
      higher perceived availability
      simpler recovery
      app server
      web
      data store
      app server
      web
      data store
      web
      app server
      data store
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
    • 29. > Architecting for Scale > Horizontal scaling
      Scale-out architecture + distributed computing
      parallel tasks
      Scalable performance at extreme scale
      asynchronous processes
      parallelization
      smaller footprint
      optimized resource usage
      reduced response time
      improved throughput
      app server
      web
      data store
      app server
      web
      data store
      web
      app server
      data store
      app server
      web
      data store
      perceived response time
      app server
      web
      data store
      app server
      web
      data store
      async tasks
    • 30. > Architecting for Scale > Horizontal scaling
      Scale-out architecture + distributed computing
      When problems occur
      smaller units of work
      decoupling shields impact
      app server
      web
      data store
      app server
      web
      data store
      web
      app server
      data store
      app server
      web
      data store
      app server
      web
      data store
      app server
      web
      data store
    • 31. > Architecting for Scale > Horizontal scaling
      Scale-out architecture + distributed computing
      When problems occur
      smaller units of work
      decoupling shields impact
      even simpler recovery
      app server
      web
      data store
      app server
      web
      data store
      web
      app server
      data store
      app server
      web
      data store
      app server
      web
      data store
      web
      data store
    • 32. > Architecting for Scale >Cloud Architecture Patterns
      Live Journal (from Brad Fitzpatrick, then Founder at Live Journal, 2007)
      Web Frontend
      Apps & Services
      Partitioned Data
      Distributed
      Cache
      Distributed Storage
    • 33. > Architecting for Scale >Cloud Architecture Patterns
      Flickr (from Cal Henderson, then Director of Engineering at Yahoo, 2007)
      Web Frontend
      Apps & Services
      Distributed Storage
      Distributed
      Cache
      Partitioned Data
    • 34. > Architecting for Scale >Cloud Architecture Patterns
      SlideShare(from John Boutelle, CTO at Slideshare, 2008)
      Web
      Frontend
      Apps &
      Services
      Distributed Cache
      Partitioned Data
      Distributed Storage
    • 35. > Architecting for Scale >Cloud Architecture Patterns
      Twitter (from John Adams, Ops Engineer at Twitter, 2010)
      Web
      Frontend
      Apps &
      Services
      Partitioned
      Data
      Queues
      Async
      Processes
      Distributed
      Cache
      Distributed
      Storage
    • 36. > Architecting for Scale >Cloud Architecture Patterns
      Distributed
      Storage
      Facebook
      (from Jeff Rothschild, VP Technology at Facebook, 2009)
      2010 stats (Source: http://www.facebook.com/press/info.php?statistics)
      People
      +500M active users
      50% of active users log on in any given day
      people spend +700B minutes /month
      Activity on Facebook
      +900M objects that people interact with
      +30B pieces of content shared /month
      Global Reach
      +70 translations available on the site
      ~70% of users outside the US
      +300K users helped translate the site through the translations application
      Platform
      +1M developers from +180 countries
      +70% of users engage with applications /month
      +550K active applications
      +1M websites have integrated with Facebook Platform
      +150M people engage with Facebook on external websites /month
      Web
      Frontend
      Apps &
      Services
      Distributed
      Cache
      Parallel
      Processes
      Partitioned
      Data
      Async
      Processes
    • 37. > Architecting for Scale > Cloud Architecture Patterns
      Windows Azure platform components
      Apps & Services
      Web Frontend
      Distributed
      Cache
      Partitioned Data
      Distributed Storage
      Queues
      Content Delivery Network
      Load Balancer
      IIS
      Web Server
      VM Role
      Worker Role
      Web Role
      Caching
      Queues
      Access Control
      Composite App
      Blobs
      Relational Database
      Tables
      Drives
      Service Bus
      Reporting
      DataSync
      Virtual Network
      Connect
      Services
    • 38. >Architecting for Scale
      Fundamental concepts
      Vertical scaling still works
    • 39. >Architecting for Scale
      Fundamental concepts
      Horizontal scaling for cloud computing
      Small pieces, loosely coupled
      Distributed computing best practices
      asynchronous processes (event-driven design)
      parallelization
      idempotent operations (handle duplicity)
      de-normalized, partitioned data (sharding)
      shared nothing architecture
      optimistic concurrency
      fault-tolerance by redundancy and replication
      etc.
    • 40. > Architecting for Scale >Fundamental Concepts
      CAP (Consistency, Availability, Partition) Theorem
      At most two of these properties for any shared-data system
      Consistency + Availability
      • High data integrity
      • 41. Single site, cluster database, LDAP, xFS file system, etc.
      • 42. 2-phase commit, data replication, etc.
      A
      C
      A
      A
      C
      C
      Consistency + Partition
      • Distributed database, distributed locking, etc.
      • 43. Pessimistic locking, minority partition unavailable, etc.
      P
      P
      P
      Availability + Partition
      • High scalability
      • 44. Distributed cache, DNS, etc.
      • 45. Optimistic locking, expiration/leases, etc.
      Source: “Towards Robust Distributed Systems”, Dr. Eric A. Brewer, UC Berkeley
    • 46. > Architecting for Scale >Fundamental Concepts
      Hybrid architectures
      Scale-out (horizontal)
      BASE: Basically Available, Soft state, Eventually consistent
      focus on “commit”
      conservative (pessimistic)
      shared nothing
      favor extreme size
      e.g., user requests, data collection & processing, etc.
      Scale-up (vertical)
      ACID: Atomicity, Consistency, Isolation, Durability
      availability first; best effort
      aggressive (optimistic)
      transactional
      favor accuracy/consistency
      e.g., BI & analytics, financial processing, etc.
      Most distributed systems employ both approaches
    • 47. > Wrap-Up
      Lastly…
      Windows Azure is an open & interoperable cloud platform
      Microsoft is committed to Java, and we are on a journey – please give us your feedback & participate in open source projects
      Diverse Choice of Development Tools for Java Developers
      Eclipse Tools for Windows Azure – Write Modern Cloud Application
      Tomcat Solutions Accelerator
      Admin Access & VM Role
      Windows Azure Platform SDKs for Java Developers
      Windows Azure SDK (Storage, Diagnostics & Service Management)
      App Fabric SDK (Service Bus & Access Control Services)
      Restletextension for OData (Java)
      For more information:
      http://windowsazure.com/interop
      http://www.interoperabilitybridges.com
    • 48. Thank you!
      David Chou
      david.chou@microsoft.com
      blogs.msdn.com/dachou
      © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
      The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

    ×