Who I am. What Amido does. Ask audience their experience
This session is all about choosing the right tools to provision your open source stack onto Azure.
Why – to reduce the TCO
Lets start by looking at how Azure has evolved since 2008. We started with the web and worker role model, now termed cloud services. This included the management of your service by the fabric which remains a unique feature of Azure. In the early days Azure was targeted at green field .NET based projects that would fit within this model. Additional features were added to this model, including full trust and the Azure Drive in 2009. This enable this model to interop with code that had previously not been allowed, such as C++ or COM components. It was now possible to run Java on Azure cloud services. Later VM roles appeared, which confusingly are not VMs in the IaaS sense of the word. VMRoles follow the stateless model for running a cloud service. Finally, we recently got persistent VMs and Azure websites. Persistent VMs enable IaaS scenarios
The evolution of Azure has lead to three main options for deploying open source projects, depending on the scenario. Its important think of these options as complimentary as they can be applied independently or together if this fits your needs. The three options are persistent VMs, cloud services and azure websites, each of which has its own strengths depending on your scenario.
But a key concept to understand is that all three of these options are built on top of the same Azure foundations; all options ultimately map to the Azure cloud service model, which describes to the Azure fabric what and how to run your services.
Lets start with Azure Roles, which maybe familiar to you if you’ve worked with Azure for a long time. The cloud service is really two things: a network boundary with a single public facing VIP, and a service model, which describes the composition of your cloud service to the fabric. Stateless VMs are defined as roles in the service model, which are easily scaled out. Deployment involves creating a package with all the assets that will be provisioned on a VM. In the case of open source solutions, these packages can get large, due to the nature of needing large frameworks or tools. In these cases, it is common practice to boot strap the deployment by storing the frameworks as ZIP files in blob storage and install these onto the roles on startup.
Deployment is defined entirely by the VHD, which can be prepared on premise and then uploaded, or worked on within the cloud.
Azure websites is a pre-built tenant that uses the Azure Roles architecture. There are several worker roles which interact to provide a PaaS solution that is further abstracted from the infrastructure than the Azure Roles model. The architecture uses a worker role built as a special IIS web server build, where configuration is managed within SQL Azure and storage is driven using a mounted Azure drive. Websites differ from Azure roles and Persistent VMs in that they offer a shared resources option, where multiple websites share an instance. The websites model also defines a deployment worker role which is an FTP service, enabling rapid deployment of web assets, with support for GIT and TFS.
Windows Azure Timeline• Run .NET websites • C++ / COM support • Custom Roles • Persistent VMs• Run .NET processes • Durable NTFS volume • Connect to on premise • VM Gallery• Resilient Storage • Java on PaaS • Bootstap deployment • PHP / Ruby support• Cloud service model • Website deployment model
Azure Roles • Service Model drives management via fabric • Stateless VMs make scale out easy • Inter-role communication via API • Worker roles can host Open Source web servers (Jetty / Tomcat) • Storage externalised • Azure drive enable NTFS volume over Blob Storage • Full trust supported
Persistent Virtual Machines • Disks are mounted on Blob storage • OS Disks generated from VM images • Multiple OS support • Additional data disks mounted at NTFS volumes • Availability Sets define fault tolerance. • Many networking options
Websites• Azure Websites are a prebuilt cloud service tenant• Architecture is a series of worker roles• Storage is managed using Azure Drive• Managed database provided using SQL Azure or ClearDb MySQL• Multi-tenanted shared resources
Side by side comparison Persistent VMs Cloud Services WebsitesDeployment VM / VHD based Service Definition / FTP with GIT and TFS Start-up Tasks supportInstancing XS, S, M, L, XL XS, S, M, L, XL Shared or S, M, L reserved instancesGuest OS • Windows or Linux • Windows • Windows (IIS)Compute Storage • OS Disks • Local Storage • Azure Drive • Data Disks • Azure DriveNetworking • Load Balanced VIP • Load Balanced VIP • Load Balanced VIP • Port Forwarding • Inter-role comms • Virtual Network • Virtual NetworkExternal Services • Blob, Table and Queue Storage • Azure SQL Database or ClearDb MySQL • Service Bus, Access Control and WaaD • Azure Add Ons (e.g. MongoLabs MongoDb)
Azure Roles• Best for… – Solutions with multiple server roles – Cloud first solutions – Low support costs• Worst for… – Scale up, Stateful solutions – Many legacy applications will not fit model – Lock in
Persistent Virtual Machines• Best for… – Application lift and shift – Scale up, stateful technologies – Total control – No data centre lock in – Not running Windows OS• Worst for… – Management costs – Complexity (disk mgmt, deployments..)
Websites• Best for… – Building green field web sites – Simple open source web solutions – Using interpreted languages – Speed to market – Rapid deployment – Low cost shared resources• Worst for… – Limited capabilities
Questions to ask yourself…• Are you designing for, or running in Azure?• Does the operating system really matter?• What are your storage capacity limits?• Do you need to run a non IIS web server?• Does your Open Source stack require multiple server roles?• How much do you care about lock in?• Are you doing any custom development?• How much money do you have?
A brief introduction to Apache Solr• Solr is a Lucene based open source search server from Apache• Solr is a Java project, requiring a typical Java web server such as Jetty• Solr stores indexes (cores) locally on an NTFS volume• Solr scale out is managed using a master / slave configuration• Solr is called via an HTTP service
Questions to ask yourself…• Are you designing for, or running in Azure? – We are designing a big scale solution for Azure, where Solr is a part• Does the operating system really matter? – Not much, although Linux would be a better fit for Solr• What are your storage capacity limits? – Terabytes of search index data• Do you need to run a non IIS web server? – Yes, we need Jetty or Tomcat for Java• Does your Open Source stack require multiple server roles? – Yes, Solr needs a Master and Slave server instances• How much do you care about lock in? – Not much, we are building for Azure• Are you doing any custom development? – Yes, Solr is part of a development• How much money do you have? – Enough for a big scale search index
The solution – Azure Roles• Worker roles running Solr and Jetty• Azure Drive acts as a durable NTFS volume up to 1TB• Use the service model to set up internal endpoints for index replication• Install server components from Blob storage using http://bootstrap.codeplex.com/• Be aware of Azure Drive leases when mounting the drive volume• Enable RDP to access Solr Admin tools
Benefits• Scale out of Solr Slaves easily• Triple replicated Solr Cores• Fault tolerance using Fabric• Easy integration with custom web role
Alternative – Persistent VMs• All the same benefits of fault tolerance and triple replicated storage• No bootstrap code, pre built VMs• Run Solr on Linux• Must manage Solr Slave disk management on scale out• Solr Slaves are load balanced set• Use VNET to communicate from custom service