MongoDB on Windows Azure WIP Report MongoBoston September 2010
Brief Survey Business Decision Makers? IaaS (ex. AWS or RackSpace) users
Management of Resources: What you manage Platform as a Service (PaaS) Software as a Service (SaaS) PRIVATE (On-Premise) Infrastructure as a Service (IaaS) Applications Applications Applications Applications You Manage You Configure Development & Runtime Kernels Development & Runtime Kernels Development & Runtime Kernels Development & Runtime Kernels You Manage Databases Databases Databases Databases Security, Management, Load Balancing & Integration Security, Management, Load Balancing & Integration Security, Management, Load Balancing & Integration Security, Management, Load Balancing & Integration Logical Servers, Storage Logical Servers, Storage Logical Servers, Storage Logical Servers, Storage You Manage Managed by Vendor Managed by Vendor Virtualization Virtualization Virtualization Virtualization OS OS OS OS Managed by Vendor Server Hardware Server Hardware Server Hardware Server Hardware Networking, Utilities, Physical Networking, Utilities, Physical Networking, Utilities, Physical Networking, Utilities, Physical
The Windows Azure Platform Windows Azure is an internet-scale cloud services platform hosted in Microsoft data centers around the world, proving a simple, reliable and powerful platform for the creation of web applications and services.
Compute Services in Windows Azure GOAL: Massive Scalability Two role types: Web Role & Worker Role Windows Azure applications are built with web roles, worker roles, or a combination of both deployed to a number of instances. Scale out – not up – by replicating worker instances as needed. Allow applications to scale user and compute processing independently. Each instance runs on its own VM (virtual machine), replicated as needed
BLOBS: Provide a simple interface for storing named files along with metadata for the file. QUEUES: Provide reliable storage and delivery of messages for an application. TABLES: Provide structured storage. A table is a set of entities which contain a set of properties. Blobs, Tables, Queues and Drives DRIVES: A durable NTFS file system volume, sharable across instances.
Running even one Mongo instance Azure runs Windows 2008 server VMs .NET 4 included. All else must be bundled. XCOPY = GOOD. Registry and MSI = BAD MongoDB = XCOPY = GOOD!!! Mongo needs disk storage. Choose: non-durable or durable Local disk: non-durable. Blob: durable! Mongo needs connection port Multiple instances can’t talk to each other Client can’t choose server instance Can’t scale – multiple instances are independent! Azure uses random ports – Mongo needs to use this port Mongod won’t allow mapping of http server port!
Single-Instance Solution Single worker role instance Local (non-durable) or blob (durable) storage Single port mapped to mongod.exe Server only; no web server access Mongod instance as spawned process Not as service Must specify mapped port, data path, no http If server exits, recycle instance
Managing Instances MongoD.exe provides status http (28017), mongod shell Azure Load Balancer hides instances! Single IP/Port inbound from client apps Multiple port round-robin internally Azure-hosted apps can access instances! Must treat all instances equally Assuming specific instance access = bad
Client works against single IP/Port No way for client app to access individual server No shared storage
Multiple servers accessible via Azure server Mainly for management purposes
Replica Sets Challenges No shell access how to configure? Single IP How to access? How to monitor? DB access How to access Master in set? What about storage?
Replica Set Solution ReplicaSet worker role Runs mongod with –replSet Management role (either worker or part of web) Enumerates all ReplicaSet role instances Builds configuration document Connects to one node; uploads configwhich initializes replica set Client application Connects to replica set via compatible driver Storage Either local storage or blob storage. If blob storage, each replica set node has its own blob
Areas to be Explored Replica set per deployment Self configuring replica sets and shards Configuration data held in WA Storage Instance and Replica Set information pushed Mongo HTTP port configuration
Coming Soon to a Cloud Near You RDP access to instances Mongo Sharding will be simpler to implement than replication due to its cloud friendly architecture