- The company was founded in 2003 and provides sophisticated visualization and interpretation of genetic data through targeted analysis workflows and actionable results.
- The document discusses the architecture for a genome browser application, including a revised architecture with a VPC across two availability zones with private subnets for security.
- It describes the web stack behind a load balancer with session info stored in Elasticache and autoscaling from 2-6 machines depending on load. The database is a statically scaled MongoDB cluster that cannot use RDS due to local deployment requirements.
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
Biomatters and Amazon Web Services
1. • Founded in 2003
• Sophisticated, intuitive Visualisation and Interpretation of
Genetic data
• Targeted Analysis Workflows
• Actionable Results
• We’re Hiring!
2.
3. Genome Browser - Requirements
• Smooth, intuitive experience in the browser
– JavaScript/HTML5
– Mobile friendly
• Tile Rendering
– Like Google Maps
– Requires fast database lookups
• Secure
– Data must be encrypted at rest and in transit
• Local-deployable
– Some customers not ready for cloud
4. Architecture
• Initial Architecture
– On EC2
– One autoscaling group (and ELB)
– One Availability Zone
• Revised Architecture
– VPC across two Availability Zones
– Private subnets for security
6. Web Stack
• Tomcat behind Apache
• Session info stored in Elasticache
• Monitoring
– Healthcheck Ping URL for the load balancer
– Cloudwatch CPU alarms for autoscaling
• Autoscaling
– Scales from 2 to 6 machines depending on load
– For > 6 machines, the database becomes the bottleneck
• Deployment
– Automatic deployment with no downtime
7. Automatic Deployment
1. Deploy latest code to master web node (through Tomcat
manager)
2. Shutdown master tomcat
3. Take AMI snapshot
4. Restart master webnode, and wait for ping URL to respond
5. Teardown existing autoscaling config
6. Set up new autoscaling config
8. Database
• Local Deployment Requirement
– Can’t use RDS or Dynamo
• MongoDB
– Highly scalable NoSQL
– Supports Advanced features
9. Database
• Base unit – pair of 50GB volumes in Raid0
• 100GB Logical Volume (LVM)
• Encryption Layer
• XFS File System
– Can grow without unmounting
• Scaling
– Storage scaling is manual
– Performance scaling could be automatic
• Need to scale preemptively
11. Overview
• Multi-Availability Zone VPC with public and private subnets
• ELB in front of Auto-Scaling web nodes
• Statically scaled MongoDB Cluster
• Encrypted volumes
• Simple Queue Service for job processing
• We’re Hiring!
Biomatters has been around since 2003, specialising in visualisation and interpretation of digital biological dataThe volumes of digitised biological data have exploded in recent years – this has created a unique opportunity as genetic analysis has become cost effective to use in the clinic for the first time.Our software brings targeted genetic analyses to the cloud, coupled with intuitive visualisations of complex data.We combine the results of analysis with data from other relevant sources (e.g. patient data, knowledge databases) to provide actionable reports for clincians
This is an example of one of our visualisations – it allows a clinician to compare a patient’s DNA to a ‘reference’ human genome and look at the differencesIt works like Google Maps, but for the human genomeWe overlay data from a number of external data sources to put information at their fingertipsLet’s say that the clinician notices the patient has a particular variation in their DNAWe can quickly bring up information about the variation – we can see that has been associated with baldnessWe can see publications related to the variationWe can even look at the structure of proteins produced by the genes around the variation
Mobile friendly means that only a small amount of data can be stored on the client device at any one time – we need to be able to rapidly retrieve more information from the server as requiredTile Rendering – we can cache tiles on either side of the viewport, but we need really fast lookups on the database to make the app smoothSecure – If we’re dealing with medical data it has to be absolutely secureLocal Deployment – Some organisations (particularly medical ones) don’t have clear guidelines about how to deal with the cloud – regulations and policy prevent data from leaving their site
One private and one public subnet for each Availability ZoneELB (and bastion host) in the public subnetsWeb nodes and database in private subnets – connect to net through NATMulti-AZ (will cover database stack later)Setup:Used the VPC wizzard then customisedVery important to configure correctly – get routing tables and security groups correct
For Amazon config we had some trouble using command line tools, so we used the java API and wrote custom Ant tasks
Database is a MongoDB clusterCan’t use Dynamo or RDS because of local deployment requirementMongo is highly scalable, a NoSQL type database but supports advanced features like automatic shardingOur base unit of storage is a pair of 50GB EBS volumes in raid0Database nodes spread across three AZ’s – so mongo has at least two nodes runningLogical volume (using LVM) allows us to scale up size as required, but scaling is a manual processThe file system is encrypted at the logical volume levelXFS file system allows us to perform online resizing of the volume
Database is a MongoDB clusterCan’t use Dynamo or RDS because of local deployment requirementMongo is highly scalable, a NoSQL type database but supports advanced features like automatic shardingOur base unit of storage is a pair of 50GB EBS volumes in raid0Database nodes spread across three AZ’s – so mongo has at least two nodes runningLogical volume (using LVM) allows us to scale up size as required, but scaling is a manual processThe file system is encrypted at the logical volume levelXFS file system allows us to perform online resizing of the volumeOur cluster is spread across three availability zones so that MongoDB will still have two nodes in the event of an outage
Job is inserted into incoming job queue (status in db=NEW)Job is picked up by a Melanoma job processor node (status in db updated to PROCESSING)Output written to S3Completed job inserted into completed job queue (status in db updated to COMPLETE)Job is picked up by emailNotifier service, which sends an email and updates job status to USER_NOTIFIED