Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
KeyBio pipeline for bioinformatics and data science
1. ↳ Each column is a sample.
↳ Each row is step in the pipeline.
↳ Color is the status of the cloud job.
2. Bringing bioinformatics and data science together in the cloud
Input:
DNA sequencing
AWS
Batch
AWS
EFS
Output:
AI-ready datasets
Scalable Parallel workflows with AWS Batch orchestrated by Airflow
User-Friendly Python + Parse VCF’s delimited fields + Mount EFS for fluid I/O
Extensible Airflow integrations with cloud services (Lambda, CloudWatch)
Secure No vendor access + No multi-tenancy + Data never leaves VPC
Cost-Effective No vendor platform fees or compute markup + Spot Instances
Boto3
.
3. From this
(raw vcf with
delimited fields)
To this
(columnar data
ready for exploration)