Clouds, Grids and Data

Clouds, Grids and Data Guy Coates Wellcome Trust Sanger Institute [email_address]

[object Object],[object Object],[object Object]

Based in Hinxton Genome Campus, Cambridge, UK. ,[object Object],[object Object]

We have active cancer, malaria, pathogen and genomic variation / human health studies. ,[object Object],[object Object]

Past Collaborations Data Sequencing Centre + DCC Sequencing centre Sequencing centre Sequencing centre Sequencing centre

Future Collaborations Collaborations are short term: 18 months-3 years. Sequencing Centre 3 Sequencing Centre 1 Sequencing Centre 2A Sequencing Centre 2B Federated access

Genomics Data Unstructured data (flat files) Data size per Genome Structured data (databases) DAS, bioMART etc ? Intensities / raw data (2TB) Alignments (200 GB) Sequence + quality data (500 GB) Variation data (1GB) Individual features (3MB)

Sharing Unstructured data ,[object Object]

Federated access. ,[object Object]

Single institute will have data distributed for DR / worldwide access. ,[object Object],[object Object],[object Object]

Some will have patient identifiable data.

iRODS ICAT Catalogue database Rule Engine Implements policies Irods Server Data on disk User interface WebDAV, icommands,fuse Irods Server Data in database Irods Server Data in S3

Useful Features ,[object Object]

Fast parallel data transfers across local and wide area network links. ,[object Object],[object Object],[object Object],[object Object]

Allows user at institute A to seamlessly access data at institute B in a controlled manner.

What are we doing with it? ,[object Object]

Move files between different storage pools. ,[object Object],[object Object],[object Object],[object Object]

Encrypt files and place on private FTP dropboxes.

Cumbersome to manage and insecure. ,[object Object],[object Object]

Software knows about S3 storage layers.

Identity management ,[object Object]

Lots of solutions: ,[object Object],[object Object],[object Object]

Delegated authentication? ,[object Object]

Dark Archives ,[object Object]

Is data in an inaccessible archive really useful?

Clouds, Grids and Data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Clouds, Grids and Data

Similar to Clouds, Grids and Data (20)

Recently uploaded

Recently uploaded (20)

Clouds, Grids and Data