globusonlineGlobus Online for ManagingTomography Data at APSRachana AnanthakrishnanFrancesco De CarloArgonne National Lab
We started with reliable, secure,high-performance file transfer …DataSourceDataDestinationUser initiatestransfer request1G...
… and then made it simple to sharebig data off existing storage systemsDataSourceUser A selectsfile(s) toshare, selects us...
Transforming data acquisitionCurrent• Experimental parametersoptimized manually• Collected data combinedwith visual inspec...
Transforming data acquisitionEnvisaged• Experimental parametersoptimized automatically• Collected data available tooptimiz...
Facility dataacquisitionGlobus Online as enablerGlobus Onlinetransfer serviceReduceddataAnalysis/SharingGlobus Onlineshari...
7Credit: Kerstin Kleese-van DamErin Miller (PNNL)collects data atAdvanced PhotonSource, renders atPNNL, and views atANL
Looking at how researchers use data• A single research question often requires theintegration of many data elements, that ...
How do we manage data today?• Often, a curious mix of ad hoc methods– Organize in directories using file and directorynami...
Introducing the dataset• Group data based on use, not location– Logical grouping to organize, reorganize, search, anddescr...
Expanding Globus Online services• Ingest and publication– Imagine a DropBox that not only replicates, butalso extracts met...
Builds on catalog as a serviceApproach• Hosted user-definedcatalogs• Based on tag model<subject, name, value>• Optional sc...
Exemplar: APS Beamlines 32-ID & 2-BMX-Ray imaging, tomography, ~few µm to30 nm resolutionCurrently can generate upto 100 T...
StorageImage processing(normalization, etc.)TomographicreconstructionVisual inspectionSelectionBeamline 2-BM~1.5um resolut...
15APS Imaging GroupAPS Software Service GroupMathematics & Computer Science/Computation InstituteMulti-scale imagefusionIn...
Timelines• July:– Alpha service available• August:– Pilot with two groups at APS• Fall of this year:– Pilot with few other...
Thank You• Interested in working with us on datasetservice:– Email: ranantha@mcs.anl.gov• Contact: support@globusonline.or...
2013 06-21-computing-for-light-sources
2013 06-21-computing-for-light-sources
2013 06-21-computing-for-light-sources
2013 06-21-computing-for-light-sources
2013 06-21-computing-for-light-sources
2013 06-21-computing-for-light-sources
Upcoming SlideShare
Loading in …5
×

2013 06-21-computing-for-light-sources

211 views

Published on

Presented at the Computing for Light and Neutron Sources Technical Forum. Discusses Globus Online transfer, sharing and metadata management in the context of collaboration with Advanced Photon Source.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
211
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • This image shows a 3D rendering of a Shewanella biofilm grown on a flat plastic substrate in a Constant Depth bioFilm Fermenter (CDFF).  The image was generated using x-ray microtomography at the Advanced Photon Source, Argonne National Laboratory.   
  • http://datasets.globus.org/carl-catalog/query/propertyA=value1
  • 2013 06-21-computing-for-light-sources

    1. 1. globusonlineGlobus Online for ManagingTomography Data at APSRachana AnanthakrishnanFrancesco De CarloArgonne National Lab
    2. 2. We started with reliable, secure,high-performance file transfer …DataSourceDataDestinationUser initiatestransfer request1Globus Onlinemoves andsyncs files2Globus Onlinenotifies user3
    3. 3. … and then made it simple to sharebig data off existing storage systemsDataSourceUser A selectsfile(s) toshare, selects useror group, and setspermissions1Globus Online tracksshared files; no needto move files to cloudstorage!2User B logs in toGlobus Onlineand accessesshared file3
    4. 4. Transforming data acquisitionCurrent• Experimental parametersoptimized manually• Collected data combinedwith visual inspection toconfirm optimal condition• Data reconstructed and sentto users via external drive• User team starts datareduction at home institution
    5. 5. Transforming data acquisitionEnvisaged• Experimental parametersoptimized automatically• Collected data available tooptimization programs• Data are automaticallyreconstructed, reduced, and shared with local andremote participants• User team leaves the APSwith reduced dataCurrent• Experimental parametersoptimized manually• Collected data combinedwith visual inspection toconfirm optimal condition• Data reconstructed and sentto users via external drive• User team starts datareduction at home institution
    6. 6. Facility dataacquisitionGlobus Online as enablerGlobus Onlinetransfer serviceReduceddataAnalysis/SharingGlobus Onlinesharing serviceGlobus Onlinedataset service** In development
    7. 7. 7Credit: Kerstin Kleese-van DamErin Miller (PNNL)collects data atAdvanced PhotonSource, renders atPNNL, and views atANL
    8. 8. Looking at how researchers use data• A single research question often requires theintegration of many data elements, that are:– In different locations– In different formats (Excel, text, CDF, HDF, …)– Described in different ways• Best grouping can vary during investigation– Longitudinal, vertical, cross-cutting• But always needs to be operated on as a unit– Share, annotate, process, copy, archive, …
    9. 9. How do we manage data today?• Often, a curious mix of ad hoc methods– Organize in directories using file and directorynaming conventions– Capture status in READMEfiles, spreadsheets, notebooks– Even PowerPoint!• Time-consuming, complex, error proneWhy can’t we manage our data like wemanage our pictures and music?
    10. 10. Introducing the dataset• Group data based on use, not location– Logical grouping to organize, reorganize, search, anddescribe usage• Tagwith characteristics that reflect content …– Capture as much existing information as we can• …or to reflect current status in investigation– Stage of processing, provenance, validation, ..• Sharedata sets for collaboration– Control access to data and metadata• Operateon datasets as units– Copy, export, analyze, tag, archive, …
    11. 11. Expanding Globus Online services• Ingest and publication– Imagine a DropBox that not only replicates, butalso extracts metadata, catalogs, converts• Cataloging– Virtual views of data based on user-definedand/or automatically extracted metadata• Integration with computation– Associate computationalprocedures, orchestrate application, catalogresults, record provenance
    12. 12. Builds on catalog as a serviceApproach• Hosted user-definedcatalogs• Based on tag model<subject, name, value>• Optional schemaconstraints• Integrated with otherGlobus servicesThree REST APIs/query/• Retrieve subjects/tags/• Create, delete, retrievetags/tagdef/• Create, delete, retrievetag definitionsBuilds on USC Tagfiler project (C. Kesselman et al.)
    13. 13. Exemplar: APS Beamlines 32-ID & 2-BMX-Ray imaging, tomography, ~few µm to30 nm resolutionCurrently can generate upto 100 TB per day< 1GB/s data rate; ~3-5GB/s in 5-10 years
    14. 14. StorageImage processing(normalization, etc.)TomographicreconstructionVisual inspectionSelectionBeamline 2-BM~1.5um resolutionBeamline 32-ID-C20-50 nm resolutionImage processing(alignment, etc.)TomographicreconstructionVisual inspectionSelectionSelectionMulti-scaleimage fusionVisual inspectionUp to 100 fps2K x 2K, 16 bits11 GB raw data1,500 fps2K x 2K, 16 bits1 min readout11 GB raw dataMulti-scale 3Dimaging datafusion at APS
    15. 15. 15APS Imaging GroupAPS Software Service GroupMathematics & Computer Science/Computation InstituteMulti-scale imagefusionInfrastructure LDRDSystem integrationInstrument & DataCollectionData Management ServicesMathematics &Computer ScienceResults:Google earth stylezoom in datanavigationTao of Fusion LDRDArgonne Collaborations
    16. 16. Timelines• July:– Alpha service available• August:– Pilot with two groups at APS• Fall of this year:– Pilot with few other groups at APS– Early beta
    17. 17. Thank You• Interested in working with us on datasetservice:– Email: ranantha@mcs.anl.gov• Contact: support@globusonline.org• Website: www.globusonline.org

    ×