Data 2.0 a new way of integrating data? Neil Chue Hong SC07, Reno
Summary <ul><li>From Data Grids </li></ul><ul><li>To Data Services </li></ul><ul><li>The Rise of Web 2.0 </li></ul><ul><li...
Grid versus Users <ul><li>Grid is about: </li></ul><ul><ul><li>sharing resources </li></ul></ul><ul><ul><li>interoperable ...
Data Grids <ul><li>The first generation of Grids concentrated on  Compute Grids </li></ul><ul><ul><li>harnessing capacity ...
Data Challenges Diversity Scale Ownership Security of data resource types, vendors, middleware, schema, metadata of collec...
Move towards data services <ul><li>Defined interface to stored collection of data </li></ul><ul><ul><li>e.g. Google and Am...
Grid Data Services <ul><li>Data middleware provides a way of publishing data in a uniform way </li></ul><ul><ul><li>access...
Grid versus User: Round 2 <ul><li>Grids provide: </li></ul><ul><ul><li>data </li></ul></ul><ul><ul><li>discovery services ...
The Rise of Web 2.0 <ul><li>New sites allow non-technical users to share information and interact in programmable environm...
The Rise of Web 2.0 <ul><li>New sites allow non-technical users to share information and interact in programmable environm...
The Four Levels of e-Science Enlightenment <ul><li>1)  Resources:  Providing access to a larger and wider diversity of res...
From DSs to VREs <ul><li>Virtual Research Environments </li></ul><ul><ul><li>bridge gap between middleware and users </li>...
SEE-GEO: Geolinking Census DB Borders DB WFS GDAS OGSA-DAI getData getFeature geoLink Feature Portrayal GLS Portal Map Ser...
Virtual Workspace for the Study of Ancient Documents <ul><li>An interface allowing browsing and searching of multiple imag...
Data 2.0: From Silos to Sharing <ul><li>Choose data based on stored metadata </li></ul><ul><ul><li>bring together for each...
Data 2.0: a new way of integrating data?  <ul><li>Many diverse data sources </li></ul><ul><ul><li>independently owned and ...
What is the future of data? <ul><li>Data must be available to all to be useful </li></ul><ul><li>Individuals must be able ...
Upcoming SlideShare
Loading in...5
×

Data 2.0|

346

Published on

Presentation given at Supercomputing 2007 on the progress of data sharing models, specifically highlighting the collision of data grid / data service and Web 2.0 worlds.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
346
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Data 2.0|

    1. 1. Data 2.0 a new way of integrating data? Neil Chue Hong SC07, Reno
    2. 2. Summary <ul><li>From Data Grids </li></ul><ul><li>To Data Services </li></ul><ul><li>The Rise of Web 2.0 </li></ul><ul><li>Towards Data 2.0 </li></ul>
    3. 3. Grid versus Users <ul><li>Grid is about: </li></ul><ul><ul><li>sharing resources </li></ul></ul><ul><ul><li>interoperable middleware </li></ul></ul><ul><ul><li>allowing bigger problems </li></ul></ul><ul><ul><li>integrating communities </li></ul></ul><ul><ul><li>improving security </li></ul></ul><ul><ul><li>bringing together data </li></ul></ul><ul><li>Users want to: </li></ul><ul><ul><li>access more resources </li></ul></ul><ul><ul><li>ignore middleware </li></ul></ul><ul><ul><li>solve bigger problems </li></ul></ul><ul><ul><li>form communities </li></ul></ul><ul><ul><li>have simple security </li></ul></ul><ul><ul><li>bring together data </li></ul></ul><ul><li>Grid and Users want very similar things </li></ul><ul><ul><li>and yet there is still a “want-got-gap” between them </li></ul></ul><ul><ul><li>how can this be bridged? </li></ul></ul>
    4. 4. Data Grids <ul><li>The first generation of Grids concentrated on Compute Grids </li></ul><ul><ul><li>harnessing capacity to improve capability </li></ul></ul><ul><li>Then came the first Data Grids </li></ul><ul><ul><li>mechanisms for dealing with the large amounts of data generated by sensors and simulations </li></ul></ul>
    5. 5. Data Challenges Diversity Scale Ownership Security of data resource types, vendors, middleware, schema, metadata of collections, formats, geographical, political and social distance on individual, group, and organisation levels; intersecting yet independent for client, service and data owner; at many levels, with many tradeoffs
    6. 6. Move towards data services <ul><li>Defined interface to stored collection of data </li></ul><ul><ul><li>e.g. Google and Amazon </li></ul></ul><ul><li>But the data could be: </li></ul><ul><ul><li>replicated </li></ul></ul><ul><ul><li>shared </li></ul></ul><ul><ul><li>federated </li></ul></ul><ul><ul><li>virtual </li></ul></ul><ul><ul><li>incomplete </li></ul></ul><ul><li>Improve the ability to discover, reference, </li></ul><ul><li>annotate, search, and provide provenance </li></ul>Make access transparent Make integration easy Make management simple
    7. 7. Grid Data Services <ul><li>Data middleware provides a way of publishing data in a uniform way </li></ul><ul><ul><li>accessible </li></ul></ul><ul><ul><li>discoverable </li></ul></ul><ul><ul><li>searchable </li></ul></ul><ul><li>Provide tools such as </li></ul><ul><ul><li>registries </li></ul></ul><ul><ul><li>replica catalogs </li></ul></ul><ul><ul><li>mediators </li></ul></ul>
    8. 8. Grid versus User: Round 2 <ul><li>Grids provide: </li></ul><ul><ul><li>data </li></ul></ul><ul><ul><li>discovery services </li></ul></ul><ul><ul><li>distributed queries </li></ul></ul><ul><ul><li>basic provenance </li></ul></ul><ul><ul><li>workflows to represent analysis process </li></ul></ul><ul><li>Users want: </li></ul><ul><ul><li>information </li></ul></ul><ul><ul><li>to find the right data </li></ul></ul><ul><ul><li>cross-database searches </li></ul></ul><ul><ul><li>sophisticated annotation </li></ul></ul><ul><ul><li>to explore the information space </li></ul></ul><ul><li>Data 2.0 must go beyond simple data access </li></ul><ul><ul><li>domain-specific vs generic data services </li></ul></ul><ul><ul><li>composability, interoperability and ease of use </li></ul></ul>
    9. 9. The Rise of Web 2.0 <ul><li>New sites allow non-technical users to share information and interact in programmable environments </li></ul><ul><ul><li>Social Networking: MySpace, Bebo, Facebook </li></ul></ul><ul><ul><li>GIS: Google Maps, Google Earth </li></ul></ul><ul><ul><li>Preference Matching: Amazon </li></ul></ul><ul><ul><li>Meta-clustering: digg, del.icio.us </li></ul></ul><ul><ul><li>Information Publishing: Flickr </li></ul></ul>
    10. 10. The Rise of Web 2.0 <ul><li>New sites allow non-technical users to share information and interact in programmable environments </li></ul><ul><ul><li>Social Networking: MySpace, Bebo, Facebook </li></ul></ul><ul><ul><li>GIS: Google Maps, Google Earth </li></ul></ul><ul><ul><li>Preference Matching: Amazon </li></ul></ul><ul><ul><li>Meta-clustering: digg, del.icio.us </li></ul></ul><ul><ul><li>Information Publishing: Flickr </li></ul></ul><ul><li>An army of curators, a world of information </li></ul>
    11. 11. The Four Levels of e-Science Enlightenment <ul><li>1) Resources: Providing access to a larger and wider diversity of resources </li></ul><ul><li>2) Automation: Increasing the automation and repeatability of experimentation </li></ul><ul><li>3) Collaboration: Allowing intra and cross disciplinary collaboration through enabling networks </li></ul><ul><li>4) Participation: Increasing access to a wider set of users and increasing knowledge in a domain by bringing new people to the subject </li></ul>
    12. 12. From DSs to VREs <ul><li>Virtual Research Environments </li></ul><ul><ul><li>bridge gap between middleware and users </li></ul></ul><ul><ul><li>integrate functionality and facilities </li></ul></ul><ul><li>Harness interest in communities and make it easy to contribute and easy to benefit </li></ul><ul><ul><li>infrastructure </li></ul></ul><ul><ul><li>annotation tools </li></ul></ul><ul><ul><li>graphical environment </li></ul></ul>
    13. 13. SEE-GEO: Geolinking Census DB Borders DB WFS GDAS OGSA-DAI getData getFeature geoLink Feature Portrayal GLS Portal Map Server Receive ticket for results Retrieve annotated image Store image on server Send parameterised query FPS Call out to existing FP service Cache attributes Stream polygons Request attributes Request features Run algorithm Stream relevant annotated polygons Concentrate on algorithm Access domain-specific data sets Utilise existing services Efficient delivery methods
    14. 14. Virtual Workspace for the Study of Ancient Documents <ul><li>An interface allowing browsing and searching of multiple image collections, including tools to compare and annotate the researcher’s personal collection </li></ul>
    15. 15. Data 2.0: From Silos to Sharing <ul><li>Choose data based on stored metadata </li></ul><ul><ul><li>bring together for each user </li></ul></ul><ul><li>Build a community by providing tools to contribute back </li></ul>Manc Data Soton Data OD OD Choose Dataset Dataset Annotation VRE Portal Amy Annot. Add Annotation Edin Data OD Bob Annot. Central Annot.
    16. 16. Data 2.0: a new way of integrating data? <ul><li>Many diverse data sources </li></ul><ul><ul><li>independently owned and curated </li></ul></ul><ul><li>Many diverse users </li></ul><ul><ul><li>each sharing and utilising multiple datasets </li></ul></ul><ul><li>A personalised, virtual data warehouse </li></ul><ul><ul><li>bring together many sources to appear as one </li></ul></ul><ul><li>Allow shared, distributed, centralised, replicated annotation to build a community </li></ul>
    17. 17. What is the future of data? <ul><li>Data must be available to all to be useful </li></ul><ul><li>Individuals must be able to harness the data to make it important to them </li></ul><ul><li>The work you have seen today will help this happen </li></ul><ul><li>Data 2.0 is not as far away as you think! </li></ul>

    ×