Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Organizing Scientific Competitions on the Semantic Web

608 views

Published on

[Abstract]
Semantic web techniques for Linked Open Data (LOD) are expected to enhance the use of scientific data, and several data repositories for LOD have been launched. Modifiable “Forkable Open-source programs” on code sharing platforms make applications (Apps) utilizing data ready for reuse. In order to organize a web-based scientific competition, platforms for both semantic data resources and application programs need to be integrated so as to yield a crea- tive cycle between data publication and application development. We devel- oped the LinkData.org platform to integrate both data and application publish- ing platforms by recording dependency graphs, the utility of which we tested by organizing a scientific competition for synthetic biology on the platform. It was found that participants to the competition generated many dependency graphs by forking pre-existing applications or reusing schema of pre-existing datasets. These creative activities could not be observed explicitly without being record- ed such as by dependency graphs among the datasets and applications on the platform. Hence we suggest a worldwide system needs to be established to re- cord and harvest such dependency graphs from distributed data platforms and application-development platforms around the world, so that our intellectual and creative activities using open datasets for application development may be recorded properly.

http://link.springer.com/chapter/10.1007%2F978-3-642-40285-2_27

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Organizing Scientific Competitions on the Semantic Web

  1. 1. Organizing Scientific Competitions on the Semantic Web Sayoko Shimoyama, Robert Sidney Cox III, David Gifford and Tetsuro Toyoda Integrated Database Unit, Advanced Center of Computing and Communication (ACCC), RIKEN, Japan DEXA2013, August 27
  2. 2. » Data repositories and directories for open data help users register their data resources and locate related data; such as the CKAN (Comprehensive Knowledge Archive Network) web-based system for storage and distribution of data. » However, the act of separating data from applications on the web makes collaboration between data and applications invisible; so contributions to opening and maintaining data are not evaluated as appropriately as contributions to Apps. ? This situation does not motivate people to contribute by donating their own datasets. invisible August27,2013
  3. 3. » To overcome this situation, we developed LinkData as a data publishing platform and LinkDataApp as an application publishing platform. » We combined them by automatically recording dependency graphs that relate Data and Apps using that data; This cycle enhances a wide range of synergistic collaborations. Create App for Data Create Data to use with App August27,2013
  4. 4. 1. Support functions for creating table data to upload: » Users can create a template by inputting metadata using LinkData’s GUI and downloading it. » Schema of all published Data can be reused for publishing new datasets. » Users input their data to this template to create their own table data for uploading. create new reuse template August27,2013
  5. 5. phone number number of books Central library 045-111- 1111 154,265 West library 045-222- 2222 65,489 South library 045-333- 3333 98,548 Central library 045-111- 1111 phone number 154,265 number of books Central library 045-111- 1111 phone number 154,265 number of books Central library 045-111- 1111 phone number 154,265 number of books 2. Conversion to RDF and publishing: » Template data tables can be uploaded, converted to RDF, and published online at LinkData.org. August27,2013
  6. 6. 3. Application development support function: » Application developers can access Data Content using LinkData platform provided APIs. » 8 formats are provided: • TSV • RDF/Turtle • RDF/JSON • RDF/XML • RSS • KML • R (for statistical analysis) • Simple Data Format Due to these functions, LinkData supports not only publishing data but also using data. August27,2013
  7. 7. 1. Create a new App by editing a sample program: » Users choose Data as an input and edit sample JavaScript programs on a web browser to develop their own original App. 2. Fork an App to publish as a new one: » Published Apps on LinkDataApp can be forked. » Users can fork and modify the program to publish it as a new App. 3. Change input Data to create a new App: » Even a non-programmer can add new functionality to an App by changing the Input Data. Choose input Data and create Change input Data Fork JavaScript Editor August27,2013
  8. 8. Entity Definition Data A single data set which has been published by a User in LinkData Application (App) A single application which has been published by a User in LinkData User A user who had registered for a LinkData account Graph Term Label Definition Data(new) → Data(old) Reuse Ldd Create new Data by reusing existing Data Data → User Contributed Ldu The relationship between Existing Data and the user who created the Data App(new) → App(old) Fork Laa Create a new App by reusing an existing App’s program code App → Data Load Lad Create an App by specifying some files as input from some particular Data App → User Contributed Lau The relationship between an Existing App and the user who created the App User(A) → User(B) Follow Luu User A follows user B to receive updates and information of evaluated Data and Apps by user B User → Data Vote Lud A user gives a rating of Useful or Un-useful for considered Data User → App Vote Lua A user gives a rating of Useful or Un-useful for a considered App August27,2013
  9. 9. » Count of hosted Data and Apps in LinkData (as of August, 2013) » Count of relationships among Data, Apps and Users in LinkData Kind of relationship Count Load (App to Data) 1508 Fork (App to App) 153 Reuse (Data to Data) 41 Follow (User to User) 54 Vote (User to Data) 279 Vote (User to App) 108 655 316 There is a stronger synergy cycle between data resources and applications than “in data” (between data and data) or “in app” (between app and app). August27,2013
  10. 10. Example dependency graph among Data, Apps and Users. » Dark Green edges indicate Data to Data reuse » Red edges indicate Data to App loading » Blue edges indicate App to App forking » Bright Green edges indicate User ownership “contribution” » Grey edges indicate votes to rate applications by users, and following of other users The dependency graph allows users to dynamically contribute to and benefit from an automated rating of both data and applications. Interactive Gene Association Matrix application created on LinkDataApp http://app.linkdata.org/app/app1s64i August27,2013
  11. 11. Organizing Scientific Competitions on the LinkData platform » For the synthetic biology competition GenoCon2 (http://genocon.org) , we challenged participants to design novel regulatory DNA for controlling gene expression in the thale cress plant Arabidopsis thaliana. » In addition to DNA sequences, we offered programs for DNA design. August27,2013
  12. 12. PromoterCAD : Data Driven Design of Plant Regulatory DNA » To allow non-experts an opportunity for DNA design we built a computer aided design tool on the LinkData platform, called PromoterCAD. » Using PromoterCAD function modules, genes with the desired properties can be found and mined for regulatory motifs. These are introduced into the synthetic promoter by user choice of regulatory position. Repeating this process can create complex regulation at the promoter. » Finally, the DNA design is exported for error and safety checking, DNA synthesis, and experimental characterization. August27,2013 http://app.linkdata.org/app/app1s335i
  13. 13. PromoterCAD LinkData system architecture for DNA design incorporates database information with user knowledge » PromoterCAD uses several data sources for Tissue / Time specific promoter design. August27,2013 fork add  Users can add their own data suited to promoter design. create new  Users also can create a new App or fork a pre-existing App for design.
  14. 14. Here we show the cycle enhancing synergy of collaboration in this web-based scientific competition for synthetic biology promoter design. This graph shows interaction between Data (Green box), Apps (Blue box), and Users (Grey box). » Dark Green edges indicate Data to Data reuse » Red edges indicate Data to App loading » Blue edges indicate App to App forking » Bright Green edges indicate User ownership “contribution” » Grey edges indicate votes to rate applications by users, and following of other users The App “GenoCon PromoterCAD” at http://app.linkdata.org/app/app1s94i is shown in the graph. The Dataset http://linkdata.org/work/rdf1s339i “Speedup Lists of Developmental Coexpression” is a source for this graph August27,2013
  15. 15. • For example, highly voted application ID:137 “A Promoter Design to Maintain the Fertility of Transgenic Plant by new Plugin MotifRanking” is a fork of ID:94 PromoterCAD. • This example graph shows that ID:94 forked by 6 apps and voted for by 1user. • It shows ID:137 forked 0 times and voted for by 5 users for a score of 5. • In this fashion each app can be compared for total activity and usefulness in turn. 6 forks 1 vote 5 votes 0 forks August27,2013
  16. 16. LinkData Application app1s137i showing usability ranking and user voting buttons on top right. http://app.linkdata.org/app/app1s137i GenoCon2 Contest Activity: » There are over 40 international submissions including from the USA, Egypt and Japan. » Users cooperated to create original designs that were modified and possibly improved by other users. » Team collaboration was aided by the open nature of the design platform; 13 promoter designs are being considered for final construction in transgenic plants. August27,2013 The semantic dependency-graph-based system with evaluation by experiment will foster a rapid biological knowledge cycle where programmers, researchers, and amateurs can all contribute.
  17. 17. » A scientific competition was successfully organized on the LinkData platform that records dependency graphs among datasets and applications. » It was found that participants in the competition generated many dependency graphs by forking pre-existing applications or reusing schema of pre-existing datasets. » These creative activities could not be observed explicitly without being recorded, such as by dependency graphs among datasets and applications on the platform. » Hence, we suggest a worldwide system needs to be established to record and harvest such dependency graphs from distributed data platforms and application-development platforms around the world, so that our intellectual and creative activities using open datasets for application development may be recorded properly. August27,2013
  18. 18. Dr. Takaho Endo for creating biological visualization tool on LinkDataApp. Ms. Yuko Yoshida for development of converter and valuable discussion. Dr. Shuji Kawaguchi for giving advice on the score calculation. Dr. Koro Nishikata for testing LinkData functions. Dr. Masahiro Mochizuki for testing and adding the MotifRanking tool. Mr. Chanaka Perera, Mr. Uditha Punchihewa, Mr. Gayan Hewathanthri, Mr. Hiroaki Osada, Mr. Kazuro Fukuhara and Mr. Kiyoshi Mizumoto (Axiohelix Co., Ltd.) for web application and LinkData development. The committee of Linked Open Data Challenge Japan for continuing interest and encouragement. This work was supported by: The National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency (JST). REFERENCES » F. Manola, et al.: RDF Primer W3C Rec. (2004) » E. Prud’hommeaux, et. al.: SPARQL Query Language for RDF. W3C Candidate Rec. (2006) » T. Toyoda, et al.: “Methods for Open Innovation on a Genome – Design Platform Associating Scientific, Commercial, and Educational Communities in Synthetic Biology,” Methods in Enzymology., Vol. 498, 189-203, (2011) » R. S. Cox III, K. Nishikata, S. Shimoyama, T. Toyoda et. al.: “PromoterCAD: data-driven design of plant regulatory DNA,” Nucl. Acids Res. 41 (W1): W569-W574, (July 2013) August27,2013

×