• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Connecting Publications & Data:  Raising visibility of local data collections through linking with international publication databases
 

Connecting Publications & Data: Raising visibility of local data collections through linking with international publication databases

on

  • 1,632 views

The 11th International Conference on Scientific Digitalization of Cultural and Scientific Heritage, University Repositories and Distance Learning organized by the Faculty of Philology of Belgrade ...

The 11th International Conference on Scientific Digitalization of Cultural and Scientific Heritage, University Repositories and Distance Learning organized by the Faculty of Philology of Belgrade University

Abstract: Connecting locally hosted data repositories to internationally hosted related articles has never been easier. With APIs and other web services becoming standardized at the same time that new linking standards, such as Datacite DOIs, are being adopted, new ways to distribute and mashup content are now possible. This presentation will explore emerging trends in linking scholarly literature to data. Both entity linking and data linking will be discussed. Examples will be presented demonstrating how these technologies are being employed by publishers and A&I vendors in cooperation with local data repositories.

Statistics

Views

Total Views
1,632
Views on SlideShare
446
Embed Views
1,186

Actions

Likes
0
Downloads
0
Comments
0

6 Embeds 1,186

http://mchabib.com 1167
http://feeds.feedburner.com 14
http://webcache.googleusercontent.com 2
http://www.google.com 1
http://www.linkedin.com 1
http://10.33.242.243 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Title: Connecting Publications & Data: Raising visibility of local data collections through linking with international publication databases   Abstract: Connecting locally hosted data repositories to internationally hosted related articles has never been easier. With APIs and other web services becoming standardized at the same time that new linking standards, such as Datacite DOIs, are being adopted, new ways to distribute and mashup content are now possible. This presentation will explore emerging trends in linking scholarly literature to data. Both entity linking and data linking will be discussed. Examples will be presented demonstrating how these technologies are being employed by publishers and A&I vendors in cooperation with local data repositories. __________________________________________ Before I get started, I would like to take a minute to set some expectations for this talk. The examples used will primarily be about hard sciences, my challenge to you is to figure out how to apply these technologies and methods to the digital humanities.
  • This is a theoretical framework for looking at the different ways that publications can be connected to data. This is also the agenda for the talk. I will first speak about the top left quadrant and then work my way to the bottom right. This means starting from the easiest to apply to the humanities and working through to the hardest.
  • This quadrant is primarily about publications to supplemental data.
  • Supplemental data submitted as a file with an article is the traditional way. It has its place, but that is not what I am talking about today.
  • Instead, new tools now enable display and direct manipulation of data in new and interesting ways. This example is an application that displays KML files on a Google Map: http://www.applications.sciverse.com/action/appDetail/298231?zone=main&pageOrigin=appGallery&activity=display
  • Next on the agenda is automating the connection between publications and whole supplementary or related datasets.
  • One example of this is the PANGAEA app which searches PANGAEA apis by article DOI and retrieves the coordinates of where supplementary data was collected and then charts these on a Google map displayed directly on the ScienceDirect article page.
  • This also works on Scopus record pages (so for lot’s of publishers and journals). From deciding to put it on Scopus as well it took less than 24 hours for the PANGAEA developer to implement. This was enabled by the SciVerse Applications platform.
  • Users can link through to the main record for the dataset on PANGAEA. One thing I would like to mention here is that there is also a DOI for the dataset. This was done through DataCite.
  • So what is DataCite and why is it important? It is also very important for creating links to data in repositories.
  • Takeaway points: International DOI Foundation enables CrossRef to give out DOIs. DataCite roughly equivalent to CrossRef. Learn more at the DataCite website. A central institution in Serbia might want to become a Member Institute.
  • So those were examples of linking to whole datasets and displaying them in new and interesting ways. Next to discuss is linking to entities.
  • Traditional linking involves an author marking up an entity such as a protein so that it can be easily linked to additional information about that entity in a different database. While this is useful, it is not what I wish to share with you today. Why make a user follow a link when…
  • You can now embed a 3D interactive model of the protein directly in context in the article. In this example the PDB Protein Viewer is embedded directly in the article.
  • In this example an author adds key structures to the article and they are then embedded using Reaxys information and software.
  • The last examples still required an Author to manually mark up entities. Through text analysis and mining, this is no longer always necessary.
  • In this example, our partner NextBio automatically recognizes entities in the text of the article. Easily extendable to new / other entities Works retrospectively on older content Does create recall / precision errors
  • Not only can it display them in the sidebar, but the application framework enables adding links to the entities in the text on the fly.
  • A reader can then click those links for additional information form multiple databases.
  • Colours & tags genes, proteins, molecule names Clicking shows a summary of features for the term (ie: sequence or 2D structure) User can click on links in the pop-up leading out to more information
  • Colours & tags genes, proteins, molecule names Clicking shows a summary of features for the term (ie: sequence or 2D structure) User can click on links in the pop-up leading out to more information
  • To summarize, we started with very traditional linking of datasets where an author submits the dataset with the article. One example of how this can be improved was the Interactive map viewer that displays supplementary KML files rather than simple attaching the files to the article. Next we discussed automated linking to datasets. This included the example of searching PANGAEA APIs for related datasets and then displaying the locations the data was collected. This will be driven by new standards such as DataCite. Third, authors manually mark up entities that can be linked to in other databases. Now it is possible to embed content from other databases using APIs. Last, is totally automated entity recognition using text analysis and mining, Again, information from third party databases can be embedded directly in the article itself. While I haven’t spoken too much about the technologies enabling these new ways of linking articles to data, one example is the SciVerse Application Framework, which now enables all of the examples discussed today. http://www.applications.sciverse.com/action/userhome
  • I would like to close with the same questions I opened with. Thank you.

Connecting Publications & Data:  Raising visibility of local data collections through linking with international publication databases Connecting Publications & Data: Raising visibility of local data collections through linking with international publication databases Document Transcript

  • Abstract: Connecting locally hosted data repositories to internationally hosted related articles has neverbeen easier. With APIs and other web services becoming standardized at the same time that new linkingstandards, such as Datacite DOIs, are being adopted, new ways to distribute and mashup content are nowpossible. This presentation will explore emerging trends in linking scholarly literature to data. Both entitylinking and data linking will be discussed. Examples will be presented demonstrating how thesetechnologies are being employed by publishers and A&I vendors in cooperation with local data repositories.__________________________________________Before I get started, I would like to take a minute to set some expectations for this talk. The examples usedwill primarily be about hard sciences, my challenge to you is to figure out how to apply these technologiesand methods to the digital humanities. 1
  • This is a theoretical framework for looking at the different ways that publications can be connectedto data.This is also the agenda for the talk. I will first speak about the top left quadrant and then work myway to the bottom right. This means starting from the easiest to apply to the humanities andworking through to the hardest. 2
  • This quadrant is primarily about publications to supplemental data. 3
  • Supplemental data submitted as a file with an article is the traditional way. It has its place, but thatis not what I am talking about today. 4
  • Instead, new tools now enable display and direct manipulation of data in new and interesting ways.This example is an application that displays KML files on a Google Map:http://www.applications.sciverse.com/action/appDetail/298231?zone=main&pageOrigin=appGallery&activity=display 5
  • Next on the agenda is automating the connection between publications and whole supplementaryor related datasets. 6
  • One example of this is the PANGAEA app which searches PANGAEA apis by article DOI andretrieves the coordinates of where supplementary data was collected and then charts these on aGoogle map displayed directly on the ScienceDirect article page. 7
  • This also works on Scopus record pages (so for lot’s of publishers and journals). From deciding toput it on Scopus as well it took less than 24 hours for the PANGAEA developer to implement. Thiswas enabled by the SciVerse Applications platform. 8
  • Users can link through to the main record for the dataset on PANGAEA. One thing I would like tomention here is that there is also a DOI for the dataset. This was done through DataCite. 9
  • So what is DataCite and why is it important? It is also very important for creating links to data inrepositories. 10
  • Takeaway points: International DOI Foundation enables CrossRef to give out DOIs. DataCiteroughly equivalent to CrossRef. Learn more at the DataCite website. A central institution in Serbiamight want to become a Member Institute. 11
  • So those were examples of linking to whole datasets and displaying them in new and interestingways. Next to discuss is linking to entities. 12
  • Traditional linking involves an author marking up an entity such as a protein so that it can be easilylinked to additional information about that entity in a different database. While this is useful, it isnot what I wish to share with you today. Why make a user follow a link when… 13
  • You can now embed a 3D interactive model of the protein directly in context in the article. In thisexample the PDB Protein Viewer is embedded directly in the article. 14
  • In this example an author adds key structures to the article and they are then embedded usingReaxys information and software. 15
  • 16
  • The last examples still required an Author to manually mark up entities. Through text analysis andmining, this is no longer always necessary. 17
  • In this example, our partner NextBio automatically recognizes entities in the text of thearticle.Easily extendable to new / other entitiesWorks retrospectively on older contentDoes create recall / precision errors 18
  • Not only can it display them in the sidebar, but the application framework enables adding links tothe entities in the text on the fly. 19
  • A reader can then click those links for additional information form multiple databases. 20
  • 1. Colours & tags genes, proteins, molecule names2. Clicking shows a summary of features for the term (ie: sequence or 2D structure)3. User can click on links in the pop-up leading out to more information 21
  • 22
  • * To summarize, we started with very traditional linking of datasets where an author submits the dataset with thearticle. One example of how this can be improved was the Interactive map viewer that displays supplementary KMLfiles rather than simple attaching the files to the article.* Next we discussed automated linking to datasets. This included the example of searching PANGAEA APIs forrelated datasets and then displaying the locations the data was collected. This will be driven by new standards such asDataCite.* Third, authors manually mark up entities that can be linked to in other databases. Now it is possible to embedcontent from other databases using APIs.* Last, is totally automated entity recognition using text analysis and mining, Again, information from third partydatabases can be embedded directly in the article itself.* While I haven’t spoken too much about the technologies enabling these new ways of linking articles to data, oneexample is the SciVerse Application Framework, which now enables all of the examples discussed today.http://www.applications.sciverse.com/action/userhome 23
  • I would like to close with the same questions I opened with. Thank you. 24