Citation Analysis for the Free, Online Literature

3,179 views

Published on

Citation Analysis for the Free, Online Literature

Published in: Travel
1 Comment
4 Likes
Statistics
Notes
No Downloads
Views
Total views
3,179
On SlideShare
0
From Embeds
0
Number of Embeds
73
Actions
Shares
0
Downloads
0
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide
  • Hello and thank you to the organisers for inviting me to speak here today.
  • Citation Analysis for the Free, Online Literature

    1. 1. Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia Group University of Southampton
    2. 2. Content <ul><li>Current services for Open Access Literature </li></ul><ul><li>Institutional Archives Registry </li></ul><ul><li>Metadata Harvesting through Celestial </li></ul><ul><li>Citebase Search </li></ul><ul><ul><li>Citation Linking </li></ul></ul><ul><ul><li>Search and Navigation Service </li></ul></ul><ul><li>Web Impact as a predictor of Citation Impact </li></ul>
    3. 3. Institutional Archives Registry
    4. 5. Sites in the IAR <ul><li>Things we want to know: </li></ul><ul><ul><li>GNU EPrints sites </li></ul></ul><ul><ul><li>Other research collections (Other Archives, Open Journals) </li></ul></ul><ul><ul><li>BOAI 1. vs BOAI 2. </li></ul></ul><ul><li>A submission form consisting of: </li></ul><ul><ul><li>URL, Name, OAI URL, Country, ‘type’, full-text, software </li></ul></ul><ul><li>Can’t (yet) track full-texts </li></ul><ul><li>(Create a master-list so archives only register-once?) </li></ul>
    5. 6. Celestial <ul><li>Designed to: </li></ul><ul><ul><li>Be an abstraction over OAI-PMH versions </li></ul></ul><ul><ul><li>Caching OAI metadata records </li></ul></ul><ul><li>Technological questions: </li></ul><ul><ul><li>How big can the OAI-PMH go (ok for 5 million records so far) </li></ul></ul><ul><ul><li>How reliable are OAI-PMH implementations </li></ul></ul><ul><li>Feeds Citebase, IAR, some external users </li></ul>
    6. 9. Services for Open Access Literature Self-Archived Full-texts (Pre/Post-prints) Open Access Publishing Citation Analysis/Linking Services (Citebase / Citeseer / OpenURL / DOI) Version Linking Services Search Engines Navigation Tools Analysis & Assessment Citebase Citeseer Google BMC arXiv.org OAI-PMH Transport OAIster Scirus n.b. Scirus/OAIster aren’t citation-analysis aware yet, Google indexes Citeseer. Not an exhaustive list …
    7. 10. Citation Analysis & Linking <ul><li>A citation is a reference from one work to another [as a hyperlink: a citation link] </li></ul><ul><li>Citation analysis uses citation relationships to analyse patterns in research </li></ul><ul><li>As a graph a work (paper, book etc.) is a vertex and a citation an edge </li></ul><ul><li>‘ Bibliometrics’ </li></ul><ul><ul><li>(study of patterns in literature) </li></ul></ul>
    8. 11. Digitometric/Infometric Analysis <ul><li>Bibliometrics for the online age </li></ul><ul><li>Couple citation analysis with Web analysis </li></ul><ul><ul><li>(how many times has x been accessed?) </li></ul></ul><ul><li>Similar to readership studies, but easier to survey and more comprehensive </li></ul><ul><ul><li>(though subject to the same problems of copies being re-distributed, multiple accesses etc.) </li></ul></ul>
    9. 12. Citebase Search
    10. 13. Citation Linking <ul><li>Retrieve and cache full-texts </li></ul><ul><ul><li>LaTeX, PDF, XML </li></ul></ul><ul><li>Extract reference list </li></ul><ul><li>Extract individual references </li></ul><ul><li>Parse references into components </li></ul><ul><ul><li>Author, year, title, journal, volume, pagination </li></ul></ul><ul><li>Store in structured database </li></ul>
    11. 14. Citebase Search
    12. 16. Citebase Search: Navigation by Citation Links Current Article Co-cited Article with reference list Reference link Future Past Related
    13. 18. Predicting Citation Impact <ul><li>The Web gives us access to new metrics </li></ul><ul><ul><li>Download/access frequency </li></ul></ul><ul><li>Can early-day ‘download’ frequency give an indication of longer-term citation frequency? </li></ul><ul><li>(Web logs from the UK arXiv.org mirror, Citation data from Citebase Search) </li></ul><ul><li>Pearson correlation after 6 months of web logs = 0.42 for the High Energy Physics sub-arXiv </li></ul>
    14. 24. Assessing Research(ers) <ul><li>Citation Impact </li></ul><ul><ul><li>By-Paper, Author, [Journal, Institution] </li></ul></ul><ul><li>Web Impact </li></ul><ul><ul><li>Predictor of citation-impact, combine with citation-impact </li></ul></ul><ul><li>Search Engines </li></ul><ul><li>More detailed research assessment </li></ul>
    15. 25. Comparing Online/Offline Impact <ul><li>Using ISI CD-ROM data </li></ul><ul><li>Use Web crawlers to find ‘online’ articles </li></ul><ul><li>Compare citation impact of online and offline articles </li></ul><ul><ul><li>By discipline, by journal, by author? </li></ul></ul><ul><li>Initial results for Physics show 2-3x increase </li></ul><ul><ul><li>arXiv.org </li></ul></ul><ul><li>Southampton, U. Quebec, Oldenburg (de) </li></ul>
    16. 26. Relevant Web Pages <ul><li>EPrints – http:// www.eprints.org / </li></ul><ul><ul><li>IAR: http:// archives.eprints.org / </li></ul></ul><ul><li>Citebase Search </li></ul><ul><ul><li>http://citebase.eprints.org/ </li></ul></ul><ul><li>Celestial </li></ul><ul><ul><li>http://celestial.eprints.org/ </li></ul></ul><ul><li>Correlation Generator </li></ul><ul><ul><li>http://citebase.eprints.org/analysis/correlation.php </li></ul></ul><ul><li>Tim Brody <tdb01r@ecs.soton.ac.uk> </li></ul>

    ×