Metadata and Metrics to Support Open Access

1,120 views
1,001 views

Published on

This presentation, invited for a workshop on Open Access and Scholarly Books (sponsored by the Berkman Center and Knowledge Unlatched), provides a very brief overview of metadata design principles, approaches to evaluation metrics, and some relevant standards and exemplars in scholarly publishing. It is intended to provoke discussion on approaches to evaluation of the use, characteristics, and value of OA publications.

Published in: Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,120
On SlideShare
0
From Embeds
0
Number of Embeds
514
Actions
Shares
0
Downloads
7
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • This work. by Micah Altman (http://micahaltman.com) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
  • Metdata can be defined variously as "data about data", digital 'breadcrumbs', magic pixie dust, and "something that everyone now knows the NSA wants a lot of". It's all of the above. Metadata is used to support decision and workflows, add value to objects (through enhancing discover, use, reuse, and integration), and to support evaluation and analysis. It's not the whole story for any of these things, but it can be a big part. This presentation, invited for a workshop on Open Access and Scholarly Books (sponsored by the Berkman Center and Knowledge Unlatched), provides a very brief overview of metadata design principles, approaches to evaluation metrics, and some relevant standards and exemplars in scholarly publishing. It is intended to provoke discussion on approaches to evaluation of the use, characteristics, and value of OA publications.
  • Metadata and Metrics to Support Open Access

    1. 1. Prepared forOpen Access and Scholarly BooksBerkman Center/Knowledge UnlatchedJune 2013Metadata and Metrics to Support OpenAccess MonographsDr. Micah Altman<escience@mit.edu>Director of Research, MIT Libraries
    2. 2. DISCLAIMERThese opinions are my own, they are not the opinionsof MIT, Brookings, any of the project funders, nor (withthe exception of co-authored previously publishedwork) my collaboratorsSecondary disclaimer:“It’s tough to make predictions, especially about thefuture!”-- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill,Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi,Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle,George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White,etc.Metadata and Metrics to Support OpenAccess Monographs2
    3. 3. Related Work• Altman (2012) “Mitigating Threats To Data Quality Throughout theCuration Lifecycle, 1-119. In Curating For Quality.• CODATA-ICSTI Task Group on Data Citation Standards and Practices,(Forthcoming 2013), Citation of Data: The Current State of Practice, Policy,and Technology, CODATA.• National Digital Stewardship Alliance, (Forthcoming 2013), NationalAgenda for Digital Stewardship.• Uhlir (ed.) (2012), Developing Data Attribution and Citation Practicesand Standards Report from an International Workshop. NationalAcademies Press, 2012Most reprints available from:informatics.mit.eduMetadata and Metrics to Support OpenAccess Monographs3
    4. 4. The Next 10 Minutes• Level setting• Start Discussion questionsMetadata and Metrics to Support OpenAccess Monographs4
    5. 5. Preview: Some Discussion Questions• Successful examples/exemplars:– existing metadata and effective uses of it with books?– graceful degradation, increasing returns, etc.?• Emerging requirements:– Explicit metadata (or identifier, integration, etc.) requirements fromstakeholders?– In what ways do these explicit support use, evaluation, and integration?– Clear implicit requirements? … Licensing (CC-BY, CC0)? Identifierschemes (ISBN, DOI)? Indexing integration requirements?– What evidence could you envision showing your stakeholders todemonstrate success?• Opportunities– ‘Easy pickings’ – metadata already produced in production,dissemination, use, but not retained?– “‘Looks-easy’ pickings” – opportunities for automated extraction; crowd-sourced entry and refinement?– Leverage points – e.g. where can effort applied to prime the pump,coordinate practice, or build infrastructure yield network effects, lowerbarriers to entry, create norms/nudges, or coordinating equlibria thatgenerate incentives to continue production? 5
    6. 6. What is metadata anyway?(a) “data about data”(b) something the NSAwants a lot of(c) magic pixie dust(d)digital breadcrumbs(e) all of the aboveMetadata and Metrics to Support OpenAccess Monographs 6Source:http://www.guardian.co.uk/technology/interactive/2013/jun/12/what-is-metadata-nsa-surveillance#meta=0000000
    7. 7. What good is it?• Support decision & workflow for production• Add value to product– Support discovery -- descriptive information– Support use – re-presentation, navigation– Support reuse/integration – descriptive, structural,provenance• Grow the evidence base regarding OA books– characteristics of production, products, and use– E.g., costs, content features, authors, quality• Support evaluationMetadata and Metrics to Support OpenAccess Monographs7
    8. 8. Selected Characteristics• Purpose– Descriptive– Structural– Administrative• Identification• Rights• Provenance• Fixity• Preservation– Linkages/relationships– AnnotationMetadata and Metrics to Support OpenAccess Monographs8• Granularity• Association Model– Embedded– Associated– Third party• Schema– Mandatory elements– Structure• Ontology– Semantics– Relationships amongelements and concepts
    9. 9. Design Heuristics• Dublin Core DesignPrinciples[Duval, et al. 2002]– Modularity– Extensibility– Capacity for refinement– Multilingual• Early capture• Automated extraction• Approaching richness– Progressive enhancement– Graceful degradation– Increasing returns toinvestment– requirement -> barrierMetadata and Metrics to Support OpenAccess Monographs9
    10. 10. Evaluation• Measurement characteristics– Scope: Local measures vs. Ego-centric vs.Global– Duration: Point in time vs. period vs. trend– Measurement Scale: Absolute vs. proportionvs. rank vs. pairwise comparisons vs. purelydescriptive (e.g. usage stories)• Inputs– Content– Associated meta-information– External behaviors, actions (awards),reputation• Use characteristics:– … understandability (cognitive burden) ofmetrics– … dissemination and adoption strategy– … incentives to be strategic to effect measuresMetadata and Metrics to Support OpenAccess Monographs10• Some emergingapproaches:– Proxies for interest(citation counts)– Proxies for use(downloads, readingpatterns, annotationpatterns, data citations)– Proxies for (predictive)value(journal impact metrics,h(g,i)-indices, PageRank,Google rank, models ofnetwork evolution)[See Borner, et al. 2004; Kurtz & Bollen 2010;Bollen et. al 2009; Uhlir 2012; CODATA-ICSTI TaskGroup on Data Citation Standards andPractices… 2013]
    11. 11. Ecosystem Integration• Usage– SUSHI / COUNTERhttp://www.niso.org/workrooms/sushi/• (NISO Standardized Usage Statistics Harvesting Initiative)• Protocol for transmission of usage statistics / practices & schema forformatting and collecting usage statistics• Digital work identifiers / locators– Exemplars: DOI’s / OpenURL– Use of identifier internal to monograph adds value forlater use and evaluation– Use of identifier / standard locator to refer to workprovides potential leverage point for usage metricscollection• Other identifiers– FUNDREF – funding identifiers– ORCID/ISNI – contributor identifiers– Data Citations – citations to data and other non-traditional scholarly publication– Embedding in monograph adds value to evidence base– Useful for evaluations – esp. those that are likely toalign incentives among funders & contributors• De facto discovery, use & evaluation– e.g. Google, AmazonMetadata and Metrics to Support OpenAccess Monographs11
    12. 12. Examples: Current State of the Practice• Institutional Repository Metrics– Harvard DASH User Storieshttps://osc.hul.harvard.edu/dash/stories– MIT Global Impacthttp://dspace.mit.edu/handle/1721.1/49433– SSRN Author and Paper Metrics:http://hq.ssrn.com/rankings/Ranking_display.cfm?TRN_gID=10&requesttimeout=900• Aggregators– Project Musehttp://muse.jhu.edu/about/stats.html– Highwirehttp://sushi.highwire.org/– HathiTrust Research Centerhttp://www.hathitrust.org/htrcMetadata and Metrics to Support OpenAccess Monographs12
    13. 13. Some Discussion Questions• Successful examples/exemplars:– existing metadata and effective uses of it with books?– graceful degradation, increasing returns, etc.?• Emerging requirements:– Explicit metadata (or identifier, integration, etc.) requirements fromstakeholders?– In what ways do these explicit support use, evaluation, and integration?– Clear implicit requirements? … Licensing (CC-BY, CC0)? Identifierschemes (ISBN, DOI)? Indexing integration requirements?– What evidence could you envision showing your stakeholders todemonstrate success?• Opportunities– ‘Easy pickings’ – metadata already produced in production,dissemination, use, but not retained?– “‘Looks-easy’ pickings” – opportunities for automated extraction; crowd-sourced entry and refinement?– Leverage points – e.g. where can effort applied to prime the pump,coordinate practice, or build infrastructure yield network effects, lowerbarriers to entry, create norms/nudges, or coordinating equlibria thatgenerate incentives to continue production? 13
    14. 14. Questions?E-mail: escience@mit.eduWeb: micahaltman.comTwitter: @drmaltmanMetadata and Metrics to SupportOpen Access Monographs14

    ×