A Hint of Mint Peter Sefton [email_address] Duncan Dickinson [email_address]
Funding by ANDS
Background: The ReDBox application <ul><ul><li>A registry of metadata about research data    (Research Data Box) </li></ul...
Motivation <ul><ul><li>The cognoscenti  talk about 'linked data' and have RDFa license plates on their cars, but... </li><...
Repository metadata is all strings Photo by http://www.flickr.com/photos/easement/
&quot;Aggregated Assault&quot; Some real IR metadata, aggregated using dc:type Journal Article (184)  PeerReviewed (105)  ...
I'm tagged but what's my identity?
RSPCA name: Wayne At the vet: Bootsy Sefton Local council: Bootsy Sefton ID 555-888-888 RSPCA ID: 555-555-555 (owner <name...
The Mint's Misson URI's for ( lost ) dogs?
The mint is a practical tool <ul><li>bdarcus @ptsefton  the point is give the dog a URI and be done; no need for philosoph...
The Mint's mission <ul><li>Bring linked data to your university library systems </li></ul><ul><li>(in addition to the new ...
Mint features <ul><ul><li>Desgined for developers  first  - with usable APIs. </li></ul></ul><ul><ul><ul><li>Simple import...
What can you put in it?
Behind the scenes - lookup <ul><li>http://novadev2.newcastle.edu.au:8086/mint/master/opensearch/lookup?searchTerms=smith  ...
Mint Features: The data day spa* <ul><ul><li>Fire up a Mint instance </li></ul></ul><ul><ul><li>Import data </li></ul></ul...
Data love going to the day spa
Day Spa example:  Name matching
The compulsory architecture diagram
Links <ul><ul><li>ReDBox-Mint: </li></ul></ul><ul><ul><ul><li>Public site:  https://sites.google.com/site/redboxmint/ </li...
Acknowledgements <ul><li>The work described here is a collaboration between: </li></ul><ul><ul><li>The University of South...
Thanks   Questions?
Upcoming SlideShare
Loading in...5
×

A hint of_mint

868

Published on

The Mint is an authority control / vocabulary server designed to supply authority services to repositories. It is designed to be a practical tool for working towards Linked Data repositories, making it easy build high-quality metadata collection and discovery system.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
868
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • It&apos;s not just parties that have multiple IDs. Take this example of the range of ways different IRs in Australian Unis fill out the resource type in OAI-PMH.
  • Parties typically have multiple id&apos;s issued by multiple parties. We have to accept this and work with it - and try to provide matching services where we can (within the limits of privacy legislattion).
  • Not really. But 
  • TBL&apos;s linked data was practical advice about how to start building the semantic web. If you stop and think for too long you can end up paralysed by complexity, whether or not such and such URI is for or about the dog - but the  bottom line is if people cant copy and paste from their browser or have the system assign URIs for them in the background then we will not build the semantic web.
  • I wanted to note that while we build data cleansing features into the Mint from the beginning, these were the hardest for our partner university to make happen. Library staff need to fight for IT resources, and getting the library resources to do data cleaning lined up takes time an commitment.
  • The team built a name matching system which can import data from the IR and match it against authoritative people-records from the research office system. It uses simple text searches in Solr/Lucene to find publications that might be by a particular researcher, but a human has to inspect every record. There is a similar system developed i Australia called NicNames but it does not have the import and export or APIs needed for The Mint.
  • A hint of_mint

    1. 1. A Hint of Mint Peter Sefton [email_address] Duncan Dickinson [email_address]
    2. 2. Funding by ANDS
    3. 3. Background: The ReDBox application <ul><ul><li>A registry of metadata about research data    (Research Data Box) </li></ul></ul><ul><ul><li>Driven by both compliance and openness </li></ul></ul><ul><ul><li>Funded by The Australian National Data Service (ANDS)  </li></ul></ul><ul><ul><li>Developed in collaboration with The University of Newcastle </li></ul></ul><ul><ul><li>Based on The Fascinator (Fedora + Solr) platform </li></ul></ul><ul><ul><li>Both ReDBox and Mint are Java/Python based and under an open-source license </li></ul></ul>
    4. 4. Motivation <ul><ul><li>The cognoscenti  talk about 'linked data' and have RDFa license plates on their cars, but... </li></ul></ul>
    5. 5. Repository metadata is all strings Photo by http://www.flickr.com/photos/easement/
    6. 6. &quot;Aggregated Assault&quot; Some real IR metadata, aggregated using dc:type Journal Article (184)  PeerReviewed (105)  Article (75)  Thesis (66)  Book chapter (65)  NonPeerReviewed (62)  Conference Paper (35)  Journal Articles (Refereed Article) (30)  c1 (28)  techreport (27)  Full-text link or file (26)  Conference or Workshop Item (DEST Category E) (21)  PhD Doctorate (20)  Article (DEST Category C) (19)  journal article (18)  Book Chapter (17)  text (14)  Book Section (10)  Report (9)  Conference Publications (Full Written Paper - Refereed) (8)  Conference or Workshop Item (8)  e1 (8)  Book Chapters (7)  b1 (7)  Book Chapter (DEST Category B) (5) 
    7. 7. I'm tagged but what's my identity?
    8. 8. RSPCA name: Wayne At the vet: Bootsy Sefton Local council: Bootsy Sefton ID 555-888-888 RSPCA ID: 555-555-555 (owner <name-withheld>) RDFID tag: 555-777-777 At the park: Bootsy
    9. 9. The Mint's Misson URI's for ( lost ) dogs?
    10. 10. The mint is a practical tool <ul><li>bdarcus @ptsefton the point is give the dog a URI and be done; no need for philosophical contemplation ;-) </li></ul>
    11. 11. The Mint's mission <ul><li>Bring linked data to your university library systems </li></ul><ul><li>(in addition to the new coffee shop  </li></ul><ul><li>- sorry 'learning commons')  </li></ul><ul><li>via  </li></ul><ul><li>much-needed authority services </li></ul>
    12. 12. Mint features <ul><ul><li>Desgined for developers first - with usable APIs. </li></ul></ul><ul><ul><ul><li>Simple import of vocabularies or names </li></ul></ul></ul><ul><ul><ul><ul><li>Spreadsheets </li></ul></ul></ul></ul><ul><ul><ul><ul><li>SKOS </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Anything you can script </li></ul></ul></ul></ul><ul><ul><ul><li>Lookup services return useful JSON (not raw RDF) </li></ul></ul></ul><ul><ul><li>So they can help users with: </li></ul></ul><ul><ul><ul><li>Usable interfaces to authority records </li></ul></ul></ul><ul><ul><ul><li>Translucent metadata 'URIs inside' </li></ul></ul></ul><ul><ul><ul><li>Useful discovery </li></ul></ul></ul>
    13. 13. What can you put in it?
    14. 14. Behind the scenes - lookup <ul><li>http://novadev2.newcastle.edu.au:8086/mint/master/opensearch/lookup?searchTerms=smith to lookup names matching &quot;smith&quot;. {&quot;OpenSearchResponse&quot;: { &quot;title&quot;: &quot;General Search&quot;, ... }, &quot;namespaces&quot;: { &quot;dc&quot;: &quot;http://purl.org/dc/terms&quot;, ... }, &quot;results&quot;: [ { &quot;result-metadata&quot;: { &quot;relevance&quot;: 1.482809 } &quot;rdf:about&quot;: &quot;http://novadev2.newcastle.edu.au:8086/mint/default/master/detail/a8692f6d28486eddebd317d1bc0939a4&quot;, ... } </li></ul>
    15. 15. Mint Features: The data day spa* <ul><ul><li>Fire up a Mint instance </li></ul></ul><ul><ul><li>Import data </li></ul></ul><ul><ul><li>Clean it </li></ul></ul><ul><ul><li>Export it </li></ul></ul><ul><ul><li>Kill the software </li></ul></ul>
    16. 16. Data love going to the day spa
    17. 17. Day Spa example:  Name matching
    18. 18. The compulsory architecture diagram
    19. 19. Links <ul><ul><li>ReDBox-Mint: </li></ul></ul><ul><ul><ul><li>Public site:  https://sites.google.com/site/redboxmint/ </li></ul></ul></ul><ul><ul><ul><li>Development site:  http://code.google.com/p/redbox-mint/ </li></ul></ul></ul><ul><ul><li>The Fascinator: </li></ul></ul><ul><ul><ul><li>Public site:  http://sites.google.com/site/fascinatorhome/ </li></ul></ul></ul><ul><ul><ul><li>Development site:  http://code.google.com/p/the-fascinator/ </li></ul></ul></ul><ul><ul><li>Author sites: </li></ul></ul><ul><ul><ul><li>Peter Sefton: http://ptsefton.com </li></ul></ul></ul><ul><ul><ul><li>Duncan Dickinson: http://duncan.dickinson.name   </li></ul></ul></ul>
    20. 20. Acknowledgements <ul><li>The work described here is a collaboration between: </li></ul><ul><ul><li>The University of Southern Queensland </li></ul></ul><ul><ul><li>The University of Newcastle </li></ul></ul><ul><ul><li>Swinburne University of Technology </li></ul></ul><ul><li>The authors wish to acknowledge and thank the development team of: </li></ul><ul><ul><li>Oliver Lucido </li></ul></ul><ul><ul><li>Linda Octalina </li></ul></ul><ul><ul><li>Greg Pendlebury </li></ul></ul><ul><ul><li>Ron Ward </li></ul></ul>
    21. 21. Thanks   Questions?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×