Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Spalding Metadata The Three Legged Dance

434 views

Published on

This presentation was part of 11th Annual NISO-BISG Forum at ALA, Ensuring the Integrated Information Experience, and given by Tim Spalding of Library Thing on June 23, 2017

Published in: Education
  • Be the first to comment

  • Be the first to like this

Spalding Metadata The Three Legged Dance

  1. 1. Metadata: The Three-Legged Dance Tim Spalding ALA NISO-BISG Forum | June 23, 2017 tim@librarything.com @LibraryThingTim
  2. 2. Who am I? Book-lover, ex-scholar, programmers LibraryThing (2005) LibraryThing for Libraries (2007) TinyCat (2016) Syndetics Unbound (2016)
  3. 3. At the Intersection Of… Readers Collectors Libraries Academic, Public, School, "Tiny" Online booksellers Bookstores Publishers Authors Also: archives, scholars, famous dead people with books, music and movie lovers
  4. 4. Data is Good Everyone their data Every data its glorious purpose Every data its data that makes it better
  5. 5. My Approach to Data Is… Loving Respectful Flexible Statistical Optimistic as to what librarians can do…
  6. 6. The Three-Legged Stool Professional data User data Content data (a very, very simplified framework)
  7. 7. Professional Data Library cataloging (MARC, BIBFRAME) Publisher/bookseller (ONIX, Amazon, Bowker) Classification (DDC, LCC, BIC, BISAC, LCSH) Professional reviews Bibliographies and guides (LibGuides, bibliographic monographs) Reading levels (Lexile, AR, F&P)
  8. 8. User Data Intentional User reviews Ratings Tags Annotations Lists Discussions User book recommendations Implicit Purchase patterns Ownership patterns Checkout patterns Reading patterns Popularity
  9. 9. Content Data Text of book Samples, quotes, etc. Tables of contents Indexes Word and phrase statistics In-text references and footnotes
  10. 10. Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone One-Legged Stools: "Recommendations," "Similar Books," etc.
  11. 11. One-Legged Stools: Recommendations Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone
  12. 12. Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone One-Legged Stools: Recommendations
  13. 13. Boring Repetitive Keep people in their bubble No serendipity, surprise No taste! Recommendations too much by statistics?
  14. 14. One-Legged Stools: Recommendations Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone
  15. 15. Solution: Add a leg or two… Let users act like professionals Use statistics on classification
  16. 16. "Everyone a Librarian" Improved Author disambiguation (1,741,282) Edition/work control (5,544,233) Canonical book titles Series Author name variants Created Work relationships (contained in, commentary on, parody of, etc.) Awards Places, characters, events Author picture Author information (education, family, occupation, nationality, etc.)
  17. 17. The Dewmoji ! 174.3 = 💭 🚎 🙈 ⚖ 1 💭 Philosophy and Psychology 7 🚎 Ethics 4 🙈 Professional Ethics .3 ⚖ Lawyers
  18. 18. "Everyone's a librarian?" Ha. Add ANOTHER leg. Librarians at LibraryThing vet USER DATA: Tag approval — LibraryThing has 135m tags; 75% belong to 30,000 unique Series approval Award approval Picture approval Review approval
  19. 19. Solution: Add a leg or two… Let users act like professionals. Use user statistics on professional data Does that classification map to user/usage data?
  20. 20. DDC against "people who have X have Y" Clusters well — high "salience" 618.4 — Birthing books 668.1 — Soapmaking 638.1 — Beekeeping Clusters terribly — low "salience" All literature in DDC 796.1 — Miscellaneous games 225.6 — New Testament > Hermeneutics, Exegesis
  21. 21. How we do Recommendations Basic Factors "People who have X have Y statistics" Three different statistical approaches Shared tags Reorder and Drop Ratings Reviews User recommendations User up and down votes LT Popularity curves Library popularity curves Tag "salience" Tag approval tag-to-author Classification systems Classification salience Series Series order Series-order importance Author clustering In-house algorithmic genre system Crosswalks from genre to tag, etc. Final factor: TASTE! Mix of authors, popularities, genres, etc.
  22. 22. Steal Someone's Leg Users do stuff to Professional data Users add and improve bibliographic information Professionals do stuff to user data Professional curation of tags, reviews Professionals pretend to be users Publishers suggest similar books
  23. 23. Random Hortatory Slogans Use all the data you can Free your data Use data by others, even distant others Be flexible Use statistics Don't be afraid of users But don't let them run rampant either… Cede ground … … Take ground Add professional value to non-professional data
  24. 24. Thank you! tim@librarything.com @LibraryThingTim
  25. 25. Idea: What's the best shelf-order system? Lay out an entire "typical" library in one long line by classification Take data on non-library clustering (e.g., people who have X have Y) Calculate the average distance you'd have to travel

×