Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bringing Insight Into Data: Info Pros’ Role in Text and Data Mining 


Published on

I offer insights into how info pros can raise their role in text and data mining projects within their organizations by better appreciation for what aspects of TDM most benefit from an info pro point of view and how info pros can best leverage their professional expertise in this new field.Presented at SLA 2018 Annual Conference, June 10, 2018.

Published in: Business
  • Be the first to comment

  • Be the first to like this

Bringing Insight Into Data: Info Pros’ Role in Text and Data Mining 

  1. 1. Bringing Insight to Data: Info Pros’ Role in Text and Data Mining Mary Ellen Bates 10 June 2018
  2. 2. 2 librarians/landing/textanddata
  3. 3. 3 Text and data mining in the info pro context TDM use cases in the info pro world Info pros’ role in TDM projects What we’ll cover
  4. 4. TDM in the info pro context
  5. 5. 5 Print indexes, catalogs Bibliographic databases & Boolean search Full-text databases Text and data mining of “semantic triples” (a.k.a. info bits) Evolution of information
  6. 6. 6 Structured info forms a semantic triple Subject — predicate — object Semantic triple for “the sky is blue” could be sky — has_the_color — blue genid:ARP40722 | | hex:#0000FF TDM = extraction of info bits
  7. 7. 7 Semantic triples from Wikipedia article: shingles — is_also_called — herpes_zoster shingles — is_caused_by — varicella_zoster_virus varicella_zoster_virus — is_treated_with — acyclovir immunosupression — is_risk_factor_for — shingles TDM = extraction of info bits
  8. 8. 8 Semantic triples from a bib cite: Article_X — has_author — Doe,_John Article_X — published_in — Heredity Doe,_John — has_affiliation — Drexel_University Article_X — funded_by — grant_123 Article_X — has_subject — Alzheimer’s_Disease TDM = extraction of info bits
  9. 9. 9 Google Books Ngram Viewer TDM applied to full text of books Parses each word and sentence Shows unexpected trends TDM shows new insights
  10. 10. 10 *_NOUN nurse Show me the 10 most common nouns preceding the word nurse over time
  11. 11. 11
  12. 12. 12
  13. 13. 13 Linked data enables meaningful connections across content Normalized data = enhanced discovery Full-text content + linked OPEN data + APIs = WOW! Linked open data
  14. 14. 14
  15. 15. 15 Library of Congress Linked Data Service Datasets of subject headings, names, etc. Not all govt datasets are LOD Structured data extracted from Wikipedia Linked open data examples
  16. 16. 16 Springer Nature SciGraph View visual patterns in large datasets See relationships across disciplines, formats Linked open data examples
  17. 17. 17
  18. 18. 18 Article: Adherence to a Mediterranean diet and Alzheimer’s disease risk in an Australian population
  19. 19. 19 PubChem open chemistry database Use API to expand search for all names for a substance See links from PubChem to SpringerNature articles Linked open data examples
  20. 20. 20
  21. 21. 21
  22. 22. “TDM enables us to do more complex searches using a large number of synonyms through ontologies, when full-text searches reach their limits.” –pharma info scientist
  23. 23. TDM use cases
  24. 24. 24 Springer Nature Journal Suggester You provide manuscript title & abstract; it recommends where to submit MS TDM use cases
  25. 25. 25
  26. 26. 26
  27. 27. 27 Help searchers gauge article significance API that takes article DOI and calculates # of citations to that article Build an internal open-access image repository Monitor open access journals for specific type of image TDM use cases
  28. 28. 28 Enhance discoverability of internal knowledge Integrate backlinks from internal content to Dbpedia entries ID high-quality vs. predatory conferences Chart # of institutions represented by speakers, citation & reference metrics of speakers TDM use cases
  29. 29. 29 Ensure more comprehensive searches API that looks up search terms for MeSH equivalent, appends all terms w/in that concept “opioid dependence” Opioid-Related Disorders, Heroin Dependence, Morphine Dependence TDM use cases
  30. 30. 30 Conduct pharmacovigilance Improved ability to identify adverse events, pharmacological substances Monitor competitor patent filings Cut through patent obfuscation TDM use cases
  31. 31. 31 Create dashboard for business intelligence Monitor key publications What institutions are publishing research Who are the most cited researchers at an institution What institutions are receiving grants? TDM use cases
  32. 32. 32 More info on Springer Nature’s TDM tools at More info on Springer Nature’s APIs for scholarly content at Additional info
  33. 33. Info pros’ role in TDM projects
  34. 34. 34 Monitors professional journals Tracks conference presentations & speakers Analyzes trends in grants and funding sources Knows who’s working on what internally Makes connections among all those info-bits TDM is like a really smart info pro
  35. 35. 35 We’re info pros We can think creatively about information We understand uses of structured data We know what metadata is needed We care about quality of taxonomies We know what resources we have We understand our clients’ info needs, search behavior Info pros’ role in TDM
  36. 36. 36 We can evaluate quality, coverage of datasets Develop quality, cost checklist for your clients Monitor govt agencies, open-data initiatives Monitor – repository of research data Info pros’ role in TDM
  37. 37. 37 We conduct strategic reference interviews Clients only for what they think we can get We understand the client’s use case We ask “What’s essential? What’s nice to have?” We think creatively about finding answers Info pros’ role in TDM
  38. 38. 38 Info pros’ role in TDM We understand copyright issues Greater discoverability = greater demand for content We leverage information for greater ROI And we know how to promote info resources internally Higher ROI for online content subscriptions
  39. 39. 39 We collaborate with other groups TDM projects involve teams from collections development, cataloging/metadata, IT, outreach We see the bigger picture We think beyond the tool or resource to delivering insight Info pros’ role in TDM
  40. 40. 40 Request the white paper through landing/textanddata or email Caitlin Cricco at Want more info?