Welcome to the SLA Taxonomy Division’s <br />“Webinar Wednesdays”<br />September 14, 2011, 1:00 PM EDT<br />Today’s Topic:...
Next up for the SLA Taxonomy Division’s “Webinar Wednesdays”<br />“Why Information Architecture on SharePoint”<br />Presen...
Goals and Agenda<br />Why is it important right now that we develop better resources about people?<br />What roles can tax...
Magnitude of the Problem:<br />Facebook - 700 Million Users Projected for 2011(Open-First)<br />700 Million Names<br />How...
What’s in a name?<br />Juliet:
"What's in a name? That which we call a rose <br />By any other name would smell as sweet."...
What’s in a name?<br />My name<br />Jay Ven Eman<br />Ven Eman, Jay<br /><First_Name>Jay</First_Name><br /><Last_Name>Ven ...
Aliases </li></ul>Jay Von Eman<br />William Henry McCarty<br />Henry Antrim<br />William H. Bonney<br />Billy the Kid<br /...
One Person:  many representations and affiliations <br />VIAF: Virtual International Authority File<br />http://viaf.org/v...
Why should you care?<br />Promotion (tenure)<br />Networking<br />Social media<br />Peer review<br />Attribution<br />Cita...
Member recruiting
Member retention
User satisfaction
Better research
Better tech transfer
Sharing
Semantic Web</li></li></ul><li>What do we do about it?<br />Semantics<br />Taxonomies<br />Thesauri<br />Classification/in...
The Semantic Roadmap: Knowledge Organization Systems (KOS)<br /><ul><li>Linked Entities
Contextual Specificity
Complex
High value</li></ul>Semantic network<br />Ontology<br />Thesaurus<br />Taxonomy<br />Controlled vocabulary <br />Synonym s...
Taxonomy
Controlled vocabulary
Synonym set/ring
Name authority file
Uncontrolled list
Unrelated Entities
Ambiguity
Simple
Low Value</li></li></ul><li>Integration / workflow<br />API’s, Client/Server, <br />Web Services, HTTP-TCP/IP<br />Author ...
Select the document collection<br />CMS<br />Please select the database and the the document directory to load<br />
CMS<br />Load the documents <br />
Sample unstructured document<br />
Run the documents through a metadata extraction process to create well-formed, rich XML<br /><ul><li>Automatic (per doc te...
E.g. Dublin Core Metadata
Bibliographic citation</li></li></ul><li>Automatically add the taxonomy terms<br />Entity extraction: People, Places, Thin...
Classification Process or Assigned Indexing<br /><Anchor><Date>09-14-11</Date><br /><TI>“Solving the Challenge”</TI><br />...
Integrating Identity into Publisher Systems<br />
Author Submission Systems<br />
Upcoming SlideShare
Loading in …5
×

Solving the Challenge of Connecting People and Author Networks

1,288 views
1,203 views

Published on

Presented by Dr. Jay Ven Eman, CEO of Access Innovations, Inc. on September 14, 2011. Part three of the Special Libraries Association's Leveraging Your Taxonomy series.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,288
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • PDF
  • Post processing“Labels” content itemBut also classifies author
  • Preprocessing beats post processing
  • Preprocessing – helps a bunch
  • Talking points:
  • Talking points:
  • Talking points:
  • Solving the Challenge of Connecting People and Author Networks

    1. 1. Welcome to the SLA Taxonomy Division’s <br />“Webinar Wednesdays”<br />September 14, 2011, 1:00 PM EDT<br />Today’s Topic: Solving the Challenge of <br />Connecting People and Author Networks<br />Access Innovations, Inc.<br />Changing search to found<br />Jay Ven Eman, Ph.D., CEO<br />j_ven_eman@accessinn.com<br />www.accessinn.com<br />www.dataharmony.com<br />+1.505.998.0800<br />Albuquerque, NM<br />
    2. 2. Next up for the SLA Taxonomy Division’s “Webinar Wednesdays”<br />“Why Information Architecture on SharePoint”<br />Presented by Joe Shepley, Doculabs<br />October 12, 2011, 1:00 PM EDT<br />“Drilling Down to the Challenge of a <br />SharePoint Taxonomy Implementation”<br />Presented by Marjorie M.K. Hlava, Access Innovations, Inc.<br />November 9, 2011, 1:00 PM EST<br />“Taxonomies for Publishing:<br />Enhancing the User Experience”<br />Presented by Jay Ven Eman, Access Innovations, Inc.<br />December 14, 2011, 1:00 PM EST<br />
    3. 3. Goals and Agenda<br />Why is it important right now that we develop better resources about people?<br />What roles can taxonomies play in this effort?<br />What opportunities are being created for knowledge discovery and collaboration?<br />What broad initiatives and technologies should we be aware of? <br />
    4. 4. Magnitude of the Problem:<br />Facebook - 700 Million Users Projected for 2011(Open-First)<br />700 Million Names<br />How will your boss, peers,<br />anyone ever find you?<br />
    5. 5. What’s in a name?<br />Juliet:
"What's in a name? That which we call a rose <br />By any other name would smell as sweet."<br />Romeo and Juliet (II, ii, 1-2)<br />
    6. 6. What’s in a name?<br />My name<br />Jay Ven Eman<br />Ven Eman, Jay<br /><First_Name>Jay</First_Name><br /><Last_Name>Ven Eman</Last_Name><br /><ul><li>Name variants
    7. 7. Aliases </li></ul>Jay Von Eman<br />William Henry McCarty<br />Henry Antrim<br />William H. Bonney<br />Billy the Kid<br />Jay Van Eman<br />Jay van Eman<br />Jay ven Eman<br />Jay Veneman<br /><ul><li>National & Cultural Conventions </li></ul>Jay Venema<br />
    8. 8. One Person: many representations and affiliations <br />VIAF: Virtual International Authority File<br />http://viaf.org/viaf/95216565/<br />
    9. 9. Why should you care?<br />Promotion (tenure)<br />Networking<br />Social media<br />Peer review<br />Attribution<br />Citation accuracy<br />“Discoverability”<br />Collaboration <br /><ul><li>Revenue
    10. 10. Member recruiting
    11. 11. Member retention
    12. 12. User satisfaction
    13. 13. Better research
    14. 14. Better tech transfer
    15. 15. Sharing
    16. 16. Semantic Web</li></li></ul><li>What do we do about it?<br />Semantics<br />Taxonomies<br />Thesauri<br />Classification/indexing<br />Markup<br />Unstructured to…<br />Structured<br />Name disambiguation <br />
    17. 17. The Semantic Roadmap: Knowledge Organization Systems (KOS)<br /><ul><li>Linked Entities
    18. 18. Contextual Specificity
    19. 19. Complex
    20. 20. High value</li></ul>Semantic network<br />Ontology<br />Thesaurus<br />Taxonomy<br />Controlled vocabulary <br />Synonym set/ring<br />Name authority file<br />Uncontrolled list<br />Ontology<br />Semantic Network<br /><ul><li>Thesaurus
    21. 21. Taxonomy
    22. 22. Controlled vocabulary
    23. 23. Synonym set/ring
    24. 24. Name authority file
    25. 25. Uncontrolled list
    26. 26. Unrelated Entities
    27. 27. Ambiguity
    28. 28. Simple
    29. 29. Low Value</li></li></ul><li>Integration / workflow<br />API’s, Client/Server, <br />Web Services, HTTP-TCP/IP<br />Author Submission<br />System<br />Books<br />Content<br />Repository “A”<br />Or Intermediate<br />Processes<br />Conference<br />Proceedings<br />Content<br />Repository<br />“B”, etc.<br />Thesaurus<br />Master<br />M.A.I.<br />ETC.<br />Data Harmony<br />MAIstro Server<br />Web<br />Sites<br />Classification System<br />
    30. 30. Select the document collection<br />CMS<br />Please select the database and the the document directory to load<br />
    31. 31. CMS<br />Load the documents <br />
    32. 32. Sample unstructured document<br />
    33. 33. Run the documents through a metadata extraction process to create well-formed, rich XML<br /><ul><li>Automatic (per doc template)
    34. 34. E.g. Dublin Core Metadata
    35. 35. Bibliographic citation</li></li></ul><li>Automatically add the taxonomy terms<br />Entity extraction: People, Places, Things<br />Conceptual indexing: using the taxonomy<br />
    36. 36. Classification Process or Assigned Indexing<br /><Anchor><Date>09-14-11</Date><br /><TI>“Solving the Challenge”</TI><br /><BLH>By</BLH><br /><Author><br /><AU_FN>Jay</AU_FN><br /><AU_MI></AU_MI><br /> <AU_LN>Ven Eman</AU_LN><br /></Author><br /><Body>The process of indexing a content object begins with…</Body><br /><Subject>Indexing</Subject><br /><Subject>Thesauri</Subject><br /><Subject>Standards</Subject><br /><Subject>Classification</Subject><br /></Anchor><br />09-14-11<br />“Solving the Challenge”<br />By Jay Ven Eman<br />The process of indexing a content object begins with…<br />Thesaurus<br />Master<br />M.A.I.<br />Unstructured<br />Data Harmony<br />MAIstro Server<br />Structured<br />Content<br />Repository<br />e.g. Database<br />Classification System<br />
    37. 37. Integrating Identity into Publisher Systems<br />
    38. 38. Author Submission Systems<br />
    39. 39. Classification/indexing<br />Suggested terms<br />
    40. 40. Button to auto-extract taxonomy attributes<br />Organizational (or people) profiles<br />User pastes or uploads CV<br />User Reviews tagging for accuracy <br />
    41. 41. Copyright © 2011 Access Innovations, Inc.<br />22<br />Topics, associations, occurrences<br />doc-type<br />MLB<br />article<br />Sports<br />http://www.newindexer.com/mlb.htm/<br />use-for<br />member-of<br />descriptor-for<br />Professional baseball<br />author-of<br />related-to<br />Professional athletes<br />member-of<br />Baseball<br />member-of<br />Amateur baseball<br />Smith<br />member-of<br />Little league<br />http://www.swaa.org<br />
    42. 42. Creating an Author Authority Database<br /><ul><li>Tag all articles in the repository with standard subjects
    43. 43. Export author names, subjects, institutions, locations, etc.
    44. 44. Disambiguate authors with the same or similar names</li></li></ul><li>Goals and Agenda<br />Why is it important right now that we develop better resources about people?<br />What roles can taxonomies play in this effort?<br />What opportunities are being created for knowledge discovery and collaboration?<br />What broad initiatives and technologies should we be aware of? <br />
    45. 45. Author Data: View by Connections…<br />
    46. 46. … or by location…<br />
    47. 47. …Or in the Document itself:<br />http://dx.doi.org/10.1371/journal.pntd.0000228.x001<br />
    48. 48. Visualizations: co-author networks<br />
    49. 49. Many Repositories for Names<br />
    50. 50. VIAF: Virtual International Authority File<br />http://viaf.org<br />
    51. 51.
    52. 52. Project VIVO<br />Designed around linked data standards Resource Description Framework (RDF)<br />VIVO’s ontology integrates data from human resource systems, grants databases, faculty annual reporting systems, and publication databases <br />Free open-source software download: http://vivo.sourceforge.net<br />
    53. 53. Detailed Profiles of Medical/BioMedical Researchers<br />Contact a researcher with the desired expertise and research activity<br />Focus the results<br />Explore a research area<br />Locate the PI for a grant<br />
    54. 54. Information stored as Resource Description Framework (RDF)<br /><ul><li>Data is structured in the form of “triples” as subject-predicate-object.
    55. 55. Concepts and their relationships use a shared ontology to facilitate the harvesting of data from multiple sources.</li></ul>Dept. of Genetics<br />College of Medicine<br />is member of<br />Jane Smith<br />Genetics Institute<br />has affiliations with<br />Journal article<br />author of<br />Book<br />Book chapter<br />Subject<br />Predicate<br />Object<br />
    56. 56. Detailed Data Relationships<br />Connections among scientists illustrated<br />David Nelson<br />has position in<br />organization with position for<br />Biomedical Informatics<br />Inverse relationships are created<br />is research area of<br />has research area<br />has position in<br />featured in<br />Clinical Translational Science Institute (CTSI)<br />organization with position for<br />Mike Conlon<br />features<br />Ed Tech Magazine<br />Gene Anderson<br />has author<br />author of<br />author of<br />has author<br />Development of an Observational Instrument to Measure Mother-Infant Separation Post Birth<br />Current and accurate data revealed<br />
    57. 57. ORCID<br />125 Participant Organizations<br />36<br />
    58. 58. Testing Possible Matching Algorithms:<br /><ul><li> VIAF matching technology from OCLC
    59. 59. Author Resolver from ProQuest
    60. 60. Matching capability from OKKAM</li></ul>Access Innovations Author Authority<br />ORCID Profile Exchange<br />ORCID<br />F67572010<br />37<br />
    61. 61. Thank you! Questions?<br />Solving the Challenge of <br />Connecting People and Author Networks<br />Access Innovations, Inc.<br />Changing search to found<br />Jay Ven Eman, Ph.D., CEO<br />j_ven_eman@accessinn.com<br />www.accessinn.com<br />www.dataharmony.com<br />+1.505.998.0800<br />Albuquerque, NM<br />

    ×