Generating Researcher Networks with Identified Persons on a Semantic Service Platform

1,113 views
1,043 views

Published on

Hanmin Jung(KISTI)

Published in: Technology, Education
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total views
1,113
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
13
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

Generating Researcher Networks with Identified Persons on a Semantic Service Platform

  1. 1. Generating Researcher Networks with Identified Persons on a Semantic Service Platform 15 Sep. 2009 Hanmin Jung KISTI BlogTalk2009 1 Copyright © 2004-2009, KISTI
  2. 2. Agenda Research networks would be useful for finding Collaborators Speakers (Key persons of a researcher group) Issues Getting sources Resolving identities Finding experts Generating networks BlogTalk2009 2 Copyright © 2004-2009, KISTI
  3. 3. Getting sources … BlogTalk2009 3 Copyright © 2004-2009, KISTI
  4. 4. Sources Identified Entities Papers: 453,124 Elsevier international journal papers with full-texts and metadata Persons: 1,352,220 Topics: 339,947 Institutions: 91,514 Locations: 409,575 (with GPS coordinate) RDF Triples: 283,087,518 (2008.11) BlogTalk2009 4 Copyright © 2004-2009, KISTI
  5. 5. Resolving identities … How to resolve identities? How to merge different identifiers as one? BlogTalk2009 5 Copyright © 2004-2009, KISTI
  6. 6. OntoFrame OntoFrame 2008 Service WS API WS API/SPARQL XML Ontology Search Engine Ontologies XML Schemata OntoReasoner® OntoURI® WS API Triple Legacy DB Table Field Listener Generator SQL/ Information Expanded Triples … WS API Answers Ontology Field Instances Information Legacy DB Table DB Tables RDF Triple Store BlogTalk2009 6 Copyright © 2004-2009, KISTI
  7. 7. Ontology Reference and Academic Knowledge Ontologies BlogTalk2009 7 Copyright © 2004-2009, KISTI
  8. 8. OntoFrame Syntactic-to-Semantic Process Design Ontology Model Design Ontology Model Edit URI Generation Rules Edit URI Generation Rules Modeling-Time Select Database & Ontology Edit Identity Resolution Rules Select Database & Ontology Edit Identity Resolution Rules Process Edit Mapping Rules Edit Mapping Rules Test Mapping Process Test Mapping Process Normalize Field Values Normalize Field Values Crawl Database Crawl Database Apply Identity Resolution Rules Apply Identity Resolution Rules Refer Authority Data Refer Authority Data Indexing-Time Resolve Identities Resolve Identities Extract Topics Extract Topics Process Assign URIs Assign URIs Apply Mapping Rules Apply Mapping Rules Apply URI Generation Rules Apply URI Generation Rules Generate RDF Triples Generate RDF Triples Run-Time Process Apply sameAs Relations Apply sameAs Relations BlogTalk2009 8 Copyright © 2004-2009, KISTI
  9. 9. Identity Resolution case 1 case 2 Barry G.T. Barry Christian Christian Lowden Lowden Becker Becker BlogTalk2009 9 Copyright © 2004-2009, KISTI
  10. 10. Identity Resolution Rules for Resolving Personal Identities Class Resource Kind Match Relation Source Weight Person Order 1 Person Name Pivot Exact Single OntoURI Person hasInstitution Feature Exact Single OntoURI 2 Person Email Feature Number Single 4 Person hasCoauthor Feature Number Multiple OntoReasoner 1 Person hasTopic threshold 0.8 BlogTalk2009 10 Copyright © 2004-2009, KISTI
  11. 11. Identity Resolution Authority Data Normalized Form Variant Form Kind Class IBM International Business Machines Corporation Abbreviation Institution Microsoft MS Abbreviation Institution Microsoft 마이크로소프트 Korean Institution London 런던 Korean Location Academic Inc. Academic Press Inc, LTD Alternative Publication BlogTalk2009 11 Copyright © 2004-2009, KISTI
  12. 12. Identity Resolution sameAs Authorization ∅ BlogTalk2009 12 Copyright © 2004-2009, KISTI
  13. 13. Identity Resolution sameAs Candidates BlogTalk2009 13 Copyright © 2004-2009, KISTI
  14. 14. ReSIST (2006 ~ 2008) BlogTalk2009 14 Copyright © 2004-2009, KISTI
  15. 15. ReSIST (2006 ~ 2008) Resilience Knowledge Base "Deliverable D31: Final Workshop report" by ReSIST BlogTalk2009 15 Copyright © 2004-2009, KISTI
  16. 16. LOD Project Linking Open Data Community Project Available in RDF and SVG (Scalable Vector Graphics) versions KISTI http://richard.cyganiak.de/2007/10/lod/ BlogTalk2009 16 Copyright © 2004-2009, KISTI
  17. 17. Finding experts … How to extract topics? How to determine topics of a researcher? BlogTalk2009 17 Copyright © 2004-2009, KISTI
  18. 18. Topic Extraction System Architecture BlogTalk2009 18 Copyright © 2004-2009, KISTI
  19. 19. Topic Propagation Propagating Topics of Entities Article Person BlogTalk2009 19 Copyright © 2004-2009, KISTI
  20. 20. Experts Finding Process Knowledge expansion Making direct relations for shorter access path Experts retrieval Querying with SPARQL for a given topic Converting SPARQL-to-SQL Using backward chaining path Post-processing Grouping and counting retrieved authors Ranking by names or the number of achievements Making an XML document as the result BlogTalk2009 20 Copyright © 2004-2009, KISTI
  21. 21. Knowledge Expansion Inference Rule @prefix isrl: <http://www.kisti.re.kr/isrl/ResearchRefOntology#> (?x isrl:hasCreatorInfo ?y) (?y isrl:hasCreator ?z) -> (?x isrl:createdByPerson ?z) Article hasCreatorInfo CreatorInfo createdByPerson hasCreator Person …… BlogTalk2009 21 Copyright © 2004-2009, KISTI
  22. 22. Experts Retrieval Backward Chaining Path BlogTalk2009 22 Copyright © 2004-2009, KISTI
  23. 23. Generating networks … How to find a researcher group? How about similar researchers? BlogTalk2009 23 Copyright © 2004-2009, KISTI
  24. 24. OntoFrame 2008 BlogTalk2009 24 Copyright © 2004-2009, KISTI
  25. 25. Researcher Networks (T, P) BlogTalk2009 25 Copyright © 2004-2009, KISTI
  26. 26. Researcher Networks (T, P) Process Getting co-author pairs for a target topic (T) SELECT DISTINCT ?person1 ?person2 WHERE { ?article aca:yearOfAccomplishment ?year . FILTER(?year>=startYear && ?year<=endYear) . ?article aca:hasTopicOfArticle <topURI> . ?article aca:createdByPerson ?person1 . ?article aca:createdByPerson ?person2 . FILTER(?person1 < ?person2) . } Selecting a target researcher (P) in the pairs Tracing group members connected with him (seed) BlogTalk2009 26 Copyright © 2004-2009, KISTI
  27. 27. Researcher Networks (P) BlogTalk2009 27 Copyright © 2004-2009, KISTI
  28. 28. Researcher Networks (P) Process Getting co-author pairs including a target researcher (P) SELECT ?per1 ?per2 WHERE { ?article aca:yearOfAccomplishment ?year . FILTER(?year>=startYear && ?year<=endYear) . ?article aca:createdByPerson ?per1 . ?article aca:createdByPerson ?per2 . FILTER(?per1 < ?per2) . FILTER(?per1=<perURI> || ?per2=<perURI>) . } Ranking them with the frequency of co-authorship BlogTalk2009 28 Copyright © 2004-2009, KISTI
  29. 29. Similar Researchers BlogTalk2009 29 Copyright © 2004-2009, KISTI
  30. 30. Similar Researchers (P) Process (1/2) Getting topics of a target researcher (P) SELECT ?per1 ?topic WHERE { ?article aca:createdByPerson ?per1 . ?article aca:hasTopicArea ?topicArea . ?topicArea aca:hasTopicOfTopicArea ?topic . FILTER(?per1=<perURI>) . } Ranking and selecting top n topics for him BlogTalk2009 30 Copyright © 2004-2009, KISTI
  31. 31. Similar Researchers Process (2/2) Getting researchers who largely share topics with him SELECT DISTINCT ?per2 WHERE { ?per2 aca:hasTopicOfPerson ?topic1 . ?per2 aca:hasTopicOfPerson ?topic2 . ?per2 aca:hasTopicOfPerson ?topic3 . ?per2 aca:hasTopicOfPerson ?topic4 . FILTER(?per2!=<perURI>) . FILTER(?topic1 < ?topic2 && ?topic2 < ?topic3 && ?topic3 < ?topic4) . { FILTER(?topic1=<topic[0]> || ?topic1=<topic[1]> || ?topic1=<topic[2]> || ?topic1=<topic[3]> || ?topic1=<topic[4]>) . FILTER(?topic2=<topic[0]> || ?topic2=<topic[1]> || ?topic2=<topic[2]> || ?topic2=<topic[3]> || ?topic2=<topic[4]>) . FILTER(?topic3=<topic[0]> || ?topic3=<topic[1]> || ?topic3=<topic[2]> || ?topic3=<topic[3]> || ?topic3=<topic[4]>) . FILTER(?topic4=<topic[0]> || ?topic4=<topic[1]> || ?topic4=<topic[2]> || ?topic4=<topic[3]> || ?topic4=<topic[4]>) . } BlogTalk2009 31 Copyright © 2004-2009, KISTI
  32. 32. Conclusions Processes to Generate Researcher Networks Getting sources: Papers Resolving identities: Rules, Authority data, sameAs Finding experts: Topics, Reasoning Generating networks: Topic-, Person-constrained Next Research Topic Service mashup to get researcher networks directly BlogTalk2009 32 Copyright © 2004-2009, KISTI
  33. 33. “A lot of times, people don’t know what they want until you show it to them.” by Steve Jobs Thank you jhm@kisti.re.kr BlogTalk2009 33 Copyright © 2004-2009, KISTI

×