Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BiOnIC: A Catalog of User Interactions with Biomedical Ontologies

153 views

Published on

BiOnIC is a catalog of aggregated statistics of user clicks, queries, and reuse counts for access to over 200 biomedical ontologies. BiOnIC also provides anonymized sequences of classes accessed by users over a period of four years. To generate the statistics, we processed the access logs of BioPortal, a large open biomedical ontology repository. We publish the BiOnIC data using DCAT and SKOS metadata standards. The BiOnIC catalog has a wide range of applicability, which we demonstrate through its use in three different types of applications. To our knowledge, this type of interaction data stemming from a real-world, large-scale application has not been published before. We expect that the catalog will become an important resource for researchers and developers in the Semantic Web community by providing novel insights into how ontologies are explored, queried and reused. The BiOnIC catalog may ultimately assist in the more informed development of intelligent user interfaces for semantic resources through interface customization, prediction of user browsing and querying behavior, and ontology summarization. The BiOnIC catalog is available at: http://onto-apps.stanford.edu/bionic.

Published in: Science
  • Shared it, also...GROW YOU DOWNLINE OVERNIGHT - Works with any mlm. Have dozens joining whatever mlm your doing today! Go to: www.mlmrc.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

BiOnIC: A Catalog of User Interactions with Biomedical Ontologies

  1. 1. BiOnIC: A Catalog of User Interactions with Biomedical Ontologies 16th Interna+onal Seman+c Web Conference (ISWC) Vienna, 21st - 25th October 2017 M A U L I K KA M D A R , S I M O N WA L K , TA N I A T U D O R A C H E , MA R K MU S E N Stanford Center for Biomedical Informa:cs Research maulikrk@stanford.edu
  2. 2. Benefits of analyzing user interac+ons Ø  Ontology Engineers: v  Iden+fy explora+on and querying paVerns v  Understand ontology usage and reuse v  Prune unwanted classes and rela+ons Ø  Ontology Repository Maintainers: v  Categorize user behaviors v  Develop intelligent interfaces v  Provide targeted recommenda+ons Ø  Biomedical Researchers: v  Iden+fy temporal research trends v  Iden+fy frequently accessed classes
  3. 3. BiOnIC: A Catalog of User Interac+ons with Biomedical Ontologies hVp://onto-apps.stanford.edu/bionic/datasets
  4. 4. hVp://bioportal.bioontology.org/
  5. 5. hVp://bioportal.bioontology.org/
  6. 6. hVp://bioportal.bioontology.org/
  7. 7. hVp://bioportal.bioontology.org/
  8. 8. hVp://bioportal.bioontology.org/
  9. 9. NCBO API Usage APIRequestsperMonth 2013−O ct 2014−Jan 2014−Apr 2014−Jul 2014−O ct 2015−Jan 2015−Apr 2015−Jul 2015−O ct 2016−Jan 2016−Apr 2016−Jul 2016−O ct 2M8M32M Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data NCBO Website Traffic OccurrencesperMonth 2009−Jan 2010−Jan 2011−Jan 2012−Jan 2013−Jan 2014−Jan 2015−Jan 2016−Jan 0100K200K Page Requests Unique IP Addresses BiOnIC datasets crea+on •  Removing robot/invalid requests •  Normalizing ontology iden+fiers and class IRIs
  10. 10. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on •  January 2015 version. •  Ontologies should have classes that are reused by others OR reuse classes from other ontologies. •  Ontologies should have minimum of 10 unique users via WebUI and API
  11. 11. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on Class Sta:s:cs Datasets For each class in each ontology: •  Access AGributes: o  Total IP Requests (WebUI/API) o  Unique IP Requests (WebUI/API) •  Reuse AGributes: o  Number of ontologies reusing a class •  Structural AGributes: o  Number of parent/child/sibling classes o  Depth from ontology root
  12. 12. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  13. 13. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  14. 14. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  15. 15. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  16. 16. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  17. 17. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  18. 18. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  19. 19. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  20. 20. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  21. 21. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2 Class Depth -> 2a 1 3a 4a 4b 4c 2b 3b 3c 1’ 2a’ 2b’ 2c’ 3a’ 3b’ 3c’
  22. 22. 2a’ 1’ 2b’ 3b’ 2a 3a 4a 3a 1 Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on User Interac:on Sequences Datasets Ontology 1 Ontology 2
  23. 23. Filtering Access Logs Filtering Ontologies Compu+ng Class Counts Compu+ng Sequences Anonymizing Data BiOnIC datasets crea+on Anonymiza:on Steps •  IP addresses anonymized using unique SHA-224 hash-encoded user iden+fiers generated from “user_<Random String>_<Random_Integer>”. •  e.g. 39fd4e6d569a034973g61bb392a694d4eabe1ef98c43ee68ca2fc86 •  Absolute Time-stamps converted to rela+ve +me-stamps, with respect to first interac+on with BioPortal repository. •  e.g. 0, 2757, 2786, 3586, 3618, 3803, 3959, 4047, 5111 (s), …
  24. 24. BiOnIC schema to model sta+s+cs and sequences data countStat bionic:CountStat bionic:ReuseCount -  reuseType -  reusingOntologies bionic:RequestCount -  accessType -  year -  totalUsers -  uniqueUsers prov:Agent bionic:Sequence -  accessType -  totalTime -  uniqueClasses bionic:SeqEn:ty -  rela6veTimestamp bionic:Ontology -  skos:prefLabel -  totalClasses -  maxDepth owl:Class - skos:prefLabel skos:Collec:on skos:Concept begin end nextEn6ty class class requests skos:member bionic:SeqDataset -  accessType bionic:StatDataset dcat:Dataset sequence classInfo ontology ontology bionic:ClassInfo -  siblings -  directParents -  directChildren -  classDepth class subClassOf ontology SKOS, PROV and DCAT standards are reused in the BiOnIC schema.
  25. 25. hVp://onto-apps.stanford.edu/bionic/datasets BiOnIC datasets
  26. 26. hVp://onto-apps.stanford.edu/bionic/datasets BiOnIC datasets hVp://www.rdjdt.org/
  27. 27. hVp://onto-apps.stanford.edu/bionic/datasets BiOnIC datasets hVp://www.rdjdt.org/ SPARQL Triplestore / Triple PaGern Fragment Server
  28. 28. hVp://onto-apps.stanford.edu/bionic/datasets BiOnIC datasets hVp://www.rdjdt.org/ SPARQL Triplestore / Triple PaGern Fragment Server BioPortal SPARQL Endpoint
  29. 29. Characteris+cs of the BiOnIC Catalog •  WebUI Access: 5.4M class requests, 1M unique agents •  API Access: 67.2M class requests, 205K unique agents •  255 biomedical ontologies
  30. 30. VisIOn (Visualizing Ontology Interac+ons) Web Applica+on hVp://onto-apps.stanford.edu/vision
  31. 31. VisIOn (Visualizing Ontology Interac+ons) Web Applica+on hVp://onto-apps.stanford.edu/vision
  32. 32. VisIOn (Visualizing Ontology Interac+ons) Web Applica+on hVp://onto-apps.stanford.edu/vision
  33. 33. Applica+ons of BiOnIC and VisIOn
  34. 34. Temporal influences in browsing and querying Fisher’s exact test with FDR: Certain classes (e.g. Ebolavirus) or sets of classes are browsed or queried significantly more, when compared between different +me periods. 2016 2015
  35. 35. Interface influences in browsing and querying Number of Unique API Users (Log Scale) Number of Unique WebUI Users (Log Scale) 1000 10 100 10 100 1000 1 1 Certain classes browsed or queried significantly more.
  36. 36. Interface influences in browsing and querying Number of Unique API Users (Log Scale) Number of Unique WebUI Users (Log Scale) 1000 10 100 10 100 1000 Female Reproduc:ve System 1 1 Certain classes browsed or queried significantly more. Dermis
  37. 37. Interface influences in browsing and querying Dysmorphic Syndrome Night blindness Number of Unique API Users (Log Scale) Number of Unique WebUI Users (Log Scale) 1000 10 100 10 100 1000 Female Reproduc:ve System 1 1 Certain classes browsed or queried significantly more. Dermis
  38. 38. Explora+on and Querying behavioral paVerns •  Certain classes in the lower levels of the ontological hierarchy are rarely browsed and queried – this may be an ar+fact of the indented tree visualiza+on. •  More triangular polygons (1 parent -> 2 children classes, or 2 parents -> 1 child class) observed in WebUI Access polygon due to indented tree visualiza+on.
  39. 39. Modeling user behaviors through Markov Chains Walk, et al. How Users Explore Ontologies on the Web: A Study of NCBO's BioPortal Usage Logs. WWW 17
  40. 40. Novel research direc+ons may be enabled through the BiOnIC and VisIOn resources •  Categorize user browsing behaviors by incorpora+ng the structural features of the ontology classes. •  Develop personalized user interfaces for ontology naviga+on, which take into account the user type and the predic+ons of the next class that a user is likely to access. •  Develop advanced methods for ontology summariza+on and modulariza+on, using BiOnIC datasets as features. …
  41. 41. Acknowledgments Musen Lab, Stanford BMI PhD Program, Stanford US NIH Grants U54-HG004028 GM086587 maulikrk@stanford.edu hVp://onto-apps.stanford.edu/bionic hVp://onto-apps.stanford.edu/vision

×