Integrating Knowledge with Semantic MediaWiki and MWSync
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Integrating Knowledge with Semantic MediaWiki and MWSync

  • 1,392 views
Uploaded on

In this presentation I discussed a new framework for creating live mirrors of MediaWiki sites such as Wikipedia and the integration of multiple mirrored sites into Semantic MediaWiki. This......

In this presentation I discussed a new framework for creating live mirrors of MediaWiki sites such as Wikipedia and the integration of multiple mirrored sites into Semantic MediaWiki. This framework, MWSync, was created as part of the effort to integrate gene knowledge on Wikipedia with the human mutation database at SNPedia.com.
Presentation copyright 2012 Erik Clarke, The Scripps Research Institute, La Jolla, CA.

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,392
On Slideshare
1,392
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • Let’s start with an example.\nvisualization of the two coming together\nbetter description of the transformations\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Watchlist can be defined programmatically through API\nNeed account on source if you want custom watchlist (could use Recent Changes)\nObviously need account on target to write changes\n
  • Watchlist can be defined programmatically through API\nNeed account on source if you want custom watchlist (could use Recent Changes)\nObviously need account on target to write changes\n
  • Watchlist can be defined programmatically through API\nNeed account on source if you want custom watchlist (could use Recent Changes)\nObviously need account on target to write changes\n
  • Watchlist can be defined programmatically through API\nNeed account on source if you want custom watchlist (could use Recent Changes)\nObviously need account on target to write changes\n
  • Watchlist can be defined programmatically through API\nNeed account on source if you want custom watchlist (could use Recent Changes)\nObviously need account on target to write changes\n
  • \n
  • \n
  • Using ontologies, if available, allows reasoning over categories! Lots of ontologies available, work well with Semantic MediaWiki\nCan import ontologies programmatically using a script and the API\n
  • Using ontologies, if available, allows reasoning over categories! Lots of ontologies available, work well with Semantic MediaWiki\nCan import ontologies programmatically using a script and the API\n
  • Using ontologies, if available, allows reasoning over categories! Lots of ontologies available, work well with Semantic MediaWiki\nCan import ontologies programmatically using a script and the API\n
  • Using ontologies, if available, allows reasoning over categories! Lots of ontologies available, work well with Semantic MediaWiki\nCan import ontologies programmatically using a script and the API\n
  • Using ontologies, if available, allows reasoning over categories! Lots of ontologies available, work well with Semantic MediaWiki\nCan import ontologies programmatically using a script and the API\n
  • We built software to do this exact thing...\n
  • We built software to do this exact thing...\n
  • We built software to do this exact thing...\n
  • We built software to do this exact thing...\n
  • We built software to do this exact thing...\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • A review article for every human gene\n
  • Contest put on by 23AndMe, award for best description of an individual from their SNPs\nPoint out mike cariaso/etc\n
  • \n
  • \n
  • \n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • bring this to the front (make it clear where you’re going from the beginning)\ngenewikiplus.org\n
  • SNPedia’s based on Semantic Mediawiki already, so some properties are included in the text\nOften, find property on SNP and write correlated property to gene article\n
  • SNPedia’s based on Semantic Mediawiki already, so some properties are included in the text\nOften, find property on SNP and write correlated property to gene article\n
  • SNPedia’s based on Semantic Mediawiki already, so some properties are included in the text\nOften, find property on SNP and write correlated property to gene article\n
  • SNPedia’s based on Semantic Mediawiki already, so some properties are included in the text\nOften, find property on SNP and write correlated property to gene article\n
  • human proteins associated with anemias\nthe genes with SNPs associated with cancer\nthe SNPs associated with diabetes\n\nthis information is also not really easily available\n
  • human proteins associated with anemias\nthe genes with SNPs associated with cancer\nthe SNPs associated with diabetes\n\nthis information is also not really easily available\n
  • human proteins associated with anemias\nthe genes with SNPs associated with cancer\nthe SNPs associated with diabetes\n\nthis information is also not really easily available\n
  • human proteins associated with anemias\nthe genes with SNPs associated with cancer\nthe SNPs associated with diabetes\n\nthis information is also not really easily available\n
  • \n
  • \n
  • \n
  • \n
  • Integrates into the broader Semantic Web; content available through standard formats\nlabel: linked data cloud (get graphic with dbpedia in the middle)\n
  • Integrates into the broader Semantic Web; content available through standard formats\nlabel: linked data cloud (get graphic with dbpedia in the middle)\n
  • Integrates into the broader Semantic Web; content available through standard formats\nlabel: linked data cloud (get graphic with dbpedia in the middle)\n
  • Integrates into the broader Semantic Web; content available through standard formats\nlabel: linked data cloud (get graphic with dbpedia in the middle)\n
  • Integrates into the broader Semantic Web; content available through standard formats\nlabel: linked data cloud (get graphic with dbpedia in the middle)\n
  • Integrates into the broader Semantic Web; content available through standard formats\nlabel: linked data cloud (get graphic with dbpedia in the middle)\n
  • Integrates into the broader Semantic Web; content available through standard formats\nlabel: linked data cloud (get graphic with dbpedia in the middle)\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • Aggregate sources to provide standardized outputs and queries\nUse SMW as a backend\nrich, flexible, semantic data store\n
  • another researcher took our RDF dump into SPARQL endpoint here:<>\n\ntalk more about GSOC\n\nlimit the point entry\n
  • another researcher took our RDF dump into SPARQL endpoint here:<>\n\ntalk more about GSOC\n\nlimit the point entry\n
  • another researcher took our RDF dump into SPARQL endpoint here:<>\n\ntalk more about GSOC\n\nlimit the point entry\n
  • another researcher took our RDF dump into SPARQL endpoint here:<>\n\ntalk more about GSOC\n\nlimit the point entry\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. Integrating Knowledgewith Semantic MediaWiki Erik Clarke The Scripps Research Institute
  • 2. A Conundrum
  • 3. A ConundrumI love popular recipe sites like the HealthyRecipes Wiki and Japanese Recipes Wiki
  • 4. A ConundrumI love popular recipe sites like the HealthyRecipes Wiki and Japanese Recipes WikiSalmon was on sale today and I’ve impulsivelybought a ton of it. I would like to find recipesthat use salmon.
  • 5. A ConundrumI love popular recipe sites like the HealthyRecipes Wiki and Japanese Recipes WikiSalmon was on sale today and I’ve impulsivelybought a ton of it. I would like to find recipesthat use salmon.Alas, I have a severe nut allergy; it would begreat if I could select only recipes without nuts.
  • 6. A ConundrumI love popular recipe sites like the HealthyRecipes Wiki and Japanese Recipes WikiSalmon was on sale today and I’ve impulsivelybought a ton of it. I would like to find recipesthat use salmon.Alas, I have a severe nut allergy; it would begreat if I could select only recipes without nuts.How do I find recipes that meet those conditions?
  • 7. Healthy Recipes WikiMore than 700delicious recipes!Lots of user-definedcategoriesSome basic plaintextsearch
  • 8. Japanese Recipes WikiMore delicious recipes!Similar to HealthyRecipes format; samelimitationsLots of commonelements, subjectmatter
  • 9. A smarter searchImagine if somehow I could ask these sites for“all recipes with salmon but without nuts” ormaybe just a “list of common ingredients in theJapanese recipes.”These searches are effectively impossible onstandard MediaWiki...
  • 10. We can use .......
  • 11. We can use .......
  • 12. We can use .......Of course, I can’t just install SMW onhealthyrecipes.wikia.com
  • 13. We can use .......Of course, I can’t just install SMW onhealthyrecipes.wikia.comI could dump all the information onto my ownSMW instance...
  • 14. We can use .......Of course, I can’t just install SMW onhealthyrecipes.wikia.comI could dump all the information onto my ownSMW instance... But I miss out on any new updates or recipes and everything else from the community
  • 15. We can use .......Of course, I can’t just install SMW onhealthyrecipes.wikia.comI could dump all the information onto my ownSMW instance... But I miss out on any new updates or recipes and everything else from the community And the information will still not be “semantically aware” just because it lives on SMW now
  • 16. A way to update live
  • 17. A way to update live1) Do an initial first copy
  • 18. A way to update live1) Do an initial first copy2) Add desired pages to watchlist
  • 19. A way to update live1) Do an initial first copy2) Add desired pages to watchlist3) Check watchlist every X minutes for changes
  • 20. A way to update live1) Do an initial first copy2) Add desired pages to watchlist3) Check watchlist every X minutes for changes4) Copy and transform changed page content
  • 21. A way to update live1) Do an initial first copy2) Add desired pages to watchlist3) Check watchlist every X minutes for changes4) Copy and transform changed page content5) Write new content to our Wiki
  • 22. Example Transformations1 cup [[onions]]3/4 cup [[carrots]]
  • 23. Example Transformations1 cup [[onions]]3/4 cup [[carrots]] 1 cup [[has_ingredient::onions]] 3/4 cup [[has_ingredient:: carrots]]
  • 24. Ontologies
  • 25. OntologiesMany ontologies map pretty well to MWCategories
  • 26. OntologiesMany ontologies map pretty well to MWCategoriesA hypothetical Food Ontology:
  • 27. OntologiesMany ontologies map pretty well to MWCategoriesA hypothetical Food Ontology: [Category:Nuts] contains Peanuts, Almonds, Cashews
  • 28. OntologiesMany ontologies map pretty well to MWCategoriesA hypothetical Food Ontology: [Category:Nuts] contains Peanuts, Almonds, Cashews If I’m allergic to nuts:
  • 29. OntologiesMany ontologies map pretty well to MWCategoriesA hypothetical Food Ontology: [Category:Nuts] contains Peanuts, Almonds, Cashews If I’m allergic to nuts: [[has_ingredient::!<q>[[Category:Nuts]]</q>]]
  • 30. FoodOntology
  • 31. Food OntologyAwesomeRecip es!SEARCH:By IngredientBy CuisineBy Dietary
  • 32. Java library
  • 33. Java library Cross-platform
  • 34. Java library Cross-platform Extremely flexible
  • 35. Java library Cross-platform Extremely flexible Open source
  • 36. Java library Cross-platform Extremely flexible Open sourcehttp://bitbucket.org/sulab/mwsync
  • 37. The Gene WikiMore than 10,500 genes37,578 PubMed citations~422 views/article/month92% are top results onGoogle6,830 distinct editors Good, et al. 2011. 10.1093/nar/gkr925
  • 38. SNPedia SNPs: single nucleotide http://SNPedia.com: A polymorphisms - point database of SNPs mutations in genes Based on Semantic “Fingerprint” of a person’s Mediawiki and NIH data genome Promethease, based on Often responsible for SNPedia, won contest for individual traits or best description of an likelihoods of traits individual using only their SNPs Baldness, eye color, sickle- cell anemia By Mike Cariaso and Greg LennonCariaso, M., Lennon, G. 2011. 10.1093/nar/gkr798
  • 39. Gene Wiki + SNPedia
  • 40. Gene Wiki + SNPediaGenes and SNPs are intrinsically related (SNPsoccur on and around genes)
  • 41. Gene Wiki + SNPediaGenes and SNPs are intrinsically related (SNPsoccur on and around genes)Data about SNPs often also applies to theirgene
  • 42. Gene Wiki + SNPediaGenes and SNPs are intrinsically related (SNPsoccur on and around genes)Data about SNPs often also applies to theirgeneWhat if I want to find all the SNPs on a gene?Or all the genes that have SNPs linked tobaldness?
  • 43. Gene Wiki SNPedi Disease Ontology
  • 44. Gene Wiki SNPedi Disease Ontology GeneWiki +
  • 45. Gene Wiki SNPedi Disease Ontology GeneWiki +
  • 46. Gene Wiki SNPedi Disease Ontology GeneWiki +
  • 47. Disease OntologyGeneWiki +
  • 48. Pass page text through Disease Ontology Annotator (NCBO) Disease OntologyGeneWiki +
  • 49. Pass page text through Disease Ontology Annotator (NCBO) Add any diseases mentioned Disease as [[is_associated_with_disease::]] Ontology propertyGeneWiki +
  • 50. Pass page text through Disease Ontology Annotator (NCBO) Add any diseases mentioned Disease as [[is_associated_with_disease::]] Ontology property Link genes <-> SNPs withGeneWiki [[has_SNP::]] or + [[in_gene::]]
  • 51. Pass page text through Disease Ontology Annotator (NCBO) Add any diseases mentioned Disease as [[is_associated_with_disease::]] Ontology property Link genes <-> SNPs withGeneWiki [[has_SNP::]] or + [[in_gene::]] [[wikilink]] -> [[is_associated_with::link]]
  • 52. GeneWiki +
  • 53. GeneWiki + SNPs Genes Diseases
  • 54. GeneWiki + #ask: [[Category:Human_proteins]] [[is_associated_with:: <q>[[Category:Anemia]]</q>]] SNPs Genes Diseases
  • 55. GeneWiki + #ask: [[Category:Human_proteins]] [[is_associated_with:: <q>[[Category:Anemia]]</q>]] SNPs #ask: [[has_SNP:: Genes [[is_associated_with_ Diseases disease::cancer]]]]
  • 56. GeneWiki + #ask: [[Category:Human_proteins]] [[is_associated_with:: <q>[[Category:Anemia]]</q>]] SNPs #ask: [[has_SNP:: Genes [[is_associated_with_ Diseases disease::cancer]]]] #ask: [[in_gene::Insulin]] [[is_associated_with_disease:: Diabetes_mellitus]]
  • 57. GeneWiki Ask API +
  • 58. GeneWiki Ask API +
  • 59. GeneWiki Ask API + genewikiplus.org/api.php? action=ask &query=[[in_gene::CDK2]]
  • 60. GeneWiki Ask API + genewikiplus.org/api.php? action=ask &query=[[in_gene::CDK2]] <items> <list-item title="Rs2069408" uri="http://genewikiplus.org/index.php?title=Rs2069408"> <properties type="Category:Is a snp" /> </list-item> <list-item title="Rs2069414" uri="http://genewikiplus.org/index.php?title=Rs2069414"> <properties type="Category:Is a snp" /> </list-item> </items>
  • 61. GeneWiki GW+ +Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 62. GeneWiki GW+ RDF + SPARQLLinking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 63. GeneWiki GW+ RDF + SPARQLLinking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 64. GeneWiki RDF + SPARQL GW+Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 65. GeneWiki +
  • 66. Various Bio Wikis GeneWiki +
  • 67. Various Bio Wikis Ask API GeneWiki +
  • 68. Various Bio Wikis Ask API api.php GeneWiki +
  • 69. Various Bio Wikis Ask API api.php GeneWiki + SPARQL
  • 70. Various Bio Wikis Ask API api.php GeneWiki + SPARQL RDF
  • 71. GeneWiki+
  • 72. GeneWiki+Google Summer of Code project to build abetter interface/visualization using our API anddata
  • 73. GeneWiki+Google Summer of Code project to build abetter interface/visualization using our API anddata Explore gene-disease-SNP links interactively
  • 74. GeneWiki+Google Summer of Code project to build abetter interface/visualization using our API anddata Explore gene-disease-SNP links interactively allow researchers to easily ask for our data
  • 75. GeneWiki+Google Summer of Code project to build abetter interface/visualization using our API anddata Explore gene-disease-SNP links interactively allow researchers to easily ask for our dataOur RDF is now in the Semantic Web!
  • 76. From the Semantic Web perspective...
  • 77. From the Semantic Web perspective... We can bring in sites whose owners maybe don’t care or have the time to create a semantically-rich representation of their data (with appropriate licenses, of course)
  • 78. From the Semantic Web perspective... We can bring in sites whose owners maybe don’t care or have the time to create a semantically-rich representation of their data (with appropriate licenses, of course) Integration with DBPedia and other Semantic Web initiatives
  • 79. From the research perspective...Wikis are popular bio data stores; this allowsbetter access and aggregation of disparate dataFor biocuration and bioinformatics, possibilitiesare hugeDirected acyclic graph-type ontologies can mapeasily into MW category system
  • 80. From the developer’s perspective...Using SMW as a data store allows rich, prebuiltrelational database built on external data(wikipedia, etc)Abstract away the difficulties/idiosyncracies ofMediaWiki’s interface and API
  • 81. Wrap-upTons of data (bio and otherwise) begging to beintegrated and organizedMWSync allows live transformation andintegration of MediaWiki sitesSMW enables RDF/SPARQLSMW API enables alternative interfaces
  • 82. Acknowledgements GeneThanks to Ben Good, SalLoguercio, and the SuLab @ ScrippsMike Cariaso @SNPedia.com Disease Ontologyhttp://genewikiplus.orghttp://bitbucket.org/ GeneWiki +sulab/mwsync