SMWCon 2012 Linked Data Visualizations


Published on

Visualizing Linked Data - View Layer, Charting, Embedded SPARQL, & HTML5

Published in: Technology, Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

SMWCon 2012 Linked Data Visualizations

  1. 1. Visualizing Linked Data (in Semantic MediaWiki) Vulcan Inc. in conjunction with Allen Institute for Brain Science
  2. 2. Today we are Discussing…• How can you use Linked Data?• Adding a „real‟ view layer to SMW.• Creating instance pages without knowing exactly what will be displayed on them.• Expanding existing instance pages with Semantic Results Formatters, a PHP view layer, and creating dynamic charts.• Examples!
  3. 3. Linked Data is…
  4. 4. What is the Allen Institute?• Launched in 2003 with seed funding from founder and philanthropist Paul G. Allen.• Serving the scientific community is at the center of our mission to accelerate progress toward understanding the brain and neurological systems.• The Allen Institutes multidisciplinary staff includes neuroscientists, molecular biologists, informaticists, and engineers. “The Allen Institute for Brain Science is an independent 501(c)(3) nonprofit medical research organization dedicated to accelerating the understanding of how the human brain works.”
  5. 5. Human Brain Map• Open, public online access• A detailed, interactive three- dimensional anatomic atlas of the "normal" human brain• Data from multiple human brains• Genomic analysis of every brain structure, providing a quantitative inventory of which genes are turned on where• High-resolution atlases of key brain structures, pinpointing where selected genes are expressed down to the cellular level• Navigation and analysis tools for accessing and mining the data
  6. 6. What is Neurowiki?• A joint project between Vulcan Inc. and the Allen Institute to build a Semantic Wiki mapping genetic instances.• A finished prototype testing the import pipelines and display components for combining 7 major RDF datasets from 6 different sources.• Current planning includes mapping complete datasets, curating a better ontology, creating multiple ontology management for a user class, and importing scientific papers.
  7. 7. Biological Linked Data Map• Open, public online access• Data from multiple RDF data stores• Complete import pipeline using LDIF framework• Outlines of each imported instance embedding inline wiki properties and providing views of imported properties from original RDF datasets• Charting tools that „pivot‟ SPARQL queries providing several views of each query• Navigation and composition tools for accessing and mining the data
  8. 8. Where did we get the data?• KEGG : Kyoto Encyclopedia of Genes and Genomes • “KEGG GENES is a collection of gene catalogs for all complete genomes generated from publicly available resources, mostly NCBI RefSeq.”• Diseasome • “The Diseasome website is a disease/disorder relationships explorer and a sample of an innovative map-oriented scientific work. Built by a team of researchers and engineers, it uses the Human Disease Network dataset.”• DrugBank • “The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug data with comprehensive drug target information.”• SIDER • “SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and package inserts.”• Neurolex • “Neurolex is a dynamic lexicon of 18,425 neuroscience terms supported by The Neuroscience Information Framework and the INCF.”
  9. 9. Wiki Ontology Map• Genes • DrugBank : 4,553 • Diseasome : 3,919 • KEGG : 9,841• Diseases • Diseasome : 4,213 • KEGG : 459• Drugs • DrugBank : 4,772 • KEGG : 2,482 • SIDER : 924• Effects • SIDER : 1,737• Pathways • KEGG : 28,442 We chose to intentionally simplify the ontology due to disagreements between researchers61,342 Instances Available about entity relationships and subclasses. for Import
  10. 10. Adding a True View Format SPARQL Result Execute Template PopulationLayer to Semantic MediaWiki Template Strategy Pattern for Template Variables 2.0Results Formatter1. Standard Semantic Results Embed Create Formatter Variables Formatter.2. Classes made extending the stock formatter and OntoBroker Triplestore Smarty Processor 3.0 providing a library of strategy classes for rendering different graph Send Result Embed Variables types.3. Assembled template variables are given to the Semantic Results 1.0 Completed Smarty PHP templating Formatter Template Added to Wiki 4.0 engine with a path to the page template file. Format Results4. Template is populated and inserted / injected into the complete MediaWiki page.
  11. 11. Four Initial Templatesfor Each Instance byCategory1. Custom infobox within outline template • Visible inline properties2. Outline template providing instance information3. Widget template displaying dynamic charts or third party services • Donut charts and disease Twitter feed4. Broad table SPARQL queries showing instance relationships5. Hidden inline properties for other extensions
  12. 12. Create List of Page NamesCreating Instance RDF Data 1.0Wiki Pages DownloadThe triplestore now containedtens of thousands ofrecognized category instances. Sanitize Script 2.0Creating the pages would requirea bot. Category Create CSV1. Fetch the RDF dumps from Page Names an active D2R server Text of Wiki2. Use regex to fetch the markup for page Read Open instance rdf:label property that was 3.0 mapped by R2R as an Create MediaWiki Page instance name MediaWiki3. Open category specific text Gateway rb Framework file of wiki markup (page of template includes) Neurowiki REST Instance interface Page4. Contact Neurowiki and 4.0 request a new page from the list of names with the category content
  13. 13. High Charts• Flash based through SVG object• Specializes in 2 formats • Line, Bar, and Pie Charting • Candlestick Charting• Extension of old Open Flash Chart• Requires JSON knowledge and object oriented JavaScript• Extensive library of prebuilt demos – very good at plug and play JSON objects• Free to download and use
  14. 14. Processing.js• JavaScript implementation of the Processing language• Created by the JQuery development group• Uses HTML5 canvas and CSS3• Often uses other visualization libraries for extending to more advanced views.• Most demos are closed projects and require knowledge of programming to implement• Free to download and use
  15. 15. GraphUp!• JavaScript that focuses on standard list and table HTML• Extends Jquery 1.4+ and easily added to frameworks• Compliant with existing HTML4 / CSS2• Used to create heat maps and bar graphs with existing structures• Detects numerical values in cells and lists creating a heat map with averages, max, and min• Customizable colors and sizes for visualizations• $15 and you don‟t have to write 300 lines of JavaScript
  16. 16. InfoVis• Flash based through SVG object• Greater control of graphs• Large library of completed graphs for plugging in existing query structures• Due to the fine graph control an extensive knowledge of JavaScript and visual application programming is required• Complete and thorough API of the library for creating new graph patterns
  17. 17. D3• Originally the ProtoVis library• Complete and thorough API of the library for creating new graph patterns• Hundreds of demo graphs both available with code or locked into specific projects• Breaks the graphing paradigm and exists only to create new representations of data• Expert level JavaScript and visual programming necessary to create unique patterns
  18. 18. Demo Links•••••••••