Your SlideShare is downloading. ×
0
An RDF Data Model for the     Semantic Web5th Oracle Life Sciences User Group meeting               May 16-17, 2005
AgendaIntroduction – 5 min  –   Susie StephensSemantic Web for Life Sciences – 25 min  –   Susie StephensOracle support of...
Semantic Web for Life Sciences         Susie Stephens
What is the Semantic Web? A machine-readable format that is Web compatible The Semantic Web adds definition tags to inform...
Resource Description Framework W3C standard for the common data format Based on triples (subject–predicate–object) Everyth...
Image Source: W3C
Enterprise Integration Hub                             Image Source: W3C
Semantic Web Stack                     Image Source: W3C
Pharma Productivity                      Source: PhRMA & FDA 2003
Critical Path Initiative                   Source: Innovation or Stagnation, FDA Report, March 2004
Ontology Frameworks for Integration                              <hasProduct>                  Protein                    ...
Biological Pathways                      Image Source: Cytoscape
Beyond the “Dead” Graphical Model                          Image Source: KEGG
Assigning Trust Values to Data                            Image Source: SWANS
InferencingIf Gene G is implicated in Disease D, and its ProteinProduct P is a functional component of only PathwayP2 -> t...
Why Semantic Web for Life Sciences? Heterogeneous data integration using explicit semantics Expression well-defined and ri...
Q U E S T I O N S A N S W E R S
RDF Support in Oracle RDBMS            Souripriya Das, Ph.D.       Consultant Member of Technical Staff     Oracle New Eng...
OverviewThree types of database objects  Model    RDF graph consisting of a set of triples  Rulebase   Set of (user-define...
RDF Models
Model: Overview Each RDF Model (graph) consists of a set of triples A triple (statement) consists of three components  –  ...
Model: Example                                            :John             16                                            ...
RDF Query
SDO_RDF_MATCH Table Func Arguments   –   Graph pattern          A sequence of triple patterns          Triple patterns typ...
SDO_RDF_MATCH: returnColumns (of type VARCHAR2) in each returned row:  For each variable ?x in Graph Pattern    –   x    –...
SDO_RDF_MATCH: matchingMatching multiple representations The same point in value space may have multiple representations  ...
RDF Query: Example Find salary and hiredate of all the uncles SELECT emp.name, emp.salary, emp.hiredate FROM emp,        T...
RDF Query: Example 2 Find pairs of persons residing at the same address where the first person rents a truck and the secon...
RDF Rulebases
Rulebase: Overview Each RDF rulebase consists of a set of rules Each rule consists of  –   antecedent: graph-pattern  –   ...
Rulebase: ExampleRules in a rulebase family_rb:    Antecedent: ‘(?x :brotherOf ?y) (?y :parentOf ?z)’    Filter: NULL    C...
RDF Rule Indexes
Rule Index: Overview A rule index represents an entailed graph A rule index is created on an RDF dataset (consisting of a ...
Rule Index: Example A rule index may be created on a dataset consisting of  –   family RDF data, and  –   family_rb ruleba...
RDF Query with Inference
SDO_RDF_MATCH withRulebases Arguments   –   Graph pattern           A sequence of triples (with variables)   –   RDF Data ...
RDF Query w/ Inference:Example Find salary and hiredate of all the uncles SELECT emp.name, emp.salary, emp.hiredate FROM e...
RDF Query w/ Inference:Example 2 Find pairs of persons residing at the same address where the first person rents a truck a...
RDF Models
Model: DDL Procedures provided as part of the API may be used to   –   Create a model   –   Drop a model When a user creat...
Model: DDL            Creating a Model  Create an Application TableCREATE TABLE family_table (  id NUMBER, family_triple S...
Loading RDF Data into Oracle Java API provided to load NTriple into NDM Sample XSLs provided  –   To convert RDF to NTripl...
Model: DML SQL DML commands may be used to do DML operations on a base table to effect DML (i.e., triple insert, delete, a...
Model: Security The creator of the base table corresponding to a model can grant privileges to other users To perform DML ...
Model: Views Database views corresponding to the models
RDF Rulebases
Rulebase: DDL Procedures provided as part of the API may be used to  –   Create a rulebase      create_rulebase(family_rb)...
Rulebase: DML SQL DML commands may be used on the database view corresponding to a target rulebase to insert, delete, and ...
Rulebase: Security Creator of a rulebase can grant privileges to the corresponding database view to other users Performing...
Rulebase: Views RDF_RULEBASE_INFO  –   Contains the list of rulebases  –   For each rulebase, contains additional      inf...
RDF Rule Indexes
Rule Index: DDL Procedures provided as part of the API may be used to   –   Create a rule index       create_rules_index (...
Rule Index: Security To create a rule index on an RDF dataset (models and rulebases), user needs to have QUERY privileges ...
Rule Index: Views RDF_RULEINDEX_INFO  –   Contains the list of rule indexes  –   For each rule index, contains additional ...
Rule Index: Dependencies Content of a rule index depends upon the content of each element of its dataset  –   Any modifica...
Summary RDF Data Model  –   Models (Graphs)  –   RDF Query using SDO_RDF_MATCH Table Function RDF Data Model with (user-de...
RDF Data Model Demo
Demo: Family Schema
Demo: Family Schema 2
Demo: Family Model Data
Demo: Family Model Data (Alt)
Demo: Query without Inferenceselect m from TABLE(SDO_RDF_MATCH(    (?m rdf:type :Male),    SDO_RDF_Models(family),    null...
Demo: Query w/ RDFS Inferenceselect m from TABLE(SDO_RDF_MATCH(    (?m rdf:type :Male),    SDO_RDF_Models(family),    SDO_...
Demo: Family Rulebase  Antecedent: ‘(?x :parentOf ?y) (?y :parentOf ?z)’  Filter: NULL  Consequent: ‘(?x :grandParentOf ?z)’
Demo: Query w/ Family and RDFS    Inferenceselect x, y from TABLE(SDO_RDF_MATCH(    (?x :grandParentOf ?y) (?x rdf:type :M...
Q U E S T I O N S A N S W E R S
Demo of Siderean’s Seamark    Navigation Server Mike DiLascio & Joanne Luciano
Agenda About Siderean Software & Predictive Medicine, Inc. Introducing Seamark Navigation Server v.3.6 Seamark & Oracle 10...
About Siderean Software   Aggregate, organize and navigate information              -the way users think –    -to improve ...
Current solutions“50,000 results!!! Now what?”           “I give up! Hello? Get me an apple!”      “Why do I get oranges w...
Introducing Seamark Navigation Server  “I can see the big picture!”           “No more staring at a blank text box.”   “I ...
How it works: process                   Term                               View   View                                 Per...
How it works: architecture                                                                 User Navigation                ...
Seamark/Oracle integration  architecture: Phase 1                                                                  User Na...
Seamark/Oracle integration architecture: Phase 2                                                                      User...
Seamark Demo: Background & Concepts  Life Sciences demonstration premise     RDF offers high value during early stage rese...
Seamark Demonstration: Identification of new drug candidates                                                              ...
Live Seamark Life Sciences      Demonstration:   Sample Screenshots
Seamark application start page shows integration of OMIM, GO, KEGG, UniProt and NCBI
Select: Probe Set ID: “M18255_cds2_s_at”
Results: 9 Matches on “M18255_cds2_s_at” to the Gene Ontology                                                             ...
Cytoplasm 1st of 9 Matches          Page Scroll
Cytoplasm 1st of 9 Matches                      Page ScrollPlasma Membrane, …, 2nd of 9 MatchesCellular Location Via Gene ...
Start Page: Optionally search across entire collection based uponkeywords from the integrated data sources
Seamark Lessons Learned RDF offers multiple unconstrained views of data/relationships – Provides  maximum flexibility duri...
Siderean Seamark Conclusion Getting the precise information we need from today’s data glut is profoundly difficult Solving...
To arrange a demonstration of Seamark orThank You!   for more information please contact:             Mike DiLascio       ...
Bio it 2005_rdf_workshop05
Bio it 2005_rdf_workshop05
Upcoming SlideShare
Loading in...5
×

Bio it 2005_rdf_workshop05

301

Published on

BioIT 2005 Reported on Project with Siderean Software. http://bit.ly/gsi68E (J Web Semantics Paper,

1 Comment
0 Likes
Statistics
Notes
  • Slides 78 and 87 show the graph of the heterogeneous data sets we brought together for the first ever demo of a life science mashup. This slide would later be used for about 2 years by Sir Tim Berners-Lee to communicate linked data. It was before links were URIs and so we had to link on mapping of database IDs (typically Database + ID within that database). See BioPAX (a few years later) for examples.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total Views
301
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Bio it 2005_rdf_workshop05"

  1. 1. An RDF Data Model for the Semantic Web5th Oracle Life Sciences User Group meeting May 16-17, 2005
  2. 2. AgendaIntroduction – 5 min – Susie StephensSemantic Web for Life Sciences – 25 min – Susie StephensOracle support of RDF in RDBMS – 25 min – Souripriya DasDemo of Siderean’s Seamark Navigation Server – 25 min – Mike DiLascio, David LaVigna & Joanne LucianoDiscussion – 10 min – Susie Stephens
  3. 3. Semantic Web for Life Sciences Susie Stephens
  4. 4. What is the Semantic Web? A machine-readable format that is Web compatible The Semantic Web adds definition tags to information in Web pages – Enables computers to discover data more effectively – Allows new associations to form between pieces of information
  5. 5. Resource Description Framework W3C standard for the common data format Based on triples (subject–predicate–object) Everything has a URI Ontologies used to label the RDF tagged elements Image Source: W3C
  6. 6. Image Source: W3C
  7. 7. Enterprise Integration Hub Image Source: W3C
  8. 8. Semantic Web Stack Image Source: W3C
  9. 9. Pharma Productivity Source: PhRMA & FDA 2003
  10. 10. Critical Path Initiative Source: Innovation or Stagnation, FDA Report, March 2004
  11. 11. Ontology Frameworks for Integration <hasProduct> Protein <participatesIn> Gene <transcribes> <translatesTo> mRNA <located> <influences> Cascade <affectedTissue> Localization pathway Disease <probeFor> <partOf> <targets> <profiledBy> Intervention Bio-process Drug point <MOA> Microarray <drugInteraction> experiment <affecting> Target <efficacyMarkerFor> model Treatment
  12. 12. Biological Pathways Image Source: Cytoscape
  13. 13. Beyond the “Dead” Graphical Model Image Source: KEGG
  14. 14. Assigning Trust Values to Data Image Source: SWANS
  15. 15. InferencingIf Gene G is implicated in Disease D, and its ProteinProduct P is a functional component of only PathwayP2 -> then Disease D directly perturbs Pathway P2<rdf:Description><log:is rdf:parseType=‘Quote’><rdf:Description rdf:about=‘variable#Gene_G’> <hasProduct rdf:resource=‘variable#Protein_P’/> <isImplicatedIn rdf:resource=‘variable#Disease_D’/></rdf:Description> <rdf:Description rdf:about=‘variable#Protein_P’> <inPathway rdf:resource=‘variable#Pathway_P2’/></rdf:Description><log:is><log:implies rdf:parseType=‘Quote’> <rdf:Description rdf:about=‘variable#Disease_D’> <D_perturbs rdf:resource=‘variable#pathway_P2’></rdf:Description></log:implies></rdf:Description>
  16. 16. Why Semantic Web for Life Sciences? Heterogeneous data integration using explicit semantics Expression well-defined and rich models of biological systems Annotating findings and interpretations formally and sharing with other scientists Embedding models and semantics within papers Applying logic to infer additional insights and to propose and/or capture new hypotheses
  17. 17. Q U E S T I O N S A N S W E R S
  18. 18. RDF Support in Oracle RDBMS Souripriya Das, Ph.D. Consultant Member of Technical Staff Oracle New England Development Center
  19. 19. OverviewThree types of database objects Model RDF graph consisting of a set of triples Rulebase Set of (user-defined) rules Rule Index Entailed RDF graphWe discuss following aspects for each type of object DDL DML Views SecurityRDF Query (with Inference)
  20. 20. RDF Models
  21. 21. Model: Overview Each RDF Model (graph) consists of a set of triples A triple (statement) consists of three components – Subject URI or blank node – Predicate URI – Object URI or literal or blank node A statement itself can be a resource (allowing nested graphs)
  22. 22. Model: Example :John 16 ageFamily: brotherOf(:John :brotherOf :Mary)(:John :age “16”^^xsd:Integer) parentOf(:Mary :parentOf :Matt) :Mary :Matt(:John :name “John”)(:Mary :name “Mary”) thinksReification:(:John :thinks _:S1)(_:S1 rdf:subject :Sue) livesIn(_:S1 rdf:predicate :livesIn) :Sue NYC(_:S1 rdf:object “NYC”)
  23. 23. RDF Query
  24. 24. SDO_RDF_MATCH Table Func Arguments – Graph pattern A sequence of triple patterns Triple patterns typically use variables – RDF Data set a set of models – Filter – Aliases … FROM TABLE(SDO_RDF_MATCH( ‘(?x :brotherOf ?y) (?y :parentOf ?z)’, SDO_RDF_Models(‘family’), … )) t …
  25. 25. SDO_RDF_MATCH: returnColumns (of type VARCHAR2) in each returned row: For each variable ?x in Graph Pattern – x – x$rdfVTYP URI, Literal, Blank node – x$rdfLTYP Specific literal type (e.g., xsd:integer) – x$rdfCLOB Contains actual value, if ?x matches a CLOB value – x$rdfLANG Language tag, if any (e.g., “en-us”) If no variable in Graph Pattern – A dummy column
  26. 26. SDO_RDF_MATCH: matchingMatching multiple representations The same point in value space may have multiple representations – “10”^^xsd:Integer – “10”^^xsd:PositiveInteger – “010”^^xsd:Integer – “000010”^^xsd:Integer SDO_RDF_MATCH automatically resolves these
  27. 27. RDF Query: Example Find salary and hiredate of all the uncles SELECT emp.name, emp.salary, emp.hiredate FROM emp, TABLE(SDO_RDF_MATCH( ‘(?x :brotherOf ?y) (?y :parentOf ?z) (?x :name ?name)’, SDO_RDF_Models(‘family), …)) t WHERE emp.name=t.name; Use of SDO_RDF_MATCH allows embedding a graph query in a SQL query
  28. 28. RDF Query: Example 2 Find pairs of persons residing at the same address where the first person rents a truck and the second person buys a fertilizer SELECT t3.x name1, t3.y name2 FROM AddrTable t1, AddrTable t2, TABLE(SDO_RDF_MATCH( ‘(?x :rents ?a) (?a rdf:type :Truck) (?y :buys ?b) (?b rdf:type :Fertilizer)’, SDO_RDF_Models(‘Activities), …)) t3 WHERE t1.name=t3.x and t2.name=t3.y and t1.addr=t2.addr;
  29. 29. RDF Rulebases
  30. 30. Rulebase: Overview Each RDF rulebase consists of a set of rules Each rule consists of – antecedent: graph-pattern – filter condition (optional) – Consequent: graph-pattern One or more rulebases may be used with relevant RDF models (graphs) to obtain entailed graphs
  31. 31. Rulebase: ExampleRules in a rulebase family_rb: Antecedent: ‘(?x :brotherOf ?y) (?y :parentOf ?z)’ Filter: NULL Consequent: ‘(?x :uncleOf ?z)’ Antecedent: ‘(?x :age ?a)’ Filter: ‘a >= 65’ Consequent: ‘(?x :ageGroup “Senior”)’ Antecedent: ‘(?x :parentOf ?y) (?y :parentOf ?z)’ Filter: NULL Consequent: ‘(?x :grandParentOf ?z)’
  32. 32. RDF Rule Indexes
  33. 33. Rule Index: Overview A rule index represents an entailed graph A rule index is created on an RDF dataset (consisting of a set of RDF models and a set of RDF rulebases)
  34. 34. Rule Index: Example A rule index may be created on a dataset consisting of – family RDF data, and – family_rb rulebase (shown earlier) The rule index will contain inferred triples showing uncleOf and ageGroup information
  35. 35. RDF Query with Inference
  36. 36. SDO_RDF_MATCH withRulebases Arguments – Graph pattern A sequence of triples (with variables) – RDF Data set a set of models a set of rulebases – Filter – Aliases … FROM TABLE(SDO_RDF_MATCH( ‘(?x :uncleOf ?y)’, SDO_RDF_Models(‘family’), SDO_RDF_Rulebases (‘rdfs’, ‘family_rb’) … )) t …
  37. 37. RDF Query w/ Inference:Example Find salary and hiredate of all the uncles SELECT emp.name, emp.salary, emp.hiredate FROM emp, TABLE(SDO_RDF_MATCH( ‘(?x :uncleOf ?y) (?x :name ?name)’, SDO_RDF_Models(‘family), SDO_RDF_Rulebases(‘rdfs’, ‘family_rb), …)) t WHERE emp.name=t.name;
  38. 38. RDF Query w/ Inference:Example 2 Find pairs of persons residing at the same address where the first person rents a truck and the second person buys a fertilizer SELECT t3.x name1, t3.y name2 FROM AddrTable t1, AddrTable t2, TABLE(SDO_RDF_MATCH( ‘(?x :rents ?a) (?a rdf:type :Truck) (?y :buys ?b) (?b rdf:type :Fertilizer)’, SDO_RDF_Models(‘Activities), SDO_RDF_Rulebases(‘rdfs’), …)) t3 WHERE t1.name=t3.x and t2.name=t3.y and t1.addr=t2.addr;
  39. 39. RDF Models
  40. 40. Model: DDL Procedures provided as part of the API may be used to – Create a model – Drop a model When a user creates a model, a database view gets created automatically – rdfm_family A model corresponds to a column of type SDO_RDF_TRIPLE_S in a base table Each model has exactly one base table associated with it
  41. 41. Model: DDL Creating a Model Create an Application TableCREATE TABLE family_table ( id NUMBER, family_triple SDO_RDF_TRIPLE_S); Create a ModelEXEC SDO_RDF.CREATE_RDF_MODEL( ‘family’, ‘family_table’,‘family_triple’); Automatically creates the following database viewrdfm_family (…)
  42. 42. Loading RDF Data into Oracle Java API provided to load NTriple into NDM Sample XSLs provided – To convert RDF to NTriple – To convert RDF to INSERT statements
  43. 43. Model: DML SQL DML commands may be used to do DML operations on a base table to effect DML (i.e., triple insert, delete, and update) on the corresponding model Insert Triples INSERT INTO family_table VALUES (1, SDO_RDF_TRIPLE_S(‘family, <http://example.org/family/John>, <http://example.org/family/brotherOf>, ‘<http://example.org/family/Mary>));
  44. 44. Model: Security The creator of the base table corresponding to a model can grant privileges to other users To perform DML to a model, a user must have DML privileges for the corresponding base table The creator of a model can grant QUERY privileges on the corresponding database view to other users A user can query only those models for which s/he has QUERY privileges to the corr. database views Only the creator of a model can drop the model
  45. 45. Model: Views Database views corresponding to the models
  46. 46. RDF Rulebases
  47. 47. Rulebase: DDL Procedures provided as part of the API may be used to – Create a rulebase create_rulebase(family_rb); – Drop a rulebase – drop_rulebase(family_rb); When a user creates a rulebase, a database view gets created automatically – rdfr_family_rb (rule_name, antecedent, filter, consequent, aliases)
  48. 48. Rulebase: DML SQL DML commands may be used on the database view corresponding to a target rulebase to insert, delete, and update rules insert into mdsys.rdfr_family_rb values( ‘uncle_rule, ‘(?x :brotherOf ?y) (?y :parentOf ?z)’, NULL, (?x :uncleOf ?z), SDO_RDF_Aliases(…));
  49. 49. Rulebase: Security Creator of a rulebase can grant privileges to the corresponding database view to other users Performing DML operations requires invoker to have appropriate privileges on the database view Only the creator of a rulebase can drop the rulebase
  50. 50. Rulebase: Views RDF_RULEBASE_INFO – Contains the list of rulebases – For each rulebase, contains additional information (such as, creator, view name, etc) Content of each rulebase is available from the corresponding database view
  51. 51. RDF Rule Indexes
  52. 52. Rule Index: DDL Procedures provided as part of the API may be used to – Create a rule index create_rules_index (family_rb_rix_family‘, SDO_RDF_Models(family), SDO_RDF_Rulebases(‘rdfs,family_rb)); – Drop a rule index drop_rules_index (family_rb_rix_family); When a user creates a rule index, a database view gets created automatically – rdfi_family_rb_rix_family (…)
  53. 53. Rule Index: Security To create a rule index on an RDF dataset (models and rulebases), user needs to have QUERY privileges on those models and rulebases Creator of a rule index holds QUERY privilege on the rule index and may grant this privilege to other users Only the creator of a rule index can drop it
  54. 54. Rule Index: Views RDF_RULEINDEX_INFO – Contains the list of rule indexes – For each rule index, contains additional information (such as, creator, status, etc) RDF_RULEINDEX_DATASETS – For every rule index, stores the names of its models and rulebases
  55. 55. Rule Index: Dependencies Content of a rule index depends upon the content of each element of its dataset – Any modification to the models or rulebases in its dataset invalidates the rule index – Dropping a model or rulebase will drop dependent rule indexes automatically.
  56. 56. Summary RDF Data Model – Models (Graphs) – RDF Query using SDO_RDF_MATCH Table Function RDF Data Model with (user-defined) Rules – Models (Graphs) – Rulebases – Rule Indexes – RDF Query on entailed RDF graphs Management (DDL, DML, Security, …) – Models, Rulebases, and Rule Indexes
  57. 57. RDF Data Model Demo
  58. 58. Demo: Family Schema
  59. 59. Demo: Family Schema 2
  60. 60. Demo: Family Model Data
  61. 61. Demo: Family Model Data (Alt)
  62. 62. Demo: Query without Inferenceselect m from TABLE(SDO_RDF_MATCH( (?m rdf:type :Male), SDO_RDF_Models(family), null, SDO_RDF_Aliases( SDO_RDF_Alias(, http://www.example.org/family/)), null));M--------------------------------------------------------------------------------http://www.example.org/family/Jackhttp://www.example.org/family/Tom
  63. 63. Demo: Query w/ RDFS Inferenceselect m from TABLE(SDO_RDF_MATCH( (?m rdf:type :Male), SDO_RDF_Models(family), SDO_RDF_Rulebases(‘RDFS’), SDO_RDF_Aliases( SDO_RDF_Alias(, http://www.example.org/family/)), null));M--------------------------------------------------------------------------------http://www.example.org/family/Jackhttp://www.example.org/family/Tomhttp://www.example.org/family/Johnhttp://www.example.org/family/Matthttp://www.example.org/family/Sammy
  64. 64. Demo: Family Rulebase Antecedent: ‘(?x :parentOf ?y) (?y :parentOf ?z)’ Filter: NULL Consequent: ‘(?x :grandParentOf ?z)’
  65. 65. Demo: Query w/ Family and RDFS Inferenceselect x, y from TABLE(SDO_RDF_MATCH( (?x :grandParentOf ?y) (?x rdf:type :Male), SDO_RDF_Models(family), SDO_RDF_Rulebases(RDFS,family_rb), SDO_RDF_Aliases( SDO_RDF_Alias(,http://www.example.org/family/)), null));X Y------------------------------------------------------ -----------------------------------------------------http://www.example.org/family/John http://www.example.org/family/Cindyhttp://www.example.org/family/John http://www.example.org/family/Tomhttp://www.example.org/family/John http://www.example.org/family/Jackhttp://www.example.org/family/John http://www.example.org/family/Cathy
  66. 66. Q U E S T I O N S A N S W E R S
  67. 67. Demo of Siderean’s Seamark Navigation Server Mike DiLascio & Joanne Luciano
  68. 68. Agenda About Siderean Software & Predictive Medicine, Inc. Introducing Seamark Navigation Server v.3.6 Seamark & Oracle 10g RDF Data Model Demonstration of Seamark / Oracle 10g integration Lessons Learned / Q&A
  69. 69. About Siderean Software Aggregate, organize and navigate information -the way users think – -to improve analysis and decision making. Founded in 2001 and based in El Segundo, CA Ventured backed in 2004 Delivering RDF-centric navigation and analysis capabilities for end users (a.k.a. - “the last mile”) Active W3C member leveraging Semantic Web standards Demonstrating integrated Seamark navigation layer over Oracle 10g RDF Data Model in collaboration with Predictive Medicine, Inc.
  70. 70. Current solutions“50,000 results!!! Now what?” “I give up! Hello? Get me an apple!” “Why do I get oranges when I’m looking for apples?” IT: CONTENT PRODUCER:“As soon as I fix his, “I just produced three appleshers stops working.” last week!” Enterprise search – Knowledge management – a brute force approach breathtakingly expensive
  71. 71. Introducing Seamark Navigation Server “I can see the big picture!” “No more staring at a blank text box.” “I can drill down quickly to what I want.” IT: CONTENT PRODUCER: “I can take my coffee “I knew we had an apple in break now.” here somewhere.” Seamark – layering organization to deliver pinpoint navigation
  72. 72. How it works: process Term View View Person Text Place Event Metadata about Organized into a unified Analyzed to generate Providing pinpointdata and content information architecture… on-demand views… navigation acrossis aggregated… the data and content
  73. 73. How it works: architecture User Navigation and User TaggingUnstructured Content and Data Feeds Web Browsers & Portals Search Engines User Alerts Metadata Navigation Navigation Aggregator Metadata Web Services Feed Aggregators Structured Content Sources
  74. 74. Seamark/Oracle integration architecture: Phase 1 User Navigation and User Tagging Web Browsers & Portals User Alerts Batch RDFMatch Oracle 10g Query issued from Cached Navigation RDF Data Seamark at Navigation Web Services Model for index time Metadata scalablepersistence of Feed Aggregators metadata
  75. 75. Seamark/Oracle integration architecture: Phase 2 User Navigation and User Tagging Web Browsers & Portals User Alerts Oracle 10g Federated RDFMatch Dynamic Navigation RDF Data Queries issued from Navigation Web Services Model for Seamark at query time Metadata scalablepersistence of metadata Feed Aggregators
  76. 76. Seamark Demo: Background & Concepts Life Sciences demonstration premise RDF offers high value during early stage research Leveraging strengths of Oracle 10g & Seamark v3.6 Oracle – large datasets / scalability Seamark – useful subsets / flexible navigation & insights Project elapsed time - about one week Locating and identifying data sources represented the greatest time element Data sources in RDF required minimal integration time Non-RDF data sources required transformation and linking values (non-trivial but straightforward)
  77. 77. Seamark Demonstration: Identification of new drug candidates 1. Differentiate different forms GO2Keyword.rdf Keywords.rdf of disease ProbeSet.rdf 2. Identify patients subgroups. 3. Identify top biomarkers Keyword 4. Identify function GO2UniProt.rdf GO2OMIM.rdf Probe 5. Identify biological and chemical properties and Protein disease associations of Gene biomarker MIM Id OMIM.rdf 6. Identify documentsIntAct.rdf 7. Identify role in metabolic GO.rdf GO2Enzyme.rdf pathways UniProt.rdf Enzyme Organism 8. Identify compounds that Citation interact 9. Identify and compare Compound Taxonomy.rdf function in other organisms PubMed.xml Enzymes.rdf KEGG.rdf Pathway 10. Identify any prior art
  78. 78. Live Seamark Life Sciences Demonstration: Sample Screenshots
  79. 79. Seamark application start page shows integration of OMIM, GO, KEGG, UniProt and NCBI
  80. 80. Select: Probe Set ID: “M18255_cds2_s_at”
  81. 81. Results: 9 Matches on “M18255_cds2_s_at” to the Gene Ontology Cytoplasm 1st of 9 Matches Cellular Location Via Gene Ontology
  82. 82. Cytoplasm 1st of 9 Matches Page Scroll
  83. 83. Cytoplasm 1st of 9 Matches Page ScrollPlasma Membrane, …, 2nd of 9 MatchesCellular Location Via Gene Ontology Page Scroll for more results, etc.
  84. 84. Start Page: Optionally search across entire collection based uponkeywords from the integrated data sources
  85. 85. Seamark Lessons Learned RDF offers multiple unconstrained views of data/relationships – Provides maximum flexibility during early stage research – Later stages can leverage OWL to constrain known relationships Data providers – Timing is right to publish in RDF format – Cut your customer’s integration costs – Speed discovery time Even with one week of effort… – Proof of Concept demonstrates value of broad & deep integration – Additional value in extending POC in customer pilot initiatives
  86. 86. Siderean Seamark Conclusion Getting the precise information we need from today’s data glut is profoundly difficult Solving this problem requires a solution that works the way you think Siderean is the world’s first turnkey navigation server for the enterprise and people at large
  87. 87. To arrange a demonstration of Seamark orThank You! for more information please contact: Mike DiLascio Office: +1 781 652 0339 Mobile: +1 781 354 7663 mdilascio@siderean.com Siderean Software, Inc. 390 North Sepulveda Blvd., Suite 2070 El Segundo, CA 90245-4475 USA http://www.siderean.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×