SlideShare a Scribd company logo
1 of 4
Download to read offline
Knowledge Driven User Interfaces for Complex Biological Queries
KIERAN O’NEILL, ALEXANDER GARCIA-CASTRO†
and DANIEL JACOBSON
National Bioinformatics Network, Central Node
†
and The International Center for Tropical Agriculture (CIAT)
With the explosion of biological data in the postgenomic era, there has been a growing need for semantic data integration, supported
by ontologies. Semantic integration techniques enable biologists to construct complex biological queries. However, the construction
of these queries and analysis of their results can place a high cognitive load on biologists. This paper presents a proposed information
visualisation tool, Digr, to aid biologists in these processes within the context of DigraBase, a graph database for semantic data
integration. A working example of a query is presented, to illustrate the complexity of the information spaces under consideration.
Visualisation techniques that have been applied to similar problems are discussed in the context of their applicability to the problem
of aiding the construction of complex queries over DigraBase, and the interpretation of their results.
General Terms: Data Visualisation, Comparative Genomics
Additional Key Words and Phrases: Information Visualisation, Complex Biological Queries, Information Integration
Introduction
In the biological domain, there has been a shift from hypothesis-driven research, wherein data is collected purely
to answer a scientific question, to data-driven research, wherein large data sets are collected and made publicly
available for analysis and interpretation [Searls 2005]. This has resulted in an explosion in the amount of
molecular biological data that is publicly available. This data is stored in at least 858 databases [Galperin 2006],
using differing formats, schemata and query software [Wong 2002]. To enable biologists to fully leverage this
data, and the information it contains, the integration of data from disparate sources is essential.
The syntactic, or ‘low level’ [Searls 2005] integration of data is a problem that has been addressed by systems
such as Sequence Retrieval Service (SRS)[Etzold and Argos 1992], Entrez [Schuler et al. 1996] and others which
overcome heterogeneity in the structure of data [Garcia-Castro et al. 2005]. However, it has become clear that
there is a further need for the integration of the meaning contained within biological data, in other words semantic
[Garcia-Castro et al. 2005] or ‘higher level’ [Searls 2005] data integration.
As an example of the importance of overcoming semantic differences, the definition of the word ‘gene’ can be
considered: In three different databases, the term carries three different meanings, each dependent on the context
[Garcia-Castro et al. 2005]. Resolution of such semantic disagreements is an important aspect of semantic data
integration [Garcia-Castro et al. 2005].
Bio-ontologies provide a means to facilitate semantic integration. An ontology can be regarded as ‘a type of
knowledge base in which concepts and relations are stored’ [Garcia-Castro et al. 2005]. Gene Ontology (GO)
[Ashburner et al. 2000], has emerged as a de facto standard molecular biological ontology [Garcia-Castro et al.
2005]. GO captures the function of gene products in terms of their involvement in biological processes, the cellular
component they function in, and their molecular function. GO has been used to annotate genes across multiple
organisms, and thus can aid in cross-organism semantic integration. In addition to GO, specific genome projects,
such as the mouse (Mus musculus)[Blake et al. 2006], fly (family Drosophilidae)[Drysdale et al. 2005] and worm
(Coenhorabdtis elegans) [Schwarz et al. 2006], have created, or are creating, their own domain ontologies for
capturing knowledge within their organism, such as anatomy ontologies, and phenotype ontologies (capturing
the effects of gene knockout)[Smith et al. 2005].
TAMBIS [Goble et al. 2001] illustrates another use for ontologies: that of facilitating biological query construc-
tion. TAMBIS provides users with a conceptual view of the information sources their query will be performed
over, while shielding them from the underlying schema of those sources. This allows TAMBIS to present a uni-
form interface to multiple data sources. The TAMBIS ontology also differs from GO and the domain ontologies
in that it represents higher level relations between concepts beyond the subsumption relations which the other
ontologies limit themselves to.
Once queries have been constructed and results returned, biologists still need to make sense of the results.
Author Addresses: K O’Neill, NBN Central, Cape Town, South Africa, kieran@nbn.ac.za
C Garcia Castro, Centro Internacional de Agricultura Tropical, Cali, Columbia D Jacobson, NBN Central, as above
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided
that the copies are not made or distributed for profit or commercial advantage, that the copies bear this notice and the full citation
on the first page. c 2007
Proceedings of SABioinf 2007, Pages 111–115
112 • Kieran O’Neill et al.
Often, the research process involves many iterations of querying, analysis and optimisation before the desired
result is achieved [Garcia-Castro et al. 2005]. However, the amount of information returned from such queries
is usually more than a human being can process mentally at once. It is necessary to provide cognitive support,
wherein some of the cognitive load is taken on by the tool, thus making it easier for the user to see previously
hidden associations as well as feasible operations [Walenstein 2002]. Information visualisation (IV) techniques
can provide this cognitive support, enabling users to more easily make sense of the results of queries, and to
construct new, refined ones [Tao et al. 2004].
Information visualisation techniques have been defined as ‘the use of computer-supported, interactive, visual
representations of abstract data to amplify cognition’ [Card et al. 1999]. They include such techniques as
providing overviews of a data sets to users, enabling users to control how much and what information is displayed,
keeping a history to enable users to retrace their steps and explore, enabling users to view relationships between
the data displayed and other data, and allowing users to extract their data into a form they can use with other
software, or pass on to their peers [Shneiderman 1996]. In this paper, these tasks will be examined in the context
of building complex biological queries.
Architecture
DigraBase (DGB) is a graph database under development by the central node of the National Bioinformatics
Network (of South Africa) [Otgaar et al. 2006]. This system allows the execution of boolean declarative queries
over data loaded into it, which can be loaded from multiple sources. This gives biologists the ability to execute
complex biological queries, wherein subsets of biological objects, such as genes, are found, based on common
properties, such as function. Since these queries can be formulated across multiple data sets, this allows biologists
to find relationships between objects that were not obvious simply by looking at one database.
DGB uses a custom query language, DGBQL, and has a command line interface. However, the query language
is complex, and learning new languages, as well as using a command line interface is difficult for most biologists
[Letondal 2001a; 2001b]. A simple, graphical interface to enable users to easily construct queries and interpret
the results, while shielding them from the complexities of the underlying software, is needed.
Digr is a proposed frontend to DGB, with the purpose of filling that need. As such, it is being constructed
according to the Model View Controller (MVC) architectural design pattern. The model component is responsible
for interaction with DGB via DGBQL and presenting results in a format usable by the view component. The
view component takes these results and presents them visually to the user. The controller component accepts
commands from the user, made via the user interface, and sends them to the model. Thus, the model can easily
be altered if the query language changes, and different versions implemented to handle different methods by
which DGBQL will be transmitted, such as CORBA and XML-RPC. The visual component of Digr can also be
altered or replaced, without affecting the rest of the system. Finally, the model/ controller parts of the system
can be provided as a library for other software to use.
A Motivating Scenario
An example of a complex query is shown below, and illustrated in Figure 1. Phrases which can be represented
by ontology terms, as well as their contexts, are highlighted in bold. The desired data to be retrieved (human
genes) is italicised:
‘Retrieve all human genes that are normally expressed in the brain and are associated with poor memory
in mice and have a role in fatty acid metabolism.’
This query contains 3 ontology terms from different ontologies. ‘Poor memory in mice’ is represented by the
Mammalian Phenotype Ontology (MPO) term ‘abnormal learning/memory’ [Smith et al. 2005]. ‘Fatty acid
metabolism’ is represented by the GO term ‘fatty acid metabolism’, part of the GO biological process ontology
[Ashburner et al. 2000]. The query can be satisfied by finding genes directly annotated with one of these terms,
or one of their child terms (specialisations). (For instance, ‘linoleic acid metabolism’ is a type of fatty acid
metabolism, so genes annotated with ‘linoleic acid metabolism’ are also transitively annotated with ‘fatty acid
metabolism’.) ‘Normally expressed in the brain’ can be fulfilled by finding genes expressed in expressed sequence
tagi (EST) libraries made from tissue extracted from the brain (represented by annotation of the library with
an anatomy ontology term ‘brain’). This constraint is more complex, requiring transition over an intermediary
layer (EST libraries) between the ontology and genes.
For associations between genes and these ontology terms, several data sources are available. The Institute for
Genomic Research (TIGR) Gene Indices database contains expression data for human genes [Lee et al. 2005].
Proceedings of SABioinf 2007
Knowledge Driven User Interfaces for Complex Biological Queries • 113
Figure 1. An illustration of the example query within the context of its information space. The three ontologies from which the
terms are taken are shown as three-dimensional boxes. Within these, the terms chosen, as well as their immediate parents, and a
few of their immediate child terms, are shown. Large arrows show the connection between the chosen terms embodied by the query.
The EMBL/DDBJ/GenBank nucleotide database [Kanz et al. 2005] is cross-referenced with the GO Annotations
database [Camon et al. 2004]. The Mouse Genome database (MGDB) is annotated with both GO and MPO
[Blake et al. 2006]. Additionally, MGDB has orthology mappings between mouse and human genes, which can
be used to find genes by the terms their orthologs are annotated with.
Visualisation Challenges
When building complex queries, selecting ontology terms and relationships is one of the major bottlenecks.
Visualising large ontologies is not easy - the process grows in complexity when formulating queries that involve
more than one ontology. Ontologies are complex directed acyclic graph (DAG) structures, and enabling users to
find terms within them corresponding to the idea of the query in their mind, is challenging.
One approach to finding ontology terms is ‘top-down browsing’ through the relationships within the ontology.
Ontology browsers, such as AmiGO [Ashburner et al. 2000], and ontology editors, such as DAGedit, accomplish
this using a collapsible tree, as used in some file browsers. Disadvantages of this approach are that the number of
children shown cannot be controlled, and that the overall complexity of the representation can be overwhelming,
and excessively screen real estate. Another system, Flamenco [Hearst et al. 2002], has been built specifically
for complex, multi-ontology queries, and enables top-down browsing of multiple ontologies simultaneously. Each
ontology is displayed in a box, with a ‘breadcrumb trail’ leading to the current term (enabling users to jump
back up the hierarchy), and a two-column list of children terms, with controls to expand or contract each box
dynamically. In this way, a large space of information is made accessible to the user without overwhelming
them or consuming screen space. Digr will use a similar approach, with adaptations for the display of DAGs,
as Flamenco was designed for tree-structured ontologies. In this way, users can decide if more specific, or more
general terms than the ones they have chosen, best capture their query.
Another approach to finding ontology terms is text searching. An example of this is Ontology Lookup Service
(OLS) [Cˆot´e et al. 2006], which uses a support vector machine (SVM) approach to enable extremely fuzzy text
searching with ranked results. OLS, however, only provides the names of matches for a user to choose from.
A richer, more informative view would better assist users in finding terms matching the concept they had in
mind. Flamenco also uses text searching, displaying results grouped by ontology terms, with options for choosing
different ontologies to group by. Digr will attempt to use the fuzzy search capabilities of OLS, but integrated
Proceedings of SABioinf 2007
114 • Kieran O’Neill et al.
with the top-down components of the interface to provide a richer view of results.
Another useful technique in complex query construction is the provision of dynamic query previews, as carried
out in Flamenco [Hearst et al. 2002]. As the query is constructed, only the terms which can be combined with
them are offered as choices to refine the query with. This reduces the complexity of finding terms, and can help
to show implicit relationships between them. In addition, the size of the result set to be returned is displayed
next to each combinable term, thus providing users with additional information to aid query construction. Digr
will attempt to provide both of these facilities.
As an interface for complex query building, Digr aims to integrate text searching and top-down browsing in a
single interface, with dynamic query previewing. In addition, results will be visually clustered according to their
annotations to different ontologies, and all user actions will be undoable via a history-keeping mechanism. In
these ways, Digr hopes to visually aid biologists in constructing complex biological queries.
REFERENCES
Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Cherry, J., Davis, A., Dolinski, K., Dwight, S., Eppig, J.,
Harris, M., Hill, D., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J., Richardson, J., Ringwald, M., Rubin, G.,
and Sherlock, G. 2000. Gene ontology: tool for the unification of biology. Nat Genet. 25(1), 25–29.
Blake, J., Eppig, J., Bult, C., Kadin, J., Richardson, J., et al. 2006. The mouse genome database (mgd): updates and
enhancements. Nucleic Acids Research 34, Database Issue.
Camon, E. et al. 2004. The gene ontology annotation(goa) database: sharing knowledge in uniprot with gene ontology. Nucleic
Acids Research 32, 90001, 262–266.
Card, S., Mackinlay, J., and Shneiderman, B. 1999. Readings in Information Visualization: Using Vision to Think. Morgan
Kaufmann.
Cˆot´e, R., Jones, P., Apweiler, R., and Hermjakob, H. 2006. The ontology lookup service, a lightweight cross-platform tool for
controlled vocabulary queries. BMC Bioinformatics 2006, 7, 97.
Drysdale, R., Crosby, M., Gelbart, W., Campbell, K., Emmert, D., et al. 2005. Flybase: genes and gene models. Nucleic
Acids Res 33, 390–395.
Etzold, T. and Argos, P. 1992. Srs–an indexing and retrieval tool for flat file data libraries. Bioinformatics 9, 49–57.
Galperin, M. Y. 2006. The molecular biology database collection: 2006 update. Nucl. Acids. Res. 34, D3–D5.
Garcia-Castro, A., Chen, Y., and Ragan, M. 2005. Information integration in molecular bioscience. Appl Bioinformatics 4, 3,
157–173.
Garcia-Castro, A., Thoraval, S., Garcia, L., and Ragan, M. 2005. Workflows in bioinformatics: meta-analysis and prototype
implementation of a workflow generator. BMC Bioinformatics 6, 87.
Goble, C., Stevens, R., Ng, G., Bechhofer, S., Paton, N., Baker, P., Peim, M., and Brass, A. 2001. Transparent access to
multiple bioinformatics information sources. IBM Systems Journal 40, 2, 532–551.
Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K., and Yee, K. 2002. Finding the flow in web site search.
Communications of the ACM 45, 9, 42–49.
Kanz, C., Aldebert, P., Althorpe, N., Baker, W., Baldwin, A., Bates, K., Browne, P., van den Broek, A., Castro, M.,
Cochrane, G., et al. 2005. The embl nucleotide sequence database. Nucleic Acids Res 33, 167–172.
Lee, Y., Tsai, J., Sunkara, S., Karamycheva, S., Pertea, G., Sultana, R., Antonescu, V., Chan, A., Cheung, F., and
Quackenbush, J. 2005. The tigr gene indices: clustering and assembling est and known genes and integration with eukaryotic
genomes. Nucleic Acids Res 33, 71–74.
Letondal, C. 2001a. Interaction et programmation - conception d’applications programmables avec des non-informaticiens. Ph.D.
thesis, Universit de Paris-Sud.
Letondal, C. 2001b. A web interface generator for molecular biology programs in unix. Bioinformatics 17, 1, 73–82.
Otgaar, D., Dominy, D., Maclear, A., Gamieldien, J., Martinez, F., and Jacobson, D. 2006. Digrabase: A graph-theoretic
framework for semantic integration of biological data. Poster, Joint BioLINK and 9th Bio-Ontologies Meeting.
Schuler, G., Epstein, J., Ohkawa, H., and Kans, J. 1996. Entrez: molecular biology database and retrieval system. Methods
Enzymol 266, 141–62.
Schwarz, E., Antoshechkin, I., Bastiani, C., Bieri, T., Blasiar, D., Canaran, P., Chan, J., Chen, N., Chen, W., Davis, P.,
et al. 2006. Wormbase: better software, richer content. Nucleic Acids Research.
Searls, D. 2005. Data integration: challenges for drug discovery. Nat Rev Drug Discov 4, 1, 45–58.
Shneiderman, B. 1996. The eyes have it: a task by data type taxonomy for informationvisualizations. Visual Languages, 1996.
Proceedings., IEEE Symposium on, 336–343.
Smith, C., Goldsmith, C., and Eppig, J. 2005. The mammalian phenotype ontology as a tool for annotating, analyzing and
comparing phenotypic information. Genome Biol 6, 1, R7.
Tao, Y., Liu, Y., Friedman, C., and Lussier, Y. 2004. Information visualization techniques in bioinformatics during the postge-
nomic era. Drug Discovery Today BIOSILICO 2, 237–245.
Walenstein, A. 2002. Cognitive support in software engineering tools: A distributed cognition framework. Ph.D. thesis, SIMON
FRASER UNIVERSITY.
Wong, L. 2002. Technologies for integrating biological data. Briefings in Bioinformatics 3, 4, 389–404.
Proceedings of SABioinf 2007

More Related Content

What's hot

EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
 
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGijbbjournal
 
Bs31267274
Bs31267274Bs31267274
Bs31267274IJMER
 
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...ijistjournal
 
Review of Multimodal Biometrics: Applications, Challenges and Research Areas
Review of Multimodal Biometrics: Applications, Challenges and Research AreasReview of Multimodal Biometrics: Applications, Challenges and Research Areas
Review of Multimodal Biometrics: Applications, Challenges and Research AreasCSCJournals
 
Gene Selection Based on Rough Set Applications of Rough Set on Computational ...
Gene Selection Based on Rough Set Applications of Rough Set on Computational ...Gene Selection Based on Rough Set Applications of Rough Set on Computational ...
Gene Selection Based on Rough Set Applications of Rough Set on Computational ...ijcoa
 
Predicting students' performance using id3 and c4.5 classification algorithms
Predicting students' performance using id3 and c4.5 classification algorithmsPredicting students' performance using id3 and c4.5 classification algorithms
Predicting students' performance using id3 and c4.5 classification algorithmsIJDKP
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological CorpusA Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological Corpusijcsit
 
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...IRJET Journal
 
Bioinformatics.Assignment
Bioinformatics.AssignmentBioinformatics.Assignment
Bioinformatics.AssignmentNaima Tahsin
 
A semantic framework for biomedical image discovery
A semantic framework for biomedical image discoveryA semantic framework for biomedical image discovery
A semantic framework for biomedical image discoverySyed Ahmad Chan Bukhari, PhD
 
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEIJCSEA Journal
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoAlexander Pico
 
Trust Enhanced Role Based Access Control Using Genetic Algorithm
Trust Enhanced Role Based Access Control Using Genetic Algorithm Trust Enhanced Role Based Access Control Using Genetic Algorithm
Trust Enhanced Role Based Access Control Using Genetic Algorithm IJECEIAES
 
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEMDBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEMIJwest
 

What's hot (18)

50120130405011
5012013040501150120130405011
50120130405011
 
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
 
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
 
Bs31267274
Bs31267274Bs31267274
Bs31267274
 
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
 
Review of Multimodal Biometrics: Applications, Challenges and Research Areas
Review of Multimodal Biometrics: Applications, Challenges and Research AreasReview of Multimodal Biometrics: Applications, Challenges and Research Areas
Review of Multimodal Biometrics: Applications, Challenges and Research Areas
 
Gene Selection Based on Rough Set Applications of Rough Set on Computational ...
Gene Selection Based on Rough Set Applications of Rough Set on Computational ...Gene Selection Based on Rough Set Applications of Rough Set on Computational ...
Gene Selection Based on Rough Set Applications of Rough Set on Computational ...
 
Predicting students' performance using id3 and c4.5 classification algorithms
Predicting students' performance using id3 and c4.5 classification algorithmsPredicting students' performance using id3 and c4.5 classification algorithms
Predicting students' performance using id3 and c4.5 classification algorithms
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological CorpusA Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
 
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
 
Bioinformatics.Assignment
Bioinformatics.AssignmentBioinformatics.Assignment
Bioinformatics.Assignment
 
A semantic framework for biomedical image discovery
A semantic framework for biomedical image discoveryA semantic framework for biomedical image discovery
A semantic framework for biomedical image discovery
 
IBSB tutorial
IBSB tutorialIBSB tutorial
IBSB tutorial
 
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon Cho
 
Trust Enhanced Role Based Access Control Using Genetic Algorithm
Trust Enhanced Role Based Access Control Using Genetic Algorithm Trust Enhanced Role Based Access Control Using Genetic Algorithm
Trust Enhanced Role Based Access Control Using Genetic Algorithm
 
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEMDBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM
 
F0433439
F0433439F0433439
F0433439
 

Similar to Knowledge Driven User Interfaces for Complex Biological Queries

A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...CSCJournals
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...IJNSA Journal
 
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...ijseajournal
 
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...ijseajournal
 
Towards a Query Rewriting Algorithm Over Proteomics XML Resources
Towards a Query Rewriting Algorithm Over Proteomics XML ResourcesTowards a Query Rewriting Algorithm Over Proteomics XML Resources
Towards a Query Rewriting Algorithm Over Proteomics XML ResourcesCSCJournals
 
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational DatabasesSemantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databasesijsrd.com
 
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLS
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLSTWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLS
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLSIJDKP
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplicationidescitation
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalWaqas Tariq
 
Simplifying Database Normalization within a Visual Interactive Simulation Model
Simplifying Database Normalization within a Visual Interactive Simulation ModelSimplifying Database Normalization within a Visual Interactive Simulation Model
Simplifying Database Normalization within a Visual Interactive Simulation Modelijdms
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisCatherine Canevet
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysisDr. Naveen Gaurav srivastava
 
A Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration FrameworkA Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration FrameworkLisa Muthukumar
 
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONCOMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONcsandit
 
Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...ijitjournal
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYcseij
 
76201910
7620191076201910
76201910IJRAT
 
Blended intelligence of FCA with FLC for knowledge representation from cluste...
Blended intelligence of FCA with FLC for knowledge representation from cluste...Blended intelligence of FCA with FLC for knowledge representation from cluste...
Blended intelligence of FCA with FLC for knowledge representation from cluste...IJECEIAES
 

Similar to Knowledge Driven User Interfaces for Complex Biological Queries (20)

A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...
 
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
 
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
PERFORMANCE EVALUATION OF STRUCTURED AND SEMI-STRUCTURED BIOINFORMATICS TOOLS...
 
Towards a Query Rewriting Algorithm Over Proteomics XML Resources
Towards a Query Rewriting Algorithm Over Proteomics XML ResourcesTowards a Query Rewriting Algorithm Over Proteomics XML Resources
Towards a Query Rewriting Algorithm Over Proteomics XML Resources
 
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational DatabasesSemantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
 
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLS
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLSTWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLS
TWO LEVEL SELF-SUPERVISED RELATION EXTRACTION FROM MEDLINE USING UMLS
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplication
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information Retrieval
 
Simplifying Database Normalization within a Visual Interactive Simulation Model
Simplifying Database Normalization within a Visual Interactive Simulation ModelSimplifying Database Normalization within a Visual Interactive Simulation Model
Simplifying Database Normalization within a Visual Interactive Simulation Model
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysis
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysis
 
A Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration FrameworkA Cell-Cycle Knowledge Integration Framework
A Cell-Cycle Knowledge Integration Framework
 
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONCOMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
 
Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...Survey on evolutionary computation tech techniques and its application in dif...
Survey on evolutionary computation tech techniques and its application in dif...
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
 
76201910
7620191076201910
76201910
 
50120140504019 2
50120140504019 250120140504019 2
50120140504019 2
 
Blended intelligence of FCA with FLC for knowledge representation from cluste...
Blended intelligence of FCA with FLC for knowledge representation from cluste...Blended intelligence of FCA with FLC for knowledge representation from cluste...
Blended intelligence of FCA with FLC for knowledge representation from cluste...
 
[IJET-V2I3P19] Authors: Priyanka Sharma
[IJET-V2I3P19] Authors: Priyanka Sharma[IJET-V2I3P19] Authors: Priyanka Sharma
[IJET-V2I3P19] Authors: Priyanka Sharma
 

More from alexander garcia

More from alexander garcia (7)

Pptx4landing page
Pptx4landing pagePptx4landing page
Pptx4landing page
 
literature based discovery
literature based discoveryliterature based discovery
literature based discovery
 
Nanotweets
NanotweetsNanotweets
Nanotweets
 
Paper as a Research Object
Paper as a Research ObjectPaper as a Research Object
Paper as a Research Object
 
RDF for PubMedCentral
RDF for PubMedCentral RDF for PubMedCentral
RDF for PubMedCentral
 
Biotea poster biolinks at ISMB 2013
Biotea poster biolinks at ISMB 2013Biotea poster biolinks at ISMB 2013
Biotea poster biolinks at ISMB 2013
 
Monday presentation 1336-may23
Monday presentation 1336-may23Monday presentation 1336-may23
Monday presentation 1336-may23
 

Recently uploaded

Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 

Recently uploaded (20)

Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 

Knowledge Driven User Interfaces for Complex Biological Queries

  • 1. Knowledge Driven User Interfaces for Complex Biological Queries KIERAN O’NEILL, ALEXANDER GARCIA-CASTRO† and DANIEL JACOBSON National Bioinformatics Network, Central Node † and The International Center for Tropical Agriculture (CIAT) With the explosion of biological data in the postgenomic era, there has been a growing need for semantic data integration, supported by ontologies. Semantic integration techniques enable biologists to construct complex biological queries. However, the construction of these queries and analysis of their results can place a high cognitive load on biologists. This paper presents a proposed information visualisation tool, Digr, to aid biologists in these processes within the context of DigraBase, a graph database for semantic data integration. A working example of a query is presented, to illustrate the complexity of the information spaces under consideration. Visualisation techniques that have been applied to similar problems are discussed in the context of their applicability to the problem of aiding the construction of complex queries over DigraBase, and the interpretation of their results. General Terms: Data Visualisation, Comparative Genomics Additional Key Words and Phrases: Information Visualisation, Complex Biological Queries, Information Integration Introduction In the biological domain, there has been a shift from hypothesis-driven research, wherein data is collected purely to answer a scientific question, to data-driven research, wherein large data sets are collected and made publicly available for analysis and interpretation [Searls 2005]. This has resulted in an explosion in the amount of molecular biological data that is publicly available. This data is stored in at least 858 databases [Galperin 2006], using differing formats, schemata and query software [Wong 2002]. To enable biologists to fully leverage this data, and the information it contains, the integration of data from disparate sources is essential. The syntactic, or ‘low level’ [Searls 2005] integration of data is a problem that has been addressed by systems such as Sequence Retrieval Service (SRS)[Etzold and Argos 1992], Entrez [Schuler et al. 1996] and others which overcome heterogeneity in the structure of data [Garcia-Castro et al. 2005]. However, it has become clear that there is a further need for the integration of the meaning contained within biological data, in other words semantic [Garcia-Castro et al. 2005] or ‘higher level’ [Searls 2005] data integration. As an example of the importance of overcoming semantic differences, the definition of the word ‘gene’ can be considered: In three different databases, the term carries three different meanings, each dependent on the context [Garcia-Castro et al. 2005]. Resolution of such semantic disagreements is an important aspect of semantic data integration [Garcia-Castro et al. 2005]. Bio-ontologies provide a means to facilitate semantic integration. An ontology can be regarded as ‘a type of knowledge base in which concepts and relations are stored’ [Garcia-Castro et al. 2005]. Gene Ontology (GO) [Ashburner et al. 2000], has emerged as a de facto standard molecular biological ontology [Garcia-Castro et al. 2005]. GO captures the function of gene products in terms of their involvement in biological processes, the cellular component they function in, and their molecular function. GO has been used to annotate genes across multiple organisms, and thus can aid in cross-organism semantic integration. In addition to GO, specific genome projects, such as the mouse (Mus musculus)[Blake et al. 2006], fly (family Drosophilidae)[Drysdale et al. 2005] and worm (Coenhorabdtis elegans) [Schwarz et al. 2006], have created, or are creating, their own domain ontologies for capturing knowledge within their organism, such as anatomy ontologies, and phenotype ontologies (capturing the effects of gene knockout)[Smith et al. 2005]. TAMBIS [Goble et al. 2001] illustrates another use for ontologies: that of facilitating biological query construc- tion. TAMBIS provides users with a conceptual view of the information sources their query will be performed over, while shielding them from the underlying schema of those sources. This allows TAMBIS to present a uni- form interface to multiple data sources. The TAMBIS ontology also differs from GO and the domain ontologies in that it represents higher level relations between concepts beyond the subsumption relations which the other ontologies limit themselves to. Once queries have been constructed and results returned, biologists still need to make sense of the results. Author Addresses: K O’Neill, NBN Central, Cape Town, South Africa, kieran@nbn.ac.za C Garcia Castro, Centro Internacional de Agricultura Tropical, Cali, Columbia D Jacobson, NBN Central, as above Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, that the copies bear this notice and the full citation on the first page. c 2007 Proceedings of SABioinf 2007, Pages 111–115
  • 2. 112 • Kieran O’Neill et al. Often, the research process involves many iterations of querying, analysis and optimisation before the desired result is achieved [Garcia-Castro et al. 2005]. However, the amount of information returned from such queries is usually more than a human being can process mentally at once. It is necessary to provide cognitive support, wherein some of the cognitive load is taken on by the tool, thus making it easier for the user to see previously hidden associations as well as feasible operations [Walenstein 2002]. Information visualisation (IV) techniques can provide this cognitive support, enabling users to more easily make sense of the results of queries, and to construct new, refined ones [Tao et al. 2004]. Information visualisation techniques have been defined as ‘the use of computer-supported, interactive, visual representations of abstract data to amplify cognition’ [Card et al. 1999]. They include such techniques as providing overviews of a data sets to users, enabling users to control how much and what information is displayed, keeping a history to enable users to retrace their steps and explore, enabling users to view relationships between the data displayed and other data, and allowing users to extract their data into a form they can use with other software, or pass on to their peers [Shneiderman 1996]. In this paper, these tasks will be examined in the context of building complex biological queries. Architecture DigraBase (DGB) is a graph database under development by the central node of the National Bioinformatics Network (of South Africa) [Otgaar et al. 2006]. This system allows the execution of boolean declarative queries over data loaded into it, which can be loaded from multiple sources. This gives biologists the ability to execute complex biological queries, wherein subsets of biological objects, such as genes, are found, based on common properties, such as function. Since these queries can be formulated across multiple data sets, this allows biologists to find relationships between objects that were not obvious simply by looking at one database. DGB uses a custom query language, DGBQL, and has a command line interface. However, the query language is complex, and learning new languages, as well as using a command line interface is difficult for most biologists [Letondal 2001a; 2001b]. A simple, graphical interface to enable users to easily construct queries and interpret the results, while shielding them from the complexities of the underlying software, is needed. Digr is a proposed frontend to DGB, with the purpose of filling that need. As such, it is being constructed according to the Model View Controller (MVC) architectural design pattern. The model component is responsible for interaction with DGB via DGBQL and presenting results in a format usable by the view component. The view component takes these results and presents them visually to the user. The controller component accepts commands from the user, made via the user interface, and sends them to the model. Thus, the model can easily be altered if the query language changes, and different versions implemented to handle different methods by which DGBQL will be transmitted, such as CORBA and XML-RPC. The visual component of Digr can also be altered or replaced, without affecting the rest of the system. Finally, the model/ controller parts of the system can be provided as a library for other software to use. A Motivating Scenario An example of a complex query is shown below, and illustrated in Figure 1. Phrases which can be represented by ontology terms, as well as their contexts, are highlighted in bold. The desired data to be retrieved (human genes) is italicised: ‘Retrieve all human genes that are normally expressed in the brain and are associated with poor memory in mice and have a role in fatty acid metabolism.’ This query contains 3 ontology terms from different ontologies. ‘Poor memory in mice’ is represented by the Mammalian Phenotype Ontology (MPO) term ‘abnormal learning/memory’ [Smith et al. 2005]. ‘Fatty acid metabolism’ is represented by the GO term ‘fatty acid metabolism’, part of the GO biological process ontology [Ashburner et al. 2000]. The query can be satisfied by finding genes directly annotated with one of these terms, or one of their child terms (specialisations). (For instance, ‘linoleic acid metabolism’ is a type of fatty acid metabolism, so genes annotated with ‘linoleic acid metabolism’ are also transitively annotated with ‘fatty acid metabolism’.) ‘Normally expressed in the brain’ can be fulfilled by finding genes expressed in expressed sequence tagi (EST) libraries made from tissue extracted from the brain (represented by annotation of the library with an anatomy ontology term ‘brain’). This constraint is more complex, requiring transition over an intermediary layer (EST libraries) between the ontology and genes. For associations between genes and these ontology terms, several data sources are available. The Institute for Genomic Research (TIGR) Gene Indices database contains expression data for human genes [Lee et al. 2005]. Proceedings of SABioinf 2007
  • 3. Knowledge Driven User Interfaces for Complex Biological Queries • 113 Figure 1. An illustration of the example query within the context of its information space. The three ontologies from which the terms are taken are shown as three-dimensional boxes. Within these, the terms chosen, as well as their immediate parents, and a few of their immediate child terms, are shown. Large arrows show the connection between the chosen terms embodied by the query. The EMBL/DDBJ/GenBank nucleotide database [Kanz et al. 2005] is cross-referenced with the GO Annotations database [Camon et al. 2004]. The Mouse Genome database (MGDB) is annotated with both GO and MPO [Blake et al. 2006]. Additionally, MGDB has orthology mappings between mouse and human genes, which can be used to find genes by the terms their orthologs are annotated with. Visualisation Challenges When building complex queries, selecting ontology terms and relationships is one of the major bottlenecks. Visualising large ontologies is not easy - the process grows in complexity when formulating queries that involve more than one ontology. Ontologies are complex directed acyclic graph (DAG) structures, and enabling users to find terms within them corresponding to the idea of the query in their mind, is challenging. One approach to finding ontology terms is ‘top-down browsing’ through the relationships within the ontology. Ontology browsers, such as AmiGO [Ashburner et al. 2000], and ontology editors, such as DAGedit, accomplish this using a collapsible tree, as used in some file browsers. Disadvantages of this approach are that the number of children shown cannot be controlled, and that the overall complexity of the representation can be overwhelming, and excessively screen real estate. Another system, Flamenco [Hearst et al. 2002], has been built specifically for complex, multi-ontology queries, and enables top-down browsing of multiple ontologies simultaneously. Each ontology is displayed in a box, with a ‘breadcrumb trail’ leading to the current term (enabling users to jump back up the hierarchy), and a two-column list of children terms, with controls to expand or contract each box dynamically. In this way, a large space of information is made accessible to the user without overwhelming them or consuming screen space. Digr will use a similar approach, with adaptations for the display of DAGs, as Flamenco was designed for tree-structured ontologies. In this way, users can decide if more specific, or more general terms than the ones they have chosen, best capture their query. Another approach to finding ontology terms is text searching. An example of this is Ontology Lookup Service (OLS) [Cˆot´e et al. 2006], which uses a support vector machine (SVM) approach to enable extremely fuzzy text searching with ranked results. OLS, however, only provides the names of matches for a user to choose from. A richer, more informative view would better assist users in finding terms matching the concept they had in mind. Flamenco also uses text searching, displaying results grouped by ontology terms, with options for choosing different ontologies to group by. Digr will attempt to use the fuzzy search capabilities of OLS, but integrated Proceedings of SABioinf 2007
  • 4. 114 • Kieran O’Neill et al. with the top-down components of the interface to provide a richer view of results. Another useful technique in complex query construction is the provision of dynamic query previews, as carried out in Flamenco [Hearst et al. 2002]. As the query is constructed, only the terms which can be combined with them are offered as choices to refine the query with. This reduces the complexity of finding terms, and can help to show implicit relationships between them. In addition, the size of the result set to be returned is displayed next to each combinable term, thus providing users with additional information to aid query construction. Digr will attempt to provide both of these facilities. As an interface for complex query building, Digr aims to integrate text searching and top-down browsing in a single interface, with dynamic query previewing. In addition, results will be visually clustered according to their annotations to different ontologies, and all user actions will be undoable via a history-keeping mechanism. In these ways, Digr hopes to visually aid biologists in constructing complex biological queries. REFERENCES Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Cherry, J., Davis, A., Dolinski, K., Dwight, S., Eppig, J., Harris, M., Hill, D., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J., Richardson, J., Ringwald, M., Rubin, G., and Sherlock, G. 2000. Gene ontology: tool for the unification of biology. Nat Genet. 25(1), 25–29. Blake, J., Eppig, J., Bult, C., Kadin, J., Richardson, J., et al. 2006. The mouse genome database (mgd): updates and enhancements. Nucleic Acids Research 34, Database Issue. Camon, E. et al. 2004. The gene ontology annotation(goa) database: sharing knowledge in uniprot with gene ontology. Nucleic Acids Research 32, 90001, 262–266. Card, S., Mackinlay, J., and Shneiderman, B. 1999. Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann. Cˆot´e, R., Jones, P., Apweiler, R., and Hermjakob, H. 2006. The ontology lookup service, a lightweight cross-platform tool for controlled vocabulary queries. BMC Bioinformatics 2006, 7, 97. Drysdale, R., Crosby, M., Gelbart, W., Campbell, K., Emmert, D., et al. 2005. Flybase: genes and gene models. Nucleic Acids Res 33, 390–395. Etzold, T. and Argos, P. 1992. Srs–an indexing and retrieval tool for flat file data libraries. Bioinformatics 9, 49–57. Galperin, M. Y. 2006. The molecular biology database collection: 2006 update. Nucl. Acids. Res. 34, D3–D5. Garcia-Castro, A., Chen, Y., and Ragan, M. 2005. Information integration in molecular bioscience. Appl Bioinformatics 4, 3, 157–173. Garcia-Castro, A., Thoraval, S., Garcia, L., and Ragan, M. 2005. Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator. BMC Bioinformatics 6, 87. Goble, C., Stevens, R., Ng, G., Bechhofer, S., Paton, N., Baker, P., Peim, M., and Brass, A. 2001. Transparent access to multiple bioinformatics information sources. IBM Systems Journal 40, 2, 532–551. Hearst, M., Elliott, A., English, J., Sinha, R., Swearingen, K., and Yee, K. 2002. Finding the flow in web site search. Communications of the ACM 45, 9, 42–49. Kanz, C., Aldebert, P., Althorpe, N., Baker, W., Baldwin, A., Bates, K., Browne, P., van den Broek, A., Castro, M., Cochrane, G., et al. 2005. The embl nucleotide sequence database. Nucleic Acids Res 33, 167–172. Lee, Y., Tsai, J., Sunkara, S., Karamycheva, S., Pertea, G., Sultana, R., Antonescu, V., Chan, A., Cheung, F., and Quackenbush, J. 2005. The tigr gene indices: clustering and assembling est and known genes and integration with eukaryotic genomes. Nucleic Acids Res 33, 71–74. Letondal, C. 2001a. Interaction et programmation - conception d’applications programmables avec des non-informaticiens. Ph.D. thesis, Universit de Paris-Sud. Letondal, C. 2001b. A web interface generator for molecular biology programs in unix. Bioinformatics 17, 1, 73–82. Otgaar, D., Dominy, D., Maclear, A., Gamieldien, J., Martinez, F., and Jacobson, D. 2006. Digrabase: A graph-theoretic framework for semantic integration of biological data. Poster, Joint BioLINK and 9th Bio-Ontologies Meeting. Schuler, G., Epstein, J., Ohkawa, H., and Kans, J. 1996. Entrez: molecular biology database and retrieval system. Methods Enzymol 266, 141–62. Schwarz, E., Antoshechkin, I., Bastiani, C., Bieri, T., Blasiar, D., Canaran, P., Chan, J., Chen, N., Chen, W., Davis, P., et al. 2006. Wormbase: better software, richer content. Nucleic Acids Research. Searls, D. 2005. Data integration: challenges for drug discovery. Nat Rev Drug Discov 4, 1, 45–58. Shneiderman, B. 1996. The eyes have it: a task by data type taxonomy for informationvisualizations. Visual Languages, 1996. Proceedings., IEEE Symposium on, 336–343. Smith, C., Goldsmith, C., and Eppig, J. 2005. The mammalian phenotype ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biol 6, 1, R7. Tao, Y., Liu, Y., Friedman, C., and Lussier, Y. 2004. Information visualization techniques in bioinformatics during the postge- nomic era. Drug Discovery Today BIOSILICO 2, 237–245. Walenstein, A. 2002. Cognitive support in software engineering tools: A distributed cognition framework. Ph.D. thesis, SIMON FRASER UNIVERSITY. Wong, L. 2002. Technologies for integrating biological data. Briefings in Bioinformatics 3, 4, 389–404. Proceedings of SABioinf 2007