SlideShare a Scribd company logo
1 of 46
Knowledge Management in a Knowledge Based
Discipline
Robert Stevens
BioHealth Informatics Group
University of Manchester
Robert.Stevens@manchester.ac.uk
Introduction
• How do we do (molecular)biology
• Managing stamp albums
• A knowledge based discipline
• Representing knowledge computationally
• Ontologies that define what entities are in the
domain
• Describing biological knowledge ontologically
• Using ontologies and is it enough?
Ernest Rutherford
“All science is either physics or stamp collecting”
Image: http://en.wikipedia.org/wiki/File:Ernest_Rutherford2.jpg
Mathematical Sciences
Laws in Biology
Charles Darwin
Image: http://en.wikipedia.org/wiki/File:Charles_Darwin_01.jpg
On The Origin of Species - 1859
Classic and Modern Biology
Genotype Phenotype
Modern biology
Classic biology
Central Dogma
Image: http://cellbio.utmb.edu/CELLBIO/DNA-RNA.jpg
Speed of sequencing
• First human genome
– 10+ years to produce
– Cost $500 million
– Huge international effort
• Now done in 10 weeks
– (for $399)
– http://tinyurl.com/genomecost
– http://www.23andme.com
1000+ databases
• according to Nucleic Acids Research
PubMed: 2 papers per minute
• ~700,000 individual papers
• Grows at 2 papers per minute (see http://
blogs.bbsrc.ac.uk for details)
Uniprot:- A protein database?
Ι∆ ΠΡΙΟ_ΗΥΜΑΝ ΣΤΑΝ∆ΑΡ∆; ΠΡΤ; 253 ΑΑ.
ΑΧ Π04156;
∆Τ 01−ΝΟς−1986 (Ρελ. 03, Χρεατεδ)
∆Τ 01−ΝΟς−1986 (Ρελ. 03, Λαστ σεθυενχε υπδατε)
∆Τ 20−ΑΥΓ−2001 (Ρελ. 40, Λαστ αννοτατιον υπδατε)
∆Ε Μαϕορ πριον προτειν πρεχυρσορ (ΠρΠ) (ΠρΠ27−30) (ΠρΠ33−35Χ) (ΑΣΧΡ).
ΓΝ ΠΡΝΠ.
ΟΣ Ηοµο σαπιενσ (Ηυµαν).
ΟΧ Ευκαρψοτα; Μεταζοα; Χηορδατα; Χρανιατα; ςερτεβρατα; Ευτελεοστοµι;
ΟΧ Μαµµαλια; Ευτηερια; Πριµατεσ; Χαταρρηινι; Ηοµινιδαε; Ηοµο.
ΟΞ ΝΧΒΙ_ΤαξΙ∆=9606;
ΡΝ [1]
ΡΠ ΣΕΘΥΕΝΧΕ ΦΡΟΜ Ν.Α.
ΡΞ ΜΕ∆ΛΙΝΕ=86300093; ΠυβΜεδ=3755672;
ΡΑ Κρετζσχηµαρ Η.Α., Στοωρινγ Λ.Ε., Ωεσταωαψ ∆., Στυββλεβινε Ω.Η.,
ΡΑ Πρυσινερ Σ.Β., ∆εαρµονδ Σ.ϑ.;
ΡΤ ∀Μολεχυλαρ χλονινγ οφ α ηυµαν πριον προτειν χ∆ΝΑ.∀;
ΡΛ ∆ΝΑ 5:315−324(1986).
ΡΝ [2]
ΡΠ ΣΕΘΥΕΝΧΕ ΟΦ 8−253 ΦΡΟΜ Ν.Α.
ΡΞ ΜΕ∆ΛΙΝΕ=86261778; ΠυβΜεδ=3014653;
ΡΑ Λιαο Ψ.−Χ.ϑ., Λεβο Ρ.ς., Χλαωσον Γ.Α., Σµυχκλερ Ε.Α.;
ΡΤ ∀Ηυµαν πριον προτειν χ∆ΝΑ: µολεχυλαρ χλονινγ, χηροµοσοµαλ µαππινγ,
ΡΤ ανδ βιολογιχαλ ιµπλιχατιονσ.∀;
ΡΛ Σχιενχε 233:364−367(1986).
ΡΝ [3]
ΡΠ ΣΕΘΥΕΝΧΕ ΟΦ 58−85 ΑΝ∆ 111−150 (ςΑΡΙΑΝΤ ΑΜΨΛΟΙ∆ ΓΣΣ).
ΡΞ ΜΕ∆ΛΙΝΕ=91160504; ΠυβΜεδ=1672107;
ΡΑ Ταγλιαϖινι Φ., Πρελλι Φ., Γηισο ϑ., Βυγιανι Ο., Σερβαν ∆.,
ΡΑ Πρυσινερ Σ.Β., Φαρλοω Μ.Ρ., Γηεττι Β., Φρανγιονε Β.;
ΡΤ ∀Αµψλοιδ προτειν οφ Γερστµανν−Στραυσσλερ−Σχηεινκερ δισεασε (Ινδιανα
ΡΤ κινδρεδ) ισ αν 11 κδ φραγµεντ οφ πριον προτειν ωιτη αν Ν−τερµιναλ
ΡΤ γλψχινε ατ χοδον 58.∀;
ΡΛ ΕΜΒΟ ϑ. 10:513−519(1991).
ΡΝ [4]
ΡΠ ΣΤΡΥΧΤΥΡΕ ΒΨ ΝΜΡ ΟΦ 118−221.
ΡΞ ΜΕ∆ΛΙΝΕ=20359708; ΠυβΜεδ=10900000;
ΡΑ Χαλζολαι Λ., Λψσεκ ∆.Α., Γυντερτ Π., ϖον Σχηροεττερ Χ., Ριεκ Ρ.,
ΡΑ Ζαην Ρ., Ωυετηριχη Κ.;
ΡΤ ∀ΝΜΡ στρυχτυρεσ οφ τηρεε σινγλε−ρεσιδυε ϖαριαντσ οφ τηε ηυµαν πριον
ΡΤ προτειν.∀;
ΡΛ Προχ. Νατλ. Αχαδ. Σχι. Υ.Σ.Α. 97:8340−8345(2000).
ΧΧ −!− ΦΥΝΧΤΙΟΝ: ΤΗΕ ΦΥΝΧΤΙΟΝ ΟΦ ΠΡΠ ΙΣ ΝΟΤ ΚΝΟΩΝ. ΠΡΠ ΙΣ ΕΝΧΟ∆Ε∆ ΙΝ ΤΗΕ
ΧΧ ΗΟΣΤ ΓΕΝΟΜΕ ΑΝ∆ ΙΣ ΕΞΠΡΕΣΣΕ∆ ΒΟΤΗ ΙΝ ΝΟΡΜΑΛ ΑΝ∆ ΙΝΦΕΧΤΕ∆ ΧΕΛΛΣ.
ΧΧ −!− ΣΥΒΥΝΙΤ: ΠΡΠ ΗΑΣ Α ΤΕΝ∆ΕΝΧΨ ΤΟ ΑΓΓΡΕΓΑΤΕ ΨΙΕΛ∆ΙΝΓ ΠΟΛΨΜΕΡΣ ΧΑΛΛΕ∆
ΧΧ ∀ΡΟ∆Σ∀.
ΧΧ −!− ΣΥΒΧΕΛΛΥΛΑΡ ΛΟΧΑΤΙΟΝ: ΑΤΤΑΧΗΕ∆ ΤΟ ΤΗΕ ΜΕΜΒΡΑΝΕ ΒΨ Α ΓΠΙ−ΑΝΧΗΟΡ.
ΧΧ −!− ΠΟΛΨΜΟΡΠΗΙΣΜ: ΤΗΕ ΦΙςΕ ΤΑΝ∆ΕΜ ΟΧΤΑΠΕΠΤΙ∆Ε ΡΕΠΕΑΤΣ ΡΕΓΙΟΝ ΙΣ ΗΙΓΗΛΨ
ΧΧ ΥΝΣΤΑΒΛΕ. ΙΝΣΕΡΤΙΟΝΣ ΟΡ ∆ΕΛΕΤΙΟΝΣ ΟΦ ΟΧΤΑΠΕΠΤΙ∆Ε ΡΕΠΕΑΤ ΥΝΙΤΣ ΑΡΕ
ΧΧ ΑΣΣΟΧΙΑΤΕ∆ ΤΟ ΠΡΙΟΝ ∆ΙΣΕΑΣΕ.
ΦΤ ΣΙΓΝΑΛ 1 22
ΦΤ ΧΗΑΙΝ 23 230 ΜΑϑΟΡ ΠΡΙΟΝ ΠΡΟΤΕΙΝ.
ΦΤ ΠΡΟΠΕΠ 231 253 ΡΕΜΟςΕ∆ ΙΝ ΜΑΤΥΡΕ ΦΟΡΜ (ΒΨ ΣΙΜΙΛΑΡΙΤΨ).
ΦΤ ΛΙΠΙ∆ 230 230 ΓΠΙ−ΑΝΧΗΟΡ (ΒΨ ΣΙΜΙΛΑΡΙΤΨ).
ΦΤ ΧΑΡΒΟΗΨ∆ 181 181 Ν−ΛΙΝΚΕ∆ (ΓΛΧΝΑΧ...) (ΠΡΟΒΑΒΛΕ).
ΦΤ ∆ΙΣΥΛΦΙ∆ 179 214 ΒΨ ΣΙΜΙΛΑΡΙΤΨ.
ΦΤ ∆ΟΜΑΙΝ 51 91 5 Ξ 8 ΑΑ ΤΑΝ∆ΕΜ ΡΕΠΕΑΤΣ ΟΦ Π−Η−Γ−Γ−Γ−Ω−Γ−
ΦΤ Θ.
ΦΤ ΡΕΠΕΑΤ 51 59 1.
ΦΤ ΡΕΠΕΑΤ 60 67 2.
ΦΤ ΡΕΠΕΑΤ 68 75 3.
ΦΤ ΡΕΠΕΑΤ 76 83 4.
ΦΤ ΡΕΠΕΑΤ 84 91 5.
ΦΤ ΙΝ ΠΑΤΙΕΝΤΣ ΩΗΟ ΗΑςΕ Α ΠΡΠ ΜΥΤΑΤΙΟΝ ΑΤ
ΦΤ ΧΟ∆ΟΝ 178: ΠΑΤΙΕΝΤΣ ΩΙΤΗ ΜΕΤ ∆ΕςΕΛΟΠ ΦΦΙ,
ΦΤ ΤΗΟΣΕ ΩΙΤΗ ςΑΛ ∆ΕςΕΛΟΠ Χϑ∆).
ΦΤ /ΦΤΙδ=ςΑΡ_006467.
ΦΤ ςΑΡΙΑΝΤ 171 171 Ν −> Σ (ΙΝ ΣΧΗΙΖΟΑΦΦΕΧΤΙςΕ ∆ΙΣΟΡ∆ΕΡ).
ΦΤ /ΦΤΙδ=ςΑΡ_006468.
ΦΤ ςΑΡΙΑΝΤ 178 178 ∆ −> Ν (ΙΝ ΦΦΙ ΑΝ∆ Χϑ∆).
ΦΤ /ΦΤΙδ=ςΑΡ_006469.
ΦΤ ςΑΡΙΑΝΤ 180 180 ς −> Ι (ΙΝ Χϑ∆).
ΦΤ /ΦΤΙδ=ςΑΡ_006470.
ΦΤ ςΑΡΙΑΝΤ 183 183 Τ −> Α (ΙΝ ΦΑΜΙΛΙΑΛ ΣΠΟΝΓΙΦΟΡΜ
ΦΤ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ).
ΦΤ /ΦΤΙδ=ςΑΡ_006471.
ΦΤ ςΑΡΙΑΝΤ 187 187 Η −> Ρ (ΙΝ ΓΣΣ).
ΦΤ /ΦΤΙδ=ςΑΡ_008746.
ΦΤ ςΑΡΙΑΝΤ 188 188 Τ −> Κ (ΙΝ ΕΟΑ∆; ∆ΕΜΕΝΤΙΑ ΑΣΣΟΧΙΑΤΕ∆ ΤΟ
ΦΤ ΠΡΙΟΝ ∆ΙΣΕΑΣΕΣ).
ΦΤ /ΦΤΙδ=ςΑΡ_008748.
ΦΤ ςΑΡΙΑΝΤ 188 188 Τ −> Ρ.
ΦΤ /ΦΤΙδ=ςΑΡ_008747.
ΦΤ ςΑΡΙΑΝΤ 196 196 Ε −> Κ (ΙΝ Χϑ∆).
ΦΤ /ΦΤΙδ=ςΑΡ_008749.
ΦΤ /ΦΤΙδ=ςΑΡ_006472.
ΣΘ ΣΕΘΥΕΝΧΕ 253 ΑΑ; 27661 ΜΩ; 43∆Β596ΒΑΑΑ66484 ΧΡΧ64;
ΜΑΝΛΓΧΩΜΛς ΛΦςΑΤΩΣ∆ΛΓ ΛΧΚΚΡΠΚΠΓΓ ΩΝΤΓΓΣΡΨΠΓ ΘΓΣΠΓΓΝΡΨΠ ΠΘΓΓ
ΓΓΩΓΘΠ ΗΓΓΓΩΓΘΠΗΓ ΓΓΩΓΘΠΗΓΓΓ ΩΓΘΠΗΓΓΓΩΓ ΘΓΓΓΤΗΣΘΩΝ ΚΠΣΚΠΚΤΝ
ΜΚ ΗΜΑΓΑΑΑΑΓΑ ςςΓΓΛΓΓΨΜΛ ΓΣΑΜΣΡΠΙΙΗ ΦΓΣ∆ΨΕ∆ΡΨΨ ΡΕΝΜΗΡΨΠΝΘ ςΨ
ΨΡΠΜ∆ΕΨΣ ΝΘΝΝΦςΗ∆Χς ΝΙΤΙΚΘΗΤςΤ ΤΤΤΚΓΕΝΦΤΕ Τ∆ςΚΜΜΕΡςς ΕΘΜΧΙΤΘΨ
ΕΡ ΕΣΘΑΨΨΘΡΓΣ ΣΜςΛΦΣΣΠΠς ΙΛΛΙΣΦΛΙΦΛ
ΙςΓ
//
ΧΧ −!− ∆ΙΣΕΑΣΕ: ΠΡΠ ΙΣ ΦΟΥΝ∆ ΙΝ ΗΙΓΗ ΘΥΑΝΤΙΤΨ ΙΝ ΤΗΕ
ΧΧ ΒΡΑΙΝ ΟΦ ΗΥΜΑΝΣ ΑΝ∆ ΑΝΙΜΑΛΣ ΙΝΦΕΧΤΕ∆
ΧΧ ΩΙΤΗ ΝΕΥΡΟ∆ΕΓΕΝΕΡΑΤΙςΕ ∆ΙΣΕΑΣΕΣ ΚΝΟΩΝ ΑΣ
ΧΧ ΤΡΑΝΣΜΙΣΣΙΒΛΕ ΣΠΟΝΓΙΦΟΡΜ ΕΝΧΕΠΗΑΛΟΠΑΤΗΙΕΣ ΟΡ ΠΡΙΟΝ Χ
Χ ∆ΙΣΕΑΣΕΣ,ΛΙΚΕ: ΧΡΕΥΤΖΦΕΛ∆Τ−ϑΑΚΟΒ ∆ΙΣΕΑΣΕ (Χϑ∆),
ΧΧ ΓΕΡΣΤΜΑΝΝ−ΣΤΡΑΥΣΣΛΕΡ ΣΨΝ∆ΡΟΜΕ (ΓΣΣ), ΦΑΤΑΛ
ΧΧ ΦΑΜΙΛΙΑΛ ΙΝΣΟΜΝΙΑ (ΦΦΙ) ΑΝ∆ ΚΥΡΥ ΙΝ ΗΥΜΑΝΣ;
ΧΧ ΣΧΡΑΠΙΕ ΙΝ ΣΗΕΕΠ ΑΝ∆ ΓΟΑΤ; ΒΟςΙΝΕ ΣΠΟΝΓΙΦΟΡΜ
ΧΧ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ (ΒΣΕ) ΙΝ ΧΑΤΤΛΕ; ΤΡΑΝΣΜΙΣΣΙΒΛΕ
ΧΧ ΜΙΝΚ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ (ΤΜΕ); ΧΗΡΟΝΙΧ ΩΑΣΤΙΝΓ
ΧΧ ∆ΙΣΕΑΣΕ (ΧΩ∆) ΟΦ ΜΥΛΕ ∆ΕΕΡ ΑΝ∆ ΕΛΚ; ΦΕΛΙΝΕ
ΧΧ ΣΠΟΝΓΙΦΟΡΜ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ (ΦΣΕ) ΙΝ ΧΑΤΣ ΑΝ∆
ΧΧ ΕΞΟΤΙΧ ΥΝΓΥΛΑΤΕ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ (ΕΥΕ) ΙΝ
ΧΧ ΝΨΑΛΑ ΑΝ∆ ΓΡΕΑΤΕΡ ΚΥ∆Υ. ΤΗΕ ΠΡΙΟΝ ∆ΙΣΕΑΣΕΣ
ΧΧ ΙΛΛΥΣΤΡΑΤΕ ΤΗΡΕΕ ΜΑΝΙΦΕΣΤΑΤΙΟΝΣ ΟΦ ΧΝΣ
ΧΧ ∆ΕΓΕΝΕΡΑΤΙΟΝ: (1) ΙΝΦΕΧΤΙΟΥΣ (2)
ΧΧ ΣΠΟΡΑ∆ΙΧ ΑΝ∆ (3) ∆ΟΜΙΝΑΝΤΛΨ ΙΝΗΕΡΙΤΕ∆ ΦΟΡΜΣ.
ΧΧ ΤΜΕ, ΧΩ∆, ΒΣΕ, ΦΣΕ, ΕΥΕ ΑΡΕ ΑΛΛ ΤΗΟΥΓΗΤ ΤΟ
ΧΧ ΟΧΧΥΡ ΑΦΤΕΡ ΧΟΝΣΥΜΠΤΙΟΝ ΟΦ ΠΡΙΟΝ−ΙΝΦΕΧΤΕ∆
ΧΧ ΦΟΟ∆ΣΤΥΦΦΣ.
∆Ρ ΕΜΒΛ; Μ13667; ΑΑΑ19664.1; −.
∆Ρ ΕΜΒΛ; Μ13899; ΑΑΑ60182.1; −.
∆Ρ ΕΜΒΛ; ∆00015; ΒΑΑ00011.1; −.
∆Ρ ΠΙΡ; Α05017; Α05017.
∆Ρ ΠΙΡ; Α24173; Α24173.
∆Ρ ΠΙΡ; Σ14078; Σ14078.
∆Ρ Π∆Β; 1Ε1Γ; 20−ϑΥΛ−00.
∆Ρ Π∆Β; 1Ε1ϑ; 20−ϑΥΛ−00.
∆Ρ Π∆Β; 1Ε1Π; 20−ϑΥΛ−00.
∆Ρ Π∆Β; 1Ε1Σ; 21−ϑΥΛ−00.
∆Ρ Π∆Β; 1Ε1Υ; 20−ϑΥΛ−00.
∆Ρ Π∆Β; 1Ε1Ω; 20−ϑΥΛ−00. ∆Ρ ΜΙΜ; 176640; −.
∆Ρ ΜΙΜ; 123400; −.
∆Ρ ΜΙΜ; 137440; −.
∆Ρ ΜΙΜ; 245300; −.
∆Ρ ΜΙΜ; 600072; −.
∆Ρ ΜΙΜ; 604920; −.
∆Ρ ΙντερΠρο; ΙΠΡ000817; Πριον.
∆Ρ Πφαµ; ΠΦ00377; πριον; 1.
∆Ρ ΠΡΙΝΤΣ; ΠΡ00341; ΠΡΙΟΝ.
∆Ρ ΣΜΑΡΤ; ΣΜ00157; ΠΡΠ; 1.
∆Ρ ΠΡΟΣΙΤΕ; ΠΣ00291; ΠΡΙΟΝ_1; 1.
∆Ρ ΠΡΟΣΙΤΕ; ΠΣ00706; ΠΡΙΟΝ_2; 1.
ΚΩ Πριον; Βραιν; Γλψχοπροτειν; ΓΠΙ−ανχηορ; Ρεπεατ; Σιγναλ;
ΚΩ 3∆−στρυχτυρε; Πολψµορπηισµ; ∆ισεασε µυτατιον.
What is Knowledge?
• Knowledge – all information
and an understanding to
carry out tasks and to infer
new information
• Information -- data
equipped with meaning
• Data -- un-interpreted
signals that reach our
senses
Michael Ashburner
Professor
University of Cambridge
UK
I
S
M
B
Name
Job
Institution
Country
C
o
n
f
man
academic, senior
ancient university, 5 rated
European
important figure in biology
B
I
O
L
O
G
Y
A Knowledge Based Discipline
• Rather than laws captured in mathematics….
• We have lots of facts: the discipline’s knowledge
• Rather than “calculating” what a protein does, we
investigate and write it down
• Equivalent to writing down the trajectories of all
thrown objects and not doing ballistics!
• To do biology one needs “the knowledge”
Heterogeneity
• 28 ways to format the representations of a biological
sequence
• Though one way to represent the bases or amino
acids…
• Different words same concept
• Different concepts same words
• Different and implicit data schema
Categories and Category Labels
GO:0000368
U2-type nuclear mRNA 5' splice site recognition
spliceosomal E complex formation
spliceosomal E complex biosynthesis
spliceosomal CC complex formation
U2-type nuclear mRNA 5'-splice site recognition
An Identity Crisis
• Database entries have identifiers unique within their
database
• The type of entity described in an entry doesn’t have
an identifier
• Different entries about the same type talk about it
differently
• How do we know when an entry in one DB talks
about the same thing as another entry in another
DB?
• That’s the skill of a bioinformatician
Why: Society of Biologists
• To do particle physics necessarily has central
organisation
• One central place to generate data
• A communitarian attitude
• It is still possible to do biology in the “garden shed”
• Historicaly less need to organise
• Hence…
Navigating the Web of
Knowledge in Bioinformatics
Biology is Special
• Large quantities of data: No it doesn’t
• Complex data: Yes it does
• Volatile data: Types of data and what is recorded
changes rapidly
• Nothing that special about biology
• …except that it has all the problem and often to a
large degree
Lots of catalogues
Genome
Proteome
Transcriptome
Interactome
Metabolome
PHENOME
Biology now has lots of facts
Creating Woods, not Trees
Genes
Proteins
Pathways
Interactions
Literature
Complex
Machines
Virtual
Organism
…. from biological facts, we make a system that is some model of a real organism
Networks of Chemicals
Image: http://genome-www.stanford.edu/rap_sir/images/Web_FigF_RAP1_glycolysis.gif
Systems within Systems
Image: http://www.ehponline.org/members/2007/10373/fig1.jpg
A Biologist’s Skills
• By the time a biologist has finished a Ph.D. he/she is
about ready for action
• They have a comprehensive knowledge of the facts
of a (narrow) domain
• He/she also knows how to do experimentation in that
domain
• There are so many facts, it is difficult to move outside
one’s sub-discipline
• Yet in a systems view such movement is mandatory
The Role of Knowledge
• A lot of facts
• Perhaps organised into a system
• No equivalent of “laws of mechanics” – we
can’t do this biology with mathematics
• Or at least not without knowing what the
numbers mean...
• This is why we’ve been using ontologies!
What is an Ontology?
• A description of that which exists (in our data)
• What it means to be a member of a category
• What categories of things exist and how do I
recognise that a particular object is a member of a
given category
Uses of Ontology in Bioinformatics
Why develop an ontology?
• To make domain assumptions explicit
– Easier to change domain assumptions
– Easier to understand and update legacy data
• To separate domain knowledge from operational knowledge
– Re-use domain and operational knowledge
separately
• A community reference for applications
• To share a consistent understanding of what information means.
History of Bio-ontologies
1992 1996 1998
TAMBIS
2002
MGED
2006
1st
Bio-ontologies
meeting
Gene Ontology
starts
2005
Controlled Vocabulary
• An Ontology isn’t a controlled vocabulary, but can be
used to deliver one
• By agreeing upon the categories in a domain and
agreeing upon their labels we are controlling
vocabulary
• Addresses one major problem in biology
• Also forces examination of definitions
• Makes domain assumptions explicit
Transferring Characteristics
Uncharacterised protein
Tra1 La2 La3
High similarity transfer characteristics
Post-Genomic Biology
• Fly, mouse, yeast, worm all have their own
terminologies
• I want to compare genomes
• How?
• The genomic sequence is easily dealt with
computationally and comparisons are easy
• This is not true of the annotations or knowledge of
those sequences
• Need a common understanding
Annotation of Data
• Big effort to create controlled vocabularies using
ontologies
• A huge annotation efffort – describe the entities in DB
with terms from ontologies
• The Gene Ontology (http://www.geneontology.org))
• The Open Biomedical Ontologies Consortiym
Genotype Phenotype
Sequence
Proteins
Gene products Transcript
Pathways
Cell type
BRENDA tissue /
enzyme source
Development
Anatomy
Pheonotype
Plasmodium
life cycle
-Sequence types
and features
-Genetic Context
- Molecule role
- Molecular Function
- Biological process
- Cellular component
-Protein covalent bond
-Protein domain
-UniProt taxonomy
-Pathway ontology
-Event (INOH pathway
ontology)
-Systems Biology
-Protein-protein
interaction
-Arabidopsis development
-Cereal plant development
-Plant growth and developmental stage
-C. elegans development
-Drosophila development FBdv fly
development.obo OBO yes yes
-Human developmental anatomy, abstract
version
-Human developmental anatomy, timed version
-Mosquito gross anatomy
-Mouse adult gross anatomy
-Mouse gross anatomy and development
-C. elegans gross anatomy
-Arabidopsis gross anatomy
-Cereal plant gross anatomy
-Drosophila gross anatomy
-Dictyostelium discoideum anatomy
-Fungal gross anatomy FAO
-Plant structure
-Maize gross anatomy
-Medaka fish anatomy and development
-Zebrafish anatomy and development
-NCI Thesaurus
-Mouse pathology
-Human disease
-Cereal plant trait
-PATO PATO attribute and value.obo
-Mammalian phenotype
-Habronattus courtship
-Loggerhead nesting
-Animal natural history and life history
eVOC (Expressed
Sequence Annotation
for Humans)
The Sequence
Ontology
(http://obo.sf.net)
GO in Analysis
• Microarray analysis one of the original visions for GO
• Clustering of modulated genes cluster about
functional attributes of their proteins
• GO also used in, for example, semantic similarity;
text analysis; etc.
Fact Management
• When “stamp collecting” we’re collecting facts
• Biology is a fact management activity
• Knowing what these fact mean is very import
• Science is perofrmed on data and the smeantics of
data enable us to do science
• Semantic e-Science
Summary
• The nature of modern biology gives it interesting
knowledge (fact) management issues
• It is a knowledge based discipline
• Not unique, but often extreme
• Ontologies seen as one component in management
(but not a panacea)
acknowledgements
• All these people provided slides and input:
• Duncan Hull
• Simon Jupp
• Phil Lord
• Carole goble
Genotype to Pathway
Created by Paul Fisher
Pathway to Phenotype
Created by Paul Fisher
Ontology Space
(Axiomatic)Richness
Usage
Representation
Metadata toilet
• Everyone wants to use good metadata but few people want to
spend time curating and cleaning metadata
– Like a clean toilet
Biologists Wake up to Standards

More Related Content

Similar to Knowledge Management in a Knowledge Based Discipline

AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ ManchesterAndre Freitas
 
Dr. Randall Prather - PRRS Resistant Pigs
Dr. Randall Prather - PRRS Resistant PigsDr. Randall Prather - PRRS Resistant Pigs
Dr. Randall Prather - PRRS Resistant PigsJohn Blue
 
Turn Away from Traditional Tethering and Towards a Better Method for Data Col...
Turn Away from Traditional Tethering and Towards a Better Method for Data Col...Turn Away from Traditional Tethering and Towards a Better Method for Data Col...
Turn Away from Traditional Tethering and Towards a Better Method for Data Col...InsideScientific
 
ChEMBL US tour December 2014
ChEMBL US tour December 2014ChEMBL US tour December 2014
ChEMBL US tour December 2014John Overington
 
Math 225-spring-2012
Math 225-spring-2012Math 225-spring-2012
Math 225-spring-2012Bruce Slutsky
 
Plant Pathology Seminar
Plant Pathology SeminarPlant Pathology Seminar
Plant Pathology SeminarBongsoo Park
 
Crofton McKim Conf Thyroid QSAR Talk 10-17-2008
Crofton McKim Conf Thyroid QSAR Talk 10-17-2008Crofton McKim Conf Thyroid QSAR Talk 10-17-2008
Crofton McKim Conf Thyroid QSAR Talk 10-17-2008KevinCrofton
 
PREVALENCE OF RADIOLOGICAL CHANGES OF TEETH & JAW BONES IN ENDSTAGE RENAL DIS...
PREVALENCE OF RADIOLOGICAL CHANGES OF TEETH & JAW BONES IN ENDSTAGE RENAL DIS...PREVALENCE OF RADIOLOGICAL CHANGES OF TEETH & JAW BONES IN ENDSTAGE RENAL DIS...
PREVALENCE OF RADIOLOGICAL CHANGES OF TEETH & JAW BONES IN ENDSTAGE RENAL DIS...Indian dental academy
 
Stroke prevention in patients with atrial fibrillation
Stroke prevention in patients with atrial fibrillationStroke prevention in patients with atrial fibrillation
Stroke prevention in patients with atrial fibrillationMgfamiliar Net
 
Gold Standard Physiological Measurements and Novel Drug Delivery Methods - Se...
Gold Standard Physiological Measurements and Novel Drug Delivery Methods - Se...Gold Standard Physiological Measurements and Novel Drug Delivery Methods - Se...
Gold Standard Physiological Measurements and Novel Drug Delivery Methods - Se...InsideScientific
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics finalRainu Rajeev
 
Thrive: Timely Health Indicators Using Remote Sensing & Innovation for the Vi...
Thrive: Timely Health Indicators Using Remote Sensing & Innovation for the Vi...Thrive: Timely Health Indicators Using Remote Sensing & Innovation for the Vi...
Thrive: Timely Health Indicators Using Remote Sensing & Innovation for the Vi...US-Ignite
 
THRIVE: Pollution Viewer
THRIVE: Pollution ViewerTHRIVE: Pollution Viewer
THRIVE: Pollution ViewerDavid Lary
 

Similar to Knowledge Management in a Knowledge Based Discipline (20)

AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
Dr. Randall Prather - PRRS Resistant Pigs
Dr. Randall Prather - PRRS Resistant PigsDr. Randall Prather - PRRS Resistant Pigs
Dr. Randall Prather - PRRS Resistant Pigs
 
Advanced NCBI
Advanced NCBI Advanced NCBI
Advanced NCBI
 
Turn Away from Traditional Tethering and Towards a Better Method for Data Col...
Turn Away from Traditional Tethering and Towards a Better Method for Data Col...Turn Away from Traditional Tethering and Towards a Better Method for Data Col...
Turn Away from Traditional Tethering and Towards a Better Method for Data Col...
 
UTI
UTIUTI
UTI
 
Closeout aoc 49 c figure 7
Closeout aoc 49 c figure 7Closeout aoc 49 c figure 7
Closeout aoc 49 c figure 7
 
ChEMBL US tour December 2014
ChEMBL US tour December 2014ChEMBL US tour December 2014
ChEMBL US tour December 2014
 
2009 09 08 Wiltshire Ipit Seminar Slides
2009 09 08 Wiltshire Ipit Seminar Slides2009 09 08 Wiltshire Ipit Seminar Slides
2009 09 08 Wiltshire Ipit Seminar Slides
 
Math 225-spring-2012
Math 225-spring-2012Math 225-spring-2012
Math 225-spring-2012
 
Plant Pathology Seminar
Plant Pathology SeminarPlant Pathology Seminar
Plant Pathology Seminar
 
Crofton McKim Conf Thyroid QSAR Talk 10-17-2008
Crofton McKim Conf Thyroid QSAR Talk 10-17-2008Crofton McKim Conf Thyroid QSAR Talk 10-17-2008
Crofton McKim Conf Thyroid QSAR Talk 10-17-2008
 
PREVALENCE OF RADIOLOGICAL CHANGES OF TEETH & JAW BONES IN ENDSTAGE RENAL DIS...
PREVALENCE OF RADIOLOGICAL CHANGES OF TEETH & JAW BONES IN ENDSTAGE RENAL DIS...PREVALENCE OF RADIOLOGICAL CHANGES OF TEETH & JAW BONES IN ENDSTAGE RENAL DIS...
PREVALENCE OF RADIOLOGICAL CHANGES OF TEETH & JAW BONES IN ENDSTAGE RENAL DIS...
 
Nanomedicine
NanomedicineNanomedicine
Nanomedicine
 
Stroke prevention in patients with atrial fibrillation
Stroke prevention in patients with atrial fibrillationStroke prevention in patients with atrial fibrillation
Stroke prevention in patients with atrial fibrillation
 
Gold Standard Physiological Measurements and Novel Drug Delivery Methods - Se...
Gold Standard Physiological Measurements and Novel Drug Delivery Methods - Se...Gold Standard Physiological Measurements and Novel Drug Delivery Methods - Se...
Gold Standard Physiological Measurements and Novel Drug Delivery Methods - Se...
 
PCI & AimRadial 2018 | Innovation in Cardiovascular Medicine - Tim A. Fischell
PCI & AimRadial 2018 | Innovation in Cardiovascular Medicine - Tim A. FischellPCI & AimRadial 2018 | Innovation in Cardiovascular Medicine - Tim A. Fischell
PCI & AimRadial 2018 | Innovation in Cardiovascular Medicine - Tim A. Fischell
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics final
 
Thrive: Timely Health Indicators Using Remote Sensing & Innovation for the Vi...
Thrive: Timely Health Indicators Using Remote Sensing & Innovation for the Vi...Thrive: Timely Health Indicators Using Remote Sensing & Innovation for the Vi...
Thrive: Timely Health Indicators Using Remote Sensing & Innovation for the Vi...
 
Immunisatie1
Immunisatie1Immunisatie1
Immunisatie1
 
THRIVE: Pollution Viewer
THRIVE: Pollution ViewerTHRIVE: Pollution Viewer
THRIVE: Pollution Viewer
 

More from robertstevens65

Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientrobertstevens65
 
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The Pragmatics and Formality of Authoring OntologiesOdsl 2016The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The Pragmatics and Formality of Authoring OntologiesOdsl 2016robertstevens65
 
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...robertstevens65
 
The Quality of Method Reporting in
The Quality of Method Reporting in The Quality of Method Reporting in
The Quality of Method Reporting in robertstevens65
 
The Semantics of Genomic Analysis
The Semantics of  Genomic AnalysisThe Semantics of  Genomic Analysis
The Semantics of Genomic Analysisrobertstevens65
 
Issues and activities in authoring ontologies
Issues and activities in authoring ontologiesIssues and activities in authoring ontologies
Issues and activities in authoring ontologiesrobertstevens65
 
The state of the nation for ontology development
The state of the nation for ontology developmentThe state of the nation for ontology development
The state of the nation for ontology developmentrobertstevens65
 
Building and Using Ontologies to do biology
Building and Using Ontologies to do biologyBuilding and Using Ontologies to do biology
Building and Using Ontologies to do biologyrobertstevens65
 
Properties and Individuals in OWL: Reasoning About Family History
Properties and Individuals in OWL: Reasoning About Family HistoryProperties and Individuals in OWL: Reasoning About Family History
Properties and Individuals in OWL: Reasoning About Family Historyrobertstevens65
 
Choosing and Building Knowledge Artefacts
Choosing and Building Knowledge ArtefactsChoosing and Building Knowledge Artefacts
Choosing and Building Knowledge Artefactsrobertstevens65
 
Populous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from TemplatesPopulous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from Templatesrobertstevens65
 
Keeping ontology development Agile
Keeping ontology development AgileKeeping ontology development Agile
Keeping ontology development Agilerobertstevens65
 
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)robertstevens65
 
A Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a RoseA Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a Roserobertstevens65
 
Working with big biomedical ontologies
Working with big biomedical ontologiesWorking with big biomedical ontologies
Working with big biomedical ontologiesrobertstevens65
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...robertstevens65
 
Ontology learning from text
Ontology learning from textOntology learning from text
Ontology learning from textrobertstevens65
 
A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2robertstevens65
 

More from robertstevens65 (20)

Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficient
 
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The Pragmatics and Formality of Authoring OntologiesOdsl 2016The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
 
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
 
The Quality of Method Reporting in
The Quality of Method Reporting in The Quality of Method Reporting in
The Quality of Method Reporting in
 
The Semantics of Genomic Analysis
The Semantics of  Genomic AnalysisThe Semantics of  Genomic Analysis
The Semantics of Genomic Analysis
 
Issues and activities in authoring ontologies
Issues and activities in authoring ontologiesIssues and activities in authoring ontologies
Issues and activities in authoring ontologies
 
The state of the nation for ontology development
The state of the nation for ontology developmentThe state of the nation for ontology development
The state of the nation for ontology development
 
Building and Using Ontologies to do biology
Building and Using Ontologies to do biologyBuilding and Using Ontologies to do biology
Building and Using Ontologies to do biology
 
Properties and Individuals in OWL: Reasoning About Family History
Properties and Individuals in OWL: Reasoning About Family HistoryProperties and Individuals in OWL: Reasoning About Family History
Properties and Individuals in OWL: Reasoning About Family History
 
Choosing and Building Knowledge Artefacts
Choosing and Building Knowledge ArtefactsChoosing and Building Knowledge Artefacts
Choosing and Building Knowledge Artefacts
 
Populous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from TemplatesPopulous: A tool for Populating OWL Ontologies from Templates
Populous: A tool for Populating OWL Ontologies from Templates
 
Keeping ontology development Agile
Keeping ontology development AgileKeeping ontology development Agile
Keeping ontology development Agile
 
Spreadsheets to OWL
Spreadsheets to OWLSpreadsheets to OWL
Spreadsheets to OWL
 
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
 
A Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a RoseA Rose by Any Other Name is Still a Rose
A Rose by Any Other Name is Still a Rose
 
Working with big biomedical ontologies
Working with big biomedical ontologiesWorking with big biomedical ontologies
Working with big biomedical ontologies
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
 
Ontology learning from text
Ontology learning from textOntology learning from text
Ontology learning from text
 
Ontology at Manchester
Ontology at ManchesterOntology at Manchester
Ontology at Manchester
 
A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2A family History Knowledge Base in OWL 2
A family History Knowledge Base in OWL 2
 

Recently uploaded

Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiologyDrAnita Sharma
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxVarshiniMK
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 

Recently uploaded (20)

Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
insect anatomy and insect body wall and their physiology
insect anatomy and insect body wall and their  physiologyinsect anatomy and insect body wall and their  physiology
insect anatomy and insect body wall and their physiology
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 
Cytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptxCytokinin, mechanism and its application.pptx
Cytokinin, mechanism and its application.pptx
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 

Knowledge Management in a Knowledge Based Discipline

  • 1. Knowledge Management in a Knowledge Based Discipline Robert Stevens BioHealth Informatics Group University of Manchester Robert.Stevens@manchester.ac.uk
  • 2. Introduction • How do we do (molecular)biology • Managing stamp albums • A knowledge based discipline • Representing knowledge computationally • Ontologies that define what entities are in the domain • Describing biological knowledge ontologically • Using ontologies and is it enough?
  • 3. Ernest Rutherford “All science is either physics or stamp collecting” Image: http://en.wikipedia.org/wiki/File:Ernest_Rutherford2.jpg
  • 5. Laws in Biology Charles Darwin Image: http://en.wikipedia.org/wiki/File:Charles_Darwin_01.jpg On The Origin of Species - 1859
  • 6. Classic and Modern Biology Genotype Phenotype Modern biology Classic biology
  • 8. Speed of sequencing • First human genome – 10+ years to produce – Cost $500 million – Huge international effort • Now done in 10 weeks – (for $399) – http://tinyurl.com/genomecost – http://www.23andme.com
  • 9. 1000+ databases • according to Nucleic Acids Research
  • 10. PubMed: 2 papers per minute • ~700,000 individual papers • Grows at 2 papers per minute (see http:// blogs.bbsrc.ac.uk for details)
  • 11. Uniprot:- A protein database? Ι∆ ΠΡΙΟ_ΗΥΜΑΝ ΣΤΑΝ∆ΑΡ∆; ΠΡΤ; 253 ΑΑ. ΑΧ Π04156; ∆Τ 01−ΝΟς−1986 (Ρελ. 03, Χρεατεδ) ∆Τ 01−ΝΟς−1986 (Ρελ. 03, Λαστ σεθυενχε υπδατε) ∆Τ 20−ΑΥΓ−2001 (Ρελ. 40, Λαστ αννοτατιον υπδατε) ∆Ε Μαϕορ πριον προτειν πρεχυρσορ (ΠρΠ) (ΠρΠ27−30) (ΠρΠ33−35Χ) (ΑΣΧΡ). ΓΝ ΠΡΝΠ. ΟΣ Ηοµο σαπιενσ (Ηυµαν). ΟΧ Ευκαρψοτα; Μεταζοα; Χηορδατα; Χρανιατα; ςερτεβρατα; Ευτελεοστοµι; ΟΧ Μαµµαλια; Ευτηερια; Πριµατεσ; Χαταρρηινι; Ηοµινιδαε; Ηοµο. ΟΞ ΝΧΒΙ_ΤαξΙ∆=9606; ΡΝ [1] ΡΠ ΣΕΘΥΕΝΧΕ ΦΡΟΜ Ν.Α. ΡΞ ΜΕ∆ΛΙΝΕ=86300093; ΠυβΜεδ=3755672; ΡΑ Κρετζσχηµαρ Η.Α., Στοωρινγ Λ.Ε., Ωεσταωαψ ∆., Στυββλεβινε Ω.Η., ΡΑ Πρυσινερ Σ.Β., ∆εαρµονδ Σ.ϑ.; ΡΤ ∀Μολεχυλαρ χλονινγ οφ α ηυµαν πριον προτειν χ∆ΝΑ.∀; ΡΛ ∆ΝΑ 5:315−324(1986). ΡΝ [2] ΡΠ ΣΕΘΥΕΝΧΕ ΟΦ 8−253 ΦΡΟΜ Ν.Α. ΡΞ ΜΕ∆ΛΙΝΕ=86261778; ΠυβΜεδ=3014653; ΡΑ Λιαο Ψ.−Χ.ϑ., Λεβο Ρ.ς., Χλαωσον Γ.Α., Σµυχκλερ Ε.Α.; ΡΤ ∀Ηυµαν πριον προτειν χ∆ΝΑ: µολεχυλαρ χλονινγ, χηροµοσοµαλ µαππινγ, ΡΤ ανδ βιολογιχαλ ιµπλιχατιονσ.∀; ΡΛ Σχιενχε 233:364−367(1986). ΡΝ [3] ΡΠ ΣΕΘΥΕΝΧΕ ΟΦ 58−85 ΑΝ∆ 111−150 (ςΑΡΙΑΝΤ ΑΜΨΛΟΙ∆ ΓΣΣ). ΡΞ ΜΕ∆ΛΙΝΕ=91160504; ΠυβΜεδ=1672107; ΡΑ Ταγλιαϖινι Φ., Πρελλι Φ., Γηισο ϑ., Βυγιανι Ο., Σερβαν ∆., ΡΑ Πρυσινερ Σ.Β., Φαρλοω Μ.Ρ., Γηεττι Β., Φρανγιονε Β.; ΡΤ ∀Αµψλοιδ προτειν οφ Γερστµανν−Στραυσσλερ−Σχηεινκερ δισεασε (Ινδιανα ΡΤ κινδρεδ) ισ αν 11 κδ φραγµεντ οφ πριον προτειν ωιτη αν Ν−τερµιναλ ΡΤ γλψχινε ατ χοδον 58.∀; ΡΛ ΕΜΒΟ ϑ. 10:513−519(1991). ΡΝ [4] ΡΠ ΣΤΡΥΧΤΥΡΕ ΒΨ ΝΜΡ ΟΦ 118−221. ΡΞ ΜΕ∆ΛΙΝΕ=20359708; ΠυβΜεδ=10900000; ΡΑ Χαλζολαι Λ., Λψσεκ ∆.Α., Γυντερτ Π., ϖον Σχηροεττερ Χ., Ριεκ Ρ., ΡΑ Ζαην Ρ., Ωυετηριχη Κ.; ΡΤ ∀ΝΜΡ στρυχτυρεσ οφ τηρεε σινγλε−ρεσιδυε ϖαριαντσ οφ τηε ηυµαν πριον ΡΤ προτειν.∀; ΡΛ Προχ. Νατλ. Αχαδ. Σχι. Υ.Σ.Α. 97:8340−8345(2000). ΧΧ −!− ΦΥΝΧΤΙΟΝ: ΤΗΕ ΦΥΝΧΤΙΟΝ ΟΦ ΠΡΠ ΙΣ ΝΟΤ ΚΝΟΩΝ. ΠΡΠ ΙΣ ΕΝΧΟ∆Ε∆ ΙΝ ΤΗΕ ΧΧ ΗΟΣΤ ΓΕΝΟΜΕ ΑΝ∆ ΙΣ ΕΞΠΡΕΣΣΕ∆ ΒΟΤΗ ΙΝ ΝΟΡΜΑΛ ΑΝ∆ ΙΝΦΕΧΤΕ∆ ΧΕΛΛΣ. ΧΧ −!− ΣΥΒΥΝΙΤ: ΠΡΠ ΗΑΣ Α ΤΕΝ∆ΕΝΧΨ ΤΟ ΑΓΓΡΕΓΑΤΕ ΨΙΕΛ∆ΙΝΓ ΠΟΛΨΜΕΡΣ ΧΑΛΛΕ∆ ΧΧ ∀ΡΟ∆Σ∀. ΧΧ −!− ΣΥΒΧΕΛΛΥΛΑΡ ΛΟΧΑΤΙΟΝ: ΑΤΤΑΧΗΕ∆ ΤΟ ΤΗΕ ΜΕΜΒΡΑΝΕ ΒΨ Α ΓΠΙ−ΑΝΧΗΟΡ. ΧΧ −!− ΠΟΛΨΜΟΡΠΗΙΣΜ: ΤΗΕ ΦΙςΕ ΤΑΝ∆ΕΜ ΟΧΤΑΠΕΠΤΙ∆Ε ΡΕΠΕΑΤΣ ΡΕΓΙΟΝ ΙΣ ΗΙΓΗΛΨ ΧΧ ΥΝΣΤΑΒΛΕ. ΙΝΣΕΡΤΙΟΝΣ ΟΡ ∆ΕΛΕΤΙΟΝΣ ΟΦ ΟΧΤΑΠΕΠΤΙ∆Ε ΡΕΠΕΑΤ ΥΝΙΤΣ ΑΡΕ ΧΧ ΑΣΣΟΧΙΑΤΕ∆ ΤΟ ΠΡΙΟΝ ∆ΙΣΕΑΣΕ. ΦΤ ΣΙΓΝΑΛ 1 22 ΦΤ ΧΗΑΙΝ 23 230 ΜΑϑΟΡ ΠΡΙΟΝ ΠΡΟΤΕΙΝ. ΦΤ ΠΡΟΠΕΠ 231 253 ΡΕΜΟςΕ∆ ΙΝ ΜΑΤΥΡΕ ΦΟΡΜ (ΒΨ ΣΙΜΙΛΑΡΙΤΨ). ΦΤ ΛΙΠΙ∆ 230 230 ΓΠΙ−ΑΝΧΗΟΡ (ΒΨ ΣΙΜΙΛΑΡΙΤΨ). ΦΤ ΧΑΡΒΟΗΨ∆ 181 181 Ν−ΛΙΝΚΕ∆ (ΓΛΧΝΑΧ...) (ΠΡΟΒΑΒΛΕ). ΦΤ ∆ΙΣΥΛΦΙ∆ 179 214 ΒΨ ΣΙΜΙΛΑΡΙΤΨ. ΦΤ ∆ΟΜΑΙΝ 51 91 5 Ξ 8 ΑΑ ΤΑΝ∆ΕΜ ΡΕΠΕΑΤΣ ΟΦ Π−Η−Γ−Γ−Γ−Ω−Γ− ΦΤ Θ. ΦΤ ΡΕΠΕΑΤ 51 59 1. ΦΤ ΡΕΠΕΑΤ 60 67 2. ΦΤ ΡΕΠΕΑΤ 68 75 3. ΦΤ ΡΕΠΕΑΤ 76 83 4. ΦΤ ΡΕΠΕΑΤ 84 91 5. ΦΤ ΙΝ ΠΑΤΙΕΝΤΣ ΩΗΟ ΗΑςΕ Α ΠΡΠ ΜΥΤΑΤΙΟΝ ΑΤ ΦΤ ΧΟ∆ΟΝ 178: ΠΑΤΙΕΝΤΣ ΩΙΤΗ ΜΕΤ ∆ΕςΕΛΟΠ ΦΦΙ, ΦΤ ΤΗΟΣΕ ΩΙΤΗ ςΑΛ ∆ΕςΕΛΟΠ Χϑ∆). ΦΤ /ΦΤΙδ=ςΑΡ_006467. ΦΤ ςΑΡΙΑΝΤ 171 171 Ν −> Σ (ΙΝ ΣΧΗΙΖΟΑΦΦΕΧΤΙςΕ ∆ΙΣΟΡ∆ΕΡ). ΦΤ /ΦΤΙδ=ςΑΡ_006468. ΦΤ ςΑΡΙΑΝΤ 178 178 ∆ −> Ν (ΙΝ ΦΦΙ ΑΝ∆ Χϑ∆). ΦΤ /ΦΤΙδ=ςΑΡ_006469. ΦΤ ςΑΡΙΑΝΤ 180 180 ς −> Ι (ΙΝ Χϑ∆). ΦΤ /ΦΤΙδ=ςΑΡ_006470. ΦΤ ςΑΡΙΑΝΤ 183 183 Τ −> Α (ΙΝ ΦΑΜΙΛΙΑΛ ΣΠΟΝΓΙΦΟΡΜ ΦΤ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ). ΦΤ /ΦΤΙδ=ςΑΡ_006471. ΦΤ ςΑΡΙΑΝΤ 187 187 Η −> Ρ (ΙΝ ΓΣΣ). ΦΤ /ΦΤΙδ=ςΑΡ_008746. ΦΤ ςΑΡΙΑΝΤ 188 188 Τ −> Κ (ΙΝ ΕΟΑ∆; ∆ΕΜΕΝΤΙΑ ΑΣΣΟΧΙΑΤΕ∆ ΤΟ ΦΤ ΠΡΙΟΝ ∆ΙΣΕΑΣΕΣ). ΦΤ /ΦΤΙδ=ςΑΡ_008748. ΦΤ ςΑΡΙΑΝΤ 188 188 Τ −> Ρ. ΦΤ /ΦΤΙδ=ςΑΡ_008747. ΦΤ ςΑΡΙΑΝΤ 196 196 Ε −> Κ (ΙΝ Χϑ∆). ΦΤ /ΦΤΙδ=ςΑΡ_008749. ΦΤ /ΦΤΙδ=ςΑΡ_006472. ΣΘ ΣΕΘΥΕΝΧΕ 253 ΑΑ; 27661 ΜΩ; 43∆Β596ΒΑΑΑ66484 ΧΡΧ64; ΜΑΝΛΓΧΩΜΛς ΛΦςΑΤΩΣ∆ΛΓ ΛΧΚΚΡΠΚΠΓΓ ΩΝΤΓΓΣΡΨΠΓ ΘΓΣΠΓΓΝΡΨΠ ΠΘΓΓ ΓΓΩΓΘΠ ΗΓΓΓΩΓΘΠΗΓ ΓΓΩΓΘΠΗΓΓΓ ΩΓΘΠΗΓΓΓΩΓ ΘΓΓΓΤΗΣΘΩΝ ΚΠΣΚΠΚΤΝ ΜΚ ΗΜΑΓΑΑΑΑΓΑ ςςΓΓΛΓΓΨΜΛ ΓΣΑΜΣΡΠΙΙΗ ΦΓΣ∆ΨΕ∆ΡΨΨ ΡΕΝΜΗΡΨΠΝΘ ςΨ ΨΡΠΜ∆ΕΨΣ ΝΘΝΝΦςΗ∆Χς ΝΙΤΙΚΘΗΤςΤ ΤΤΤΚΓΕΝΦΤΕ Τ∆ςΚΜΜΕΡςς ΕΘΜΧΙΤΘΨ ΕΡ ΕΣΘΑΨΨΘΡΓΣ ΣΜςΛΦΣΣΠΠς ΙΛΛΙΣΦΛΙΦΛ ΙςΓ // ΧΧ −!− ∆ΙΣΕΑΣΕ: ΠΡΠ ΙΣ ΦΟΥΝ∆ ΙΝ ΗΙΓΗ ΘΥΑΝΤΙΤΨ ΙΝ ΤΗΕ ΧΧ ΒΡΑΙΝ ΟΦ ΗΥΜΑΝΣ ΑΝ∆ ΑΝΙΜΑΛΣ ΙΝΦΕΧΤΕ∆ ΧΧ ΩΙΤΗ ΝΕΥΡΟ∆ΕΓΕΝΕΡΑΤΙςΕ ∆ΙΣΕΑΣΕΣ ΚΝΟΩΝ ΑΣ ΧΧ ΤΡΑΝΣΜΙΣΣΙΒΛΕ ΣΠΟΝΓΙΦΟΡΜ ΕΝΧΕΠΗΑΛΟΠΑΤΗΙΕΣ ΟΡ ΠΡΙΟΝ Χ Χ ∆ΙΣΕΑΣΕΣ,ΛΙΚΕ: ΧΡΕΥΤΖΦΕΛ∆Τ−ϑΑΚΟΒ ∆ΙΣΕΑΣΕ (Χϑ∆), ΧΧ ΓΕΡΣΤΜΑΝΝ−ΣΤΡΑΥΣΣΛΕΡ ΣΨΝ∆ΡΟΜΕ (ΓΣΣ), ΦΑΤΑΛ ΧΧ ΦΑΜΙΛΙΑΛ ΙΝΣΟΜΝΙΑ (ΦΦΙ) ΑΝ∆ ΚΥΡΥ ΙΝ ΗΥΜΑΝΣ; ΧΧ ΣΧΡΑΠΙΕ ΙΝ ΣΗΕΕΠ ΑΝ∆ ΓΟΑΤ; ΒΟςΙΝΕ ΣΠΟΝΓΙΦΟΡΜ ΧΧ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ (ΒΣΕ) ΙΝ ΧΑΤΤΛΕ; ΤΡΑΝΣΜΙΣΣΙΒΛΕ ΧΧ ΜΙΝΚ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ (ΤΜΕ); ΧΗΡΟΝΙΧ ΩΑΣΤΙΝΓ ΧΧ ∆ΙΣΕΑΣΕ (ΧΩ∆) ΟΦ ΜΥΛΕ ∆ΕΕΡ ΑΝ∆ ΕΛΚ; ΦΕΛΙΝΕ ΧΧ ΣΠΟΝΓΙΦΟΡΜ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ (ΦΣΕ) ΙΝ ΧΑΤΣ ΑΝ∆ ΧΧ ΕΞΟΤΙΧ ΥΝΓΥΛΑΤΕ ΕΝΧΕΠΗΑΛΟΠΑΤΗΨ (ΕΥΕ) ΙΝ ΧΧ ΝΨΑΛΑ ΑΝ∆ ΓΡΕΑΤΕΡ ΚΥ∆Υ. ΤΗΕ ΠΡΙΟΝ ∆ΙΣΕΑΣΕΣ ΧΧ ΙΛΛΥΣΤΡΑΤΕ ΤΗΡΕΕ ΜΑΝΙΦΕΣΤΑΤΙΟΝΣ ΟΦ ΧΝΣ ΧΧ ∆ΕΓΕΝΕΡΑΤΙΟΝ: (1) ΙΝΦΕΧΤΙΟΥΣ (2) ΧΧ ΣΠΟΡΑ∆ΙΧ ΑΝ∆ (3) ∆ΟΜΙΝΑΝΤΛΨ ΙΝΗΕΡΙΤΕ∆ ΦΟΡΜΣ. ΧΧ ΤΜΕ, ΧΩ∆, ΒΣΕ, ΦΣΕ, ΕΥΕ ΑΡΕ ΑΛΛ ΤΗΟΥΓΗΤ ΤΟ ΧΧ ΟΧΧΥΡ ΑΦΤΕΡ ΧΟΝΣΥΜΠΤΙΟΝ ΟΦ ΠΡΙΟΝ−ΙΝΦΕΧΤΕ∆ ΧΧ ΦΟΟ∆ΣΤΥΦΦΣ. ∆Ρ ΕΜΒΛ; Μ13667; ΑΑΑ19664.1; −. ∆Ρ ΕΜΒΛ; Μ13899; ΑΑΑ60182.1; −. ∆Ρ ΕΜΒΛ; ∆00015; ΒΑΑ00011.1; −. ∆Ρ ΠΙΡ; Α05017; Α05017. ∆Ρ ΠΙΡ; Α24173; Α24173. ∆Ρ ΠΙΡ; Σ14078; Σ14078. ∆Ρ Π∆Β; 1Ε1Γ; 20−ϑΥΛ−00. ∆Ρ Π∆Β; 1Ε1ϑ; 20−ϑΥΛ−00. ∆Ρ Π∆Β; 1Ε1Π; 20−ϑΥΛ−00. ∆Ρ Π∆Β; 1Ε1Σ; 21−ϑΥΛ−00. ∆Ρ Π∆Β; 1Ε1Υ; 20−ϑΥΛ−00. ∆Ρ Π∆Β; 1Ε1Ω; 20−ϑΥΛ−00. ∆Ρ ΜΙΜ; 176640; −. ∆Ρ ΜΙΜ; 123400; −. ∆Ρ ΜΙΜ; 137440; −. ∆Ρ ΜΙΜ; 245300; −. ∆Ρ ΜΙΜ; 600072; −. ∆Ρ ΜΙΜ; 604920; −. ∆Ρ ΙντερΠρο; ΙΠΡ000817; Πριον. ∆Ρ Πφαµ; ΠΦ00377; πριον; 1. ∆Ρ ΠΡΙΝΤΣ; ΠΡ00341; ΠΡΙΟΝ. ∆Ρ ΣΜΑΡΤ; ΣΜ00157; ΠΡΠ; 1. ∆Ρ ΠΡΟΣΙΤΕ; ΠΣ00291; ΠΡΙΟΝ_1; 1. ∆Ρ ΠΡΟΣΙΤΕ; ΠΣ00706; ΠΡΙΟΝ_2; 1. ΚΩ Πριον; Βραιν; Γλψχοπροτειν; ΓΠΙ−ανχηορ; Ρεπεατ; Σιγναλ; ΚΩ 3∆−στρυχτυρε; Πολψµορπηισµ; ∆ισεασε µυτατιον.
  • 12. What is Knowledge? • Knowledge – all information and an understanding to carry out tasks and to infer new information • Information -- data equipped with meaning • Data -- un-interpreted signals that reach our senses Michael Ashburner Professor University of Cambridge UK I S M B Name Job Institution Country C o n f man academic, senior ancient university, 5 rated European important figure in biology B I O L O G Y
  • 13. A Knowledge Based Discipline • Rather than laws captured in mathematics…. • We have lots of facts: the discipline’s knowledge • Rather than “calculating” what a protein does, we investigate and write it down • Equivalent to writing down the trajectories of all thrown objects and not doing ballistics! • To do biology one needs “the knowledge”
  • 14. Heterogeneity • 28 ways to format the representations of a biological sequence • Though one way to represent the bases or amino acids… • Different words same concept • Different concepts same words • Different and implicit data schema
  • 15. Categories and Category Labels GO:0000368 U2-type nuclear mRNA 5' splice site recognition spliceosomal E complex formation spliceosomal E complex biosynthesis spliceosomal CC complex formation U2-type nuclear mRNA 5'-splice site recognition
  • 16. An Identity Crisis • Database entries have identifiers unique within their database • The type of entity described in an entry doesn’t have an identifier • Different entries about the same type talk about it differently • How do we know when an entry in one DB talks about the same thing as another entry in another DB? • That’s the skill of a bioinformatician
  • 17. Why: Society of Biologists • To do particle physics necessarily has central organisation • One central place to generate data • A communitarian attitude • It is still possible to do biology in the “garden shed” • Historicaly less need to organise • Hence…
  • 18. Navigating the Web of Knowledge in Bioinformatics
  • 19. Biology is Special • Large quantities of data: No it doesn’t • Complex data: Yes it does • Volatile data: Types of data and what is recorded changes rapidly • Nothing that special about biology • …except that it has all the problem and often to a large degree
  • 21. Biology now has lots of facts
  • 22. Creating Woods, not Trees Genes Proteins Pathways Interactions Literature Complex Machines Virtual Organism …. from biological facts, we make a system that is some model of a real organism
  • 23. Networks of Chemicals Image: http://genome-www.stanford.edu/rap_sir/images/Web_FigF_RAP1_glycolysis.gif
  • 24. Systems within Systems Image: http://www.ehponline.org/members/2007/10373/fig1.jpg
  • 25. A Biologist’s Skills • By the time a biologist has finished a Ph.D. he/she is about ready for action • They have a comprehensive knowledge of the facts of a (narrow) domain • He/she also knows how to do experimentation in that domain • There are so many facts, it is difficult to move outside one’s sub-discipline • Yet in a systems view such movement is mandatory
  • 26. The Role of Knowledge • A lot of facts • Perhaps organised into a system • No equivalent of “laws of mechanics” – we can’t do this biology with mathematics • Or at least not without knowing what the numbers mean... • This is why we’ve been using ontologies!
  • 27. What is an Ontology? • A description of that which exists (in our data) • What it means to be a member of a category • What categories of things exist and how do I recognise that a particular object is a member of a given category
  • 28. Uses of Ontology in Bioinformatics
  • 29. Why develop an ontology? • To make domain assumptions explicit – Easier to change domain assumptions – Easier to understand and update legacy data • To separate domain knowledge from operational knowledge – Re-use domain and operational knowledge separately • A community reference for applications • To share a consistent understanding of what information means.
  • 30. History of Bio-ontologies 1992 1996 1998 TAMBIS 2002 MGED 2006 1st Bio-ontologies meeting Gene Ontology starts 2005
  • 31. Controlled Vocabulary • An Ontology isn’t a controlled vocabulary, but can be used to deliver one • By agreeing upon the categories in a domain and agreeing upon their labels we are controlling vocabulary • Addresses one major problem in biology • Also forces examination of definitions • Makes domain assumptions explicit
  • 32. Transferring Characteristics Uncharacterised protein Tra1 La2 La3 High similarity transfer characteristics
  • 33. Post-Genomic Biology • Fly, mouse, yeast, worm all have their own terminologies • I want to compare genomes • How? • The genomic sequence is easily dealt with computationally and comparisons are easy • This is not true of the annotations or knowledge of those sequences • Need a common understanding
  • 34. Annotation of Data • Big effort to create controlled vocabularies using ontologies • A huge annotation efffort – describe the entities in DB with terms from ontologies • The Gene Ontology (http://www.geneontology.org)) • The Open Biomedical Ontologies Consortiym
  • 35. Genotype Phenotype Sequence Proteins Gene products Transcript Pathways Cell type BRENDA tissue / enzyme source Development Anatomy Pheonotype Plasmodium life cycle -Sequence types and features -Genetic Context - Molecule role - Molecular Function - Biological process - Cellular component -Protein covalent bond -Protein domain -UniProt taxonomy -Pathway ontology -Event (INOH pathway ontology) -Systems Biology -Protein-protein interaction -Arabidopsis development -Cereal plant development -Plant growth and developmental stage -C. elegans development -Drosophila development FBdv fly development.obo OBO yes yes -Human developmental anatomy, abstract version -Human developmental anatomy, timed version -Mosquito gross anatomy -Mouse adult gross anatomy -Mouse gross anatomy and development -C. elegans gross anatomy -Arabidopsis gross anatomy -Cereal plant gross anatomy -Drosophila gross anatomy -Dictyostelium discoideum anatomy -Fungal gross anatomy FAO -Plant structure -Maize gross anatomy -Medaka fish anatomy and development -Zebrafish anatomy and development -NCI Thesaurus -Mouse pathology -Human disease -Cereal plant trait -PATO PATO attribute and value.obo -Mammalian phenotype -Habronattus courtship -Loggerhead nesting -Animal natural history and life history eVOC (Expressed Sequence Annotation for Humans)
  • 37.
  • 38. GO in Analysis • Microarray analysis one of the original visions for GO • Clustering of modulated genes cluster about functional attributes of their proteins • GO also used in, for example, semantic similarity; text analysis; etc.
  • 39. Fact Management • When “stamp collecting” we’re collecting facts • Biology is a fact management activity • Knowing what these fact mean is very import • Science is perofrmed on data and the smeantics of data enable us to do science • Semantic e-Science
  • 40. Summary • The nature of modern biology gives it interesting knowledge (fact) management issues • It is a knowledge based discipline • Not unique, but often extreme • Ontologies seen as one component in management (but not a panacea)
  • 41. acknowledgements • All these people provided slides and input: • Duncan Hull • Simon Jupp • Phil Lord • Carole goble
  • 42. Genotype to Pathway Created by Paul Fisher
  • 45. Metadata toilet • Everyone wants to use good metadata but few people want to spend time curating and cleaning metadata – Like a clean toilet
  • 46. Biologists Wake up to Standards

Editor's Notes

  1. Ide
  2. Ide
  3. Slide Title: G 2 P Slide contains two semicircles labelled Genotype and Phenotype Text says: Classic Biology; Modern Biology
  4. Ide
  5. “Data are the uninterpreted signals that reach our senses every minute in time by the zillions…Information is data equipped with meaning…Knowledge is the whole body of data and information that people bring to bear to practical use in action, in order to carry out tasks and to create new information.”(Schreiber et al. 1998)
  6. Slide Title: Catalogues Stack of books listing: Genome Transcriptome Proteome Interactome Metabolome Phenome
  7. Slide Title: Literature Lots of books in a library
  8. Slide Title Slide contains: Book on the left with a plus sign Black and white image, man sat at an old valve-style computer (i.e. manchester baby) Text saying: genes, proteins, interactions, pathways Mouse on the right Text below images says: (left) Literature (middle) complex machines (right) Organism (bottom) “…. from biological facts, we make a system that is some model of a real thing” - Robert Stevens – 2008
  9. Ide
  10. Ide
  11. Slide Title: Genotype to Pathway QTL to Pathway workflow This workflow: Identifies all the genes, and their Ensembl ids, in a QTL region using BioMart Cross-references the gene ids to Entrez and Uniprot ids Entrez and Uniprot ids then map onto KEGG gene ids The KEGG gene ids are then used to identify KEGG pathways, including a description and an ID These lists of descriptions and IDs are then returned back to the user
  12. Slide Title: Pathway to Phenotype Pathways to PubMed abstracts workflow This workflow: Takes in a list of KEGG pathway descriptions Appends a search string to the end of each description Searches through PubMed using the NCBI eUtils Web Services For each article found in PubMed, as a PubMed id, an abstract is returned along with the date of publication These abstracts are then returned to the user as a single file Thos abstracts, coupled with abstracts from the phenotype, provide evidence linking those pathways to the phenotype