SlideShare a Scribd company logo
Standards and Ontologies for
                             Functional Genomics
                                   Conference
                              October 23-26, 2004
                        University of Pennsylvania School
                                    of Medicine




EXPERIENCES IN BUILDING AN ONTOLOGY-
DRIVEN IMAGE DATABASE FOR BIOLOGISTS


Chris Catton
Image BioInformatics Laboratory
Department of Zoology
University of Oxford, UK

e-mail: chris.catton@zoo.ox.ac.uk
Outline

•   Why are images important?
•   What is the BioImage database?
•   Why use a semantic web architecture?
•   Lessons and research questions
Why are biological images important in the
                post-genomic age?
•   Images are semantic instruments for capturing aspects of the real
    world, and form a vital part of the scientific record, for which words are
    no substitute
•   In the post-genomic world, attention is now focused on the organization
    and integration of information within cells, for functional analyses of
    gene products
•   In a month a single active cell biology lab may generate between 10
    and 100 Gbytes of multidimensional image data
Images are complex …

•   An image database must be
    able to store original images in
    any digital format currently
    available or yet to be invented,
    including multi-channel 3D
    images, multi-channel videos,
    etc.
The need for image databases
•   The value of digital image information depends upon how easily it
    can be located, searched for relevance, and retrieved
•   Detailed descriptive metadata about the images are essential
•   Without them, digital image repositories become little more than
    meaningless and costly data graveyards
•   Despite the growth of on-line journals that permit the inclusion of
    media objects, few of these resources are freely available, and those
    that are are difficult to locate and are not cross-searchable
•   There is thus a need for a free publicly available image database with
    rich well-structured searchable metadata
•   The BioImage Database seeks to fulfil that need
This view has a growing acceptance …
What metadata?

•   Image acquisition (who took the original micrograph, where,
    when, under what conditions, for what purpose, etc.)
•   The media object itself (source and derivation, image type,
    dynamic range, resolution, format, codec, etc.),
•   The denotation of the referent (e.g. the name, age and condition
    of the subject),
•   Connotation of the referent (the image’s interpretation, meaning,
    purpose or significance, its relevance to its creator and others,
    and its semantic relationship to other images).
•   Field aspects of the real world that cannot conveniently be
    attached to any particular object (e.g. variations of illumination
    intensity or chemo-attractant concentration across the field of
    view of a light microscope image).
•   Sequences of change where there is a need to preserve the
    concept of object identity in the face of radical spatio-temporal
    changes in appearance.
Why use a semantic web architecture?

•   Traditional relational databases don’t meet our needs
     • Image data is complex, layered, and difficult to model
     • Images are searched primarily through their metadata
     • Metadata is time consuming and difficult to obtain
•   Ontologies offer the promise of better retrieval accuracy through
    linking to instances in an ontology, rather than attempting to
    process free text.
•   Ontologies offer the promise of easy inter-operability with other
    systems
The BioImage Ontology
Lessons learned:
                 Performance, scalability …

•   Database retrieval is slower than a traditional database would
    be
•   Scalability remains to be tested (true for all semantic web
    software)
•   Query languages (RDQL) are immature when compared to SQL
•   Parsing RDF is hard and slow (RDF-ABBREV output of the
    Jena parser is unreliable and the unstriped format requires
    multiple passes to create XML that can easily be transformed to
    HTML)
A problem with ontologies?

•   The volume of data generated in the Life Sciences is now
    estimated to be doubling every month

•   Already people look less and less at the raw scientific data
    (unless they are their own results)

•   As this volume of data accumulates, few if any of us will have
    the time or the mental capacity to assimilate new data, structure
    them in a meaningful way and extract information, without first
    processing the data through an ontology or some other similar
    machine-based organisational aid

•   THE ONTOLOGY WILL BE WRONG! (or we should all pack up
    and go home)
Paradigm shifts

•   Our human understanding of an area of science is never static,
    but is constantly being revised by new research
•   Such revisions in understanding are either evolutionary
    (incremental), following the progressive discovery of more and
    more detail, interpreted according to the prevailing paradigm, or
    revolutionary, when the prevailing paradigm is overthrown by
    another
•   How do paradigm revolutions succeed?
       "A new scientific truth does not triumph by convincing its
       opponents and making them see the light, but rather
       because its opponents eventually die, and a new generation
       grows up that is familiar with it"
                                          (Max Planck, 1949)
Factors preventing evolution

•   Ontology builders are ‘monks’ (and nuns) - led by an ‘abbot’, a
    relatively senior domain expert likely to be committed to encapsulating
    the dominant paradigm
•   Substantial problems confront any newcomers wishing to contribute,
    since ontology building is time-consuming and expensive
•   Since an ontology expresses the community consensus, there will be
    massive social pressures against change
•   If large volumes of data have already been encoded using an existing
    ontology, this will make it difficult to introduce change
•   The first ontology in a domain may assume a monopolistic position that
    becomes unassailable, even if it has universally acknowledged
    weaknesses
•   Ontologies are unlikely to evolve in response to the same market
    forces that drive the development of applications software
Encapsulating the dominant paradigm

•   Imagine a section of an ontology describing the development of adult
    mammalian bone marrow and brain, constructed according to the pre-1980
    dominant paradigm that bone marrow develops from mesoderm, while
    brain develops from ectoderm
An example of paradigm evolution

•   Subsequently, adult mouse brain was found to contain haemopoietic stem cells
•   Bartlett (1982) hypothesised that these cells developed from foetal haemopoietic cells that
    entered the brain tissue before the barrier was established




•   This challenge to the dominant paradigm that brain tissues are derived exclusively from
    ectoderm can be accommodated by extending the graph
An example of paradigm revolution

•   More recently, Brazelton et al. (2000) claimed that haemopoietic stem cells from adult
    bone marrow can develop into neural cells in adult mouse brain
•   If true, this result overthrows the paradigm that neuronal cells can only develop from
    embryonic ectoderm, requiring a new ontology incompatible with the old




•   This new ontology is no longer an extension of the previous one, since neural cells no
    longer develop only from foetal neuroepithelium
A way forward – using Named Graphs in
               RDF (and OWL?)
•   In response to considerable frustration and confusion within the RDF
    community about the best method of reifying RDF statements, Jeremy
    Carroll et al. proposed an extension to RDF
Thanks and acknowledgements

•   David Shotton and Simon Sparks
    for BioImage developments
    (http://www.bioimage.org)
•   John Pybus, our computer systems
    manager, for keeping us running in
    spite of the problems
•   Liz Mellings for unbounded
    patience inputting data and testing
•   The European Commission for
    funding the BioImage Project (EC
    IST 5th Framework Contract 2001-
    32688: ORIEL – Online Research
    Information Environment for the Life
    Sciences; http://www.oriel.org)
End

More Related Content

Viewers also liked

JavaOne 2009 - Full-Text Search: Human Heaven and Database Savior in the Cloud
JavaOne 2009 - Full-Text Search: Human Heaven and Database Savior in the CloudJavaOne 2009 - Full-Text Search: Human Heaven and Database Savior in the Cloud
JavaOne 2009 - Full-Text Search: Human Heaven and Database Savior in the Cloud
Aaron Walker
 
OSDC 2010 - You've Got Cucumber in my Java and it Tastes Great
OSDC 2010 - You've Got Cucumber in my Java and it Tastes GreatOSDC 2010 - You've Got Cucumber in my Java and it Tastes Great
OSDC 2010 - You've Got Cucumber in my Java and it Tastes Great
Aaron Walker
 
Biii
BiiiBiii
Biii
01ti0902
 
OSDC-2010 Database Full-text Search.... making it not suck
OSDC-2010 Database Full-text Search.... making it not suckOSDC-2010 Database Full-text Search.... making it not suck
OSDC-2010 Database Full-text Search.... making it not suck
Aaron Walker
 
Diona Slim Line. "Westfalia GMBH"
Diona Slim Line. "Westfalia GMBH"Diona Slim Line. "Westfalia GMBH"
Diona Slim Line. "Westfalia GMBH"
mamontov
 
VRA 2012, Emerging New Roles, Or How My Job Has Evolved
VRA 2012, Emerging New Roles, Or How My Job Has EvolvedVRA 2012, Emerging New Roles, Or How My Job Has Evolved
VRA 2012, Emerging New Roles, Or How My Job Has Evolved
Visual Resources Association
 

Viewers also liked (7)

JavaOne 2009 - Full-Text Search: Human Heaven and Database Savior in the Cloud
JavaOne 2009 - Full-Text Search: Human Heaven and Database Savior in the CloudJavaOne 2009 - Full-Text Search: Human Heaven and Database Savior in the Cloud
JavaOne 2009 - Full-Text Search: Human Heaven and Database Savior in the Cloud
 
OSDC 2010 - You've Got Cucumber in my Java and it Tastes Great
OSDC 2010 - You've Got Cucumber in my Java and it Tastes GreatOSDC 2010 - You've Got Cucumber in my Java and it Tastes Great
OSDC 2010 - You've Got Cucumber in my Java and it Tastes Great
 
Biii
BiiiBiii
Biii
 
OSDC-2010 Database Full-text Search.... making it not suck
OSDC-2010 Database Full-text Search.... making it not suckOSDC-2010 Database Full-text Search.... making it not suck
OSDC-2010 Database Full-text Search.... making it not suck
 
Diona Slim Line. "Westfalia GMBH"
Diona Slim Line. "Westfalia GMBH"Diona Slim Line. "Westfalia GMBH"
Diona Slim Line. "Westfalia GMBH"
 
VRA 2012, Emerging New Roles, Or How My Job Has Evolved
VRA 2012, Emerging New Roles, Or How My Job Has EvolvedVRA 2012, Emerging New Roles, Or How My Job Has Evolved
VRA 2012, Emerging New Roles, Or How My Job Has Evolved
 
χριστοσ 2
χριστοσ  2χριστοσ  2
χριστοσ 2
 

Similar to Experiences in building an ontology driven image database for ...

Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
James Hendler
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
Maryann Martone
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
Maryann Martone
 
Ngsp
NgspNgsp
Ngsp
Tim Clark
 
Data Mining Dissertations and Adventures and Experiences in the World of Chem...
Data Mining Dissertations and Adventures and Experiences in the World of Chem...Data Mining Dissertations and Adventures and Experiences in the World of Chem...
Data Mining Dissertations and Adventures and Experiences in the World of Chem...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)
James Hendler
 
Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to Know
Barry Smith
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
inside-BigData.com
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
Maryann Martone
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Amit Sheth
 
Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic Web
Valentina Presutti
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
William Gunn
 
Ontological realism as a strategy for integrating ontologies
Ontological realism as a strategy for integrating ontologiesOntological realism as a strategy for integrating ontologies
Ontological realism as a strategy for integrating ontologies
Barry Smith
 
Neuroscience as networked science
Neuroscience as networked scienceNeuroscience as networked science
Neuroscience as networked science
Neuroscience Information Framework
 
A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...
Koray Atalag
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
Connected Data World
 
The biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspective
Vince Smith
 
Ontology
OntologyOntology
Ontology
Ahmed Tememe
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objects
seanb
 
Software Ecosystem Evolution. It's complex!
Software Ecosystem Evolution. It's complex!Software Ecosystem Evolution. It's complex!
Software Ecosystem Evolution. It's complex!
Tom Mens
 

Similar to Experiences in building an ontology driven image database for ... (20)

Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 
Ngsp
NgspNgsp
Ngsp
 
Data Mining Dissertations and Adventures and Experiences in the World of Chem...
Data Mining Dissertations and Adventures and Experiences in the World of Chem...Data Mining Dissertations and Adventures and Experiences in the World of Chem...
Data Mining Dissertations and Adventures and Experiences in the World of Chem...
 
Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)Tragedy of the Data Commons (ODSC-East, 2021)
Tragedy of the Data Commons (ODSC-East, 2021)
 
Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to Know
 
Collins seattle-2014-final
Collins seattle-2014-finalCollins seattle-2014-final
Collins seattle-2014-final
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
Semantic Web & Web 3.0 empowering real world outcomes in biomedical research ...
 
Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic Web
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
Ontological realism as a strategy for integrating ontologies
Ontological realism as a strategy for integrating ontologiesOntological realism as a strategy for integrating ontologies
Ontological realism as a strategy for integrating ontologies
 
Neuroscience as networked science
Neuroscience as networked scienceNeuroscience as networked science
Neuroscience as networked science
 
A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...A Semantic Web based Framework for Linking Healthcare Information with Comput...
A Semantic Web based Framework for Linking Healthcare Information with Comput...
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
The biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspective
 
Ontology
OntologyOntology
Ontology
 
OAI7 Research Objects
OAI7 Research ObjectsOAI7 Research Objects
OAI7 Research Objects
 
Software Ecosystem Evolution. It's complex!
Software Ecosystem Evolution. It's complex!Software Ecosystem Evolution. It's complex!
Software Ecosystem Evolution. It's complex!
 

Recently uploaded

220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
Kalna College
 
The basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptxThe basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptx
heathfieldcps1
 
Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)
nitinpv4ai
 
Skimbleshanks-The-Railway-Cat by T S Eliot
Skimbleshanks-The-Railway-Cat by T S EliotSkimbleshanks-The-Railway-Cat by T S Eliot
Skimbleshanks-The-Railway-Cat by T S Eliot
nitinpv4ai
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
zuzanka
 
adjectives.ppt for class 1 to 6, grammar
adjectives.ppt for class 1 to 6, grammaradjectives.ppt for class 1 to 6, grammar
adjectives.ppt for class 1 to 6, grammar
7DFarhanaMohammed
 
How to Setup Default Value for a Field in Odoo 17
How to Setup Default Value for a Field in Odoo 17How to Setup Default Value for a Field in Odoo 17
How to Setup Default Value for a Field in Odoo 17
Celine George
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
سمير بسيوني
 
CIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdfCIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdf
blueshagoo1
 
Creative Restart 2024: Mike Martin - Finding a way around “no”
Creative Restart 2024: Mike Martin - Finding a way around “no”Creative Restart 2024: Mike Martin - Finding a way around “no”
Creative Restart 2024: Mike Martin - Finding a way around “no”
Taste
 
FinalSD_MathematicsGrade7_Session2_Unida.pptx
FinalSD_MathematicsGrade7_Session2_Unida.pptxFinalSD_MathematicsGrade7_Session2_Unida.pptx
FinalSD_MathematicsGrade7_Session2_Unida.pptx
JennySularte1
 
Creation or Update of a Mandatory Field is Not Set in Odoo 17
Creation or Update of a Mandatory Field is Not Set in Odoo 17Creation or Update of a Mandatory Field is Not Set in Odoo 17
Creation or Update of a Mandatory Field is Not Set in Odoo 17
Celine George
 
A Free 200-Page eBook ~ Brain and Mind Exercise.pptx
A Free 200-Page eBook ~ Brain and Mind Exercise.pptxA Free 200-Page eBook ~ Brain and Mind Exercise.pptx
A Free 200-Page eBook ~ Brain and Mind Exercise.pptx
OH TEIK BIN
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
Ch-4 Forest Society and colonialism 2.pdf
Ch-4 Forest Society and colonialism 2.pdfCh-4 Forest Society and colonialism 2.pdf
Ch-4 Forest Society and colonialism 2.pdf
lakshayrojroj
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
Information and Communication Technology in Education
Information and Communication Technology in EducationInformation and Communication Technology in Education
Information and Communication Technology in Education
MJDuyan
 
Simple-Present-Tense xxxxxxxxxxxxxxxxxxx
Simple-Present-Tense xxxxxxxxxxxxxxxxxxxSimple-Present-Tense xxxxxxxxxxxxxxxxxxx
Simple-Present-Tense xxxxxxxxxxxxxxxxxxx
RandolphRadicy
 
CHUYÊN ĐỀ ÔN TẬP VÀ PHÁT TRIỂN CÂU HỎI TRONG ĐỀ MINH HỌA THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN TẬP VÀ PHÁT TRIỂN CÂU HỎI TRONG ĐỀ MINH HỌA THI TỐT NGHIỆP THPT ...CHUYÊN ĐỀ ÔN TẬP VÀ PHÁT TRIỂN CÂU HỎI TRONG ĐỀ MINH HỌA THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN TẬP VÀ PHÁT TRIỂN CÂU HỎI TRONG ĐỀ MINH HỌA THI TỐT NGHIỆP THPT ...
Nguyen Thanh Tu Collection
 
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGHKHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
shreyassri1208
 

Recently uploaded (20)

220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
 
The basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptxThe basics of sentences session 7pptx.pptx
The basics of sentences session 7pptx.pptx
 
Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)Oliver Asks for More by Charles Dickens (9)
Oliver Asks for More by Charles Dickens (9)
 
Skimbleshanks-The-Railway-Cat by T S Eliot
Skimbleshanks-The-Railway-Cat by T S EliotSkimbleshanks-The-Railway-Cat by T S Eliot
Skimbleshanks-The-Railway-Cat by T S Eliot
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
 
adjectives.ppt for class 1 to 6, grammar
adjectives.ppt for class 1 to 6, grammaradjectives.ppt for class 1 to 6, grammar
adjectives.ppt for class 1 to 6, grammar
 
How to Setup Default Value for a Field in Odoo 17
How to Setup Default Value for a Field in Odoo 17How to Setup Default Value for a Field in Odoo 17
How to Setup Default Value for a Field in Odoo 17
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
 
CIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdfCIS 4200-02 Group 1 Final Project Report (1).pdf
CIS 4200-02 Group 1 Final Project Report (1).pdf
 
Creative Restart 2024: Mike Martin - Finding a way around “no”
Creative Restart 2024: Mike Martin - Finding a way around “no”Creative Restart 2024: Mike Martin - Finding a way around “no”
Creative Restart 2024: Mike Martin - Finding a way around “no”
 
FinalSD_MathematicsGrade7_Session2_Unida.pptx
FinalSD_MathematicsGrade7_Session2_Unida.pptxFinalSD_MathematicsGrade7_Session2_Unida.pptx
FinalSD_MathematicsGrade7_Session2_Unida.pptx
 
Creation or Update of a Mandatory Field is Not Set in Odoo 17
Creation or Update of a Mandatory Field is Not Set in Odoo 17Creation or Update of a Mandatory Field is Not Set in Odoo 17
Creation or Update of a Mandatory Field is Not Set in Odoo 17
 
A Free 200-Page eBook ~ Brain and Mind Exercise.pptx
A Free 200-Page eBook ~ Brain and Mind Exercise.pptxA Free 200-Page eBook ~ Brain and Mind Exercise.pptx
A Free 200-Page eBook ~ Brain and Mind Exercise.pptx
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
Ch-4 Forest Society and colonialism 2.pdf
Ch-4 Forest Society and colonialism 2.pdfCh-4 Forest Society and colonialism 2.pdf
Ch-4 Forest Society and colonialism 2.pdf
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
Information and Communication Technology in Education
Information and Communication Technology in EducationInformation and Communication Technology in Education
Information and Communication Technology in Education
 
Simple-Present-Tense xxxxxxxxxxxxxxxxxxx
Simple-Present-Tense xxxxxxxxxxxxxxxxxxxSimple-Present-Tense xxxxxxxxxxxxxxxxxxx
Simple-Present-Tense xxxxxxxxxxxxxxxxxxx
 
CHUYÊN ĐỀ ÔN TẬP VÀ PHÁT TRIỂN CÂU HỎI TRONG ĐỀ MINH HỌA THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN TẬP VÀ PHÁT TRIỂN CÂU HỎI TRONG ĐỀ MINH HỌA THI TỐT NGHIỆP THPT ...CHUYÊN ĐỀ ÔN TẬP VÀ PHÁT TRIỂN CÂU HỎI TRONG ĐỀ MINH HỌA THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN TẬP VÀ PHÁT TRIỂN CÂU HỎI TRONG ĐỀ MINH HỌA THI TỐT NGHIỆP THPT ...
 
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGHKHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
KHUSWANT SINGH.pptx ALL YOU NEED TO KNOW ABOUT KHUSHWANT SINGH
 

Experiences in building an ontology driven image database for ...

  • 1. Standards and Ontologies for Functional Genomics Conference October 23-26, 2004 University of Pennsylvania School of Medicine EXPERIENCES IN BUILDING AN ONTOLOGY- DRIVEN IMAGE DATABASE FOR BIOLOGISTS Chris Catton Image BioInformatics Laboratory Department of Zoology University of Oxford, UK e-mail: chris.catton@zoo.ox.ac.uk
  • 2. Outline • Why are images important? • What is the BioImage database? • Why use a semantic web architecture? • Lessons and research questions
  • 3. Why are biological images important in the post-genomic age? • Images are semantic instruments for capturing aspects of the real world, and form a vital part of the scientific record, for which words are no substitute • In the post-genomic world, attention is now focused on the organization and integration of information within cells, for functional analyses of gene products • In a month a single active cell biology lab may generate between 10 and 100 Gbytes of multidimensional image data
  • 4. Images are complex … • An image database must be able to store original images in any digital format currently available or yet to be invented, including multi-channel 3D images, multi-channel videos, etc.
  • 5. The need for image databases • The value of digital image information depends upon how easily it can be located, searched for relevance, and retrieved • Detailed descriptive metadata about the images are essential • Without them, digital image repositories become little more than meaningless and costly data graveyards • Despite the growth of on-line journals that permit the inclusion of media objects, few of these resources are freely available, and those that are are difficult to locate and are not cross-searchable • There is thus a need for a free publicly available image database with rich well-structured searchable metadata • The BioImage Database seeks to fulfil that need
  • 6. This view has a growing acceptance …
  • 7. What metadata? • Image acquisition (who took the original micrograph, where, when, under what conditions, for what purpose, etc.) • The media object itself (source and derivation, image type, dynamic range, resolution, format, codec, etc.), • The denotation of the referent (e.g. the name, age and condition of the subject), • Connotation of the referent (the image’s interpretation, meaning, purpose or significance, its relevance to its creator and others, and its semantic relationship to other images). • Field aspects of the real world that cannot conveniently be attached to any particular object (e.g. variations of illumination intensity or chemo-attractant concentration across the field of view of a light microscope image). • Sequences of change where there is a need to preserve the concept of object identity in the face of radical spatio-temporal changes in appearance.
  • 8. Why use a semantic web architecture? • Traditional relational databases don’t meet our needs • Image data is complex, layered, and difficult to model • Images are searched primarily through their metadata • Metadata is time consuming and difficult to obtain • Ontologies offer the promise of better retrieval accuracy through linking to instances in an ontology, rather than attempting to process free text. • Ontologies offer the promise of easy inter-operability with other systems
  • 10. Lessons learned: Performance, scalability … • Database retrieval is slower than a traditional database would be • Scalability remains to be tested (true for all semantic web software) • Query languages (RDQL) are immature when compared to SQL • Parsing RDF is hard and slow (RDF-ABBREV output of the Jena parser is unreliable and the unstriped format requires multiple passes to create XML that can easily be transformed to HTML)
  • 11. A problem with ontologies? • The volume of data generated in the Life Sciences is now estimated to be doubling every month • Already people look less and less at the raw scientific data (unless they are their own results) • As this volume of data accumulates, few if any of us will have the time or the mental capacity to assimilate new data, structure them in a meaningful way and extract information, without first processing the data through an ontology or some other similar machine-based organisational aid • THE ONTOLOGY WILL BE WRONG! (or we should all pack up and go home)
  • 12. Paradigm shifts • Our human understanding of an area of science is never static, but is constantly being revised by new research • Such revisions in understanding are either evolutionary (incremental), following the progressive discovery of more and more detail, interpreted according to the prevailing paradigm, or revolutionary, when the prevailing paradigm is overthrown by another • How do paradigm revolutions succeed? "A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it" (Max Planck, 1949)
  • 13. Factors preventing evolution • Ontology builders are ‘monks’ (and nuns) - led by an ‘abbot’, a relatively senior domain expert likely to be committed to encapsulating the dominant paradigm • Substantial problems confront any newcomers wishing to contribute, since ontology building is time-consuming and expensive • Since an ontology expresses the community consensus, there will be massive social pressures against change • If large volumes of data have already been encoded using an existing ontology, this will make it difficult to introduce change • The first ontology in a domain may assume a monopolistic position that becomes unassailable, even if it has universally acknowledged weaknesses • Ontologies are unlikely to evolve in response to the same market forces that drive the development of applications software
  • 14. Encapsulating the dominant paradigm • Imagine a section of an ontology describing the development of adult mammalian bone marrow and brain, constructed according to the pre-1980 dominant paradigm that bone marrow develops from mesoderm, while brain develops from ectoderm
  • 15. An example of paradigm evolution • Subsequently, adult mouse brain was found to contain haemopoietic stem cells • Bartlett (1982) hypothesised that these cells developed from foetal haemopoietic cells that entered the brain tissue before the barrier was established • This challenge to the dominant paradigm that brain tissues are derived exclusively from ectoderm can be accommodated by extending the graph
  • 16. An example of paradigm revolution • More recently, Brazelton et al. (2000) claimed that haemopoietic stem cells from adult bone marrow can develop into neural cells in adult mouse brain • If true, this result overthrows the paradigm that neuronal cells can only develop from embryonic ectoderm, requiring a new ontology incompatible with the old • This new ontology is no longer an extension of the previous one, since neural cells no longer develop only from foetal neuroepithelium
  • 17. A way forward – using Named Graphs in RDF (and OWL?) • In response to considerable frustration and confusion within the RDF community about the best method of reifying RDF statements, Jeremy Carroll et al. proposed an extension to RDF
  • 18. Thanks and acknowledgements • David Shotton and Simon Sparks for BioImage developments (http://www.bioimage.org) • John Pybus, our computer systems manager, for keeping us running in spite of the problems • Liz Mellings for unbounded patience inputting data and testing • The European Commission for funding the BioImage Project (EC IST 5th Framework Contract 2001- 32688: ORIEL – Online Research Information Environment for the Life Sciences; http://www.oriel.org)
  • 19. End