The document summarizes the Gene Wiki, which crowdsources human gene annotation by harnessing the contributions of scientists. It has achieved a critical mass of over 10,000 editors contributing over 1 million words to gene summaries. These crowdsourced annotations have been mined to extract structured gene ontology and disease associations, with specificity of 48-64% for gene ontology and 90-93% for disease associations after expert curation. However, sources of error in the extracted associations include incorrect concept recognition from text and incorrect interpretation of sentence context.
Crowdsourcing to structure biological knowledge (USC/ISI)Andrew Su
Talk given at USC's Information Sciences Institute (http://www.isi.edu). The AV recording is pretty horrible, but for anyone interested: http://webcasterms1.isi.edu/mediasite/SilverlightPlayer/Default.aspx?peid=89751f8537c44f2fa241db99c793cd231d
Crowdsourcing to structure biological knowledge (USC/ISI)Andrew Su
Talk given at USC's Information Sciences Institute (http://www.isi.edu). The AV recording is pretty horrible, but for anyone interested: http://webcasterms1.isi.edu/mediasite/SilverlightPlayer/Default.aspx?peid=89751f8537c44f2fa241db99c793cd231d
NCBO Webinar: Translating unstructured, crowdsourced content into structured ...Andrew Su
The use of crowdsourcing in biology is gaining popularity as a mechanism to tackle challenges of massive scale. However, to maximize participation and lower the barriers to entry, contributions to crowdsourcing efforts are typically not well-structured, which makes computing on these data challenging and difficult. The presentation will discuss strategies for translating this unstructured content into structured data. Three vignettes (in varying degrees of completion) will be described, one each from our Gene Wiki [1], BioGPS [2], and serious gaming [3] initiatives.
[1]: http://en.wikipedia.org/wiki/Portal:Gene_Wiki
[2]: http://biogps.org
[3]: http://genegames.org
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...Andrew Su
Keynote talk given at GMOD 2014
Video of talk at: https://www.youtube.com/watch?v=RVijs5ry05E
Video of QA at: https://www.youtube.com/watch?v=dGHXo-iNsyU
Blog post: http://sulab.org/2013/06/creating-a-centralized-model-organism-database-cmod/
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Updatedolleyj
The Evidence and Conclusion Ontology (ECO) describes types of evidence relevant to biological investigations. First developed in the early 2000s, ECO now consists of over 1700 defined classes and is used by a large, and growing, list of resources. ECO imports close to 1000 classes from the Ontology for Biomedical Investigations and the Gene Ontology for use in logical definitions. Historically, ECO terms have generally been categorized by either the biological context of the evidence (e.g. gene expression) or the technique used to generate the evidence (e.g. PCR-based evidence). The result is that sometimes terms that have related biological context are found under different unrelated nodes. To address this, we have been performing a rigorous review of the structure and logic of the branches of ECO. Working with additional input from collaborators through the issue tracker on GitHub, term labels, definitions, and relationships are being evaluated and updated. The goal of these changes is to increase the logical consistency of ECO, make it easier for users to find and understand terms, and allow for ECO to continue to grow and support its users. In addition to the structural review, we have been working with CollecTF to utilize ECO for automated text mining. To generate a curated corpus for this effort, we have been annotating ECO terms to sentences which contain evidence-based assertions about gene products, taxonomic entities, and sequence features. From this effort we have developed clearly-defined annotation guidelines that have been passed on to a team of undergraduates who are continuing the curation effort.
Annotations are limited to single sentences, or to two consecutive sentences, containing the evidence instance and assertion clause. The quality of the mapping to ECO
and the strength of the author’s assertion are also captured. ECO is freely available at http://evidenceontology.org/ and https://github.com/evidenceontology.
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
Ora Lassila and Amit Sheth, "Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Interoperability", Invited Talk at ONC-HHS Invitational Workshop on Next Generation Interoperability for Health, Washington DC, January 19-20, 2011.
Introducing OWL Ontologies - the use of globally accessible controlled vocabularies in the domain of biology, chemistry, health, and data science. The more that data elements and form fields reference these, the more your application and data will become globally connected and adaptable.
Keynote presented to KE workshop held in conjunction with the release of the report "A Surfboard for Riding the Wave
Towards a four country action programme on research data": http://www.knowledge-exchange.info/Default.aspx?ID=469
Bio-ontologies in bioinformatics: Growing up challengesJanna Hastings
Bio-ontologies are growing up, and their use is becoming widespread in many areas of computational science. The new maturity is bringing new challenges, however, in particular visualization of complex ontologies; moving from OBO to OWL; using multiple ontologies in conjunction; training appropriate for biologists and community building.
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
This presentation tries to highlight the importance and relevance of community-based curation of biological data. It describes the results of harvesting expertise from dispersed researchers assigning functions to predicted and curated peptides, as well as collaborative efforts for standardization of genes and gene product attributes across species and databases.
A talk presented January 19, 2013 in the Indo-US Joint Workshop on Biodiversity Informatics at the Ashoka Trust for Research in Ecology and the Environment in Bangalore, India.
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Slides from a presentation on the Knowledge Organization System (KOS) work program for GBIF. KOS developments for biodiversity information resources and input to the emerging Vocabulary Management Task Group (VoMaG).
Links
GBIF KOS prototype tools, http://kos.gbif.org/
Tool: Semantic Wiki prototype, http://terms.gbif.org/wiki/
Tool: ISOcat prototype demo, http://kos.gbif.org/isocat/
GBIF concept vocabulary term browser, http://kos.gbif.org/termbrowser/
GBIF Resources Repository, http://rs.gbif.org/terms/
GBIF Vocabulary Server, http://vocabularies.gbif.org/
GBIF Resources Browser, http://tools.gbif.org/resource-browser/
Viralzone: a web resource dedicated to viruses, by Patrick Masson, Chantal Hulo, Edouard De Castro, Lydie Bougueleret, Philippe Le Mercier and Ioannis Xenarios.
Presented at the 5th International Biocuration Conference, hosted by PIR in Washington, DC, April 2-4, 2012.
NCBO Webinar: Translating unstructured, crowdsourced content into structured ...Andrew Su
The use of crowdsourcing in biology is gaining popularity as a mechanism to tackle challenges of massive scale. However, to maximize participation and lower the barriers to entry, contributions to crowdsourcing efforts are typically not well-structured, which makes computing on these data challenging and difficult. The presentation will discuss strategies for translating this unstructured content into structured data. Three vignettes (in varying degrees of completion) will be described, one each from our Gene Wiki [1], BioGPS [2], and serious gaming [3] initiatives.
[1]: http://en.wikipedia.org/wiki/Portal:Gene_Wiki
[2]: http://biogps.org
[3]: http://genegames.org
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...Andrew Su
Keynote talk given at GMOD 2014
Video of talk at: https://www.youtube.com/watch?v=RVijs5ry05E
Video of QA at: https://www.youtube.com/watch?v=dGHXo-iNsyU
Blog post: http://sulab.org/2013/06/creating-a-centralized-model-organism-database-cmod/
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Updatedolleyj
The Evidence and Conclusion Ontology (ECO) describes types of evidence relevant to biological investigations. First developed in the early 2000s, ECO now consists of over 1700 defined classes and is used by a large, and growing, list of resources. ECO imports close to 1000 classes from the Ontology for Biomedical Investigations and the Gene Ontology for use in logical definitions. Historically, ECO terms have generally been categorized by either the biological context of the evidence (e.g. gene expression) or the technique used to generate the evidence (e.g. PCR-based evidence). The result is that sometimes terms that have related biological context are found under different unrelated nodes. To address this, we have been performing a rigorous review of the structure and logic of the branches of ECO. Working with additional input from collaborators through the issue tracker on GitHub, term labels, definitions, and relationships are being evaluated and updated. The goal of these changes is to increase the logical consistency of ECO, make it easier for users to find and understand terms, and allow for ECO to continue to grow and support its users. In addition to the structural review, we have been working with CollecTF to utilize ECO for automated text mining. To generate a curated corpus for this effort, we have been annotating ECO terms to sentences which contain evidence-based assertions about gene products, taxonomic entities, and sequence features. From this effort we have developed clearly-defined annotation guidelines that have been passed on to a team of undergraduates who are continuing the curation effort.
Annotations are limited to single sentences, or to two consecutive sentences, containing the evidence instance and assertion clause. The quality of the mapping to ECO
and the strength of the author’s assertion are also captured. ECO is freely available at http://evidenceontology.org/ and https://github.com/evidenceontology.
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
Ora Lassila and Amit Sheth, "Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Interoperability", Invited Talk at ONC-HHS Invitational Workshop on Next Generation Interoperability for Health, Washington DC, January 19-20, 2011.
Introducing OWL Ontologies - the use of globally accessible controlled vocabularies in the domain of biology, chemistry, health, and data science. The more that data elements and form fields reference these, the more your application and data will become globally connected and adaptable.
Keynote presented to KE workshop held in conjunction with the release of the report "A Surfboard for Riding the Wave
Towards a four country action programme on research data": http://www.knowledge-exchange.info/Default.aspx?ID=469
Bio-ontologies in bioinformatics: Growing up challengesJanna Hastings
Bio-ontologies are growing up, and their use is becoming widespread in many areas of computational science. The new maturity is bringing new challenges, however, in particular visualization of complex ontologies; moving from OBO to OWL; using multiple ontologies in conjunction; training appropriate for biologists and community building.
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
This presentation tries to highlight the importance and relevance of community-based curation of biological data. It describes the results of harvesting expertise from dispersed researchers assigning functions to predicted and curated peptides, as well as collaborative efforts for standardization of genes and gene product attributes across species and databases.
A talk presented January 19, 2013 in the Indo-US Joint Workshop on Biodiversity Informatics at the Ashoka Trust for Research in Ecology and the Environment in Bangalore, India.
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Slides from a presentation on the Knowledge Organization System (KOS) work program for GBIF. KOS developments for biodiversity information resources and input to the emerging Vocabulary Management Task Group (VoMaG).
Links
GBIF KOS prototype tools, http://kos.gbif.org/
Tool: Semantic Wiki prototype, http://terms.gbif.org/wiki/
Tool: ISOcat prototype demo, http://kos.gbif.org/isocat/
GBIF concept vocabulary term browser, http://kos.gbif.org/termbrowser/
GBIF Resources Repository, http://rs.gbif.org/terms/
GBIF Vocabulary Server, http://vocabularies.gbif.org/
GBIF Resources Browser, http://tools.gbif.org/resource-browser/
Viralzone: a web resource dedicated to viruses, by Patrick Masson, Chantal Hulo, Edouard De Castro, Lydie Bougueleret, Philippe Le Mercier and Ioannis Xenarios.
Presented at the 5th International Biocuration Conference, hosted by PIR in Washington, DC, April 2-4, 2012.
Similar to ISB2012: The Gene Wiki: Crowdsourcing human gene annotation (20)
Citizen Science and Rare Disease ResearchAndrew Su
Talk given at "Personalized Health in the Digital Age" September 22, 2016 at Campus Biotech in Geneva, Switzerland https://www.personalizedhealth2016.ch/
Centralized Model Organism Database (Biocuration 2014 poster)Andrew Su
A Centralized Model Organism Database (CMOD) for the Long Tail of Genomes
Presented at Biocuration 2014 in Toronto http://biocuration2014.events.oicr.on.ca/
See related slides at http://www.slideshare.net/andrewsu/20140116-gmod-short
Title: Sense of Smell
Presenter: Dr. Faiza, Assistant Professor of Physiology
Qualifications:
MBBS (Best Graduate, AIMC Lahore)
FCPS Physiology
ICMT, CHPE, DHPE (STMU)
MPH (GC University, Faisalabad)
MBA (Virtual University of Pakistan)
Learning Objectives:
Describe the primary categories of smells and the concept of odor blindness.
Explain the structure and location of the olfactory membrane and mucosa, including the types and roles of cells involved in olfaction.
Describe the pathway and mechanisms of olfactory signal transmission from the olfactory receptors to the brain.
Illustrate the biochemical cascade triggered by odorant binding to olfactory receptors, including the role of G-proteins and second messengers in generating an action potential.
Identify different types of olfactory disorders such as anosmia, hyposmia, hyperosmia, and dysosmia, including their potential causes.
Key Topics:
Olfactory Genes:
3% of the human genome accounts for olfactory genes.
400 genes for odorant receptors.
Olfactory Membrane:
Located in the superior part of the nasal cavity.
Medially: Folds downward along the superior septum.
Laterally: Folds over the superior turbinate and upper surface of the middle turbinate.
Total surface area: 5-10 square centimeters.
Olfactory Mucosa:
Olfactory Cells: Bipolar nerve cells derived from the CNS (100 million), with 4-25 olfactory cilia per cell.
Sustentacular Cells: Produce mucus and maintain ionic and molecular environment.
Basal Cells: Replace worn-out olfactory cells with an average lifespan of 1-2 months.
Bowman’s Gland: Secretes mucus.
Stimulation of Olfactory Cells:
Odorant dissolves in mucus and attaches to receptors on olfactory cilia.
Involves a cascade effect through G-proteins and second messengers, leading to depolarization and action potential generation in the olfactory nerve.
Quality of a Good Odorant:
Small (3-20 Carbon atoms), volatile, water-soluble, and lipid-soluble.
Facilitated by odorant-binding proteins in mucus.
Membrane Potential and Action Potential:
Resting membrane potential: -55mV.
Action potential frequency in the olfactory nerve increases with odorant strength.
Adaptation Towards the Sense of Smell:
Rapid adaptation within the first second, with further slow adaptation.
Psychological adaptation greater than receptor adaptation, involving feedback inhibition from the central nervous system.
Primary Sensations of Smell:
Camphoraceous, Musky, Floral, Pepperminty, Ethereal, Pungent, Putrid.
Odor Detection Threshold:
Examples: Hydrogen sulfide (0.0005 ppm), Methyl-mercaptan (0.002 ppm).
Some toxic substances are odorless at lethal concentrations.
Characteristics of Smell:
Odor blindness for single substances due to lack of appropriate receptor protein.
Behavioral and emotional influences of smell.
Transmission of Olfactory Signals:
From olfactory cells to glomeruli in the olfactory bulb, involving lateral inhibition.
Primitive, less old, and new olfactory systems with different path
Knee anatomy and clinical tests 2024.pdfvimalpl1234
This includes all relevant anatomy and clinical tests compiled from standard textbooks, Campbell,netter etc..It is comprehensive and best suited for orthopaedicians and orthopaedic residents.
New Drug Discovery and Development .....NEHA GUPTA
The "New Drug Discovery and Development" process involves the identification, design, testing, and manufacturing of novel pharmaceutical compounds with the aim of introducing new and improved treatments for various medical conditions. This comprehensive endeavor encompasses various stages, including target identification, preclinical studies, clinical trials, regulatory approval, and post-market surveillance. It involves multidisciplinary collaboration among scientists, researchers, clinicians, regulatory experts, and pharmaceutical companies to bring innovative therapies to market and address unmet medical needs.
- Video recording of this lecture in English language: https://youtu.be/lK81BzxMqdo
- Video recording of this lecture in Arabic language: https://youtu.be/Ve4P0COk9OI
- Link to download the book free: https://nephrotube.blogspot.com/p/nephrotube-nephrology-books.html
- Link to NephroTube website: www.NephroTube.com
- Link to NephroTube social media accounts: https://nephrotube.blogspot.com/p/join-nephrotube-on-social-media.html
micro teaching on communication m.sc nursing.pdfAnurag Sharma
Microteaching is a unique model of practice teaching. It is a viable instrument for the. desired change in the teaching behavior or the behavior potential which, in specified types of real. classroom situations, tends to facilitate the achievement of specified types of objectives.
Couples presenting to the infertility clinic- Do they really have infertility...Sujoy Dasgupta
Dr Sujoy Dasgupta presented the study on "Couples presenting to the infertility clinic- Do they really have infertility? – The unexplored stories of non-consummation" in the 13th Congress of the Asia Pacific Initiative on Reproduction (ASPIRE 2024) at Manila on 24 May, 2024.
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...VarunMahajani
Disruption of blood supply to lung alveoli due to blockage of one or more pulmonary blood vessels is called as Pulmonary thromboembolism. In this presentation we will discuss its causes, types and its management in depth.
NVBDCP.pptx Nation vector borne disease control programSapna Thakur
NVBDCP was launched in 2003-2004 . Vector-Borne Disease: Disease that results from an infection transmitted to humans and other animals by blood-feeding arthropods, such as mosquitoes, ticks, and fleas. Examples of vector-borne diseases include Dengue fever, West Nile Virus, Lyme disease, and malaria.
Title: Sense of Taste
Presenter: Dr. Faiza, Assistant Professor of Physiology
Qualifications:
MBBS (Best Graduate, AIMC Lahore)
FCPS Physiology
ICMT, CHPE, DHPE (STMU)
MPH (GC University, Faisalabad)
MBA (Virtual University of Pakistan)
Learning Objectives:
Describe the structure and function of taste buds.
Describe the relationship between the taste threshold and taste index of common substances.
Explain the chemical basis and signal transduction of taste perception for each type of primary taste sensation.
Recognize different abnormalities of taste perception and their causes.
Key Topics:
Significance of Taste Sensation:
Differentiation between pleasant and harmful food
Influence on behavior
Selection of food based on metabolic needs
Receptors of Taste:
Taste buds on the tongue
Influence of sense of smell, texture of food, and pain stimulation (e.g., by pepper)
Primary and Secondary Taste Sensations:
Primary taste sensations: Sweet, Sour, Salty, Bitter, Umami
Chemical basis and signal transduction mechanisms for each taste
Taste Threshold and Index:
Taste threshold values for Sweet (sucrose), Salty (NaCl), Sour (HCl), and Bitter (Quinine)
Taste index relationship: Inversely proportional to taste threshold
Taste Blindness:
Inability to taste certain substances, particularly thiourea compounds
Example: Phenylthiocarbamide
Structure and Function of Taste Buds:
Composition: Epithelial cells, Sustentacular/Supporting cells, Taste cells, Basal cells
Features: Taste pores, Taste hairs/microvilli, and Taste nerve fibers
Location of Taste Buds:
Found in papillae of the tongue (Fungiform, Circumvallate, Foliate)
Also present on the palate, tonsillar pillars, epiglottis, and proximal esophagus
Mechanism of Taste Stimulation:
Interaction of taste substances with receptors on microvilli
Signal transduction pathways for Umami, Sweet, Bitter, Sour, and Salty tastes
Taste Sensitivity and Adaptation:
Decrease in sensitivity with age
Rapid adaptation of taste sensation
Role of Saliva in Taste:
Dissolution of tastants to reach receptors
Washing away the stimulus
Taste Preferences and Aversions:
Mechanisms behind taste preference and aversion
Influence of receptors and neural pathways
Impact of Sensory Nerve Damage:
Degeneration of taste buds if the sensory nerve fiber is cut
Abnormalities of Taste Detection:
Conditions: Ageusia, Hypogeusia, Dysgeusia (parageusia)
Causes: Nerve damage, neurological disorders, infections, poor oral hygiene, adverse drug effects, deficiencies, aging, tobacco use, altered neurotransmitter levels
Neurotransmitters and Taste Threshold:
Effects of serotonin (5-HT) and norepinephrine (NE) on taste sensitivity
Supertasters:
25% of the population with heightened sensitivity to taste, especially bitterness
Increased number of fungiform papillae
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?bkling
Are you curious about what’s new in cervical cancer research or unsure what the findings mean? Join Dr. Emily Ko, a gynecologic oncologist at Penn Medicine, to learn about the latest updates from the Society of Gynecologic Oncology (SGO) 2024 Annual Meeting on Women’s Cancer. Dr. Ko will discuss what the research presented at the conference means for you and answer your questions about the new developments.
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
ISB2012: The Gene Wiki: Crowdsourcing human gene annotation
1. The Gene Wiki: Crowdsourcing human gene
annotation
Andrew Su, Ph.D.
Department of Molecular and Experimental Medicine
The Scripps Research Institute
Biocuration 2012
April 2, 2012
2. 2
The Long Tail is a prolific source of content
Short
Head
Content
produced
Long Tail
Contributors (sorted)
News : Newspapers Blogs
Video: TV/Hollywood YouTube
Product reviews: Consumer reports Amazon reviews
Food reviews: Food critics Yelp
Talent judging: Olympics American Idol
Gene annotation: Manual curation Gene Wiki
3. 3
We can harness the
Long Tail of scientists
to directly participate in
the gene annotation
process.
5. 5
Wikipedia has breadth and depth
Articles
Words
(millions)
Wikipedia Britannica
Online
http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons, July 2008
7. 7
Wiki success depends on a positive feedback
Gene wiki page utility
1 100
2 200
Number of Number of
contributors users
8. 8
10,000 gene “stubs” within Wikipedia Utility
Users
Contributors
Protein structure
Gene
summary
Symbols and
identifiers
Gene Ontology
annotations
Protein
interactions
Tissue expression
Linked pattern
references
Links to structured
databases
Huss, PLoS Biol, 2008
9. 9
Gene Wiki has a critical mass of readers
Utility
Users
Contributors
Total: ~4.3 million
views / month
Huss, PLoS Biol, 2008; Good, NAR, 2011
10. 10
Gene Wiki has a critical mass of editors
Utility
~10,000 words added / month
Users
Contributors
Total 1.42 million words
≈ 230 full-length articles
4.3 million views / month
Cumulative edits
Productive
edits
1000 edits / month
Vandalism
Good, NAR, 2011
11. 11
A review article for every gene is powerful
Reelin: 68 editors, 543 edits since July 2002
Heparin: 175 editors, 320 edits since June 2003
AMPK: 44 editors, 84 edits since March 2004
RNAi: 232 editors, 708 edits since October 2002
References to the literature
Hyperlinks to related concepts
12. 12
Making the Gene Wiki more computable
Free text Structured annotations
13. 13
Filling the gaps in gene annotation
NCBI Entrez Gene: 3362
Gene Wiki
mapping
Wikilink Candidate
assertion
GO:0004993
GO exact
synonym
14. 14
Filling the gaps in gene annotation
NCBI Entrez Gene: 334
Gene Wiki
mapping
Wikilink Candidate
assertion
GO:0006897
GO exact
match
15. Disease associations mined from the Gene Wiki
Good, BMC Genomics 2011, 12:603
Gene Wiki Articles
(10,271) 23% exact
match
Filter out 5% match
seeded text parent
2% match
child
70% have
NCBO
no match
Annotator
Matched Disease 2147
Compare to
Ontology terms candidate
DO database
(2983) annotations
16. Disease associations mined from the Gene Wiki
Good, BMC Genomics 2011, 12:603
Expert curation
Correct
Incorrect: 10% 86%
Maybe: 4% Overall specificity: 90-93%
17. GO associations mined from the Gene Wiki
Good, BMC Genomics 2011, 12:603
Gene Wiki Articles
(10,271) 17% exact
match
Filter out
seeded text 26% match
parent
55% have
NCBO no match
Annotator 2% match
child
Matched Gene 6319
Compare to
Ontology terms candidate
GO database
(11,022) annotations
18. GO associations mined from the Gene Wiki
Good, BMC Genomics 2011, 12:603
Expert curation
Correct
14%
Maybe
60% 26%
Incorrect
Overall specificity: 48-64%
19. 19
Common sources of error in GO associations
Good, BMC Genomics 2011, 12:603
1) Incorrect concept recognition
OR2F1: “Olfactory receptors … are
responsible for the recognition and G protein-
mediated transduction of odorant signals.”
Signal transduction (GO:0007165) Transduction (GO:0009293)
The cellular process in which a signal The transfer of genetic information to a
is conveyed to trigger a change in the bacterium from a bacteriophage or
activity or state of a cell. Signal between bacterial or yeast cells
transduction begins with reception of a mediated by a phage vector.
signal, e.g. a ligand binding to a
receptor or receptor activation by a
stimulus such as light, and ends with
regulation of a downstream cellular
process…
20. 20
Common sources of error in GO associations
Good, BMC Genomics 2011, 12:603
2) Incorrect sentence context
MEF2C: “Several post translational
modifications have been identified including
phosphorylation on serine-59 …”
Dephosphorylation
Excretion
Phosporylation Gene expression
Glycosylation
Localization
MEF2C Neurogenesis Methylation
Proteolysis
Secretion
Transport
Myelination Transcription
Translation
21. 21
Novel GO annotations – so what?
6319
11,022 ~100,000
“novel” 4703 (43%)
annotations annotations
annotations match known
mined from from GO
@ 48-64% annotations
Gene Wiki consortium
specificity
22. 22
Gene Wiki content improves enrichment analysis
axon Enrichment
guidance GO term
analysis
(GO:0007411)
811 articles
264 genes PubMed Concept
Gene list
abstracts recognition
GO:0007411
Yes No
Linked genes Yes 13 2
through
No 251 12033
PubMed
P = 1.55 E-20
23. 23
Gene Wiki content improves enrichment analysis
muscle Enrichment
contraction GO term
analysis
(GO:0006936)
251 articles
87 genes PubMed Concept
Gene list
abstracts recognition
+
Gene Wiki
87 articles
GO:0006936 GO:0006936
Linked genes Linked genes
through through
PubMed PubMed +
Gene Wiki
P = 1.0 P = 1.22 E-09
24. 24
Gene Wiki content improves enrichment analysis
More
p-value significant
(PubMed + GW) PubMed only
Muscle
contraction
More
significant
PubMed + GW
p-value (PubMed only)
25. 25
Challenges and future directions
• How to complement and integrate with
traditional biocuration workflows?
• How to disseminate and utilize
crowdsourced annotations?
26. 26
The
Long Tail of scientists
is a valuable source of
information on gene
function
27. 27
Collaborators Group members
Doug Howe, ZFIN Erik Clarke Ian Macleod
John Hogenesch, U Penn
Jon Huss, GNF
Ben Good (*) Chunlei Wu
Luca de Alfaro, UCSC Salvatore Loguercio
Angel Pizzaro, U Penn
Faramarz Valafar, SDSU
Pierre Lindenbaum,
Fondation Jean Dausset
Michael Martone, Rush See poster # 30 for more on
Konrad Koehler, Karo Bio
Warren Kibbe, Simon Lim, Northwestern the Gene Wiki and
Many Wikipedia editors crowdsourcing in biology!
WP:MCB Project
Contact
http://sulab.org
asu@scripps.edu
@andrewsu
+Andrew Su
Funding and Support
(BioGPS: GM83924, Gene Wiki: GM089820)
28. 28
Making the Gene Wiki more reliable
Novartis is a multinational 2 The company name is derived
pharmaceutical company from old Greek, and means
based in Basel, Switzerland "destroyer of birds".
that manufactures drugs such
as clozapine
(Clozaril), diclofenac
(Voltaren), …
2
29. 29
Making the Gene Wiki more reliable
Novartis is a multinational 2 The company name is derived
pharmaceutical company from old Greek, and means
based in Basel, Switzerland "destroyer of birds".
that manufactures drugs such
as clozapine (Clozaril),
diclofenac (Voltaren), …
36211 total edits 36 total edits
* *
*
*
* *
*
* *
*
* *
* *
High-trust author Low-trust author
http://www.wikitrust.net/
Editor's Notes
Relying on the entire community of scientists to digest the biomedical literature: identification filtering extraction summarization
Transduction accounts for 70% of the concept recognition problems
Tried on 773 GO categories, significant in 356 cases (46%)
We extended this analysis to all 773 GO terms used in human gene annotations and found a consistent improvement in the enrichment scores
We started working with Doug Howe because he helped us learn a lot about biocuration, but clearly we’d need to expand partnersIn particular, since GO curation seems to be largely drawn by organisms
Also want to convince you that the Long Tail of bioinformatics developers is valuable too, but first have to convince you that there is a bottleneck in tool development.