The document describes the Gene Wiki, a crowdsourced online portal for annotating human genes. It notes that the "long tail" of scientists can help directly participate in gene annotation. The Gene Wiki has grown significantly, with over 1 million words contributed and 4.3 million views per month. Content from the Gene Wiki improves gene enrichment analysis and allows mining of novel gene ontology annotations. Future work aims to integrate the Gene Wiki with other databases to enable dynamic queries across genes, diseases, and SNPs. Crowdsourcing from scientists is positioned as a valuable source of information on gene function.
A review of two alignment-free methods for sequence comparison. In this presentation two alignment-free methods are studied:
- "Similarity analysis of DNA sequences based on LZ complexity and dynamic programming algorithm" by Guo et al.
- "Alignment-free comparison of genome sequences by a new numerical characterization" by Huang et al.
A review of two alignment-free methods for sequence comparison. In this presentation two alignment-free methods are studied:
- "Similarity analysis of DNA sequences based on LZ complexity and dynamic programming algorithm" by Guo et al.
- "Alignment-free comparison of genome sequences by a new numerical characterization" by Huang et al.
Crowdsourcing to structure biological knowledge (USC/ISI)Andrew Su
Talk given at USC's Information Sciences Institute (http://www.isi.edu). The AV recording is pretty horrible, but for anyone interested: http://webcasterms1.isi.edu/mediasite/SilverlightPlayer/Default.aspx?peid=89751f8537c44f2fa241db99c793cd231d
NCBO Webinar: Translating unstructured, crowdsourced content into structured ...Andrew Su
The use of crowdsourcing in biology is gaining popularity as a mechanism to tackle challenges of massive scale. However, to maximize participation and lower the barriers to entry, contributions to crowdsourcing efforts are typically not well-structured, which makes computing on these data challenging and difficult. The presentation will discuss strategies for translating this unstructured content into structured data. Three vignettes (in varying degrees of completion) will be described, one each from our Gene Wiki [1], BioGPS [2], and serious gaming [3] initiatives.
[1]: http://en.wikipedia.org/wiki/Portal:Gene_Wiki
[2]: http://biogps.org
[3]: http://genegames.org
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...Andrew Su
Keynote talk given at GMOD 2014
Video of talk at: https://www.youtube.com/watch?v=RVijs5ry05E
Video of QA at: https://www.youtube.com/watch?v=dGHXo-iNsyU
Blog post: http://sulab.org/2013/06/creating-a-centralized-model-organism-database-cmod/
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Slides from a presentation on the Knowledge Organization System (KOS) work program for GBIF. KOS developments for biodiversity information resources and input to the emerging Vocabulary Management Task Group (VoMaG).
Links
GBIF KOS prototype tools, http://kos.gbif.org/
Tool: Semantic Wiki prototype, http://terms.gbif.org/wiki/
Tool: ISOcat prototype demo, http://kos.gbif.org/isocat/
GBIF concept vocabulary term browser, http://kos.gbif.org/termbrowser/
GBIF Resources Repository, http://rs.gbif.org/terms/
GBIF Vocabulary Server, http://vocabularies.gbif.org/
GBIF Resources Browser, http://tools.gbif.org/resource-browser/
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Updatedolleyj
The Evidence and Conclusion Ontology (ECO) describes types of evidence relevant to biological investigations. First developed in the early 2000s, ECO now consists of over 1700 defined classes and is used by a large, and growing, list of resources. ECO imports close to 1000 classes from the Ontology for Biomedical Investigations and the Gene Ontology for use in logical definitions. Historically, ECO terms have generally been categorized by either the biological context of the evidence (e.g. gene expression) or the technique used to generate the evidence (e.g. PCR-based evidence). The result is that sometimes terms that have related biological context are found under different unrelated nodes. To address this, we have been performing a rigorous review of the structure and logic of the branches of ECO. Working with additional input from collaborators through the issue tracker on GitHub, term labels, definitions, and relationships are being evaluated and updated. The goal of these changes is to increase the logical consistency of ECO, make it easier for users to find and understand terms, and allow for ECO to continue to grow and support its users. In addition to the structural review, we have been working with CollecTF to utilize ECO for automated text mining. To generate a curated corpus for this effort, we have been annotating ECO terms to sentences which contain evidence-based assertions about gene products, taxonomic entities, and sequence features. From this effort we have developed clearly-defined annotation guidelines that have been passed on to a team of undergraduates who are continuing the curation effort.
Annotations are limited to single sentences, or to two consecutive sentences, containing the evidence instance and assertion clause. The quality of the mapping to ECO
and the strength of the author’s assertion are also captured. ECO is freely available at http://evidenceontology.org/ and https://github.com/evidenceontology.
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Presentation of the Global Biodiversity Information Facility (GBIF) knowledge organization system (KOS) work program for the National Center for Biomedical Ontology (NCBO) Web seminar series in October 2012. Available at http://www.bioontology.org/GBIF-vocabulary-management-for-biodiversity-informatics
Crowdsourcing to structure biological knowledge (USC/ISI)Andrew Su
Talk given at USC's Information Sciences Institute (http://www.isi.edu). The AV recording is pretty horrible, but for anyone interested: http://webcasterms1.isi.edu/mediasite/SilverlightPlayer/Default.aspx?peid=89751f8537c44f2fa241db99c793cd231d
NCBO Webinar: Translating unstructured, crowdsourced content into structured ...Andrew Su
The use of crowdsourcing in biology is gaining popularity as a mechanism to tackle challenges of massive scale. However, to maximize participation and lower the barriers to entry, contributions to crowdsourcing efforts are typically not well-structured, which makes computing on these data challenging and difficult. The presentation will discuss strategies for translating this unstructured content into structured data. Three vignettes (in varying degrees of completion) will be described, one each from our Gene Wiki [1], BioGPS [2], and serious gaming [3] initiatives.
[1]: http://en.wikipedia.org/wiki/Portal:Gene_Wiki
[2]: http://biogps.org
[3]: http://genegames.org
A Centralized Model Organism Database (CMOD) for the Long Tail of Sequenced G...Andrew Su
Keynote talk given at GMOD 2014
Video of talk at: https://www.youtube.com/watch?v=RVijs5ry05E
Video of QA at: https://www.youtube.com/watch?v=dGHXo-iNsyU
Blog post: http://sulab.org/2013/06/creating-a-centralized-model-organism-database-cmod/
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Slides from a presentation on the Knowledge Organization System (KOS) work program for GBIF. KOS developments for biodiversity information resources and input to the emerging Vocabulary Management Task Group (VoMaG).
Links
GBIF KOS prototype tools, http://kos.gbif.org/
Tool: Semantic Wiki prototype, http://terms.gbif.org/wiki/
Tool: ISOcat prototype demo, http://kos.gbif.org/isocat/
GBIF concept vocabulary term browser, http://kos.gbif.org/termbrowser/
GBIF Resources Repository, http://rs.gbif.org/terms/
GBIF Vocabulary Server, http://vocabularies.gbif.org/
GBIF Resources Browser, http://tools.gbif.org/resource-browser/
BioCuration 2019 - Evidence and Conclusion Ontology 2019 Updatedolleyj
The Evidence and Conclusion Ontology (ECO) describes types of evidence relevant to biological investigations. First developed in the early 2000s, ECO now consists of over 1700 defined classes and is used by a large, and growing, list of resources. ECO imports close to 1000 classes from the Ontology for Biomedical Investigations and the Gene Ontology for use in logical definitions. Historically, ECO terms have generally been categorized by either the biological context of the evidence (e.g. gene expression) or the technique used to generate the evidence (e.g. PCR-based evidence). The result is that sometimes terms that have related biological context are found under different unrelated nodes. To address this, we have been performing a rigorous review of the structure and logic of the branches of ECO. Working with additional input from collaborators through the issue tracker on GitHub, term labels, definitions, and relationships are being evaluated and updated. The goal of these changes is to increase the logical consistency of ECO, make it easier for users to find and understand terms, and allow for ECO to continue to grow and support its users. In addition to the structural review, we have been working with CollecTF to utilize ECO for automated text mining. To generate a curated corpus for this effort, we have been annotating ECO terms to sentences which contain evidence-based assertions about gene products, taxonomic entities, and sequence features. From this effort we have developed clearly-defined annotation guidelines that have been passed on to a team of undergraduates who are continuing the curation effort.
Annotations are limited to single sentences, or to two consecutive sentences, containing the evidence instance and assertion clause. The quality of the mapping to ECO
and the strength of the author’s assertion are also captured. ECO is freely available at http://evidenceontology.org/ and https://github.com/evidenceontology.
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Presentation of the Global Biodiversity Information Facility (GBIF) knowledge organization system (KOS) work program for the National Center for Biomedical Ontology (NCBO) Web seminar series in October 2012. Available at http://www.bioontology.org/GBIF-vocabulary-management-for-biodiversity-informatics
Keynote presented to KE workshop held in conjunction with the release of the report "A Surfboard for Riding the Wave
Towards a four country action programme on research data": http://www.knowledge-exchange.info/Default.aspx?ID=469
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
This presentation tries to highlight the importance and relevance of community-based curation of biological data. It describes the results of harvesting expertise from dispersed researchers assigning functions to predicted and curated peptides, as well as collaborative efforts for standardization of genes and gene product attributes across species and databases.
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
Ora Lassila and Amit Sheth, "Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Interoperability", Invited Talk at ONC-HHS Invitational Workshop on Next Generation Interoperability for Health, Washington DC, January 19-20, 2011.
A talk presented January 19, 2013 in the Indo-US Joint Workshop on Biodiversity Informatics at the Ashoka Trust for Research in Ecology and the Environment in Bangalore, India.
Similar to ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation (20)
Citizen Science and Rare Disease ResearchAndrew Su
Talk given at "Personalized Health in the Digital Age" September 22, 2016 at Campus Biotech in Geneva, Switzerland https://www.personalizedhealth2016.ch/
Centralized Model Organism Database (Biocuration 2014 poster)Andrew Su
A Centralized Model Organism Database (CMOD) for the Long Tail of Genomes
Presented at Biocuration 2014 in Toronto http://biocuration2014.events.oicr.on.ca/
See related slides at http://www.slideshare.net/andrewsu/20140116-gmod-short
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
2024.06.01 Introducing a competency framework for languag learning materials ...
ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation
1. The Gene Wiki: Crowdsourcing human gene
annotation
Andrew Su, Ph.D.
The Scripps Research Institute
ISMB
Special Session: Harnessing community
intelligence for bioinformatics
#ISMB #SS7
July 17, 2012
2. 2
The Long Tail is a prolific source of content
Short
Head
Content
produced
Long Tail
Contributors (sorted)
News : Newspapers Blogs
Video: TV/Hollywood YouTube
Product reviews: Consumer reports Amazon reviews
Food reviews: Food critics Yelp
Talent judging: Olympics American Idol
Gene annotation: Manual curation Gene Wiki
3. 3
We can harness the
Long Tail of scientists
to directly participate in
the gene annotation
process.
5. 5
Wikipedia has breadth and depth
Articles
Words
(millions)
Wikipedia Britannica
Online
http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons, July 2008
7. 7
Wiki success depends on a positive feedback
Gene wiki page utility
1 100
2 200
Number of Number of
contributors users
8. 8
10,000 gene “stubs” within Wikipedia Utility
Users
Contributors
Protein structure
Gene
summary
Symbols and
identifiers
Gene Ontology
annotations
Protein
interactions
Tissue expression
Linked pattern
references
Links to structured
databases
Huss, PLoS Biol, 2008
9. 9
Gene Wiki has a critical mass of readers
Utility
Users
Contributors
Total: ~4.3 million
views / month
Huss, PLoS Biol, 2008; Good, NAR, 2011
10. 10
Gene Wiki has a critical mass of editors
Utility
~10,000 words added / month
Users
Contributors
Total 1.42 million words
≈ 230 full-length articles
4.3 million views / month
Cumulative edits
Productive
edits
1000 edits / month
Vandalism
Good, NAR, 2011
11. 11
A review article for every gene is powerful
Reelin: 98 editors, 703 edits since July 2002
Hyperlinks to related concepts
Heparin: 358 editors, 654 edits since June 2003
AMPK: 109 editors, 203 edits since March 2004
RNAi: 394 editors, 994 edits since October 2002
References to the literature
12. 12
Making the Gene Wiki more computable
Free text Structured annotations
13. 13
Filling the gaps in gene annotation
Good, BMC Genomics 2011, 12:603
NCBI Entrez Gene: 3362
Gene Wiki
mapping
Wikilink Candidate
assertion
GO:0004993
GO exact
synonym
Annotator
14. 14
Filling the gaps in gene annotation
Good, BMC Genomics 2011, 12:603
NCBI Entrez Gene: 334
Gene Wiki
mapping
Wikilink Candidate
assertion
GO:0006897
GO exact
match
Annotator
15. 15
Novel GO annotations – so what?
Good, BMC Genomics 2011, 12:603
6319
11,022 ~100,000
“novel” 4703 (43%)
annotations annotations
annotations match known
mined from from GO
@ 48-64% annotations
Gene Wiki consortium
specificity
16. 16
Gene Wiki content improves enrichment analysis
axon Enrichment
guidance GO term
analysis
(GO:0007411)
811 articles
264 genes PubMed Concept
Gene list
abstracts recognition
GO:0007411
Yes No
Linked genes Yes 13 2
through
No 251 12033
PubMed
P = 1.55 E-20
17. 17
Gene Wiki content improves enrichment analysis
muscle Enrichment
contraction GO term
analysis
(GO:0006936)
251 articles
87 genes PubMed Concept
Gene list
abstracts recognition
+
Gene Wiki
87 articles
GO:0006936 GO:0006936
Linked genes Linked genes
through through
PubMed PubMed +
Gene Wiki
P = 1.0 P = 1.22 E-09
18. 18
Gene Wiki content improves enrichment analysis
More
p-value significant with
(PubMed + GW) PubMed only
Muscle
contraction
More
significant with
PubMed + GW
p-value (PubMed only)
19. 19
Gene Wiki+ for integrative queries
mwsync
http://genewikiplus.org
27. 27
Collaborators Group members
Doug Howe, ZFIN Erik Clarke Ian Macleod
John Hogenesch, U Penn
Jon Huss, GNF
Ben Good Max Nanis
Luca de Alfaro, UCSC Salvatore Loguercio Chunlei Wu
Angel Pizzaro, U Penn
Faramarz Valafar, SDSU
Pierre Lindenbaum,
Fondation Jean Dausset ISMB travel support
Michael Martone, Rush
Konrad Koehler, Karo Bio
Warren Kibbe, Simon Lim, Northwestern
Many Wikipedia editors
WP:MCB Project
Contact
http://sulab.org
asu@scripps.edu
@andrewsu
+Andrew Su
Funding and Support
(BioGPS: GM83924, Gene Wiki: GM089820)
Editor's Notes
Relying on the entire community of scientists to digest the biomedical literature: identification filtering extraction summarization
Tried on 773 GO categories, significant in 356 cases (46%)
We extended this analysis to all 773 GO terms used in human gene annotations and found a consistent improvement in the enrichment scores
Also want to convince you that the Long Tail of bioinformatics developers is valuable too, but first have to convince you that there is a bottleneck in tool development.