Welch Wordifier Bosc2009

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Welch Wordifier Bosc2009 - Presentation Transcript

    1. "Junk" DNA Proves to be Highly Valuable1
      What was once thought of as DNA with zero value in plants--dubbed "junk" DNA--may turn out to be key in helping scientists improve the control of gene expression in transgenic crops.2
      Cooper and collaborators investigated "junk" DNA in the model plant Arabidopsis thaliana, using a computer program to find short segments of DNA that appeared as molecular patterns…These linked patterns are called pyknons…
      This discovery in plants illustrates that the link between coding DNA and junk DNA crosses higher orders of biology and suggests a universal genetic mechanism at play that is not yet fully understood.
      1-Alfredo Flores, June 2, 2009; http://www.ars.usda.gov/is/pr/2009/090602.htm.
      2-Bret Cooper, Soybean Genomics and Improvement Laboratory, Agricultural Research Service, USDA.
    2. “Perhaps it is time tobid farewell to the term ‘junk’ DNA – we knew not your true nature.”
      (Regulatory RNAs and the demise of ‘junk’ DNA. Genome Biology 2006, 7:328)
      The genome
      genes
      Functional elements?
      Functional Elements: 90%?? Junk: 10%??
      "...a certain amount of hubris was required
      for anyone to call any part of the genome 'junk,'
      given our level of ignorance."(Francis Collins, 2006)
    3. Fig. 1. Pyknons in the 3' UTRs of the apoptosis inhibitor birc4 (shown above the horizontal line) and nine other genes
      Rigoutsos, Isidore et al. (2006) Proc. Natl. Acad. Sci. USA 103, 6605-6610
      Copyright ©2006 by the National Academy of Sciences
    4. WordSeekerA Software Suite for Discovery and Characterization of Genomic Words and Genome-Wide Patterns
    5. www.word-seeker.org
    6. word discovery methods
      sequence-driven
      (alignment-based)
      pattern-driven
      (enumerative)
      exhaustive
      optimized
      probabilistic
      optimization
      deterministic
      optimization
      YMF
      preprocess
      combine
      short patterns
      AlignAce
      MEME
      WINNOWER
      heuristic
      exact
      Teiresias,
      WordSeeker
      suffix tree,
      Weeder
      GuhaThakurta D., Computational identification of transcriptional regulatory elements in DNA sequence. Nucleic Acids Res. 2006 Jul 19;34(12):3585-98. Print 2006. Review.
      Sandve GK, Drabløs F., A survey of motif discovery methods in an integrated framework. Biol Direct. 2006 Apr 6;1:11.
    7. The WORDIFIER Pattern
      for Functional and Regulatory Genomics
      sequence(s)
      words
      WORDIFIER
      scientist
      scientist
    8. OWEF: An Open Source Word Enumeration Framework for Bioinformatics
      Kyle Kurz, Lonnie R. Welch,
      Frank Drews, Lee Nau,
      Jens Lichtenberg
      Ohio University School of EECS
      Bioinformatics Laboratory
    9. Motivation
      Create a robust Motif Discovery framework using abstracted core algorithms
      Use a modular design, allowing new methods and algorithms to be implemented quickly and easily
      Abstract C++ classes
      Easily extensible
      Support the Scientific Discovery process
    10. Approach
    11. Project Information
      Project:
      http://bio-s1.cs.ohiou.edu/~wordseek/download/
      Open Source License:
      GNU General Public License (GPL v3)
      Language:
      C++
      Applications:
      Currently in final testing phase
      Future Work:
      Will provide backend for WordSeeker tool at Ohio University and Ohio Supercomputer Center
      Will be used to fully analyze the Arabidopsis thaliana genome
    12. Open Source Implementation of Batch Extraction for Coding and Non-Coding Sequences
      Jens Lichtenberg, Lonnie R. Welch
      Bioinformatics Laboratory
      School of EECS
      Ohio University
    13. Motivation
      Regulatory Genomics tools return and operate on lists of Gene Symbols (e.g. STAT5A, Cd59a, Slc35f4)
      To our knowledge, no currently supported, open source “tool” that allows extraction of specific non-coding sequences for any organism
      Ensembl API provides limited functionality
    14. Approach
      connect to
      Ensembl database
      Input
      Output
      Set up repository
      Retrieve Gene Adaptor
      create gene object
      Gene Symbol
      Retrieve 5’UTR
      Retrieve 3’UTR
      Retrieve Exons
      Retrieve Upstream Adaptor
      Retrieve Introns
      Retrieve Promoter
      Promoter length
      Output Files
    15. Project Information
      Project:
      http://opensource.msseeker.org
      GNU General Public License (GPL)
      Language:
      Perl
      Integrated in WordSeeker motif discovery tool of Ohio University Bioinformatics Lab
      Future Work:
      Connection to Genbank repository information
      Release into BioPerl or CPAN
    16. Acknowledgements
      Thomas Bitterman, OSC
      Laura Elnitski, NHGRI
      Susan Evans, OU
      Matt Geisler, SIU
      Erich Grotewold , OSU
      Edwin Jacox, NHGRI
      Stephen S. Lee, U. Idaho
      Pooja M. Majmudar, OU
      Paul Morris, BGSU
      Chase Nelson, Oberlin
      Eric Stockinger , OSU
      Sarah Wyatt, OU
      Alper Yilmaz, OSU
      Jeffrey Parvin, OSU
      Kun Huang, OSU
      Thomas Mitchell , OSU
      Kengo Morohashi, OSU
      Rebecca Lamb , OSU
      John Finer, OSU
      • Lonnie Welch
      • Jens Lichtenberg
      • Rami Alouran
      • Frank Drews
      • Kyle Kurz
      • Xiaoyu Liang
      • Lee Nau
      • Matt Wiley
      • Razvan Bunescu
      • Joshua D. Welch
      • Klaus Ecker
      • Mohit Alam
      • Nathaniel George
      • Dazhang Gu
      • Eric Petri
      • Josiah Seaman
      • Kaiyu Shen
      Collaborators
      WordSeeker Team
      Former Members of the team
    17. a pattern “describes a problem which occurs
      over and over again in our environment, and
      then describes the core of the solution to that
      problem, in such a way that you can use the
      solution a million times over, without ever doing
      it the same way twice [1].”
      C. Alexander, S. Ishikawa, and M. Silverstein, A Pattern Language: Towns,
      Buildings, Construction. Oxford University Press, 1977.
    18. Alexander Pattern Format
      Picture – a representative example
      Introductory paragraph - sets the context
      
      Headline - the essence of the problem in one or two sentences.
      Body –
      • empirical background of the pattern
      • evidence for its validity
      • range of different ways the pattern can be manifested
      Solution
      • relationships which are required to solve the stated problem in the stated context.
      • stated in the form of an instruction—so that you know exactly what you need to do, to build the pattern
      Diagram - shows the solution, with labels to indicate its main components
      
      A paragraph which ties the pattern to all those smaller patterns in the language, which are needed to complete this pattern, to embellish it, to fill it out…
    19. Picture, Introduction, Headline
      With the availability of the genomic sequences of
      numerous organisms, life scientists are working in
      conjunction with bioinformaticians to decipher the
      meanings of the genomes. Projects such as Encyclopedia of
      Genomic Elements (ENCODE) [2] and Pyknons [3], seek to
      identify and charatcetrize the functional elements in genomes.
      The functional elements are often referred to as words.
      Given a genomic sequence (or a set of sequences), an important problem
      is the enumeration of all subsequences (words) contained in the sequence
      (or the set of sequences).
      The WORDIFIER Pattern for Functional and Regulatory Genomics

    + boscbosc, 4 months ago

    custom

    222 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 222
      • 222 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 4
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories