This document provides a summary of Jennifer Shelton's background and experience in bioinformatics. It outlines her education in biology and post-baccalaureate studies. Her research focuses on de novo genome and transcriptome assembly using next-generation sequencing and BioNano Genomics data. She has extensive experience developing bioinformatics workflows and teaching coding skills through workshops. Currently she is the Bioinformatics Core Outreach Coordinator at Kansas State University where she continues her research and outreach efforts.
Meren's pirate presentation at the STAMPS course to talk about the basic concepts most binning algorithms use to bin contigs into genome bins: sequence composition, and differential coverage.
Meren's pirate presentation at the STAMPS course to talk about the basic concepts most binning algorithms use to bin contigs into genome bins: sequence composition, and differential coverage.
PhD Thesis: Mining abstractions in scientific workflowsdgarijo
Slides of the presentation for my PhD dissertation. I strongly recommend downloading the slides, as they have animations that are easier to see in power point. The abstract of the thesis is as follows: "Scientific workflows have been adopted in the last decade to represent the computational methods used in in silico scientific experiments and their associated research products. Scientific workflows have demonstrated to be useful for sharing and reproducing scientific experiments, allowing scientists to visualize, debug and save time when re-executing previous work. However, scientific workflows may be difficult to understand and reuse. The large amount of available workflows in repositories, together with their heterogeneity and lack of documentation and usage examples may become an obstacle for a scientist aiming to reuse the work from other scientists. Furthermore, given that it is often possible to implement a method using different algorithms or techniques, seemingly disparate workflows may be related at a higher level of abstraction, based on their common functionality. In this thesis we address the issue of reusability and abstraction by exploring how workflows relate to one another in a workflow repository, mining abstractions that may be helpful for workflow reuse. In order to do so, we propose a simple model for representing and relating workflows and their executions, we analyze the typical common abstractions that can be found in workflow repositories, we explore the current practices of users regarding workflow reuse and we describe a method for discovering useful abstractions for workflows based on existing graph mining techniques. Our results expose the common abstractions and practices of users in terms of workflow reuse, and show how our proposed abstractions have potential to become useful for users designing new workflows".
This is a keynote that I have given in polyweb workshop on the state of the art of data science reproducibility. I review tools that have been developed over the last few years in the first part. In the second part, I focus on proposals that I have been involved in to facilitate workflow reproducibility and preservation.
Web Apollo Tutorial for the i5K copepod research community.Monica Munoz-Torres
Introduction to Web Apollo for the i5K i5K copepod research community. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
Data analysis & integration challenges in genomicsmikaelhuss
Presentation given at the Genomics Today and Tomorrow event in Uppsala, Sweden, 19 March 2015. (http://connectuppsala.se/events/genomics-today-and-tomorrow/) Topics include APIs, "querying by data set", machine learning.
Precise elucidation of the many different biological features encoded in any genome requires careful examination and review by researchers, who gather and evaluate the available evidence to corroborate and modify gene predictions and other biological elements. This curation process allows them to resolve discrepancies and validate automated gene model hypotheses and alignments. This approach is the well-established practice for well-known genomes such as human, mouse, zebrafish, Drosophila, et cetera. Desktop Apollo was originally developed to meet these needs.
The cost of sequencing a genome has been dramatically reduced by several orders of magnitude in the last decade, and the natural consequence is that more and more researchers are sequencing more and more new genomes, both within populations and across species. Because individual researchers can now readily sequence many genomes of interest, the need for a universally accessible genomic curation tool logically follows. Each new exome or genome sequenced requires visualization and curation to obtain biologically accurate genomic features sets, even for limited set of genes, because computational genome analysis remains an imperfect art. Additionally, unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore researchers now face additional work correcting for more frequent assembly errors and annotating genes split across multiple contigs.
Genome annotation is an inherently collaborative task; researchers only very rarely work in isolation, turning to colleagues for second opinions and insights from those with with expertise in particular domains and gene families. The new JavaScript based Apollo, allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. We are also focused on training the next generation of researchers by reaching out to educators to make these tools available as part of curricula via workshops and webinars, and through widely applied systems such as iPlant and DNA Subway. Here we offer details of our progress.
Presentation at Genome Informatics, Session (3) on Databases, Data Mining, Visualization, Ontologies and Curation.
Authors: Monica C Munoz-Torres, Suzanna E. Lewis, Ian Holmes, Colin Diesh, Deepak Unni, Christine Elsik.
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsChristopher Mason
Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single cells, RNA profiling, and metagenomics. Technical artifacts and contaminations can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous.
Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data.
This webinar will review work to develop standards and their applications in genomics, including the ABRF-NGS Phase II NGS Study on DNA Sequencing; the FDA’s Sequencing Quality Control Consortium (SEQC2); metagenomics standards efforts (ABRF, ATCC, Zymo, Metaquins), and the Epigenomics QC group of the SEQC2. The webinar will also review he computational methods for detection, validation, and implementation of these genomic measures.
Towards Reproducible Science: a few building blocks from my personal experienceOscar Corcho
Invited keynote given at the Second International Workshop on Semantics for BioDiversity (http://fusion.cs.uni-jena.de/s4biodiv2017/), held in conjunction with ISWC2017 (https://iswc2017.semanticweb.org/)
PhD Thesis: Mining abstractions in scientific workflowsdgarijo
Slides of the presentation for my PhD dissertation. I strongly recommend downloading the slides, as they have animations that are easier to see in power point. The abstract of the thesis is as follows: "Scientific workflows have been adopted in the last decade to represent the computational methods used in in silico scientific experiments and their associated research products. Scientific workflows have demonstrated to be useful for sharing and reproducing scientific experiments, allowing scientists to visualize, debug and save time when re-executing previous work. However, scientific workflows may be difficult to understand and reuse. The large amount of available workflows in repositories, together with their heterogeneity and lack of documentation and usage examples may become an obstacle for a scientist aiming to reuse the work from other scientists. Furthermore, given that it is often possible to implement a method using different algorithms or techniques, seemingly disparate workflows may be related at a higher level of abstraction, based on their common functionality. In this thesis we address the issue of reusability and abstraction by exploring how workflows relate to one another in a workflow repository, mining abstractions that may be helpful for workflow reuse. In order to do so, we propose a simple model for representing and relating workflows and their executions, we analyze the typical common abstractions that can be found in workflow repositories, we explore the current practices of users regarding workflow reuse and we describe a method for discovering useful abstractions for workflows based on existing graph mining techniques. Our results expose the common abstractions and practices of users in terms of workflow reuse, and show how our proposed abstractions have potential to become useful for users designing new workflows".
This is a keynote that I have given in polyweb workshop on the state of the art of data science reproducibility. I review tools that have been developed over the last few years in the first part. In the second part, I focus on proposals that I have been involved in to facilitate workflow reproducibility and preservation.
Web Apollo Tutorial for the i5K copepod research community.Monica Munoz-Torres
Introduction to Web Apollo for the i5K i5K copepod research community. WebApollo is genome annotation editor; it provides a web-based environment that allows multiple distributed users to review, edit, and share manual annotations. This presentation includes information specific to the projects of the Global Initiative to sequence the genomes of 5,000 species of arthropods, i5K. Let's get started!
Data analysis & integration challenges in genomicsmikaelhuss
Presentation given at the Genomics Today and Tomorrow event in Uppsala, Sweden, 19 March 2015. (http://connectuppsala.se/events/genomics-today-and-tomorrow/) Topics include APIs, "querying by data set", machine learning.
Precise elucidation of the many different biological features encoded in any genome requires careful examination and review by researchers, who gather and evaluate the available evidence to corroborate and modify gene predictions and other biological elements. This curation process allows them to resolve discrepancies and validate automated gene model hypotheses and alignments. This approach is the well-established practice for well-known genomes such as human, mouse, zebrafish, Drosophila, et cetera. Desktop Apollo was originally developed to meet these needs.
The cost of sequencing a genome has been dramatically reduced by several orders of magnitude in the last decade, and the natural consequence is that more and more researchers are sequencing more and more new genomes, both within populations and across species. Because individual researchers can now readily sequence many genomes of interest, the need for a universally accessible genomic curation tool logically follows. Each new exome or genome sequenced requires visualization and curation to obtain biologically accurate genomic features sets, even for limited set of genes, because computational genome analysis remains an imperfect art. Additionally, unlike earlier genome projects, which had the advantage of more highly polished genomes, recent projects usually have lower coverage. Therefore researchers now face additional work correcting for more frequent assembly errors and annotating genes split across multiple contigs.
Genome annotation is an inherently collaborative task; researchers only very rarely work in isolation, turning to colleagues for second opinions and insights from those with with expertise in particular domains and gene families. The new JavaScript based Apollo, allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. We are also focused on training the next generation of researchers by reaching out to educators to make these tools available as part of curricula via workshops and webinars, and through widely applied systems such as iPlant and DNA Subway. Here we offer details of our progress.
Presentation at Genome Informatics, Session (3) on Databases, Data Mining, Visualization, Ontologies and Curation.
Authors: Monica C Munoz-Torres, Suzanna E. Lewis, Ian Holmes, Colin Diesh, Deepak Unni, Christine Elsik.
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsChristopher Mason
Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single cells, RNA profiling, and metagenomics. Technical artifacts and contaminations can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous.
Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data.
This webinar will review work to develop standards and their applications in genomics, including the ABRF-NGS Phase II NGS Study on DNA Sequencing; the FDA’s Sequencing Quality Control Consortium (SEQC2); metagenomics standards efforts (ABRF, ATCC, Zymo, Metaquins), and the Epigenomics QC group of the SEQC2. The webinar will also review he computational methods for detection, validation, and implementation of these genomic measures.
Towards Reproducible Science: a few building blocks from my personal experienceOscar Corcho
Invited keynote given at the Second International Workshop on Semantics for BioDiversity (http://fusion.cs.uni-jena.de/s4biodiv2017/), held in conjunction with ISWC2017 (https://iswc2017.semanticweb.org/)
This talk explores how principles derived from experimental design practice, data and computational models can greatly enhance data quality, data generation, data reporting, data publication and data review.
Enabling Large Scale Sequencing Studies through Science as a ServiceJustin Johnson
Now
“Now” generation sequencing has drastically changed the traditional costs and infrastructure within the sequencing community. There are several technologies, platforms and algorithms that show promise, but it is not always intuitive where to start. This uncertainty is compounded by the fact that commonly used analysis tools are difficult to build, maintain, and run effectively. Sample acquisition and preparation is quickly becoming a bottleneck as projects move from small sample sizes to hundreds or even thousands of samples. We will present case studies highlighting information, methods, challenges and opportunities in leveraging large scale high throughput sequencing and bioinformatics. Specifically we will highlight a recent genome-wide study of methylation patterns in 1575 individuals with Schizophrenia. We will also discuss several cancer transcriptome and exome sequencing projects as well as a human pathogen transcriptome characterization project consisting of multiple organisms and almost a billion reads.
The Future
The Ion Torrent PGM machine is a very promising, rapid throughput, ultra scalable sequencer that could play an integral part in future human health studies. Applications such as microbial whole genome sequencing, metagenomic characterization of environmental and microbiome sample, and targeted resequencing projects stand to benefit from this technology over time. To date we have completed more than 25 runs on a single PGM and will comment on the setup as well as sequence data and analysis.
Excited to share our vision for bioinformatics education available for students and researchers that want to apply advanced multi-omics integration and machine learning to large biomedical datasets. Practice and learn from real-life projects.
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
Precise elucidation of the many different biological features encoded in a genome requires a careful curation process that involves reviewing all available evidence to allow researchers to resolve discrepancies and validate automated gene models, protein alignments, and other biological elements. Genome annotation is an inherently collaborative task; researchers only rarely work in isolation, turning to colleagues for second opinions and insights from those with expertise in particular domains and gene families.
The i5k initiative seeks to sequence the genomes of 5,000 insect and related arthropod species. The selected species are known to be important to worldwide agriculture, food safety, medicine, and energy production as well as many used as models in biology, those most abundant in world ecosystems, and representatives in every branch of the insect phylogeny in an effort to better understand arthropod evolution and phylogeny. Because computational genome analysis remains an imperfect art, each of these new genomes sequenced will require visualization and curation.
Apollo is an instantaneous, collaborative, genome annotation editor, and the new JavaScript based version allows researchers real-time interactivity, breaking down large amounts of data into manageable portions to mobilize groups of researchers with shared interests. The i5K is a broad and inclusive effort that seeks to involve scientists from around the world in their genome curation process and Apollo is serving as the platform to empower this community. Here we offer details about this collaboration.
Building a flexible infrastructure with Bioclipse, open source, and federated...Ola Spjuth
Presentation held at Bio-IT World Expo Europe 2009.
Presenters:
* Ola Spjuth, Dept. Pharmaceutical Biosciences, Uppsala University, Sweden
* Lars Carlsson, Global Safety Assessment, AstraZeneca R&D, Sweden
This presentation is a thorough guide to the use of Web Apollo, with details on User Navigation, Functionality, and the thought process behind manual annotation.
During this workshop, participants:
- Learn to identify homologs of known genes of interest in a newly sequenced genome.
- Become familiar with the environment and functionality of the Web Apollo genome annotation editing tool.
- Learn how to corroborate or modify automatically annotated gene models using available evidence in Web Apollo.
- Understand the process of curation in the context of genome annotation.
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsGolden Helix Inc
Analysis of rare variants for population-level data is becoming a more common component of genomic research. Whether using exome chips, whole-exome sequencing, or even whole-genome sequencing, rare variation analysis requires a unique analytic perspective.
In this presentation, we will review some of the tools available in SVS for large sequenced cohorts including summarization, visualization, and statistical analysis of rare variants using KBAC, CMC, and other methods.
Special attention will be given to useful functions available for download from the SVS scripts repository.
Production Bioinformatics, emphasis on ProductionChris Dwan
Production bioinformatics at Sema4 can be thought of as data ops - a peer to the lab ops organization. We operate 24/7 to deliver correct and timely results on NGS and other data for thousands of samples per week. This deck introduces the Prod BI organization and systems architecture with a focus on what it takes to run bioinformatics in production rather than for R&D or pure research.
Making Use of NGS Data: From Reads to Trees and Annotations
2015_CV_J_SHELTON_linked
1. JENNIFER M SHELTON Email: sheltonj@ksu.edu
Kansas State University
Division of Biology
116 Ackert Hall
Manhattan, KS 66506
Research products
Persistent digital research identifier
Open source code
Posters
STATEMENT OF INTEREST
I have chosen to focus on bioinformatics because it is clear that analysis of
large datasets is increasingly important to the biological sciences. I find this
quickly evolving field rich with opportunities to learn new skills and to further
develop those skills by training and collaborating with other scientists.
Working with data from existing and emerging technologies in an open and
collaborative environment is preferred. In my code, I focus on reproducibility by
sharing as much of my code and source data as is appropriate for a project.
I am interested in both optimizing heuristics for informative but speedy initial
data analysis and more open ended exploration of new kinds of data to identify
unanticipated sources of information.
I have worked primarily on de novo assembly and draft assembly
improvement projects for NGS and genome map data. However I am also very
interested to work on projects involving cancer genomics.
EDUCATION
2012 MS, Biology Kansas State University
2010 post-baccalaureate, science Brooklyn College
2009 post-baccalaureate, science Hunter College
2008 post-baccalaureate, Ecology,
Evolution & Environmental Biology Columbia University
2002 BFA, Cum Laude Maryland Institute, College of Art
POSITIONS/ WORK EXPERIENCE
IIIIIIIIIIIIIIIIII
IIIIIIIIIII
IIII
IIIIIIIIIII
2. 2012-present Bioinformatics Core Outreach Coordinator, K-INBRE, KSU
2010-2012 Graduate Teaching Assistant, KSU
2009 Intern, New York Botanical Gardens
2008-2009 Intern, American Museum of Natural History
AWARDS
2011 James E. Ackert Award for Outstanding Presentation by a
Graduate Student, KSU
2001 Maryland Institute Achievement Award
1998-2002 Dean’s List, Maryland Institute, College of Art
SUPPORT
2010-2012 Graduate Teaching Assistantship, KSU $25,000/annum
2001 Maryland Institute Achievement Award $2,000
1998-02 Trustee Award Scholarship. $20,000
SKILLS
Software Carpentry trained instructor: SWC is an non-profit organization of
scientists train other scientists in basic coding skills. http://software-carpentry.org/
pages/team.html.
Bioinformatics tools: miraEST, Tophat2, CuffDiff2, Bowtie2, Cufflinks, Velvet,
Oases, Trinity, Prinseq, ABySS, BLAST, BioNano RefAligner, BioNano
Assembler, IrysView, MaSuRCA, Trimmomatic, BWA, Sam Tools.
Languages: BASH, Perl, R, SAS, Python, LaTeX
ALGORITHMS AND WORKFLOWS
2015
CleanIllumina: Readme Processes Illumina reads using Trimmomatic's custom
palindrome adapter cleaning and their max information adaptive quality trimmer.
Pipeline includes sample data and a lab tutorial. (Under development)
Sewing Machine: Readme Comprehensive, customizable and robust workflow
that starts with assembled BioNano genome maps, a draft genome FASTA and
in silico labeled maps created from the draft genome FASTA. The pipeline
iteratively runs alignments between the in silico maps and the genome maps
and runs Stitch with the resulting alignment. The pipeline tests both a default
and relaxed set of alignment parameters. Sewing Machine iterates Stitch until
no new super scaffolds are created and the summarizes the results of the
3. alignments and super scaffolding. Pipeline includes sample data and a lab
tutorial. doi
AssembleIrysXeonPhi de novo: Readme Comprehensive, customizable and
robust workflow that takes raw BioNano data (i.e. a Datasets directory with BNX
files from IrysView) and writes assembly scripts to test a range of parameters,
summarizes assembly results and finally organizes output and create summary
reports and graphs. Pipeline includes sample data and a lab tutorial. doi
AssembleIrysXeonPhi: Readme Comprehensive, customizable and robust
workflow that starts with raw BioNano data (i.e. a Datasets directory with BNX
files from IrysView) and a draft genome FASTA to prepare in silico labeled
maps, rescale BioNano molecule maps, write assembly scripts to test a range
of parameters, summarize assembly results, align best BioNano genome maps
to in silico maps and super scaffold the draft genome FASTA (with Sewing
Machine) and finally organize output and create summary reports and graphs.
Pipeline includes sample data and a lab tutorial. doi
2014
tBlastx: Code Annotates de novo transcriptomes with the NCBI nt database.
Pipeline output was customized for a sequencing facilities request. Specialty
scripts for PI Dr. Benjamin Hause (KSU Vet Med). doi
AssembleG: Readme Processes Illumina DNA reads from cleaning and multi-k
assemblies to produce de novo assembled genomes. Pipeline includes sample
data and a lab tutorial. doi
AssembleT: Readme Processes Illumina RNA reads from cleaning and multi-k
assemblies to clustering for de novo transcriptomes. Pipeline includes sample
data and a lab tutorial. doi
RNA-SeqAlign2Ref: Readme Processes Illumina RNA reads from cleaning and
aligning to differential expression profiling for the transcriptomes of annotated
genomes. Pipeline includes sample data and a lab tutorial. doi
RNA-SeqAlign: Readme Processes Illumina RNA reads from cleaning and
aligning to count summarizing for de novo transcriptomes. Pipeline includes
sample data and a lab tutorial. doi
AssembleIrysCluster: Readme Preps raw molecule maps for the BioNano Irys
System and customizes assembly scripts to test a range of assembly
parameters. doi
2013
Stitch: Readme Super scaffolds genomic sequence samples using alignments to
BioNano assembled genome maps. doi
4. Blastx: Readme Annotates de novo transcriptomes with the NCBI nr database.
Pipeline can be run recursively to recover annotations that timed out on a
cluster. Pipeline includes sample data and a lab tutorial. doi
Count_reads_denovo: Readme Summarizes read counts for projects with non-
model organisms where no reference genome or annotation files are available.
Also considers both pairs for paired end reads unlike HTSeq. doi
TEACHING EXPERIENCE
As the K-INBRE Bioinformatics Core Outreach Coordinator:
2015
Software Carpentry Workshop at Memorial Sloan Kettering (graduate
students and postdocs). Lead Data Carpentry instructor at workshop. Aug 24-25.
Workshop website
Software Carpentry Workshop at Stanford University (graduate students and
postdocs). Lead SWC instructor at workshop July 23-24. Workshop website
Software Carpentry Workshop at The Jackson Laboratory (graduate students
and postdocs). SWC instructor at workshop. Workshop website
Software Carpentry Workshop at University of Connecticut Storrs (graduate
students and postdocs). SWC instructor at workshop. Workshop website
Software Carpentry Workshop at University of Campinas (graduate students
and postdocs). SWC instructor at workshop. Workshop website
Software Carpentry Workshop at Weill Cornell Medical College (graduate
students and postdocs). SWC instructor at workshop. Workshop website
2014
Beocat with UNIX and Perl (graduate students, postdocs and faculty). I
developed this month long workshop, NGS Analysis on Beocat, in response to
student interest in coding for NGS data analysis on Beocat, our HPC cluster. It
quickly evolved into a collaboration between the Division of Biology and the KSU
Center for Scientific Supercomputing. I developed most of the exercises and
taught the UNIX/Cluster submission half of the workshop. Dr. Greg Wilson from
Software Carpentry (SWC) video conferenced with me at an early stage of
workshop development and I created the NGS Analysis on Beocat material to fit
the SWC training model. There were 60+ applicants for 30 seats.
Introduction to Bioinformatics (graduate students and postdocs). T.A., As my
analysis scripts developed into workflows I created tutorials to expedite student
training in the lab portion of an introductory Bioinformatics course.
2013
5. Informal Perl course (graduate students and PostDocs). T.A., Dr. Brad Olson
led an informal course to help develop instructional material to teach Perl to
Biologists.
2012
Transcriptomics and Phlyogenetics video lessons (undergraduate students).
Created videos to teach undergraduate students about two bioinformatics tools
with accompanied exercises.
As a GTA in the Division of Biology, KSU:
2010-2012
Plant Physiology Lab (undergraduate students). GT.A., Created “A. thaliana
heat shock mutant vs. WT” and “Plant growth: using imageJ to estimate leaf
area over a time course” labs as well as lectures. Organized and guided labs.
Principles of Biology: (undergraduate students). Created lab lectures and
assisted students with a studio model introductory biology course.
Organismic Biology Lab: (undergraduate students). Created lab lectures and
examine questions. Organized and guided labs.
RESEARCH AND OUTREACH COLLABORATIONS
2012-current
Applied Bioinformatics Journal Club I organized a journal club where methods
articles, selected by students who are using them, are discussed weekly by an
interdepartmental group of biologists and computer scientists from KSU, KUMC,
and KU-L Bioinformatics Core members via video conference link. I maintain an
archive of articles chosen for discussion in our website. The Bioinformatics Core
Applied Bioinformatics Journal Club and training blogs have been visited over
92,000 times to date.
2011
Grass Journal Club As a graduate student, I organized a graduate student/
postdoc only journal club to encourage student participation and leverage the full
range of ecological and agronomic research into grass physiology, phylogeny,
and functional genetics conducted at Kansas State’s Division of Biology and
Plant Pathology. website
RESEARCH EXPERIENCE
2014-present
Assemble de novo BioNano genome maps. Kansas State University, Division
of Biology.
6. Various Principal investigators. Assembled de novo assemblies of BioNano
molecule maps for bacteria and eukaryotes (including but not limited to Zea
mays, Corvus corone, Monachus schauinslandi).
2013-present
Assemble BioNano genome maps with a reference. Kansas State University,
Division of Biology.
Various Principal investigators. Assembled de novo assemblies of BioNano
molecule maps for bacteria and eukaryotes (including but not limited to
Arabidopsis thaliana, Oryza sativa ssp. japonica, Pan troglodytes, Homo sapien,
Microcebus murinus, Amaranthus hypochondriacus, Trypanosoma cruzi,
Medicago truncatula, Gossypium raimondii, Gallus gallus, Xanthomonas
axonopodis pv glycines, Tribolium castaneum, Gonium pectorale, Manduca
sexta, Danaus plexippus, Triticum aestivum, Nicrophorus vespilloides, Tribolium
madens, Drosophila miranda, Drosophila pseudoobscura, Escherichia coli,
Corvus corone, Zea mays, Electrophorus electricus and Acyrthosiphon pisum).
2012-2014
Assemble de novo plant transcriptomes. Kansas State University, Division of
Biology.
Principal investigators, Dr. Timothy Durrett and Dr. Loretta Johnson. Assembled
multi-k-mer de novo assemblies and count summaries for eight plant taxa.
2010-2012
Cuticular wax synthesis and deposition. Kansas State University, Division of
Biology.
Principal investigator, Dr. Loretta Johnson. Investigating the varied cuticular
waxes of locally adapted Andropogon gerardii. Using traditional and next
generation sequencing and quantification technology to study the Fatty Acid
Elongation (FAE) system and Fatty Acid Synthase (FAS) system while classifying
wax crystalloid structure, chemical composition, and barrier properties.
2008-2009
Function and evolution of AGL6 genes. New York Botanical Gardens.
Supervised by Dr. Amy Litt. Carried out studies on the evolution of the
AGAMOUS-LIKE6 (AGL6) gene to identify changes in number and sequence
that might have influenced flower evolution. Determined the sequence of the
AGL6 gene in non-model species of flowering plants. Generated construct and
over-expressed AGL6 via a CaMV-35S promoter, in Arabidopsis.
Flower Development Genes. New York Botanical Gardens.
Supervised by Dr. Abeer Mohamed. Investigated function of MADS-box
transcription factors that promote flower and fruit development. Examined
change in expression/phenotype when the APELATA1 (AP1) gene is
7. expressed by the FRUITFULL (FUL) promoter and vice versa, in mutant
Arabidopsis.
Population Study, Asian Cycads. Sackler Institute of Comparative Genomics,
American Museum of Natural History.
Supervised by Dr. Angelica Cibrian-Jaramillo. Distribution of Cycas
micronesica and sympatric cycad species in Southeast Asia.
Microsatellite Library, C. trifolia. Sackler Institute for Comparative Genomics,
American Museum of Natural History.
Assistant to Dr. Aswini Pai on project assessing (1) whether Coptis trifolia only
reproduces clonally, and (2) if sexual reproduction occurs, how much diversity
is in the gene pool.
Isolated microsatellites from tissue gathered in northern New York. Analyzed
results, designed primers from flanking regions. Advised Dr. Pai, a visiting
professor, of the lab layout and protocols.
Genetic Variability in Arctic Mammals. Sackler Institute for Comparative
Genomics, American Museum of Natural History.
Supervised by Dr. Diana Weber on the project “Linkage Disequilibrium of MHC
Loci in Polar Mammals: Potential Maladaptive Consequences for a Warming
Globe.” Investigated genetic relationships in the major histocompatibility
complex (MHC) Class II region for DQB and DRB from arctic mammals, with
emphasis on the muskox and polar bear.
PUBLICATIONS
Shelton J., Coleman M. C., Herndon N., Lu N., Lam E.T., Anantharaman T. and
Brown S.J. Tools and pipelines for BioNano data: molecule assembly pipeline
and FASTA super scaffolding tool. 2015 bioRxiv doi: http://dx.doi.org/
10.1101/020966. (Paper revisions are currently under review by BMC
Genomics).
Weber D.S., Van Coeverden De Groot P.J., Peacock E., Schrenzel M.D., Perez
D.A., Thomas S., Shelton J., Else C.K., Darby L.L., Acosta L., Harris C.,
Youngblood J., Boag P. and Desalle R. 2013. Low MHC variation in the polar
bear: implications in the face of Arctic warming. Animal Conservation, 16: 671–
683. doi.
PRESENTATIONS
2015
*Cunningham C. B. , Ji L. , Shelton J., Schmitz R. J., Brown S. J., Moore A. J.
Methylation occurs in beetles: the genome of the subsocial beetle Nicrophorus
8. vespilloides (Coleoptera: Silphidae). Ninth Annual Arthropod Genomics
Symposium.
Rogers J., Larsen P.A., Raveendran M., Liu Y., English A., Han Y.., Vee V. ,
Campbell C.R., Shelton J., Brown S.J., Muzny D.M., Gibbs R.A., Yoder A.D.,
Worley K. Whole genome assembly of the gray mouse lemur (Microcebus
murinus) genome: integrating diverse platforms and data types. Biology of
Genomes. Cold Spring Harbor.
2014
Brown S.J. and Shelton J. BioNano Genomics Webinar: Using BioNano Maps to
Improve an Insect Genome Assembly. http://www.bionanogenomics.com/
bionano-community/webinars/.
*Shelton J., Herndon N., Gray M., Liang H., Durrett T., Johnson L., Akhunova A.,
Brown S.J. Multi-K-Mer de novo Transcriptome Assembly, Validation, and
Count. Plant and Animal Genomes XXII.
*Herndon N., Shelton J., Andrews W., Wang W., Brown S.J. Improving the
Tribolium draft Assembly with Physical Maps Based on Imaging Ultra-Long
Single DNA Molecules. Plant and Animal Genomes XXII.
*Gray M., Shelton J., Chellapilla S., Bello N., Akhunova A., Liang H., Garrett K.,
Akhunov E., Morgan T., Johnson L. Transcriptional Differences of Mesic and
Xeric Ecotypes of an Ecologically-Dominant Prairie Grass Andropogon gerardii
to Abiotic Stress. Plant and Animal Genomes XXII.
Johnson L., Brown S.J., Gray M., Shelton J., Baer S.G., Maricle B., Bello N.
Genetic Differentiation, Transcriptome Variation, and Local Adaptation of an
Ecologically Dominant Prairie and Bioenergy Grass Andropogon gerardii (big
bluestem) Occurring Along the Precipitation Gradient in Midwest U.S.
Grasslands. Plant and Animal Genomes XXII.
*Shelton J., Gray M., Brown S.J., Chellapilla S., Akhunova A., Akhunov E., Liang
X., Johnson L. De novo Transcriptome Profiling of Two Edaphically and
Phenotypically Divergent Grasses: Dominant Forage Grass Big Bluestem
Andropogon gerardii and Drought-Tolerant Sand Bluestem Andropogon
gerardii ssp. hallii. Plant and Animal Genomes XXI.
Shelton J. Micromorphology, Chemistry, and Physiology of epicuticular waxes in
sand bluestem, an edaphic ecotype of big bluestem. Ecological Genomics
Research Forum, Manhattan, KS.
*Shelton J., Johnson, L., Ravenek J., Jeannotte R., Song Z., Welti R. Bello N.,
Nippert J. Divergent epicuticular physiology and morphology in the ecological
dominant prairie grass big bluestem Andropogon gerardii, and its edaphic
9. variety sand bluestem A. gerardii var. hallii. American Society of Plant
Biologists Midwestern Sectional Meeting Lincoln, NE.
*Shelton J., Johnson, L., Ravenek J., Jeannotte R., Song Z., Welti R.
Divergence in leaf morphology and physiology in Andropogon gerardii var.
hallii (sand bluestem) a locally adapted variety of Andropogon gerardii (big
bluestem). ASPB annual meeting.
Shelton J., Johnson, L., Ravenek J., Jeannotte R., Song Z., Welti R. Phenotypic
divergence of cuticle chemistry and surface morphology in big and sand
bluestem. Biology Graduate Students Research Forum, Manhattan KS.
*Shelton J., Johnson, L., Ravenek J., Jeannotte R., Song Z., Welti R. Local
adaptation of wax biosynthesis in western Andropogon gerardii and
Andropogon gerardii (subsp. hallii). Ecological Genomics Symposium, Kansas
City, KS.
* indicates poster presentation
REFERENCES
* Public Web Version of CV: Contact Jennifer M Shelton directly for reference
contact details.
Susan J Brown
Director K-INBRE Bioinformatics Core
University Distinguished Professor
239A Chalmers Hall
Manhattan KS, 66506
Todd Dickinson
Former VP, Global Commercial Operations BioNano Genomics
CEO Dovetail Genomics, CEO IdentifyGenomics, CEO BigDataBio
Greg Wilson
Software Carpentry, Executive Director
http://www.software-carpentry.org/
Arvind Bharti PhD, PMP
Syngenta Crop Protection, LLC
3054 E Cornwallis Rd
Research Triangle Park, NC 27709
Jeremy Chien
10. Member of the Applied Bioinformatics Journal Club and Researcher in the K-
INBRE network
Assistant Professor, Cancer Biology
Assistant Director, Translational Genomics, University of Kansas Cancer Center
2020B Wahl Hall East, Mailstop 1027
3901 Rainbow Boulevard
Kansas City, KS 66160
Peter A. Larsen
I am collaborating with the the grey mouse lemur genome assembly project that
Peter Larsen is working on
Postdoctoral Associate, Yoder lab
Duke University
Box 90338
BioSci 130 Science Drive
Durham, NC 27708
PERSONAL SCHOLARLY DEVELOPMENT
As a graduate student I studied genomics applications, chromosome and
genome analysis, and bioinformatics. This prepared me for assembling my own
transcriptomes for two wild grasses in my master research project.
As the Bioinformatics Core Outreach Coordinator I learned to code in Perl and
BASH as well as work on an SGE Cluster by collaborating with bioinformatics
specialists and cluster administrators. As I accrued these skills I made completed
scripts available for use through Github as “kstatebioinfo". As these scripts
developed into workflows I created tutorials that I used to train students in the lab
portion of an introductory Bioinformatics course. Students in this course became
interested in learning how to code, prompting the creation of a month long coding
workshop. The workshop, NGS Analysis on Beocat, turned into a collaboration
between the Division of Biology and KSU Center for Scientific Supercomputing. I
developed most of the exercises and taught the first two weeks of the workshop.
Greg Wilson from Software Carpentry (SWC) met with us at an early stage of
workshop development and I adjusted the NGS Analysis on Beocat material to fit
the SWC training model.
In addition to coding, codesharing and training I ,created the Applied
Bioinformatics Journal Club (ABJC) as a forum to discuss workflows and projects
within and between Bioinformatics Cores. At the ABJC, methods articles are
discussed weekly by an interdepartmental group of biologists and computer
scientists from KSU, KUMC, and KU-L’s Bioinformatics Cores via video
11. conference. Usage of the ABJC’s archived topics has already exceeded 90,000
views.
I am also collaborating with scientists at BioNano Genomics to develop
workflows for BioNano molecule maps. These can be used to assemble physical
maps or order/orient sequence assemblies. Working in a beta test site for this
new technology and working with my Bioinformatics Core to start our certified
BioNano Pro service center brought together the dynamic and collaborative
aspects that draw me to bioinformatic analysis. This was perhaps the most
engaging collaborative experience I have had so far.