This document provides information about various data visualization tools developed by the Immunological Genome Project to explore gene expression data from the mouse immune system. It describes tools such as the Skyline which displays expression profiles of a gene across cell types, the Population Comparison tool for finding genes that distinguish cell populations, and the Constellation viewer which shows networks of correlated genes. It also mentions other tools like the Gene Expression Map, Module browser, MyGeneSet and mobile apps that allow exploration of ImmGen gene expression data.
Common languages in genomic epidemiology: from ontologies to algorithmsJoão André Carriço
Presentation for 2nd Conference Rapid Microbial NGS and Bioinformatics: Translation Into Practice
Hamburg/Germany, June 9-11, 2016
http://rami-ngs.org/
Understanding the Place of the Eucharist in CatholicismJeffery Fasching
Father (FR.) Jeffery “Jeff” Fasching provides faith services like the traditional Latin Mass at one of the largest Hispanic parishes in Tulsa, Oklahoma. Among the most important of Father Jeff Fasching’s duties as a Catholic priest is to offer the Eucharist to parishioners.
Common languages in genomic epidemiology: from ontologies to algorithmsJoão André Carriço
Presentation for 2nd Conference Rapid Microbial NGS and Bioinformatics: Translation Into Practice
Hamburg/Germany, June 9-11, 2016
http://rami-ngs.org/
Understanding the Place of the Eucharist in CatholicismJeffery Fasching
Father (FR.) Jeffery “Jeff” Fasching provides faith services like the traditional Latin Mass at one of the largest Hispanic parishes in Tulsa, Oklahoma. Among the most important of Father Jeff Fasching’s duties as a Catholic priest is to offer the Eucharist to parishioners.
Catholic Charities of Southern Missouri Celebrate with Dinner for LifeJeffery Fasching
Father (FR.) Jeffery “Jeff” Fasching works as a Catholic priest in Tulsa, Oklahoma, where he assists with prison ministry. In addition, Rev. Fasching offers traditional Latin mass for parishioners in the third-largest Hispanic parish found in the area. Father Jeff Fasching supports numerous religious organizations, including the Catholic Charities of Southern Missouri (CCOSMO).
A Catholic priest, Rev. Father (FR.) Jeffery “Jeff” Fasching serves the community of Saints Peter and Paul Church in Tulsa, Oklahoma, where he often offers Mass in the Extraordinary Form. Rev. Father Jeff Fasching believes in the importance of receiving Holy Communion only on the tongue.
GenomeSnip: Fragmenting the Genomic Wheel to augment discovery in cancer rese...Maulik Kamdar
Presentation in the Conference on Semantics in Healthcare and Life Sciences (CSHALS) 2014, Boston.
Abstract: Cancer genomics researchers have greatly benefited from high-throughput technologies for the characterization of genomic alterations in patients. These voluminous genomics datasets when supplemented with the appropriate computational tools have led towards the identification of 'oncogenes' and cancer pathways. However, if a researcher wishes to exploit the datasets in conjunction with this extracted knowledge his cognitive abilities need to be augmented through advanced visualizations. In this paper, we present GenomeSnip, a visual analytics platform, which facilitates the intuitive exploration of the human genome and displays the relationships between different genomic features. Knowledge, pertaining to the hierarchical categorization of the human genome, oncogenes and abstract, co-occurring relations, has been retrieved from multiple data sources and transformed a priori. We demonstrate how cancer experts could use this platform to interactively isolate genes or relations of interest and perform a comparative analysis on the 20.4 billion triples Linked Cancer Genome Atlas (TCGA) datasets.
Creating custom gene panels for next-generation sequencing: optimization of 5...Thermo Fisher Scientific
Next-generation sequencing gene panels enable the examination of multiple genes, identifying previously described variants and discovering novel variants, to elucidate genetic disease. The challenges are substantial, including: identification of all genes of interest; assay optimization to create robust, reproducible, multiplex panels; and developing accurate, comprehensive, reproducible analysis pipelines.
Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...Thermo Fisher Scientific
Targeted next-generation sequencing panels enable interrogation of multiple genes across many samples to more deeply understand human genetic disease. However, finding all relevant genes, developing robust, high performing multiplex panels, and implementing scalable, reproducible and accurate analysis pipelines is challenging. We present a coordinated suite of tools to facilitate genetic disease research. First, we developed the Content Selection Engine which organizes human diseases hierarchically, and links all diseases to a set of associated genes; and the Gene Scoring Algorithm which ranks genes by clinical relevance. We developed optimized assays for the
most studied 1000 disease research genes, and we are in the process of developing optimized assays for a further 4000 genes.
An interactive web interface allows scientists to select any disease of interest, display all associated genes, select any genes, and add additional genes, for any number of diseases. Empirical coverage for each gene can be visualized in IGV. A custom Ampliseq gene panel can be built using the optimized assays from all the selected genes. Optimized gene panels can be developed narrowly targeted to specific diseases, or larger gene panels can be developed for broader phenotypes. Disease categories include early onset neonatal phenotypes such as metabolic disorders, Severe Combined Immunodeficiency (SCID), heme disorders; and late onset disorders such as cancer predisposition and cardiovascular disorders.
Catholic Charities of Southern Missouri Celebrate with Dinner for LifeJeffery Fasching
Father (FR.) Jeffery “Jeff” Fasching works as a Catholic priest in Tulsa, Oklahoma, where he assists with prison ministry. In addition, Rev. Fasching offers traditional Latin mass for parishioners in the third-largest Hispanic parish found in the area. Father Jeff Fasching supports numerous religious organizations, including the Catholic Charities of Southern Missouri (CCOSMO).
A Catholic priest, Rev. Father (FR.) Jeffery “Jeff” Fasching serves the community of Saints Peter and Paul Church in Tulsa, Oklahoma, where he often offers Mass in the Extraordinary Form. Rev. Father Jeff Fasching believes in the importance of receiving Holy Communion only on the tongue.
GenomeSnip: Fragmenting the Genomic Wheel to augment discovery in cancer rese...Maulik Kamdar
Presentation in the Conference on Semantics in Healthcare and Life Sciences (CSHALS) 2014, Boston.
Abstract: Cancer genomics researchers have greatly benefited from high-throughput technologies for the characterization of genomic alterations in patients. These voluminous genomics datasets when supplemented with the appropriate computational tools have led towards the identification of 'oncogenes' and cancer pathways. However, if a researcher wishes to exploit the datasets in conjunction with this extracted knowledge his cognitive abilities need to be augmented through advanced visualizations. In this paper, we present GenomeSnip, a visual analytics platform, which facilitates the intuitive exploration of the human genome and displays the relationships between different genomic features. Knowledge, pertaining to the hierarchical categorization of the human genome, oncogenes and abstract, co-occurring relations, has been retrieved from multiple data sources and transformed a priori. We demonstrate how cancer experts could use this platform to interactively isolate genes or relations of interest and perform a comparative analysis on the 20.4 billion triples Linked Cancer Genome Atlas (TCGA) datasets.
Creating custom gene panels for next-generation sequencing: optimization of 5...Thermo Fisher Scientific
Next-generation sequencing gene panels enable the examination of multiple genes, identifying previously described variants and discovering novel variants, to elucidate genetic disease. The challenges are substantial, including: identification of all genes of interest; assay optimization to create robust, reproducible, multiplex panels; and developing accurate, comprehensive, reproducible analysis pipelines.
Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...Thermo Fisher Scientific
Targeted next-generation sequencing panels enable interrogation of multiple genes across many samples to more deeply understand human genetic disease. However, finding all relevant genes, developing robust, high performing multiplex panels, and implementing scalable, reproducible and accurate analysis pipelines is challenging. We present a coordinated suite of tools to facilitate genetic disease research. First, we developed the Content Selection Engine which organizes human diseases hierarchically, and links all diseases to a set of associated genes; and the Gene Scoring Algorithm which ranks genes by clinical relevance. We developed optimized assays for the
most studied 1000 disease research genes, and we are in the process of developing optimized assays for a further 4000 genes.
An interactive web interface allows scientists to select any disease of interest, display all associated genes, select any genes, and add additional genes, for any number of diseases. Empirical coverage for each gene can be visualized in IGV. A custom Ampliseq gene panel can be built using the optimized assays from all the selected genes. Optimized gene panels can be developed narrowly targeted to specific diseases, or larger gene panels can be developed for broader phenotypes. Disease categories include early onset neonatal phenotypes such as metabolic disorders, Severe Combined Immunodeficiency (SCID), heme disorders; and late onset disorders such as cancer predisposition and cardiovascular disorders.
Visualizing the genome: Techniques for presenting genome data and annotationsAnn Loraine
Poster presented at ISMB 2002. Examples from the Neomorphic Annotation Station, an early version of Integrated Genome Browser, and ProtAnnot, an alternative splicing protein domain viewer, illustrate visualization techniques for genome browsers.
Visual Exploration of Clinical and Genomic Data for Patient StratificationNils Gehlenborg
Talk presented at the Simons Foundation Biotech Symposium "Complex Data Visualization: Approach and Application" (12 September 2014)
http://www.simonsfoundation.org/event/complex-data-visualization-approach-and-application/
In this talk I describe how we integrated a sophisticated computational framework directly into the StratomeX visualization technique to enable rapid exploration of tens of thousands of stratifications in cancer genomics data, creating a unique and powerful tool for the identification and characterization of tumor subtypes. The tool can handle a wide range of genomic and clinical data types for cohorts with hundreds of patients. StratomeX also provides direct access to comprehensive data sets generated by The Cancer Genome Atlas Firehose analysis pipeline.
http://stratomex.caleydo.org
Introduction
Transcriptome analysis
Goal of functional genomics
Why we need functional genomics
Technique
1. At DNA level
2.At RNA level
3. At protein level
4. loss of function
5. functional genomic and bioinformatics
Application
Latest research and reviews
Websites of functional genomics
Conclusions
Reference
1. A G
Neutrophils CD4 T
A G
SNP
Distance
to TSS
SNP BP
Position
T4
Fold-Change
GN
Log10 p-val
GN
Fold-Change
T4
Log10 p-val
SNPs
in LD
rs29385375 68667674 -818938 5.47 0.37 4.82 0.32 rs29385200, rs29473969, rs29399226,
rs26935693, rs26935688, rs29385273,
rs29400948, rs6156679, rs29385375,
rs29433934, rs29402084, rs26905626,
rs29384675, rs29402266, rs29466835,
rs26905601, rs6295876, rs45904989,
NCBI Gene: Senp3 dbSNP: rs2938 ImmGen Skyline: Senp UCSC Genome Browser: Senp3
Immunological eQTLs
The javascript-based interface (D3) presents the eQTLs
associated with a chosen gene in CD4+ T cells and neutro-
phils, based on data from 40 mouse inbred mouse strains
(Mostafavi et al, 2014).
Entering a gene of interest displays a list of Single Nucleotide
Polymorphisms (SNPs) that significantly affect its expression.
A table of eQTL is returned, as well as an animated
genotype/expression plot which displays the values for each
strain.
Expression Quantitative Trait Loci (eQTLs)
that affect a gene
This tool shows all the splice junction sequences that have been detected
for a chosen gene, color-coded by frequency, on the UCSC genome
browser.
Splice junctions
Splice junctions detected for a chosen gene across RNA-seq data.
The Immunological Genome Project (ImmGen) is a consortium
of immunologists and computational biologists who aim, using shared and
rigorously controlled data generation pipelines, to exhaustively chart gene
expression profiles and their underlying regulatory networks in the mouse
immune system. The project encompasses the innate and adaptive immune
systems, surveying all cell types of the myeloid and lymphoid lineages with a
focus on primary cells directly ex vivo. These are analyzed through different
states of differentiation and maturation, activating responses, effector stages,
tissue localization, age and genetic variation. These data support the computa-
tional reconstruction of the genetic regulatory network underlying cell differen-
tiation and activation in the immune system.
ImmGen is a public resource, and its data displays are actively used by the
Immunology community. The ImmGen team has developed novel modes of
graphic representation, for both desktop and mobile supports. Overall, the
ImmGen data browsers are custom interactive tools that are framed around
specific questions that a wet biologist might have, rather than providing simple
data access. These tools have been developed over time, and use a variety of
technologies, some have been clear successes, some maybe less so, but are
in continuous evolution.
ImmunologicalGenomeProject:DataVisualizationTools
Catherine Laplace, Richard Cruse, Scott Davis, Jeff Ericson, Gordon Hyatt, Radu Jianu, Rachel Melamed, Henry Paik, Richard Park, Tal Shay, Liang Yang.
The Immunological Genome Consortium.
Benoist-Mathis Laboratory, Division of Immunology, Harvard Medical School, Boston, MA
Terminology conventions:
“Gene” is meant as one element of the microarray. A true gene in
the molecular biology sense may be represented by several “genes”
on the array.
A "Population" represents a cell-type as defined by usual surface
markers and expression reporters, in a particular organismal loca-
tion and state (resting or stimulated, genetically perturbed, etc).
A “Dataset” is a vector of expression values for a population, a
"DataGroup" is a collection of datasets, generated similarly and
normalized together so as to be comparable.
www.immgen.org
Expression levels
Population comparison
Find the genes that most distinguish two (or more) populations.
Distinguishing two cell-types
The "Population Comparison" browser com-
pares individual populations or population
groups, and brings out the genes that distin-
guish them. The comparison is computed in
real-time (R on the HMS Orchestra cluster),
and returns a table of differential metrics
(FoldChange, p-value, FDR).
The browser can perform simple pairwise
comparisons between individual populations,
or more complex comparisons involving
groups of populations (e.g. “All macrophages
vs All B cells”), as chosen by the user with a
drag-and-drop graphic interface.
Mobile version
The ImmGen iPhone app features a similar
"Population Comparison" functionality that
allows users to compare two selected
ImmGen cell-types or groups and finds the
most differentially expressed genes.
Relationships between genes
yradnuobretuo
outerboundary
ocnoitalerrocneiciffet
0.8
0.9- -
- -
Ctsc
Ceacam1
1110003E01Rik
Daf1
Tpst1Myo1e
Blnk
Ly6d
Arhgap8
Ell2
Pkig
Gga2
Stk23
2010309G21Rik
Lat2
Cd22
Rufy1
Snx9Mef2c
Lyl1
Irf5
Tcf4
Ebf1
Casp9
Napsa
Gm1419
LOC56304
Igk-v21
AW112037
Gm1419
Igk-v8
Blnk
Daf2
Ceacam2
IgB
Cybb
Scd1
Prkcd
Blk
Lyn
Btk
Syk
Ryr1
Plcg2
Network of gene correlations.
The Constellation view presents genes most closely corre-
lated to a chosen gene, overall or within a lineage. Spatial
coordinates depict attributes of these correlated genes: the
distance from the center encodes the tightness of this corre-
lation (closely linked genes are shown close to the center,
more distant ones at the periphery), their angular position on
the circle can be chosen to represent chromosomal position
within the genome, GeneOntology-based clusters, or second-
ary correlations. This correlated network, originally inspired
by the Visual Thesaurus, can be explored sequentially by
clicking on any of the genes and bringing up its own set of
correlations.
Gene Constellation
Regulators and Modulators
A novel algorithm for network analysis, specifically tailored to
exploit the particular configuration of the ImmGen datagroup,
was applied to predict which transcriptional control elements
might regulate modules of coregulated genes. Clustering was
performed by Super Paramagnetic Clustering resulting in clus-
ters of co-expressed genes and a novel algorithm (Ontogenet),
specifically tailored to exploit the particular configuration of the
ImmGen datagroup, was applied to predict which transcrip-
tional control elements might regulate modules of coregulated
genes (Jojic et al, 2013). The online browser allows exploration
and display of the modules’ composition, expression patterns,
sequence motif enrichement, etc.
Interactive display of the modules of co-regulated genes defined from ImmGen data,
and the transcription factors predicted to control them.
MyGeneSet
While other databrowsers are queried one gene at a time, the MyGeneSet browsers allow the user to interrogate the expression
across ImmGen of a group of genes. This allows one to quickly appreciate the different elements of a complex signature, or to
quickly identify the cell(s) of origin for a given variation. This javascript-based online browser allows users to visualize the expres-
sion of their own set of genes across some of all ImmGen populations. Gene lists can be typed or pasted in, or dropped as a text
file of GeneSymbols.
Several visualization options are returned: a scatter plot (“W plot”) of normalized expression across the selected populations,; an
interactive heapmap representation, developed using D3.js, which allows the user to rearrange the map based on expression
values of a selected gene or population
Expression of a specific set of genes across ImmGen populations
Gene expression profiles generated from different immunological cell-types by RNA-
sequencing are visualized on the UCSC Genome Browser. Expression levels are displayed
as individual bar graphs at the genes’ respective chromosomal location, and can be related
to all other information tracked on the UCSC browser
Gene Expression map (GEM)
This online browser compares microarray expression profiles across populations, with genes organized according to
chromosomal positions. Usera can search for particular genes, and the display zooms from a global perspective of the
map to a gene-level representation, via the Google Maps API. Variations in expression among populations are high-
lighted by a white halo (perhaps not the most effective feature).
GoogleMaps representation of the genome, gene expression values as pseudo-color barcodes.
Skyline
Displays expression values for a selected gene
across immune cell types as a bar chart expression
profile. Data generated on Affymetrix MoGene ST1.0
microarray or by TrueSeq RNAseq platform are
normalized and presented across various ImmGen
datagroups (eg B cells, NK cells, etc).
Basic annotation information on the gene and links to
external databases are provided. The user can
search for the genes to display, based on gene
names, symbols, or other common identifiers (when
more than one gene is returned, by scrolling between
the different genes). This Flash-based interface for a
PostgeSQL database was the original ImmGen
browser, and has been very popular.
The ImmGen app version of the Skyline offers a similar
histogram, but also an innovative 2D barcode to display
expression data across a large number of populations.
The app also explores genes most similar to the gene of
interest by displaying its “Friends” (most correlated genes
across the ImmGen dataset), “Family” (genes with the most
similar GeneOntology identifiers) or ”Neighbors” (closest on
the chromosome).
Stem Cell B Cell Macrophage Monocyte
Bar graph expression profiles of a selected gene in a group of cell types.
Mobile Skyline
Quick reference representation of gene expression on smartphone supports
RNA-seq expression profiles
RNASeq gene expression read density along chromosomal location.
Last year daily independant visitors
Apr 2014 Jul Oct Jan 2015
400
200
Usage
344221
2014 independant visitors by country
Google analytics data