This document describes a portal that integrates publicly available connectomic and gene expression data with RNA-seq data to help identify genes of interest in neurodegenerative diseases like Parkinson's. Specifically, it uses retrograde tracing analysis (Retro-TRAP) on dopamine neurons and linear regression of RNA-seq data comparing disease vs normal states. Key genes identified are validated using the Allen Brain Atlas gene expression data stored locally. This combined approach helps pinpoint genes selectively expressed in neural circuits involved in diseases like Parkinson's that show selective neurodegeneration. The portal was created using Python/Django and allows users to upload and analyze multi-omic data interactively to advance understanding of neurodegenerative disease etiology.
1. Molecular profiling of a neural circuit during neurodegeneration
Siddhartha Mitra1, Yupu Liang1 ,Alexander R. Nectow1,2
1.Research Bioinformatics CCTS, Rockefeller University, New York City, NY
2.Princeton Neuroscience Institute, Princeton University, Princeton, NJ
Background
Acknowledgments/ Funding
CTSA award UL1TR001866
Integration of different sources of data
Selective neurodegeneration in PD
We created a portal that integrates publicly available connectomic and
expression data sets with RNA-SEQ data from experiments. Use of the
multiple sources of data helps to identify the genes of interest.
Data from
Retro-TRAP
analysis
results
The molecular mechanisms responsible for the selective vulnerability of
neurons in neurodegenerative disease like Parkinson’s Disease (PD) are
incompletely. The aim of this project is to generate an annotated high-
throughput methodology to validate genes identified using molecular
profiling technologies. This will help to understand the etiology and the
pathogenesis of neurodegenerative disease.
Retro-TRAP for profiling cell-specific gene
expression
User Case: Identify genes of interest using
RNA-SEQ analysis and Allen Data
Allen Brain
Gene
Expression
Data
Identify genes of
interest
Fig 1. In Parkinson’s
Disease, there is
selectivedegeneration of
the dopamine neurons
within the nigrostriatal
pathway.
Fig 2. Retro-TRAP is a tool
for molecular connectomic
profiling of discrete subsets
of neurons within a specific
neural circuit using
fluorescent proteins to
indirectly tag ribosomes.
Fig 3. This approach enables validation of master genes differentially
expressed within the mesolimbic and nigrostriatal dopaminergic
circuits.
Software architecture
Linear regression analysis is performed on the RNA-SEQ data to identify the
differentially expressed genes. The workflow of this analysis needs the output RNA-
SEQ pipeline and user input about the different samples used in RNA-SEQ.
Workflow for identifying genes of interest
Linear regression analysis of RNA-Seq data comparing the disease
state vs normal state of Substantia Nigra (SNc) in Parkinson’s
disease yields several genes of interest. Allen data identifies genes
in this list which are themselves highly expressed in the SNc.
Slc6a3 (dopamine transporter) is #314 in the linear regression
analysis but #16 in expression in SNc (Allen). Use of Allen Gene
expression data helps us identify Slc6a3 as a gene of importance.
The gene expression sunburst diagram and the section data set
images clearly show that Slc6a3 is selectively enriched in SNc.
Tcf25 is ranked #20 in regression analysis, but is ranked #1883 for
expression in the SNc (Allen). This gene is constitutively expressed.
The genes selectively expressed relative to the background for the region
involved are identified from Allen gene expression data, which is stored locally in
a MongoDB database. This data is used to identify true positives in the linear
regression analysis. Genes highly ranked in both sources for the brain region are
selected. The data from the Allen portal, including image data, is also used.
A Python / Django based web portal was created for this project. The portal allows
the user to interactively upload and analyze the data from RNA-SEQ analysis, and
also use Allen gene expression data. Multiple data sources are accessed by the
Python / Django portal, which also interacts with Celery workers in the background
for analysis. The individual workers invoke R scripts for regression analysis.
Selective
expression
in SNc
Constitutively
expressed
gene