Characterization of genes and proteins of cross-species biological pathways
                   Jennifer Ivy Dong, Douglas ...
Upcoming SlideShare
Loading in...5
×

Characterization of genes and proteins of cross-species biological pathways

1,125

Published on

Presented at the 2010 UMUC Biotechnology Symposium, May 21, 2010, Rockville, MD.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,125
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Characterization of genes and proteins of cross-species biological pathways

  1. 1. Characterization of genes and proteins of cross-species biological pathways Jennifer Ivy Dong, Douglas James Joubert–NIH Library, Raina Kumar, & Robert Stephen–ABCC/NCI Introduction Materials and methods Results The process has four major modules: The new era of genomics and proteomics, with the advent of Six pathways, three each from BioCarta and KEGG, were 1. Identify homologous proteins using the Homologene database 3. Find variations using multiple sequence alignments high throughput technologies such as microarrays and next analyzed using this process and the results for these pathways generation sequencing, has opened up great opportunities for 2. Identify homologous proteins using similarity search 4. Find all known variations from the UniProt database are presented below. The matrices for one of the pathways are the life science research community to better understand also shown for illustration. biological processes. The gene lists obtained from data through BioCarta Pathways these experiments are generally analyzed further in the context of biological pathways as well as with available biological Interferon Gamma (IGP): The IGP pathway has a significant knowledge sets such as specifically described gene ontologies, role in the body's immune response. It has 6 genes, all well gene sets and gene enrichments. Efforts are underway to conserved among mammals except for JAK1 and STAT1 in develop new methods to derive biologically meaningful Pan troglodyte. information from the gene lists obtained from such technologies. Start with a BioCarta/KEGG pathway name Nerve Growth Factor (NGF): NGF is important for the survival Although there has been considerable effort extended at the Identify homologous proteins by similarity search of neurons during embryonic development and has an effect on level of building, maintaining and distributing these gene sets, a Map Sequence Id to protein Id using Retrieve gene list from CGAP with gene sequence IDs the growth of sensory and sympathetic ganglia. It has 20 genes BioDbnet system allowing visualization of their conservation across and most are well-conserved. Across species the exceptions Identify homologous proteins in mammalian species has not been developed. We have Perform BlastP for Proteins homologene database include DPM2 and ELK1, and KLK2. Within species, only Canis Retrieve homolog group ID for each gene from developed a process to retrieve information from two pathway Homologene database at NCBI lupus familiaris had NGF genes that were less conserved. Populate matrices with best hits using databases, KEGG and BioCarta, and combine it with information taxonomy report Protein Kinase C through G-protein coupled receptor from other biological databases such as Homologene and Report value 1 for species from homolog for each gene for mammals (PKC): GPCRs are involved in signal transduction and play a Uniprot to characterize cross-species conservation of genes and Fetch sequences using protein seq ID for all the Find variations homologous genes for each pathway gene for Perl scripts role in various cellular functions. There are 9 genes in this proteins and gain insights into new biological knowledge. from MSA mammals Populate matrices (heat map), where genes are at X- axis and species at Y-axis pathway, and all the genes are extremely well-conserved. Specifically, we are trying to understand which genes and proteins are common in given pathways across species among Perform MSA by ClustalW Find known variations KEGG Pathways mammals such as human (Homo sapiens), mouse (Mus Perl scripts Identify protein IDs of all the proteins for same species in NCBI database using Sequence Id or Hedgehog Pathway: The hedgehog signaling pathway is musculus), rat (Rattus norvegicus), dog (Canis lupus familiaris), Use *.dnd to make cladogram Map sequence id to UniProt Id using BioDbnet believed to govern the growth of embryonic stem cells as well cow (Bos taurus), and chimpanzee (Pan troglodytes). We also Search for variations in *.aln files as metamorphosis in general. It has 44 genes, of which 23 are explore the problem of finding the variations or mutations in For each protein search for UniProt entry from files Perl scripts Report variations in tab-delimited files derived from UniProt conserved among all represented mammals. Three genes these genes and proteins that are well tolerated across these SPA18, DRYK1A, and BTRC are common in all mammals. species. Read known variation in flat file and return annotation in tab delimited file Basal Transcription Factors (BTF): BTF is a major control point for gene expression in eukaryotes and it contains 34 genes. Most genes in this pathway are well-conserved except GTF2AIL and STON1. Dorsal-ventral axis formation (DVF): The DVF pathway is controlled by GRK and EGFR and is important in limb development. It has 29 genes and most of the genes are well- Objectives conserved, the exception being FMN2. Matrices obtained This project focused on developing methods for deriving the through homologene and similarity method are shown below: cross-species annotations for genes and protein groups Cynomolgus monkey Sumatran Orangutan Rhesus Macaque European Rabbit Western baboon Domestic Sheep Syrian Hamster Gene Symbol Gene Symbol White Bear Opposum Wild boar Platypus Human Bonobo Mouse Chimp Human Horse Mouse Gorilla Chimp identified in candidate pathways. The project had three primary Cow Cow Dog mice Rat Rat Cat Dog goals: Conclusions Future work BRAF CPEB1 EGFR 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 BRAF CPEB1 EGFR 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1. Produce a matrix containing genes in a particular biological ERBB2 1 1 1 0 1 0 ERBB2 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 1 0 0 0 0 0 0 ERBB4 1 1 1 0 1 1 We developed a process for characterizing cross-species Future work includes: ERBB4 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 1 0 0 0 0 0 0 ETS1 1 1 1 1 1 0 ETS1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0 1 1 0 0 0 0 Similarity Search ETS2 1 1 1 1 1 1 pathway ETS2 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0 0 1 0 0 0 0 ETV6 1 1 1 1 1 1 ETV6 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 conservation of gene and proteins for mammals, and finding ETV7 1 0 1 0 0 1 ETV7 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 Homologene 1. Fully automate the process FMN2 GRB2 1 1 0 1 0 1 0 1 0 1 0 1 FMN2 GRB2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 1 0 0 1 0 1 0 0 0
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×