Be the first to like this
Genomics and proteomics are closely related fields of research. An understanding of one is generally required for the other, yet in many ways, the methods used to study the two cannot be more different. With the emergence of massive parallel sequencing vast quantities of genomics and transcriptomics data are being generated. At the same time, improvements in mass spectrometry technologies are enabling proteins to be identified with greater specificity and sensitivity. This now provides new opportunities to investigate ways to integrate genomics and proteomics data and understand how the two can complement each other to advance biological knowledge. Using HeLa cells as a model system, we have comprehensively examined the gene models derived from genomics and transcriptomics data and integrated these with proteomics and phosphoproteomics datasets. Reanalysis of proteomics data using HeLa specific gene models enable significant increases in the number of peptides/proteins to be identified, providing new insights into both the genome and proteome of HeLa cells. Technical challenges and methods required for integrating genomics and proteomics data will also be discussed. In summary, given that massive parallel sequencing data are now available for many popular cell lines in public data repositories, our study provides further support for the need and benefit of an integrative data analysis for both genome and proteome analysis.