Be the first to like this
Traditional microbial genome sequencing relies upon clonal cultures, but the new era of genomics is facing a new challenge: the metagenomics analysis. In the next few years it is probable that metagenomics will be used in clinical diagnostic settings. Thus, metagenomics has the potential to revolutionize pathogen detection in public health laboratories by allowing the simultaneous detection of all microorganisms in a clinical sample. For viruses, unbiased high-throughput sequencing approach is useful for directly detecting pathogenic viruses without advance genetic information. The use of metagenomics for virus discovery in clinical samples has opened new opportunities for understanding the aetiology of unexplained illness. For bacteria, it should be reminded that only a small fraction of the phylogenetic diversity of Bacteria and Archaea is represented by cultivated organisms. Hence, metagenomics will probably serve to identify new pathogens, and new infections caused by consortiums. In chronic infections metagenomics will give us information about the relevance of biofilms and other bacterial organizations that would be important in such infections. As an example, metagenomics for Mycobacterium infections have demonstrated undetected, plural, strains in the same patient. Microbiome analysis has been one of the most important applications of metagenomics.
Two major strategies have been applied in the past years for bacterial metagenomics: 16S and shotgun metagenomics. 16S metagenomics tells us about microbial diversity and relative abundance of species and taxa. Shotgun metagenomics is a much more massive approach able to inform about the functional profile of the different genes present in the sample and even to obtain assembled genomes if the sample is not very complex.
Metagenomics has brought new challenges to bioinformatics. Cloud computing can solve the problem of massive data analysis providing scalable, real time, on demand computing for metagenomics data analysis. However, Cloud Computing infrastructure is not easy to manage and publicly available software solutions would be needed to extend the use of cloud for the analysis of huge metagenomics data sets.
MG7 is a new system for analysis of reads from metagenomics based on the use of cloud computing for the parallel computation of the BLAST similarity in which is based the inference of function and the assignment of taxonomic origin. A special peculiarity of MG7 system is the utilization of a non relational model database. MG7 uses a graph database to store the results of the analysis and to facilitate the querying and the access to the data organized in the hierarchic structure of the taxonomy tree. MG7 is an open source project that is licensed under AGPLV3 license.