SRI KRISHNA ARTS AND SCIENCE
COLLEGE
ASMITHA – 20BBT010
Subject Code : 20BSU04
INTRODUCTION TO BIOINFORMATICS
DATA BASES
TO UNDERSTAND COMPUTATIONAL BIOLOGY
CONTENTS
 Definition
 History of Bioinformatics
 Components of Bioinformatics
 Applications of Bioinformatics
DATABASES
 Definition
 Types of databases
 Primary database
 Secondary database
 Composite database
 Applications of database
DEFINITION OF BIOINFORMATICS
 Bioinformatics includes biological studies that use computer
programming as part of their methodology, as well as specific
analysis “pipelines” that are repeatedly used, particularly in the
field of genomics.
 Bioinformatics tools aid in comparing, analyzing and interpreting
genetic and genomic data and more generally in the
understanding of evolutionary aspects of molecular biology.
HISTORY OF BIOINFORMATICS
 Paulien Hogeweg and Ben Hesper coined it in 1970 to refer to the
study of information processes in biotic systems.
 This definition placed bioinformatics as a field parallel to biochemistry
(the study of chemical processes in biological systems).
 Another early contributor to bioinformatics was Elvin A. Kabat, who
pioneered biological sequence analysis in 1970 with his
comprehensive volumes of antibody sequences released with Tai Te
Wu between 1980 and 1991.
COMPONENTS OF BIOINFORMATICS
COMPONENTS OF BIOINFORMATICS
1. Creation of databases:
This involves the organizing, storage and management the biological data sets.
2. Development of algorithms and statistics:
This involves the development of tools and resources to determine the relationship among the members of large data
setsorganizing
3. Analysis of data and interpretation:
This includes DNA, RNA and protein sequences, protein structure, gene expression profiles and biochemical pathways.
APPLICATIONS OF BIOINFORMATICS
 Sequence mapping of biomolecules (DNA, RNA, proteins).
 Identification of nucleotide sequences of functional genes.
 Finding of sites that can be cut by restriction enzymes.
 Designing of primer sequence for polymerase chain
reactio
 Development of models for the functioning various cells,
tissues and organs.
APPLICATIONS OF BIOINFORMATICS
 Prediction of functional gene products.
 To trace the evolutionary trees of genes.
 For the prediction of 3-dimensional structure of proteins.
 Molecular modelling of biomolecules.
 Designing of drugs for medical treatment.
 Handling of vast biological data which otherwise is not
possible.
DATABASE
DATABASE
 Biological databases are libraries of biological sciences, collected from scientific experiments,
published literature, high-throughput experiment technology, and computational analysis.
 These are the databases consisting of biological data like protein sequencing, molecular structure,
DNA sequences, etc in an organized form.
 Several computer tools are there to manipulate the biological data like an update, delete, insert, etc.
Scientists, researchers from all over the world enter their experiment data and results in a biological
database so that it is available to a wider audience.
 Biological databases are free to use and contain a huge collection of a variety of biological data.
TYPES OF DATABASE
 There are basically 3 types of biological
databases are as follows.
 1. Primary database
 2. Secondary Database
 3. Composite Database
PRIMARY DATABASE
 Primary databases are also called as archieval database.
 They are populated with experimentally derived data such as nucleotide sequence, protein
sequence or macromolecular structure.
 Experimental results are submitted directly into the database by researchers, and the data are
essentially archival in nature.
Examples –
Examples of Primary database- Nucleic Acid Databases are GenBank and DDBJ
Protein Databases are PDB,SwissProt,PIR,TrEMBL,Metacyc, etc.
SECONDARY DATABASE
 The data stored in these types of databases are the analyzed result of the primary database.
Computational algorithms are applied to the primary database and meaningful and informative
data is stored inside the secondary database.
 The data here are highly curated(processing the data before it is presented in the database). A
secondary database is better and contains more valuable knowledge compared to the primary
database.
Examples –
Examples of Secondary databases are as follows.
InterPro (protein families, motifs, and domains)
UniProt Knowledgebase (sequence and functional information on proteins)
COMPOSITE DATABASE
 The data entered in these types of databases are first compared and then filtered based on desired
criteria.
 The initial data are taken from the primary database, and then they are merged together based on
certain conditions.
 It helps in searching sequences rapidly. Composite Databases contain non-redundant data.
Examples –
Examples of Composite Databases are as follows.
Composite Databases –OWL,NRD and Swissport +TREMBL
APPLICATIONS OF DATABASE
 Discovery of genome as well as protein sequencing aroused interest in bioinformatics and propelled the necessity to create
databases of biological sequences.
 These data are processed in useful knowledge/information by data mining before storing into databases.
 This book chapter aims to present a detailed overview of different types of database called as primary, secondary and
composite databases along with many specialized biological databases for RNA molecules, protein-protein interaction, genome
information, metabolic pathways, phylogenetic information etc.
 Attempt has also been made to focus on drawbacks of present biological databases.
 Moreover, this book chapter provides an elaborate and illustrative discussion about various bioinformatics tools used for gene
prediction, sequence analysis, phylogenetic analysis, protein structure as well as function prediction, molecular interactions
prediction for several purposes including discovery of new gene as well as conserved regions in protein families, estimation of
evolutionary relationships among organisms, 3D structure prediction of drug targets for exploring the mechanism as well as
new drug discovery and protein-protein interactions for exploring the signaling pathways.
TO UNDERSTAND COMPUTATIONAL BIOLOGY
COMPUTATIONAL BIOLOGY
 Computational biology is an interdisciplinary field that develops and applies computational methods to analyse
large collections of biological data, such as genetic sequences, cell populations or protein samples, to make new
predictions or discover new biology.
 Computational biology, by contrast, is concerned with solutions to issues that have been raised by studies in
bioinformatics. Both disciplines are generally considered facets of the rapidly-expanding fields of data science
and biotechnology.
 Computational biology is useful in scientific research, including the examination of how proteins interact with
each other through the simulation of protein folding, motion, and interaction.
 While computational biology emphasizes the development of theoretical methods, computational simulations,
and mathematical modeling, bioinformatics emphasizes informatics and statistics.
 Though the two fields are interrelated, bioinformatics and computational biology differ in the kinds of needs
they address.
COMPUTATIONAL BIOLOGY IN
BIOINFORMATICS
 Computational biology and bioinformatics is an interdisciplinary field
that develops and applies computational methods to analyse large
collections of biological data, such as genetic sequences, cell
populations or protein samples, to make new predictions or discover
new biology.
 The key difference between bioinformatics and computational
biology is that bioinformatics is a multidisciplinary field that
combines biological knowledge with computer programming and
large sets of big data, while computational biology is a
multidisciplinary field that uses computer science, statistic, and
mathematics to help solve problems in biology.
APPLICATIONS OF COMPUTATIONAL
BIOLOGY
 Functional prediction involves assessing the sequence and structural similarity between an
unknown and a known protein and analyzing the proteins’ interactions with other molecules. Such
analyses may be extensive, and thus computational biology has become closely aligned with
systems biology, which attempts to analyze the workings of large interacting networks of biological
components, especially biological pathways.
THANK YOU

Presentation.pptx

  • 1.
    SRI KRISHNA ARTSAND SCIENCE COLLEGE ASMITHA – 20BBT010 Subject Code : 20BSU04
  • 2.
    INTRODUCTION TO BIOINFORMATICS DATABASES TO UNDERSTAND COMPUTATIONAL BIOLOGY
  • 3.
    CONTENTS  Definition  Historyof Bioinformatics  Components of Bioinformatics  Applications of Bioinformatics DATABASES  Definition  Types of databases  Primary database  Secondary database  Composite database  Applications of database
  • 4.
    DEFINITION OF BIOINFORMATICS Bioinformatics includes biological studies that use computer programming as part of their methodology, as well as specific analysis “pipelines” that are repeatedly used, particularly in the field of genomics.  Bioinformatics tools aid in comparing, analyzing and interpreting genetic and genomic data and more generally in the understanding of evolutionary aspects of molecular biology.
  • 5.
    HISTORY OF BIOINFORMATICS Paulien Hogeweg and Ben Hesper coined it in 1970 to refer to the study of information processes in biotic systems.  This definition placed bioinformatics as a field parallel to biochemistry (the study of chemical processes in biological systems).  Another early contributor to bioinformatics was Elvin A. Kabat, who pioneered biological sequence analysis in 1970 with his comprehensive volumes of antibody sequences released with Tai Te Wu between 1980 and 1991.
  • 6.
  • 7.
    COMPONENTS OF BIOINFORMATICS 1.Creation of databases: This involves the organizing, storage and management the biological data sets. 2. Development of algorithms and statistics: This involves the development of tools and resources to determine the relationship among the members of large data setsorganizing 3. Analysis of data and interpretation: This includes DNA, RNA and protein sequences, protein structure, gene expression profiles and biochemical pathways.
  • 8.
    APPLICATIONS OF BIOINFORMATICS Sequence mapping of biomolecules (DNA, RNA, proteins).  Identification of nucleotide sequences of functional genes.  Finding of sites that can be cut by restriction enzymes.  Designing of primer sequence for polymerase chain reactio  Development of models for the functioning various cells, tissues and organs.
  • 9.
    APPLICATIONS OF BIOINFORMATICS Prediction of functional gene products.  To trace the evolutionary trees of genes.  For the prediction of 3-dimensional structure of proteins.  Molecular modelling of biomolecules.  Designing of drugs for medical treatment.  Handling of vast biological data which otherwise is not possible.
  • 10.
  • 11.
    DATABASE  Biological databasesare libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis.  These are the databases consisting of biological data like protein sequencing, molecular structure, DNA sequences, etc in an organized form.  Several computer tools are there to manipulate the biological data like an update, delete, insert, etc. Scientists, researchers from all over the world enter their experiment data and results in a biological database so that it is available to a wider audience.  Biological databases are free to use and contain a huge collection of a variety of biological data.
  • 12.
    TYPES OF DATABASE There are basically 3 types of biological databases are as follows.  1. Primary database  2. Secondary Database  3. Composite Database
  • 13.
    PRIMARY DATABASE  Primarydatabases are also called as archieval database.  They are populated with experimentally derived data such as nucleotide sequence, protein sequence or macromolecular structure.  Experimental results are submitted directly into the database by researchers, and the data are essentially archival in nature. Examples – Examples of Primary database- Nucleic Acid Databases are GenBank and DDBJ Protein Databases are PDB,SwissProt,PIR,TrEMBL,Metacyc, etc.
  • 14.
    SECONDARY DATABASE  Thedata stored in these types of databases are the analyzed result of the primary database. Computational algorithms are applied to the primary database and meaningful and informative data is stored inside the secondary database.  The data here are highly curated(processing the data before it is presented in the database). A secondary database is better and contains more valuable knowledge compared to the primary database. Examples – Examples of Secondary databases are as follows. InterPro (protein families, motifs, and domains) UniProt Knowledgebase (sequence and functional information on proteins)
  • 15.
    COMPOSITE DATABASE  Thedata entered in these types of databases are first compared and then filtered based on desired criteria.  The initial data are taken from the primary database, and then they are merged together based on certain conditions.  It helps in searching sequences rapidly. Composite Databases contain non-redundant data. Examples – Examples of Composite Databases are as follows. Composite Databases –OWL,NRD and Swissport +TREMBL
  • 16.
    APPLICATIONS OF DATABASE Discovery of genome as well as protein sequencing aroused interest in bioinformatics and propelled the necessity to create databases of biological sequences.  These data are processed in useful knowledge/information by data mining before storing into databases.  This book chapter aims to present a detailed overview of different types of database called as primary, secondary and composite databases along with many specialized biological databases for RNA molecules, protein-protein interaction, genome information, metabolic pathways, phylogenetic information etc.  Attempt has also been made to focus on drawbacks of present biological databases.  Moreover, this book chapter provides an elaborate and illustrative discussion about various bioinformatics tools used for gene prediction, sequence analysis, phylogenetic analysis, protein structure as well as function prediction, molecular interactions prediction for several purposes including discovery of new gene as well as conserved regions in protein families, estimation of evolutionary relationships among organisms, 3D structure prediction of drug targets for exploring the mechanism as well as new drug discovery and protein-protein interactions for exploring the signaling pathways.
  • 17.
  • 19.
    COMPUTATIONAL BIOLOGY  Computationalbiology is an interdisciplinary field that develops and applies computational methods to analyse large collections of biological data, such as genetic sequences, cell populations or protein samples, to make new predictions or discover new biology.  Computational biology, by contrast, is concerned with solutions to issues that have been raised by studies in bioinformatics. Both disciplines are generally considered facets of the rapidly-expanding fields of data science and biotechnology.  Computational biology is useful in scientific research, including the examination of how proteins interact with each other through the simulation of protein folding, motion, and interaction.  While computational biology emphasizes the development of theoretical methods, computational simulations, and mathematical modeling, bioinformatics emphasizes informatics and statistics.  Though the two fields are interrelated, bioinformatics and computational biology differ in the kinds of needs they address.
  • 20.
    COMPUTATIONAL BIOLOGY IN BIOINFORMATICS Computational biology and bioinformatics is an interdisciplinary field that develops and applies computational methods to analyse large collections of biological data, such as genetic sequences, cell populations or protein samples, to make new predictions or discover new biology.  The key difference between bioinformatics and computational biology is that bioinformatics is a multidisciplinary field that combines biological knowledge with computer programming and large sets of big data, while computational biology is a multidisciplinary field that uses computer science, statistic, and mathematics to help solve problems in biology.
  • 21.
    APPLICATIONS OF COMPUTATIONAL BIOLOGY Functional prediction involves assessing the sequence and structural similarity between an unknown and a known protein and analyzing the proteins’ interactions with other molecules. Such analyses may be extensive, and thus computational biology has become closely aligned with systems biology, which attempts to analyze the workings of large interacting networks of biological components, especially biological pathways.
  • 22.