Syllabus
• Unit 1
•Introduction to Bioinformatics: Definition - Computational
Biology; Biological Data Acquisition: The form of biological
information Retrieval methods for DNA sequence, protein
sequence, and protein structure information; Databases –
Format and Annotation: Conventions for database indexing
and specification of search terms, Common sequence file
formats. Annotated sequence databases - primary sequence
databases, protein sequence, and structure databases;
Organism-specific databases; Data – Access
3.
• Unit 2
Biocomputing:Introduction to String Matching Algorithms.
Database Search Techniques - Local versus global- Sequence
Comparison and Alignment Techniques - Pairwise and
Multiple sequence alignment. - Use of Scoring Matrices-
Dynamic programming algorithms, Needleman-Wunsch and
Smith-waterman. Heuristic Methods of sequence alignment,
BLAST, and PSI-BLAST. Multiple Sequence Alignment and
software tools for pairwise and multiple sequence alignment;
– Phylogenetics analysis- Phylip.
4.
• Unit 3
•Profiles, motifs, and features identification using tools like
Prosite. Automated Gene Prediction - ORF finding;
Visualization tool- Pymol. Introduction to Signaling
Pathways. Machine Learning Methods in Bioinformatics -
Introduction to Matlab.
5.
Books to Refer
•Bioinformatics: Concepts, Skills & Applications – Rastogi
et al.
• Essential Bioinformatics – Xiong
• Developing Bioinformatics Computer Skills – Gibas &
Jambeck
• An Introduction to Bioinformatics Algorithms – Jones and
Pevzner
• Introduction to Bioinformatics-Krawetz Womble
• Introduction to Bioinformatics- V Kothekar
6.
What is Bioinformatics??
Statistics
Biology
computer
science
Bioinformatics is an
interdisciplinary
scientific field that
develops and uses
computational tools to
collect, store, analyze,
and interpret large
amounts of biological
data, such as DNA, RNA,
and protein sequences,
to understand living
systems and disease.
It is an
interdisciplinary
field which
harnesses
computer science,
mathematics,
physics, and
biology.
7.
Computer Science
It appliestechniques from
machine learning, data
mining, AI, optimization,
visualization and simulation
and develops new techniques
as required
8.
Computational
Biology
• Bioinformatics islimited to sequence,
structural and functional analysis of
genes and genomes and statistics.
• Computational Biology encompasses
all biological areas that involve model
building, simulations and theoretical
methods.
• Eg: Mathematical modelling of
population dynamics, ADME, organ
functioning
9.
Biology
• Bioinformatics involvesthe application of computational and statistical
techniques to the analysis and interpretation of biological data.
• Various types of biological data are used in bioinformatics, providing
insights into the structure, function, and relationships of biological entities.
• Genomic Data
• Transcriptomic Data
• Proteomic Data
• Metabolomic Data
• Structural Data
• Functional Genomics Data
• Phylogenetic Data
• Biological Literature and Annotations
10.
Omics in Bioinformatics
•Bioinformatics plays a crucial role in processing, analyzing, and interpreting
these massive data sets generated by high-throughput techniques.
• Genomics: study of the entire set of genes (genome)
• DNA sequences, genome assemblies, gene annotations, and SNPs
• Transcriptomics: study of all RNA molecules (gene expression and regulation)
• RNA sequencing (RNA-Seq) data, microarray data, and information on alternative
splicing and isoform expression
• Proteomics: study of the entire set of proteins
• Mass spectrometry data, protein-protein interaction networks, and protein structural
information
• Metabolomics: study of metabolites in biological systems
• Mass spectrometry and nuclear magnetic resonance (NMR) spectroscopy data on
metabolite concentrations and profiles
11.
• Epigenomics: heritablechanges in gene function that do not involve alterations to the
underlying DNA sequence
• DNA methylation, histone modifications, and chromatin structure
• Pharmacogenomics: genetic variations in individual responses to drugs, aiming to
personalize medicine
• Genetic variations relevant to drug metabolism, efficacy, and adverse reactions
• Metagenomics: study of the structure and function of entire nucleotide sequences isolated
and analyzed from all the organisms (typically microbes) in a bulk sample.
• DNA sequences from mixed microbial populations, functional gene annotations, and taxonomic
information
• Immunomics: study of the immune system
• immune system components, antibody-antigen interactions, vaccine generation and immune cell
signaling
• Interactomics: explores the interactions between biomolecules, such as protein-protein
interactions, to understand cellular functions and signaling pathways
• Protein-protein interaction networks, signaling pathway data, and information on molecular
interactions
Omics in Bioinformatics
13.
Data Acquisition
• Systematiccollection of biological data from various sources for analysis,
interpretation, and further investigation.
• Different sections of Data Acquisition
• Data Generation: Obtaining raw biological data through experimental methods
(Sequencing, microarray, x-ray diffraction, mass spec etc )
• Data Retrieval: Collecting existing biological data from public repositories (genomic,
proteomic and expression).
• Data Integration: Combining data from multiple sources to create a unified dataset.
(Composite databases)
• Data Cleaning and Preprocessing: Preparing data for analysis by addressing issues
such as missing values, outliers, and normalization.
• Data Annotation: is the process of the categorization, describing or labeling of data
• Metadata Collection: additional contextual information (Patient metadata)
• Data Storage and sharing: Storing acquired data in a structured and accessible
manner.
14.
Types of DNAsequences and gene data
• Genomic DNA-The entire genome data
• cDNA- from a mature mRNA using reverse transcriptase (create copies,
PCR and functional genomics )
• Recombinant DNA- artificially created DNA (cloning, GMOs and
transgenic animals)
• ESTs(Expressed Sequence tag)- small sub-sequence of transcribed DNA
• GSSs(Genome Survey Sequences)- small sub-sequence of genomic DNA
origin (dbGSS)
• SNPs
• Gene-gene associations
• Gene-disease associations