SlideShare a Scribd company logo
Bioinformatics
Dr. Shailendra Singh Khichi
What is Bioinformatics
• Bioinformatics is an interdisciplinary research
area at the interface between computer science
and biological science
• Luscombe et al. in defining bioinformatics as a
union of biology and informatics:
• Bioinformatics uses computers for storage,
retrieval, manipulation, and distribution of
information related to biological
macromolecules such as DNA, RNA, and
proteins
• Bioinformatics differs from a related field known as
computational biology.
• Bioinformatics is limited to sequence, structural, and
functional analysis of genes and genomes and their
corresponding products.
• However, computational biology encompasses all
biological areas that involve computation.
• For example, mathematical modeling of ecosystems,
population dynamics, application of the game theory in
behavioral studies, and phylogenetic construction using
fossil records all employ computational tools, but do not
necessarily involve biological macromolecules.
• The first sequence alignment algorithm was
developed by Needleman and Wunsch in 1970.
• This was a fundamental step in the development of
the field of bioinformatics, which paved the way for
the routine sequence comparisons and database
searching practiced by modern biologists.
• The first protein structure prediction algorithm was
developed by Chou and Fasman in 1974.
Goals
• The ultimate goal of bioinformatics is to better understand
a living cell and how it functions at the molecular level.
• By analyzing raw molecular sequence and structural data,
bioinformatics research can generate new insights and
provide a “global” perspective of the cell.
• Cellular functions are mainly performed by proteins whose
capabilities are ultimately determined by their sequences.
• Therefore, solving functional problems using sequence and
sometimes structural approaches has proved to be a fruitful
endeavor.
Scope
• Bioinformatics consists of two subfields:
1. The development of computational tools and
databases and
2. Application of these tools and databases in
generating biological knowledge to better
understand living systems.
• The tool development includes: writing software
for sequence, structural, and functional analysis, as
well as the construction and curating (organizing) of
biological databases.
• Tools are used in 3 areas of genomic and molecular biological
research:
1. molecular sequence analysis,
2. molecular structural analysis,
3. and molecular functional analysis.
1. Sequence analysis: sequence alignment, sequence database
searching, motif and pattern discovery, gene and promoter finding,
reconstruction of evolutionary relationships, and genome assembly
and comparison.
2. Structural analyses: protein and nucleic acid structure analysis,
comparison, classification, and prediction.
3. Functional analyses: gene expression profiling, protein–protein
interaction prediction, protein subcellular localization prediction,
metabolic pathway reconstruction, and simulation
• These 3 aspects are integrated with each other
• For eg: protein structure prediction depends on
sequence alignment data;
• clustering of gene expression profiles requires the use
of phylogenetic tree construction methods derived in
sequence analysis.
• Sequence-based promoter prediction is related to
functional analysis of coexpressed genes.
• Gene annotation include
1.Distinction between coding and noncoding sequences,
2. Identification of translated protein sequences,
3. Determination of the gene’s evolutionary relationship
with other known genes.
Applications
• knowledge-based drug design,
• forensic DNA analysis, and
• Agricultural biotechnology.
• Computational studies of protein–ligand interactions provide a rational basis
for the rapid identification of novel leads for synthetic drugs.
• Knowledge of the 3D structures of proteins allows molecules to be designed
that are capable of binding to the receptor site of a target protein with great
affinity and specificity.
• This informatics-based approach significantly reduces the tie and cost
necessary to develop drugs with higher poteny fewer side effects, and less
toxicity than using the traditional trial-and-error approach.
LIMITATIONS
• Bioinformatics depends on experimental science to
produce raw data for analysis.
• It, in turn, provides useful interpretation of
experimental data and important leads for further
experimental research.
• The quality of bioinformatics predictions depends on
the quality of data and the sophistication of the
algorithms being used
• Sequence data from high throughput analysis often
contain errors.
• If the sequences are wrong or annotations incorrect,
the results from the downstream analysis are
misleading as well.
Program Database Query Typical uses
BLASTN Nucleotide Nucleotide
Mapping oligonucleotides, cDNAs, and PCR
products to a genome; screening repetitive
elements; cross-species sequence exploration;
annotating genomic DNA; clustering sequencing
reads; vector clipping
BLASTP Protein Protein
Identifying common regions between proteins;
collecting related proteins for phylogenetic
analyses
BLASTX Protein
Nucleotide
translated into
protein
Finding protein-coding genes in genomic DNA;
determining if a cDNA corresponds to a known
protein
TBLASTN
Nucleotide
translated into
protein
Protein
Identifying transcripts, potentially from multiple
organisms, similar to a given protein; mapping a
protein to genomic DNA
TBLASTX
Nucleotide
translated into
protein
Nucleotide
translated into
protein
Cross-species gene prediction at the genome or
transcript level; searching for genes missed by
traditional methods or not yet in protein
databases
WHAT IS A DATABASE
A database is a computerized archive used to store and organize data in such a way
that information can be retrieved easily via a variety of search criteria.
Databases are composed of computer hardware and software for data management.
The objective of the development of a database is to organize data in a set of structured
records to enable easy retrieval of information.
Each record, also called an entry, should contain a number of fields that hold the actual
data items.
To retrieve a particular record from the database, a user can specify a particular piece of
information, called value, to be found in a particular field and expect the computer to
retrieve the whole data record. This process is called making a query.
Types of DATABASES Format
• Flat file
• Relational databases
• Object oriented database
1. Flat File format:
• Originally, databases all used a flat file format, which is a long text
file that contains many entries separated by a delimiter (І)
• The entire text file does not contain any hidden instructions for
computers to search for specific information or to create reports
based on certain fields from each record.
• The text file can be considered a single table
• To search a flat file for a particular piece of information, a computer
has to read through the entire file, an obviously inefficient process
• Relational Databases
• Instead of using a single table as in a flat file database,
relational databases use a set of tables to organize data.
• Each table, also called a relation, is made up of columns
and rows
• Columns represent individual fields. Rows represent values
in the fields of records.
• The columns in a table are indexed according to a common
feature called an attribute, so they can be cross-referenced
in other tables
• To execute a query in a relational database, the system
selects linked data items from different tables and
combines the information into one report.
• Therefore, specific information can be found more quickly
from a relational database than from a flat file database.
Relational Databases
• Object-Oriented Databases
• One of the problems with relational databases
is that the tables used do not describe
complex hierarchical relationships between
data items.
• To overcome the problem, object-oriented
databases have been developed that store
data as objects
• an object can be considered as a unit that
combines data and mathematical routines
that act on the data
Types of Biological Databases
• Biological databases can be roughly divided into three categories:
primary databases, secondary databases, and specialized databases.
Primary databases contain original biological data. They are archives
of raw sequence or structural data submitted by the scientific
community.
Eg: GenBank and Protein Data Bank (PDB)
Secondary databases contain computationally processed or manually
curated information, based on original information from primary
databases.
• Translated protein sequence databases containing functional
annotation belong to this category.
• Eg: SWISS-Prot and Protein Information Resources (PIR)
• Specialized databases are those that cater to a
particular research interest.
• For example, Flybase, HIV sequence database,
and Ribosomal Database Project are
databases that specialize in a particular
organism or a particular type of data.
Primary Databases
• There are 3 major public sequence databases that store raw nucleic acid
sequence data produced and submitted by researchers worldwide:
• GenBank,
• European Molecular Biology Laboratory (EMBL)
• DNA Data Bank of Japan (DDBJ)
• which are all freely available on the Internet.
• Most of the data in the databases are contributed directly by authors with
a minimal level of annotation.
• These three public databases closely collaborate and exchange new data
daily. They together constitute the International Nucleotide Sequence
Database Collaboration.
• Although the three databases all contain the same sets of raw data, each
of the individual databases has a slightly different kind of format to
represent the data.
• For the three-dimensional structures of biological macromolecules, there
is only one centralized database, the PDB.
• This database archives atomic coordinates of macromolecules (both
proteins and nucleic acids) determined by x-ray crystallography and NMR.
It uses a flat file format to represent protein name, authors, experimental
details, secondary structure, cofactors, and atomic coordinates.
• Secondary Databases
• Sequence annotation information in the primary database is often
minimal.
• To turn the raw sequence information into more sophisticated
biological knowledge, much post processing of the sequence
information is needed.
• which contain computationally processed sequence information
derived from the primary databases.
• some are simple archives of translated sequence data from
identified open reading frames in DNA, whereas others provide
additional annotation and information related to higher levels of
information regarding structure and functions.
• A prominent example of secondary databases is SWISS-PROT, which
provides detailed sequence annotation that includes structure,
function, and protein family assignment. The sequence data are
mainly derived from TrEMBL, a database of

More Related Content

What's hot

Introduction of bioinformatics
Introduction of bioinformaticsIntroduction of bioinformatics
Introduction of bioinformatics
Dr NEETHU ASOKAN
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
AkanshaChauhan15
 
NCBI
NCBINCBI
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
Thapar Institute of Engineering & Technology, Patiala, Punjab, India
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
Sachin Kumar
 
Applications of bioinformatics
Applications of bioinformaticsApplications of bioinformatics
Structural genomics
Structural genomicsStructural genomics
Structural genomics
Vaibhav Maurya
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
Dhananjay Desai
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
Santosh Kumar Sahoo
 
Protein database
Protein databaseProtein database
Protein database
Khalid Hakeem
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)
IndrajaDoradla
 
Sequence analysis - Bioinformatics
Sequence analysis - BioinformaticsSequence analysis - Bioinformatics
Sequence analysis - Bioinformatics
Pratik Parikh
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
ALLIENU
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
Alichy Sowmya
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
prateek kumar
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Somdutt Sharma
 
Swiss PROT
Swiss PROT Swiss PROT
Proteins databases
Proteins databasesProteins databases
Proteins databases
Hafiz Muhammad Zeeshan Raza
 
UniProt
UniProtUniProt
UniProt
AmnaA7
 

What's hot (20)

Introduction of bioinformatics
Introduction of bioinformaticsIntroduction of bioinformatics
Introduction of bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
NCBI
NCBINCBI
NCBI
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
 
Applications of bioinformatics
Applications of bioinformaticsApplications of bioinformatics
Applications of bioinformatics
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
Protein database
Protein databaseProtein database
Protein database
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)
 
Sequence analysis - Bioinformatics
Sequence analysis - BioinformaticsSequence analysis - Bioinformatics
Sequence analysis - Bioinformatics
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Swiss PROT
Swiss PROT Swiss PROT
Swiss PROT
 
Protein Structure Prediction
Protein Structure PredictionProtein Structure Prediction
Protein Structure Prediction
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
UniProt
UniProtUniProt
UniProt
 

Similar to Bioinformatics

Biological data base
Biological data baseBiological data base
Biological data base
kishoreGupta17
 
Bioinformatics__Lecture_1.ppt
Bioinformatics__Lecture_1.pptBioinformatics__Lecture_1.ppt
Bioinformatics__Lecture_1.ppt
sirwansleman
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptx
science lover
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdf
nedalalazzwy
 
Basics Of Bioinformatics .pptx
Basics Of Bioinformatics .pptxBasics Of Bioinformatics .pptx
Basics Of Bioinformatics .pptx
Mohdkaifkhan18
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
maulikchaudhary8
 
Data retreival system
Data retreival systemData retreival system
Data retreival system
Shikha Thakur
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
PrashantSharma807
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
AakifahAmreen
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
AyeshaYousaf20
 
Sequence submission tools ............pptx
Sequence submission tools ............pptxSequence submission tools ............pptx
Sequence submission tools ............pptx
Cherry
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
AshuAsh15
 
Biological databases
Biological databases Biological databases
Biological databases
SEKHARREDDYAMBATI
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
sworna kumari chithiraivelu
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
Vandana Yadav03
 
BASIC OF BIOINFORMATICS.pptx
BASIC OF BIOINFORMATICS.pptxBASIC OF BIOINFORMATICS.pptx
BASIC OF BIOINFORMATICS.pptx
DevaprasadPanda
 
Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)
Sijo A
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
Saramita De Chakravarti
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdf
kigaruantony
 

Similar to Bioinformatics (20)

Biological data base
Biological data baseBiological data base
Biological data base
 
Bioinformatics__Lecture_1.ppt
Bioinformatics__Lecture_1.pptBioinformatics__Lecture_1.ppt
Bioinformatics__Lecture_1.ppt
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptx
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdf
 
Basics Of Bioinformatics .pptx
Basics Of Bioinformatics .pptxBasics Of Bioinformatics .pptx
Basics Of Bioinformatics .pptx
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Data retreival system
Data retreival systemData retreival system
Data retreival system
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
 
Sequence submission tools ............pptx
Sequence submission tools ............pptxSequence submission tools ............pptx
Sequence submission tools ............pptx
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
Biological databases
Biological databases Biological databases
Biological databases
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
 
BASIC OF BIOINFORMATICS.pptx
BASIC OF BIOINFORMATICS.pptxBASIC OF BIOINFORMATICS.pptx
BASIC OF BIOINFORMATICS.pptx
 
Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdf
 

Recently uploaded

Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 

Recently uploaded (20)

Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 

Bioinformatics

  • 2. What is Bioinformatics • Bioinformatics is an interdisciplinary research area at the interface between computer science and biological science • Luscombe et al. in defining bioinformatics as a union of biology and informatics: • Bioinformatics uses computers for storage, retrieval, manipulation, and distribution of information related to biological macromolecules such as DNA, RNA, and proteins
  • 3. • Bioinformatics differs from a related field known as computational biology. • Bioinformatics is limited to sequence, structural, and functional analysis of genes and genomes and their corresponding products. • However, computational biology encompasses all biological areas that involve computation. • For example, mathematical modeling of ecosystems, population dynamics, application of the game theory in behavioral studies, and phylogenetic construction using fossil records all employ computational tools, but do not necessarily involve biological macromolecules.
  • 4. • The first sequence alignment algorithm was developed by Needleman and Wunsch in 1970. • This was a fundamental step in the development of the field of bioinformatics, which paved the way for the routine sequence comparisons and database searching practiced by modern biologists. • The first protein structure prediction algorithm was developed by Chou and Fasman in 1974.
  • 5. Goals • The ultimate goal of bioinformatics is to better understand a living cell and how it functions at the molecular level. • By analyzing raw molecular sequence and structural data, bioinformatics research can generate new insights and provide a “global” perspective of the cell. • Cellular functions are mainly performed by proteins whose capabilities are ultimately determined by their sequences. • Therefore, solving functional problems using sequence and sometimes structural approaches has proved to be a fruitful endeavor.
  • 6. Scope • Bioinformatics consists of two subfields: 1. The development of computational tools and databases and 2. Application of these tools and databases in generating biological knowledge to better understand living systems. • The tool development includes: writing software for sequence, structural, and functional analysis, as well as the construction and curating (organizing) of biological databases.
  • 7. • Tools are used in 3 areas of genomic and molecular biological research: 1. molecular sequence analysis, 2. molecular structural analysis, 3. and molecular functional analysis. 1. Sequence analysis: sequence alignment, sequence database searching, motif and pattern discovery, gene and promoter finding, reconstruction of evolutionary relationships, and genome assembly and comparison. 2. Structural analyses: protein and nucleic acid structure analysis, comparison, classification, and prediction. 3. Functional analyses: gene expression profiling, protein–protein interaction prediction, protein subcellular localization prediction, metabolic pathway reconstruction, and simulation
  • 8.
  • 9. • These 3 aspects are integrated with each other • For eg: protein structure prediction depends on sequence alignment data; • clustering of gene expression profiles requires the use of phylogenetic tree construction methods derived in sequence analysis. • Sequence-based promoter prediction is related to functional analysis of coexpressed genes. • Gene annotation include 1.Distinction between coding and noncoding sequences, 2. Identification of translated protein sequences, 3. Determination of the gene’s evolutionary relationship with other known genes.
  • 10. Applications • knowledge-based drug design, • forensic DNA analysis, and • Agricultural biotechnology. • Computational studies of protein–ligand interactions provide a rational basis for the rapid identification of novel leads for synthetic drugs. • Knowledge of the 3D structures of proteins allows molecules to be designed that are capable of binding to the receptor site of a target protein with great affinity and specificity. • This informatics-based approach significantly reduces the tie and cost necessary to develop drugs with higher poteny fewer side effects, and less toxicity than using the traditional trial-and-error approach.
  • 11. LIMITATIONS • Bioinformatics depends on experimental science to produce raw data for analysis. • It, in turn, provides useful interpretation of experimental data and important leads for further experimental research. • The quality of bioinformatics predictions depends on the quality of data and the sophistication of the algorithms being used • Sequence data from high throughput analysis often contain errors. • If the sequences are wrong or annotations incorrect, the results from the downstream analysis are misleading as well.
  • 12. Program Database Query Typical uses BLASTN Nucleotide Nucleotide Mapping oligonucleotides, cDNAs, and PCR products to a genome; screening repetitive elements; cross-species sequence exploration; annotating genomic DNA; clustering sequencing reads; vector clipping BLASTP Protein Protein Identifying common regions between proteins; collecting related proteins for phylogenetic analyses BLASTX Protein Nucleotide translated into protein Finding protein-coding genes in genomic DNA; determining if a cDNA corresponds to a known protein TBLASTN Nucleotide translated into protein Protein Identifying transcripts, potentially from multiple organisms, similar to a given protein; mapping a protein to genomic DNA TBLASTX Nucleotide translated into protein Nucleotide translated into protein Cross-species gene prediction at the genome or transcript level; searching for genes missed by traditional methods or not yet in protein databases
  • 13. WHAT IS A DATABASE A database is a computerized archive used to store and organize data in such a way that information can be retrieved easily via a variety of search criteria. Databases are composed of computer hardware and software for data management. The objective of the development of a database is to organize data in a set of structured records to enable easy retrieval of information. Each record, also called an entry, should contain a number of fields that hold the actual data items. To retrieve a particular record from the database, a user can specify a particular piece of information, called value, to be found in a particular field and expect the computer to retrieve the whole data record. This process is called making a query.
  • 14. Types of DATABASES Format • Flat file • Relational databases • Object oriented database 1. Flat File format: • Originally, databases all used a flat file format, which is a long text file that contains many entries separated by a delimiter (І) • The entire text file does not contain any hidden instructions for computers to search for specific information or to create reports based on certain fields from each record. • The text file can be considered a single table • To search a flat file for a particular piece of information, a computer has to read through the entire file, an obviously inefficient process
  • 15. • Relational Databases • Instead of using a single table as in a flat file database, relational databases use a set of tables to organize data. • Each table, also called a relation, is made up of columns and rows • Columns represent individual fields. Rows represent values in the fields of records. • The columns in a table are indexed according to a common feature called an attribute, so they can be cross-referenced in other tables • To execute a query in a relational database, the system selects linked data items from different tables and combines the information into one report. • Therefore, specific information can be found more quickly from a relational database than from a flat file database.
  • 17. • Object-Oriented Databases • One of the problems with relational databases is that the tables used do not describe complex hierarchical relationships between data items. • To overcome the problem, object-oriented databases have been developed that store data as objects • an object can be considered as a unit that combines data and mathematical routines that act on the data
  • 18.
  • 19. Types of Biological Databases • Biological databases can be roughly divided into three categories: primary databases, secondary databases, and specialized databases. Primary databases contain original biological data. They are archives of raw sequence or structural data submitted by the scientific community. Eg: GenBank and Protein Data Bank (PDB) Secondary databases contain computationally processed or manually curated information, based on original information from primary databases. • Translated protein sequence databases containing functional annotation belong to this category. • Eg: SWISS-Prot and Protein Information Resources (PIR)
  • 20. • Specialized databases are those that cater to a particular research interest. • For example, Flybase, HIV sequence database, and Ribosomal Database Project are databases that specialize in a particular organism or a particular type of data.
  • 21. Primary Databases • There are 3 major public sequence databases that store raw nucleic acid sequence data produced and submitted by researchers worldwide: • GenBank, • European Molecular Biology Laboratory (EMBL) • DNA Data Bank of Japan (DDBJ) • which are all freely available on the Internet. • Most of the data in the databases are contributed directly by authors with a minimal level of annotation. • These three public databases closely collaborate and exchange new data daily. They together constitute the International Nucleotide Sequence Database Collaboration. • Although the three databases all contain the same sets of raw data, each of the individual databases has a slightly different kind of format to represent the data. • For the three-dimensional structures of biological macromolecules, there is only one centralized database, the PDB. • This database archives atomic coordinates of macromolecules (both proteins and nucleic acids) determined by x-ray crystallography and NMR. It uses a flat file format to represent protein name, authors, experimental details, secondary structure, cofactors, and atomic coordinates.
  • 22. • Secondary Databases • Sequence annotation information in the primary database is often minimal. • To turn the raw sequence information into more sophisticated biological knowledge, much post processing of the sequence information is needed. • which contain computationally processed sequence information derived from the primary databases. • some are simple archives of translated sequence data from identified open reading frames in DNA, whereas others provide additional annotation and information related to higher levels of information regarding structure and functions. • A prominent example of secondary databases is SWISS-PROT, which provides detailed sequence annotation that includes structure, function, and protein family assignment. The sequence data are mainly derived from TrEMBL, a database of