BIOINFORMATICS
Total no. of slides - 17
What is Bioinformatics?“Bioinformatics is the field of science in which . Biology
, Computer Science , and informationTechnology
merge to form a single discipline”
-
NCBI
Roughly, bioinformatics describes any use of computers to
handle biological information. bioinformatics to them is a
synonym for "computational molecular biology"- the use of
computers to characterize the molecular components of
living things.
Areas where bioinformatics is
applied
GENOMICS
Genomic
feature
prediction
Sequencing
data analysis
PROTEOMICS
Protein 3D
structure
modeling
Drug design
SYSTEMS
BIOLOGY
Gene set
enrichment
Pathway
analysis
PHENOTYPE
Image
analysis
Integration
Domain knowledge for
bioinformatics
• DNA – Deoxyribonucleic acid
• It is the hereditary material
• The information in DNA is stored as a code made up of four chemical bases:
adenine (A), guanine (G), cytosine (C), and thymine (T).
• Human DNA consists of about 3 billion bases, and more than 99 percent of
those bases are the same in all people.
• The order, or sequence, of these bases determines the information available
for building and maintaining an organism, similar to the way in which letters of
the alphabet appear in a certain order to form words and sentences.
• An important property of DNA is that it can replicate, or make copies of itself.
APPROACH
BIOLOGICAL QUESTION
GENERATE DATA
TRANSLATE INTO COMPUTER SOLVABLE TASK.
DEVELOP AN ALGORITHM
IMPLEMENT ALGORITHM
RUN ALGORITHM
CONDENSE RESULT IN HUMAN READABLE FORM
ANSWER BIOLOGICAL QUESTION
Some Basic Examples
Our example is Vibrio cholerae, the pathogenic bacterium that causes cholera; here is
the nucleotide sequence appearing in the ori of Vibrio cholerae:
The computational
problem• May be to find replication sites or ORI sites.
• Operating under the assumption that DNA is a language of its own, let's borrow Legrand's method
and see if we can find any surprisingly frequent "words" within the ori of Vibrio cholerae.
• So we have a pseudocode to find this ,
• PatternCount(Text, Pattern)
• count ← 0
• for i ← 0 to |Text| − |Pattern|
• if Text(i, |Pattern|) = Pattern
• count ← count + 1
• return count
Let us Solve!
Hypothesis
Assume we know the sequence of ORI i.e CGTGGGACG
AIM : to count no. of CGTGGGACG s in given sequence.
Python code :
s=input(“dna sequence:”) #vibrio cholerae sequence example
p=‘CGTGGGACG’
count=0
for i in range(len(s)-len(p)+1): #for loop
if s[i:i+len(p)]==p): #condition
cnt+=1 #addn. of count
print(count) #PRINTS NO. OF TIMES OF REPETITION
Traditional Biology vs High
throughput Biology
• Traditional Biology
Hypothesis
Experimental
design
Experiment Eyeballing Evaluation
Hypothesis Experimental
design
Experiment
Data
analysis
Evaluation
Bioinformatics
• High Throughput Biology
Bioinformatics is
an
interdisciplinary
field
• Bioinformatics requires
dedication and
continuity
• Bioinformatics data
analysis is a full
research experiment in
itself
• We get the most out of
our research if we work
as an interdisciplinary
research team
throughout
Some examples of
bioinformatics tools
BLAST
BLAST (Basic Local Alignment Search Tool) comes under the category of homology and
similarity tools. It is a set of search programs designed for the Windows platform and is used
to perform fast similarity searches regardless of whether the query is for protein or DNA.
FASTA
FAST homology search All sequences .An alignment program for protein sequences created
by Pearsin and Lipman in 1988. The program is one of the many heuristic algorithms
proposed to speed up sequence comparison.
Application of Programmes in
Bioinformatics
JAVA in Bioinformatics
Physiome Sciences' computer-based biological simulation technologies and
Bioinformatics Solutions' PatternHunter are two examples of the growing
adoption of Java in bioinformatics.
PERL in Bioinformatics
String manipulation, regular expression matching, file parsing, data format
interconversion etc are the common text-processing tasks performed in
bioinformatics.
Bioinformatics
ProjectsBIOPYTHON
Biopython is a set of freely available tools for biological computation written in Python
by an international team of developers. It is a distributed collaborative effort to develop
Python libraries and applications which address the needs of current and future work in
bioinformatics.
BIOJAVA
BioJava is an open-source software project dedicated to provide Java tools to process
biological data. BioJava is a set of library functions written in the programming language
Java for manipulating sequences, protein structures, file parsers, Common Object
Request Broker Architecture (CORBA) interoperability, Distributed Annotation System
(DAS), access to AceDB, dynamic programming, and simple statistical routines.
Some Applications of
Bioinformatics
The science of bioinformatics has many beneficial uses in the
modern day world . These include the following:
• Molecular medicine
• Microbial genome applications
• Agriculture
• Animals
• Comparative studies
Current Applications of Bioinformatics
(Covid 19)
• the routine detection of SARS-CoV-2 infection,
• the reliable analysis of sequencing data,
• the tracking of the COVID-19 pandemic and
evaluation of containment measures,
• the study of coronavirus evolution,
• the discovery of potential drug targets and
development of therapeutic strategies.
Tools for drug design (Covid 19
– EVBC)
WEBLIOGRAPHY
• http://evbc.uni-jena.de/tools/coronavirus-tools/
• https://arxiv.org/ftp/arxiv/papers/0911/0911.4230.pdf
• https://www.quora.com/What-are-the-names-of-software-used-in-
bioinformatics
• https://bioinformatics.csiro.au/

Bioinformatics

  • 1.
  • 2.
    What is Bioinformatics?“Bioinformaticsis the field of science in which . Biology , Computer Science , and informationTechnology merge to form a single discipline” - NCBI Roughly, bioinformatics describes any use of computers to handle biological information. bioinformatics to them is a synonym for "computational molecular biology"- the use of computers to characterize the molecular components of living things.
  • 3.
    Areas where bioinformaticsis applied GENOMICS Genomic feature prediction Sequencing data analysis PROTEOMICS Protein 3D structure modeling Drug design SYSTEMS BIOLOGY Gene set enrichment Pathway analysis PHENOTYPE Image analysis Integration
  • 4.
    Domain knowledge for bioinformatics •DNA – Deoxyribonucleic acid • It is the hereditary material • The information in DNA is stored as a code made up of four chemical bases: adenine (A), guanine (G), cytosine (C), and thymine (T). • Human DNA consists of about 3 billion bases, and more than 99 percent of those bases are the same in all people. • The order, or sequence, of these bases determines the information available for building and maintaining an organism, similar to the way in which letters of the alphabet appear in a certain order to form words and sentences. • An important property of DNA is that it can replicate, or make copies of itself.
  • 5.
    APPROACH BIOLOGICAL QUESTION GENERATE DATA TRANSLATEINTO COMPUTER SOLVABLE TASK. DEVELOP AN ALGORITHM IMPLEMENT ALGORITHM RUN ALGORITHM CONDENSE RESULT IN HUMAN READABLE FORM ANSWER BIOLOGICAL QUESTION
  • 6.
    Some Basic Examples Ourexample is Vibrio cholerae, the pathogenic bacterium that causes cholera; here is the nucleotide sequence appearing in the ori of Vibrio cholerae:
  • 7.
    The computational problem• Maybe to find replication sites or ORI sites. • Operating under the assumption that DNA is a language of its own, let's borrow Legrand's method and see if we can find any surprisingly frequent "words" within the ori of Vibrio cholerae. • So we have a pseudocode to find this , • PatternCount(Text, Pattern) • count ← 0 • for i ← 0 to |Text| − |Pattern| • if Text(i, |Pattern|) = Pattern • count ← count + 1 • return count
  • 8.
    Let us Solve! Hypothesis Assumewe know the sequence of ORI i.e CGTGGGACG AIM : to count no. of CGTGGGACG s in given sequence. Python code : s=input(“dna sequence:”) #vibrio cholerae sequence example p=‘CGTGGGACG’ count=0 for i in range(len(s)-len(p)+1): #for loop if s[i:i+len(p)]==p): #condition cnt+=1 #addn. of count print(count) #PRINTS NO. OF TIMES OF REPETITION
  • 9.
    Traditional Biology vsHigh throughput Biology • Traditional Biology Hypothesis Experimental design Experiment Eyeballing Evaluation Hypothesis Experimental design Experiment Data analysis Evaluation Bioinformatics • High Throughput Biology
  • 10.
    Bioinformatics is an interdisciplinary field • Bioinformaticsrequires dedication and continuity • Bioinformatics data analysis is a full research experiment in itself • We get the most out of our research if we work as an interdisciplinary research team throughout
  • 11.
    Some examples of bioinformaticstools BLAST BLAST (Basic Local Alignment Search Tool) comes under the category of homology and similarity tools. It is a set of search programs designed for the Windows platform and is used to perform fast similarity searches regardless of whether the query is for protein or DNA. FASTA FAST homology search All sequences .An alignment program for protein sequences created by Pearsin and Lipman in 1988. The program is one of the many heuristic algorithms proposed to speed up sequence comparison.
  • 12.
    Application of Programmesin Bioinformatics JAVA in Bioinformatics Physiome Sciences' computer-based biological simulation technologies and Bioinformatics Solutions' PatternHunter are two examples of the growing adoption of Java in bioinformatics. PERL in Bioinformatics String manipulation, regular expression matching, file parsing, data format interconversion etc are the common text-processing tasks performed in bioinformatics.
  • 13.
    Bioinformatics ProjectsBIOPYTHON Biopython is aset of freely available tools for biological computation written in Python by an international team of developers. It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics. BIOJAVA BioJava is an open-source software project dedicated to provide Java tools to process biological data. BioJava is a set of library functions written in the programming language Java for manipulating sequences, protein structures, file parsers, Common Object Request Broker Architecture (CORBA) interoperability, Distributed Annotation System (DAS), access to AceDB, dynamic programming, and simple statistical routines.
  • 14.
    Some Applications of Bioinformatics Thescience of bioinformatics has many beneficial uses in the modern day world . These include the following: • Molecular medicine • Microbial genome applications • Agriculture • Animals • Comparative studies
  • 15.
    Current Applications ofBioinformatics (Covid 19) • the routine detection of SARS-CoV-2 infection, • the reliable analysis of sequencing data, • the tracking of the COVID-19 pandemic and evaluation of containment measures, • the study of coronavirus evolution, • the discovery of potential drug targets and development of therapeutic strategies.
  • 16.
    Tools for drugdesign (Covid 19 – EVBC)
  • 17.
    WEBLIOGRAPHY • http://evbc.uni-jena.de/tools/coronavirus-tools/ • https://arxiv.org/ftp/arxiv/papers/0911/0911.4230.pdf •https://www.quora.com/What-are-the-names-of-software-used-in- bioinformatics • https://bioinformatics.csiro.au/

Editor's Notes

  • #3 This is the question that your experiment answers
  • #5 Summarize your research in three to five points.
  • #9 Establish hypothesis before you begin the experiment. This should be your best educated guess based on your research.