(BS)-Course
BIOINFORMATICS
Hasnain Israr
Department of Medical Lab Technology (MLT)
Abasyn University Islamabad
Learning Objective
 To train students to analyze genetics data for
research
 To learn the details of proteins structure and their
bioactive configuration
 To learn about computational analysis of proteins
structure and functions
 To learn about computational tools for protein and
nucleic acid analysis
Recommended Books
Evaluation System
• Assignments 05%
• Quizzes 15%
• Presentations 05%
• Midterm Exams 25%
• Final Exams 50%
Topic-1
Introduction to Information Technology and
Bioinformatics Basic Concepts
Bioinformatics
The combination of
Biology and Informatics
Originally refers to the use of computational
tools to organize and analyze genetic and
protein sequence data
First coined by Dr. Hwa Lim in 1988
Bioinformatics
It is an interdisciplinary approach requiring
advanced knowledge of computer
mathematics and statistical methods for
understanding of biological phenomena at
the molecular level
Bioinformatics
The NCBI defines bioinformatics as:
"Bioinformatics is the field of science in which
biology, computer science, and information
technology merge into a single discipline”
What is Bioinformatics
Easy Answer
Using computers to solve molecular biology
problems; Intersection of molecular biology and
computer science
Hard Answer
Computational techniques (e.g. algorithms, artificial
intelligence, databases) for management and
analysis of biological data and knowledge
What is Bioinformatics
Bioinformatics = Biology + Information
Computation methods are necessary to analyze the
massive amount of information that coming out of
the genome projects
Bioinformatics
Biological
Data
Computer
Analysis
+
Bioinformatics
 It is an interdisciplinary field that develops methods
and software tools for understanding biological data
 As an interdisciplinary field of science, bioinformatics
combines computer science, statistics, mathematics,
and engineering to analyze and interpret biological
data
Components of Bioinformatics
 Creation of databases
 Development of algorithms and statistics
 Analysis of data interpretation
National Centre for Biotechnology Information
(NCBI)
The NCBI houses a series of
databases relevant to biotechnology and
biomedicine and is an important resource for
bioinformatics tools and services
National Centre for Biotechnology Information
(http://www.ncbi.nlm.nih.gov/)
Developed by National Library of Medicine (NLM) at the
National Institutes of Health (NIH)
A comprehensive website for biologists including
 Biology-related databases
 tools for viewing and analyzing
 automated systems for storing and retrieval
NCBI Home Page
DNA/RNA Domain of
NCBI
 GenBank (Submitter owned)
 RefSeq (Reference Sequence) (Annotated)
 SRA (Sequence read archive) (Short sequences
typically less then 1000bp)
 PopSet (Population data set) (related DNA sequences
derived from mutations and phylogenetic analysis)
 Biosample
What's the difference between PubMed and
PubMed Central?
PubMed
PMC
NLM Catalog
MeSH
NCBI Book Shelf
Taxonomy Domain of NCBI
 The NCBI Taxonomy Database
DNA/RNA Domain of NCBI
 GenBank
 RefSeq
 SRA
 PopSet
 Biosample
GenBank
RefSeq
GenBank vs RefSeq
 A GenBank genome assembly contains assembled genome
sequences submitted by investigators
 A RefSeq genome assembly represents an NCBI-derived copy of a
submitted GenBank assembly
 RefSeq records are maintained by NCBI
 NCBI staff may remove short sequences or reported
contaminants from the assembly
GenBank vs RefSeq
 All RefSeq genome assemblies include annotation
 The GenBank assembly is an archival record that is owned by the
submitter. In rare cases where NCBI makes updates to the
GenBank assembly, for example, to remove contaminated
sequences, the original submitter will be notified
GenBank vs RefSeq
PopSet
SRA
BioSample

Introduction to Information Technology and Bioinformatics Basic Concepts.pptx

  • 1.
    (BS)-Course BIOINFORMATICS Hasnain Israr Department ofMedical Lab Technology (MLT) Abasyn University Islamabad
  • 2.
    Learning Objective  Totrain students to analyze genetics data for research  To learn the details of proteins structure and their bioactive configuration  To learn about computational analysis of proteins structure and functions  To learn about computational tools for protein and nucleic acid analysis
  • 3.
  • 4.
    Evaluation System • Assignments05% • Quizzes 15% • Presentations 05% • Midterm Exams 25% • Final Exams 50%
  • 5.
    Topic-1 Introduction to InformationTechnology and Bioinformatics Basic Concepts
  • 6.
    Bioinformatics The combination of Biologyand Informatics Originally refers to the use of computational tools to organize and analyze genetic and protein sequence data First coined by Dr. Hwa Lim in 1988
  • 7.
    Bioinformatics It is aninterdisciplinary approach requiring advanced knowledge of computer mathematics and statistical methods for understanding of biological phenomena at the molecular level
  • 8.
    Bioinformatics The NCBI definesbioinformatics as: "Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline”
  • 9.
    What is Bioinformatics EasyAnswer Using computers to solve molecular biology problems; Intersection of molecular biology and computer science Hard Answer Computational techniques (e.g. algorithms, artificial intelligence, databases) for management and analysis of biological data and knowledge
  • 10.
    What is Bioinformatics Bioinformatics= Biology + Information Computation methods are necessary to analyze the massive amount of information that coming out of the genome projects
  • 11.
  • 12.
    Bioinformatics  It isan interdisciplinary field that develops methods and software tools for understanding biological data  As an interdisciplinary field of science, bioinformatics combines computer science, statistics, mathematics, and engineering to analyze and interpret biological data
  • 13.
    Components of Bioinformatics Creation of databases  Development of algorithms and statistics  Analysis of data interpretation
  • 14.
    National Centre forBiotechnology Information (NCBI) The NCBI houses a series of databases relevant to biotechnology and biomedicine and is an important resource for bioinformatics tools and services
  • 15.
    National Centre forBiotechnology Information (http://www.ncbi.nlm.nih.gov/) Developed by National Library of Medicine (NLM) at the National Institutes of Health (NIH) A comprehensive website for biologists including  Biology-related databases  tools for viewing and analyzing  automated systems for storing and retrieval
  • 16.
  • 17.
    DNA/RNA Domain of NCBI GenBank (Submitter owned)  RefSeq (Reference Sequence) (Annotated)  SRA (Sequence read archive) (Short sequences typically less then 1000bp)  PopSet (Population data set) (related DNA sequences derived from mutations and phylogenetic analysis)  Biosample
  • 18.
    What's the differencebetween PubMed and PubMed Central?
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
    Taxonomy Domain ofNCBI  The NCBI Taxonomy Database
  • 25.
    DNA/RNA Domain ofNCBI  GenBank  RefSeq  SRA  PopSet  Biosample
  • 26.
  • 27.
  • 28.
    GenBank vs RefSeq A GenBank genome assembly contains assembled genome sequences submitted by investigators  A RefSeq genome assembly represents an NCBI-derived copy of a submitted GenBank assembly  RefSeq records are maintained by NCBI  NCBI staff may remove short sequences or reported contaminants from the assembly
  • 29.
    GenBank vs RefSeq All RefSeq genome assemblies include annotation  The GenBank assembly is an archival record that is owned by the submitter. In rare cases where NCBI makes updates to the GenBank assembly, for example, to remove contaminated sequences, the original submitter will be notified
  • 30.
  • 31.
  • 32.
  • 33.