SlideShare a Scribd company logo
Features of Biological
Databases
CHARU SHARMA
B.Sc(H) BOTANY 3rd
YEAR
Biological Database
 It is a collection of data that is
structured, searchable, updated
periodically and cross-referenced.
 Stores biological data in electronic
form.
 Purpose-
Systemization of database
Availability of biological data
Analysis of computed biological data
HISTORY
 Insulin, first protein that was sequenced;
composed of 55 amino acid.
 The sequence was published in “Atlas Of
Protein Sequence” in 1965 by Margaret
Day Hoff.
 Became base for PIR database.
 First nucleotide sequenced was of Yeast
tRNA, composed of 77 bp.
 First organism whose genome was
sequenced, a free living virus
Haemophilus influenzae in 1995 by Craig
Ventar
Features of Biological
Databases
1. Heterogeneity
2. High volume data
3. Uncertainity
4. Data curation
5. Data integration
6. Data sharing
7. Dynamics
1. Data Heterogeneity
Availability of diverse and complex
data types.
Data Types :
 Sequence- Nucleotide, Protein
 Graph - Data indicating relationship
among themselves can be captured
as graph. It includes pathway data,
genetic maps and structural taxonomy.
 High dimensional data –
Data generated from micro-array
experiments that involves thousands of
genes and hundreds of experimental
condition.
 Shapes –
It consists of 3D molecular structural
data.
Example- Docking
 Temporal data –
For studying dynamics of any biological
system.
Example- Development biology
 Patterns –
There are patterns lying within the
genome that characterize biologically
entities.
Example-Regulatory sequence
(promoter)
 Scalar and Vector fields –
 Extracted features data –
Numerical data obtained from
combination of one of the above
mentioned data types
2. High volume data
In addition to being highly
heterogeneous, biological data are
voluminous to support comprehensive
investigations in various fields and
directions.
3. Uncertainity
Biological data have great deal of
uncertainity as they represent biological
phenomenon that are observed and
assumed.
4. Data curation
 Biological data are collected from
various sources across different
structural and functional boundaries.
 There are always chances of missing
links.
 To fill these, the data is analyzed and
curated via automated methods.
5. Data integration
After years of research, across
different structural and functional
scales, data is collected from
laboratories worldwide, and integrated
together through a database and
made available for use.
6. Data sharing
 Biological data is shared via
databases.
 Purpose:
For scientific community’s inspection
For cross verification
To prevent repetition and validation of
data
7. Dynamics
 New data is generated every day in
laboratories.
 And sometimes this new data
contradicts with the old data.
 So, its necessary to develop new
organizational database schemes to
incorporate new data.
CLASSIFICATION
Classification of biological
databases
o Data type
o Maintainer status
o Data access
o Data source
o Database design
o Organism
1. Data type
 Sequence database
a. Nucleotide database : GenBank, EMBL-
Bank
b. Protein database : Swiss-Prot, PIR
 Structure database - PDB, NDB, DALI, MSD
 Microarray database - ArrayExpress, MIAME
 Chemical database - PubChem
 Pathway database - KEGG, BioSilico
 Enzyme database - ExPASy, REBASE
 Disease database - OMIM, OMIA
 Literature database - PubMed, ScoPUS
2. Maintainer status
 NCBI, EMBL
 Academic group or scientist
 Commercial company
3. Data access
 Publicly available
 Available with copyright
 Browsing only, accessible but not
downloadable
 Academic but not freely available
 Restricted
4. Data source
a) Primary database (archival)
Original data submission by researcher occurs.
Examples:
Nucleotide - GenBank, EMBL, DDBJ
Protein - UniProt
Structure - PDB
Literature - Medline (PubMed)
b) Secondary database (curated)
- Results of analysis of primary databases.
- Either manually curated or by automated
methods
Examples: Prosite , Pfam , RefSeq
5. Database design
 Flat files
 Relational database (SQL)
 Object oriented database
 Exchange/publication technologies
(FTP, HTML, SOAP, COBRA, XML)
6. Organism
 Bacteria
 Virus
 Human
THANK YOU

More Related Content

What's hot

Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
KAUSHAL SAHU
 
Ddbj
DdbjDdbj
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
KAUSHAL SAHU
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
Hafiz Muhammad Zeeshan Raza
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
sagrika chugh
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
Pranavathiyani G
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
Vidya Kalaivani Rajkumar
 
Protein databases
Protein databasesProtein databases
Protein databasessarumalay
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
VinaKhan1
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
Asad Afridi
 
Kegg
KeggKegg
Kegg
msfbi1521
 
Bioinformatics on internet
Bioinformatics on internetBioinformatics on internet
Bioinformatics on internet
Bahauddin Zakariya University lahore
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)
ShivaniShewale2
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
Vijay Hemmadi
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
Thapar Institute of Engineering & Technology, Patiala, Punjab, India
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
Entrez databases
Entrez databasesEntrez databases
Entrez databases
Hafiz Muhammad Zeeshan Raza
 
Biological databases
Biological databasesBiological databases
Biological databases
Tamanna Syeda
 

What's hot (20)

protein data bank
protein data bankprotein data bank
protein data bank
 
Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
 
Ddbj
DdbjDdbj
Ddbj
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Kegg
KeggKegg
Kegg
 
Bioinformatics on internet
Bioinformatics on internetBioinformatics on internet
Bioinformatics on internet
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 

Similar to Features of biological databases

Biological Database
Biological DatabaseBiological Database
Biological Database
Sombir Kashyap
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
SBituila
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
BibiQuinah
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
DrGopaSarma
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases
Hemant Bothe
 
Protein database
Protein databaseProtein database
Protein database
Khalid Hakeem
 
PROTEIN DATABASE
PROTEIN DATABASEPROTEIN DATABASE
PROTEIN DATABASE
naveed ul mushtaq
 
Biological databases
Biological databasesBiological databases
Biological databases
SHRADHEYA GUPTA
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
Elufer Akram
 
Biological database by kk sahu
Biological database by kk sahuBiological database by kk sahu
Biological database by kk sahu
KAUSHAL SAHU
 
Biological data base
Biological data baseBiological data base
Biological data base
kishoreGupta17
 
Basics in bioinformatics
Basics in bioinformaticsBasics in bioinformatics
Basics in bioinformatics
Mamun Billah
 
protein databases
 protein databases protein databases
protein databases
wasisyed
 
Bioinformatics
BioinformaticsBioinformatics
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
AakifahAmreen
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Data retrieval
Data retrievalData retrieval
BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf
BIOINFORMATICS  AND  DATABASES IN BIOINFORMATICS.pdfBIOINFORMATICS  AND  DATABASES IN BIOINFORMATICS.pdf
BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf
PravanjanDash
 
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptxCOMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
PravanjanDash
 
Bioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of NatureBioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of Nature
Robert Cormia
 

Similar to Features of biological databases (20)

Biological Database
Biological DatabaseBiological Database
Biological Database
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases
 
Protein database
Protein databaseProtein database
Protein database
 
PROTEIN DATABASE
PROTEIN DATABASEPROTEIN DATABASE
PROTEIN DATABASE
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Biological database by kk sahu
Biological database by kk sahuBiological database by kk sahu
Biological database by kk sahu
 
Biological data base
Biological data baseBiological data base
Biological data base
 
Basics in bioinformatics
Basics in bioinformaticsBasics in bioinformatics
Basics in bioinformatics
 
protein databases
 protein databases protein databases
protein databases
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Introduction to Biological databases
 
Data retrieval
Data retrievalData retrieval
Data retrieval
 
BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf
BIOINFORMATICS  AND  DATABASES IN BIOINFORMATICS.pdfBIOINFORMATICS  AND  DATABASES IN BIOINFORMATICS.pdf
BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf
 
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptxCOMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
COMPUNATIONAL BIOLOGY AND DATABASES IN BIOINFORMATICS.pptx
 
Bioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of NatureBioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of Nature
 

Recently uploaded

Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 

Recently uploaded (20)

Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 

Features of biological databases

  • 1. Features of Biological Databases CHARU SHARMA B.Sc(H) BOTANY 3rd YEAR
  • 2. Biological Database  It is a collection of data that is structured, searchable, updated periodically and cross-referenced.  Stores biological data in electronic form.  Purpose- Systemization of database Availability of biological data Analysis of computed biological data
  • 3. HISTORY  Insulin, first protein that was sequenced; composed of 55 amino acid.  The sequence was published in “Atlas Of Protein Sequence” in 1965 by Margaret Day Hoff.  Became base for PIR database.  First nucleotide sequenced was of Yeast tRNA, composed of 77 bp.  First organism whose genome was sequenced, a free living virus Haemophilus influenzae in 1995 by Craig Ventar
  • 4. Features of Biological Databases 1. Heterogeneity 2. High volume data 3. Uncertainity 4. Data curation 5. Data integration 6. Data sharing 7. Dynamics
  • 5. 1. Data Heterogeneity Availability of diverse and complex data types. Data Types :  Sequence- Nucleotide, Protein  Graph - Data indicating relationship among themselves can be captured as graph. It includes pathway data, genetic maps and structural taxonomy.
  • 6.  High dimensional data – Data generated from micro-array experiments that involves thousands of genes and hundreds of experimental condition.  Shapes – It consists of 3D molecular structural data. Example- Docking  Temporal data – For studying dynamics of any biological system. Example- Development biology
  • 7.  Patterns – There are patterns lying within the genome that characterize biologically entities. Example-Regulatory sequence (promoter)  Scalar and Vector fields –  Extracted features data – Numerical data obtained from combination of one of the above mentioned data types
  • 8. 2. High volume data In addition to being highly heterogeneous, biological data are voluminous to support comprehensive investigations in various fields and directions. 3. Uncertainity Biological data have great deal of uncertainity as they represent biological phenomenon that are observed and assumed.
  • 9. 4. Data curation  Biological data are collected from various sources across different structural and functional boundaries.  There are always chances of missing links.  To fill these, the data is analyzed and curated via automated methods.
  • 10. 5. Data integration After years of research, across different structural and functional scales, data is collected from laboratories worldwide, and integrated together through a database and made available for use.
  • 11. 6. Data sharing  Biological data is shared via databases.  Purpose: For scientific community’s inspection For cross verification To prevent repetition and validation of data
  • 12. 7. Dynamics  New data is generated every day in laboratories.  And sometimes this new data contradicts with the old data.  So, its necessary to develop new organizational database schemes to incorporate new data.
  • 14. Classification of biological databases o Data type o Maintainer status o Data access o Data source o Database design o Organism
  • 15. 1. Data type  Sequence database a. Nucleotide database : GenBank, EMBL- Bank b. Protein database : Swiss-Prot, PIR  Structure database - PDB, NDB, DALI, MSD  Microarray database - ArrayExpress, MIAME  Chemical database - PubChem  Pathway database - KEGG, BioSilico  Enzyme database - ExPASy, REBASE  Disease database - OMIM, OMIA  Literature database - PubMed, ScoPUS
  • 16.
  • 17. 2. Maintainer status  NCBI, EMBL  Academic group or scientist  Commercial company
  • 18. 3. Data access  Publicly available  Available with copyright  Browsing only, accessible but not downloadable  Academic but not freely available  Restricted
  • 19. 4. Data source a) Primary database (archival) Original data submission by researcher occurs. Examples: Nucleotide - GenBank, EMBL, DDBJ Protein - UniProt Structure - PDB Literature - Medline (PubMed) b) Secondary database (curated) - Results of analysis of primary databases. - Either manually curated or by automated methods Examples: Prosite , Pfam , RefSeq
  • 20. 5. Database design  Flat files  Relational database (SQL)  Object oriented database  Exchange/publication technologies (FTP, HTML, SOAP, COBRA, XML)
  • 21. 6. Organism  Bacteria  Virus  Human