SlideShare a Scribd company logo
1 of 14
WHAT is a database?
• A collection of data that needs to be:
– Structured
– Searchable
– Updated (periodically)
– Cross referenced
• Challenge:
– To change “meaningless” data into useful information that can be
accessed and analysed the best way possible.
For example:
HOW would YOU organise all biological sequences so that the
biological information is optimally accessible?
You need an appropriate database management system (DBMS)
DBMS
• Internal organization
– Controls speed and
flexibility
• A unity of programs that
– Store
– Extract
– Modify
Database
Store Extract Modify
USER(S)
DBMS organisation types
• Flat file databases (flat DBMS)
– Simple, restrictive, table
• Hierarchical databases (hierarchical DBMS)
– Simple, restrictive, tables
• Relational databases (RDBMS)
– Complex,versatile, tables
• Object-oriented databases (ODBMS)
– Complex, versatile, objects
Relational databases
• Data is stored in multiple related tables
• Data relationships across tables can be
either many-to-one or many-to-many
• A few rules allow the database to be
viewed in many ways
• Lets convert the “course details” to a
relational database
Student 1 Chemistry Biology A B B A C …..
Student 2 Ecology Maths A D A A A …..
.
.
.
.
Course details
FLAT DATABASE 2
Student 2 Ecology Biology A B A A A …..
Student 1 Chemistry English A A A A A …..
.
.
.
.
Name Depart. Course E1 E2 E3 P1 P2
Student 1 Chemistry Maths C C B A A …..
Our flat file database
Normalize (1NF) …
• We remove repeating records (rows)
sID Name dID
1 Student1 1
2 Student2 2
cID Course
1 Biology
2 Maths
3 English
dID Department
1 Chemistry
2 Ecology
1 1 A B B A C …..
2 2 A D A A A …..
.
.
.
.
2 1 A B A A A …..
1 3 A A A A A …..
.
.
.
.
sID cID E1 E2 E3 P1 P2
1 2 C C B A A …..
Primary keys
Foreign keys
sID Name dID
1 Student1 1
2 Student2 2
cID Course
1 Biology
2 Maths
3 English
gID Grade
1 A
2 B
3 C
dID Department
1 Chemistry
2 Ecology
wID Project
1 E1
2 E2
3 E3
4 P1
5 P2
sID cID gID wID
1 1 1 1
1 1 2 2
1 1 2 3
1 1 1 4
1 1 3 5
2 1 1 1
2 1 1 2
2 1 2 3
2 1 1 4
2 1 1 5
Normalize (2NF) …
• We remove redundant fields (columns)
Relational Databases
• What have we achieved?
– No repeating information
– Less storage space
– Better reality representation
– Easy modification/management
– Easy usage of any combination of records
Remember
the DBMS has programs to access and edit this
information so ignore the human reading limitation of
the primary keys
Accessing database information
• A request for data from a database is
called a query
• Queries can be of three forms:
– Choose from a list of parameters
– Query by example (QBE)
– Query language
Query by Example (QBE) reports allows end users to query, insert, update, and delete
values into a database table or view.
In the QBE build wizard, you choose which data to display in the report. Or, you can
allow end users to make their own queries in the QBE report's customization form.
Because the QBE system formulates the actual query, QBE is easier to learn than
formal query languages, such as the standard Structured Query Language (SQL).
Distributed databases
• From local to global attitude
• Data appears to be in one location but is most definitely
not
• A definition: Two or more data files in different locations,
periodically synchronized by the DBMS to keep data in
all locations consistent (A,B,C)
• An intricate network for combining and sharing
information
• Administrators praise fast network technologies!!!
• Users praise the internet!!!
Three main Points
• Database proliferation
– Dozens to hundreds at the moment
• More and more scientific discoveries result
from inter-database analysis and mining
• Rising complexity of required data-
combinations
– E.g. translational medicine: “from bench to
bedside” (genomic data vs. clinical data)
Proliferation = great and rapid increase in numbers; Grid = a network of evenly
space horizontal and vertical lines (rooster);
Semantic = related to the meaning;
Biological databases
• Like any other database
– Data organization for optimal analysis
• Data is of different types
– Raw data (DNA, RNA, protein sequences)
– Curated data (DNA, RNA and protein
annotated sequences and structures,
expression data)
A few biological databases
• Nucleotide Databases
Alternative Splicing, EMBL-Bank, Ensembl, Genomes Server, Genome,
MOT, EMBL-Align, Simple Queries, dbSTS Queries, Parasites, Mutations,
IMGT
• Genome Databases
Human, Mouse, Yeast, C.elegans, FLYBASE, Parasites
• Protein Databases
Swiss-Prot, TrEMBL, InterPro, CluSTr, IPI, GOA, GO, Proteome Analysis,
HPI, IntEnz, TrEMBLnew, SP_ML, NEWT, PANDIT
• Structure Databases
PDB, MSD, FSSP, DALI
• Microarray Database
ArrayExpress
• Literature Databases
MEDLINE, Software Biocatalog, Flybase Archives
• Alignment Databases
BAliBASE, Homstrad, FSSP
A short word on problems
• Even today we face some key limitations
– There is no standard format
• Every database or program has its own format
– There is no standard nomenclature
• Every database has its own names
– Data is not fully optimized
• Some datasets have missing information without indications
of it
– Data errors
• Data is sometimes of poor quality, erroneous, misspelled
• Error propagation resulting from computer annotation

More Related Content

Similar to 1.Databases for bioinformatics and its types

PPT-UEU-Basis-Data-Pertemuan-1.pptx
PPT-UEU-Basis-Data-Pertemuan-1.pptxPPT-UEU-Basis-Data-Pertemuan-1.pptx
PPT-UEU-Basis-Data-Pertemuan-1.pptxUbaidURRahman78
 
Design and implementation of Clinical Databases using openEHR
Design and implementation of Clinical Databases using openEHRDesign and implementation of Clinical Databases using openEHR
Design and implementation of Clinical Databases using openEHRPablo Pazos
 
Module 1 - Chapter1.pptx
Module 1 - Chapter1.pptxModule 1 - Chapter1.pptx
Module 1 - Chapter1.pptxSoniaDevi15
 
Database management system.pptx
Database management system.pptxDatabase management system.pptx
Database management system.pptxAshmitKashyap1
 
System Analysis And Design
System Analysis And DesignSystem Analysis And Design
System Analysis And DesignLijo Stalin
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EITESANGO
 
Database Systems - Lecture Week 1
Database Systems - Lecture Week 1Database Systems - Lecture Week 1
Database Systems - Lecture Week 1Dios Kurniawan
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics AakifahAmreen
 
UNIT machine learning unit 1,algorithm pdf
UNIT machine learning  unit 1,algorithm pdfUNIT machine learning  unit 1,algorithm pdf
UNIT machine learning unit 1,algorithm pdfOmarFarooque9
 
Bioinformatics__Lecture_1.ppt
Bioinformatics__Lecture_1.pptBioinformatics__Lecture_1.ppt
Bioinformatics__Lecture_1.pptsirwansleman
 
Database Systems(DBS) Or DATABASE MANAGEMENT SYSTEM
Database Systems(DBS) Or DATABASE MANAGEMENT SYSTEMDatabase Systems(DBS) Or DATABASE MANAGEMENT SYSTEM
Database Systems(DBS) Or DATABASE MANAGEMENT SYSTEMmoronfolabukunmi
 

Similar to 1.Databases for bioinformatics and its types (20)

DBMS
DBMS DBMS
DBMS
 
PPT-UEU-Basis-Data-Pertemuan-1.pptx
PPT-UEU-Basis-Data-Pertemuan-1.pptxPPT-UEU-Basis-Data-Pertemuan-1.pptx
PPT-UEU-Basis-Data-Pertemuan-1.pptx
 
Design and implementation of Clinical Databases using openEHR
Design and implementation of Clinical Databases using openEHRDesign and implementation of Clinical Databases using openEHR
Design and implementation of Clinical Databases using openEHR
 
Introduction to Databases by Dr. Kamal Gulati
Introduction to Databases by Dr. Kamal GulatiIntroduction to Databases by Dr. Kamal Gulati
Introduction to Databases by Dr. Kamal Gulati
 
Module 1 - Chapter1.pptx
Module 1 - Chapter1.pptxModule 1 - Chapter1.pptx
Module 1 - Chapter1.pptx
 
Fundamentals of DBMS
Fundamentals of DBMSFundamentals of DBMS
Fundamentals of DBMS
 
Database management system.pptx
Database management system.pptxDatabase management system.pptx
Database management system.pptx
 
Unit 01 dbms
Unit 01 dbmsUnit 01 dbms
Unit 01 dbms
 
System Analysis And Design
System Analysis And DesignSystem Analysis And Design
System Analysis And Design
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 
Database Systems - Lecture Week 1
Database Systems - Lecture Week 1Database Systems - Lecture Week 1
Database Systems - Lecture Week 1
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
Unit01 dbms 2
Unit01 dbms 2Unit01 dbms 2
Unit01 dbms 2
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
 
UNIT machine learning unit 1,algorithm pdf
UNIT machine learning  unit 1,algorithm pdfUNIT machine learning  unit 1,algorithm pdf
UNIT machine learning unit 1,algorithm pdf
 
Bioinformatics__Lecture_1.ppt
Bioinformatics__Lecture_1.pptBioinformatics__Lecture_1.ppt
Bioinformatics__Lecture_1.ppt
 
Database Lecture Notes
Database Lecture NotesDatabase Lecture Notes
Database Lecture Notes
 
Database_Introduction.pdf
Database_Introduction.pdfDatabase_Introduction.pdf
Database_Introduction.pdf
 
Database Systems(DBS) Or DATABASE MANAGEMENT SYSTEM
Database Systems(DBS) Or DATABASE MANAGEMENT SYSTEMDatabase Systems(DBS) Or DATABASE MANAGEMENT SYSTEM
Database Systems(DBS) Or DATABASE MANAGEMENT SYSTEM
 

Recently uploaded

WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpWASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpSérgio Sacani
 
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeGBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeAreesha Ahmad
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfpablovgd
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxArunLakshmiMeenakshi
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfStart Project
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...yogeshlabana357357
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxKyawThanTint
 
Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...Nistarini College, Purulia (W.B) India
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Sahil Suleman
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxPat (JS) Heslop-Harrison
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Sérgio Sacani
 
NuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent UniversityNuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent Universitypablovgd
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyanmuralinath2
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptxCherry
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surfaceSérgio Sacani
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxGOWTHAMIM22
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxjayabahari688
 
MIP Award presentation at the IEEE International Conference on Software Analy...
MIP Award presentation at the IEEE International Conference on Software Analy...MIP Award presentation at the IEEE International Conference on Software Analy...
MIP Award presentation at the IEEE International Conference on Software Analy...Annibale Panichella
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionAreesha Ahmad
 

Recently uploaded (20)

WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpWASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
 
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeGBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdf
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptx
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
NuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent UniversityNuGOweek 2024 full programme - hosted by Ghent University
NuGOweek 2024 full programme - hosted by Ghent University
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyan
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
The solar dynamo begins near the surface
The solar dynamo begins near the surfaceThe solar dynamo begins near the surface
The solar dynamo begins near the surface
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
 
MIP Award presentation at the IEEE International Conference on Software Analy...
MIP Award presentation at the IEEE International Conference on Software Analy...MIP Award presentation at the IEEE International Conference on Software Analy...
MIP Award presentation at the IEEE International Conference on Software Analy...
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 

1.Databases for bioinformatics and its types

  • 1. WHAT is a database? • A collection of data that needs to be: – Structured – Searchable – Updated (periodically) – Cross referenced • Challenge: – To change “meaningless” data into useful information that can be accessed and analysed the best way possible. For example: HOW would YOU organise all biological sequences so that the biological information is optimally accessible? You need an appropriate database management system (DBMS)
  • 2. DBMS • Internal organization – Controls speed and flexibility • A unity of programs that – Store – Extract – Modify Database Store Extract Modify USER(S)
  • 3. DBMS organisation types • Flat file databases (flat DBMS) – Simple, restrictive, table • Hierarchical databases (hierarchical DBMS) – Simple, restrictive, tables • Relational databases (RDBMS) – Complex,versatile, tables • Object-oriented databases (ODBMS) – Complex, versatile, objects
  • 4. Relational databases • Data is stored in multiple related tables • Data relationships across tables can be either many-to-one or many-to-many • A few rules allow the database to be viewed in many ways • Lets convert the “course details” to a relational database
  • 5. Student 1 Chemistry Biology A B B A C ….. Student 2 Ecology Maths A D A A A ….. . . . . Course details FLAT DATABASE 2 Student 2 Ecology Biology A B A A A ….. Student 1 Chemistry English A A A A A ….. . . . . Name Depart. Course E1 E2 E3 P1 P2 Student 1 Chemistry Maths C C B A A ….. Our flat file database
  • 6. Normalize (1NF) … • We remove repeating records (rows) sID Name dID 1 Student1 1 2 Student2 2 cID Course 1 Biology 2 Maths 3 English dID Department 1 Chemistry 2 Ecology 1 1 A B B A C ….. 2 2 A D A A A ….. . . . . 2 1 A B A A A ….. 1 3 A A A A A ….. . . . . sID cID E1 E2 E3 P1 P2 1 2 C C B A A ….. Primary keys Foreign keys
  • 7. sID Name dID 1 Student1 1 2 Student2 2 cID Course 1 Biology 2 Maths 3 English gID Grade 1 A 2 B 3 C dID Department 1 Chemistry 2 Ecology wID Project 1 E1 2 E2 3 E3 4 P1 5 P2 sID cID gID wID 1 1 1 1 1 1 2 2 1 1 2 3 1 1 1 4 1 1 3 5 2 1 1 1 2 1 1 2 2 1 2 3 2 1 1 4 2 1 1 5 Normalize (2NF) … • We remove redundant fields (columns)
  • 8. Relational Databases • What have we achieved? – No repeating information – Less storage space – Better reality representation – Easy modification/management – Easy usage of any combination of records Remember the DBMS has programs to access and edit this information so ignore the human reading limitation of the primary keys
  • 9. Accessing database information • A request for data from a database is called a query • Queries can be of three forms: – Choose from a list of parameters – Query by example (QBE) – Query language Query by Example (QBE) reports allows end users to query, insert, update, and delete values into a database table or view. In the QBE build wizard, you choose which data to display in the report. Or, you can allow end users to make their own queries in the QBE report's customization form. Because the QBE system formulates the actual query, QBE is easier to learn than formal query languages, such as the standard Structured Query Language (SQL).
  • 10. Distributed databases • From local to global attitude • Data appears to be in one location but is most definitely not • A definition: Two or more data files in different locations, periodically synchronized by the DBMS to keep data in all locations consistent (A,B,C) • An intricate network for combining and sharing information • Administrators praise fast network technologies!!! • Users praise the internet!!!
  • 11. Three main Points • Database proliferation – Dozens to hundreds at the moment • More and more scientific discoveries result from inter-database analysis and mining • Rising complexity of required data- combinations – E.g. translational medicine: “from bench to bedside” (genomic data vs. clinical data) Proliferation = great and rapid increase in numbers; Grid = a network of evenly space horizontal and vertical lines (rooster); Semantic = related to the meaning;
  • 12. Biological databases • Like any other database – Data organization for optimal analysis • Data is of different types – Raw data (DNA, RNA, protein sequences) – Curated data (DNA, RNA and protein annotated sequences and structures, expression data)
  • 13. A few biological databases • Nucleotide Databases Alternative Splicing, EMBL-Bank, Ensembl, Genomes Server, Genome, MOT, EMBL-Align, Simple Queries, dbSTS Queries, Parasites, Mutations, IMGT • Genome Databases Human, Mouse, Yeast, C.elegans, FLYBASE, Parasites • Protein Databases Swiss-Prot, TrEMBL, InterPro, CluSTr, IPI, GOA, GO, Proteome Analysis, HPI, IntEnz, TrEMBLnew, SP_ML, NEWT, PANDIT • Structure Databases PDB, MSD, FSSP, DALI • Microarray Database ArrayExpress • Literature Databases MEDLINE, Software Biocatalog, Flybase Archives • Alignment Databases BAliBASE, Homstrad, FSSP
  • 14. A short word on problems • Even today we face some key limitations – There is no standard format • Every database or program has its own format – There is no standard nomenclature • Every database has its own names – Data is not fully optimized • Some datasets have missing information without indications of it – Data errors • Data is sometimes of poor quality, erroneous, misspelled • Error propagation resulting from computer annotation