What is a “Database”• A Database is a collection of related data. – Facts that can be recorded and have specific meaning. – Represents some aspect of the real world. – A database is a logically coherent collection of data. Random assortment of data cannot be construed as a database. – A database is designed, built, and populated for a specific purpose.
What is a Database Management System (DBMS or DMS)• A general purpose software system that facilitates the definition, storing, manipulating, security, organization, retrieval, and sharing of data in a database.• The descriptive information about a database is called meta-data.• To interact with a DBMS you must issue transactions.
Types of Database Management Systems• Hierarchical• Network• Relational• Object Oriented• XML• Experimental models such as Spatial-Temporal Note: We will concentrate on Relational
Relational Database• Characterized by data being arranged to appear as a table of data with rows and columns.• Each column has a unique meaning, name, and data type.• Each Row must have one or more columns.• A Database must have one or more related tables.• There may be several database instances on one machine.• How the data is physically organized and stored on the machine is hidden from the programmer and user.
Definitions• Attribute: A single data item related to a database. AKA Field, column• Candidate Key: A field or group of fields that a could be a primary key.• Cursor: The specific record or tuple in a table or view that the database is currently pointing to.• Data Mining: Automated data analysis techniques used to uncover previously undetected relationships or information.
Other Definitions• Domain: The set of allowed values for an attribute.• Entity: A single object about which data can be stored in a database table. Examples: Person, specimen, or location.• ER Diagram: A graph that shows the tables and the relationships between each one.• Foreign Key: An attribute(s) in a table that is a primary key in another table.
Other Definitions• Functional Dependency: When one attribute is related to another. Usually uni-directional. Examples: Social Security Number -> Name.• Join: The operation of putting the information in multiple tables together in one.• Normalization: The operation if reducing the amount of redundant information in a database.• SQL: The Structured Query Language standard.
Other Definitions• Tuple: A row or record.• View: A view is a “virtual table” that is generated on the fly when the view is accessed. The view is generally created using a pre-defined transaction. Example: You may want to generate a table of personnel without privacy information and with information from other tables includes such as zip code, city, state.
Database Design• Define the purpose of the Database• Gather requirements• Gather data items based upon requirements (In other words, what data will be needed to satisfy the requirements for data storage, retrieval and reporting). – Name each attribute using standard naming conventions. See if there exist standards for the particular area. Example: Darwin Core Standard for biological information systems. – Define the attribute of each data item. – Define range of data values. – Define compatibility checks on the data.
Database Design (cont)• Group related data items into an entity or an object. (examples: person, class, organization)• Take an entity and define the physical tables.• Normalize Tables: – 1st Normal Form: Each attribute must be autonomous and there must not be any repeating groups. – 2nd Normal Form: Is in 1NF and Only attributes allowed that are directly related to the Primary Key. – 3rd Normal Form: 2NF and there are no transitive dependencies. Note: Most databases seldom go beyond 3NF.
Database Design (cont)• Create any additional tables needed to support the requirements. – Attribute look up ( Examples: specimen sex, form etc. – Cross reference tables needed for normalization.• Create Views.• Create indexes.• Decide on visibility of tables and data along with security features.• Create backup and logging strategy.
Example## Table structure for table species#CREATE TABLE species(SpeciesID int(8) NOT NULL auto_increment,GenusID int(32) NOT NULL,FamilyName varchar(128) NOT NULL,GenusName varchar(128) NOT NULL,Variety varchar(128),SpeciesEpithet varchar(128),SpeciesNameAuthors varchar (128),SpeciesDescribedYear char(4),DateIdentified date DEFAULT 0000-00-00,IdentifiedBy varchar(128) NOT NULL,PRIMARY KEY (SpeciesID));
Example# Table Structure for table classificationCREATE TABLE classification(GenusID int(32) NOT NULL auto_increment,GenusName varchar(128) NOT NULL,FamilyName varchar(128) NOT NULL,OrderName varchar(128) NOT NULL,ClassName varchar(128),PhylumName varchar(128) NOT NULL,KingdomName varchar(128) NOT NULL,PRIMARY KEY (GenusID));
Example## Table structure for table specimen#CREATE TABLE specimen(MorphBankSpecimenID int(32) NOT NULL,SpeciesID int(8) NOT NULL,CatalogNumber int (32) NOT NULL AUTO_INCREMENT,DateLastModified date NOT NULL default 0000-00-00,InstitutionCode varchar(128),CollectionCode varchar(128),ScientificName varchar(128),BasisOfRecord char(1),SubSpecies varchar (128),TypeStatus varchar (255),TypeName varchar (128),CollectionNumber varchar (128),