Your SlideShare is downloading. ×
database glossary.doc.doc.doc.doc
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

database glossary.doc.doc.doc.doc

339
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
339
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. database-glossarydocdocdocdoc1200.doc 19/7/2010 http://databases.about.com/library/glossary/blglossary.htm?PM=ss13_databases <A> DataBase Management Systems / Software: Microsoft Access Definition: is an entry-level database that offers a flexible environment for database developers and users. It makes use of the familiar Microsoft Office interface and allows for integration with larger-scale enterprise databases such as Microsoft's SQL Server and Oracle. Cold Fusion Definition: Cold Fusion, a product of Allaire Corporation, is a suite of development tools designed to facilitate web integration of databases. It features the Cold Fusion Markup Language (CFML) which expands upon the features provided by the Hypertext Transfer Protocol (HTTP) and the Extensible Markup Language (XML). CFML allows developers to create web-integrated databases without the complexity inherent in full-scale programming languages such as Java and C++. IBM DB2 Definition: DB2 is a relational database system developed by IBM Corporation, originally for use on large mainframe computer systems. It has since been ported to a variety of platforms including SunOS, Solaris, Linux, Windows 95/98/NT/2000 and HP-UX. dBase Definition: dBase is a relational database management system first marketed by Ashton-Tate corporation in the early 1980s. The data formatting conventions utilized by dBase quickly became industry standards still in use today. The dBase Corporation provides support for legacy and future applications. Delphi Definition: Delphi, a product of the Borland Corporation, is a rapid application development platform that utilizes a visual approach to rapid application development. Delphi offers specialized features for database connectivity and application development. FoxPro Definition: Microsoft Visual FoxPro is a development environment catering to the needs of database developers. Supported platforms include FoxPro, SQL Server and Oracle. INGRES Definition: INGRES is a relational database system produced by Computer Associates. It runs under a wide variety of operating systems and supports the industry-standard Structured Query Language. MySQL page 1
  • 2. database-glossarydocdocdocdoc1200.doc 19/7/2010 Definition: MySQL is a relational database management system that implements many industry standards including SQL and ODBC along with C and Perl APIs. MySQL is made available under the GNU General Public License (GPL) free of charge and under commercial license for commercial use. Oracle Definition: Oracle is a powerful relational database management system that offers a large feature set. Along with Microsoft SQL Server, Oracle is widely regarded as one of the two most popular full-featured database systems on the market today. Online Analytical Processing (OLAP) Definition: Online Analytical Processing software allows for the real-time analysis of data stored in a database. The OLAP server is normally a separate component that contains specialized algorithms and indexing tools to efficiently process data mining tasks with minimal impact on database performance. Online Transaction Processing (OLTP) Paradox Definition: Paradox is a relational database management system produced by the Corel Corporation. Postgres Definition: Postgres is an object-oriented relational database management system (sometimes referred to as an object-relational database). It began as a research project at the University of California, Berkely and is available in several free and commercial versions. Microsoft SQL Server Definition: Microsoft SQL Server is a powerful relational database management system catering to high-end users with advanced needs. Along with Oracle, Microsoft SQL Server is widely regarded as one of the two main full-featured database systems on the market today. <B> Database Terminology: Raw Data page 2
  • 3. database-glossarydocdocdocdoc1200.doc 19/7/2010 Definition: Data consists of a series of facts or statements that may have been collected, stored, processed and/or manipulated but have not been organized or placed into context. When data is organized, it becomes information. Information can be processed and used to draw generalized conclusions or knowledge. Example: A file listing all of the orders placed through an online service is an example of data. If we sort the data by ZIP code and summarize the number of orders that come from each city, we have created information. We can create knowledge by taking this information and making statements such as "Most orders for Widget X come from the northeastern United States." Information Definition: Information is the processed data ordered in a meaningful way. Knowledge Definition: Knowledge consists of generalized conceptual statements that have been developed through the analysis of information. Example: A file listing all of the orders placed through an online service is an example of data. If we sort the data by ZIP code and summarize the number of orders that come from each city, we have created information. We can create knowledge by taking this information and making statements such as "Most orders for Widget X come from the northeastern United States." Database Definition: A database is a collection of information organized into interrelated tables of data and specifications of data objects. Relation / Table Definition: A database relation is a predefined row/column format (that defines an entity) for storing information in a relational database. Relations are equivalent to tables. Record / Row Definition: In a relational database, a row consists of one set of attributes (or one tuple) corresponding to one instance of the entity that a table schema describes. page 3
  • 4. database-glossarydocdocdocdoc1200.doc 19/7/2010 Attribute / Field / Column Definition: Database tables are composed of individual columns corresponding to the attributes of the object. A single data item related to a database object. The database schema associates one or more attributes with each database entity. Example: In the following database table, the attributes are <name, ID, extension> Name ID Extension Jim 124 7075 Valeri 128 0853 Bob 192 4214 Domain Definition: The domain of a database attribute is the set of all allowable values that attribute may assume. Examples: A field for gender may have the domain {male, female, unknown} where those three values are the only permitted entries in that column. Tuple Definition: Tuple is a term from set theory which refers to a collection of one or more attributes. Cardinality Definition: In set theory, cardinality refers to the number of members in the set. When specifically applied to database theory, the cardinality of a table refers to the number of rows (or tuples) contained in a table. Examples: The table below has cardinality 5: Name Age SSN Phone Extension Rob 28 123-45-6789 1242 Amy 34 987-65-4321 9281 Elizabeth 34 111-22-3333 9312 Jim 42 333-22-1111 3214 Mike 29 999-99-9999 2314 page 4
  • 5. database-glossarydocdocdocdoc1200.doc 19/7/2010 Key Definition: A database key is an attribute utilized to sort and/or identify data in some manner. Each table has a primary key which uniquely identifies records. Foreign keys are utilized to cross-reference data between relational tables. Primary Key Definition: The primary key of a relational table uniquely identifies each record in the table. It can either be a normal attribute that is guaranteed to be unique (such as Social Security Number in a table with no more than one record per person) or it can be generated by the DBMS (such as a globally unique identifier, or GUID, in Microsoft SQL Server). Candidate Key Definition: A candidate key is a combination of attributes that can be uniquely used to identify a database record. Each table may have one or more candidate keys. One of these candidate keys is selected as the table primary key. Examples: There are a large number of candidate keys in the sample table below. Some of these are <SSN>, <Phone Extension>, <Name, SSN>, and <Name, Age, SSN>. Note that <Age> is not a candidate key in this case because Amy and Elizabeth share the same age. Name Age Social Security No (SSN) Phone Extension Department Code Rob 28 123-45-6789 1242 001 Amy 34 987-65-4321 9281 002 Elizabeth 34 111-22-3333 9312 002 Jim 42 333-22-1111 3214 005 Mike 29 999-99-9999 2314 004 Foreign Key Definition: A foreign key is a field in a relational table that matches the primary key column (e.g. department code) of another table (Department). The foreign key can be used to cross- reference tables. Index page 5
  • 6. database-glossarydocdocdocdoc1200.doc 19/7/2010 Definition: An index is a database feature used for locating data quickly within a table. Indexes are defined by selecting a set of commonly searched attribute(s) on a table and using the appropriate platform-specific mechanism to create an index. Example: Personnel information may be store in a Human Resource department's employee table. Clerks find that they often search the table for employees by last name but get slow query responses. Defining an index on the table consisting of the last name attribute would speed up these queries. Data Mining Definition: Data mining is the use of automated data analysis techniques to uncover previously undetected relationships among data items. Data mining often involves the analysis of data stored in a data warehouse. Three of the major data mining techniques are regression, classification and clustering. Data Warehouse Definition: A data warehouse is a centralized database that captures information from various parts of an organization's business processes. This information can later be analyzed to determine predictive relationships through the use of data mining techniques. Enterprise Definition: An enterprise is an organization that utilizes computers and applications. In general use, enterprises refer to businesses/organizations that operate on a large scale. Applications that are designed for these organizations are often referred to as enterprise applications. Example: A multinational company that has interconnected computer users located around the world could be considered an enterprise. The network operating system that they utilize can be referred to as an enterprise operating system. The database that stores their global sales information is both an enterprise application and an enterprise database. Entity Definition: An entity is a single object (e.g. student, course, department, project) about which data can be stored. It is the "subject" of a table. Entities and their interrelationships are modeled through the use of entity-relationship diagrams. page 6
  • 7. database-glossarydocdocdocdoc1200.doc 19/7/2010 Definition: An entity is a single object about which data can be stored. It is the "subject" of a table. Entities and their interrelationships are modeled through the use of entity-relationship diagrams. Entity-Relationship Diagram Definition: An entity-relationship diagram is a specialized graphic that illustrates the interrelationships (e.g. 1 to 1, 1 to N, N to N) between entities in a database. Also Known As: ER Diagram, E-R Diagram, entity-relationship model Flat File Definition: Flat files are data files that contain records with no structured relationships. Additional knowledge is required to interpret these files such as the file format properties. Modern database management systems used a more structured approach to file management (such as one defined by the Structured Query Language) and therefore have more complex storage arrangements. Example: Many database management systems offer the option to export data to comma delimited file. This type of file contains no inherent information about the data and interpretation requires additional knowledge. For this reason, this type of file can be referred to as a flat file. Normalization Definition: Normalization is the process of structuring relational database schema such that most ambiguity is removed. The stages of normalization are referred to as normal forms and progress from the least restrictive (First Normal Form) through the most restrictive (Fifth Normal Form). Generally, most database designers do not attempt to implement anything higher than Third Normal Form or Boyce-Codd Normal Form. Boyce-Codd Normal Form (BCNF) Definition: A relation is in Boyce-Codd Normal Form (BCNF) if every determinant is a candidate key. First Normal Form (1NF) page 7
  • 8. database-glossarydocdocdocdoc1200.doc 19/7/2010 Definition: A relation is said to be in First Normal Form (1NF) if and only if each attribute of the relation is atomic. More simply, to be in 1NF, each column must contain only a single value and each row must contain the same columns. Example: The following table is NOT in First Normal Form: Manager Employees Jim Susan, Rob, Beth Mary Alice, John, Asim Renee Mike Joe Alan, Tim Here is an alternative option that IS in 1NF. Manager Employee Jim Susan Jim Rob Jim Beth Mary Alice Mary John Mary Asim Renee Mike Joe Alan Joe Tim Second Normal Form (2NF) Definition: In order to be in Second Normal Form, a relation must first fulfill the requirements to be in First Normal Form. Additionally, each nonkey attribute in the relation must be functionally dependent upon the primary key. Example: The following relation is in First Normal Form, but not Second Normal Form: Order # Customer Contact Person Total 1 Acme Widgets John Doe $134.23 2 ABC Corporation Fred Flintstone $521.24 3 Acme Widgets John Doe $1042.42 4 Acme Widgets John Doe $928.53 page 8
  • 9. database-glossarydocdocdocdoc1200.doc 19/7/2010 In the table above, the order number serves as the primary key. Notice that the customer and total amount are dependent upon the order number -- this data is specific to each order. However, the contact person is dependent upon the customer. An alternative way to accomplish this would be to create two tables: Customer Contact Person Acme Widgets John Doe ABC Corporation Fred Flintstone Order # Customer Total 1 Acme Widgets $134.23 2 ABC Corporation $521.24 3 Acme Widgets $1042.42 4 Acme Widgets $928.53 The creation of two separate tables eliminates the dependency problem experienced in the previous case. In the first table, contact person is dependent upon the primary key -- customer name. The second table only includes the information unique to each order. Someone interested in the contact person for each order could obtain this information by performing a JOIN operation. Third Normal Form (3NF) Definition: In order to be in Third Normal Form, a relation must first fulfill the requirements to be in Second Normal Form. Additionally, all attributes that are not dependent upon the primary key must be eliminated. Examples: The following table is NOT in Third Normal Form: Company City State ZIP Acme Widgets New York NY 10169 ABC Corporation Miami FL 33196 XYZ, Inc. Columbia MD 21046 In this example, the city and state are dependent upon the ZIP code. To place this table in 3NF, two separate tables would be created -- one containing the company name and ZIP code and the other containing city, state, ZIP code pairings. page 9
  • 10. database-glossarydocdocdocdoc1200.doc 19/7/2010 This may seem overly complex for daily applications and indeed it may be. Database designers should always keep in mind the tradeoffs between higher level normal forms and the resource issues that complexity creates. Fourth Normal Form (4NF) Definition: To be in Fourth Normal Form, a relation must first be in Boyce-Codd Normal Form. Additionally, a given relation may not contain more than one multivalued attribute. Examples: The following relation is NOT in Fourth Normal Form: Manager Child Employee Jim Beth Alice Mary Bob Jane Mary NULL Adam Each manager can have more than one child and each manager can supervise more than one employee. Therefore, this relation is not in Fourth Normal Form. The creation of two separate relations for the Manager/Child and Manager/Employee relationships would put this relation in Fourth Normal Form. Functional Dependency Definition: A functional dependency occurs when one attribute in a relation uniquely determines another attribute. This can be written A -> B which would be the same as stating "B is functionally dependent upon A." Examples: In a table listing employee characteristics including Social Security Number (SSN) and name, it can be said that name is functionally dependent upon SSN (or SSN -> name) because an employee's name can be uniquely determined from their SSN. However, the reverse statement (name -> SSN) is not true because more than one employee can have the same name but different SSNs. Lock Definition: Database management systems utilize locks to provide concurrency control. Common uses of locks are to ensure that only one user can modify a record at a time and that page 10
  • 11. database-glossarydocdocdocdoc1200.doc 19/7/2010 data can not be read while it is being modified. Locking mechanisms can be enforced at the row, table or page level. Metadata Definition: Metadata is literally "data about data." This term refers to information about data itself -- perhaps the origin, size, formatting or other characteristics of a data item. In the database field, metadata is essential to understanding and interpreting the contents of a data warehouse. Example: The eXtensible Markup Language (XML) is a metadata format used to define other data objects. Replication Definition: Replication is the process of sharing information between databases (or any other type of server) to ensure that the content is consistent between systems. Replication is normally used to increase the number of database servers available to clients, thereby reducing the load on each. Form Definition: A database form can be used to facilitate database data entry and/or retrieval operations. A database developer/administrator usually designs a form which can then be used by personnel without any specific database skills to perform repetitive tasks. Examples: The picture below shows an example form from a Microsoft Access database: Report page 11
  • 12. database-glossarydocdocdocdoc1200.doc 19/7/2010 Definition: A database report presents information retrieved from a table or query in a preformatted, attractive manner. Examples: A sample Microsoft Access report is shown below: Repository Definition: A repository is a collection of resources that can be accessed to retrieve information. Repositories often consist of several databases tied together by a common search engine. Transaction Definition: Transactions are a group of database commands which are to be treated as a single atomic event. Transactions are maintained using the two phase commit system. Two Phase Commit Definition: Two Phase Commit is the process by which a relational database ensures that distributed transactions are performed in an orderly manner. In this system, transactions may be terminated by either committing them or rolling them back. page 12
  • 13. database-glossarydocdocdocdoc1200.doc 19/7/2010 Query Definition: Queries are the primary mechanism for retrieving information from a database and consist of questions presented to the database in a predefined format. Many database management systems use the Structured Query Language (SQL) standard query format. Structured Query Language (SQL) Definition: The structured query language is an industry-standard language used for manipulation of data in a relational database. The major SQL commands of interest to database users are SELECT, INSERT, JOIN and UPDATE. SELECT Definition: The SELECT statement in SQL is the primary mechanism for retrieving information from a relational database. Examples: Given the following table: Members ID LastName Age 1 Smith 25 2 Jones 42 3 Reynolds 36 This SQL statement: SELECT LastName FROM Members WHERE Age>30 Would produce the following results: LastName Jones Reynolds INSERT page 13
  • 14. database-glossarydocdocdocdoc1200.doc 19/7/2010 Definition: The INSERT SQL command is used to add records to a table within a database. Examples: Given the following simple table: Members ID Last Name Age 1 Smith 25 2 Jones 42 The following SQL statement could be used to add a new record: INSERT INTO Members VALUES ('3','Reynolds','36') Which would produce the new table: ID Last Name Age 1 Smith 25 2 Jones 42 3 Reynolds 36 JOIN Definition: The SQL JOIN statement is used to combine the data contained in two relational database tables based upon a common attribute. Examples: Given the following two tables: Customers Customer ID CompanyName Phone 12 ABC Corporation 123-4567 49 XYZ, Inc. 765-4321 Orders OrderID CustomerID Amount 4021 12 $842.21 8532 12 $582.20 page 14
  • 15. database-glossarydocdocdocdoc1200.doc 19/7/2010 8192 49 $12.43 The following JOIN statement could be used: JOIN Customers, Orders WHERE Customers.CustomerID = Orders.CustomerID DISPLAY Customers.CompanyName, Orders.Amount Which would display the following results: CompanyName Amount ABC Corporation $842.21 ABC Corporation $582.20 XYC, Inc. $12.43 Alternatively, the JOIN can be performed implicitly with a SELECT statement such as: SELECT CompanyName, Amount FROM Customers, Orders In this example, it is not necessary to specify the JOIN condition because the two tables share only one common column which is automatically used. A WHERE clause could be used to further refine the results. For example, if we only wanted results from ABC Corporation we could use the statement: SELECT CompanyName, Amount FROM Customers, Orders WHERE CompanyName = 'ABC Corporation' UPDATE Definition: The UPDATE statement in SQL is used to edit values for attributes in one or more records of a relational table. Example: Given the following table: Members ID Last Name Age 1 Smith 25 page 15
  • 16. database-glossarydocdocdocdoc1200.doc 19/7/2010 2 Jones 42 3 Reynolds 36 Assume that the member Jones recently changed her last name to McGuire. This change could be effected using the following SQL statement: UPDATE Members SET LastName = 'McGuire' WHERE ID = 2 COMMIT Definition: The COMMIT statement in SQL marks the final step in the processing of a database transaction. The alternative is to utilize the ROLLBACK command to cancel the proposed database changes. Examples: The COMMIT statement is used in the following manner: BEGIN TRANSACTION [transaction_name] ... SQL Statement(s) ... COMMIT TRANSACTION [transaction_name] Rollback Definition: The ROLLBACK statement in SQL cancels the proposed changes in a pending database transaction. The transaction can be rolled back completely by specifying the transaction name in the ROLLBACK statement. A partial rollback can also be accomplished by specifying a savepoint name in lieu of the transaction name. The alternative to rolling back a transaction is to utilize the COMMIT command to make the proposed changes part of the relational database. Examples: The ROLLBACK statement is used in the following manner to cancel an entire transaction: BEGIN TRANSACTION [transaction_name] ... page 16
  • 17. database-glossarydocdocdocdoc1200.doc 19/7/2010 SQL Statement(s) ... ROLLBACK TRANSACTION [transaction_name] The ROLLBACK command can also be used to cancel part of a transaction in the following manner: BEGIN TRANSACTION [transaction_name] ... SQL Statement(s) SAVE TRANSACTION savepoint_name SQL Statement(s) ROLLBACK TRANSACTION savepoint_name NULL Definition: The NULL SQL keyword is used to represent either a missing value or a value that is not applicable in a relational table. page 17