Upcoming SlideShare
Loading in...5

Like this? Share it with your network








Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Wassenar497.doc Document Transcript

  • 1. The College of New Jersey Advanced Open Source Databases April 25, 2002 David Wassenar Submitted in partial fulfillment of the requirements of Computer Science Course CMSC 497
  • 2. Abstract MySQL, one of the most commonly used relational databases used on websites, has been growing in popularity in recent years. At first intended for mainly small-scale databases, and web-based databases, it is now being used in many large-scale environments also. While the database has a very large user base, and extensive documentation, there is very little other information out on it. While the Documentation is helpful, there are many questions that still are not answered. There are many unique aspects of MySQL, however they are often lost in the technical documents. Our Research group looked into many of the features of MySQL, from the installation, internal configurations, replicating the database for large databases, to building applications based off it. In this paper, the two-tier architecture of MySQL, as well as normal database features not supported in MySQL will be discussed.
  • 3. One of the most unique features of MySQL is how the actual database works. While it is hard to compare the code of MySQL to other major databases, such as DB2, Oracle, or SQL Server, as they are all closed source, it is fairly easy to say that how the MySQL code works is unique. The MySQL database actually functions as two separate parts, each with their own layers of abstractions. The first part is the MySQL server program. If a programmer had to worry about writing socket connections, or connection pooling with the database each time they had to write an application for a database, the amount of time to write each application would be tremendous. Instead, as in other RDBMSs (Relational Database Management Systems), the actual server code handles all of this. The MySQL server code takes care of the establishing of connections, connecting pooling, handling requests, and other server related actions. When a program attempts to connect to the MySQL database, the first thing it passes through is the MySQL server code. The MySQL server is totally separate from the tables and files that store the information. All that it worries about doing is validating the information, handling the requests, and passing it on. While this may sound like a very simple job, it is not necessarily so with a database. Depending on the size of the database’s user pool, the database could be getting thousands of hits per minute. The server has to make sure all of the requests are valid, handle each of the separate connections, and pass the requests on to the table handler. The table-handler is the second part of the database. While in most RDBMSs both the server and the table-handler come together, MySQL has several different
  • 4. choices. Also, since MySQL is open source, it is possible for other developers to develop table-handles for the database. While in a database such as Oracle, it is possible to write utilities, it is very hard, if not impossible, for outside developers to write add-ons to the actual Oracle code. The table-handler receives the query from the MySQL server, and from their handles it. The MySQL server never actually sees any of the processing. All that it knows is that it makes a request, and it returns a result. It has no idea how the table handler controls the data, or how the data is stored. It has no concept of indexes, primary keys, foreign keys, or table spaces. The table-handler has no concept of connections, server sockets, or the applications making the requests. There are several benefits derived from the two level architecture. • Full abstraction. This makes testing/debugging the MySQL server code easier, as well as testing the Table Handler easier. • Table-handlers can be designed separately, by other people, without having to worry about what the MySQL server does. • Different table-handlers can be used for different table types. • Features can be added based on table-handlers. Below is a list of the main features of each level of code:
  • 5. MySQL Server: • Contains the portability code, allowing MySQL database to run on multiple operating systems and platforms. • Does all of the communication between the client or application. • Takes care of the access privileges of users, and verifies whether they are able to access the information they are requesting. • Parses and optimizes SQL code. • Executes the SQL code. • Takes care of higher level issues such as replication. Table-Handler: • Runs the query on the database. (While the MySQL server executes the SQL code, it doesn’t actually run it. It just breaks it down into simpler instructions for the table-handler). • Decides on how to write the data to memory. • Handles the file system used in storing the data in files. • Handles accessing the information. • Handles features such as transactions, keys, table/row/group level locking. Currently there are three main types of table-handlers in use. They are MyISAM, BDB, and InnoDB. Each table-handler has their own strengths and weaknesses. MyISAM is the default table-handler. BDB and InnoDB are included with the install in most versions, however you have to make some changes to allow them to be used. The default type, MyISAM, does not contain many of the other features that BDB and InnoDB contain. For example, it does not support foreign keys or transactions.
  • 6. BDB was the first table type that allowed transactions. BDB tables only support table level locking however. This means that when an insert, delete, or update query is being run on a table, the whole table is locked out from any other query. Once it is completed, the table is unlocked. While this is more then MyISAM supports, it is not sufficient for larger databases. Innobase Oy released InnoDB in 2001. Since this time, it has been gaining market share in the table-handle selection. Like BDB, it supports transactions. However, it also supports row level locking. With row level locking, when you are running an update, insert, or delete query, only the rows affected are locked. A user can still access the rest of the table. This is very important in large databases, when 100s of inserts, updates, and deletes might be running per second. InnoDB also has a large number of other features. It is the only table-handler that allows for declared foreign keys. In smaller databases foreign keys and foreign key integrity is not a large issue, but in large-scale databases this can be a very important factor. InnoDB also provides an Oracle-style consistent read, also known as multi- versioned concurrency control. This allows for selects to not interfere at all with other queries. Again, this feature is only found in MySQL by using the InnoDB table types. From the above information, several of the benefits of table-handles are made clear. InnoDB was released by Innobase Oy, a company that is separate from MySQL Ab. This could only be done because of the two-tier architecture of the code. It would be impossible to write another table handle for a database such as Oracle, or SQL server. Also, it should be noted all the extra features added to the InnoDB table handle. Some of these features will cause the database to slow down. MySQL Ab purposely provides
  • 7. MyISAM with stripped down features. Transactions, for example, require a great deal more system resources and memory. It is possible to implement a database using both types of table: MyISAM for non-transactional tables, and InnoDB for tables requiring foreign keys, or transactions. The diagram illustrates how each table handler maintains their own tables, and can run at once. MyISAM has three file types: .FRM, .MYI, .MYD. The .FRM file saves the structure of the table, and contains information relating to the table. The .MYI file, or MyISAM index file, contains the indexing for the table. The .MYD, or MyISAM data file, contains the actual data contained in the table. Thus, each table will have its own three files. InnoDB utilities a feature also found in Oracle called tablespace. The size of the tablespace is defined in the my.cnf file. Each table will create more tablespaces as it needs more room. The benefit of this is that the data is split over multiple files, and tables can be larger then 2 gigabytes. Many operating systems have a max file size of 2 gigabytes. In a MyISAM file, if your .MYD file grew that large, it could not be added to. With the information distributed over multiple files, this isn’t a problem.
  • 8. One of the reasons that MySQL is a very fast and efficient database is that many features supported in other databases are not supported in it. Some of these features are planned to be implemented in later versions; some are not. Most of the features to be implemented are based off user demands. MySQL AB judges what features to include by what the user population, particularly those with support contracts, wish to see in MySQL. The largest feature that was known to be missing from MySQL was transactions. However, this is easily corrected by using InnoDB tables. InnoDB tables, as discussed above, provide many enhancements for larger databases requiring features such as transactions. InnoDB tables also will handle foreign keys, another feature that was missing from the base distribution of MySQL. The next feature most often complained about that MySQL lacks is sub selects. A sub select is when an SQL statement is nested inside of another SQL statement. MySQL plans to have sub selects implemented in MySQL by version 4.3. They are currently on version 4.01. There is a work around for sub selects. Almost anything done with a sub select can be done through the use of a join. Two features that go hand and hand that are not supported by MySQL are triggers and stored procedures. While these two features are not used by everyone, those that are used to using them may find the lack of them a large pain. MySQL plans to implement them eventually, but does not yet have a timetable. A stored procedure is a piece of code stored along with the server, which can be called upon. It is almost like a library that is located very close to the database, so it tends to be very efficient. An example of a stored procedure might be to subtract an item’s inventory by one when a purchase is made. A
  • 9. trigger often works with a stored procedure. A trigger will have a certain trigger to set it off. Using the same example, a trigger might be activated when something is bought. The trigger then calls the stored procedure to decrement inventory. The main work around for this is to just write applications differently, and not rely on them. The last major feature not supported in MySQL is views. Views are mainly used for security reasons. They are great at hiding data from those that do not need to see it, without having to worry about setting up intricate permissions on every column of a table. MySQL has been primarily a database used on the web. On the web, there is very little need for views. MySQL AB has no plans to implement views at this time, as there is simply very little user request for them.
  • 10. Summary No one can deny the power of the MySQL database. Already it is gaining in market share, and usage. Many have thought of it as a stripped down database, but a closer look will prove that wrong. Its two-tier architecture, and open source business plan, allow for MySQL to continue growing strong. In a future market, where many small databases will be prevalent, MySQL will be found more and more. The features not contained within MySQL can all be worked around. While some may find the lack of stored procedures, triggers, and sub selects a pain, there are many ways to get around them. When compared to the thousands of dollars more Oracle costs, you may realize just how non important those features are. It used to be that databases were only used in the large-scale multi billion-dollar businesses. Now, even a small eCommerce website is likely to be running some form of relational database.
  • 11. Bibliography DuBois, Paul. “MySQL”.New Riders, 2000. DuBois, Paul. “MySQL and Perl for the Web.” New Riders, 2001 The MySQL website. http://www.mysql.com/