The College of New Jersey
Advanced Open Source Databases
April 25, 2002
Submitted in partial fulfillment of the requirements of
Computer Science Course CMSC 497
MySQL, one of the most commonly used relational databases used on websites,
has been growing in popularity in recent years. At first intended for mainly small-scale
databases, and web-based databases, it is now being used in many large-scale
environments also. While the database has a very large user base, and extensive
documentation, there is very little other information out on it.
While the Documentation is helpful, there are many questions that still are not
answered. There are many unique aspects of MySQL, however they are often lost in the
Our Research group looked into many of the features of MySQL, from the
installation, internal configurations, replicating the database for large databases, to
building applications based off it. In this paper, the two-tier architecture of MySQL, as
well as normal database features not supported in MySQL will be discussed.
One of the most unique features of MySQL is how the actual database works.
While it is hard to compare the code of MySQL to other major databases, such as DB2,
Oracle, or SQL Server, as they are all closed source, it is fairly easy to say that how the
MySQL code works is unique. The MySQL database actually functions as two separate
parts, each with their own layers of abstractions. The first part is the MySQL server
If a programmer had to worry about writing socket connections, or connection
pooling with the database each time they had to write an application for a database, the
amount of time to write each application would be tremendous. Instead, as in other
RDBMSs (Relational Database Management Systems), the actual server code handles all
of this. The MySQL server code takes care of the establishing of connections, connecting
pooling, handling requests, and other server related actions.
When a program attempts to connect to the MySQL database, the first thing it
passes through is the MySQL server code. The MySQL server is totally separate from
the tables and files that store the information. All that it worries about doing is validating
the information, handling the requests, and passing it on.
While this may sound like a very simple job, it is not necessarily so with a
database. Depending on the size of the database’s user pool, the database could be
getting thousands of hits per minute. The server has to make sure all of the requests are
valid, handle each of the separate connections, and pass the requests on to the table
The table-handler is the second part of the database. While in most RDBMSs
both the server and the table-handler come together, MySQL has several different
choices. Also, since MySQL is open source, it is possible for other developers to develop
table-handles for the database. While in a database such as Oracle, it is possible to write
utilities, it is very hard, if not impossible, for outside developers to write add-ons to the
actual Oracle code.
The table-handler receives the query from the MySQL server, and from their
handles it. The MySQL server never actually sees any of the processing. All that it
knows is that it makes a request, and it returns a result. It has no idea how the table
handler controls the data, or how the data is stored. It has no concept of indexes, primary
keys, foreign keys, or table spaces. The table-handler has no concept of connections,
server sockets, or the applications making the requests.
There are several benefits derived from the two level architecture.
• Full abstraction. This makes testing/debugging the MySQL server code easier, as
well as testing the Table Handler easier.
• Table-handlers can be designed separately, by other people, without having to
worry about what the MySQL server does.
• Different table-handlers can be used for different table types.
• Features can be added based on table-handlers.
Below is a list of the main features of each level of code:
• Contains the portability code, allowing MySQL database to run on multiple
operating systems and platforms.
• Does all of the communication between the client or application.
• Takes care of the access privileges of users, and verifies whether they are able to
access the information they are requesting.
• Parses and optimizes SQL code.
• Executes the SQL code.
• Takes care of higher level issues such as replication.
• Runs the query on the database. (While the MySQL server executes the SQL
code, it doesn’t actually run it. It just breaks it down into simpler instructions for
• Decides on how to write the data to memory.
• Handles the file system used in storing the data in files.
• Handles accessing the information.
• Handles features such as transactions, keys, table/row/group level locking.
Currently there are three main types of table-handlers in use. They are MyISAM,
BDB, and InnoDB. Each table-handler has their own strengths and weaknesses.
MyISAM is the default table-handler. BDB and InnoDB are included with the install in
most versions, however you have to make some changes to allow them to be used. The
default type, MyISAM, does not contain many of the other features that BDB and
InnoDB contain. For example, it does not support foreign keys or transactions.
BDB was the first table type that allowed transactions. BDB tables only support table
level locking however. This means that when an insert, delete, or update query is being
run on a table, the whole table is locked out from any other query. Once it is completed,
the table is unlocked. While this is more then MyISAM supports, it is not sufficient for
Innobase Oy released InnoDB in 2001. Since this time, it has been gaining market
share in the table-handle selection. Like BDB, it supports transactions. However, it also
supports row level locking. With row level locking, when you are running an update,
insert, or delete query, only the rows affected are locked. A user can still access the rest
of the table. This is very important in large databases, when 100s of inserts, updates, and
deletes might be running per second.
InnoDB also has a large number of other features. It is the only table-handler that
allows for declared foreign keys. In smaller databases foreign keys and foreign key
integrity is not a large issue, but in large-scale databases this can be a very important
factor. InnoDB also provides an Oracle-style consistent read, also known as multi-
versioned concurrency control. This allows for selects to not interfere at all with other
queries. Again, this feature is only found in MySQL by using the InnoDB table types.
From the above information, several of the benefits of table-handles are made clear.
InnoDB was released by Innobase Oy, a company that is separate from MySQL Ab. This
could only be done because of the two-tier architecture of the code. It would be
impossible to write another table handle for a database such as Oracle, or SQL server.
Also, it should be noted all the extra features added to the InnoDB table handle. Some of
these features will cause the database to slow down. MySQL Ab purposely provides
MyISAM with stripped down features. Transactions, for example, require a great deal
more system resources and memory. It is possible to implement a database using both
types of table: MyISAM for non-transactional tables, and InnoDB for tables requiring
foreign keys, or transactions.
The diagram illustrates how each table handler maintains their own tables, and can
run at once. MyISAM has three file types: .FRM, .MYI, .MYD. The .FRM file saves the
structure of the table, and contains information relating to the table. The .MYI file, or
MyISAM index file, contains the indexing for the table. The .MYD, or MyISAM data
file, contains the actual data contained in the table. Thus, each table will have its own
three files. InnoDB utilities a feature also found in Oracle called tablespace. The size of
the tablespace is defined in the my.cnf file. Each table will create more tablespaces as it
needs more room. The benefit of this is that the data is split over multiple files, and
tables can be larger then 2 gigabytes. Many operating systems have a max file size of 2
gigabytes. In a MyISAM file, if your .MYD file grew that large, it could not be added to.
With the information distributed over multiple files, this isn’t a problem.
One of the reasons that MySQL is a very fast and efficient database is that many
features supported in other databases are not supported in it. Some of these features are
planned to be implemented in later versions; some are not. Most of the features to be
implemented are based off user demands. MySQL AB judges what features to include by
what the user population, particularly those with support contracts, wish to see in
The largest feature that was known to be missing from MySQL was transactions.
However, this is easily corrected by using InnoDB tables. InnoDB tables, as discussed
above, provide many enhancements for larger databases requiring features such as
transactions. InnoDB tables also will handle foreign keys, another feature that was
missing from the base distribution of MySQL.
The next feature most often complained about that MySQL lacks is sub selects. A
sub select is when an SQL statement is nested inside of another SQL statement. MySQL
plans to have sub selects implemented in MySQL by version 4.3. They are currently on
version 4.01. There is a work around for sub selects. Almost anything done with a sub
select can be done through the use of a join.
Two features that go hand and hand that are not supported by MySQL are triggers
and stored procedures. While these two features are not used by everyone, those that are
used to using them may find the lack of them a large pain. MySQL plans to implement
them eventually, but does not yet have a timetable. A stored procedure is a piece of code
stored along with the server, which can be called upon. It is almost like a library that is
located very close to the database, so it tends to be very efficient. An example of a stored
procedure might be to subtract an item’s inventory by one when a purchase is made. A
trigger often works with a stored procedure. A trigger will have a certain trigger to set it
off. Using the same example, a trigger might be activated when something is bought.
The trigger then calls the stored procedure to decrement inventory. The main work
around for this is to just write applications differently, and not rely on them.
The last major feature not supported in MySQL is views. Views are mainly used for
security reasons. They are great at hiding data from those that do not need to see it,
without having to worry about setting up intricate permissions on every column of a
table. MySQL has been primarily a database used on the web. On the web, there is very
little need for views. MySQL AB has no plans to implement views at this time, as there
is simply very little user request for them.
No one can deny the power of the MySQL database. Already it is gaining in
market share, and usage. Many have thought of it as a stripped down database, but a
closer look will prove that wrong. Its two-tier architecture, and open source business
plan, allow for MySQL to continue growing strong. In a future market, where many
small databases will be prevalent, MySQL will be found more and more.
The features not contained within MySQL can all be worked around. While some
may find the lack of stored procedures, triggers, and sub selects a pain, there are many
ways to get around them. When compared to the thousands of dollars more Oracle costs,
you may realize just how non important those features are. It used to be that databases
were only used in the large-scale multi billion-dollar businesses. Now, even a small
eCommerce website is likely to be running some form of relational database.
DuBois, Paul. “MySQL”.New Riders, 2000.
DuBois, Paul. “MySQL and Perl for the Web.” New Riders, 2001
The MySQL website. http://www.mysql.com/