Os Owens


Published on

Published in: Business, Technology
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Os Owens

    1. 1. Programming with SQLite <ul><li>Michael Owens </li></ul><ul><li>[email_address] </li></ul><ul><li>http://www.mikesclutter.com </li></ul>
    2. 2. Purpose of this Session <ul><li>To learn about SQLite's architecture and how its various subsystems work together. </li></ul><ul><li>To learn subtle but important points critical to programming SQLite in order to avoid problems and get the best performance. </li></ul><ul><li>To understand SQLite's similarities and differences with other databases, its features and limitations, in order to know when best to use it, and when not to. </li></ul>
    3. 3. Outline <ul><li>Architecture: How SQLite is put together </li></ul><ul><li>The C API: How it is designed and how it works in relation to the architecture. </li></ul><ul><li>Transactions: What you must know to avoid deadlocks and maximize concurrency. </li></ul><ul><li>Optimizations: ... and other dirty hacks. </li></ul><ul><li>Features: What SQLite has that others don't. </li></ul><ul><li>Limitations: What SQLite doesn't have the others do. </li></ul>
    4. 4. Outline <ul><li>Applications: Different ways SQLite is put to use – web, embedded, GUI, etc. – and how it is suited to each environment. </li></ul>
    5. 5. I. Introducing SQLite
    6. 6. What is SQLite? <ul><li>It is an embedded , relational database implemented in pure, portable, ANSI C. </li></ul><ul><li>It is designed to plug directly into your programs, scripts, or web applications, equipping them with an on-board, lightweight relational database engine that has no external dependencies (e.g. installation, configuration, administration, etc.). </li></ul><ul><li>It consists of a compact library and a command line utility that doubles as a shell. </li></ul>
    7. 7. What is SQLite? <ul><li>It supports a large subset of ANSI SQL. </li></ul><ul><li>It supports transactions, views, indexes, triggers, subqueries, check constraints and a variety of other features found in relational databases. </li></ul><ul><li>The open source community has created a variety of extensions, allowing you to use it with languages such as Perl, Python, Ruby, Java, PHP, Tcl, .NET, Pike, Scheme, Lua, Smalltalk, Objective C, Delphi, Ada, Haskell, BASIC, ... (your favorite language here), yadda, yadda. </li></ul>
    8. 8. What is SQLite? <ul><li>It is designed to be small, portable, reliable, customizable, efficient, and free. </li></ul><ul><li>SQLite was originally created by Richard Hipp, who is responsible for most of its major design decisions and implementation. </li></ul><ul><li>99% of SQLite's code is written and maintained by Hipp and Dan Kennedy, an Australian living in Thailand. </li></ul>
    9. 9. History <ul><li>The idea of SQLite was inspired by a government project Richard was working on. It employed a big-name RDBMS that often went down. </li></ul><ul><li>When the database puked, Richard's program could not run. Worse, he got the blame for the program not running (not the database). </li></ul>
    10. 10. History <ul><li>After seeing a professional DBA struggle with the database over a period of days, Richard saw the need for something simpler: an embedded relational database which could be managed by programmers. No network connectivity to worry about, no shared libraries to keep up with, no baroque client APIs to hassle with. </li></ul><ul><li>Richard started coding in May of 2000, and released SQLite 1.0 in August. It used GNU gdbm as storage layer which was soon replaced by his own custom B-tree layer that supported transactions. </li></ul>
    11. 11. History <ul><li>Version 2 followed within the year. By 2001, many open source projects were beginning to use it. Several languages had bindings for it. </li></ul><ul><li>In 2004, SQLite had a major upgrade to version 3. </li></ul><ul><li>Version 3 is in many respects a completely different database: Internationalization, C API triples in size, manifest typing, B-tree optimizations, more SQL features, etc. </li></ul>
    12. 12. History <ul><li>Other organizations have been involved in SQLite's development, such as Google and AOL. </li></ul><ul><li>Google has been instrumental in the full text search feature and has conducted extensive testing resulting any many performance and security improvements. </li></ul><ul><li>AOL funded some major features in version 3. </li></ul><ul><li>There are many others that both use and contribute financially to SQLite, but wish to remain anonymous. </li></ul>
    13. 13. Who Uses SQLite? <ul><li>Apple: Safari, Mail, Core Data, Aperture </li></ul><ul><li>Adobe: AIR, Lightroom </li></ul><ul><li>Google: Gears </li></ul><ul><li>Mozilla: Firefox </li></ul><ul><li>Sun: Solaris 10 </li></ul><ul><li>Your cell phone (if it runs Symbian). </li></ul><ul><li>PHP, Nokia, Dlink, Philips, Palm, and lots of embedded products who don't advertise the fact. </li></ul>
    14. 14. SQLite's Design Philosophy <ul><li>Flexibility: Code is well-documented, easy to understand and modular. It is easy to customize, modify, extend, hack, etc. </li></ul><ul><li>Compactness: Most databases strive to be big. SQLite is strives to stay small, and resists bloat at all costs. Big features are still possible, but they are added as conditionally-compiled or dynamically loadable extensions. Core library will never outgrow the embedded space. </li></ul>
    15. 15. SQLite's Design Philosophy <ul><li>Reliability: Test everything. Line for line, SQLite's test suite is as large (or larger) than its core code base. </li></ul><ul><li>Portability. Run everywhere. It is written in ANSI with an OS abstraction layer at the bottom that keeps the code uncluttered with #ifdef s. </li></ul><ul><li>Stability. The API is designed so as to never, ever change, if at all possible. Your programs should never break as a result of the API. </li></ul>
    16. 16. How Flexible? <ul><li>The C API provides many features with which to change/extend SQLite's behaviour (almost half is devoted just to extension/customization). </li></ul><ul><li>The C API provides a module interface with which to dynamically load custom extensions. </li></ul><ul><li>The source includes many features as compile-time options and/or extensions. </li></ul><ul><li>The source is small enough for a single programmer to dig around under the hood and get an idea of what is going on. </li></ul>
    17. 17. How Flexible? <ul><li>No licensing restrictions. SQLite has no copyright. It is public domain. Instead, it has a blessing: May you do good and not evil. May you find forgiveness for yourself and forgive others. May you share freely, never taking more than you give. </li></ul><ul><li>All code contributors are required to sign affidavits disavowing any copyright interests in code. The affidavits are kept in a secret, fireproof box in the Bat Cave. </li></ul>
    18. 18. How Compact? <ul><li>Core library consists of about 30,000 lines of C. </li></ul><ul><li>Library is about 250K in size and will never get any larger. It can be stripped down to almost half that size of that depending on what you don't compile in. </li></ul><ul><li>SQLite is designed to work well in embedded environments. There are applications where SQLite runs on smart cards. </li></ul>
    19. 19. How Reliable? <ul><li>SQLite is paranoid (in the healthy sense). All transaction settings are very conservative by default – safety before speed (but you are given the means to deviate from this). </li></ul><ul><li>Half of SQLite's source distribution is devoted to testing, providing over 97% code coverage. </li></ul><ul><li>SQLite has a good track record (seven years). </li></ul><ul><li>However, since SQLite runs in your address space, C/C++ programmers can still screw things up with bad pointers. </li></ul>
    20. 20. How Portable? <ul><li>Both the library itself as well as its database format operate across 32/64 bit architectures and byte orders </li></ul><ul><li>SQLite compiles and runs on Windows, Linux, Mac OS X, BSD, Solaris, AIX, HP-UX, Symbian, WinCE, VX Works, OS/2, and the NetBSD toaster. </li></ul><ul><li>SQLite databases are binary compatible: they work natively on all systems without any need for conversion. You could create a database on a SPARC and use it on your cell phone without any changes. </li></ul>
    21. 21. How Portable? <ul><li>However, databases are still sensitive to differences in library versions however. A 2.x library cannot read 3.x databases. </li></ul><ul><li>Later versions of 3.x are binary compatible back to 3.0. Earlier versions may read later versions, but they obviously cannot utilize later features. </li></ul>
    22. 22. #define sqlite3OsEnterMutex #define sqlite3OsLeaveMutex #define sqlite3OsInMutex #define sqlite3OsThreadSpecificData #define sqlite3OsMalloc #define sqlite3OsRealloc #define sqlite3OsFree #define sqlite3OsAllocationSize #define sqlite3OsDlopen #define sqlite3OsDlsym #define sqlite3OsDlclose How Portable? #define sqlite3OsOpenReadWrite #define sqlite3OsOpenExclusive #define sqlite3OsOpenReadOnly #define sqlite3OsDelete #define sqlite3OsFileExists #define sqlite3OsFullPathname #define sqlite3OsIsDirWritable #define sqlite3OsSyncDirectory #define sqlite3OsTempFileName #define sqlite3OsRandomSeed #define sqlite3OsSleep #define sqlite3OsCurrentTime <ul><li>If your platform is not supported, you can create your own custom OS abstraction layer by implementing about 20 functions. </li></ul>
    23. 23. And ... <ul><li>Good article on SQLite: http://technology.guardian.co.uk/weekly/story/0,,2107239,00.html </li></ul>
    24. 24. II. Architecture
    25. 25. Subsystems <ul><li>8 modular subsystems logically grouped into a front end compiler, virtual machine, and backend storage system. </li></ul><ul><li>Front end compiles query in to virtual machine program. </li></ul><ul><li>Virtual machine executes program, which manipulates the backend. </li></ul><ul><li>Backend performs storage, retrieval, transactions, and locking. </li></ul>
    26. 26. Subsystems <ul><li>OS abstraction layer provides a compatibility layer for porting to different platforms. All modules thus use a common API for OS services (e.g. file and directory access, threads, memory allocation, locking primitives, dynamic shared objects, etc.). </li></ul>
    27. 28. The Virtual Database Engine <ul><li>Everything in SQLite is performed in the form of a virtual machine language. </li></ul><ul><li>This machine language is executed by the Virtual Database Engine (VDBE). </li></ul><ul><li>There are about 120 opcodes in the VDBE instruction set. They cover everything from starting transactions, reading and writing data, to manipulating the stack. </li></ul><ul><li>You can print the VDBE code for any SQL statement with the EXPLAIN command. </li></ul>
    28. 29. sqlite> CREATE TABLE x (a,b,c); sqlite> INSERT INTO x VALUES (1,2,3); sqlite> EXPLAIN SELECT * FROM x; addr opcode p1 p2 p3 ---- --------------- ---- ---- ------- 0 Goto 0 12 1 Integer 0 0 # x 2 OpenRead 0 2 3 SetNumColumns 0 3 4 Rewind 0 10 5 Column 0 0 # x.a 6 Column 0 1 # x.b 7 Column 0 2 # x.c 8 Callback 3 0 9 Next 0 5 10 Close 0 0 11 Halt 0 0 12 Transaction 0 0 13 VerifyCookie 0 1 14 Goto 0 1 15 Noop 0 0
    29. 30. The Virtual Database Engine <ul><li>Every subsystem above the VDBE serves to create VDBE instructions, every subsystem below serves to carry them out. </li></ul><ul><li>The VDBE is the heart of SQLite. </li></ul><ul><li>For reference, all VDBE instructions are documented in the source file vdbe.c in the SQLite distribution. Also documented online at http://www.sqlite.org/opcode.html </li></ul>
    30. 31. III. The API
    31. 32. Functions <ul><li>The C API is made up of about 80 functions. </li></ul><ul><li>The API is composed of two logical parts: one for query processing, the other for writing custom extensions (user-defined functions, etc.)‏ </li></ul><ul><li>Each function has both a UTF8 and UTF16 variant. </li></ul><ul><li>SQLite is fanatical about API stability and backward compatibility. Any function not marked as experimental should never change. </li></ul>
    32. 33. Query Processing: Round 1 <ul><li>The C API talks to both the front end to compile a command, and to the back end (VDBE) to execute it. </li></ul><ul><li>The C API employs two data structures and three functions to execute commands </li></ul><ul><li>Data structures: connection handle ( sqlite3 ) and statement handle ( sqlite3_stmt ). </li></ul>
    33. 34. Query Processing: Round 1 <ul><li>Functions: prepare (compile), step, and finalize. </li></ul><ul><li>The connection handle represents a single connection to a database (transaction context), and the statement handle represents a single SQL statement. </li></ul>
    34. 35. Example: Pseudocode # Open connection c1 = open('foods.db')‏ # Compile a statement stmt = c1.prepare('SELECT * FROM episodes')‏ # Execute and iterate over results while stmt.step() print stmt.column('name')‏ end # Finalize statement stmt.finalize()‏ c1.close()‏
    35. 36. Example: C #include <sqlite3.h> int main(int argc, char **argv)‏ { int rc, i, ncols; sqlite3 *cnx; sqlite3_stmt *stmt; char *sql; const char *tail; /* Connect to database*/ sqlite3_open(&quot;db&quot;, &cnx); /* Prepare statement */ sql = &quot;SELECT * FROM x&quot;; sqlite3_prepare(cnx, sql, (int)strlen(sql), &stmt, &tail); /* Get the number of columns in statement */ ncols = sqlite3_column_count(stmt);
    36. 37. A Simple Example: C (cont.)‏ /* Iterate over result set. */ while(sqlite3_step(stmt) == SQLITE_ROW) { for(i=0; i < ncols; i++) { fprintf(stderr, &quot;'%s' &quot;, sqlite3_column_text(stmt, i)); } } /* Finalize */ sqlite3_finalize(stmt); /* Close database */ sqlite3_close(cnx); return 0; }
    37. 38. Query Processing: Round 2 <ul><li>You compile a SQL command with sqlite3_prepare() , which takes a connection and SQL command as input and produces a statement structure output. </li></ul><ul><li>The statement handle holds the compiled VDBE program and other associated resources needed to execute it. </li></ul><ul><li>You execute a statement using sqlite3_step() , which takes the statement as input and executes its VDBE code, which for SELECT commands is a stepwise process (hence the name). </li></ul>
    38. 39. Query Processing: Round 2 <ul><li>For SELECT commands, sqlite3_step() processes a single row with each call and returns SQLITE_DONE when it reaches the end of the result set. </li></ul><ul><li>With INSERT, UPDATE, DELETE, and other commands, a single call to sqlite3_step() does the job. </li></ul><ul><li>When the statement is complete, you use sqlite3_finalize() to deallocate the statement and its associated resources. </li></ul>
    39. 40. Data Structures, Locks, and Storage c1 = open('foods.db')‏ c2 = open('foods.db')‏ stmt1 = c1.prepare('SELECT * FROM episodes')‏ stmt2 = c1.prepare('SELECT * FROM episodes')‏ stmt3 = c2.prepare('INSERT INTO episodes...')‏ stmt4 = c2.prepare('UPDATE episodes ...')‏ while stmt1.step() print stmt1.column('name')‏ stmt4.step()‏ end ...
    40. 42. B-Tree and Pager <ul><li>The two modules of interest here are the B-tree and pager. The VDBE program, specifically its instructions, use these modules to operate on the database. </li></ul><ul><li>The B-tree's job is navigation and order. Every table in the database is represented as an individual B-tree. </li></ul><ul><li>B-trees are made up of pages. The first page of a table's B-tree is called its root page . </li></ul>
    41. 43. B-Tree and Pager <ul><li>The root pages of all tables in the database is stored in the system catalog, which consists of a single table called sqlite_master . Its root page is the first page in the database. </li></ul><ul><li>Rows are stored in pages. </li></ul><ul><li>Each row in a table is assigned an integer primary key value (whether you create one in the schema or not) which the B-tree uses to uniquely identify rows and arrange them in pages within tables. </li></ul>
    42. 44. B-Tree and Pager <ul><li>The B-tree traverses pages in a table using cursors. A cursor is essentially just a pointer to a row in a table. </li></ul><ul><li>While the B-tree navigates through and manipulates pages and tables, it knows nothing about the database file or anything on disk. </li></ul>
    43. 45. B-Tree and Pager <ul><li>Disk access is the pager's job. When the B-tree needs a page from the database, it asks the pager to get it. </li></ul><ul><li>The pager then transfers the page from disk into a memory region called the page cache , making it available to the B-tree. </li></ul><ul><li>The B-tree can only access pages through the page cache. </li></ul>
    44. 46. Query Processing: Round 3 <ul><li>Now let's go back through the example again and take a detailed look at how the API, VDBE, and storage system work together to execute a SQL statement. </li></ul><ul><li>First sqlite3_open() creates a new connection to the database. Internally, each connection structure contains a B-tree object which in turn contains a pager object. </li></ul><ul><li>Next sqlite3_prepare() compiles a SQL command, stashing the VDBE code in the sqlite3_stmt structure. </li></ul>
    45. 47. Query Processing: Round 3 /* Stripped down: */ /* The gist of executing a query. */ sqlite3_open(&quot;db&quot;, &cnx); sqlite3_prepare(cnx, SQL, sqllen, &stmt, NULL); while(sqlite3_step(stmt) == SQLITE_ROW) { /* Do something with row. */ } sqlite3_finalize(stmt);
    46. 49. sqlite> EXPLAIN SELECT * FROM x; addr opcode p1 p2 p3 ---- --------------- ---- ---- ------- 0 Goto 0 12 1 Integer 0 0 # x 2 OpenRead 0 2 3 SetNumColumns 0 3 4 Rewind 0 10 5 Column 0 0 # x.a 6 Column 0 1 # x.b 7 Column 0 2 # x.c 8 Callback 3 0 9 Next 0 5 10 Close 0 0 11 Halt 0 0 12 Transaction 0 0 13 VerifyCookie 0 1 14 Goto 0 1 15 Noop 0 0
    47. 50. sqlite> EXPLAIN SELECT * FROM x; addr opcode p1 p2 p3 ---- --------------- ---- ---- ------- 0 Goto 0 12 1 Integer 0 0 # x 2 OpenRead 0 2 3 SetNumColumns 0 3 4 Rewind 0 10 5 Column 0 0 # x.a 6 Column 0 1 # x.b 7 Column 0 2 # x.c 8 Callback 3 0 9 Next 0 5 10 Close 0 0 11 Halt 0 0 12 Transaction 0 0 13 VerifyCookie 0 1 14 Goto 0 1 15 Noop 0 0 Open a read-only cursor on table whose root page is 2. Load cols from B-Tree row. Move cursor, start over or end Start transaction Return SQLITE_ROW Close cursor and end Start. Goto instruction 12 Goto instruction 1 Sanity check(s)‏ sqlite3_step()‏ Position cursor to first row. Return SQLITE_DONE
    48. 51. VDBE Instruction Cheat Sheet <ul><li>OpenRead : Open a read-only cursor for the database table whose root page is P2 in a database file. The database file is determined by an integer from the top of the stack. </li></ul><ul><li>SetNumColumns : This opcode sets the number of columns for cursor P1 to P2 . Before the Column instruction can be executed on a cursor, this opcode must be called to set the number of fields in the table. </li></ul><ul><li>Rewind : Push the index of the cursor that will be used in the next instruction onto the stack in P1 . If the table is empty then jump to P2 . </li></ul><ul><li>Column : Use cursor given by P1 and push onto the stack the value of the column given by P2 . </li></ul><ul><li>Callback : The top P1 values on the stack into a single row associated with the statement handle. Stop VDBE execution and return SQLITE_ROW . </li></ul><ul><li>Next : Advance cursor P1 so that it points to the next key/data pair in its table or index. If there are no more key/value pairs then fall through to the following instruction. But if the cursor advance was successful, jump immediately to P2 . </li></ul>
    49. 54. Query Processing: Round 3 <ul><li>Every statement (handle) is associated with a single connection. It uses its connection's resources (e.g. B-tree and pager object) to carry out the VDBE program. </li></ul><ul><li>The first call to sqlite3_step() starts execution of the VDBE program. </li></ul><ul><li>The first few instructions tend to transaction related matters and the real work starts with the OpenRead instruction, which allocates a cursor and places it on the root page in table x. </li></ul>
    50. 55. Query Processing: Round 3 <ul><li>The new cursor is identified by an index given in the P1 operand, which in this case is 0. It is to be positioned on the page given in P2, which is page 2. </li></ul><ul><li>Thus OpenRead causes the pager to load page 2 from the database into the page cache, and uses the B-tree to point the cursor to the first row in that page. </li></ul>
    51. 56. Query Processing: Round 3 <ul><li>Following OpenRead , SetNumColums allocated space in the statement handle to hold a single result row. Rewind the advances the cursor from the root page down to the first leaf page, onto the first record. The VDBE keeps executing instructions until it reaches the Callback instruction, which causes it to yield. </li></ul>
    52. 57. Query Processing: Round 3 <ul><li>At this point, the VDBE program has positioned the cursor on the first row, loaded its columns into memory (stored in the statement structure), and made it available for reading. </li></ul><ul><li>sqlite3_step() returns SQLITE_ROW to indicate this. We then fetch data from the row using sqlite3_column_text() in the while loop body. </li></ul>
    53. 58. Query Processing: Round 3 <ul><li>The subsequent call to sqlite3_step() resumes operation at the Next instruction, which moves to the cursor to the next row. If there is another row, Next puts the cursor on it and loops back to the instruction given in operand P2 – instruction 5 – Column . </li></ul><ul><li>The next three Column instructions load the values for columns a , b , and c of that row into memory and then proceed to Callback , which again leads to sqlite3_step() returning SQLITE_ROW , and while loop begins anew. </li></ul>
    54. 59. Query Processing: Round 3 <ul><li>Each subsequent call to sqlite3_step() repeats this process until the cursor reaches the end of the table. </li></ul><ul><li>At the last record, the Next instruction does not loop back, but rather falls through to the following instruction – Close – which closes the cursor, followed by Halt , which terminates the VDBE program. </li></ul>
    55. 60. Query Processing: Round 3 <ul><li>In this case, sqlite3_step() returns SQLITE_DONE , indicating there are no more rows to be had. In our example, this breaks the while loop and the program proceeds to sqlite3_finalize() . </li></ul><ul><li>This then is how everything works together, in terms of a simple SELECT command, traversing all rows in table x . </li></ul>
    56. 61. IV. Transactions, Locks, and Cursors
    57. 62. Concurrency <ul><li>SQLite's concurrency model is different than most other databases you might be familiar with </li></ul><ul><li>Some common patterns you use with other databases may not work like you expect in SQLite. </li></ul><ul><li>To write good code, you must understand the transaction model, the locking model, cursors and how they all relate to each other. </li></ul><ul><li>This is the most important thing to understand in programming SQLite when concurrency is involved. </li></ul>
    58. 63. Locking <ul><li>SQLite uses database level locking. </li></ul><ul><li>SQLite implements locking using file locking on the database file. </li></ul><ul><li>SQLite keeps three different file locks to implement six lock states: a reserved byte, a pending byte, and a shared region. </li></ul><ul><li>File locking is abstracted by the OS Interface. </li></ul>
    59. 65. Think in Pages <ul><li>To best envision the lock model, think in terms of pages and the pager. </li></ul><ul><li>Every operation involves reading pages, writing pages, or removing pages. </li></ul><ul><li>An INSERT writes pages. </li></ul><ul><li>An UPDATE writes pages. In this case, it writes modified versions of pre-existing pages to the database. </li></ul><ul><li>A DELETE may even be performed solely by writing pages – modified page with row removed. </li></ul>
    60. 66. Think in Pages <ul><li>Also, make one more distinction: modifying a page versus writing a page. </li></ul><ul><li>Modifying a page means reading a page from the database into memory and changing the memory image (in the page cache). </li></ul><ul><li>Writing a page means actually writing a modified page back to the database file. </li></ul><ul><li>Also, think in terms of readers and writers. Multiple readers, but only one writer may work at a time. </li></ul>
    61. 67. UNLOCKED <ul><li>Default state of connection when you first open the database. </li></ul><ul><li>No locks. </li></ul><ul><li>No access. </li></ul><ul><li>No interference. </li></ul>
    62. 68. SHARED <ul><li>The read lock. </li></ul><ul><li>In order to read a page from the database, the pager must first obtain a SHARED lock. </li></ul><ul><li>SELECT statements can run entirely in SHARED. </li></ul><ul><li>SHARED is only sufficient for reading. Pager cannot modify pages with a SHARED lock. </li></ul>
    63. 69. RESERVED <ul><li>To modify a page, the pager has to go from SHARED to RESERVED. </li></ul><ul><li>Again, modified pages are stored in a localized memory cache inside the pager, called the page cache . In RESERVED, all modified pages stay in the page cache. They are not written back to the database file. </li></ul><ul><li>You can control the size of the page cache from SQL using the cache_size pragma. </li></ul>
    64. 70. RESERVED <ul><li>Also, when the pager enters RESERVED, it creates the rollback journal . </li></ul><ul><ul><li>The journal is instrumental in every transaction. </li></ul></ul><ul><ul><li>It makes it possible to rollback a transaction. </li></ul></ul><ul><ul><li>It stores the original copies of all modified database pages. </li></ul></ul>
    65. 71. RESERVED <ul><li>When the pager gets the RESERVED lock, it also grabs and holds the PENDING lock. </li></ul><ul><li>This starts a process of attrition, as no new readers can enter the database because the PENDING lock is unavailable. </li></ul><ul><li>So in RESERVED, the writer can make changes in memory, and no new readers may enter. </li></ul>
    66. 72. RESERVED owens@linux $ sqlite3 foods.db SQLite version 3.3.17 Enter &quot;.help&quot; for instructions sqlite> begin; sqlite> update foods set type_id=0; sqlite> <ul><li>The UPDATE statement causes the session to get RESERVED lock and create of journal. </li></ul><ul><li>Here, the journal is stored in a file named foods.db-journal , residing in the current working directory. </li></ul>
    67. 73. EXCLUSIVE <ul><li>The moment a connection tries to commit anything to the database, it has to go from RESERVED to EXCLUSIVE. </li></ul><ul><li>It cannot get an EXCLUSIVE lock until all other SHARED locks are released (readers are gone). </li></ul>
    68. 74. RESERVED vs. Exclusive <ul><li>In order for a connection to modify a page (in memory), it must first get a RESERVED lock. </li></ul><ul><li>In order for a connection to write that page back to the database file, it must first get EXCLUSIVE. </li></ul><ul><li>RESERVED is making modifications in the page cache. EXCLUSIVE is putting those modifications back into the database file. </li></ul><ul><li>The pager enters EXCLUSIVE because either you issue a COMMIT, or it has run out of room to hold modified pages in the page cache. </li></ul>
    69. 75. Read Scenario db = open('foods.db')‏ db.exec('BEGIN')‏ db.exec('SELECT * FROM episodes')‏ db.exec('SELECT * FROM episodes')‏ db.exec('COMMIT')‏ db.close()‏ <ul><li>UNLOCKED – › PENDING – › SHARED – › UNLOCKED </li></ul>
    70. 76. Write Scenario db = open('foods.db')‏ db.exec('BEGIN')‏ db.exec('UPDATE episodes set ...')‏ db.exec('COMMIT')‏ db.close()‏ <ul><li>UNLOCKED – › PENDING – › SHARED – › RESERVED – › PENDING – › EXCULSIVE – › UNLOCKED </li></ul>
    71. 77. Auto-commit mode <ul><li>Auto-commit mode is where every statement runs in its own transaction. It is the default. </li></ul><ul><li>Each modifying statement involves creating, committing, and clearing out the rollback journal. On bulk inserts, this can add up. </li></ul><ul><li>Ultimately, it is speed versus space. </li></ul><ul><li>The moment you issue a BEGIN, auto-commit mode is off and all subsequent statements run in a single transaction until COMMIT or ROLLBACK. </li></ul>
    72. 78. Auto-commit mode <ul><li>If you want speed, run in a single transaction. This may result in a large journal file depending on how much you change/insert (complete copy of DB in worse case). </li></ul><ul><li>If you want space, run in auto-commit. This will keep the journal file smaller on average, but may result in many fsyncs . (Also, look into PRAGMA synchronous ). </li></ul>
    73. 79. Auto-commit mode <ul><li>In most applications, especially where concurrency matters, use don't use auto-commit. You minimize journal syncs and this is the only way to reliably avoid deadlocks (which we'll address shortly). </li></ul><ul><li>In bulk loading, where there is probably only a single connection, consider using a combination. Group a bunch of operations together in a transaction, commit them, and start over. This way, you are limiting both fsync s and journal size. </li></ul>
    74. 80. SELECT-UPDATE Loops <ul><li>A common operation in programming with databases is to iterate over a result set while making modifications to records therein. </li></ul><ul><li>If you are not careful about how you do this, you can wind up with either </li></ul><ul><li>Deadlocks. </li></ul><ul><li>Unexpected results. </li></ul>
    75. 81. SELECT-UPDATE Loops <ul><li>A common way to do this on other databases is to open a cursor with one connection and update records with another. </li></ul>
    76. 82. This Won't Work c1 = open('foods.db')‏ c2 = open('foods.db')‏ stmt = c1.prepare('SELECT * FROM episodes')‏ while stmt.step() c2.exec('UPDATE episodes SET …)‏ end stmt.finalize()‏ c1.close()‏ c2.close()‏
    77. 83. Why Not? <ul><li>Locking. </li></ul><ul><li>c1 has a SHARED lock because of the SELECT statement. </li></ul><ul><li>c2 must get an EXCLUSIVE lock to commit the UPDATE statement's changes, which is impossible as long as c1 is in SHARED. </li></ul><ul><li>c1 will get SQLITE_BUSY every time. </li></ul>
    78. 84. Solution: Single Connection c1 = open('foods.db')‏ stmt = c1.prepare('SELECT * FROM episodes')‏ while stmt.step() sql = 'UPDATE episodes SET …' c1.exec(sql)‏ end stmt.finalize()‏ c1.exec('COMMIT')‏ c1.close()‏
    79. 85. We Still Have Problems <ul><li>If this code is operating in a concurrent environment, this example can still fail. </li></ul><ul><li>If another connection has a RESERVED lock, our UPDATE will fail. </li></ul><ul><li>Perhaps a little brute force might help ... We'll just keep looping until the UPDATE goes through. Eventually that other connection will finish what it's doing and our UPDATE will succeed. Right? </li></ul>
    80. 86. Brute Force stmt = c1.prepare('SELECT * FROM episodes')‏ while stmt.step() sql = 'UPDATE episodes SET …' while c1.exec(sql) != SQLITE_OK # Keep trying until it works‏ end end
    81. 87. Actual Example #!/usr/bin/env lua require &quot;sqlite3&quot; db = sqlite3.new()‏ stmt = sqlite3_stmt.new()‏ -- Connect to database sqlite3.open(db, &quot;foods.db&quot;)‏ -- Start a transaction if sqlite3.exec(db, &quot;BEGIN&quot;) ~= SQLITE_OK then print('BEGIN FAILED: ' .. sqlite3.errmsg(db))‏ return false end -- Compile a SELECT statement sql = 'SELECT id, type_id, name FROM foods ORDER BY id LIMIT 1' sqlite3.prepare(db, sql, stmt); -- Execute it. This is where an EXCLUSIVE lock will stop us local rc = sqlite3.step(stmt)‏ -- Check the value. If not SQLITE_ROW, we have a problem. if rc ~= SQLITE_ROW then print(&quot;SELECT FAILED: &quot; .. sqlite3.errmsg(db))‏ os.exit(1)‏ end
    82. 88. -- Iterate over result set while rc == SQLITE_ROW do -- Get the record id local id = sqlite3.column_int(stmt, 0)‏ print(&quot;Fetched row: id=&quot;..id)‏ -- Update the row. Keep trying until it goes through sql = 'UPDATE foods SET type_id = 100 WHERE id=' .. id while sqlite3.exec(db, sql) ~= SQLITE_OK do print('UPDATE FAILED: ' .. sqlite3.errmsg(db))‏ os.execute(&quot;sleep 1&quot;)‏ end -- Next row rc = sqlite3.step(stmt)‏ end -- Finalize sqlite3.finalize(stmt); -- Commit transaction if sqlite3.exec(db, &quot;COMMIT&quot;) ~= SQLITE_OK then print('COMMIT FAILED: ' .. sqlite3.errmsg(db))‏ return false end sqlite3.close(db)‏
    83. 89. Result: Deadlock <ul><li>This code prevents the other connection from ever finishing, at least if that code takes the same brute force approach as us – looping forever until it works. </li></ul><ul><li>Why? Because our connection never gives up its SHARED lock, preventing the other connection from being able to get to EXCLUSIVE. As long as we both keep retrying, we deadlock. </li></ul><ul><li>And we also lock everyone else out of the database as well. </li></ul>
    84. 90. Result: Deadlock <ul><li>Who is at fault here? We are. </li></ul><ul><li>We made no attempt to ensure that we could ever get a RESERVED lock. </li></ul><ul><li>We never backed off once we couldn't get the RESERVED lock, causing our SHARING lock to prevent the guy in RESERVED from committing. </li></ul><ul><li>The other guy had RESERVED. He has the right of way. We have to ROLLBACK and start over from scratch. </li></ul>
    85. 91. Resolution <ul><li>Start with the right transaction. </li></ul><ul><li>In this example, we know we are going to modify the database. So we should start with a RESERVED lock. </li></ul><ul><li>This way, we never get in the way of another connection in RESERVED, and we don't have to start over from scratch. </li></ul><ul><li>If everyone follows this protocol, then this deadlock cannot occur. </li></ul>
    86. 92. Transaction Entry Points <ul><li>BEGIN (DEFERRED): No locks until you do something. Auto-commit runs in this form. </li></ul><ul><li>BEGIN IMMEDIATE: Start with a RESERVED lock. First get a SHARED lock, and then get a RESERVED lock. If it is not possible to get the RESERVED lock, return to UNLOCKED. </li></ul><ul><li>BEGIN EXCLUSIVE: Start with an EXCLUSIVE lock. </li></ul>
    87. 93. Suggested Approach c1 = open('foods.db')‏ while c1.exec('BEGIN IMMEDIATE') != SQLITE_SUCCESS end stmt = c1.prepare('SELECT * FROM episodes')‏ while stmt.step()‏ # Will always work because we're in RESERVED c1.exec('UPDATE episodes SET …)‏ end stmt.finalize()‏ c1.exec('COMMIT')‏ c1.close()‏
    88. 94. SELECT-UPDATE Loop Rules <ul><li>If you are going modify rows while reading them from the database, your code has to use one and only one connection (transaction context) with which to do so. </li></ul><ul><li>If other code is hitting the same database, you (and the other code) should start in at least RESERVED before modifying the database. This will keep both of you from clashing. </li></ul>
    89. 95. Pop Quiz sql = 'UPDATE ...' stmt = c1.prepare(sql)‏ while stmt.step() != SQLITE_DONE # Keep trying end stmt.finalize()‏ <ul><li>Say stmt.step() returns SQLITE_BUSY . What does SQLITE_BUSY mean here? </li></ul><ul><li>Can't get SHARED? Can't get RESERVED? Can't get EXCLUSIVE? </li></ul>
    90. 96. Pop Quiz <ul><li>Answer: You don't know. </li></ul><ul><li>I cite this example because you may be tempted to do something like this using bound parameters. </li></ul><ul><li>How do you make this work? Start with BEGIN IMMEDIATE . </li></ul><ul><li>Then SQLITE_BUSY means that you couldn't get the reserved lock. And, BEGIN IMMEDIATE automatically releases the SHARED lock when it fails so that the other connection can complete. </li></ul>
    91. 97. Pop Quiz # The correct place to apply brute force. while c1.exec('BEGIN IMMEDIATE') != SQLITE_OK end sql = 'UPDATE ...' stmt = c1.prepare(sql)‏ while stmt.step() != SQLITE_DONE # Keep trying end stmt.finalize()‏ c1.exec('COMMIT')‏
    92. 98. Read Consistency <ul><li>The suggested approach for SELECT-UPDATE loop still presents problems. </li></ul><ul><li>Even though it is deadlock resistant, it still can lead to unpredictable results. </li></ul><ul><li>The reading cursor can be adversely affected by the UPDATE causing it to re-read, or even skip over records, depending on what the UPDATE does. </li></ul><ul><li>In short, you should not modify a table if there are any reading cursors on it – even though both cursors reside in the same transaction. </li></ul>
    93. 99. Example select_sql = 'SELECT * from foods where type_id > 0' ORDER BY type_id; update_sql = 'UPDATE foods set type_id=type_id+1 where' stmt = c1.prepare('SELECT * FROM episodes')‏ while stmt.step()‏ id = sqlite3_column_int(stmt, 0)‏ c1.exec(update_sql + 'id=' + id)‏ end
    94. 100. Cursor Sensitivity <ul><li>This example (if it were real) would work fine. </li></ul><ul><li>But add an index on type_id </li></ul><ul><li>What happens now? </li></ul><ul><li>If that were a cron job, you could be getting a big email tomorrow. </li></ul><ul><li>What happend? The index we created combined with the type_id > 0 expression caused SQLite to follow the index path, and that path was always changing because of the UPDATE. </li></ul>
    95. 101. Actual Example #!/usr/bin/env lua require &quot;sqlite3&quot; [==[ Assumes the following in database: CREATE TABLE discounts ( product_id INTEGER PRIMARY KEY value INT ); INSERT INTO discounts (value) VALUES (1); ]==] function print_value(db)‏ local sql = &quot;SELECT value FROM discounts&quot; local stmt = sqlite3_stmt.new()‏ local rc = sqlite3.prepare(db, sql, stmt); if rc ~= SQLITE_OK then error(sqlite3.errmsg(db))‏ end sqlite3.step(stmt)‏ print(string.format( &quot; RESULT: value = %i &quot;, sqlite3.column_int(stmt, 0)))‏ sqlite3.finalize(stmt); end
    96. 102. -- Connect to database db = sqlite3.new()‏ stmt = sqlite3_stmt.new()‏ sqlite3.open(db, &quot;test.db&quot;)‏ -- Drop/recreat discounts table, if exists clear_table(db)‏ -- Create an index on the discount column sqlite3.exec(db, &quot;CREATE INDEX discounts_value_idx ON discounts(value)&quot;)‏ sql = &quot;SELECT * FROM discounts WHERE value > 0&quot; rc = sqlite3.prepare(db, sql, stmt); if rc ~= SQLITE_OK then print('SQL ERROR: ' .. sqlite3.errmsg(db))‏ print(rc)‏ os.exit(1)‏ end -- Iterate through result set while sqlite3.step(stmt) == SQLITE_ROW do local id = sqlite3.column_int(stmt, 0)‏ local type_id = sqlite3.column_int(stmt, 1)‏ print(string.format(&quot;SQLITE_ROW: id=%-2i x=%-2i&quot;, id, type_id))‏ -- Increment value by 1 sqlite3.exec( db, &quot;UPDATE discounts SET value=value+1 &quot; .. &quot; WHERE product_id=&quot; .. id )‏ end
    97. 103. -- Close statement handle sqlite3.finalize(stmt); -- Print the current value print_value(db)‏ -- Close database sqlite3.close(db)‏
    98. 104. Cursor Sensitivity <ul><li>Our UPDATE constantly moved the current record in front of the current cursor position, extending result the path. After the UPDATE, our cursor was no longer sitting on the current record, it was effectively sitting behind it. Therefore, the next record is always the current record. </li></ul><ul><li>Why does this work in other databases but not in SQLite? </li></ul>
    99. 105. Cursor Sensitivity <ul><li>In other databases, the client libraries often take a different approach. When you do a SELECT, you get your own copy of the result set sent back to you. It exists outside of the database, and is therefore insulated from further database changes. </li></ul><ul><li>Or, some databases provide server-side cursors, which work similarly to SQLite. </li></ul>
    100. 106. Read Consistency <ul><li>Also, in other databases (take PostgreSQL), you could perform the UPDATE in a separate transaction. Therefore, even if you are using cursors, the SELECT still won't see the UPDATE changes because PostgreSQL uses multi-version concurrency. This is true with any MVCC database (e.g. MySQL InnoDB, Oracle)‏ </li></ul><ul><li>Even for non-MVCC databases, SQL 2003 provides options that make cursors sensitive or insensitive to changes. </li></ul>
    101. 107. Read Consistency <ul><li>In SQLite, all cursors are sensitive to changes. </li></ul><ul><li>In earlier versions, SQLite used table locks to prevent a table from being modified if it had read cursors on it. Now, it allows it. In either case, you still should take the same approach. </li></ul><ul><li>You have to ensure that you don't actively change the result set as you make changes (unless you are aware of the side effects you may be producing). How do you do this? </li></ul>
    102. 108. Consider Temporary Tables <ul><li>Temporary tables can be quite handy. </li></ul><ul><li>Make a copy of the set you want to modify (or enough to navigate and make the changes you want). It will be immune to your changes. </li></ul><ul><li>This is kind of the equivalent of the traditional client API of other databases. You effectively have your own copy. </li></ul><ul><li>Using temporary tables, you won't bloat your database file with ephemeral data (or don't have to bother with vacuuming the space). </li></ul>
    103. 109. Example c1 = open('foods.db')‏ c2 = open('foods.db')‏ c2.exec('CREATE TEMPORARY TABLE temp_epsidodes AS SELECT * from episodes')‏ stmt = c1.prepare('SELECT * FROM episodes')‏ while stmt.step() print stmt.column('name')‏ c2.exec('UPDATE temp_episodes SET …')‏ end stmt.finalize()‏
    104. 110. c2.exec('BEGIN IMMEDIATE')‏ # Use conflict resolution to do the update in # in a single step c2.exec('REPLACE INTO episodes SELECT * FROM temp_episodes')‏ c2.exec('COMMIT')‏ c1.close()‏ c2.close()‏
    105. 111. Page Cache 101 <ul><li>One way to increase concurrency is to reduce the time a connection is in EXCLUSIVE. </li></ul><ul><li>Ideally, a connection should only enter EXCLUSIVE when a COMMIT is issued, not because of cache exhaustion. </li></ul><ul><li>Thus, the connection's time in EXCLUSIVE should only entail flushing pages to disk, not doing more work. </li></ul>
    106. 112. Page Cache 101 <ul><li>The way to do this is to ensure that the cache is large enough to hold all dirty pages. </li></ul><ul><li>To correctly size the cache, you must have an idea of how it works. </li></ul><ul><li>Page cache is composes of three kinds of pages: Pinned, dirty, and unused. </li></ul>
    107. 114. Page Cache 101: Page States <ul><li>When B-tree reads a page (a cursor is on it), it's referenced. </li></ul><ul><li>When B-tree modifies a page, its dirty. </li></ul><ul><li>When B-tree is done with a page, it is unreferenced (put on free list). </li></ul>
    108. 115. Page Cache 101: Page Lists <ul><li>Pager keeps three lists: All, free, and dirty. </li></ul><ul><li>Free pages are pages with no references (B-tree is not using them)‏ </li></ul><ul><li>Only free pages that are not dirty can be recycled. </li></ul>
    109. 116. Page Cache 101: Cache Growth <ul><li>Cache can only grow as large as the limit set by the cache_size PRAGMA. </li></ul><ul><li>Available pages = cache_size – dirty – referenced </li></ul><ul><li>When there are no more available pages, the next page modification attempt will force pager to try for an EXCLUSIVE lock. </li></ul>
    110. 117. Page Cache 101: Allocation <ul><li>Page allocation is pretty simple. </li></ul><ul><li>If current cache size < cache_size, allocate a new page from the heap. </li></ul><ul><li>If current cache size = cache_size, try to recycle a page. Search through the free list and look for first non-dirty page. </li></ul><ul><li>If can't find any usable pages, go to EXCLUSIVE and write out modified pages to database. This will free up dirty pages for reuse. </li></ul>
    111. 118. Page Cache 101: Overhead <ul><li>Each cursor references 2 pages (page it points to and its parent). </li></ul><ul><li>Each index cursor also references 2 pages. </li></ul><ul><li>If you get SQLite to use multiple indexes, then they are merged into a temp table, which is managed by a separate page cache. In this case, the main page cache is therefore unaffected. </li></ul>
    112. 119. Page Cache 101: Overhead <ul><li>Every active statement always references page 1 (the sqlite_master table). </li></ul><ul><li>You are basically looking at approx 4 or 5 referenced pages for the average SELECT, which is very small compared to the default cache size (2,000 pages). </li></ul><ul><li>So the number of pages available for modification is approx cache_size – 5, depending on what you are trying to do. If you have 100 cursors open on the same connection, however, you obviously will have more than that. </li></ul>
    113. 120. Page Cache 101: Cache Sizing <ul><li>To estimate records per page, use SQLite analyzer. </li></ul><ul><li>Records/page = Num entries/Primary pages (see next slide)‏ </li></ul><ul><li>Take the upper limit of number of records you expect of modify on average, compute pages, and adjust cache size accordingly using PRAGMA cache_size or default_cache_size . </li></ul>
    114. 121. Page Cache 101: Cache Sizing *** Table FOODS w/o any indices ************************************** Percentage of total database.......... 27.5% Number of entries..................... 412 Bytes of storage consumed............. 11264 Bytes of payload...................... 7245 64.3% Average payload per entry............. 17.58 Average unused bytes per entry........ 4.67 Average fanout........................ 10.00 Fragmentation......................... 60.0% Maximum payload per entry............. 49 Entries that use overflow............. 0 0.0% Index pages used...................... 1 Primary pages used.................... 10 Overflow pages used................... 0 Total pages used...................... 11 Unused bytes on index pages........... 942 92.0% Unused bytes on primary pages......... 982 9.6% Unused bytes on overflow pages........ 0 Unused bytes on all pages............. 1924 17.1%
    115. 122. Summary <ul><li>SELECT-UPDATE loops must be done with single connection. </li></ul><ul><li>If you are at all worried about concurrency and avoiding deadlocks, you should always use BEGIN IMMEDIATE before modifying the database. </li></ul>
    116. 123. Summary <ul><li>Cursors are sensitive to changes in tables and indexes. If your modifications potentially change your result set, you should consider using temporary tables. </li></ul><ul><li>Good write concurrency comes from minimizing time in EXCLUSIVE, and therefore maximizing time in RESERVED. </li></ul><ul><li>You maximize your time in RESERVED by making sure your page cache is large enough to hold your modifications (with reason). </li></ul>
    117. 124. V. Features
    118. 125. Unique Features <ul><li>In memory databases. Databases don't have to exist on OS files. They can be created in memory. </li></ul><ul><li>Virtual tables. Virtual tables allow you to make tables out of information that exists outside of SQLite. By implementing a specific set of callback functions, you can enable SQLite to scan non-relational data sources as if they were native SQLite tables. For example, you could implement a virtual table over you file system and read file entries as if they were individual rows. </li></ul>
    119. 126. Unique Features <ul><li>Conflict resolution. SQLite has built in rules to overcome integrity violations. If you try to insert a record whose primary key value is already in the table, conflict resolution gives you the option of automatically overwriting the record, or failing. Furthermore, conflict resolution can be defined at different levels ranging from the SQL command all the way down to the table definition. </li></ul><ul><li>Attaching databases. You can attach multiple database files to a single connection and read to and from them as if they belonged to a single database. </li></ul>
    120. 127. Unique Features <ul><li>Manifest typing. SQLite is dynamically typed, and works a lot like a scripting language where other databases are statically typed. This can provide a great deal of flexibility in many of the same ways. </li></ul>
    121. 128. Interesting Features <ul><li>User-defined functions and aggregates. Half of SQLite's API is geared for creating user-defined functions and aggregrates, allowing you to easily add your own SQL functions. </li></ul><ul><li>Loadable modules. These compliment user-defined functions, allowing you to store them in dynamically loadable libraries, and load them from SQL at run time rather than having to statically compile your own custom library. </li></ul>
    122. 129. #include <sqlite3ext.h> SQLITE_EXTENSION_INIT1 static void hello_newman( sqlite3_context *context, int argc, sqlite3_value **argv)‏ { sqlite3_result_text(context, &quot;Hello Jerry&quot;); } int newman_init( sqlite3 *db, char **pzErrMsg, const sqlite3_api_routines *pApi )‏ { SQLITE_EXTENSION_INIT2(pApi); sqlite3_create_function( db, &quot;hello_newman&quot;, 1, SQLITE_ANY, 0, hello_newman, 0, 0 ); return 0; }
    123. 130. owensmk $ gcc --shared examples/newman.c -o newman.so owensmk $ ./sqlite3 SQLite version 3.4.0 Enter &quot;.help&quot; for instructions sqlite> .load newman.so extension_init sqlite> select hello_newman(); Hello Jerry sqlite>
    124. 131. Interesting Features <ul><li>Full text search. The goal of FTS is to be able to store and search hundreds of thousands of documents (perhaps a million). The stable version is FTS2, which is currently developed by Scott Hess at Google. See sqlite.org/cvstrac/wiki?p=FtsTwo for more info. </li></ul><ul><li>Pluggable backends. Allows you to write a custom I/O handlers to replace SQLite's file I/O. Firefox uses this to implement asynchronous I/O so that its database is more responsive over NFS. See test_async.c in source for more information. </li></ul>
    125. 132. VII. Limitations
    126. 133. Query Optimization <ul><li>Query Optimizer. SQLite does not have a sophisticated query optimizer. While it does collect some basic index statistics, it will not use fancy algorithms to find optimal paths to materialize a large multi-way joins. </li></ul><ul><li>SQLite cannot utilize multiple indexes without you specifically formulating your SQL to do so (using INTERSECT). </li></ul>SELECT * FROM foods WHERE rowid IN (SELECT rowid FROM foods WHERE name='Bagels' INTERSECT SELECT rowid FROM foods WHERE type_id=1);
    127. 134. Concurrency <ul><li>SQLite uses coarse grain (database-level) locking. This limits how many concurrent connections can be in a given database at a given time, and what they can do. </li></ul><ul><li>Understanding how SQLite implements locking and transactions (while somewhat involved) is the single most important thing to using it well for applications that involve write-concurrency. That is the principle topic of this talk. </li></ul><ul><li>Read concurrency runs like a bat out of Hell. </li></ul>
    128. 135. Network File Systems <ul><li>The problem is not that SQLite doesn't work on network file systems. Rather, bugs in a given implementations may lead to undefined behaviour or corrupt databases. </li></ul><ul><li>SQLite is very dependent on file locking. If the network file system does not implement locking exactly like it should, then SQLite cannot reliably know when to stay out of database when it is busy, or when to enter it when it's not. </li></ul><ul><li>SQLite can work on network file systems, if they implement locking correctly. </li></ul>
    129. 136. SQL Implementation <ul><li>There are some SQL related constructs that SQLite does not implement, such as foreign-key constraints, GRANT and REVOKE, complete ALTER TABLE support, and nested transactions, among other things. </li></ul><ul><li>For details, see http://www.sqlite.org/omitted.html . </li></ul>
    130. 137. Database Size <ul><li>SQLite is not meant for data warehousing. </li></ul><ul><li>When SQLite initializes the rollback journal, it uses a bitmap to track dirty pages. This bitmap must be large enough to represent every page in the database. Therefore, the bitmap consumes 256 bytes for every 1Mb of database. </li></ul><ul><li>Thus, for a 100Gb database, each transaction would require allocating 25Mb of RAM each time a connection enters RESERVED. </li></ul>
    131. 138. VI. Optimizations (and other dirty hacks)‏
    132. 139. Improving Concurrency <ul><li>Adjust the page cache </li></ul><ul><li>Consider synchronous writes (but know the risks). Using PRAGMA synchronous =OFF disables journal flushing before writing modified pages to the database file (~50x speedup). If the system goes down, part of the journal file may be lost, and the database can end up inconsistent or corrupted. </li></ul><ul><li>Use incremental BLOB IO for large binary data transfers, as it minimizes memory allocation. You don't have to load the whole blob into the page cache to get it into the database. </li></ul>
    133. 140. Improving Speed <ul><li>For single connections, consider exclusive locking mode (PRAGMA locking_mode ). It minimizes file operations associated with locking and the journal. It reuses the journal (truncating rather than closing) and keeps all locks intact between transactions (rather than releasing and reacquiring them). This can pay big dividends in embedded applications. </li></ul><ul><li>Consider the SQLite Amalgamation. It's an optimizations smackdown: the whole code base in a single source file. Can yield 35% performance improvement for optimized compiles. </li></ul>
    134. 141. Improving Speed <ul><li>Consider your page size. If you are storing large chunks of binary data, a larger page size could help performance. Also, larger page sizes increase the fanout on indexes, which can be helpful if you have very large tables. See PRAGMA page_size in the documentation. </li></ul>
    135. 142. VIII. Review
    136. 143. Where SQLite Works <ul><li>As in the story that led to SQLite's creation, there are times when you simply don't need a big database. You need a way to manage complex relationships and moderate amounts of storage that travels along with your application. </li></ul><ul><li>Embedded applications are a good example. SQLite is perhaps most popular among embedded developers, as it is designed with features geared specifically for environments with limited resources. </li></ul>
    137. 144. Where SQLite Works <ul><li>For example, people have ripped out the entire front end – leaving only VDBE and storage modules, resulting in a library size of around 64K. </li></ul><ul><li>They precompiled all of their app's queries on PC, and shipped the VDBE code as part of the application. They created custom API call(s) to initialize statement handles from precompiled VDBE code. </li></ul><ul><li>Try doing that with the SQL Server desktop engine. </li></ul>
    138. 145. Where SQLite Works <ul><li>SQLite is much more customizable than most other databases. It has 30,000 lines of code to PostgreSQL's 400,000. </li></ul><ul><li>Provided that SQLite meets your specifications, which system would you rather hack on if you really had to get your hands dirty? (PostgreSQL's code is immaculate, don't get me wrong. I'm referring to overall size and learning curve here.)‏ </li></ul>
    139. 146. Where SQLite Works <ul><li>Traditional applications: (C/C++ clients, utilities, etc.) It adds a lot a power in a small amount of space.‏ SQLite is a better fopen() . It offers a an application the ability to use a file format that includes built-in ACID transactions. Equally good for configuration files as well. </li></ul><ul><li>SQLite is a cheap calculator. In-memory databases can do fast, flexible computation with a minimal programming. </li></ul><ul><li>Indexing (bookmarks, messages, etc.)‏ </li></ul>
    140. 147. Where SQLite Works <ul><li>Scripts, scripts, scripts: throw-away, automation, log-file analysis, text searching, quotas, etc. SQLite is a very convenient tool with which to aggregate and process data. </li></ul><ul><li>Web development: Great for managing session state, configuration, and prototyping. Manifest typing can come in handy for prototyping the database that will support the application . </li></ul>
    141. 148. Where SQLite Works <ul><li>“ Relationalizing” interfaces. Virtual tables give you the ability to represent all kinds of different data (and interfaces) in table form. </li></ul><ul><li>You could write a virtual table to interface with /proc to give real-time reporting on processes, filtering and sorting with SQL. You could create an interface to a log file, or to view the state of your firewall (like pftop but in table form). SNMP? </li></ul><ul><li>You could then create SQL scripts to be run automated monitoring tasks which use complex SQL expressions and look for specific conditions. </li></ul>
    142. 149. Where SQLite Doesn't Work <ul><li>As a replacement for Oracle. Massive size, massive features, SQL compliance, a built-in Java virtual machine, thousands of concurrent users, etc. SQLite isn't a data warehouse; it is a sophisticated backpack. </li></ul><ul><li>Client/Server and/or networking (e.g. network file systems). NFS can work but varies with application and implementation. </li></ul>
    143. 150. Where SQLite Doesn't Work <ul><li>High write concurrency. 100,000 hits/day is a conservative estimate for website use, although it has been demonstrated to handle 10 times that. </li></ul><ul><li>It is not hard to make the call on when to use a larger RDBMS. When your data becomes large enough, complex enough, or critical enough to start discussing things like replication, point-in-time recovery, or table spaces, it's clearly time to go bigger. </li></ul>