2011 06-sq lite-forensics


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

2011 06-sq lite-forensics

  1. 1. SQLite ForensicsThe little database that could
  2. 2. About viaForensicsviaForensics is an innovative digital forensicsand security company providing expert servicesto:• Law Enforcement• Government Agencies• Corporations• Attorneys/Individuals
  3. 3. What’s the problem?• We want to recover as much data from devicesas possible• People delete data, mostly the data we want!• SQLite is a very popular data storage format• Currently no advanced SQLite recovery tool onthe market (but stay tuned)
  4. 4. What is SQLite?• SQLite is a widely used, lightweight databasecontained in a single cross-platform file used bydevelopers for structured data storage• Used in most smart phones (iPhone, Android,Symbian, webOS)• Used in major operating systems andapplications (Apple OS X, Google Chrome andChrome OS, Firefox)
  5. 5. Why do developers need structured data storage?• Applications need to store and retrieve data• In past and today, developers created their ownfile formats• But why reinvent the wheel for basic datastorage?• SQLite is free, open, high quality and takes careof the messy details
  6. 6. Core SQLite characteristics (from their FAQ)• Transactions are atomic, consistent, isolated, and durable (ACID)even after system crashes and power failures.• Zero-configuration - no setup or administration needed.• A complete database is stored in a single cross-platform disk file.• Small code footprint: 190KiB - 325KiB fully configured• Cross-platform and easy to port to unsupported systems.• Sources are in the public domain. Use for any purpose.• Standalone command-line interface (CLI) client
  7. 7. SQL = Structured Query Language• SQL is the language used to interact with manydatabases, including SQLite• Basic functions: Create, Read, Update andDeleted (CRUD)• Transactions: Start a change and it eithercompletes in entirety (commit) or not at all(rollback)• Very powerful, many variations
  8. 8. SQL – basic commands• SELECT – queries data from tables or tables– SELECT rowid, address, date, text FROM message;• INSERT INTO – adds data row to table– INSERT INTO message VALUES (NULL, ‘3128781100’, 1282844546, ‘text message’);• UPDATE – updates data rows in tables– UPDATE message SET date=1282846291 WHERE rowid=4;• DELETE – deletes data rows in tables– DELETE FROM message WHERE rowid=4;• Many good tutorials online
  9. 9. Viewing a SQLite database – command line• Command line apps– sqlite3 for full SQLite functions– sqlite_analyzer for db metadata• Linux/Mac/Windows versions• Represents latest version• Full source code and documentation• http://www.sqlite.org/download.html
  10. 10. Example sqlite3 sessionRun sqlite3 on database fileahoog@linux-wks-001:~/sqlite$ ./sqlite3 iPhone-3G-313-sms.dbSQLite version 3.7.4Enter ".help" for instructionsEnter SQL statements terminated with a ";"sqlite>List tables in databasesqlite> .tables_SqliteDatabaseProperties msg_groupgroup_member msg_piecesmessageExamine schema (structure) of message databasesqlite> .schema messageCREATE TABLE message (ROWID INTEGER PRIMARY KEY AUTOINCREMENT, address TEXT, dateINTEGER, text TEXT, flags INTEGER, replace INTEGER, svc_center TEXT, group_id INTEGER,association_id INTEGER, height INTEGER, UIFlags INTEGER, version INTEGER, subject TEXT,country TEXT, headers BLOB, recipients BLOB, read INTEGER);
  11. 11. Example sqlite3 session - continuedView record “4” in 2 formatssqlite> .headers onsqlite> SELECT * FROM message WHERE ROWID = 4;ROWID|address|date|text|flags|replace|svc_center|group_id|association_id|height|UIFlags|version|subject|country|headers|recipients|read4|(312) 898-4070|1282844546|Sure is a nice day out |3|0||3|1282844546|0|4|0||us|||1sqlite> .mode linesqlite> SELECT * FROM message WHERE ROWID = 4;ROWID = 4address = (312) 898-4070date = 1282844546text = Sure is a nice day outflags = 3replace = 0svc_center =group_id = 3association_id = 1282844546height = 0UIFlags = 4version = 0subject =country = usheaders =recipients =read = 1
  12. 12. sqlite3_analyzer – very useful in forensic analysisahoog@linux-wks-001:~/sqlite$ ./sqlite3_analyzer iPhone-3G-313-sms.db/** Disk-Space Utilization Report For iPhone-3G-313-sms.dbPage size in bytes.................... 2048Pages in the whole file (measured).... 14Pages in the whole file (calculated).. 14Pages that store data................. 13 92.9%Pages on the freelist (per header).... 0 0.0%Pages on the freelist (calculated).... 0 0.0%Pages of auto-vacuum overhead......... 1 7.1%Number of tables in the database...... 7Number of indices..................... 4Number of named indices............... 3Automatically generated indices....... 1Size of the file in bytes............. 28672Bytes of user payload stored.......... 1833 6.4%*** Page counts for all tables with their indices ********************MESSAGE............................... 3 21.4%SQLITE_MASTER......................... 3 21.4%_SQLITEDATABASEPROPERTIES............. 2 14.3%MSG_PIECES............................ 2 14.3%<snip>
  13. 13. Viewing a SQLite database – SQLite Database Browser• Freeware, public domain, open source visual tool usedto create, design and edit database files compatible withSQLite• Windows/Linux/Mac• Support SQLite 3.x• Last updated 12/2009• http://sqlitebrowser.sourceforge.net/• Many other (free) options listed at:http://www.sqlite.org/cvstrac/wiki?p=ManagementTools
  14. 14. Viewing a SQLite database – SQLite Database Browser
  15. 15. Viewing a SQLite table – SQLite Database Browser
  16. 16. SQLite – database header format• The first 100 bytes of the database file comprise thedatabase file header.• First 5 of 22 fieldsOffset Size Description0 16 The header string: "SQLite format 3000"16 2 The database page size in bytes. Must be a power of twobetween 512 and 32768 inclusive, or the value 1 representing apage size of 65536.18 1 File format write version. 1 for legacy; 2 for WAL.19 1 File format read version. 1 for legacy; 2 for WAL.20 1 Bytes of unused "reserved" space at the end of each page.Usually 0.
  17. 17. SQLite – Organized in Pages• Database consists of one or more pages, logical units which storedata• Pages are numbered beginning with 1• A page is one of the following:Page type DescriptionB-Tree page B-Tree pages are part of the tree structures used to storedatabase tables and indexes.Overflow page Overflow pages are used by particularly large databaserecords that do not fit on a single B-Tree page.Free page Free pages are pages within the database file that are notbeing used to store meaningful data. (or so they think!)Pointer-map page Part of auto-vacuum systemLocking page Tracks when database rows are locked for updating
  18. 18. B+tree and B-Tree formats – on-disk data structure• Data structure which represents sorted data in a waythat allows for efficient insertion, retrieval and removal ofrecords• Optimized for storage devices (vs. in memory) byminimizing the number of disk accesses.• In a B+tree, all data is stored in the leaves of the treeinstead of in both the leaves and the intermediate branchnodes.• A single B-Tree structure is stored using one or moredatabase pages. Each page contains a single B-Treenode.
  19. 19. B+Tree graphical representationhttp://www.sqlite.org/fileformat.html#table_btrees
  20. 20. SQLite storage classes and data types• Only 5 storage classes/data types :1. NULL: The value is a NULL value.2. INTEGER: The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytesdepending on the magnitude of the value.3. REAL: The value is a floating point value, stored as an 8-byte IEEE floatingpoint number.4. TEXT: The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).5. BLOB: The value is a blob of data, stored exactly as it was input. Oftenused to store binary data
  21. 21. SQLite storage classes – on disk example• 5 storage classes in hex on disk:• NULL: 0x00• INTEGER (4-byte): 0x4c76a782 = 1282844546• REAL: 0x41B1EC2EC004D9D7 = 300691136.018949– http://babbage.cs.qc.edu/IEEE-754/64bit.html• TEXT (ASCII): 0x53757265206973 = Sure is• BLOB: hard to represent binary here…see Text
  22. 22. Variable Integers – saving space, adding confusion• A variable-length integer or "varint" uses less space for small positivevalues.• Used in SQLite metadata (row headers, b-tree indexes, etc.)• A varint is between 1 and 9 bytes in length.• The varint consists of either zero or more byte which have the high-order bitset followed by a single byte with the high-order bit clear, or nine bytes,whichever is shorter. The lower seven bits of each of the first eight bytesand all 8 bits of the ninth byte are used to reconstruct the 64-bit twos-complement integer.• Varints are big-endian: bits taken from the earlier byte of the varint are themore significant and bits taken from the later bytes.• http://www.sqlite.org/fileformat.html#varint_format• Clear? How about an example ->
  23. 23. Variable Integers – example• Let’s say you find the following hex varint: 0x8CA06F– Examine each bit, if > 0x80 then not the last byte– So, we have 3 bytes: 0x8C 0xA0 0x6F (since 0x6F < 0x80 it’sthe last byte). Here’s how to convert:* MSB: Most significant bit (left most bit)Original Bytes 0x8C 0xA0 0x6FConvert to binary 1000 1100 1010 0000 0110 1111Remove MSB* 000 1100 010 0000 110 1111Concatenate 000110001000001101111In hex/decimal Hex: 0x03106F Decimal: 200,815
  24. 24. Freelist / Free page list• When information is deleted from the database,pages containing that data are not in active use.• Unused pages are stored on the freelist and arereused when additional pages are required.• Forensic value: “Freelist leaf pages contain noinformation. SQLite avoids reading or writingfreelist leaf pages in order to reduce disk I/O.”
  25. 25. Rollback journal• Created when a database is going to be updated• The original unmodified content of that page is writteninto the rollback journal.• The rollback journal is always located in the samedirectory as the database file and has the same name asthe database file but with the string "-journal" appended• Excellent source of forensic data if recoverable• Recoverable on many systems though some are nowwriting to tmpfs/RAM disks
  26. 26. Write Ahead Log (WAL)• New technique just introduced in 3.7.0• Generally faster and disk I/O is more sequential (which helps us inadvanced recovery)• All changes to the database are recorded by writing frames into the WAL.• Transactions commit when a frame is written that contains a commit marker.• A single WAL can and usually does record multiple transactions.• Periodically, the content of the WAL is transferred back into the database filein an operation called a "checkpoint".• Forensic value: recovery of WAL files
  27. 27. Record Format• A record contains a header and a body, in that order. Theheader:– begins with a single varint which determines the total number ofbytes in the header. The varint value is the size of the header inbytes including the size varint itself.– Following the size varint are one or more additional varints, oneper column. These additional varints are called "serial type"numbers and determine the datatype of each column– After the final header varint, the record data immediately follows– The 2-bytes prior to the start of the header correspond to theauto-increment integer assigned by the system (also a varint)
  28. 28. Record Format – visual representation• http://www.sqlite.org/fileformat.html#record_format
  29. 29. Record FormatHeader Value Data type and size0 An SQL NULL value (type SQLITE_NULL). This value consumes zero bytes of space in the records data area.1 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 1-byte signed integer.2 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 2-byte signed integer.3 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 3-byte signed integer.4 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 4-byte signed integer.5 An SQL integer value (type SQLITE_INTEGER), stored as a big-endian 6-byte signed integer.6 An SQL integer value (type SQLITE_INTEGER), stored as an big-endian 8-byte signed integer.7 An SQL real value (type SQLITE_FLOAT), stored as an 8-byte IEEE floating point value.8 The literal SQL integer 0 (type SQLITE_INTEGER). The value consumes zero bytes of space in the records dataarea. Values of this type are only present in databases with a schema file format (the 32-bit integer at byteoffset 44 of the database header) value of 4 or greater. (iOS4 uses this)9 The literal SQL integer 1 (type SQLITE_INTEGER). The value consumes zero bytes of space in the records dataarea. Values of this type are only present in databases with a schema file format (the 32-bit integer at byteoffset 44 of the database header) value of 4 or greater. (iOS4 uses this)10,11 Not used. Reserved for expansion.bytes * 2 + 12 Even values greater than or equal to 12 are used to signify a blob of data (type SQLITE_BLOB) (n-12)/2 bytesin length, where n is the integer value stored in the record header.bytes * 2 + 13 Odd values greater than 12 are used to signify a string (type SQLITE_TEXT) (n-13)/2 bytes in length, where nis the integer value stored in the record header.
  30. 30. Recovery from allocated SQLite with stringsahoog@linux-wks-001:~/sqlite$ strings iPhone-3G-313-sms.db | less<snip>msg_group(314) 267-6611us(920) 277-1869us(312) 898-4070us(312) 401-1679us(414) 331-5030usPiece of cake! Cant wait to try em out on Sundaytext/plain2text_0002.txtimage/jpeg1IMG_6807.jpg?Check out mccalistertext/plain2text_0002.txtimage/jpeg1IMG_6807.jpg<snip>
  31. 31. Carving SQLite files• File header readily identifiable• Sample scalpel entry:• Other tools like FTK/EnCase also carve0000000: 5351 4c69 7465 2066 6f72 6d61 7420 3300 SQLite format 3.0000010: 0800 0101 0040 2020 0000 00fc 0000 0000 .....@ ........0000020: 0000 0000 0000 0000 0000 000f 0000 0001 ................# extension case size header footer# sensitivesqlitedb y 819200 SQLitex20format
  32. 32. Carving SQLite files – OS specific findings• iOS– Good recovery of both allocated and “latent”SQLite files• Android– Excellent recovery but high repetition due tolog-structured file system repeating SQLiteheader• Other common file systems– Good recovery form typical magnetic mediadevice running FAT, FAT32, NTFS, HFS, etc.
  33. 33. SQLite in Hex (really the only way to look at it)0002270: 0000 0000 0000 0000 004d 0d12 0029 0445 .........M...).E0002280: 0101 0001 0401 0101 0011 0000 0128 3331 .............(310002290: 3229 2038 3938 2d34 3037 304c 77d8 a257 2) 898-4070Lw..W00022a0: 696c 6c20 796f 7520 676f 2067 6574 206d ill you go get m00022b0: 6520 6120 636f 6666 6565 3f03 0003 4c77 e a coffee?...Lw00022c0: d8a2 0000 0075 7301 3f0c 1200 2904 2901 .....us.?...).).Name Type Header Converted Body Value / notesRowid – actual Varint 0x0d 13 So rowid = 13Header Size Varint 0x12 18 Length of header is 18 bytes (header size + 17 rows)Rowid NULL 0x00 0 NULL tells SQLite on insert to determine next auto incrementAddress Text 0x29 (41 -13)/2 = 14 (312) 898-4070 [14 chars - covert 0x29 to decimal, calc size]Date Integer 0x04 4-byte integer 0x4c77d8a2 in decimal is 1282922658 [recognize number format?]Text Text 0x45 (69 -13)/2 = 28 Will you go get me a coffee?Flags Integer 0x01 1-byte integer 0x03 = 3Replace Integer 0x01 1-byte integer 0x00 = 0Svc_center Text 0x00 NULL No value, not represented in data at allGroup_id Integer 0x01 1-byte integer 0x03 = 3Association_id Integer 0x04 4-byte integer 0x4c77d8a2 in decimal is 1282922658 [recognize number format?]Height Integer 0x01 1-byte integer 0x00 = 0UIFlags Integer 0x01 1-byte integer 0x00 = 0Version Integer 0x01 1-byte integer 0x00 = 0Subject Text 0x00 NULL No value, not represented in data at allCountry Text 0x11 (17 – 13)/2 = 2 usHeaders Blob 0x00 NULL No value, not represented in data at allRecipients Blog 0x00 NULL No value, not represented in data at allRead Integer 0x01 1-byte integer 1 [Last data byte]
  34. 34. Advanced Technique• Use well defined SQLite structure to develop a programto recover SQLite rows• Row header and data values “decay” over time due to– Being (partially) re-allocated– Fragmentation– Compensated for this with simple probability engine whichdetermined likelihood sequence of bytes represented header rowwe are interested in• Underlying file system can have great impact, from FAT,HFSplus (iPhone) and YAFFS2 (Android)• Look for journal files and WAL data too
  35. 35. Contact UsAndrew Hoog, CIOahoog@viaforensics.comhttp://viaforensics.com1000 Lake St, Suite 203Oak Park, IL 60301Tel: 312-878-1100 | Fax: 312-268-7281