Presentation Agenda <ul><li>Brief History
Technology Overview
The Storage Engine Adventure
Roadmap
The Team
Wrap-Up </li></ul>
Brief BlackRay History
What is BlackRay? <ul><li>BlackRay is a relational, in-memory database
Supports SQL, utilizes PostgreSQL drivers
Fulltext (Tokenized) Search in Text fields
Object-Oriented API Support
Persistence via Files, Transaction support
Scalable and Fault Tolerant
Open Source, Open Community
Available under the GPLv2 </li></ul>
BlackRay History <ul><li>Concept of BlackRay was developed in 1999 for a Web Phone Directory at Deutsche Telekom
Development of current BlackRay started in 2005, as a Data Access Library
First Production Use at Deutsche Telekom in 2006
Evolution to Data Engine in 2007 and 2008
Open Source under GPLv2 since June 2009 </li></ul>
Why Build Another Database?  <ul><li>Rather unique set of requirements: </li><ul><li>Phone Directory with approx. 80 Milli...
All queries in the 10-500 Millisecond range
Approximately 2000 concurrent users
Over 500 Queries per Second (sustained)
Updates once a day may only take several minutes
Index needs to be tokenized (SQL: CONTAINS)
Phonetic search
Extensive Wildcard queries (leading/midspan/trailing) </li></ul></ul>
Decision to Implement BlackRay <ul><li>Decision was formed in Mid 2005
Designed as a lightweight data access engine
Implementation in C++ for performance and maintainability
APIs for access across the network from different languages (Java, C++)
Feature set was designed for our specific business case in a particular project </li></ul>
Current Release <ul>Current 0.10.0 – Released December 2009 <ul><li>Complete rewrite of SQL Parser (boost::spirit2)
PostgreSQL client compatibility (via network protocol) to allow JDBC/ODBC... via PostgreSQL driver
Rewritten CLI tools
Major bugfixes (potential memory leaks)
Better Authentication suppor for Instances </li></ul></ul>
Some more details <ul><li>Written in C++
Relies heavily on boost
Compiles well with gcc and Sun Studio 12
Behaves well on Linux, Solaris, OpenSolaris and MacOS
Complete 64 Bit development and platform support
Use of cmake for multi platform support </li></ul>
Upcoming SlideShare
Loading in …5
×

The Adventure: BlackRay as a Storage Engine

1,378 views
1,306 views

Published on

This is Felix' talk at the 2010 edition of MySQL Con in Santa Clara, CA

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,378
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

The Adventure: BlackRay as a Storage Engine

  1. 2. Presentation Agenda <ul><li>Brief History
  2. 3. Technology Overview
  3. 4. The Storage Engine Adventure
  4. 5. Roadmap
  5. 6. The Team
  6. 7. Wrap-Up </li></ul>
  7. 8. Brief BlackRay History
  8. 9. What is BlackRay? <ul><li>BlackRay is a relational, in-memory database
  9. 10. Supports SQL, utilizes PostgreSQL drivers
  10. 11. Fulltext (Tokenized) Search in Text fields
  11. 12. Object-Oriented API Support
  12. 13. Persistence via Files, Transaction support
  13. 14. Scalable and Fault Tolerant
  14. 15. Open Source, Open Community
  15. 16. Available under the GPLv2 </li></ul>
  16. 17. BlackRay History <ul><li>Concept of BlackRay was developed in 1999 for a Web Phone Directory at Deutsche Telekom
  17. 18. Development of current BlackRay started in 2005, as a Data Access Library
  18. 19. First Production Use at Deutsche Telekom in 2006
  19. 20. Evolution to Data Engine in 2007 and 2008
  20. 21. Open Source under GPLv2 since June 2009 </li></ul>
  21. 22. Why Build Another Database? <ul><li>Rather unique set of requirements: </li><ul><li>Phone Directory with approx. 80 Million subscribers
  22. 23. All queries in the 10-500 Millisecond range
  23. 24. Approximately 2000 concurrent users
  24. 25. Over 500 Queries per Second (sustained)
  25. 26. Updates once a day may only take several minutes
  26. 27. Index needs to be tokenized (SQL: CONTAINS)
  27. 28. Phonetic search
  28. 29. Extensive Wildcard queries (leading/midspan/trailing) </li></ul></ul>
  29. 30. Decision to Implement BlackRay <ul><li>Decision was formed in Mid 2005
  30. 31. Designed as a lightweight data access engine
  31. 32. Implementation in C++ for performance and maintainability
  32. 33. APIs for access across the network from different languages (Java, C++)
  33. 34. Feature set was designed for our specific business case in a particular project </li></ul>
  34. 35. Current Release <ul>Current 0.10.0 – Released December 2009 <ul><li>Complete rewrite of SQL Parser (boost::spirit2)
  35. 36. PostgreSQL client compatibility (via network protocol) to allow JDBC/ODBC... via PostgreSQL driver
  36. 37. Rewritten CLI tools
  37. 38. Major bugfixes (potential memory leaks)
  38. 39. Better Authentication suppor for Instances </li></ul></ul>
  39. 40. Some more details <ul><li>Written in C++
  40. 41. Relies heavily on boost
  41. 42. Compiles well with gcc and Sun Studio 12
  42. 43. Behaves well on Linux, Solaris, OpenSolaris and MacOS
  43. 44. Complete 64 Bit development and platform support
  44. 45. Use of cmake for multi platform support </li></ul>
  45. 46. Technology Overview
  46. 47. Why call it Data Engine? <ul><li>BlackRay is a hybrid between a relational database and a search engine -> thus we call it „ data engine “
  47. 48. Database features: </li><ul><li>Relational structure, with Join between tables
  48. 49. Wildcards and index functions
  49. 50. SQL and JDBC/ODBC </li></ul><li>Search Engine Features </li><ul><li>Fulltext retrieval (token index)
  50. 51. Phonetic and similar approximation search
  51. 52. Extremely low latency </li></ul></ul>
  52. 53. BlackRay Architecture Redo Log Snapshots
  53. 54. Data Universe <ul><li>BlackRay features a 5-Perspective Index </li><ul><li>Layer 1: Dictionary
  54. 55. Layer 2: Postings
  55. 56. Layer 3: Row Index
  56. 57. Layer 4: Multi-Token Layer
  57. 58. Layer 5: Multi-Value Layer </li></ul><li>Layer 1 and 2 comprise a fully inverted Index
  58. 59. Statistics in this Index used for Query Plan Building
  59. 60. All data - index and raw output - are held in memory </li></ul>
  60. 61. Getting Data Into BlackRay <ul><li>Loading requires a running BlackRay Instance
  61. 62. Schemas, Tables, and Indexes are created using an XML description language
  62. 63. Standard loader utility to load CSV data
  63. 64. Bulk loading is done with logging disabled (no single transactions for better performance)
  64. 65. Indexing is done with maximum degree of parallelism depending on CPUs
  65. 66. After all data is indexed, a snapshot can be taken </li></ul>
  66. 67. Basic Load Performance Data <ul><li>German yellowpage and whitepage data </li><ul><li>60 Million subscribers
  67. 68. 100 Million phone numbers
  68. 69. Raw data approx 16GB </li></ul><li>Indexing performance </li><ul><li>Total index size 11GB
  69. 70. Time to index: 33 Minutes, on dual 2GHz Xeon (Linux)
  70. 71. Updates: 500MB, 400K rows in approx 4 minutes </li></ul><li>Time to load snapshot: 200 Seconds for 11GB (limited only by HDD read performance) </li></ul>
  71. 72. Snapshots and Data Versioning <ul><li>Persistence is done via file based snapshots
  72. 73. A Snapshot is one flat file
  73. 74. Snapshots consist of all schemas in one instance
  74. 75. Snapshots have a version number
  75. 76. To make a backup of the data, simply copy the snapshot file to a backup media
  76. 77. It is possible to load an older snapshot: Data is version controlled if older snapshots are stored
  77. 78. Note: Currently Snapshots are not fully portable. </li></ul>
  78. 79. Transactions in BlackRay <ul><li>BlackRay supports transactions via a Redo Log
  79. 80. All commands that modify data are logged if requested
  80. 81. In case of a crash, the latest snapshot will be loaded
  81. 82. Replay of the transaction log will then bring the database back to a consistent state
  82. 83. Redo Log is eliminated when a snapshot is persisted
  83. 84. For better performance snapshots should be taken periodically, ideally after each bulk update </li></ul>
  84. 85. Native Query APIs <ul><li>C++, Java and Python Object Oriented APIs are available
  85. 86. Built with ICE (ZeroC) as the Network and Object Brokerage Protocol
  86. 87. Query Objects are constructed using an Object Builder
  87. 88. Execution via the network, Results as Objects
  88. 89. Load balacing and failover built into the protocol
  89. 90. Native support for Ruby, PHP and C# upcoming </li></ul>
  90. 91. Native Query APIs <ul><li>Queries can use any combination of OR, AND
  91. 92. Index functions (phonetic, sysnonyms, stopwords) can be stacked
  92. 93. Token search (fulltext) and wildcard are also supported
  93. 94. Advantage of the APIs: </li><ul><li>Minimized overhead
  94. 95. Very low latency
  95. 96. High availability </li></ul></ul>
  96. 97. SQL Interface <ul><li>BlackRay implements the PostgreSQL server socket interface
  97. 98. All PostgreSQL compatible frontends can be utilized against BlackRay
  98. 99. JDBC, ODBC, native drivers using socket interface are all supported
  99. 100. Limitations: SQL is only a reasonable subset of SQL92, Metadata is not yet fully implemented...
  100. 101. Currently not all index functions can be accessed via SQL statements </li></ul>
  101. 102. Management Features <ul><li>Management Server acts as central broker for all Instances
  102. 103. Command line tools for administration
  103. 104. SNMP management: </li><ul><li>Health check of Instances
  104. 105. Statistics , including access counters and performance measurements </li></ul></ul>
  105. 106. Clustering and Replication <ul><li>BlackRay supports a multi-node setup for fault tolerance
  106. 107. Cluster have N query nodes and single update node
  107. 108. All query nodes are equal and require the full index data available
  108. 109. Index creation is done offline on the update node
  109. 110. Query nodes can reload snapshots to receive updated data, via shared storage or local copy
  110. 111. Load-balancing handled by native APIs </li></ul>
  111. 112. Administration <ul><li>In-Memory Databases require very little administrative tasks
  112. 113. Configuration via one file per Instance
  113. 114. Disk layout etc all are of no importance
  114. 115. Backups are performed by copying snapshots
  115. 116. Recovery is done by restoring a snapshot
  116. 117. No daily administration required
  117. 118. SNMP allows remote supervision with common tools </li></ul>
  118. 119. Our Adventure: BlackRay As A Storage Engine
  119. 120. Why even bother? <ul><li>In Fall 2009, we embarked on a little adventure to implement BlackRay as a storage engine
  120. 121. The old Engine had only a minimal SQL interface and we lacked the expertise to build it ourselves
  121. 122. Plugging into the MySQL ecosystem seemed like a very pleasant choice
  122. 123. The features of BlackRay would make it a good query cache for large disk tables..... </li></ul>
  123. 124. Our First Problem <ul>BlackRay does not support a simple table scan..... <ul><li>It may seem strange, but due to it's design as an in-memory index, we do not separate table and index
  124. 125. Each column index basically is the data of the column
  125. 126. BlackRay distinguishes select and output columns, both of which remain in RAM
  126. 127. The index therefore was never designed to be forced back into a row format, for a simple table walk </li></ul></ul>
  127. 128. Possible Solution? <ul>So, we can walk the Index instead? <ul><li>Rather than scanning the table, it is possible to scan the index instead
  128. 129. This only works for the columns markes „searchable“
  129. 130. Causes nasty errors when trying to select against result-only columns
  130. 131. In tokenized index columns, getting the data back out means concatenation with a blank between values – not nice, as tokenizing can follow complex rules
  131. 132. Requires Refactoring of our Layer 3 (Row-Index) </li></ul></ul>
  132. 133. Next Issue <ul>Optimizing Queries <ul><li>The BlackRay Optimizer uses the Layer 1/2 (Inverted Index) and Layer 4 (Multi-Tokens) Data to chose a Query Path
  133. 134. In BlackRay „SELECT text FROM t WHERE text LIKE '*pattern1*' AND text LIKE '*pattern2* is extremely efficient as the inverted index has all the data
  134. 135. Even with OR this is an efficient Query, due to the fact that we can immediately chose the smaller query first and eliminate double matches </li></ul></ul>
  135. 136. Next Issue <ul><li>Optimizing in the Storage Engine Interface? </li><ul><li>In BlackRay, the Optimizer uses the AST from the SQL Parser to figure out what to optimize
  136. 137. Based on a field or single Index level, the number of matches really are not useful
  137. 138. Without utilizing the Layer2 and Layer4 structures, we lose performance by several orders of magnitude
  138. 139. Personal Opinion: The MySQL Optimizer really seems to like table scans, and tricking it with random vs sequential read cost did not do the trick </li></ul></ul>
  139. 140. Functions in the Index <ul><li>Columns can take functions to be used on the data upon indexing, and when select is carried out
  140. 141. The most common functions are </li><ul><ul><li>TOKENIZE – to support multi-token indexes
  141. 142. PHONETIC – match against defined phonetic rules
  142. 143. ALIAS – match a token against words with similar meaning </li></ul></ul><li>Internally these functions could be considered Meta-Columns on the Index
  143. 144. To be able to chose the proper column, we need to know what function was used in the select </li></ul>
  144. 145. Functions in the Index <ul>Consider this Query: <ul>SELECT text FROM t WHERE text LIKE fx_phonetic('maier') </ul><li>Functions can take more than one parameter, and may be nested
  145. 146. We could not quite figure out how to explain this to the MySQL Parser
  146. 147. The function data would need to be available to the Index to chose where to look </li></ul>
  147. 148. Threading Models... <ul><li>BlackRay has a highly optimized Threading Model
  148. 149. In RAM, we do not expect I/O-waits, so a model of two dedicated Threads per CPU core works really well
  149. 150. Locking in the Index is built around this model
  150. 151. „One Thread per Conection“ requires at least a careful review of the way critical data structures are accessed </li></ul>
  151. 152. .... Our Conclusion <ul><li>Currently, BlackRay really does not fit too well into the storage engingine architecture
  152. 153. Did we lose all hope? Absolutely not.....
  153. 154. BlackRay Applications could really benefit from being able to utilize MySQL features, including the Archive Engine as well as temporary tables in Heap
  154. 155. Thanks to the excellent Blog and postings by Brian Aker, which allowed us to not make all beginner mistakes ourselves </li></ul>
  155. 156. Lessons Learned <ul><li>We need to spend more time on this topic to get the issues worked out
  156. 157. Most of the issues require a significant change to the way BlackRay works
  157. 158. Without modifications of the way the Optimizer can be interacted with, this makes very little sense </li></ul>
  158. 159. Possible Alternatives (Best Bet) <ul><li>BlackRay supports the PostgreSQL interfaces
  159. 160. Queries could be fed through a BlackRay frontend, passing parts of the Query to PostgreSQL (or even MySQL) – this can be done with a rather simple JDBC intermediate layer
  160. 161. It might be even more straightforward to modify the MySQL SQL processor to directly interact with the BlackRay core, allowing better interaction with the BlackRay Optimizer </li></ul>
  161. 162. Project Roadmap
  162. 163. Immediate Roadmap <ul><li>Planned 0.11.0 – Due in July 2010 </li><ul><li>Support for ad-hoc data manipulation via SQL (INSERT/UPDATE/DELETE)
  163. 164. Pluggable Function architecture (loadable libraries)
  164. 165. Make all index functions available in SQL
  165. 166. Support for Prepared Statements (ODBC/JDBC)
  166. 167. Improved thread and memory management (Perftools?) </li></ul><li>BlackRay Admin Console (Remora) 0.11 </li><ul><li>Engine Statistics via GUI
  167. 168. Cluster Node management </li></ul></ul>
  168. 169. Midterm Roadmap <ul><li>Scalability Features </li><ul><li>Sharding & Partitioning Options
  169. 170. Federated Search </li></ul><li>Fully portable snapshot format (across platforms)
  170. 171. Query Performance Analyzer
  171. 172. Improved Statistics Module with GUI
  172. 173. Removal of ICE as a core component
  173. 174. BlackRay as a Storage Backend for SUN OpenDS LDAP Engine </li></ul>
  174. 175. Midterm Roadmap <ul><li>Security Features </li><ul><li>Improved User and Access Control concepts
  175. 176. SSL for all connections
  176. 177. External User Store (LDAP/OpenSSO/PAM...) </li></ul><li>Increased Platform support </li><ul><li>Windows 7 and Windows Server platforms
  177. 178. Embedded platforms </li></ul><li>Other, random features by popular request. </li></ul>
  178. 179. The Team behind BlackRay
  179. 180. SoftMethod GmbH <ul><li>SoftMethod GmbH initiated the project in 2005
  180. 181. Company was founded in 2004 and currently has 10 employees
  181. 182. Focus of SoftMethod is high performance software engineering
  182. 183. Product portfolio includes telco/contact center and LDAP applications
  183. 184. SoftMethod also offers load testing and technical software quality assurance support. </li></ul>
  184. 185. Development Team <ul><li>SoftMethod GmbH initiated the project in 2005 </li><ul><li>Thomas Wunschel, Architect and Lead Developer
  185. 186. Felix Schupp, Initiator and Project Sponsor
  186. 187. Mike Alexeev, Key Contributor (SQL/Functions)
  187. 188. Souvik Roy, Performance Analysis and Tools </li></ul></ul>
  188. 189. Wrap-Up
  189. 190. What to do next <ul><li>Get BlackRay: </li><ul><li>Register yourself on http://forge.softmethod.de
  190. 191. SVN checkout available at http://svn.softmethod.de/opensource/blackray/trunk </li></ul><li>Get Involved </li><ul><li>Anyone can register and create tickets, news etc
  191. 192. We have an active mailing list for discussion as well </li></ul><li>Contribute </li><ul><li>We require a signed Contributor agreement before being allowed commit access to the repository </li></ul></ul>
  192. 193. Contact Us <ul><li>Website: http://www.blackray.org
  193. 194. Twitter: http://twitter.com/dataengine
  194. 195. Facebook http://facebook.com/dataengine
  195. 196. Mailing List: http://lists.softmethod.de
  196. 197. Download: http://sourceforge.net/projects/blackray
  197. 198. Felix: [email_address]
  198. 199. Thomas: [email_address] </li></ul>

×