Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Реляционные или нереляционные (Josh Berkus)

2,191 views

Published on

Published in: Technology
  • Be the first to comment

Реляционные или нереляционные (Josh Berkus)

  1. 1. Relational vs. Non-Relational Josh Berkus PostgreSQL Experts Inc. Open Source Bridge 2010
  2. 2. 2003: <ul><li>MySQL
  3. 3. PostgreSQL
  4. 4. FireBird
  5. 5. BerkeleyDB
  6. 6. Derby
  7. 7. HSQLDB
  8. 8. SQLite </li></ul>
  9. 9. 2003: <ul><li>MySQL
  10. 10. PostgreSQL
  11. 11. FireBird
  12. 12. BerkeleyDB
  13. 13. Derby
  14. 14. HSQLDB
  15. 15. SQLite </li></ul>
  16. 16. Postgres vs. MySQL
  17. 17. 2010 <ul><li>MySQL
  18. 18. PostgreSQL
  19. 19. FireBird
  20. 20. BerkeleyDB
  21. 21. Derby
  22. 22. HSQLDB
  23. 23. SQLite </li></ul>
  24. 24. “ NoSQL movement”
  25. 25. All non-relational databases
  26. 26. Are not the same
  27. 27. Graph Key-value Document Hierarchical Distributed Neo4J HyperGraphDB Jena CouchDB BerkeleyDB-XML Solr Memcached Tokyo Cabinet db4o RIAK Cassandra Hypertable MySQL NDB MongoDB
  28. 28. All relational databases
  29. 29. Are not the same
  30. 30. Embedded MPP OLTP Streaming C-Store SQLite Firebird HSQL PostgreSQL MySQL Oracle SQL Server TeraData Greenplum Aster LucidDB MonetDB Streambase Truviso
  31. 31. NoFins
  32. 33. “ NoSQL movement”
  33. 34. Mythbust #2
  34. 35. “ revolutionary”
  35. 36. There
  36. 37. are
  37. 38. no
  38. 39. new
  39. 40. database
  40. 41. designs
  41. 42. There are only new implementations and combinations
  42. 43. “ A database storing application-friendly formatted objects, each containing collections of attributes which can be searched through a document ID, or the creation of ad-hoc indexes as needed by the application.”
  43. 44. CouchDB, 2007 Pick, 1965
  44. 45. CouchDB, 2007 embeddable Pick JSON storage REST API map/reduce
  45. 46. “ revolutionary”
  46. 47. “ r evolutionary”
  47. 48. “ renaissance of non-relational databases”
  48. 49. Mythbust #3
  49. 50. “ non-relational databases are toys”
  50. 51. Bigtable Google
  51. 52. Dynamo Amazon
  52. 53. Memcached FaceBook
  53. 54. Pick, Cach é US Vetrans' Administration
  54. 55. Mythbust #4
  55. 56. “ Relational databases will become obsolete”
  56. 57. “ Three decades past, the relational empire conquered the hierarchical hegemony. Today, an upstart challenges the relational empire's dominance ...” --Philip Wadler, Keynote VLDB, Rome, September 2001 XML Databases 2001
  57. 58. Anyone remember XML databases?
  58. 59. No?
  59. 60. What happened?
  60. 61. established relational and non-relational databases hybridized XML
  61. 62. Oracle XML PostgreSQL XML2 BerkeleyDB XML DB2
  62. 63. Mythbust #5
  63. 64. “ Relational databases are for when you need ACID transactions.”
  64. 65. Transactions ≠ Relational
  65. 66. Robust Transactions without Relationality: SQL Without Transactions: BerkeleyDB Amazon Dynamo MySQL MyISAM MS Access
  66. 67. Mythbust #6
  67. 68. “ Users are adopting noSQL for web-scale performance”
  68. 70. Horizontal Scalability Note: data in the above chart is extremely dated. Some databases were tested over a year ago.
  69. 71. Mythbust #7
  70. 73. You
  71. 74. do
  72. 75. not
  73. 76. have
  74. 77. to choose
  75. 78. one database.
  76. 79. Choose the database system which fits your current application goals. or ...
  77. 80. Use more than one together MySQL + Memcached PostgreSQL + CouchDB or ...
  78. 81. Use a Hybrid MySQL NDB PostgreSQL Hstore HadoopDB
  79. 82. But what about relational vs non-relational?
  80. 83. Relational OLTP Databases* <ul><li>Transactions: more mature support </li><ul><li>including multi-statement </li></ul><li>Constraints: enforce data rules absolutely
  81. 84. Consistency: enforce structure 100%
  82. 85. Complex reporting: keep management happy!
  83. 86. Vertical scaling (but not horizontal) * mature ones, anyway </li></ul>
  84. 87. SQL vs. Not SQL SQL promotes: <ul><li>portability
  85. 88. managed changes over time
  86. 89. multi-application access
  87. 90. many mature tools </li></ul>
  88. 91. SQL vs. Not SQL But … SQL is a full programming language, and you have to learn it to use it.
  89. 92. SQL vs. Not SQL No-SQL allows: <ul><li>programmers as DBAs
  90. 93. no impedance
  91. 94. fast interfaces
  92. 95. fast development </li></ul>
  93. 96. SQL vs. Not SQL … but: may involve learning complex proprietary APIs db.things.find({j: {$ne: 3}, k: {$gt: 10}});
  94. 97. The main reason to use SQL-RDBMSs
  95. 98. “ Immortal Data”
  96. 99. “ Immortal Data” your data has a life independent of this specific application implementation
  97. 100. How do I choose?
  98. 101. Define the problem you are trying to solve
  99. 102. <ul><li>“ I need a database for my blog”
  100. 103. “ I need to add 1000's of objects per second on a low-end device.”
  101. 104. “ I need my database to enforce business rules across several applications.”
  102. 105. “ I want my application to be location-sensitive.”
  103. 106. “ I need to cache data an access it 100K times per second.”
  104. 107. “ I need to produce summary reports across 2TB of data.”
  105. 108. “ I have a few hundred government documents I need to serve on the web and mine”
  106. 109. “ I need to know who-knows-who-knows-who.”
  107. 110. “ I need to data mine 1000's of 30K bug reports per minute.” </li></ul>
  108. 111. Define the features you actually need <ul><li>many connections
  109. 112. multi-server scalability
  110. 113. complex query logic
  111. 114. APIs
  112. 115. redundancy
  113. 116. data integrity
  114. 117. schema/schemaless
  115. 118. data mining </li></ul>
  116. 119. fit the database to the task
  117. 120. “I need a database for my blog”
  118. 121. Use anything! <ul><li>MySQL
  119. 122. PostgreSQL
  120. 123. MongoDB
  121. 124. SQLite
  122. 125. CouchDB
  123. 126. Flatfiles
  124. 127. DBaseIII
  125. 128. Something you wrote yourself </li></ul>
  126. 129. “I need my database to unify several applications and keep them consistent.”
  127. 130. PostgreSQL “ OLTP SQL-Relational Database” “ It's not just a database: it's a development platform”
  128. 131. Postgres 9 beta out now!
  129. 132. “I need my application to be location-aware.”
  130. 133. PostGIS “ Geographic Relational Database”
  131. 134. PostGIS <ul><li>Queries across “contains” “near” “closest”
  132. 135. Complex geometric map objects </li><ul><li>polygons
  133. 136. lines (roads, etc) </li></ul></ul>or now … CouchDB Spatial and Spatialite!
  134. 137. “I need to store 1000's of event objects per second on embedded hardware.”
  135. 138. db4object “ Embedded Key-Value Store”
  136. 139. db4object “ Embedded Key-Value Store” BerkeleyDB, Redis, TokyoCabinet, MongoDB
  137. 140. db4object <ul><li>German Train System
  138. 141. Insert 1000's of objects per second
  139. 142. Low-end embedded console computer
  140. 143. Simple access in native programming language (Java, .NET) </li></ul><ul><li>compromise: embedded SQL database: SQLite </li></ul>
  141. 144. “I need to access 100K objects per second over 1000's of connections.”
  142. 145. memcached “ Distributed In-Memory Key-Value Store”
  143. 146. memcached <ul><li>Use: public website
  144. 147. Used for caching 1000's of serialized objects per second
  145. 148. Available for 100000's of requests per second across 1000's of connections
  146. 149. Cache each object only once per site
  147. 150. Supplements a relational database </li></ul>Alternatives: Redis, KyotoTyrant, etc.
  148. 151. “I need to produce complex summary reports over 2TB of data.”
  149. 152. LucidDB “ Relational Column-Store”
  150. 153. LucidDB <ul><li>For reporting and analysis
  151. 154. Large quantities of data (TB)
  152. 155. Complex OLAP & analytics queries
  153. 156. Business intelligence
  154. 157. compliments a transactional database </li></ul>
  155. 158. “I have 100's of government documents I need to serve on the web and mine for data.”
  156. 159. CouchDB “ Document Store”
  157. 160. CouchDB <ul><li>CividDB Project
  158. 161. Storing lots and lots of government documents
  159. 162. Don't know what's in them
  160. 163. Don't know how they are structured
  161. 164. Store them, figure out structure later by data mining. </li></ul>It's also good for mobile applications!
  162. 165. “I have a social application and I need to know who-knows- who-knows- who-knows- who-knows-who.”
  163. 166. Neo4j “ Graph Database”
  164. 167. Neo4j <ul><li>Social Network Website
  165. 168. 6 degrees of separation
  166. 169. “you may also like”
  167. 170. type and degrees of relationship </li></ul>
  168. 171. “I get 1000's of 30K bug reports per minute and I need to mine them for trends.”
  169. 172. Hadoop “ Massively Parallel Data Mine”
  170. 173. Hadoop + HBase <ul><li>Massive bug report data feed
  171. 174. 1000's of bug reports per minute
  172. 175. Each bug report 2-45K of data
  173. 176. Need to extract trends and correlate inexact data
  174. 177. Summarize in daily & weekly reports </li></ul>
  175. 178. Conclusion <ul><li>Different database systems do better at different tasks. </li><ul><li>every database feature is a tradeoff
  176. 179. no database can do all things well </li></ul><li>Relational vs. non-relational doesn't matter </li><ul><li>pick the database(s) for the project or the task </li></ul></ul>
  177. 180. Questions? <ul><li>PostgreSQL Project </li><ul><li>www.postgresql.org
  178. 181. [email_address] </li></ul><li>PostgreSQL Experts </li><ul><li>www.pgexperts.com </li><ul><li>www.pgexperts.com/documents.html </li></ul></ul><li>Open Source Database Survey </li><ul><li>Selena Deckleman
  179. 182. Open Source Database Survey: www.ossdbsurvey.org </li></ul></ul><ul><li>PostgreSQL BOF, Tonight 7pm </li></ul>Copyright 2010 Josh Berkus, distributable under the creative commons attribution license, except for 3rd-party images which are property of their respective owners. Special thanks to Selena Deckelman for the bunnies image and Gavin Roy for NoSQL performance tests.

×