Beyond SQL

1,360 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,360
On SlideShare
0
From Embeds
0
Number of Embeds
48
Actions
Shares
0
Downloads
31
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Intro
  • Be clear, scaling is about getting bigger, not getting faster.
  • The web is high read, low write. High volume.
  • Shoehorning objects into relational models. Serves us well, but options exist.
  • Couch pros: good at being disconnected for long time, good replication. Cons: REST only, replication as scaling
    Tokyo Cabinet: great replacement for BDB
  • replica pairs and master-slave configurations.
    MVCC (Multiversion concurrency control) of couch vs update in place of mongo.
    MVCC very good at master-master, often offline, problems requiring versioning.
    Update in place offers killer write speeds and updates.
    Mongo replication really geared toward failover and updates.
  • SQL plans usually go into production untested since they can change with new data. That can mean production down. Statistical query optimizers change when the table statistics change. That means UNPREDICTABLITY.
    Wicked plans. Mongo runs multiple plans and learns from them. First one to the finish wins.
  • Latency is bad. CouchDB is cool and all, REST and HTTP is awesome. But the point of using nosql is usually speed. Why give that up for something as simple as connections?
  • Canonical example: Website Analytics.
  • LRU’ed out
    caching
    logfiles
    history/activity/streams
  • Modifier Operations
    Update if Current
  • Nice mapping to dynamically typed languages.
    Makes migrations and updates to the database smooth. (Think mysql locking tables to add column).
  • If code isn’t solid, you can get into trouble.
  • Preserves datatypes unlike straight JSON.
    Binary data like videos, images can be stored directly in BSON, unlike as attachments in CouchDB.
    Still have to deal with form strings in many cases.
  • RDBMS still great for most apps.
  • Beyond SQL

    1. 1. Beyond SQL Rise of the noSQL rebellion Matt Kern @lightcap matt@codebenders.com codes: github.com/lightcap github.com/codebenders codebenders.com
    2. 2. What’s wrong with SQL, anyhow?
    3. 3. Nothing!
    4. 4. Nothing! Okay, a lot…
    5. 5. SQL monoliths
    6. 6. Too Big To FAIL
    7. 7. scale up
    8. 8. scale up vs.
    9. 9. scale up vs. scale out
    10. 10. scaling vs optimization
    11. 11. It’s a web world
    12. 12. in-memory database key-value store column-oriented document-oriented
    13. 13. Lotus Notes?!?!
    14. 14. noSQL armys
    15. 15. No Joins!
    16. 16. SELECT     Program,     Month,     ThisYearTotalRevenue,     PriorYearTotalRevenue FROM (     SELECT         ISNULL(ThisYear.Program, PriorYear.Program) as Program,         ISNULL(ThisYear.Month, PriorYear.Month),         ISNULL(ThisYear.TotalRevenue, 0) as ThisYearTotalRevenue,         ISNULL(PriorYear.TotalRevenue, 0) as PriorYearTotalRevenue     FROM (         SELECT Program, Month, SUM(TotalRevenue) as TotalRevenue         FROM PVMonthlyStatusReport         WHERE Year = @FinancialYear         GROUP BY Program, Month     ) as ThisYear     FULL OUTER JOIN (         SELECT Program, Month, SUM(TotalRevenue) as TotalRevenue         FROM PVMonthlyStatusReport         WHERE Year = (@FinancialYear - 1)         GROUP BY Program, Month     ) as PriorYear ON         ThisYear.Program = PriorYear.Program         AND ThisYear.Month = PriorYear.Month ) as Revenue WHERE     Program = 'Bikes' ORDER BY     Month
    17. 17. SELECT     Program,     Month,     ThisYearTotalRevenue,     PriorYearTotalRevenue FROM (     SELECT         ISNULL(ThisYear.Program, PriorYear.Program) as Program,         ISNULL(ThisYear.Month, PriorYear.Month),         ISNULL(ThisYear.TotalRevenue, 0) as ThisYearTotalRevenue,         ISNULL(PriorYear.TotalRevenue, 0) as PriorYearTotalRevenue     FROM (         SELECT Program, Month, SUM(TotalRevenue) as TotalRevenue         FROM PVMonthlyStatusReport         WHERE Year = @FinancialYear         GROUP BY Program, Month     ) as ThisYear     FULL OUTER JOIN (         SELECT Program, Month, SUM(TotalRevenue) as TotalRevenue         FROM PVMonthlyStatusReport         WHERE Year = (@FinancialYear - 1)         GROUP BY Program, Month     ) as PriorYear ON         ThisYear.Program = PriorYear.Program         AND ThisYear.Month = PriorYear.Month ) as Revenue WHERE     Program = 'Bikes' ORDER BY     Month
    18. 18. Well, if you really want…
    19. 19. Proprietary noSQL • Google BigTable • Amazon Dynamo AKA:
    20. 20. Open Source noSQL • MongoDB • Cassandra Project (Facebook) • Hadoop: HBase (Apache) • Dynamite • Tokyo Cabinet • HBase • CouchDB (Apache) • Hypertable • Voldermort (LinkedIn) • SimpleDB
    21. 21. MongoDB Hawtness
    22. 22. Auto Sharding
    23. 23. Failover and reduncancy
    24. 24. Query Optimization AI
    25. 25. Dynamic Querys
    26. 26. Relax? No.
    27. 27. Bad ass write performance
    28. 28. Upserts
    29. 29. Capped Collections
    30. 30. Multikeys
    31. 31. Lockless updates
    32. 32. Schema-free
    33. 33. Schema-free?
    34. 34. BSON = Binary JSON
    35. 35. MapReduce • Massively Parallel • Huge datasets • Loves sharded data • Good enough for Google.
    36. 36. 64-bit vs 32-bit
    37. 37. Libraries
    38. 38. Good Fits
    39. 39. Mashups!
    40. 40. Session data
    41. 41. Volatile Data
    42. 42. New uses every day
    43. 43. Thanks!

    ×