Your SlideShare is downloading. ×
0
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

North Bay Ruby Meetup 101911

258

Published on

North Bay Ruby Meetup slides. Resources: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis, http://nosql.mypopescu.com/

North Bay Ruby Meetup slides. Resources: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis, http://nosql.mypopescu.com/

Published in: Technology
1 Comment
0 Likes
Statistics
Notes
  • Resources: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis, http://nosql.mypopescu.com/, http://dbmsmusings.blogspot.com/, http://nosqltapes.com/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total Views
258
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
1
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    • 1. Devs, Ops, & Data A gentle introduction to the world of DataEngine Yard
    • 2. Agenda•Context•RDBMs •Do’s and Don’ts•NoSQL •Survey•General Advice•QuestionsEngine Yard 2
    • 3. I work with Data!• Data Engineer @ Engine Yard• Organizer of the DFW Big My Team Data Group (@dfwbigdata) is Hiring!• Previous life • Sr Web Developer • Data Architect • Student: MS in CSCI & MS in Info Mgmt from WashU STLEngine Yard 3
    • 4. The Universe Relational NoSQL/CoSQL “The rest” VS VSEngine Yard 4
    • 5. Relational World• ACID Properties - Atomicity • Either all of a transaction’s actions are committed or none are - Consistency • Any transaction the database performs will take it from one consistent state to another - Isolation • Operations cannot access data that has been modified during a transaction that has not yet completed - Durability • DBMS recover the committed transaction updates against any kind of system failure (hardware or software)Engine Yard 6
    • 6. Relational World• Relational Model - How data should be formatted (Normalized)• Unified Language for Querying - SQL• Data - Tabular, structured, relatively centralized• Scale vertically - Go up until no more - Sharding goes against the model!• Established theory & algorithmsEngine Yard 7
    • 7. RDBMS Performance• Size of your data matters - Migrations - Schema changes• Hardware matters - Disk IO - RAM• Restores are not Magic• Debug your queries - explain() - slow query logs• Indexes!Engine Yard 8
    • 8. A bit of context• CAP - Consistency • All clients consistent view of data - Availability • Clients have access to read & write data - Partition Tolerance • System won’t fail if individual nodes can’t communicate• Horizontal Scaling• Data interaction is DB-specific• Data - Unstructured - Large quantitiesEngine Yard 10
    • 9. A bit of context Data Model Column- C Key/Value Document Graph Oriented o n s Single Membase, i Master MongoDB Neo4j Redis* s t e n Multi- Cassandra, c Master/ Riak CouchDB HBase, Dynamo Hypertable yEngine Yard 11
    • 10. A bit of context http://blog.nahurst.com/visual-guide-to-nosql-systemsEngine Yard 12
    • 11. Survey of NoSQL Stores• Disk-backed in-memory database• Datatype Server (awesome!)• Has notion of transactions• Pros: - Blazing fast, easy to set up• Cons: - May not be best for large databases• Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory). - Stock prices. Analytics. Real-time data collection. Real-time communicationEngine Yard 13
    • 12. Survey of NoSQL Stores• Document Oriented DB (Erlang)• Flexible replication (MM, MS)• Pros: - DB consistency, ease of use - MVCC - write operations do not block reads• Cons: - Needs compacting from time to time• Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important. - CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi- site deployments.Engine Yard 14
    • 13. Survey of NoSQL Stores• Dynamo-based key/value store• Pros: - Fault tolerance - Distributed - Scalable• Cons: - Learning curve is a bit steep - multi-site replication in commercial version only• Best used: If you want something Cassandra- like (Dynamo-like) but simpler. - Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt.Engine Yard 15
    • 14. Survey of NoSQL Stores• Document Oriented DB (binary JSON)• Memory Mapped Files, Schema-less• Pros: - Easy to get started - 2 Ruby ORMS (MongoId, MongoMapper)• Cons: - Cluster reconfiguration tricky• Best used: If you need dynamic queries. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks. - For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.Engine Yard 16
    • 15. Survey of NoSQL Stores• Column-based DB - Facebook• Pros: - Best of BigTable (column families) and Dynamo - Querying by column, range of keys• Cons: - Bloat and complexity (Java)• Best used: When you write more than you read (logging). If every component of the system must be in Java. - Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis.Engine Yard 17
    • 16. Survey of NoSQL Stores• Graph DB written in Java• Pros: - Native way to describe relationships - Advanced path-finding with multiple algorithms• Cons: -?• Best used: For graph-style data. Neo4j is quite different from the others in this sense. - Social relations, public transport links, road maps, network topologies.Engine Yard 18
    • 17. Survey of NoSQL Stores• MapReduce Framework• Distributed FS, task tracker, ... (full Ecosystem)• Pros: - Process extremely large data volumes• Cons: - Bloat and complexity (Java), steep learning curve• Best used: When you write more than you read (logging). If every component of the system must be in Java. - Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis.Engine Yard 19
    • 18. “The rest”• Full Text Search - Awesome search capabilities - Helps reduce load of your database• Caches - Very fast! Use in any application where low-latency data access - Membase • Memcache compatible, but with persistence and clustering • All nodes are identical (master-master replication)Engine Yard 20
    • 19. Keep in mind• Data and query models• Durability needs• Scalability needs• Partition needs - data on multiple servers?• Consistency - reads!• Server performance• Analytical workloadEngine Yard 21
    • 20. Advice• Give them a try! - Fast to set up• Data Models matter - Going from RDBMS to NoSQL will require a conversion step so plan for it• No silver bullet - Your app will likely use different repositories for specific usagesEngine Yard 22
    • 21. Questions?Engine Yard 23

    ×