Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Moving from SQL Server to MongoDB


Published on

Moving from SQL Server to MongoDB - Lessons Learned

Published in: Technology
  • Now we need not write any scripts to migrate SQL data to MongoDB. Pelica Migration Services can migrate the data retaining the relations with just a click! You can download a trial version or request a demo at
    Are you sure you want to  Yes  No
    Your message goes here
  • Thanks guys. I'm available for presentations and consulting if you are interested.
    Are you sure you want to  Yes  No
    Your message goes here
  • A good on-the-ground presentation on SQL vs NOSQL, the + & - for their use-case.
    Are you sure you want to  Yes  No
    Your message goes here

Moving from SQL Server to MongoDB

  1. 1. BuzzNumbers Presentation<br />Moving From SQL Server to MongoDB<br />
  2. 2. Todays Presentation<br />Problems faced with Social Media Monitoring/Analytics<br />Why choose NoSQL over SQL <br />Why choose MongoDB<br />NOSQL Vs SQL Schema Design<br />Infinite scalability with commodity hardware & .NET<br />Why we still use .NET (why not Ruby/Java/Python)<br />Lessons Learned<br />
  3. 3. NOSQL at BuzzNumbers<br />About BuzzNumbers<br />
  4. 4. About BuzzNumbers<br />SaaSWeb Product Company<br />Web and Social Media Analytics<br /> Collect “big data”web content<br />Near-Realtime data capture<br />News, Blogs, Social Mediaetc<br />Scraping, API’s, Feeds<br />Analytics & Business Intelligence<br />BI, Text, Sentiment, Locations, NLP, Machine Learning <br />
  5. 5. BuzzNumbers Project Team<br /> Nick Holmes a Court - @nickhac<br />Brett Anderson - @brehtt<br />Steve Casey - @stevencasey<br />Jacinto Santamaria<br />Chris Fulstow - @chrisfulstow<br />Josie Kidd - @jose9 <br />
  6. 6. NOSQL at BuzzNumbers<br />Problems Faced at BuzzNumbers<br />
  7. 7. Problems faced at BuzzNumbers<br /> Large and fast growing DB Tables<br />Lots of Read/Writes from data collection 24/7<br /> Massive Table Scans for user reports (< 3 sec SLA)<br /> Large Joins (10+ Tables) with Nested Views<br /> Complex Queries (Aggregates, Where’s, FullText) <br />FullText Search Indexes needed real-time updates <br /> Read/Write Contention <br /> Rapid Index fragmentation, Slow rebuilds <br /> DB Locks occurring (with no implicit Transactions)<br /> Blocking Transactions (both small/large tables)<br />
  8. 8. Outgrew SQL Server Enterprise 2008<br />“Free” Software from MSFT from BizSpark<br /> Tried everything with SQL Enterprise<br />Significant SQL Performance Tuning <br /> Dirty Reads (nolock), Offline Index Rebuilds<br />Replication / Clustering / Multi-Instance<br /> Problems<br /> Schema changes impossible with uptime requirements<br />DBA tasks made system unavailable for hours/days<br />Hardware / SQL DBA got very expensive<br /> Web users experienced annoying / unnecessary waits on blocked queries that were non-complex because of joins<br />
  9. 9. BuzzNumbers NOSQL Presentation<br />Why NOSQL over SQL<br />
  10. 10. What is NOSQL<br /> New generation of “Databases”<br /> “Not Only SQL” - Mostly Open Source <br /> NOSQL Distributed database designed to deliver<br /> Distributed “Big Data” storage<br /> Distributed processing of queries/calculations<br /> NOSQL Examples include<br />Google– BigTable<br />Yahoo -Hadoop (30k+ Nodes)<br />Facebook - Cassandra<br />FourSquare - MongoDB<br />
  11. 11. Why NoSQL over SQL<br />SQL <br />Guaranteed consistency<br />Transactions<br />Schemas / DataTypes<br />Joins / Foreign Keys<br />TSQL/PL-SQL (Views, Procs)<br />Scale Up (hardware)<br />Many Benefits including<br />Ease of use<br />Many developers skilled in SQL<br />Trusted for decades / Proven<br />NoSQL<br />Eventual Consistency<br />No Transaction Support<br />Key/Value Data (mostly)<br />Flat Data (no joins)<br />Key Lookups / MapReduce / Code<br />Scale out (distributed)<br />Many Benefits including<br />Performance / Scale<br />Lower license costs<br />Solves Web2 problems<br />
  12. 12. Why NoSQL over SQL<br />CAP Theorem <br />Consistency<br />Availability<br />Partitioning<br />Only 2 of 3 are Possible<br />Consistency/Availability <br />RDBMS<br />Availability / Partitioning <br />NOSQL<br />Consistency / Partitioning <br />Availability Issues (No one wants this)<br />
  13. 13. BuzzNumbers NOSQL Presentation <br />Why MongoDB for NOSQL?<br />
  14. 14. NOSQL Providers<br />
  15. 15. Who uses Mongo?<br />
  16. 16. Why Mongo<br /> Proven for multiple usage scenarios<br />High performance (eventual consistency) <br /> Data stored in JSON (not only Key/Value)<br />Supports Multiple Indexes (Anywhere in JSON)<br />Easy to Install, Easy to Use(Linux/Windows)<br />Easy to Scale for High Volume Writes (Sharding)<br />Easy to Scale for High Volume Reads (Replica Sets)<br />Automatic Failover and Redundancy (Replica Sets)<br />REST Interface and Drivers for Ruby/.NET/Java/Etc<br />Easy to Query via multiple techniques<br />Key/Value, Mongo Query, JavaScript, MapReduce<br />
  17. 17. BuzzNumbers NOSQL Presentation <br />Moving from SQL Schema to No-Schema<br />
  18. 18. BuzzNumbers NOSQL Presentation <br />RDMBS Schema (Tables)<br />Mongo Collection (JSON)<br />
  19. 19. BuzzNumbers NOSQL Presentation <br />RDMBS Schema<br />Mongo JSON Document<br />
  20. 20. BuzzNumbers NOSQL Presentation <br />RDMBS Schema<br />Mongo JSON Document<br />One Document Per Website Per Day<br />
  21. 21. BuzzNumbers NOSQL Presentation <br />RDMBS Schema<br />Mongo JSON Document<br />Pre-Aggregate SUM/COUNT/AVG Calculations using UPSERT<br />
  22. 22. BuzzNumbers NOSQL Presentation <br />RDMBS Schema<br />Mongo JSON Document<br />Store Line Items with rich data as Nested Arrays .<br />Use JavaScript or MapReduce to Query<br />
  23. 23. Basic SQL vs Mongo Syntax<br />Select * from Clients<br />db.clients.find()<br />Select * from Clients where clientid = 1<br />db.clients.find({”ClientID” :1})<br />Insert into clients (ClientID, Name) Values (1, “ACME”)<br />db.clients.ìnsert({”ClientID” :1,”Name”:”ACME” })<br />Create Table / Alter Table<br /> Just start inserting db.client.insert({JSON HERE})<br />Create Index<br />db.clients.ensureIndex({“ClientID”:1, “Name”:1})<br />
  24. 24. Basic SQL vs Mongo Syntax<br />Select * from Clients<br />db.clients.find()<br />Select * from Clients where clientid = 1<br />db.clients.find({”ClientID” :1})<br />Insert into clients (ClientID, Name) Values (“ACME”, 1)<br />db.clients.ìnsert({”ClientID” :1,”Name”:”ACME” })<br />Create Table<br /> Just start inserting<br />Create Index<br />db.clients.ensureIndex({“ClientID”:1, “Name”:1})<br />
  25. 25. BuzzNumbers NOSQL Presentation <br />Infinite Scale with .NET and NOSQL<br />
  26. 26. Infinite Scale with .NET<br /> Use .NET for Rapid Product Development<br /> Web Applications (IIS, ASP.NET, User Databases)<br /> Server Applications (Scraping, Apps, Services, Data)<br />Scheduled Tasks / Backend Jobs<br /> Use Open Source for Infinite Scale on Linux<br />MongoDB for Big Data Storage <br /> SOLR (distributed Lucene) for Full Text Indexing<br />.NET Drivers Available for Mongo/SOLR<br />
  27. 27. Infinite Scale with .NET<br /> Cloud Hosting for Low Cost Scale<br /> Rackspace Cloud ($200 p/m per 4GB-RAM server)<br /> Windows and Ubuntu – Image/Clone/API support<br />Zabbix Monitoring – notify when near capacity<br /> Amazon/Heroku/dotCloud alternates<br /> Tips to deliver fantastic performance at scale<br /> Indexes MUST fit in RAM (Disk Reads are Slow)<br />SSD’s HardDisks are worth the extra price<br />4GB RAM / 160GB Disk seems to be optimum price/performance per node in distributed system<br />
  28. 28. BuzzNumbers NOSQL Presentation <br />Why we stay with .NET?<br />
  29. 29. Why we stay with .NET<br /> Visual Studio best IDE!!!<br />SQL Server great database for most Data<br /> Proven Tech Stack (low corporate risk) <br /> Lots of support (MSFT and Consultants)<br /> Large online community with code samples<br /> Many Open Source libraries <br /> ASP.NET MVC RAZOR is RAD<br />Non-Complex Sysadmin for Windows Servers<br /> Drivers/Integration available for most OSS Projects<br /> Lots of Agile/Scrum/TDD/CI/Project Management tools<br /> Lots of smart .NET web developers & engineers<br />
  30. 30. BuzzNumbers NOSQL Presentation <br />Lessons Learned<br />
  31. 31. Lessons Learned<br />“Big Data” is not 100M records: but 1BN+ <br />Don’t scale until you need to (Premature optimisation costs - big time)<br />SQL RBDMS solves most problems but Scale up costs are prohibitive for startups so plan in advance when you might need to switch<br />Mixing SQL for SmallData and NOSQL for BigData delivers both ease/speed of development and performance<br />Mongo/SOLR works well to solve specific performance problems <br />Not all problems are equal: optimiseeach solution per performance problem<br />Don’t go NOSQL unless you absolutely need to<br />Very early technology with lots of learning overhead, risks and production issues<br />Skilled .NET/Mongo/SOLR engineers are very hard to find<br />If client/data segmentation is possible, multiple SQL instances can deliver<br />Ensure Indexes fit in Memory<br />Spend time planning your schema in advances based on query requirements<br />
  32. 32. BuzzNumbers NOSQL Presentation <br />Interested to learn more?<br />
  33. 33. Thanks for your time<br /> Speak with one of the Buzz Team tonight<br /> Join our Team? We’re Hiring!<br />Web Developers<br />Software Engineers<br />UX / Web Designers<br />Immediate and Future roles… Talk to us!<br />