NoSQL Slideshare Presentation

  • 3,116 views
Uploaded on

Can No-SQL technologies hold for the specific requirements that apply to the Telco domain? …

Can No-SQL technologies hold for the specific requirements that apply to the Telco domain?

This is the Slideshare Presentation by Ericsson Researcher Nicolas Seyvet to accompany his blog "NoSQL for Telco"
http://labs.ericsson.com/blog/nosql-for-telco

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,116
On Slideshare
0
From Embeds
0
Number of Embeds
15

Actions

Shares
Downloads
24
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Individualization of content
    •In the salary lists of the 1970s, all elements had exactly one job
    •In the salary lists of the 2000s, we need 5 job columns! Or 8?Or 15?
  • Database developers all know the ACID acronym. It says that database transactions should be:
    Atomic: Everything in a transaction succeeds or the entire transaction is rolled back.
    Consistent: A transaction cannot leave the database in an inconsistent state.
    Isolated: Transactions cannot interfere with each other.
    Durable: Completed transactions persist, even when servers restart etc.
    These qualities seem indispensable, and yet they are incompatible with availability and performance in very large systems. For example, suppose you run an online book store and you proudly display how many of each book you have in your inventory. Every time someone is in the process of buying a book, you lock part of the database until they finish so that all visitors around the world will see accurate inventory numbers. That works well if you run The Shop Around the Corner but not if you run Amazon.com.
    Amazon might instead use cached data. Users would not see not the inventory count at this second, but what it was say an hour ago when the last snapshot was taken. Also, Amazon might violate the “I” in ACID by tolerating a small probability that simultaneous transactions could interfere with each other. For example, two customers might both believe that they just purchased the last copy of a certain book. The company might risk having to apologize to one of the two customers (and maybe compensate them with a gift card) rather than slowing down their site and irritating myriad other customers.
    There is a computer science theorem that quantifies the inevitable trade-offs. Eric Brewer’s CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.)
    An alternative to ACID is BASE:
    Basic Availability
    Soft-state
    Eventual consistency
    Rather than requiring consistency after every transaction, it is enough for the database to eventually be in a consistent state. (Accounting systems do this all the time. It’s called “closing out the books.”) It’s OK to use stale data, and it’s OK to give approximate answers.
    It’s harder to develop software in the fault-tolerant BASE world compared to the fastidious ACID world, but Brewer’s CAP theorem says you have no choice if you want to scale up. However, as Brewer points out in this presentation, there is a continuum between ACID and BASE. You can decide how close you want to be to one end of the continuum or the other according to your priorities.
  • The Sex Pistols had shown that barely-constrained fury was more important to their contemporaries than art-school structuralism, giving anyone with three chords and something to say permission to start a band. Eric Brewer, in what became known as Brewer's Conjecture, said that as applications become more web-based we should stop worrying about data consistency, because if we want high availability in these new distributed applications, then guaranteed consistency of data is something we cannot have, thus giving anyone with three servers and a keen eye for customer experience permission to start an internet scale business. Disciples of Brewer (present that day or later converts) include the likes of Amazon, EBay, and Twitter.
     
     
    What he said was there are three core systemic requirements that exist in a special relationship when it comes to designing and deploying applications in a distributed environment (he was talking specifically about the web but so many corporate businesses are multi-site/multi-country these days that the effects could equally apply to your data-centre/LAN/WAN arrangement).
  • The Sex Pistols had shown that barely-constrained fury was more important to their contemporaries than art-school structuralism, giving anyone with three chords and something to say permission to start a band. Eric Brewer, in what became known as Brewer's Conjecture, said that as applications become more web-based we should stop worrying about data consistency, because if we want high availability in these new distributed applications, then guaranteed consistency of data is something we cannot have, thus giving anyone with three servers and a keen eye for customer experience permission to start an internet scale business. Disciples of Brewer (present that day or later converts) include the likes of Amazon, EBay, and Twitter.
     
     
    What he said was there are three core systemic requirements that exist in a special relationship when it comes to designing and deploying applications in a distributed environment (he was talking specifically about the web but so many corporate businesses are multi-site/multi-country these days that the effects could equally apply to your data-centre/LAN/WAN arrangement).
    Conistent vs available is a business decision. Not an engineering one.
  • NoSQL ttend to sacrifice full C and A for P at any given time -> All in all eventually A or C
    Databases are great at this because they focus on ACID properties and give us Consistency by also giving us Isolation, so that when Customer One is reducing books-in-stock by one, and simultaneously increasing books-in-basket by one, any intermediate states are isolated from Customer Two, who has to wait a few milliseconds while the data store is made consistent.
     
    Once you start to spread data and logic around different nodes then there's a risk of partitions forming. A partition happens when, say, a network cable gets chopped, and Node A can no longer communicate with Node B. With the kind of distribution capabilities the web provides, temporary partitions are a relatively common occurrence and, as I said earlier, they're also not that rare inside global corporations with multiple data centres.
  • Easier scalability is the first aspect highlighted by Wiederhold. NoSQL databases like Couchbase and 10Gen's MongoDB, he said, can be scaled up to handle much bigger data volumes with relative ease.If your company suddenly finds itself deluged by overnight success, for example, with customers coming to your Web site by the droves, a relational database would have to be painstakingly replicated and re-partitioned in order to scale up to meet the new demand.Wiederhold cited social and mobile gaming vendors as the big example of this kind of situation. An endorsement or a few well-timed tweets could spin up semi-dormant gaming servers and get them to capacity in mere hours. Because of the distributed nature of non-relational databases, to scale NoSQL all you need to do is add machines to the cluster to meet demand.
  • Could we store these, if they came in a stream?
  • As data increases, there may be many StoreFiles on HDFS, which is not good for its performance. Thus, HBase will automatically pick up a couple of the smaller StoreFiles and rewrite them into a bigger one. This process is called minor compaction. For certain situations, or when triggered by a configured interval (once a day by default), major compaction runs automatically. Major compaction will drop the deleted or expired cells and rewrite all the StoreFiles in the Store into a single StoreFile; this usually improves the performance.However, as major compaction rewrites all of the Stores' data, lots of disk I/O and network traffic might occur during the process. This is not acceptable on a heavy load system. You might want to run it at a lower load time of your system.
    Rather than let HBase auto-split your Regions, manage the splitting manually [11]. With growing amounts of data, splits will continually be needed. Since you always know exactly what regions you have, long-term debugging and profiling is much easier with manual splits. It is hard to trace the logs to understand region level problems if it keeps splitting and getting renamed. Data offlining bugs + unknown number of split regions == oh crap! If an HLog or StoreFile was mistakenly unprocessed by HBase due to a weird bug and you notice it a day or so later, you can be assured that the regions specified in these files are the same as the current regions and you have less headaches trying to restore/replay your data. You can finely tune your compaction algorithm. With roughly uniform data growth, it's easy to cause split / compaction storms as the regions all roughly hit the same data size at the same time. With manual splits, you can let staggered, time-based major compactions spread out your network IO load.How do I turn off automatic splitting? Automatic splitting is determined by the configuration value hbase.hregion.max.filesize. It is not recommended that you set this to Long.MAX_VALUE in case you forget about manual splits. A suggested setting is 100GB, which would result in > 1hr major compactions if reached.What's the optimal number of pre-split regions to create? Mileage will vary depending upon your application. You could start low with 10 pre-split regions / server and watch as data grows over time. It's better to err on the side of too little regions and rolling split later. A more complicated answer is that this depends upon the largest storefile in your region. With a growing data size, this will get larger over time. You want the largest region to be just big enough that the Store compact selection algorithm only compacts it due to a timed major. If you don't, your cluster can be prone to compaction storms as the algorithm decides to run major compactions on a large series of regions all at once. Note that compaction storms are due to the uniform data growth, not the manual split decision.If you pre-split your regions too thin, you can increase the major compaction interval by configuring HConstants.MAJOR_COMPACTION_PERIOD. If your data size grows too large, use the (post-0.90.0 HBase) org.apache.hadoop.hbase.util.RegionSplitter script to perform a network IO safe rolling split of all regions.
  • Either way this is not good for business. Amazon claim(http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it) that just an extra one tenth of a second on their response times will cost them 1% in sales. Google said(http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html) they noticed that just a half a second increase in latency caused traffic to drop by a fifth.
    Where both sides agree though is that the answer to scale is distributed parallelisation not, as was once thought, supercomputer grunt.
     
    If they're not working in parallel you have no chance to get the problem done in a reasonable amount of time. This is a lot like anything else. If you have a really big job to do you get lots of people to do it. So if you are building a bridge you have lots of construction workers. That's parallel processing also. So a lot of this will end up being "how do we mix parallel processing and the internet?"Inktomi and the Internet Bubble
  • In our account of the history of NoSQL development, we’ve concentrated on big data running on clusters. While we think this is the key thing that drove the opening up of the database world, it isn't the only reason we see project teams considering NoSQL databases. An equally important reason is the old frustration with the impedance mismatch problem. The big data concerns have created an opportunity for people to think freshly about their data storage needs, and some development teams see that using a NoSQL database can help their productivity by simplifying their database access even if they have no need to scale beyond a single machine.
  • Individualization of content
    •In the salary lists of the 1970s, all elements had exactly one job
    •In the salary lists of the 2000s, we need 5 job columns! Or 8?Or 15?
  • Individualization of content
    •In the salary lists of the 1970s, all elements had exactly one job
    •In the salary lists of the 2000s, we need 5 job columns! Or 8?Or 15?
    The recent trend in discussing NoSQL databases is to highlight their schemaless nature—it is a popular feature that allows developers to concentrate on the domain design without worrying about schema changes. It’s especially true with the rise of agile methods [Agile Methods] where responding to changing requirements is important. Discussions, iterations, and feedback loops involving domain experts and product owners are important to derive the right understanding of the data; these discussions must not be hampered by a database's schema complexity. With NoSQL data stores, changes to the schema can be made with the least amount of friction, improving developer productivity (“The Emergence of NoSQL,” p. 9). We have seen that developing and maintaining an application in the brave new world of schemaless databases requires careful attention to be given to schema migration
  • A document-oriented database is a computer program designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data. Document-oriented databases are one of the main categories of so-called NoSQL databases and the popularity of the term "document-oriented database" (or "document store") has grown[citation needed] with the use of the term NoSQL itself. In contrast to well-known relational databases and their notions of "Relations" (or "Tables"), these systems are designed around an abstract notion of a "Document".
    By storing and managing data based on columns rather than rows, column-oriented architecture overcomes query limitations that exist in traditional row-based RDBMS. Only the necessary columns in a query are accessed, reducing I/O activities by circumventing unneeded rows.[7] This enables it to[clarification needed] with very large data volumes (Terabytes to Petabytes), commonly referred to as Big Data.
    Big data[1][2] is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage,[3] search, sharing, transfer, analysis,[4] and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller set
  • http://codahale.com/you-cant-sacrifice-partition-tolerance/
    Some systems cannot be partitioned. Single-node systems (e.g., a monolithic Oracle server with no replication) are incapable of experiencing a network partition. But practically speaking these are rare; add remote clients to the monolithic Oracle server and you get a distributed system which can experience a network partition (e.g., the Oracle server becomes unavailable).
    Network partitions aren’t limited to dropped packets: a crashed server can be thought of as a network partition. The failed node is effectively the only member of its partition component, and thus all messages to it are “lost” (i.e., they are not processed by the node due to its failure). Handling a crashed machine counts as partition-tolerance.

Transcript

  • 1. for Telco Data Research Day 2013 Prepared by Nicolas Seyvet Help from N. Hari Kumar P. Matray
  • 2. Who AM I? › Software Developer 10+ years at Ericsson › HLR, PGM, IMS-M, MMS, MTV, BCS › Joined Research late 2012 –BMUM -> BUSS (5+ years) –DUCI (<6 months) › Active member in various /// groups –Linux (ELX, UMWP, etc.), Agile, SWAN, EQNA › Open source contributor Ericsson Internal | 2013-06-03 | Page 2
  • 3. The Plan › Why NoSQL? › CAP › Research activities › Market trends Ericsson Internal | 2013-06-03 | Page 3
  • 4. NoSQL: Why? Data Research Day 2013
  • 5. NoSQL: Why? Trends – Usual Suspects Gossip SDN Gartner Data Center TCO Report, June 2012. Ericsson Internal | 2013-06-03 | Page 5 Internet Hypertext, RSS, Wikis, blogs, wikis, tagging, user generated content, RDF, ontologies
  • 6. NoSQL: Why? TrendS: Architecture › Multicore › Parallelization/Distribute d › Cloud › Schemaless Application Application 1980s: Mainframe applications Ericsson Internal | 2013-06-03 | Page 6 Application Application Application Application Application Application 1990s: Database as integration hub Application Application Application Application 2000s: Decoupled services Application Application
  • 7. Two Ways to Scale Go BIG or many? PARTITIO N Ericsson Internal | 2013-06-03 | Page 7 (replication)
  • 8. Vailability CAP artition Data Research Day 2013
  • 9. CAP Theorem Brewer’s Conjecture “Of three properties of shared-data systems – data Consistency, system Availability and tolerance to network Partitions – only two can be achieved at any given moment in time .” › 2000 Prof Eric Brewer, PoDC Conference Keynote › 2002 Seth Gilbert and Nancy Lynch, ACM SIGACT News 33 (2) Ericsson Internal | 2013-06-03 | Page 9
  • 10. CAP Theorem The business decision CONSISTENT Partition OR Available Ericsson Internal | 2013-06-03 | Page 10
  • 11. CAP Summary Available Traditio MySQL nal relationa l: , Postg re S Q L , e t c. Consistent AP CA CP dra, as s an em s iak, C e syst or t , R lik m Volde , Dynamo hD b Couc AP: Requests will complete at any node possibly violating consistency Partition Tolerance HBase, MongoDB, Redis, BigTable like systems CP: Requests will complete at nodes that have quorum Ericsson Internal | 2013-06-03 | Page 11
  • 12. Why NoSQL now? › Trends “Internet size”, Cluster friendly Rapid development / Solution oriented Polyglot Persistence Schemaless Ericsson Internal | 2013-06-03 | Page 12
  • 13. Research Activities TelCO Applicability Aggregation Event Streams Data Research Day 2013
  • 14. HBAse BigTable/Columnar Coordination Master selection Root region lookup Node registration … Data files Write-Ahead Log (WAL) Rack aware Default data replication x3 Region allocation Failover Log splitting Load balancing One active (elected), many stand by Holds regions Handle I/O requests In-Memory data (MemStore) Split regions Compact regions › ZooKeeper (cluster) › Hadoop (cluster) › HBase: 1 elected master / many region servers Ericsson Internal | 2013-06-03 | Page 14
  • 15. TelCO Applicability Study Hbase For HLR data? ›Comprehensive report ›Using HBase is DOABLE! OK! Ericsson Internal | 2013-06-03 | Page 15
  • 16. HBASE BULK Processing Event Processing & Aggregation › 100 Million rows Queries evaluated SELECT col1 FROM table SELECT SUM(col1) FROM table WHERE col2=val2 GROUP BY col3 › › › › › Map/Reduce › Scan › Co-processor Ericsson Internal | 2013-06-03 | Page 16 CPU RAM Network Schema
  • 17. Bulk Processing Scaling out/Horizontally › 100 Million rows › Linear scaling! SELECT SUM(col1) FROM table WHERE col2=val2 GROUP BY col3 Ericsson Internal | 2013-06-03 | Page 17
  • 18. READ/WRITE 100000 iterations Periodic degradation › 150,000,000 rows › row = key + 1 column (1K) Entire cluster up and running 8 nodes ( 1Master / 7 slaves) Ericsson Internal | 2013-06-03 | Page 18
  • 19. Robustness Killing Them Softly… Master Slaves Ericsson Internal | 2013-06-03 | Page 19
  • 20. How much Data can it Fit? ITK / Constellation / CEA › Network produces events – RNC, SGSN, S-&R-KPI – Traffic DPI – GTP-C › CEA (Perfmon) – Correlated events 1000+ K events/s Event Event Feeder Feeder 10+ K events/s Map/Reduce Put.. Put.. Put… 10,000,000 subscribers Staging data on HDFS HBase HBase BulkLoader BulkLoader HBase HBase PutLoader PutLoader Look Ericsson Internal | 2013-06-03 | Page 20 up d at a
  • 21. The Upcoming Fight Storkluster 18 machines Ericsson Internal | 2013-06-03 | Page 21 Bigdata 2 machines
  • 22. What about HDFS ? Small files (250 B) › It scales! › TestDFSIO benchmark > 3000 GB/s - Read > 2000 GB/s - Writes CPU CPU Larger files (1 KB) › But …. it is not that simple… Ericsson Internal | 2013-06-03 | Page 22 CPU and I/O CPU and I/O Larger files (1 KB) Network Network
  • 23. What about End to End? writing to Hbase included 100 K events/s › It scales! › And it gets… more complicated 200 K events/s Ericsson Internal | 2013-06-03 | Page 23
  • 24. But…. › Within ~2 hours – Rows/s – CPU – IO ----------+++ +++++++++ Ericsson Internal | 2013-06-03 | Page 24 7K/s x2 100%
  • 25. HDFS CURSE Compaction Storm › Remember what we were doing? – Hint: Creating lots of small files to add to HBase?.. › Major compaction storm! – Manage compaction and region splitting Ericsson Internal | 2013-06-03 | Page 25 HBase HBase BulkLoader BulkLoader M/R
  • 26. Conclusion › Scalability … Scalability… Scalability › It works but it is not so easy… › Recommendation: – Polyglot data storage Ericsson Internal | 2013-06-03 | Page 26
  • 27. Ericsson Internal | 2013-06-03 | Page 27
  • 28. NoSQL Data Research Day 2013
  • 29. NoSQL: The name › It is not about saying SQL is bad or should not be used › ”An accidental neologism” – Martin Fowler › A twitter hash › No prescriptive definition, just observations of common characteristics – “Any database that is not a Relational Database” – Running well on clusters (scalable) – schemaless › Polyglot persistence – Using different stores in different circumstances Ericsson Internal | 2013-06-03 | Page 29 The term was coined at a meetup with the creators behind some prominent emerging databases ... then there was a conference ... ... and a mailing list ... ... the name caught on ... ... then there were more conferences ... ... and here we are!
  • 30. NoSQL: Why? Trend No 2/4: Connectedness Internet Hypertext, RSS, Wikis, blogs, wikis, tagging, user generated content, RDF, ontologies M2M Application Ericsson Internal | 2013-06-03 | Page 30
  • 31. NoSQL: Why? Trend No 3/4: Content Individualization Schemaless •Extend at runtime •De-normalize •Domain design (not schema migration) › Individualization of content › Decentralization Ericsson Internal | 2013-06-03 | Page 31
  • 32. NoSQL Landscape › 4 emerging categories Key-Value Graph BigTable Document (NewSQL) DBN Ericsson Internal | 2013-06-03 | Page 32 (Object)
  • 33. Consistency “A system is consistent if an update is applied to all relevant nodes at the same logical time ” Strong consistency Weak consistency Atomicity Consistency Isolation Durability (ACID) Eventual consistency (inconsistency window) NoSQL solutions DO support Transactions Standard database replication (or caching) IS NOT strongly consistent, as such any solutions making use of any of those is by definition Eventually Consistent at best Ericsson Internal | 2013-06-03 | Page 33
  • 34. Partition Tolerance / Availability › “The network will be allowed to lose arbitrarily many messages sent from one node to another” [..] › “For a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response ” Gilbert and Lynch, SIGACT 2002 CP: Requests will complete at nodes that have quorum AP: Requests will complete at any node possibly violating consistency High latency ~= Partition Ericsson Internal | 2013-06-03 | Page 34
  • 35. HBASE BULK Processing Event Processing & Aggregation › 100 Million rows Queries evaluated SELECT col1 FROM table SELECT SUM(col1) FROM table WHERE col2=val2 GROUP BY col3 Ericsson Internal | 2013-06-03 | Page 35