The document discusses scalable storage systems and key-value stores as an alternative to traditional databases. It provides an overview of vertical and horizontal scalability. Traditional databases are not well-suited for scalable systems due to their complexity, wasted features, and multi-step query processing. Key-value stores offer simpler data models and interfaces that are designed from the start for scaling across hundreds of machines. Performance comparisons show key-value stores significantly outperforming traditional databases. The document also outlines how key-value storage systems work at the aggregation and storage layers.
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
NoSQL includes a wide range of different database technologies and were developed as a result of surging volume of data stored. Relational databases are not capable of coping with this huge volume and faces agility challenges. This is where NoSQL databases have come in to play and are popular because of their features. The session covers the following topics to help you choose the right NoSQL databases:
Traditional databases
Challenges with traditional databases
CAP Theorem
NoSQL to the rescue
A BASE system
Choose the right NoSQL database
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!ScaleBase
Home-grown sharding is hard - REALLY HARD! ScaleBase scales-out MySQL, delivering all the benefits of MySQL sharding, with NONE of the sharding headaches. This webinar explains: MySQL scale-out without embedding code and re-writing apps, Successful sharding on Amazon and private clouds, Single vs. multiple shards per server, Eliminating data silos, Creating a redundant, fault tolerant architecture with no single-point-of-failure, Re-balancing and splitting shards
Oracle vs NoSQL – The good, the bad and the uglyJohn Kanagaraj
A good understanding of NoSQL database technologies that can be used to support a Big Data implementation is essential for today’s Oracle professional. This was discussed in detail in a 2 hour deep-dive technical session at COLLABORATE 2014 - The Oracle User Group Conference. In this slide deck, you will learn what Big Data brings to the table as well as the concepts behind the underlying NoSQL data stores, in comparison to its ancestor you know well - the Oracle RDBMS. We will determine where and how to employ these NoSQL data stores effectively as well as point out some of the issues that you will have to think through (and prepare for) before your organization rushes headlong into a “Big Data” implementation. We will look specifically at MongoDB, CouchBase and Cassandra in this context. At the end of the session, we will provide pointers and links to help the audience take the next step in learning about these technologies for themselves
Relational databases vs Non-relational databasesJames Serra
There is a lot of confusion about the place and purpose of the many recent non-relational database solutions ("NoSQL databases") compared to the relational database solutions that have been around for so many years. In this presentation I will first clarify what exactly these database solutions are, compare them, and discuss the best use cases for each. I'll discuss topics involving OLTP, scaling, data warehousing, polyglot persistence, and the CAP theorem. We will even touch on a new type of database solution called NewSQL. If you are building a new solution it is important to understand all your options so you take the right path to success.
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...Vivek Adithya Mohankumar
The research paper covers the consolidated interpretation of NoSQL systems, on the basis of performance, scalability and data aggregation, and compares the types of NoSQL databases based on their implementation and maintenance.
Service Primitives for Internet Scale ApplicationsAmr Awadallah
A general framework to describe internet scale applications and characterize the functional properties that can be traded away to improve the following operational metrics:
* Throughput (how many user requests/sec?)
* Interactivity (latency, how fast user requests finish?)
* Availability (% of time user perceives service as up), including fast recovery to improve availability
* TCO (Total Cost of Ownership)
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
NoSQL includes a wide range of different database technologies and were developed as a result of surging volume of data stored. Relational databases are not capable of coping with this huge volume and faces agility challenges. This is where NoSQL databases have come in to play and are popular because of their features. The session covers the following topics to help you choose the right NoSQL databases:
Traditional databases
Challenges with traditional databases
CAP Theorem
NoSQL to the rescue
A BASE system
Choose the right NoSQL database
ScaleBase Webinar: Scaling MySQL - Sharding Made Easy!ScaleBase
Home-grown sharding is hard - REALLY HARD! ScaleBase scales-out MySQL, delivering all the benefits of MySQL sharding, with NONE of the sharding headaches. This webinar explains: MySQL scale-out without embedding code and re-writing apps, Successful sharding on Amazon and private clouds, Single vs. multiple shards per server, Eliminating data silos, Creating a redundant, fault tolerant architecture with no single-point-of-failure, Re-balancing and splitting shards
Oracle vs NoSQL – The good, the bad and the uglyJohn Kanagaraj
A good understanding of NoSQL database technologies that can be used to support a Big Data implementation is essential for today’s Oracle professional. This was discussed in detail in a 2 hour deep-dive technical session at COLLABORATE 2014 - The Oracle User Group Conference. In this slide deck, you will learn what Big Data brings to the table as well as the concepts behind the underlying NoSQL data stores, in comparison to its ancestor you know well - the Oracle RDBMS. We will determine where and how to employ these NoSQL data stores effectively as well as point out some of the issues that you will have to think through (and prepare for) before your organization rushes headlong into a “Big Data” implementation. We will look specifically at MongoDB, CouchBase and Cassandra in this context. At the end of the session, we will provide pointers and links to help the audience take the next step in learning about these technologies for themselves
Relational databases vs Non-relational databasesJames Serra
There is a lot of confusion about the place and purpose of the many recent non-relational database solutions ("NoSQL databases") compared to the relational database solutions that have been around for so many years. In this presentation I will first clarify what exactly these database solutions are, compare them, and discuss the best use cases for each. I'll discuss topics involving OLTP, scaling, data warehousing, polyglot persistence, and the CAP theorem. We will even touch on a new type of database solution called NewSQL. If you are building a new solution it is important to understand all your options so you take the right path to success.
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...Vivek Adithya Mohankumar
The research paper covers the consolidated interpretation of NoSQL systems, on the basis of performance, scalability and data aggregation, and compares the types of NoSQL databases based on their implementation and maintenance.
Service Primitives for Internet Scale ApplicationsAmr Awadallah
A general framework to describe internet scale applications and characterize the functional properties that can be traded away to improve the following operational metrics:
* Throughput (how many user requests/sec?)
* Interactivity (latency, how fast user requests finish?)
* Availability (% of time user perceives service as up), including fast recovery to improve availability
* TCO (Total Cost of Ownership)
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
Jan 22nd, 2010 Hadoop meetup presentation on project voldemort and how it plays well with Hadoop at linkedin. The talk focus on Linkedin Hadoop ecosystem. How linkedin manage complex workflows, data ETL , data storage and online serving of 100GB to TB of data.
The NoSQL movement has introduced four new database architectural patterns that complement, but not replace, traditional relational and analytical databases. This presentation will introduce these four patterns and discuss their relative strengths and weaknesses for solving a variety of business problems. These problems include Big Data (scalability), search, high availability and agility. For each type of problem we look at how NoSQL databases take different approaches to solving these problems and how you can use this knowledge to find the right database architecture for your business challenges.
Transaction processing systems are generally considered easier to scale than data warehouses. Relational databases were designed for this type of workload, and there are no esoteric hardware requirements. Mostly, it is just matter of normalizing to the right degree and getting the indexes right. The major challenge in these systems is their extreme concurrency, which means that small temporary slowdowns can escalate to major issues very quickly.
In this presentation, Gwen Shapira will explain how application developers and DBAs can work together to built a scalable and stable OLTP system - using application queues, connection pools and strategic use of caches in different layers of the system.
Polyglot Database - Linuxcon North America 2016Dave Stokes
Many Relation Databases are adding NoSQL features to their products. So what happens when you can get direct access to the data as a key/value pair, or you can store an entire document in a column of a relational table, and more
What Your Database Query is Really DoingDave Stokes
Do you ever wonder what your database servers is REALLY doing with that query you just wrote. This is a high level overview of the process of running a query
NativeX (formerly W3i) recently transitioned a large portion of their backend infrastructure from MS SQL Server to Apache Cassandra. Today, its Cassandra cluster backs its mobile advertising network supporting over 10 million daily active users producing over 10,000 transactions per second with an average database request latency of under 2 milliseconds. Going from relational to noSQL required NativeX's engineers to re-train, re-tool and re-think the way it architects applications and infrastructure. Learn why Cassandra was selected as a replacement, what challenges were encountered along the way, and what architecture and infrastructure were involved in the implementation.
This presentation is for those of you who are interested in moving your on-prem SQL Server databases and servers to Azure virtual machines (VM’s) in the cloud so you can take advantage of all the benefits of being in the cloud. This is commonly referred to as a “lift and shift” as part of an Infrastructure-as-a-service (IaaS) solution. I will discuss the various Azure VM sizes and options, migration strategies, storage options, high availability (HA) and disaster recovery (DR) solutions, and best practices.
Azure SQL Database (SQL DB) is a database-as-a-service (DBaaS) that provides nearly full T-SQL compatibility so you can gain tons of benefits for new databases or by moving your existing databases to the cloud. Those benefits include provisioning in minutes, built-in high availability and disaster recovery, predictable performance levels, instant scaling, and reduced overhead. And gone will be the days of getting a call at 3am because of a hardware failure. If you want to make your life easier, this is the presentation for you.
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
Jan 22nd, 2010 Hadoop meetup presentation on project voldemort and how it plays well with Hadoop at linkedin. The talk focus on Linkedin Hadoop ecosystem. How linkedin manage complex workflows, data ETL , data storage and online serving of 100GB to TB of data.
The NoSQL movement has introduced four new database architectural patterns that complement, but not replace, traditional relational and analytical databases. This presentation will introduce these four patterns and discuss their relative strengths and weaknesses for solving a variety of business problems. These problems include Big Data (scalability), search, high availability and agility. For each type of problem we look at how NoSQL databases take different approaches to solving these problems and how you can use this knowledge to find the right database architecture for your business challenges.
Transaction processing systems are generally considered easier to scale than data warehouses. Relational databases were designed for this type of workload, and there are no esoteric hardware requirements. Mostly, it is just matter of normalizing to the right degree and getting the indexes right. The major challenge in these systems is their extreme concurrency, which means that small temporary slowdowns can escalate to major issues very quickly.
In this presentation, Gwen Shapira will explain how application developers and DBAs can work together to built a scalable and stable OLTP system - using application queues, connection pools and strategic use of caches in different layers of the system.
Polyglot Database - Linuxcon North America 2016Dave Stokes
Many Relation Databases are adding NoSQL features to their products. So what happens when you can get direct access to the data as a key/value pair, or you can store an entire document in a column of a relational table, and more
What Your Database Query is Really DoingDave Stokes
Do you ever wonder what your database servers is REALLY doing with that query you just wrote. This is a high level overview of the process of running a query
NativeX (formerly W3i) recently transitioned a large portion of their backend infrastructure from MS SQL Server to Apache Cassandra. Today, its Cassandra cluster backs its mobile advertising network supporting over 10 million daily active users producing over 10,000 transactions per second with an average database request latency of under 2 milliseconds. Going from relational to noSQL required NativeX's engineers to re-train, re-tool and re-think the way it architects applications and infrastructure. Learn why Cassandra was selected as a replacement, what challenges were encountered along the way, and what architecture and infrastructure were involved in the implementation.
This presentation is for those of you who are interested in moving your on-prem SQL Server databases and servers to Azure virtual machines (VM’s) in the cloud so you can take advantage of all the benefits of being in the cloud. This is commonly referred to as a “lift and shift” as part of an Infrastructure-as-a-service (IaaS) solution. I will discuss the various Azure VM sizes and options, migration strategies, storage options, high availability (HA) and disaster recovery (DR) solutions, and best practices.
Azure SQL Database (SQL DB) is a database-as-a-service (DBaaS) that provides nearly full T-SQL compatibility so you can gain tons of benefits for new databases or by moving your existing databases to the cloud. Those benefits include provisioning in minutes, built-in high availability and disaster recovery, predictable performance levels, instant scaling, and reduced overhead. And gone will be the days of getting a call at 3am because of a hardware failure. If you want to make your life easier, this is the presentation for you.
Storage Systems for High Scalable Systems Presentationandyman3000
Presentation from http://www.hfadeel.com/Blog/?p=151 what kind of storage systems players like Facebook or Google use for their extreme scalability requirements
With AWS you can choose the right database technology and software for the job. Given the myriad of choices, from relational databases to non-relational stores, this session provides details and examples of some of the choices available to you. This session also provides details about real-world deployments from customers using Amazon RDS, Amazon ElastiCache, Amazon DynamoDB, and Amazon Redshift.
Learn how Amazon Redshift, our fully managed, petabyte-scale data warehouse, can help you quickly and cost-effectively analyze all of your data using your existing business intelligence tools. Get an introduction to how Amazon Redshift uses massively parallel processing, scale-out architecture, and columnar direct-attached storage to minimize I/O time and maximize performance. Learn how you can gain deeper business insights and save money and time by migrating to Amazon Redshift. Take away strategies for migrating from on-premises data warehousing solutions, tuning schema and queries, and utilizing third party solutions.
Learn tuning best practices for taking advantage of Amazon Redshift's columnar technology and parallel processing capabilities to improve your delivery of queries and improve overall database performance. This session explains how to migrate from existing data warehouses, create an optimized schema, efficiently load data, use work load management, tune your queries, and use Amazon Redshift's interleaved sorting features. Finally, learn how to use these best practices to give their entire organization access to analytic insights at scale.
Presented by: Alex Sinner, Solutions Architecture PMO, Amazon Web Services
Customer Guest: Luuk Linssen, Product Manager, Bannerconnect
PostgreSQL is very differently architected and presents none
of these problems. All PostgreSQL operations are multi-versioned using
Multi-Version Concurrency Control (MVCC). As a result, common
operations such as re-indexing, adding or dropping columns, and
recreating views can be performed online and without excessive locking,
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Key Trends Shaping the Future of Infrastructure.pdf
http://www.hfadeel.com/Blog/?p=151
1. Level 300 Storage Systems For Scalable systems HaythamElFadeel Researcher in Computer Sciences
2. Agenda Introduction Glance at the Scalable systems. What the available storage solution. The problem with the current solutions. The problem with the Database The Next-Generation of Storage System Key-Value store systems. Performance comparison. How it’s works Discussions, Q/A
3. Glance at Scalable Systems Scalable systems Scalability is the ability to provide better performance when you add more computing power. This performance gained should be relevant to the added computing power. Examples: Google, Yahoo, Facebook, Amazon, eBay, Orkut, Google App Engine, etc.
4. Glance at Scalable Systems Scalable types Vertical Scalability: Adding resource within the same logical unit to increase the capacity. For example: Add more CPUs, or expanding the storage or the memory. Horizontal Scalability: Add multiple logical units of resources and make them together work as a single unit. You can think about it like: Clustering, Distributed, and Load-Balancing.
5. Vertical Scalability vs. Horizontal Scalability Limited Not limited Vertical Scaling Horizontal Scaling Software and Hardware Hardware only
6. Vertical Scalability vs. Horizontal Scalability HaythamElFadeel Quote: If you need scalability, urgently, going to vertical scaling is probably will to be the easiest, but be sure that Vertical scaling, gets more and more expensive as you grow, and While infinite horizontal linear scalability is difficult to achieve, infinite vertical scalability is impossible.
7. Vertical Scalability vs. Horizontal Scalability HaythamElFadeel Quote: On the other hand Horizontal scalability doesn’t require you to buy more and more expensive hardware. It’s meant to be scaled using commodity storage and server solutions. But Horizontal scalability isn’t cheap either. The application has to be built ground up to run on multiple servers as a single application.
8. Glance at Scalable Systems Facebook More than 200,000,000 active user. 50,000 photo uploaded per minute. The most active social-network in the Web. Facebook chat The main challenge is maintain the users status. Distribute the load should depend on the users, and they friends to avoid the traveling. Building a system that should scale from that start to serve 100,000,000 user is really hard.
9. Glance at Scalable Systems Amazon More than 10,000,000 transition in every holidays. The Reliability of the user shopping cart is not option. Google, Yahoo, Microsoft, Kngine, etc Processing huge amount of data, more than 1TB. Sorting the index by the rank value. Which means, sort more than 1TB of data. Save the Crawled Web pages.
10. The Available Storage Solutions Memory: Just a Data Structure :) Disk: Text File: { XML, Protocol Buffer, Json } Binary File: { Serialized, custom format } Database: { MySQL, SQL Server, SQLLite, Oracle }
11. The Available Storage Solutions Memory: Just a Data Structure :) Disk: Text File: { XML, Protocol Buffer, Json } Binary File: { Serialized, custom format } Database: { MySQL, SQL Server, SQLLite, Oracle } What about capacity Bad performance Not portable, questions about performance Bad performance, Complex, huge latency.
12. The Problem with the Database Causes Old and Very complex system. Many wasted features. Many steps to process the SQL query. Need administration, and others.
13. The Problem with the Database Causes Old and Very complex system. The RDMS is very complex system, just like Operating System: Thread Scheduling, Deadlock monitor, Resource manager. I/O Manager, Pages Manager, Execution Plan Manager. Case Manager, Memory Manager, Transaction Manager, etc. Most of DBMS architecture, designs, algorithms came up around 1970s: Different hardware, platform properties. Old architecture, design, and algorithms. Please review resource #1
14. The Problem with the Database Causes Many wasted features. Today systems have very rich features, simply because they think that ‘one size fits all’: CLR Types, CLR Integration, Replication, Functions. Policy, Relations, Transaction, Stored procedure, ACID, etc. You can even call a Web Service from SQL Server! All this mess, make the database appear like a platform and development environment.
15. The problem with the Database Causes Many Steps to process the query. Parse the Query. Build the expression tree, and resolve the relational algebra expression. Optimize the expression tree. Choice the execution plan. Start execute. Please review resource #2, #3
16. The problem with the Database Effects Bad Performance: Throughput, Resource usage, Latency. Not Scalable.
17. The problem with the Database Effects Bad Performance: Throughput, Resource usage, Latency: Even the faster DBMS ‘MySQL’ can’t provide more than 5,000 query per second*. Add to this the consumed resource, and the big latency. * Depend on the configuration
18. The problem with the Database Effects Not Scale: The Database is not designed to scale. Even if you get a new PC and partition the Database you will never get (accepted) good performance improvement. Please review resource #1
19. The problem with the Database The Database give us ACID: Atomicity: A transaction is all or nothing. Consistency: Only valid data is written to the database. Isolation: pretend all transactions are happening serially and the data is correct. Durability: What you write is what you get.
20. The problem with the Database The problem with ACID is that it gives you too much, it trips you up when you are trying to scale a system across multiple nodes. Down time is unacceptable. So your system needs to be reliable. Reliability requires multiple nodes to handle machine failures. To make a scalable systems that can handle lots and lots of reads and writes you need many more nodes.
21. The problem with the Database Once you try to scale ACID across many machines you hit problems with network failures and delays. The algorithms don't work in a distributed environment at any acceptable speed. It’s a dead end
22. The Next generation of Storage Systems From long time ago many researches teams and companies discovered that the database is main bottleneck. Many wasted features, bad performance, and not designed for scale systems.
23. The Next generation of Storage Systems Building large systems on top of a traditional RDBMS data storage layer is no longer good enough. This talk explores the landscape of new technologies available today to augment your data layer to improve performance and reliability. Please review resource #4
24. Key-Value Storage Systems Simple data-model, just key-value pairs. Every Value Assigned to Key. No complex stuff, such as: Relations, ACID, or SQL quires. Simple interface: Get(key) Put(key, value) Delete(key) < Optional
25. Key-Value Storage Systems Designed from the start to scale to hundreds of machines. Designed to be reliable, even if 50% of the machines crashed. No extra work require to add new machine, just plug the machine and it will work in harmony. Many open source projects (C++, Java, Lisp).
29. Key-Value Storage Systems Now You should make your decide Take the blue pill And see the truth Or, Take the red pill And stay in wonderland
30. Key-Value Storage Systems Key-Value Storage System, and other systems built around CAP concept: Consistency: your data is correct all the time. What you write is what you read. Availability: you can read and write and write your data all the time. Partition Tolerance: if one or more nodes fails the system still works and becomes consistent when the system comes on-line.
31. Key-Value Storage Systems One Node - Performance Comparison (Web) MySql 3,030 sets/second. 4,670 gets/second. Redis 11,200 sets/second. (3.7x MySQL) 9,840 gets/second. (2.1x MySQL) Tokyo Tyrant 9,030 sets/second. (3.0x MySQL) 9,250 gets/second. (2.0x MySQL) Please review resource #5
32. Key-Value Storage Systems Two High-End Nodes - Performance Comparison (Web) Redis 89,230 sets/second. 85,840 gets/second.
33. Key-Value Storage Systems One Node - Performance Comparison SQL Server 2,900 sets/second. 3,500 gets/second. Vina* 10,100 sets/second. (3.4x SQL Server) 9,970 gets/second. (2.8x SQL Server) * Vina : Key-Value Storage System used inside Kngine.
34. How it’s Works Any Key-Value storage system, consist of two primary layers: Aggregation Layer Storing Layer
35. How it’s Works Any Key-Value storage system, consist of two primary layers: Aggregation Layer Manage the instances, replication and distribution. Storing Layer One or many Disk-based Hash-Table.
37. How it’s Works (Aggregation Layer) Received the requests. Route it to the target node. Manage Partitioning, and Replicas. The Partitioning, Replication done by Consistence Hashing algorithm. On the board Please review resource #6
38. Key-Value Storage Systems Amazon Dynamo. < Paper Facebook Cassandra. < Open source Tokyo Cabinet/Tyrant. < Open source Redis < Open source MongoDB < Open source
40. References The End of an Architectural Era (It’s Time for a Complete Rewrite). Paper. Database Systems - Paul Beynon-Davies. Book. Inside SQL Server engine - MS Press. Book. Drop ACID and Think About Data. Highscalability.com. RedisvsMySQLvs Tokyo Tyrant. Colin Howe’s Blog. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. Paper. Dynamo: Amazon’s Highly Available Key-value Store. Paper. Redis, Tokyo Tyrant project. Consistent Hashing. Tom white Blog.
41. Resources High Scalability blog. Highscalability.com It’s all about innovation blog. Hfadeel.com/blog. All Things Distributed. Allthingsdistributed.com Tom White blog lexemetech.com
42. Thanks… Dear all, All of my presentation content it's open source. Please feel free to use, copy, and re-distribute it.