Nowadays developed NoSQL systems offer a number of advantages:
• the ability to horizontally scale out throughput over many servers,
• a simple call level interface or protocol,
• supporting weaker consistency models in contrast to ACID (Atomicity, Consistency, Isolation, Durability) guaranteed properties for transactions in most traditional RDBMS (Relational Database Management System)
• efficient use of distributed indexes and RAM for data storage,
I Have a NoSQL Toaster - Troy .NET User Group - July 2017Matthew Groves
The document provides an overview of NoSQL databases and Couchbase. It begins with an introduction from the presenter and then discusses why organizations adopt NoSQL databases and the types of NoSQL databases. The bulk of the document focuses on document databases and Couchbase, covering common operations, the use of N1QL, and example applications. It also briefly discusses key-value, columnar, and graph databases. The document concludes with information on where to find Couchbase and answers to frequently asked questions.
My toaster stores data without SQL and without tables. But making a choice based on what something doesn’t have isn’t terribly useful. “NoSQL” is an increasingly inaccurate catch-all term that covers a lot of different types of data storage. Let’s make more sense of this new breed of database management systems and go beyond the buzzword. In this session, the four main data models that make up the NoSQL movement will be covered: key-value, document, columnar and graph. How they differ and when you might want to use each one will be discussed.
This session will be looking at the whole ecosystem, with a more detailed focus on Couchbase, Cassandra, Riak KV, and Neo4j.
I have a NoSQL Toaster - ConnectJS - October 2016Matthew Groves
NoSQL is a catch-all term that covers a lot of different types of data storage. Is it really helpful to group them together by one thing they don't have? Think about it like this: my toaster is as much NoSQL as any database! So, how can we make more sense of this new breed of database management systems?
This document summarizes a presentation about MongoDB given by Sean Laurent of StudyBlue, Inc. It introduces MongoDB and its features, including its document-oriented data model, flexible schemas, querying and updating capabilities, indexing, drivers, replication and sharding for scalability. Useful MongoDB tools are also discussed like the mongo shell, mongostat, mongotop and MMS monitoring. The document concludes with MongoDB's strengths like horizontal scaling and rapidly evolving schemas, and weaknesses like not being a drop-in replacement for SQL and limitations around transactions.
Exploring, understanding and monitoring macOS activity with osqueryZachary Wasserman
How can osquery help with security, devops, compliance and IT?
This talk from MacDevopsYVR 2018 provides an introduction to osquery for mac administrators (and is relevant to a wider audience).
A story of Netflix and AB Testing in the User Interface using DynamoDB - DAT3...Amazon Web Services
Netflix runs hundreds of multivariate A/B tests a year, many of which help personalize the experience in the UI. This causes an exponential growth in the number of user experiences served to members, with each unique experience resulting in a unique JS/CSS bundle. Pre-publishing millions of permutations to the CDN for each build of each UI simply does not work at Netflix scale. In this session, we discuss how we built, designed, and scaled a brand new Node.js service, Codex. Its sole responsibility is to build personalized JS/CSS bundles on the fly for members as they move through the Netflix user experience. We’ve learned a ton about building a horizontally scalable Node.js microservice using core AWS services. Codex depends on Amazon S3 and Amazon DynamoDB to meet the streaming needs of our 100 million customers.
CouchDB is an open-source document-oriented NoSQL database that uses JSON to store data. It was created by Damien Katz in 2005 and became an Apache project in 2008. CouchDB stores documents in databases and provides a RESTful API for reading, adding, editing and deleting documents. It uses MVCC for concurrency and handles updates in a lockless and optimistic manner. CouchDB follows the CAP theorem and can be partitioned across multiple servers for availability. It uses MapReduce to index and query documents through JavaScript views. Replication allows synchronizing copies of databases by comparing changes. Data can also be migrated to mobile clients through integrations.
This document introduces Couchbase, an open-source distributed operational database. It discusses how Couchbase provides scalability and high performance through its architecture which allows independent scaling of data, querying, and indexing workloads. It also highlights Couchbase capabilities like JSON document storage, N1QL querying, asynchronous writes, SDKs, full-text search, and cross data center replication. Examples of Couchbase uses at large companies like eBay are also presented.
I Have a NoSQL Toaster - Troy .NET User Group - July 2017Matthew Groves
The document provides an overview of NoSQL databases and Couchbase. It begins with an introduction from the presenter and then discusses why organizations adopt NoSQL databases and the types of NoSQL databases. The bulk of the document focuses on document databases and Couchbase, covering common operations, the use of N1QL, and example applications. It also briefly discusses key-value, columnar, and graph databases. The document concludes with information on where to find Couchbase and answers to frequently asked questions.
My toaster stores data without SQL and without tables. But making a choice based on what something doesn’t have isn’t terribly useful. “NoSQL” is an increasingly inaccurate catch-all term that covers a lot of different types of data storage. Let’s make more sense of this new breed of database management systems and go beyond the buzzword. In this session, the four main data models that make up the NoSQL movement will be covered: key-value, document, columnar and graph. How they differ and when you might want to use each one will be discussed.
This session will be looking at the whole ecosystem, with a more detailed focus on Couchbase, Cassandra, Riak KV, and Neo4j.
I have a NoSQL Toaster - ConnectJS - October 2016Matthew Groves
NoSQL is a catch-all term that covers a lot of different types of data storage. Is it really helpful to group them together by one thing they don't have? Think about it like this: my toaster is as much NoSQL as any database! So, how can we make more sense of this new breed of database management systems?
This document summarizes a presentation about MongoDB given by Sean Laurent of StudyBlue, Inc. It introduces MongoDB and its features, including its document-oriented data model, flexible schemas, querying and updating capabilities, indexing, drivers, replication and sharding for scalability. Useful MongoDB tools are also discussed like the mongo shell, mongostat, mongotop and MMS monitoring. The document concludes with MongoDB's strengths like horizontal scaling and rapidly evolving schemas, and weaknesses like not being a drop-in replacement for SQL and limitations around transactions.
Exploring, understanding and monitoring macOS activity with osqueryZachary Wasserman
How can osquery help with security, devops, compliance and IT?
This talk from MacDevopsYVR 2018 provides an introduction to osquery for mac administrators (and is relevant to a wider audience).
A story of Netflix and AB Testing in the User Interface using DynamoDB - DAT3...Amazon Web Services
Netflix runs hundreds of multivariate A/B tests a year, many of which help personalize the experience in the UI. This causes an exponential growth in the number of user experiences served to members, with each unique experience resulting in a unique JS/CSS bundle. Pre-publishing millions of permutations to the CDN for each build of each UI simply does not work at Netflix scale. In this session, we discuss how we built, designed, and scaled a brand new Node.js service, Codex. Its sole responsibility is to build personalized JS/CSS bundles on the fly for members as they move through the Netflix user experience. We’ve learned a ton about building a horizontally scalable Node.js microservice using core AWS services. Codex depends on Amazon S3 and Amazon DynamoDB to meet the streaming needs of our 100 million customers.
CouchDB is an open-source document-oriented NoSQL database that uses JSON to store data. It was created by Damien Katz in 2005 and became an Apache project in 2008. CouchDB stores documents in databases and provides a RESTful API for reading, adding, editing and deleting documents. It uses MVCC for concurrency and handles updates in a lockless and optimistic manner. CouchDB follows the CAP theorem and can be partitioned across multiple servers for availability. It uses MapReduce to index and query documents through JavaScript views. Replication allows synchronizing copies of databases by comparing changes. Data can also be migrated to mobile clients through integrations.
This document introduces Couchbase, an open-source distributed operational database. It discusses how Couchbase provides scalability and high performance through its architecture which allows independent scaling of data, querying, and indexing workloads. It also highlights Couchbase capabilities like JSON document storage, N1QL querying, asynchronous writes, SDKs, full-text search, and cross data center replication. Examples of Couchbase uses at large companies like eBay are also presented.
This document discusses NoSQL databases and how they can be used on Microsoft Azure. It introduces the presenters, Eva Gjeci and Vito Flavio Lorusso, as technology evangelists for Microsoft focused on Azure. It then provides an overview of different types of NoSQL databases and examples of how they can be implemented on Azure, including using virtual machines, Azure services like DocumentDB, Redis Cache, and Table Storage, as well as hosted database services. The document demonstrates setting up MongoDB and Cassandra on Azure virtual machines and using DocumentDB and its querying capabilities. It also discusses optimizing NoSQL database storage and high availability options on Azure.
Couchbase 101 provides an overview of Couchbase including:
- Key concepts of Couchbase such as its use as a key-value store and document store using JSON documents.
- Single node and cluster-wide operations for reading, writing and updating documents.
- Cross data center replication (XDCR) to replicate data between geographically distributed clusters.
- Indexing and querying features including secondary indexes, views, and the new N1QL query language.
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachDATAVERSITY
After three decades of relational data modeling, everyone’s pretty comfortable with schemas, tables, and entity-relationships. As more and more Global 2000 companies choose NoSQL databases to power their Digital Economy applications, they need to think about how to best model their data. How do they move from a constrained, table-driven model to an agile, flexible data model based on JSON documents?
This webinar is intended for architects and application developers who want to learn about new JSON document data modeling approaches, techniques, and best practices. This webinar will show you how to get started building a JSON document data model, how to migrate a table-based data model to JSON documents, and how to optimize your design to enable fast query performance.
This webinar will provide practical, experience-based advice and best practices for modeling JSON documents, including:
- When to embed or not embed objects in your JSON document
- Data modeling using a practical data access pattern approach
- Indexing your JSON documents
- Querying your data using N1QL (SQL for JSON)
Using MongoDB to Build a Fast and Scalable Content RepositoryMongoDB
Presented by Mike Obrebski, Senior Solution Architect, Nuxeo
MongoDB can be used in the Nuxeo Platform as a replacement for traditional SQL databases. Nuxeo's content repository, which is the cornerstone of this open source software platform, can now completely rely on MongoDB for data storage. This presentation will explain the motivation for using MongoDB and will discuss different implementation strategies. In this session, you will learn more about the migrations to MongoDB and how we were able to achieve increased performance gains.
This document provides an overview of NoSQL and MongoDB. It discusses trends driving the adoption of NoSQL databases like increasing data sizes, more connectedness, and individualization. It covers the different types of NoSQL databases and MongoDB in particular. Key concepts discussed include the CAP theorem, MongoDB's document-oriented data model, and basic CRUD operations in MongoDB using the shell.
This document provides an overview of PostCSS, including what it is, how it works, popular plugins, and how to create your own PostCSS plugin. Some key points:
- PostCSS is a tool for transforming CSS with JS plugins that parses CSS into an AST, passes it through plugins, and outputs modified CSS. It allows for variables, mixins, future syntax support, and more.
- Popular plugins include Autoprefixer, CSSnext, PreCSS, StyleLint, PostCSS Assets, and CSSNano.
- To create a plugin, make a node module with index.js, require PostCSS, and use the PostCSS API to modify the AST and output CSS.
This document provides an overview of Apache Cassandra presented by Christopher Batey, a Technical Evangelist for Apache Cassandra. It begins with introductions and background on Batey. It then covers key aspects of Cassandra including distributed databases, Cassandra use cases, replication, fault tolerance, data modeling with Cassandra Query Language (CQL) and the Java driver. Examples are provided around modeling customer event data in Cassandra and querying that data.
A decade ago, the database was assumed to be a solved problem. Relational databases (PostgreSQL, MySQL, SQLite to name a few) were dominating the database market and hierarchical databases (LDAP, DNS) where regarded as niche solutions. The NoSQL revolution surely changed the concept of what a database can be. At the same time, the popularity of mobile devices exploded. This talk will dive into how data structures are persisted and queried on mobile devices today, and try to revive the old question: is the database really a solved problem?
- Amazon EKS and AWS Fargate were highlighted as two services for running containers on AWS.
- Amazon EKS allows users to easily run Kubernetes on AWS and manages the control plane, while AWS Fargate allows running containers without managing servers.
- AWS Fargate launches containers without needing to provision or manage servers, providing a simpler experience for developing and operating containerized applications.
This document discusses Docker containers and CoreOS. It summarizes Sebastien Goasguen's background working with high performance computing, cloud computing, and various open source projects. It then discusses how Docker simplifies application deployment and portability using containers and image sharing. CoreOS is introduced as a Linux distribution optimized for Docker with tools like etcd and Fleet for managing distributed applications across containers. Kubernetes is presented as a system for orchestrating Docker containers across multiple hosts and providing services like replication and high availability.
This talk will cover lessons learned at Community Engine regarding MongoDB, including: why we moved away from an Hybrid solution using SQL and MongoDB; an outline of the technologies and what we learned using MongoDB on Amazon Web Services; the MongoDB C# driver; MongoDB with SOLR for Full Text Search; how we do migration, deployment and more.
The document introduces cloud computing and cloud services. It defines cloud computing as using computing resources delivered over a network as a service. It then provides information about Radu Vunvulea and his background. The document outlines an agenda to cover cloud computing topics but does not fill in the agenda items. It also includes sections on cloud characteristics, cloud providers, market forecasts, Azure components, execution models, machine sizes, data management, queues, caching, and other Azure services.
[JSDC 2016] Codex: Conditional Modules Strike BackAlex Liu
Netflix runs hundreds of multivariate AB tests a year, many of which help personalize the experience in the UI. This causes an exponential growth in the number of user experiences we serve to members, with each unique experience resulting in a unique JS/CSS bundle. Pre-publishing million of permutations to the CDN for each build of each UI simply does not work at Netflix scale.
Instead, we've taken a novel approach by standing up a brand new Node.js service: Codex. Codex's sole responsibility is to build personalized JS/CSS bundles on the fly for our members as they move through the Netflix user experience. This frees up our UI teams to innovate rapidly on the UI itself, without having to worry about the costs of infrastructure and the complexity of pre-publishing to the CDN.
As we stood up Codex, we learned a ton about building a horizontally scalable Node.js microservice. This talk is the story of how we built, designed, and scaled that service to meet the needs of our 80 million customers.
A Presentation on MongoDB Introduction - HabilelabsHabilelabs
This document introduces MongoDB, an open-source document-oriented database. It discusses how MongoDB stores data as JSON-like documents rather than in tables, supports dynamic schemas, and is horizontally scalable. Some common use cases for MongoDB are also listed, including single view applications, IoT, mobile, real-time analytics, personalization, catalogs, and content management. The document concludes by covering how Habilelabs uses MongoDB with Node.js for REST API projects.
Full stack development with Node and NoSQL - Austin Node.JS Group - October ...Matthew Groves
In this session, we will talk about what is different about this generation of web applications and how a solid development approach must consider the latency, throughput and interactivity demand by users across both mobile devices, web browsers, and IoT. We will demonstrate how to include Couchbase in such applications to support a flexible data model and easy scalability required for modern development. We will demonstrate how to create a full stack application focusing on the CEAN stack which is composed of Couchbase, Express Framework, AngularJS, and Node.js.
About Matt:
Matthew D. Groves is a guy who loves to code. It doesn't matter if it's C#, jQuery, or PHP: he'll submit pull requests for anything. He has been coding ever since he wrote a QuickBASIC point-of-sale app for his parent's pizza shop back in the 90s. He currently works as a Developer Advocate for Couchbase. His free time is spent with his family, watching the Reds, and getting involved in the developer community. He is the author of AOP in .NET (published by Manning), and is also a Microsoft MVP.
For more information on Couchbase, check out http://developer.couchbase.com
This document summarizes Vitaly Friedman's talk on responsive design techniques and tricks. The talk covered resolution independence using SVG/icon fonts, content choreography with Flexbox, compressive images that maintain quality at different sizes, conditional loading of assets based on breakpoints, and lazy loading of JavaScript and social buttons. It also discussed maintaining aspect ratios for images and videos across screens, and serving different video files for different devices. The overall message was that responsive design requires a new mindset and pragmatic solutions rather than rigid rules.
Slides from workshop held on 12/14 in Asbury Park, NJ
http://www.meetup.com/Jersey-Shore-Tech/events/148118762/?gj=ro2_e&a=ro2_gnl&rv=ro2_e&_af_eid=148118762&_af=event
Couchbase Chennai Meetup: Developing with Couchbase- made easyKarthik Babu Sekar
This session provided an overview of Couchbase Solutions and whats latest and greatest in the new release. This session also talks about how easy is to develop with Couchbase and query the database
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
More Related Content
Similar to i_have_a_nosql_toaster_-_qcon_-_june_2017.pptx
This document discusses NoSQL databases and how they can be used on Microsoft Azure. It introduces the presenters, Eva Gjeci and Vito Flavio Lorusso, as technology evangelists for Microsoft focused on Azure. It then provides an overview of different types of NoSQL databases and examples of how they can be implemented on Azure, including using virtual machines, Azure services like DocumentDB, Redis Cache, and Table Storage, as well as hosted database services. The document demonstrates setting up MongoDB and Cassandra on Azure virtual machines and using DocumentDB and its querying capabilities. It also discusses optimizing NoSQL database storage and high availability options on Azure.
Couchbase 101 provides an overview of Couchbase including:
- Key concepts of Couchbase such as its use as a key-value store and document store using JSON documents.
- Single node and cluster-wide operations for reading, writing and updating documents.
- Cross data center replication (XDCR) to replicate data between geographically distributed clusters.
- Indexing and querying features including secondary indexes, views, and the new N1QL query language.
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachDATAVERSITY
After three decades of relational data modeling, everyone’s pretty comfortable with schemas, tables, and entity-relationships. As more and more Global 2000 companies choose NoSQL databases to power their Digital Economy applications, they need to think about how to best model their data. How do they move from a constrained, table-driven model to an agile, flexible data model based on JSON documents?
This webinar is intended for architects and application developers who want to learn about new JSON document data modeling approaches, techniques, and best practices. This webinar will show you how to get started building a JSON document data model, how to migrate a table-based data model to JSON documents, and how to optimize your design to enable fast query performance.
This webinar will provide practical, experience-based advice and best practices for modeling JSON documents, including:
- When to embed or not embed objects in your JSON document
- Data modeling using a practical data access pattern approach
- Indexing your JSON documents
- Querying your data using N1QL (SQL for JSON)
Using MongoDB to Build a Fast and Scalable Content RepositoryMongoDB
Presented by Mike Obrebski, Senior Solution Architect, Nuxeo
MongoDB can be used in the Nuxeo Platform as a replacement for traditional SQL databases. Nuxeo's content repository, which is the cornerstone of this open source software platform, can now completely rely on MongoDB for data storage. This presentation will explain the motivation for using MongoDB and will discuss different implementation strategies. In this session, you will learn more about the migrations to MongoDB and how we were able to achieve increased performance gains.
This document provides an overview of NoSQL and MongoDB. It discusses trends driving the adoption of NoSQL databases like increasing data sizes, more connectedness, and individualization. It covers the different types of NoSQL databases and MongoDB in particular. Key concepts discussed include the CAP theorem, MongoDB's document-oriented data model, and basic CRUD operations in MongoDB using the shell.
This document provides an overview of PostCSS, including what it is, how it works, popular plugins, and how to create your own PostCSS plugin. Some key points:
- PostCSS is a tool for transforming CSS with JS plugins that parses CSS into an AST, passes it through plugins, and outputs modified CSS. It allows for variables, mixins, future syntax support, and more.
- Popular plugins include Autoprefixer, CSSnext, PreCSS, StyleLint, PostCSS Assets, and CSSNano.
- To create a plugin, make a node module with index.js, require PostCSS, and use the PostCSS API to modify the AST and output CSS.
This document provides an overview of Apache Cassandra presented by Christopher Batey, a Technical Evangelist for Apache Cassandra. It begins with introductions and background on Batey. It then covers key aspects of Cassandra including distributed databases, Cassandra use cases, replication, fault tolerance, data modeling with Cassandra Query Language (CQL) and the Java driver. Examples are provided around modeling customer event data in Cassandra and querying that data.
A decade ago, the database was assumed to be a solved problem. Relational databases (PostgreSQL, MySQL, SQLite to name a few) were dominating the database market and hierarchical databases (LDAP, DNS) where regarded as niche solutions. The NoSQL revolution surely changed the concept of what a database can be. At the same time, the popularity of mobile devices exploded. This talk will dive into how data structures are persisted and queried on mobile devices today, and try to revive the old question: is the database really a solved problem?
- Amazon EKS and AWS Fargate were highlighted as two services for running containers on AWS.
- Amazon EKS allows users to easily run Kubernetes on AWS and manages the control plane, while AWS Fargate allows running containers without managing servers.
- AWS Fargate launches containers without needing to provision or manage servers, providing a simpler experience for developing and operating containerized applications.
This document discusses Docker containers and CoreOS. It summarizes Sebastien Goasguen's background working with high performance computing, cloud computing, and various open source projects. It then discusses how Docker simplifies application deployment and portability using containers and image sharing. CoreOS is introduced as a Linux distribution optimized for Docker with tools like etcd and Fleet for managing distributed applications across containers. Kubernetes is presented as a system for orchestrating Docker containers across multiple hosts and providing services like replication and high availability.
This talk will cover lessons learned at Community Engine regarding MongoDB, including: why we moved away from an Hybrid solution using SQL and MongoDB; an outline of the technologies and what we learned using MongoDB on Amazon Web Services; the MongoDB C# driver; MongoDB with SOLR for Full Text Search; how we do migration, deployment and more.
The document introduces cloud computing and cloud services. It defines cloud computing as using computing resources delivered over a network as a service. It then provides information about Radu Vunvulea and his background. The document outlines an agenda to cover cloud computing topics but does not fill in the agenda items. It also includes sections on cloud characteristics, cloud providers, market forecasts, Azure components, execution models, machine sizes, data management, queues, caching, and other Azure services.
[JSDC 2016] Codex: Conditional Modules Strike BackAlex Liu
Netflix runs hundreds of multivariate AB tests a year, many of which help personalize the experience in the UI. This causes an exponential growth in the number of user experiences we serve to members, with each unique experience resulting in a unique JS/CSS bundle. Pre-publishing million of permutations to the CDN for each build of each UI simply does not work at Netflix scale.
Instead, we've taken a novel approach by standing up a brand new Node.js service: Codex. Codex's sole responsibility is to build personalized JS/CSS bundles on the fly for our members as they move through the Netflix user experience. This frees up our UI teams to innovate rapidly on the UI itself, without having to worry about the costs of infrastructure and the complexity of pre-publishing to the CDN.
As we stood up Codex, we learned a ton about building a horizontally scalable Node.js microservice. This talk is the story of how we built, designed, and scaled that service to meet the needs of our 80 million customers.
A Presentation on MongoDB Introduction - HabilelabsHabilelabs
This document introduces MongoDB, an open-source document-oriented database. It discusses how MongoDB stores data as JSON-like documents rather than in tables, supports dynamic schemas, and is horizontally scalable. Some common use cases for MongoDB are also listed, including single view applications, IoT, mobile, real-time analytics, personalization, catalogs, and content management. The document concludes by covering how Habilelabs uses MongoDB with Node.js for REST API projects.
Full stack development with Node and NoSQL - Austin Node.JS Group - October ...Matthew Groves
In this session, we will talk about what is different about this generation of web applications and how a solid development approach must consider the latency, throughput and interactivity demand by users across both mobile devices, web browsers, and IoT. We will demonstrate how to include Couchbase in such applications to support a flexible data model and easy scalability required for modern development. We will demonstrate how to create a full stack application focusing on the CEAN stack which is composed of Couchbase, Express Framework, AngularJS, and Node.js.
About Matt:
Matthew D. Groves is a guy who loves to code. It doesn't matter if it's C#, jQuery, or PHP: he'll submit pull requests for anything. He has been coding ever since he wrote a QuickBASIC point-of-sale app for his parent's pizza shop back in the 90s. He currently works as a Developer Advocate for Couchbase. His free time is spent with his family, watching the Reds, and getting involved in the developer community. He is the author of AOP in .NET (published by Manning), and is also a Microsoft MVP.
For more information on Couchbase, check out http://developer.couchbase.com
This document summarizes Vitaly Friedman's talk on responsive design techniques and tricks. The talk covered resolution independence using SVG/icon fonts, content choreography with Flexbox, compressive images that maintain quality at different sizes, conditional loading of assets based on breakpoints, and lazy loading of JavaScript and social buttons. It also discussed maintaining aspect ratios for images and videos across screens, and serving different video files for different devices. The overall message was that responsive design requires a new mindset and pragmatic solutions rather than rigid rules.
Slides from workshop held on 12/14 in Asbury Park, NJ
http://www.meetup.com/Jersey-Shore-Tech/events/148118762/?gj=ro2_e&a=ro2_gnl&rv=ro2_e&_af_eid=148118762&_af=event
Couchbase Chennai Meetup: Developing with Couchbase- made easyKarthik Babu Sekar
This session provided an overview of Couchbase Solutions and whats latest and greatest in the new release. This session also talks about how easy is to develop with Couchbase and query the database
Similar to i_have_a_nosql_toaster_-_qcon_-_june_2017.pptx (20)
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
6. ~3200 BC: invention of writing in Mesopotamia
~300 BC: Library of Alexandria
~1041 AD: Movable type invented inChina
1450: Movable type invented in Europe
1822: Babbage’s paper on the application of difference engines
1943: Colossus, the first programmable digital computer
1957-1960: IBM SABRE, the first commercial use of a database
1970: EF Codd proposes the relational database model
1974: Ingres released
1979: Commercial launch of Oracle database
1988: Object databases appear
1989: Microsoft SQL Server version 1
1991: HTTP v0.9
1995: MySQL’s initial release
2005: CouchDB released
2006: Google’s BigTable paper
2007: Amazon’s Dynamo paper
2008-2009: The NoSQL Cambrian explosion: Riak, MongoDB,Cassandra, Redis
2010: Couchbase arrives on the scene
30. "ClientProductCode","Title","Quantity"
"TShirtCBS","Couchbase logo t-shirt (small)",29
"TShirtCBM","Couchbase logo t-shirt (medium)",72
"TShirtCBL","Couchbase logo t-shirt (large)",34
"TShirtCBXL","Couchbase logo t-shirt (extra large)",12
"StickerCBLogo","Couchbase logo sticker",256
"PenCBLogo","Couchbase logo pen",365
32. private static void LoadStockList() {
var csv = new CsvReader(File.OpenText("swag.csv"));
while (csv.Read()) {
var record = csv.GetRecord<dynamic>();
var document = new Document<dynamic> {
Id = record.ClientProductCode,
Content = new {
record.Title,
record.Quantity
}
};
_bucket.Insert(document);
Console.WriteLine($"Added document '{record.ClientProductCode}'");
}
}
This session was originally conceived by Matthew Revell
I have modified it, but it is largely based on his work, so I want to give him credit
We're here because "relational" is the norm
So right away I can tell the age of my audience
I took some liberties with my own doodle of the QCon logo
I hope you'll forgive me
But Couchbase is happy to sponsor the event, please come by the booth if you haven't already
We ‘always’ been doing SQL, anything else must be an aberration?
Well I’m not so sure.
In 3200 B.C. we have the first evidence of human writing
Then later the printing press
The first commercial database is in 1960 (SABRE by American Airlines and IBM making a better way to handle airline ticket bookings)
It was not relational, it was more like a filesystem hierarchy
1970 – EF Codd proposes the relational db model, and came up with SQL later
1979 – the necessary sacrifices to the dark forces were made and the Oracle database is released
So maybe around 1990, relational databases become “the norm”, about 20-30 years to where we default to stick things in tables
In 2005, people start looking at old ideas (because there’s nothing new underneath the sun), and a document database was created called CouchDB (which was a different spin on Lotus Notes)
2008 – explosion of NoSQL databases, influenced by two academic papers from Google and Amazon, who were losing money because they couldn’t keep relational DBs online – so Amazon created a massively distributed, fault tolerant, highly available, eventually consistent db called Dynamo, and you can use a version of that today, which is called DynamoDB
So these papers started a lot of open source activity, and we’ll look at those in a bit more detail today
2010 – the heavens open and Couchbase decends to earth (disclaimer, I work for Couchbase)
That’s why I say this toaster is a NoSQL Toaster!
I can operate it fully without using any SQL query on it
It can store data, it’s not tables, it’s bread
I can get that data back out (as toast)
It’s a good marketing shorthand to say “these databases are different than what we’ve been doing for 30 years”
But defining something by what it’s not – only useful for a little while
So let’s dive in and figure out what NoSQL means?
Is it because they’re scalable?
Well… not really… not all of them are scalable (and sometimes they aren’t even if their creators say they are)
Is it commodity hardware?
A lot of them can use commodity hardware for flexible and affordable scaling
THEY DON’T HAVE TO. Not all NoSQL databases were designed with scaling and commodity in mind.
But this is the sort of thing that people mention a lot with nosql dbs
So what is it that defines a NoSQL database?
Is it because they’re easy to develop against?
Well… not really… if you try Cassandra, it’s not straightforward totally easy, can still be a bit inflexible
Does it have to do with ACID vs BASE?
Atomic
Consistent
Isolated
Durable
Vs
Basic Availability
Soft-state
Eventual Consistency
Well, no… FoundationDB is a NoSQL database that is ACID compliant (Apple purchased them)
Many NoSQL databases tradeoff some level of ACIDity for scalability or performance reasons
Is it that there is no Schema?
Maybe… kinda… sorta… even if your database isn’t imposing a rigid schema on you
This doesn’t mean there isn’t a schema, it’s just that the database isn’t enforcing one
You as a developer need to impose a schema on it
Is it that they are denormalized?
This is pretty close.
Data doesn’t need to be normalized, and there may be some benefit to non-normalized data (performance or flexibility or both)
Doesn’t mean you can’t split data up, you just do it differently/broader
Before I get into some of the details about NoSQL databases
Here’s a disclaimer
If you’re starting out on a new project or refactoring an existing project, don’t use NoSQL
Because it’s cool, because there’s peer pressure, because you want more stuff on your resume
A lot of problems are SOLVED WELL by existing technology that you understand well and have a good toolset around
So if you have a problem that’s not going to scale massively, or it fits nicely into a relational model, and you can deal with object impedance mismatch and OR/Ms and you’re comfortable with that, then you might as well use a relational database and leave the road less traveled for another time
This is the Blobfish, some call it the ugliest fish in the world
But this is a picture at surface level. This is actually a deep sea fish
When it’s in its normal high-pressure environment it looks more like this
It’s still not going to win any beauty pageants, but it definitely looks more like a standard fish
My point is this: whatever data store you choose should be the right one for the job. There’s a term “Polyglot persistence” which just means that.
So maybe if part of your app is just saving log data, maybe that’s a good place to start with a nosql db. If another part is highly relational, then use a relational db.
Maybe stick with relational for the “database of record”, but use nosql for caching, or for session, or for shopping cart. Wherever it makes sense.
So maybe SQL is right for your project
But there are some cases where maybe relational ISN’T a good fit or at least not the best fit
With ANY system really
Vertical scaling is the easiest: faster server, more ram, beefier machine, but that gets expensive, impractical
Horizontal scaling: with relational databases, I’ve got two words for you “sharding scheme”
With SQL Server there are a variety of sharding methods (vertical, range, hash, directory), but you have to worry about joining across shards, denormalizing, rebalancing, what if you change the schema?
People do this, and people get it to work. Depending on your use case, it can be reasonable for certain use cases where the data can be easily divided, it can be a nightmare otherwise.
This is where distributed databases can help.
You can do this with relational databases, but many NoSQL databases are built with this in mind
If you have a database cluster, and you need additional capacity, you can rack up another cheap server and add it to the cluster.
Now:
You don’t have to upgrade your hardware
licensing may cost you less
Your capacity is flexible
There are some other benefits!
This is always hard with a relational database.
Keeping things online. Especially for writes.
CAP Theorum – Consistent Available Partition-tolerant, applies to when nodes go down or there’s a network split. When operating normally, there is no such tradeoff.
CAP theorem states that a distributed system can have three attributes
And that you can only guarantee two of them at a time
That word “guarantee” is in the academic sense
If your network and servers are working as expected, you get all three.
You might get all three to 99999 but you can't guarantee it
We’ll see more of this graph later
Something simple may not be so bad
But something complex…
Coming up with good schemas in the first place
And then maintaining them across the lifetime of your app
Has anyone had to take a database offline (or put into read-only) in order to do a schema migration?
In many cases, this just isn’t acceptable: to tell a user that you can’t use this because we’re doing schema maintenance
There are lots of types of NoSQL databases
These 4 seem to be the main categories that people are interested in, esp in open source world
And these will be the ones you’ll likely encounter in your research about what to use
Graph is still a bit of the odd one out
NoSQL databases are converging, and databases overall are converging
In Couchbase, we’ve introduced N1QL, which is a full SQL implementation PLUS SOME to query documents
Riak has introduced indexing and other features to help with querying
SQL Server and PostgreSQL have introduced some great JSON features
DocumentDB for Azure has a SQL implementation
So it’s an exciting time where databases are adapting and changing and converging
Logos for: MongoDB, CouchDB, MarkLogic, Couchbase, Lotus Notes, RethinkDB, OrientDB, Xbase, RavenDB, Existdb
These are the top two most relevant document databases
MongoDB has huge traction, Couchbase is easily the runner up (I’d say that even if they didn’t pay my mortgage)
It understands the format of the document you’re dealing with
It’s not just a bag you throw things in, it has understanding of the data
These are often the best general replacement for relational databases
MarkLogic is XML, I learned recently it can also do JSON
I guess there could be a YAML database or something else like that…
But no one likes XML (animation to cross it off or whatever)
BSON is a binary implementation of JSON that Mongo uses, can be exported to “Extended JSON”
so really what we’re talking about here is JSON
So what we’re talking about in document databases is storing JSON
It’s human readable, compact (compared to XML), can contain arrays, objects, hierarchies
It can do something to the document (or documents) without you having to
-pull it out
-do something to it
-put it back
Take a quick run through how couchbase handles documents
Here are some new terms in NoSQL
Buckets – is a collections of documents, or equivalent of a database
Documents – Kinda like tables, but not
Views – map/reduce functions to do interesting queries
N1QL – Couchbase has a query language that you can use to query/operate on documents as well
Here is a CSV of data for Swag
There’s a code, title/description, and quantity
Here’s some C# that pulls that in and puts it into couchbase
There’s a URL to one or more nodes
The bucket name
Taking the values from the CSV
I’m making the swag code into the document key
And I’m taking the other values, stuffing them into an anonymous object (which will be converted to JSON)
This is a screenshot of the Couchbase Console, it’s the admin view
And this is what it would look like if you imported into Couchbase
Pull out a single document with a GET statement
Document databases have these operations, typically simple crud operations
This is (initially) why these databases are called “nosql”, because you aren’t using SQL to do these core operations
Do more interesting stuff with M/R or with N1QL queries in Couchbase
Often what we’re doing with relational databases is starting with an object that has hierarchy, and twisting it around to get it to fit into database tables
If we can avoid it, why go through that pain of an impedence mismatch? Instead just put the whole thing into a document
Some things that couchbase is good for
Great for dev, less impedence mismatch
Scales well, works as a cluster, easy to ramp up – just rack up another server, join the cluster, and it redistributes data
Fast, because there is a built-in cache so you’ll often be reading/writing to RAM and not having to wait on the disk
Drawback: When nodes go down or network splits: highly consistent, but eventually available
Consistency means that even during a network partition, there is only a single up to date copy of the data.
Couchbase works this way. So, if a node goes down, you can no longer access that document.
That sounds dire, but with Couchbase, there are replica documents stored on other nodes. So, when the nodes goes down, you can get read-only copies. And, Couchbase has an “auto failover” feature. When a node is detected as being down, it can compensate by promoting the replicas to active documents. This does take time, so during that period you do not get availability.
Similar to document databases
Except the value is an opaque blob, the database doesn’t care what the value is and doesn’t reason about it
Couchbase can act as a key/value store
Amazon dynamodb is a service from amazon
Redis is an in-memory data store
Riak, which is a faithful implementation of the Dynamo paper from Amazon
Mix of in-memory and distributed data stores
Let’s focus on distributed data stores (cross out in-memory caches)
It’s “content agnostic”
It’s very flexible in what you can store, usually very fast, not terribly flexible for querying
I’m not a Riak expert, I’m not even a Riak user, I’ll try to represent it fairly
Connects to riak
Create a bucket called ‘actors’
Let’s store actors as the value and the role as the key
“James Bond” => “Sean Connery”
This is the gotcha with dynamo-style systems
You’ve got a number of servers and a number of apps writing to them and reading from them at the same time
Riak tries its best not to ever refuse a write
So let’s say you lose a bunch of servers, someone tripped over the power cord
And riak, dynamodb, etc, will carry on receiving the writes and taking over the workload for now
Or maybe there’s a network split; some apps are writing to one half, the other are writing to the other half
So you might have conflicting values
With a strongly consistent database, writes could be refused. This is trade-off with something like Riak
[animation]
So app 1 writes Sean Connery, app 2 writes Timothy Dalton, and maybe app 3 writes in Roger Moore
But the truth is: there can be only one!
So when we’re talking about distributed databases, there’s really only two options, Availability or Consistency
Availability means that when there is a network partition, you can still perform reads and writes on any piece of data
Even if that data lived on a server that can no longer be accessed
The question is, what happens if that node comes back? There’s potential for a conflict (also known as “siblings”)
You can resolve this conflict in a number of ways: at the database level with timestamp or last-write-wins
Or bubble it up to a user
Or write some sort of custom merging code
If availability is more important that consistency, this is the system you go with.
Riak has introduced a lot of cool stuff to help with this problem: Version Vectors, explicit siblings, and other stuff
But in many cases it’s going to be something that you have to think about, figure out, and fix
I would recommend you try out Riak
These come from google’s BigTable paper
These are the hard ones
Store data in columns instead of rows
These are some Columnar databases
For web devs working with open source tools, we focus on these two
Let’s have a very quick look at cassandra
It’s taken the data model idea from BigTable
But it’s taken data distribution from the dynamo paper
So you’ve got eventual consistency of something like Riak with data model that kinda looks relational but absolutely isn’t
Keyspaces are like databases
Column Families are a bit like tables
Rows are rows, but different than you know them
Columns are key-value pairs associated with a row
So what does that actually mean
This image borrowed from DataStax website
The keyspace is ‘blog’ (but please don’t use Cassandra to run a blog)
Zoom into column families
‘Users’ is a column family (kinda looks like a table)
Each row has a key
So if we get jbellis, we get name and state, which are columns (k/v pairs associated with the row)
They can be anything, each row doesn’t have to have same columns
You can do foreign keys, but manually
It looks relational, but it really has a key/value approach
Why do this?
Google likes it for analytics. Instead of updating something in place, you can just add another column (there is an upper limit to columns, but it’s large)
Let’s say we had a gas meter instead of a blog. Gas reader takes a reading every 15 minutes
So a column family could be a gas meter
A row could be a day
Each column would be every 15-minute reading
It’s very quick to do an aggregate of row, get an average reading per day, something like that
This is an example I took from the datastax.com site on getting started
This looks like SQL, but it’s CQL, and it’s more like a subset of SQL: it’s lacking things like JOINs and subqueries
Here I’m connecting to the cluster
Picking a keyspace
Executing an insert
And then selecting out that user and printing it to console
Impedance mismatch – don’t use it for a blog, use it where you need to throw a lot of data quickly, do range queries across large bits of data
Scales well – but apparently is an Ops headache, since it uses the JVM
So when we’re talking about distributed databases, there’s really only two options, Availability or Consistency
Availability means that when there is a network partition, you can still perform reads and writes on any piece of data
Even if that data lived on a server that can no longer be accessed
The question is, what happens if that node comes back? There’s potential for a conflict (also known as “siblings”)
You can resolve this conflict in a number of ways:
at the database level with timestamp or last-write-wins
At the client level, do a merge or even let the user resolve the conflict, you cannot guarantee that a merge will never have to be handled manually
If availability is more important that consistency, this is the system you go with.
Really big data: Hbase says “billions of rows times millions of columns”
Graph is kinda the odd one out, but worth mentioning
The others are trying to store run-of-the-mill data
With a graph, you’re storing data, but you’re also storing rich connections between the data
Graph databases are based on the ideas of Euler who a polymath in the 1700s, contributed to math, physics, astronomy, logic, etc
The “7 bridges of…” problem
So Euler came up with a whole area of math to deal with that problem
The problem is this: figure out a walk through the city that would cross each bridge once (and only once). You don’t have to start or stop at the same spot.
He proved that there was no way to do that, but in doing so Euler used this problem to establish the entire field of graph theory
Graph databases are all about the WAY that things are connected together
CosmosDb is an azure offering from Microsoft, it’s “multi-model”, so it can act as all the types we’ve talked about as well as graph
OrientDb is also multi-model, it can act as document database or graph
Graph databases are all about the WAY that things are connected together
So this image has some random-looking things
Each of the items are all different things, database only cares about the relationships between them
These relationships are directional
So you could ask “give me all the enemies of Luke Skywalker who have appeared in The Force Awakens”
Supermarket: “give me all the people who bought baby food and who also bought beer”, decide to give them discounts or coupons, etc
Images:
https://www.flickr.com/photos/mlemos/581278438
https://www.flickr.com/photos/pmiaki/5351186496
https://commons.wikimedia.org/wiki/File:Starwars-tatooine.jpg
https://www.flickr.com/photos/stevetroughton/16889877817
https://commons.wikimedia.org/wiki/File:Star_Wars_The_Force_Awakens.jpg
https://commons.wikimedia.org/wiki/File:MCM_London_May_15_-_Stormtrooper_(18246218651).jpg
https://commons.wikimedia.org/wiki/File:Empirestrikesback2.png
NoSQL is a term that I have a love/hate relationship with
Groups together a bunch of different databases into one big tent
"NoSQL" may mean something different to someone you're talking to
It's more helpful to drill into specifics: document, key/value, etc
So, for your next web project, take a minute to think about the database you’re going to use
Don’t use something just because it’s fashionable
Don’t use something just because you’ve always used it
Consider using multiple data stores
Think about what you’re going to need long term
Polyglot persistence
Mongo: Features N1QL, XDCR, Full Text Search, Mobile & Sync. Memory-first architecture and proven, easy scaling.
CouchDb: Couchbase started as a whole new piece of software that was basically a combination of memcache and CouchDb a long time ago, but has grown far beyond that. Couchbase isn’t a fork or vice versa. They share an acronym and they are both NoSQL. Like MySQL and SQL Server, for instance.
Open source apache license for community edition, enterprise edition on a faster release schedule, some advanced features, and support license.
Couchbase is software you can run in the cloud on a VM or on your own data center. CosmosDb is a manage cloud service, but there is a emulator you can run locally.