The document discusses polyglot persistence, which is using multiple data persistence technologies like both SQL and NoSQL databases together. It notes that while relational databases have long dominated, NoSQL databases offer alternatives to address needs like large volumes of data, high speeds, and flexibility. The document outlines different types of NoSQL databases and considerations for choosing between them, and advocates using different databases based on their strengths and applying principles like separation of layers and reusable services to integrate diverse data stores in a polyglot approach.
MongoDB has collections which contain documents with fields. Collections are grouped into databases. Documents can be queried, inserted, updated, and deleted. MongoDB does not use joins but instead supports embedding related data or denormalization. It is suitable for flexible schemas, growing data volumes, high write volumes, and distributed systems.
The document discusses schema design basics for MongoDB, including terms, considerations for schema design, and examples of modeling different types of data structures like trees, single table inheritance, and many-to-many relationships. It provides examples of creating indexes, evolving schemas, and performing queries and updates. Key topics covered include embedding data versus normalization, indexing, and techniques for modeling one-to-many and many-to-many relationships.
This document discusses MongoDB, a document-oriented NoSQL database. It begins with an introduction to NoSQL databases and their advantages over relational databases. MongoDB is then described in more detail, including its features like storing dynamic JSON documents, indexing, replication, querying and MapReduce capabilities. Examples of CRUD operations on MongoDB documents are provided. The document concludes by discussing when MongoDB may be applicable and who uses it, as well as comparing MongoDB to SQL databases.
The Challenges of Describing Best Tagging Practices for JATS
Jeffrey Beck, Technical Information Specialist, National Center for Biotechnology Information (NCBI), U.S. National Library of Medicine, National Institutes of Health; Co-chair, NISO Journal Article Tag Suite (JATS) Standing Committee
Ruby on CouchDB - SimplyStored and RockingChairJonathan Weiss
Presentation by Jonathan Weiss about Ruby on CouchDB at Ruby User Group Berlin in Marc 2010. Present SimplyStored, a nice wrapper for Ruby object. RockingChair is an in-memory CouchDB for speeding up your tests.
Back to Basics Webinar 1 - Introduction to NoSQLJoe Drumgoole
The document provides information about an introductory webinar on NoSQL databases and MongoDB. It includes the webinar agenda which covers why NoSQL databases exist, the different types of NoSQL databases including key-value, column, graph and document stores, and details on MongoDB including how it uses JSON-like documents, ensures data durability through replica sets, and scales through sharding. It also advertises a follow up webinar on building a first MongoDB application and provides a registration link.
This PPT Gives Information about:
1. Database basics,
2. Indexes,
3. PHP MyAdmin Connect & Pconnect,
4. MySQL Create,
5. MySQL Insert,
6. MySQL Select,
7. MySQL Update,
8. MySQL Delete,
9. MySQL Truncate,
10.MySQL Drop
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
This is the third webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will explain the architecture of document databases.
MongoDB has collections which contain documents with fields. Collections are grouped into databases. Documents can be queried, inserted, updated, and deleted. MongoDB does not use joins but instead supports embedding related data or denormalization. It is suitable for flexible schemas, growing data volumes, high write volumes, and distributed systems.
The document discusses schema design basics for MongoDB, including terms, considerations for schema design, and examples of modeling different types of data structures like trees, single table inheritance, and many-to-many relationships. It provides examples of creating indexes, evolving schemas, and performing queries and updates. Key topics covered include embedding data versus normalization, indexing, and techniques for modeling one-to-many and many-to-many relationships.
This document discusses MongoDB, a document-oriented NoSQL database. It begins with an introduction to NoSQL databases and their advantages over relational databases. MongoDB is then described in more detail, including its features like storing dynamic JSON documents, indexing, replication, querying and MapReduce capabilities. Examples of CRUD operations on MongoDB documents are provided. The document concludes by discussing when MongoDB may be applicable and who uses it, as well as comparing MongoDB to SQL databases.
The Challenges of Describing Best Tagging Practices for JATS
Jeffrey Beck, Technical Information Specialist, National Center for Biotechnology Information (NCBI), U.S. National Library of Medicine, National Institutes of Health; Co-chair, NISO Journal Article Tag Suite (JATS) Standing Committee
Ruby on CouchDB - SimplyStored and RockingChairJonathan Weiss
Presentation by Jonathan Weiss about Ruby on CouchDB at Ruby User Group Berlin in Marc 2010. Present SimplyStored, a nice wrapper for Ruby object. RockingChair is an in-memory CouchDB for speeding up your tests.
Back to Basics Webinar 1 - Introduction to NoSQLJoe Drumgoole
The document provides information about an introductory webinar on NoSQL databases and MongoDB. It includes the webinar agenda which covers why NoSQL databases exist, the different types of NoSQL databases including key-value, column, graph and document stores, and details on MongoDB including how it uses JSON-like documents, ensures data durability through replica sets, and scales through sharding. It also advertises a follow up webinar on building a first MongoDB application and provides a registration link.
This PPT Gives Information about:
1. Database basics,
2. Indexes,
3. PHP MyAdmin Connect & Pconnect,
4. MySQL Create,
5. MySQL Insert,
6. MySQL Select,
7. MySQL Update,
8. MySQL Delete,
9. MySQL Truncate,
10.MySQL Drop
Back to Basics Webinar 3: Schema Design Thinking in DocumentsMongoDB
This is the third webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will explain the architecture of document databases.
Back to Basics Webinar 3 - Thinking in DocumentsJoe Drumgoole
- The document discusses modeling data in MongoDB based on cardinality and access patterns.
- It provides examples of embedding related data for one-to-one and one-to-many relationships, and references for large collections.
- The document recommends considering read/write patterns and embedding objects for efficient access, while breaking out data if it grows too large.
Montreal Sql saturday: moving data from no sql db to azure data lakeDiponkar Paul
NoSQL database have grown popularity in recent years due to the flexibility of data modeling and scaling up capabilities. NoSQL database also have been using in big data landscape. The demo rich session will elaborate difference between SQL and NoSQL. And end to end solution for data moving capabilities from NoSQL database MongoDB by using Azure data factory.
MongoDB’s basic unit of storage is a document. Documents can represent rich, schema-free data structures, meaning that we have several viable alternatives to the normalized, relational model. In this talk, we’ll discuss the tradeoff of various data modeling strategies in MongoDB.
Back to Basics Webinar 2: Your First MongoDB ApplicationMongoDB
The document provides instructions for installing and using MongoDB to build a simple blogging application. It demonstrates how to install MongoDB, connect to it using the mongo shell, insert and query sample data like users and blog posts, update documents to add comments, and more. The goal is to illustrate how to model and interact with data in MongoDB for a basic blogging use case.
Dev Jumpstart: Schema Design Best PracticesMongoDB
New to MongoDB? We’ll discuss the tradeoff of various data modeling strategies in MongoDB. This talk will jumpstart your knowledge of how to work with documents, evolve your schema, and common schema design patterns. MongoDB’s basic unit of storage is a document. No prior knowledge of MongoDB is assumed.
This document summarizes an agenda for a schema design workshop. The workshop covers basic schema and patterns, schema design, sharding, and replication in MongoDB. It includes examples of schema design for relational databases and MongoDB, including embedding, linking, inheritance patterns, one-to-many relationships, and many-to-many relationships. The goals are to learn data modeling with MongoDB through labs and understand implications of replication and sharding.
This document discusses schema design patterns for MongoDB. It begins by comparing terminology between relational databases and MongoDB. Common patterns for modeling one-to-one, one-to-many, and many-to-many relationships are presented using examples of patrons, books, authors, and publishers. Embedded documents are recommended when related data always appears together, while references are used when more flexibility is needed. The document emphasizes focusing on how the application accesses and manipulates data when deciding between embedded documents and references. It also stresses evolving schemas to meet changing requirements and application logic.
Hagedorn 2013: Beyond Darwin Core - Stable Identifiers and then quickly beyon...Gregor Hagedorn
A brief discussion where to use Linked Open Data http-identifiers and where DOIs are more appropriate. And beyond: what do we really want? Where can we get more, if we use resolvable identifiers? What distinguishes a web from a database?
NoSQL, SQL, NewSQL - methods of structuring data.Tony Rogerson
Today’s environment is a polyglot database, that is to say, it’s made up of a number of different database sources and possibly types. In this session we’ll look at some of the options of storing data – relational, key/value, document etc. I’ll overview what is SQL, NoSQL and NewSQL to give you some context for today’s world of data storage.
Robert Stam discusses schema design in MongoDB. Some key points:
- MongoDB uses collections instead of tables and documents instead of rows.
- Schema design focuses on how an application will access and use data, rather than on data storage.
- Relationships can be modeled through embedding documents, linking documents through IDs, or embedding IDs in other documents. The best approach depends on read/write patterns.
- Examples show modeling one-to-one, one-to-many, and many-to-many relationships through embedding, linking, and linking in both directions between collections.
- Categories can be modeled as documents, arrays, or hierarchical paths.
MongoDB Schema Design: Four Real-World ExamplesMike Friedman
This document discusses different schema designs for common use cases in MongoDB. It presents four cases: (1) modeling a message inbox, (2) retaining historical data within limits, (3) storing variable attributes efficiently, and (4) looking up users by multiple identities. For each case, it analyzes different modeling approaches, considering factors like query performance, write performance, and whether indexes can be used. The goal is to help designers choose an optimal schema based on their application's access patterns and scale requirements.
The document discusses schema design in MongoDB. It explains that MongoDB uses documents rather than tables and rows. Documents can be normalized, with links between separate documents, or denormalized by embedding related data. Embedding is recommended for fast queries but normalized schemas also work well. The document also covers indexing strategies, profiling database performance, and provides examples of schema designs for events, seating, and a feed reader application.
Indexes are references to documents that are efficiently ordered by key and maintained in a tree structure for fast lookup. They improve the speed of document retrieval, range scanning, ordering, and other operations by enabling the use of the index instead of a collection scan. While indexes improve query performance, they can slow down document inserts and updates since the indexes also need to be maintained. The query optimizer aims to select the best index for each query but can sometimes be overridden.
This document discusses document databases and schema design when modeling data. It introduces key concepts like embedding related data within documents for optimal performance and flexibility compared to traditional relational schemas. Examples are provided for how to model one-to-one, one-to-many, and many-to-many relationships either through embedding or referencing other documents. General recommendations suggest embedding by default for relationships where related data is often accessed together, while referencing may be better for scaling or inconsistent many-to-many relationships. The goal is to design schemas that match how the application will use the data for best results.
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
Many user-facing applications present some kind of news feed/inbox system. You can think of Facebook, Twitter, or Gmail as different types of inboxes where the user can see data of interest, sorted by time, popularity, or other parameter. A scalable inbox is a difficult problem to solve: for millions of users, varied data from many sources must be sorted and presented within milliseconds. Different strategies can be used: scatter-gather, fan-out writes, and so on. This session presents an actual application developed by 10gen in Java, using MongoDB. This application is open source and is intended to show the reference implementation of several strategies to tackle this common challenge. The presentation also introduces many MongoDB concepts.
Schema Design by Example ~ MongoSF 2012hungarianhc
This document summarizes a presentation about schema design in MongoDB. It discusses embedding documents, linking documents through references, and using geospatial data for check-ins. Examples are given for modeling blog posts and comments, places with metadata, and user profiles with check-in histories. The document emphasizes designing schemas based on application needs rather than relational normalization.
This document discusses various indexing strategies in MongoDB to help scale applications. It covers the basics of indexes, including creating and tuning indexes. It also discusses different index types like geospatial indexes, text indexes, and how to use explain plans and profiling to evaluate queries. The document concludes with a section on scaling strategies like sharding to scale beyond a single server's resources.
NoSQL databases only unfold their entire strength when also embracing the their concepts regarding usage and schema design. These slides give some overview of features and concepts of MongoDB.
Do More with Postgres- NoSQL Applications for the EnterpriseEDB
NoSQL capabilities in Postgres are opening up new avenues for solving enterprise challenges without having to adopt new technologies that bring risk and instability to data management. EnterpriseDB has made it easier for developers to get started using the NoSQL capabilities in Postgres to develop Web 2.0 applications, deploying on Amazon with a new development environment featuring application frameworks, a web server and code samples.
This presentation covers how developers tap the powers of Postgres by addressing:
* How to use JSON and HSTORE side by side with ANSI SQL to create powerful, robust and scalable Web 2.0 data-driven applications
* Getting started with PGXDK (Postgres Extended Datatype Developer Kit), a free AMI that simplifies the development of Postgres-based applications that integrate NoSQL technologies with JSON and Python
* Code samples for writing applications with Postgres using dynamic new capabilities
If you would like to learn more about building your NoSQL Applications with Postgres please email sales@enterprisedb.com.
Databases have been around for decades and were highly optimised for data aggregations during that time. Not only Big data has changed the landscape of databases massively in the past years - we nowadays can find many Open Source projects among the most popular dbs.
After this talk you will be enabled to decide if a database can make your work more efficient and which direction to look to.
Towards a rebirth of data science (by Data Fellas)Andy Petrella
Nowadays, Data Science is buzzing all over the place.
But what is a, so-called, Data Scientist?
Some will argue that a Data Scientist is a person able to report and present insights in a data set. Others will say that a Data Scientist can handle a high throughput of values and expose them in services. Yet another definition includes the capacity to create meaningful visualizations on the data.
However, we enter an age where velocity is a key. Not only the velocity of your data is high, but the time to market is shortened. Hence, the time separating the moment you receive a set of data and the time you’ll be able to deliver added value is crucial.
In this talk, we’ll review the legacy Data Science methodologies, what it meant in terms of delivered work and results.
Afterwards, we’ll slightly move towards different concepts, techniques and tools that Data Scientists will have to learn and appropriate in order to accomplish their tasks in the age of Big Data.
The dissertation is closed by exposing the Data Fellas view on a solution to the challenges, specially thanks to the Spark Notebook and the Shar3 product we develop.
Back to Basics Webinar 3 - Thinking in DocumentsJoe Drumgoole
- The document discusses modeling data in MongoDB based on cardinality and access patterns.
- It provides examples of embedding related data for one-to-one and one-to-many relationships, and references for large collections.
- The document recommends considering read/write patterns and embedding objects for efficient access, while breaking out data if it grows too large.
Montreal Sql saturday: moving data from no sql db to azure data lakeDiponkar Paul
NoSQL database have grown popularity in recent years due to the flexibility of data modeling and scaling up capabilities. NoSQL database also have been using in big data landscape. The demo rich session will elaborate difference between SQL and NoSQL. And end to end solution for data moving capabilities from NoSQL database MongoDB by using Azure data factory.
MongoDB’s basic unit of storage is a document. Documents can represent rich, schema-free data structures, meaning that we have several viable alternatives to the normalized, relational model. In this talk, we’ll discuss the tradeoff of various data modeling strategies in MongoDB.
Back to Basics Webinar 2: Your First MongoDB ApplicationMongoDB
The document provides instructions for installing and using MongoDB to build a simple blogging application. It demonstrates how to install MongoDB, connect to it using the mongo shell, insert and query sample data like users and blog posts, update documents to add comments, and more. The goal is to illustrate how to model and interact with data in MongoDB for a basic blogging use case.
Dev Jumpstart: Schema Design Best PracticesMongoDB
New to MongoDB? We’ll discuss the tradeoff of various data modeling strategies in MongoDB. This talk will jumpstart your knowledge of how to work with documents, evolve your schema, and common schema design patterns. MongoDB’s basic unit of storage is a document. No prior knowledge of MongoDB is assumed.
This document summarizes an agenda for a schema design workshop. The workshop covers basic schema and patterns, schema design, sharding, and replication in MongoDB. It includes examples of schema design for relational databases and MongoDB, including embedding, linking, inheritance patterns, one-to-many relationships, and many-to-many relationships. The goals are to learn data modeling with MongoDB through labs and understand implications of replication and sharding.
This document discusses schema design patterns for MongoDB. It begins by comparing terminology between relational databases and MongoDB. Common patterns for modeling one-to-one, one-to-many, and many-to-many relationships are presented using examples of patrons, books, authors, and publishers. Embedded documents are recommended when related data always appears together, while references are used when more flexibility is needed. The document emphasizes focusing on how the application accesses and manipulates data when deciding between embedded documents and references. It also stresses evolving schemas to meet changing requirements and application logic.
Hagedorn 2013: Beyond Darwin Core - Stable Identifiers and then quickly beyon...Gregor Hagedorn
A brief discussion where to use Linked Open Data http-identifiers and where DOIs are more appropriate. And beyond: what do we really want? Where can we get more, if we use resolvable identifiers? What distinguishes a web from a database?
NoSQL, SQL, NewSQL - methods of structuring data.Tony Rogerson
Today’s environment is a polyglot database, that is to say, it’s made up of a number of different database sources and possibly types. In this session we’ll look at some of the options of storing data – relational, key/value, document etc. I’ll overview what is SQL, NoSQL and NewSQL to give you some context for today’s world of data storage.
Robert Stam discusses schema design in MongoDB. Some key points:
- MongoDB uses collections instead of tables and documents instead of rows.
- Schema design focuses on how an application will access and use data, rather than on data storage.
- Relationships can be modeled through embedding documents, linking documents through IDs, or embedding IDs in other documents. The best approach depends on read/write patterns.
- Examples show modeling one-to-one, one-to-many, and many-to-many relationships through embedding, linking, and linking in both directions between collections.
- Categories can be modeled as documents, arrays, or hierarchical paths.
MongoDB Schema Design: Four Real-World ExamplesMike Friedman
This document discusses different schema designs for common use cases in MongoDB. It presents four cases: (1) modeling a message inbox, (2) retaining historical data within limits, (3) storing variable attributes efficiently, and (4) looking up users by multiple identities. For each case, it analyzes different modeling approaches, considering factors like query performance, write performance, and whether indexes can be used. The goal is to help designers choose an optimal schema based on their application's access patterns and scale requirements.
The document discusses schema design in MongoDB. It explains that MongoDB uses documents rather than tables and rows. Documents can be normalized, with links between separate documents, or denormalized by embedding related data. Embedding is recommended for fast queries but normalized schemas also work well. The document also covers indexing strategies, profiling database performance, and provides examples of schema designs for events, seating, and a feed reader application.
Indexes are references to documents that are efficiently ordered by key and maintained in a tree structure for fast lookup. They improve the speed of document retrieval, range scanning, ordering, and other operations by enabling the use of the index instead of a collection scan. While indexes improve query performance, they can slow down document inserts and updates since the indexes also need to be maintained. The query optimizer aims to select the best index for each query but can sometimes be overridden.
This document discusses document databases and schema design when modeling data. It introduces key concepts like embedding related data within documents for optimal performance and flexibility compared to traditional relational schemas. Examples are provided for how to model one-to-one, one-to-many, and many-to-many relationships either through embedding or referencing other documents. General recommendations suggest embedding by default for relationships where related data is often accessed together, while referencing may be better for scaling or inconsistent many-to-many relationships. The goal is to design schemas that match how the application will use the data for best results.
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
Many user-facing applications present some kind of news feed/inbox system. You can think of Facebook, Twitter, or Gmail as different types of inboxes where the user can see data of interest, sorted by time, popularity, or other parameter. A scalable inbox is a difficult problem to solve: for millions of users, varied data from many sources must be sorted and presented within milliseconds. Different strategies can be used: scatter-gather, fan-out writes, and so on. This session presents an actual application developed by 10gen in Java, using MongoDB. This application is open source and is intended to show the reference implementation of several strategies to tackle this common challenge. The presentation also introduces many MongoDB concepts.
Schema Design by Example ~ MongoSF 2012hungarianhc
This document summarizes a presentation about schema design in MongoDB. It discusses embedding documents, linking documents through references, and using geospatial data for check-ins. Examples are given for modeling blog posts and comments, places with metadata, and user profiles with check-in histories. The document emphasizes designing schemas based on application needs rather than relational normalization.
This document discusses various indexing strategies in MongoDB to help scale applications. It covers the basics of indexes, including creating and tuning indexes. It also discusses different index types like geospatial indexes, text indexes, and how to use explain plans and profiling to evaluate queries. The document concludes with a section on scaling strategies like sharding to scale beyond a single server's resources.
NoSQL databases only unfold their entire strength when also embracing the their concepts regarding usage and schema design. These slides give some overview of features and concepts of MongoDB.
Do More with Postgres- NoSQL Applications for the EnterpriseEDB
NoSQL capabilities in Postgres are opening up new avenues for solving enterprise challenges without having to adopt new technologies that bring risk and instability to data management. EnterpriseDB has made it easier for developers to get started using the NoSQL capabilities in Postgres to develop Web 2.0 applications, deploying on Amazon with a new development environment featuring application frameworks, a web server and code samples.
This presentation covers how developers tap the powers of Postgres by addressing:
* How to use JSON and HSTORE side by side with ANSI SQL to create powerful, robust and scalable Web 2.0 data-driven applications
* Getting started with PGXDK (Postgres Extended Datatype Developer Kit), a free AMI that simplifies the development of Postgres-based applications that integrate NoSQL technologies with JSON and Python
* Code samples for writing applications with Postgres using dynamic new capabilities
If you would like to learn more about building your NoSQL Applications with Postgres please email sales@enterprisedb.com.
Databases have been around for decades and were highly optimised for data aggregations during that time. Not only Big data has changed the landscape of databases massively in the past years - we nowadays can find many Open Source projects among the most popular dbs.
After this talk you will be enabled to decide if a database can make your work more efficient and which direction to look to.
Towards a rebirth of data science (by Data Fellas)Andy Petrella
Nowadays, Data Science is buzzing all over the place.
But what is a, so-called, Data Scientist?
Some will argue that a Data Scientist is a person able to report and present insights in a data set. Others will say that a Data Scientist can handle a high throughput of values and expose them in services. Yet another definition includes the capacity to create meaningful visualizations on the data.
However, we enter an age where velocity is a key. Not only the velocity of your data is high, but the time to market is shortened. Hence, the time separating the moment you receive a set of data and the time you’ll be able to deliver added value is crucial.
In this talk, we’ll review the legacy Data Science methodologies, what it meant in terms of delivered work and results.
Afterwards, we’ll slightly move towards different concepts, techniques and tools that Data Scientists will have to learn and appropriate in order to accomplish their tasks in the age of Big Data.
The dissertation is closed by exposing the Data Fellas view on a solution to the challenges, specially thanks to the Spark Notebook and the Shar3 product we develop.
Manage online profiles with oracle no sql database tht10972 - v1.1Robert Greene
The document discusses managing online customer profiles with Oracle NoSQL Database. It describes the need for flexible and scalable customer profile data to enable personalization, recommendations, and other business functions. Oracle NoSQL Database is presented as a solution to store evolving customer profile data across many fields in a distributed, globally accessible way to support business needs. The document provides an overview of implementing customer profiles using Oracle NoSQL Database's JSON schemas and AVRO serialization.
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
How fast can you modify your data collection to include a new field, make all the necessary changes in data processing and storage, and then use that field in analytics or product features? For many companies, the answer is a few quarters, whereas others do it in a day. This data agility latency has a direct impact on companies' ability to innovate with data. Schema-on-read has been a key strategy to lower that latency - as the community has shifted towards storing data outside relational databases, we no longer need to make series of schema changes through the whole data chain, coordinated between teams to minimise operational risk. Schema-on-read comes with a cost, however. Errors that we used to catch during testing or in early test deployments can now sneak into production undetected and surface as product errors or hard-to-debug data quality problems later than with schema-on-write solutions.
In this presentation, we will show how we have rejected the tradeoff between slow schema change rate and quality to achieve the best of both worlds. By using metaprogramming and versioned pipelines that are tested end-to-end, we can achieve fast schema changes with schema-on-write and the protection of static typing. We will describe the tools in our toolbox - Scalameta, Chimney, Bazel, and custom tools. We will also show how we leverage them to take static typing one step further and differentiate between domain types that share representation, e.g. EmailAddress vs ValidatedEmailAddress or kW vs kWh, while maintaining harmony with data technology ecosystems.
This document proposes a software architecture to address complex and dynamic data modeling challenges. The proposed solution has four main components: [1] An OSGi-based architecture for modularity, reusability and dynamic updates. [2] A graph database (Neo4j) to flexibly store relationships and enable natural queries. [3] A user interface built with AngularJS and D3.js for rich, data-driven visualization. [4] The use of a "mad developer" to implement the architecture. The architecture aims to reduce complexity, support dynamic data and provide a flexible yet user-friendly interface.
Considerations for using NoSQL technology on your next IT project - Akmal Cha...jaxconf
Over the past few years, we have seen the emergence and growth of NoSQL technology. This has attracted interest from organizations looking to solve new business problems. There are also examples of how this technology has been used to bring practical and commercial benefits to some organizations. However, since it is still an emerging technology, careful consideration is required in finding the relevant developer skills and choosing the right product. This presentation will discuss these issues in greater detail. In particular, it will focus on some of the leading NoSQL products, such as MongoDB, Cassandra, Redis, and Neo4j and will discuss their architectures and suitability for different problems. Short demonstrations, using Java, are planned to give the audience a feel for the practical aspects of such products.
The document introduces Neo4j, an open-source graph database. It discusses how graph databases have evolved over time to model different types of connected data, from social networks to spatial information. It provides examples of using Neo4j to represent social network, spatial, financial, and genomic data as graphs. The document also summarizes the core components of Neo4j and reasons for using a graph database approach.
This document summarizes a presentation on NoSQL and multi-model databases. It begins with an introduction to NoSQL databases, describing them as non-relational systems designed for big data and scalability. The main NoSQL models are outlined as key-value, document, columnar, and graph databases. Document databases are discussed in more detail. The presentation then covers multi-model databases, which combine features of document and graph databases, and allows for flexible querying. Popular multi-model databases like OrientDB and ArangoDB are presented. Finally, the document concludes with a demo of OrientDB's querying capabilities.
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
DBA's used to be Relational Database centric for instance managing Microsoft SQL Server or Oracle, in this changing world of polyglot database environments their role has expanded not just into new platforms other than SQL but also new legal governance, modelling techniques, architecture etc. They need to have a base knowledge of Kimball, Inmon, Data Vault, what CAP theorem is, LAMBDA, Big Data, Data Science etc.
NoSQL, Neo4J for Java Developers , OracleWeek-2012Eugene Hanikblum
This seminar covered using Neo4j, a graph database for Java developers. It began with an overview of big data and NoSQL databases, including key-value stores, column databases, document databases, and graph databases. It then discussed why graph databases are useful when data is highly interconnected. The remainder of the seminar focused on Neo4j, explaining what it is, how it compares to relational databases, and how to model and query data using Neo4j and tools like Cypher, Spring Data Neo4j, the Neo4j browser, and Neoclipse plugin. Code examples were also provided.
SwanseaCon 2017 presentation on Making MySQL Agile-ish. Relational Databases are not usually considered part of the Agile Programming movement but there are many new features in MySQL to make it easier to include it. This presentation covers how MySQL is moving to help support agile development while maintaining the traditional 'non agile' stability expected from a database.
Our own Sean Doherty was in Madrid this week, presenting at the Red Hat Partner summit on the rise of big data and what it means for the future of the RDBMS in the enterprise. Check out his presentation!
Neo4j provides a graph database and tools for working with connected data. It helps customers gain insights from data relationships. Neo4j has thousands of customers worldwide, supports various deployment options, and its graph queries and algorithms help with tasks like recommendations, fraud detection, and knowledge discovery. It provides visualizations, analytics and machine learning tools to make graph data accessible and help users understand connections.
ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! Embarcadero Technologies
Watch the accompanying webinar presentation at http://embt.co/BigXE6
These are the slides for the ER/Studio and DB PowerStudio Launch Webinar: Big Data, Big Models, Big News! on September 18, 2014
MySQL Without the SQL -- Oh My! Longhorn PHP ConferenceDave Stokes
You can now use MySQL without needing to know Structured Query Language (SQL) with the MySQL Document Store. Access JSON documents and/or relational tables using the new X DevAPI
What is a distributed data science pipeline. how with apache spark and friends.Andy Petrella
What was a data product before the world changed and got so complex.
Why distributed computing/data science is the solution.
What problems does that add?
How to solve most of them using the right technologies like spark notebook, spark, scala, mesos and so on in a accompanied framework
The document provides an overview of the MySQL Document Store, which allows storing and querying JSON documents within MySQL tables without requiring SQL. It is built on the MySQL JSON data type and X DevAPI. Key features highlighted include the ability to work with both relational tables and document collections together using various programming languages, transactions, and casting collections as tables. The document store is available in MySQL 5.7 and 8 via a plug-in.
Similar to NoSQL Search Roadshow Zurich 2013 - Polyglot persistence with no sql (20)
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
2. Abstract
Alternative data persistence technologies like NoSQL emerged since more than
10 years, but we developers hesitate to open our horizon for these new
approaches. Why should we?
Relational databases dominated the IT industry for a long time and served us
very well. Everybody knows SQL and is used to the relational data model with
all its advantages and disadvantages.
But the one who are looking beyond their borders will find a richness of
NoSQL technologies and products.
Every product has its own properties and characteristics. How can we
differentiate them? Is it all about smart decisions, or do we have more
possibilities? We will go into the world of NoSQL and explain the different kind
of NoSQL products, when to use them and what is about polyglot persistence
to be.
6. Michael Lehmann @lehmamic
Senior Software Engineer @Zühlke since 2012
.Net enterprise and cloud applications
Roman Kuczynski @qtschi
Senior Software Engineer @Zühlke since 2011
Data(base) architectures, BI and Big Data
17. Roger Federer Roger Federer
Scaling by replication
Roger Federer
N. Djokovic N. Djokovic N. Djokovic
18. Impedance mismatch using relational databases
public class BlogPost
{
public int Id { get; set; }
public string Content { get; set; }
public List<string> Tags { get; set; }
}
19. Design for the relational model
public class BlogPost
{
public int Id { get; set;}
public List<Tag> Tags { get; set; }
}
public class Tag
{
public int Id { get; set; }
public PlogPost BelongsTo { get; set; }
public string Name { get; set; }
}
BlogPost
- Id (int)
- Content (varchar)
Tag
- Id (int)
- BlogPostId (int)
- Name (varchar)
20. NoSQL databases increase productivity
var post = new BlogPost
{
Id = 1,
Content = "Any text content",
Tags = new [] { "NoSQL", "Cloud", "PolyglotPersistence" }
};
collection.Insert(post);
39. The graph data model
Node [1]
Name = ‘John’
Node [2]
Name = ‘Sara’
Node [5]
Name = ‘Joe’
Node [3]
Name = ‘Maria’
Node [4]
Name = ‘Steve’
friend friend
friend friend
55. Common data tier design
Presentation
Domain
DAL
Resources RDBMS
Search
Transactions
Caching
Blobs
Triggers
Reporting
User Interface
Relational-ObjectObject-Relational
60. Resources
NoSQL Distilled
Author: Martin Fowler, Pramod J. Sadalage
ISBN: 978-0321826626
Making Sense of NoSQL
Author: Dan McCreary, Ann Kelly
ISBN: 978-1617291074
Links
http://nosql-database.org/
http://en.wikipedia.org/wiki/NoSQL
Do you know that? Put the hands up in the air!Almoust everybody is comfortable with SQLRoman
We developers are familiar with Relational databases and we know what we get when we use them. We are familiar using…SQL (Structures query language) SQL is a standardACID operations TransactionsRelational Schema that implies referential integrity and it’s constraints (PK, FK) to ensure data integrityData consistency must be ensured SQL is our holy cow (untouchable, established and accepted)Roman
Today Relational Databases such as SQL Server are a de facto standardWe choose from products (SQL Server, Oracle, MySQL...) not from technologiesFor (almost) every data persistency solutionThis reminds us to the Swiss army knife that can be used for everything.But what would you say, if you build a house and the electrician arrives with a Swiss Army Knife only?Michael
Before we dive deeper let’s introduce ourselves.Michael:Is a senior software engineer at Zühlke since 2012Focuses on enterprise and cloud application development in .NetRoman:Is a senior software engineer at Zühlke since 2011Focuses on data(base) architectures and technologies including BI and Big DataAgenda:For the next 40’ we are going to talk about:A short briefing of aternativenosql technologiesAnd what it means when use nosql databases in a polyglot persistence environmentBeide
Sure you may ask:Why should we decide for anything else than a SQL database?In almost every case we are fine using SQL.What reasons do we have to use another technology where everybody has to learn something new?Michael
You are right to challenge our statement to use other technologies than SQL.But nowadays we have other circumstances than a few years ago.(with cloud computing and BigData we have..)There are some business drivers that challenge RDBMS.We didn’t invent those business drivers.Let’s have a look at most reasonable drivers.Michael
One of the main drivers is Big Data.Hence, our first driver is Volume:Amount of data: from MB TB PBMore users, applications or devices accessing dataRDBMS reached their physical or financial limits. we need scalable and affordable solutionsRoman
Our second driver is velocity: Encreasing incoming data frequencyWe want to work everytime with the “newest” and up to date data without delays (think of a tweet arriving to late) From daily data to “realtime” The speed the data arrives and getting out, in other words the whole data lifecycle gets accelerated.Roman
Another aspect is Variability:Data origin from different devices, user produced or generally sources (mobile data, web content, sensors…)Often unstructured data, the amount of structured data is relatively small.Roman
Agility is our last driver and second main aspect beside Big Data.Productivity of development workResponsive for changes Time to market and low costsRoman
All these business drivers challenges and put pressure onto relational databases.Let’s see what options do we have to encounter those problems Michael
NoSQL database emerged in the market to meet those business driversHence, the characteristics of NoSQL databases address these problemsThere are main 2 reasons to use NoSQL:ScalingProductivityMichael
Why scaling?First of all most NoSQL technologies are built to scale out well.Scaling out means:We build a database cluster, based on commodity hardware (cheap), and not only spend money in a single box solution.Hence, we get availability, better performance due to load balancing, and capacityMichael
There are two techniques for scaling:With Sharding we distribute our data to different nodes: e.g. customer [a-m] on node #2 and customer [n-z] on node #2Some of the NoSQL database even provide autosharding, means we just can add new nodes to the cluster and the database will redistribute the data.Michael
There are two techniques for scaling:With Replication we duplicate our data on different nodes. This gives us a better failover and increases performance with parallel access.Michael
As a developer, we want to store our in-memory object structures.Using Relational databases, we have to map our objects to a relational model.Here we talk about the so called impedance mismatch. => a simple example list of strings
In general, NoSQL databases are schemaless, what brings us advantages in gaining development productivity.Usually data is stored in XML or Json format. This allows us to store our objects straight forward.Data format is version tolerant. If we change our data structures in code (i.e. we add or remove a property), NoSQL database does not care about it. In a relational database we have to update the schema and migrate the data as well.Schemaless does not mean, having no schema. We do have an implicit schema, but not enforced by the database! 2. FoliefürSchemalos und versionstolerantMichael
Let’s talk about consistencyScaling out and the fact, that no schema is present, influences data consistency.1. Having no schema means: We cannot enforce data integrity by the database.Roman
NoSQL databases generally don’t support ACID Operations (transactions)They rather provide eventual consistentcy.Roman
Let’s illustrate this by an example:We want to book a hotel room and we see the room is free.Roman
We book the room.On ZH server the room is still free, because update has not been processed yet.Roman
Data is inconsistent for a short period of time.We call this the inconsistency window!As soon as updates are processed on ZH, data will be eventually consistent!Means: Nobody else has booked the room on ZH server before updates had been processed!Roman
BUT: What happens, if someone else has booked the room on ZH before synchronization?Roman
To avoid conflicts:We have to wait for a commit of ZH to complete the updateWe define one server as the master, and only the master accept changes.BUT: In case of such a conflict, do we need strong consistency?What is more important? No conflict or risk to lose customers?Roman
we may handle conflicts by business discount, spare roomsRoman
To avoid cpnflicts we take “latency time” into accountWe have to reconsider when strong consistency is required Hence, we have to balance between performance and strong consistencyRoman
At the beginning we spoke about the Swiss army knife. Now we want to discover the tools available to build our solution. Let’s open the tool box.Roman
We have a great variety of NoSQL database products.For example:- Riak, MongoDB, Cassandra, HBase, Neo4J, etc.Roman
There are so many products, there is no way to no all these databases in detail.But we can classify them along their main characteristics.Nowadays 4 well known NoSQL Categories got generally accepted in the NoSQL community.Michael
We start with the most simple databases are so called Key-Values Stores. Originally developed for distributed caches (e.g. web sites)- Typical characteristics are, that the data is stored and accessed by a unique key, comparable to c# dictionary.- The data is just a bucket (or a set) of any data, which generally can not be queried.- Very fast to write and read data.- Easy to scale.Typical products: Memcached, Redis, RiakMichael
The most widely used databases are document stores.Typical products: CouchDB, MongoDB, RavenDB (written in .net!), (OrientDB hybrid, also Graph database)- As the name implies, these databases store the data as documents. The whole document is a serialized object tree (Aggregates),which makes this kind of databases very intuitive and easy to work with.Michael
Here is an example of such a document:- Generally the documents are stored in xml or json (bison) format.- These databases are query enabled, so we can search for a value of a given property in hierarchical documentsand we can apply indexes on data fields to optimise the queries.Michael
Column-family stores are the most close to table like in relational databases. They are also known as Wide column databases or big tables.Typical products: Cassandra, Hbase, HypertableRoman
Column family stores are semi-schematic:- You can think of a column-family store as one huge, big table with lots of columns- These columns are organised in so called “Column Families”, which are equivalent to tables in RDMBS.- Data is stored as rows and accessed by a row key and the column name.- Not every row has to contain the same columns- Even more, columns can dynamically added or removed to a row.Roman
Last but not least, graph databases are a bit exotic.Typical products: Neo4J, Infinite Graph, OrientDBMichael
Relational databases and the NoSQL categories we discussed already are not strong in modelling complex relationships.Imaging you have a graph like this (see example). How would you model and query the nodes and relationships in SQL?You see there are some limitations. Queries to traverse a graph would decrease the performance drastically.Using RDBMS, the data model defines how to query the relationships.Graph databases have another architectural approach and focuson the relationships of the data. Data is modelled as a graph with nodes and edges. Edges are the relationships between the nodes and can contain data as well.Special query languages like Cipher for Neo4J allow to traverse the graph intuitively.Example: see illustrationMichael
After this excurse, we now have a variety of databases to store our dataNot only NoSQL databases, but also the Relational databases. Yes, they still do have a right to exist, and they have a place in our toolbelt.Examples:Facebook, EBay CassandraCERN for ATLAS Detector used for Large Hadron Collider CassandraForbes.com for articles MongoDBSalesforce Marketing Cloud MongoDBAdobe, HP Neo4JRDBMS still can be used!Michael
But now, which database is the best for my application.All of the databases have their advantages and disadvantages for a certain scenario.Roman
Do we really have to pick only one database? In any scenario, we have to take trade-offs into account. Actually, we want to use the most accurate database for every job.Roman
Why we don’t do this?We can use different databases in an application for different storage scenarios.Martin Fowler calls this polyglot persistence.Michael
Title:Use the right tool for the jobScenario webshop:Caching:Redis (KV)Session storage: Redis (KV)Shopping Cart: Redis (KV)Product Catalog: RavenDB (Doc)Recommondation Engine: Neo4J (Graph)Financial Transactions: MS SQL Server (RDBMS)Reporting: MS SQL Server (RDBMS)Event Logging: Cassandra (CF)Michael
This is nice.But it introduces some new issues.Roman
First of all, we need people whoHave Knowledge in developing with these databases (API, characteristics in detail, advantages & disadvantages)Have Knowledge in operating these databases (administration, installing & upgrading, monitoring, backup & restore, performance tuning, storage management)Roman
Beside the skills, polyglot persistence impacts our architecture.Using multiple databases in the same application (system) increase complexity in the code and architecture.We need to design the architecture to handle this complexity. Roman
In the past SQL databases had been used as so called integration databases:Relational databases have a common, platform independant interface SQLMultiple applications accessed the same databases. So databases acted as integration platforms.The database was the master of the data, and the model ensured data quality (consistency, integrity)Using NoSQL databases, we have different circumstances:NoSQL database cannot enforce the schema and strong consistency.This requires that the application becomes the master of the data and is responsible to ensure schema and consistency.Roman
Polyglot Persistence does not match with the integration database idea:We don’t have a master database anymore.Means: We cannot ensure data quality across multiple databases, if they are accessed by multiple applications.Roman
To overcome this problem, we need a central unit being the master of the data.Here application databases come into account:The application owns a database and is the master of the data.Hence, the data can only be accessed by exactly one application and no other application accesses the database.Roman
In an enterprise environment we not only have one single application, but multiple applications using the same data.Since application databases don’t work with multiple applications, we need an enhancement of that design!Michael
We all know about the term SOA.A service acts as an application in the “application databases” scenario:The service is master of the data and can ensure the schema and consistencyMultiple applications can access the service.They never access the data(base) directly!Michael
We discussed the approach to ensure data quality and schema with polyglot persistence.Now let’s have a look at how we can avoid a mess in our code.The key is that we structure our code in layers.Michael
For this we use the commonly well-known layer architecture:PresentationBusinessResource Access (DAL)Resources (databases)Michael
Usually, the data access layer contains all of the logic to access the data.Bloated, heavy and application-specificNot reusableExpensive to maintainMichael
Example reusability of a chair:Object-related vs. interface-related the interface can be reused, not the object! Interfaces are reusable!!!!!!!!!! Not objectsMichael
Especially with NoSQL databases we should to aspire towards small reusable data services:Provide functionality accessing only one database.Example: Caching Service.Because these services provide not only data access, but also business logic (domain-independent), they are part of the business layer.Hence, the former bloated data access layer gets split into several independent services and is moved to the business layer.In Fact, the services have it’s own DAL, that’s why we placed we placed it across the layers.Data services are domain-independent. They don’t implement domain-specific domain logic!Michael
We now have several independent data services:They can be used by multiple applications.Data services ensure data consistency and schema in their owned databaseData consistency across several data services is not guaranteed by the data services.BeispielWebshop (Catalog/Shopping Cart/Order System/Recommendation Engine/Financial Transaction)A few minutes ago we talked about a central unit to ensure data consistency across several databases.To meet this requirement applied to our data services, we introduce business services:They provide the domain logic and use one or even more data services.Responsible to guarantee data consistency.Michael
Relational Databases give us confidence, because they are established and approved and robust. But that does not mean, that we use it for everything.At the very beginning, we compared our affection to Relational Databases with a Swiss army knife (one tool fits all).Now we have a toolbox full of individual tools to do our job, like a professional.That means, we have a variety of database technologies and products.We know about their strengthAnd how to use them REMEMBER: Use the right tool for the job!Roman
NoSQL DistilledMaking Sense of NoSQLhttp://www.datastax.com/documentation/gettingstarted/index.htmlhttp://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/de//archive/bigtable-osdi06.pdfhttp://nosql-database.org/http://en.wikipedia.org/wiki/NoSQLRoman