RavenDB is a NoSQL database written in .NET Core that makes it a natural fit for .NET applications.
Here I'm sharing my experience of using RavenDB in a SaaS product, covering its pros and cons, comparing to other NoSQL databases (CosmosDB, MongoDB, etc.) and explaining reasons of choosing NoSQL vs SQL.
4. NoSQL types
Alex Klaus alex-klaus.com
Document store
Key-value store
Graph
Column Store
Multimodel database
Object database
Tabular
Tuple store
Triplestore
Main types Other types
8. When not to NoSQL?
Alex Klaus alex-klaus.com
OLAP/OLTP
Small project / Simple DB structure
No need to scale
Ad hoc queries
Immediate consistency
Unclear requirements
Inexperienced developers
9. How to shoot yourself
with document NoSQL?
Normalised data storage
Excessive JOINs in queries
Overuse of immediate consistency
10. Reasons to NoSQL
Alex Klaus alex-klaus.com
CHEAPER TO SCALE CONVENIENCE OF
DEVELOPMENT
11. Reason to NoSQL #1: Cheaper to scale
Options:
1. Scale up – more powerful machine
2. Scale out – cluster of smaller ones
Alex Klaus alex-klaus.com
12. Scaling out SQL server
Alex Klaus alex-klaus.com
Shared-disk system Sharding
Single point of failure
Extra work to
• query data (e.g. joining data across shards)
• control data integrity across shards
• rebalance the sharding from time to time
MS SQL provides FCI (Failover Cluster Instance) running in
WSFC (Windows Server Failover Clustering), which leverages
Cluster Shared Volumes
Azure SQL offers Elastic queries and Elastic transactions
Node failure makes that shard’s data unavailable
13. Scale out NoSQL:
Better sharding
1. Natural unit of distribution
2. Avoid cross-node JOINs
3. Avoid cross-node
transactions
Alex Klaus alex-klaus.com
Sharding in MongoDB
Aggregate orientation
16. SQL databases are optimising storage
usage, when NoSQL — CPU
CPU is the most expensive resource in
data centres, storage is the cheapest
The most often operation against the DB
is querying data
JOINs and GROUP BYs on a normalised DB
are hammering the CPU
Alex Klaus alex-klaus.com
NoSQL
hosting:
costs
17. Reason to NoSQL #2: Convenience of development
Alex Klaus alex-klaus.com
Integration database
• Used in multiple applications,
developed by separate teams
• Complex structure
• Poor coordination between
teams
Application database
• Used in single application
• Tailored to the app’s
requirements
• One team maintains the app
and the DB
New way of thinkingOld way of thinking
18. Impedance Mismatch
Alex Klaus alex-klaus.com
Normalised database
Translate objects for
reading & writing
Use of ORM
Impedance Mismatch
difference between the relational
model (structure of tables and
rows) and the in-memory data
structures (rich objects)
19. What’s wrong
with ORM?
Alex Klaus alex-klaus.com
Complex data mapping leads
to expensive maintenance
One extra learning curve
Poor performance
OrmHate by Martin Fowler
20. Reason to NoSQL: No Impedance Mismatch
Alex Klaus alex-klaus.com
NoSQL data model uses aggregates or
graphs
No need in ORM
The same structures in memory as they
are stored on disk
21. Reason to NoSQL: Better integration with tech stack
Alex Klaus alex-klaus.com
22. Tech stacks
for NoSQL
Alex Klaus alex-klaus.com
MongoDb + NodeJs
(MEAN stack)
CosmosDb + EF Core 3
RavenDb
(client library comes
out-of-the-box)
23. SQL vs NoSQL
Alex Klaus alex-klaus.com
Run at scale
Convenience of
development
OLAP/OLTP
Small project / Simple DB structure
No need to scale
Ad hoc queries
Immediate consistency
Unclear requirements
Inexperienced developers
25. Alex Klaus alex-klaus.com
Oren Eini
aka
Ayende Rahien
Why RavenDb?
Well established DB
Support of all OS, containers and
cloud providers
Good integration with .NET Core
9 years since
release of v1
Written in .NET Core with a
focus on the .NET infrastructure
27. RavenDb: .NET integration
Alex Klaus alex-klaus.com
public class Contact
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
using (var store = new DocumentStore
{
Urls = new[] { "http://live-test.ravendb.net" },
Database = "Test"
})
{
store.Initialize();
var contact = new Contact
{ FirstName = "Alex", LastName = "Klaus" };
using (var session = store.OpenSession())
{
session.Store(contact);
session.SaveChanges();
}
}
28. RavenDb: .NET integration
Alex Klaus alex-klaus.com
using (var session = store.OpenSession())
{
var query = from c in session.Query<Contact>()
where c.FirstName == "Alex"
select c;
List<Contact> alexContacts = query.ToList();
Console.WriteLine(alexContacts.FirstOrDefault()?.LastName);
}
public class Contact
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
using (var session = store.OpenSession())
{
var query = from c in session.Query<Contact>()
where c.FirstName == "Alex"
select c.LastName;
List<string> alexContacts = query.ToList();
Console.WriteLine(alexContacts.FirstOrDefault());
}
39. “Features” in query language
Alex Klaus alex-klaus.com
Lack of high-level wrappers for Date manipulation (like SqlFunctions in .NET)
from c in session.Query<Client>()
where c.FirstName.StartsWith("A")
select c.LastName;
but string manipulations work:
NoSQL definition:
No clear definition. No NoSQL foundation
It’s ”non SQL” / “non relational” DB.
Now “Not only SQL”
4 main types
No clear border between key-value and document
Many DB support different NoSQL types (every one does Document-oriented store)
People affiliated with NoSQL databases aggressively advertising them as a solution for everything
Early adopters swearing allegiance to the NoSQL and claiming of passing the point of no return to SQL
Is NoSQL the silver bullet? SQL is not obsolete, it’s still a better tool in some cases
Second Industrial Revolution
On-Line Analytical Processing (OLAP) in MS SQL since 1998 and in Oracle since 1996. Columnar Databases like Apache Cassandra, that are well-suited for OLAP-like workloads.
Simple DB structure. Means simple data-mapping.
No need to scale. SQL databases are optimising storage usage, when NoSQL — CPU
Ad hoc queries. Normalised data – easy querying. Need in indexes. 1 index per query is common (e.g. MongoDB, RavenDB, Amazon DynamoDB). CosmosDB creates indexes for all. RavenDB supports auto-indexes. MongoDb can scan without index.
Immediate consistency. Denormalised data structure and use of Aggregates imply duplicated data in collections. Harder to do immediate consistency.
Unclear requirements. Changing aggregates is expensive.
Inexperienced developers. NoSQL demands knowledge of DDD, CQRS. Eric Evans coined the DDD in 2003. CQRS, introduced by Greg Young in 2010 based on the CQS (Command–Query Separation) described by Bertrand Meyer in the 1980s.
What’s left outside
- ACID transactions
ACID (Atomicity, Consistency, Isolation, Durability) is way less important (entities and aggregates). Heavy transaction use may flag.
All main NoSQL vendors support full ACID compliant transactions.
- Data Integrity
Lack of validation can be mitigated by
- correctly designed DDD aggregates, so number of references outside the aggregate boundaries is drastically reduced;
- update the outside references in ACID transactions to enforce all-or-nothing execution.
Shared-disk file system uses a SAN (Storage Area Network) to allow cluster nodes to gain direct disk access.
Sharding puts different data on separate nodes, each of which does its own reads and writes.
Point of aggregates is to design data structures in a way to combine data that’s commonly accessed together.
The implementation details differ from vendor to vendor.
The application code will look exactly the same if the DB is running on one node or a big cluster.
“Master-Slave” replace with “Parent-Child”
Change level of redundancy – 3, 5, 7 nodes
Easy scale up – one node at a time (2 min)
CPU & RAM are heavily involved in building indexes for NoSQL data
Combine costs of hosting a highly resilient database, plus maintenance and scaling, then NoSQL options can be very lucrative in the long run overcoming all the cons.
Old way – separate DBA team
New way – fits microservices. No need in normalised DB structure
Normalised database forces devs to translate rich objects to relational representation to store on disk and translate back when reading.
Complex data mapping leads to expensive maintenance (changes on either side need to be reflected properly in the mapping).
One extra gruelling learning curve, as devs need to understand in depth how the ORM works on the top of how the DB works.
Poor performance as the under-the-hood interactions with the database aren’t pretty
Nowadays we rarely work with the database in isolation. Usually, the database is a part of a complex environment where a back-end or full-stack developer performs various operations leveraging tools and technologies at his/her disposal.
The ability to communicate with the database using the tech stack of a dev’s preference and less thinking about DB topology, structure, etc. does make a difference. How much of a difference? It’s subjective, one gotta try first.
MongoDb is 10 year old. CosmosDb is 9 y.o.
9-year time from v1 Microsoft released SQL Server 7,
Now at v4.2
Open source. Free community license and in AWS
Oren works 24/7 – architect at Raven, blogs on Raven, does tech support, speaks at conferences and writing books.
Cloud hosting – better price policy comparing to CosmosDb.
Free on AWS
Creating a new record:
- Simple models. No data annotations like in EF. No boiler plate
Querying records:
- LINQ support (executed on DB)
- No manual data mapping
Raven Studio: Created record
Querying in Raven Studio
Store derived classes in one collection
Query them together or indeoendently
Cosmonaut SDK is close.
CosmosDB + EF Core 3
Auto-indexes and manual
Map-reduce for grouping
Text search (tokenizing words)
Docs – Official website (Beginners, Samples). Worse than Angular. Comparable to PostgreSQL
Community – Relatively small on Stackoverflow, not many articles.
Old articles (new engine since 2018)
Tech support – on Google Groups. Prompt response from devs. Bug fixes.
Raven supports server-side JavaScript execution
Executes the JavaScript in Jint (open source JavaScript Interpreter for .NET)