Being RDBMS Free
Alternative Approaches to Data
Persistence
DAVID HOERSTER
About Me
C# MVP (Since April 2011)
Sr. Solutions Architect at Confluence
One of the Conference Organizers for Pittsburgh TechFest
Past President of Pittsburgh .NET Users Group and organizer of recent Pittsburgh Code
Camps and other Tech Events
Twitter - @DavidHoerster
Blog – http://blog.agileways.com
Email – david@agileways.com
Goals
To allow you to achieve a zen-like state by
never having to decide between a left and
right outer join
Goals
That a non-relational solution may be considered an option
What are some areas of a traditional application that could be a non-relational solution
Introduce some non-relational tools
How those tools would be used in a .NET solution (CODE!)
Traditional Architecture
Data persistence is central to application
Generally monolithic
Jack of all trades; master of none
Traditional Architecture
Client
Web
Server
App
Server
Data
Repository
App Data
Session
Cache (?)
Full Text Search
Audit
Consider…
An online employment application
Wizard interface, with 9-12 steps
Most data is 1:1 across steps, but some data is 1:many
How to best structure 1:1 data
◦ 6-8 tables, linked by ID?
◦ Or one wide table with lots of nullable columns?
◦ What about joining?
How about 1:many data
◦ Several tables with 1:* relationships, which also needs to be joined
Don’t forget searching!!!
Applicant
General
Disclosure
Attestation
Skills
Empl
Educat’
n
Database Thaw
I'm confident to say that if you starting a new strategic enterprise
application you should no longer be assuming that your
persistence should be relational. The relational option might be the
right one - but you should seriously look at other alternatives.
-- Martin Fowler (http://martinfowler.com/bliki/PolyglotPersistence.html)
Monolithic Data Persistence
Provides consistency, but…
Is it always best tool for all jobs?
Is it easy for prototyping / rapid development?
Consider
◦ How data will be used
◦ What kinds of data you’ll have
Why Non-Relational
Use Case – Company Intranet / CMS
Overall object is a CMS-like app for a company’s intranet content
Usage is mostly read-only, with pages and attachments
◦ Pages, attachments, searching, admin, etc.
Traditional database could be multiple tables with 1:1 relationships and some 1:many
relationships
Lots of joins for a page
…or a single document
What if…
We could break some pieces out
◦ Flatten structures for querying
◦ Highly efficient search services
◦ Pub/sub hubs
◦ Remote caching with excellent performance
◦ Session management outside a DB for load balanced environments
How would app then be architected?
…but consider the costs
Learning curve
Distributed systems
Compensating transactions
Consider this with
◦ Data
◦ Searching
◦ Caching/Session
◦ Auditing
Data Storage
Typically, RDBMS is the de facto standard
◦ SQL Server
◦ MySQL
◦ PostgreSQL
◦ Oracle (Yikes!!)
But do you really need it?
Data Storage
Get all the orders for user ‘David’ in last 30 days
SELECT c.FirstName, c.MiddleName, c.LastName, soh.SalesOrderID, soh.OrderDate,
sod.UnitPrice, sod.OrderQty, sod.LineTotal,
p.Name as 'ProductName', p.Color, p.ProductNumber,
pm.Name as 'ProductModel',
pc.Name as 'ProductCategory',
pcParent.Name as 'ProductParentCategory'
FROM SalesLT.Customer c INNER JOIN SalesLT.SalesOrderHeader soh
ON c.CustomerID = soh.CustomerID
INNER JOIN SalesLT.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
INNER JOIN SalesLT.Product p ON sod.ProductID = p.ProductID
INNER JOIN SalesLT.ProductModel pm ON p.ProductModelID = pm.ProductModelID
INNER JOIN SalesLT.ProductCategory pc ON p.ProductCategoryID = pc.ProductCategoryID
INNER JOIN SalesLT.ProductCategory pcParent ON pc.ParentProductCategoryID = pcParent.ProductCategoryID
WHERE c.FirstName = 'David'
AND soh.OrderDate > (GETDATE()-30)
Data Storage
Wouldn’t it be great if it were something like this?
SELECT FirstName, MiddleName, LastName, SalesOrderID, OrderDate,
UnitPrice, OrderQty, LineTotal, ProductName, Color, ProductNumber,
ProductModel, ProductCategory, ProductParentCategory
FROM CustomerSales
WHERE FirstName = 'David'
AND OrderDate > (GETDATE()-30)
Data Storage
Maybe a document database can be of use
Number out there
◦ MongoDB
◦ RavenDB
◦ Couchbase
Consolidated structures without relational ties to other collections
Object databases
Why Document Database
Quick prototyping
Application usage that lends itself to persisting objects
Consider usage of your data before using
Avoid “cool factor”
Consider performance
◦ “NoSQL is so much faster...”
◦ Um, not always…
Looking at MongoDB
Server can have databases
Databases contain collections (like a table)
Collections contain documents (like rows)
Documents can be structured, have hierarchies, indexes, primary key
Working with Mongo’s C# Client
public class MongoContext<T> : IContext<T> where T : class, new() {
private IDictionary<String, String> _config;
private readonly MongoCollection<T> _coll;
public MongoContext(IDictionary<String, String> config) {
_config = config;
var client = new MongoClient(config["mongo.serverUrl"]);
var server = client.GetServer();
var database = server.GetDatabase(config["mongo.database"]);
_coll = database.GetCollection<T>(config["mongo.collection"]);
}
public IQueryable<T> Items {
get { return _coll.FindAll().AsQueryable(); }
}
}
Working with Mongo’s C# Client
Encapsulate my queries and commands
public class FindPageById : ICriteria<Page> {
private readonly String _id;
public FindPageById(String pageId)
{
_id = pageId;
}
public IEnumerable<Page> Execute(IContext<Page> ctx)
{
return ctx.Items.Where(p => p.Id == _id);
}
}
Working with Mongo’s C# Client
Invoke my query/command
public class TemplateController : MyBaseController {
private readonly IContext<Page> _pageCtx;
public TemplateController(IContext<Page> ctx) : base() {
_pageCtx = ctx;
}
[HttpGet]
public IportalPageMetadata Section(String cat, String page) {
var id = String.Format("{0}/{1}", cat, page);
var thePage = new FindPageById(id)
.Execute(_pageCtx)
.FirstOrDefault();
...
}
}
Working with Mongo’s C# Client
Writing to Mongo is just as simple...
[HttpPost]
public Boolean Post(Page page)
{
var userId = await GetUserId();
new CreatePage(page, userId)
.Execute(_pages);
_searchPage.Insert(page);
return true;
}
Evolving Architecture
Client
Web
Server
App
Server
Data
Repository
Search
Some data (?)
Session
Cache (?)
Document
Repository
Write
Query
Search
How do you search?
◦ LIKE ‘%blah%’ ?
◦ Dynamic SQL
◦ Full-Text
LIKE and Dynamic SQL can be quick to create
◦ Tough to maintain
Full-Text gives power
◦ Limited in search options
Search
Number of search services out there like
◦ Lucene
◦ Solr
Lucene is a search engine
◦ Embed in apps
◦ .NET port (Lucene.NET)
Solr is search service
◦ Built on Lucene
◦ Connect apps to it
Searching with Solr
Disconnected from your application
Search content via HTTP REST calls
Can use SolrNet as a client
◦ https://github.com/mausch/SolrNet
Document-based
Searching with Solr
private readonly ISolrOperations<T> _solr;
public SolrSearchProvider(ISolrOperations<T> solr) { _solr = solr; }
public IEnumerable<T> Query(String searchString) {
var options = new QueryOptions() {
Fields = new List<String> {"title", "body", "lastModified" }.ToArray(),
Highlight = new HighlightingParameters() {
BeforeTerm = "<strong><em>",
AfterTerm = "</em></strong>",
Fields = new List<String> { "title", "body" }.ToArray(),
Fragsize = 100
}
};
var results = _solr.Query(new SolrQuery(searchString), options);
return results;
}
Evolving Architecture
Client
Web
Server
App
Server
Data
Repository
Some data (?)
Session
Cache (?)
Search
Service
Query
Write
Document
Repository
Write
Query
Session and Cache Data
Generally short-lived for users
Fairly static for cached data
Key/value stores can serve us well here
◦ Redis
Redis has two good .NET client libraries
◦ StackExchange.Redis
◦ ServiceStack.Redis
Using Redis
public class RedisSessionManager : ISessionManager {
private static ConnectionMultiplexer _redis = null;
private readonly IDictionary<String, String> _config;
public RedisSessionManager(IDictionary<String, String> config) {
if (_redis == null) {
_redis = ConnectionMultiplexer.Connect(config["session.serverUrl"].ToString());
}
_config = config;
}
public async Task<Boolean> CreateSessionAsync(String portalId, String userId, String fullName) {
var time = DateTime.UtcNow.ToString();
var timeout = _config.ContainsKey("session.timeout");
var vals = new HashEntry[] {
new HashEntry("userid", userId), new HashEntry("login", time),
new HashEntry("lastAction", time), new HashEntry("fullName", fullName)
};
await RedisDatabase.HashSetAsync(portalId, vals);
return await RedisDatabase.KeyExpireAsync(portalId, TimeSpan.FromMinutes(timeout));
}
}
Using Redis
public async Task<Boolean> ExtendSessionAsync(String portalId) {
var timeout = _config.ContainsKey("session.timeout");
await RedisDatabase.HashSetAsync(portalId, "lastAction",
DateTime.UtcNow.ToString());
return await RedisDatabase.KeyExpireAsync(portalId,
TimeSpan.FromMinutes(timeout));
}
public async Task<Boolean> ExpireSessionAsync(String portalId) {
return await RedisDatabase.KeyDeleteAsync(portalId);
}
Using Redis
At login (to stick session id in a cookie):
await Session.CreateSessionAsync(userId, fullName);
Upon log out:
await Session.ExpireSessionAsync(sessionCookie.Value);
Evolving Architecture
Client
Web
Server
App
Server
Data
Repository
Some data (?)
Search
Service
Query
Write
Document
Repository
Write
Query
Session/
Cache
Service
Why Data Store
We’re left with a database with not much use
◦ Transactional data in document store
◦ Search documents in Solr
◦ Session, caching, etc. in key/value or caching service like Redis
What it probably ends up acting as is…
Evolving Architecture
Client
Web
Server
App
Server
Event Store
2-3 flat tables
Event data
Search
Service
Query
Write
Document
Repository
Write
Query
Session/
Cache
Service
Queue?
(D)Evolved Architecture
Client
Web
Server
App
Server
Event
Store
Search
Service
Query
Write
Doc
Repo
Write
Query
Session/
Cache
Service
Queue?
(D)Evolved Architecture
Pick and choose what components work best
Don’t use them just to use them
Proof-of-Concept / Prototype
Why look to be RDBMS free
Searching
◦ More than just full-text needs
Data
◦ Choose a system that you can model the business
◦ Not the other way around
Caching / Session Values / PubSub
◦ Offload necessary?
◦ Ensure performance
Maintenance and support big factors to consider
Consider data usage/architecture before just jumping in
Tools
MongoDB
◦ http://mongodb.org
◦ RoboMongo http://robomongo.org
◦ Perf Best Practices http://info.mongodb.com/rs/mongodb/images/MongoDB-Performance-Best-
Practices.pdf
◦ Operations Best Practices http://info.mongodb.com/rs/mongodb/images/10gen-
MongoDB_Operations_Best_Practices.pdf
Solr
◦ http://lucene.apache.org/solr/
Redis
◦ http://redis.io/
◦ Redis Manager http://redisdesktop.com/

Being RDBMS Free -- Alternate Approaches to Data Persistence

  • 1.
    Being RDBMS Free AlternativeApproaches to Data Persistence DAVID HOERSTER
  • 2.
    About Me C# MVP(Since April 2011) Sr. Solutions Architect at Confluence One of the Conference Organizers for Pittsburgh TechFest Past President of Pittsburgh .NET Users Group and organizer of recent Pittsburgh Code Camps and other Tech Events Twitter - @DavidHoerster Blog – http://blog.agileways.com Email – david@agileways.com
  • 3.
    Goals To allow youto achieve a zen-like state by never having to decide between a left and right outer join
  • 4.
    Goals That a non-relationalsolution may be considered an option What are some areas of a traditional application that could be a non-relational solution Introduce some non-relational tools How those tools would be used in a .NET solution (CODE!)
  • 5.
    Traditional Architecture Data persistenceis central to application Generally monolithic Jack of all trades; master of none
  • 6.
  • 7.
    Consider… An online employmentapplication Wizard interface, with 9-12 steps Most data is 1:1 across steps, but some data is 1:many How to best structure 1:1 data ◦ 6-8 tables, linked by ID? ◦ Or one wide table with lots of nullable columns? ◦ What about joining? How about 1:many data ◦ Several tables with 1:* relationships, which also needs to be joined Don’t forget searching!!! Applicant General Disclosure Attestation Skills Empl Educat’ n
  • 8.
    Database Thaw I'm confidentto say that if you starting a new strategic enterprise application you should no longer be assuming that your persistence should be relational. The relational option might be the right one - but you should seriously look at other alternatives. -- Martin Fowler (http://martinfowler.com/bliki/PolyglotPersistence.html)
  • 9.
    Monolithic Data Persistence Providesconsistency, but… Is it always best tool for all jobs? Is it easy for prototyping / rapid development? Consider ◦ How data will be used ◦ What kinds of data you’ll have
  • 10.
    Why Non-Relational Use Case– Company Intranet / CMS Overall object is a CMS-like app for a company’s intranet content Usage is mostly read-only, with pages and attachments ◦ Pages, attachments, searching, admin, etc. Traditional database could be multiple tables with 1:1 relationships and some 1:many relationships Lots of joins for a page …or a single document
  • 11.
    What if… We couldbreak some pieces out ◦ Flatten structures for querying ◦ Highly efficient search services ◦ Pub/sub hubs ◦ Remote caching with excellent performance ◦ Session management outside a DB for load balanced environments How would app then be architected?
  • 12.
    …but consider thecosts Learning curve Distributed systems Compensating transactions Consider this with ◦ Data ◦ Searching ◦ Caching/Session ◦ Auditing
  • 13.
    Data Storage Typically, RDBMSis the de facto standard ◦ SQL Server ◦ MySQL ◦ PostgreSQL ◦ Oracle (Yikes!!) But do you really need it?
  • 14.
    Data Storage Get allthe orders for user ‘David’ in last 30 days SELECT c.FirstName, c.MiddleName, c.LastName, soh.SalesOrderID, soh.OrderDate, sod.UnitPrice, sod.OrderQty, sod.LineTotal, p.Name as 'ProductName', p.Color, p.ProductNumber, pm.Name as 'ProductModel', pc.Name as 'ProductCategory', pcParent.Name as 'ProductParentCategory' FROM SalesLT.Customer c INNER JOIN SalesLT.SalesOrderHeader soh ON c.CustomerID = soh.CustomerID INNER JOIN SalesLT.SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID INNER JOIN SalesLT.Product p ON sod.ProductID = p.ProductID INNER JOIN SalesLT.ProductModel pm ON p.ProductModelID = pm.ProductModelID INNER JOIN SalesLT.ProductCategory pc ON p.ProductCategoryID = pc.ProductCategoryID INNER JOIN SalesLT.ProductCategory pcParent ON pc.ParentProductCategoryID = pcParent.ProductCategoryID WHERE c.FirstName = 'David' AND soh.OrderDate > (GETDATE()-30)
  • 15.
    Data Storage Wouldn’t itbe great if it were something like this? SELECT FirstName, MiddleName, LastName, SalesOrderID, OrderDate, UnitPrice, OrderQty, LineTotal, ProductName, Color, ProductNumber, ProductModel, ProductCategory, ProductParentCategory FROM CustomerSales WHERE FirstName = 'David' AND OrderDate > (GETDATE()-30)
  • 16.
    Data Storage Maybe adocument database can be of use Number out there ◦ MongoDB ◦ RavenDB ◦ Couchbase Consolidated structures without relational ties to other collections Object databases
  • 17.
    Why Document Database Quickprototyping Application usage that lends itself to persisting objects Consider usage of your data before using Avoid “cool factor” Consider performance ◦ “NoSQL is so much faster...” ◦ Um, not always…
  • 18.
    Looking at MongoDB Servercan have databases Databases contain collections (like a table) Collections contain documents (like rows) Documents can be structured, have hierarchies, indexes, primary key
  • 19.
    Working with Mongo’sC# Client public class MongoContext<T> : IContext<T> where T : class, new() { private IDictionary<String, String> _config; private readonly MongoCollection<T> _coll; public MongoContext(IDictionary<String, String> config) { _config = config; var client = new MongoClient(config["mongo.serverUrl"]); var server = client.GetServer(); var database = server.GetDatabase(config["mongo.database"]); _coll = database.GetCollection<T>(config["mongo.collection"]); } public IQueryable<T> Items { get { return _coll.FindAll().AsQueryable(); } } }
  • 20.
    Working with Mongo’sC# Client Encapsulate my queries and commands public class FindPageById : ICriteria<Page> { private readonly String _id; public FindPageById(String pageId) { _id = pageId; } public IEnumerable<Page> Execute(IContext<Page> ctx) { return ctx.Items.Where(p => p.Id == _id); } }
  • 21.
    Working with Mongo’sC# Client Invoke my query/command public class TemplateController : MyBaseController { private readonly IContext<Page> _pageCtx; public TemplateController(IContext<Page> ctx) : base() { _pageCtx = ctx; } [HttpGet] public IportalPageMetadata Section(String cat, String page) { var id = String.Format("{0}/{1}", cat, page); var thePage = new FindPageById(id) .Execute(_pageCtx) .FirstOrDefault(); ... } }
  • 22.
    Working with Mongo’sC# Client Writing to Mongo is just as simple... [HttpPost] public Boolean Post(Page page) { var userId = await GetUserId(); new CreatePage(page, userId) .Execute(_pages); _searchPage.Insert(page); return true; }
  • 23.
  • 24.
    Search How do yousearch? ◦ LIKE ‘%blah%’ ? ◦ Dynamic SQL ◦ Full-Text LIKE and Dynamic SQL can be quick to create ◦ Tough to maintain Full-Text gives power ◦ Limited in search options
  • 25.
    Search Number of searchservices out there like ◦ Lucene ◦ Solr Lucene is a search engine ◦ Embed in apps ◦ .NET port (Lucene.NET) Solr is search service ◦ Built on Lucene ◦ Connect apps to it
  • 26.
    Searching with Solr Disconnectedfrom your application Search content via HTTP REST calls Can use SolrNet as a client ◦ https://github.com/mausch/SolrNet Document-based
  • 27.
    Searching with Solr privatereadonly ISolrOperations<T> _solr; public SolrSearchProvider(ISolrOperations<T> solr) { _solr = solr; } public IEnumerable<T> Query(String searchString) { var options = new QueryOptions() { Fields = new List<String> {"title", "body", "lastModified" }.ToArray(), Highlight = new HighlightingParameters() { BeforeTerm = "<strong><em>", AfterTerm = "</em></strong>", Fields = new List<String> { "title", "body" }.ToArray(), Fragsize = 100 } }; var results = _solr.Query(new SolrQuery(searchString), options); return results; }
  • 28.
    Evolving Architecture Client Web Server App Server Data Repository Some data(?) Session Cache (?) Search Service Query Write Document Repository Write Query
  • 29.
    Session and CacheData Generally short-lived for users Fairly static for cached data Key/value stores can serve us well here ◦ Redis Redis has two good .NET client libraries ◦ StackExchange.Redis ◦ ServiceStack.Redis
  • 30.
    Using Redis public classRedisSessionManager : ISessionManager { private static ConnectionMultiplexer _redis = null; private readonly IDictionary<String, String> _config; public RedisSessionManager(IDictionary<String, String> config) { if (_redis == null) { _redis = ConnectionMultiplexer.Connect(config["session.serverUrl"].ToString()); } _config = config; } public async Task<Boolean> CreateSessionAsync(String portalId, String userId, String fullName) { var time = DateTime.UtcNow.ToString(); var timeout = _config.ContainsKey("session.timeout"); var vals = new HashEntry[] { new HashEntry("userid", userId), new HashEntry("login", time), new HashEntry("lastAction", time), new HashEntry("fullName", fullName) }; await RedisDatabase.HashSetAsync(portalId, vals); return await RedisDatabase.KeyExpireAsync(portalId, TimeSpan.FromMinutes(timeout)); } }
  • 31.
    Using Redis public asyncTask<Boolean> ExtendSessionAsync(String portalId) { var timeout = _config.ContainsKey("session.timeout"); await RedisDatabase.HashSetAsync(portalId, "lastAction", DateTime.UtcNow.ToString()); return await RedisDatabase.KeyExpireAsync(portalId, TimeSpan.FromMinutes(timeout)); } public async Task<Boolean> ExpireSessionAsync(String portalId) { return await RedisDatabase.KeyDeleteAsync(portalId); }
  • 32.
    Using Redis At login(to stick session id in a cookie): await Session.CreateSessionAsync(userId, fullName); Upon log out: await Session.ExpireSessionAsync(sessionCookie.Value);
  • 33.
    Evolving Architecture Client Web Server App Server Data Repository Some data(?) Search Service Query Write Document Repository Write Query Session/ Cache Service
  • 34.
    Why Data Store We’releft with a database with not much use ◦ Transactional data in document store ◦ Search documents in Solr ◦ Session, caching, etc. in key/value or caching service like Redis What it probably ends up acting as is…
  • 35.
    Evolving Architecture Client Web Server App Server Event Store 2-3flat tables Event data Search Service Query Write Document Repository Write Query Session/ Cache Service Queue?
  • 36.
  • 37.
    (D)Evolved Architecture Pick andchoose what components work best Don’t use them just to use them Proof-of-Concept / Prototype
  • 38.
    Why look tobe RDBMS free Searching ◦ More than just full-text needs Data ◦ Choose a system that you can model the business ◦ Not the other way around Caching / Session Values / PubSub ◦ Offload necessary? ◦ Ensure performance Maintenance and support big factors to consider Consider data usage/architecture before just jumping in
  • 39.
    Tools MongoDB ◦ http://mongodb.org ◦ RoboMongohttp://robomongo.org ◦ Perf Best Practices http://info.mongodb.com/rs/mongodb/images/MongoDB-Performance-Best- Practices.pdf ◦ Operations Best Practices http://info.mongodb.com/rs/mongodb/images/10gen- MongoDB_Operations_Best_Practices.pdf Solr ◦ http://lucene.apache.org/solr/ Redis ◦ http://redis.io/ ◦ Redis Manager http://redisdesktop.com/