A Tale of 3 Databases
Chris Skardon
@cskardon | chris@tournr.com
Before We Begin
Contents
• Prologue
• Chapter 1 – SQL
• Chapter 2 – RavenDB
• Chapter 3 – Neo4j
• Epilogue
Prologue
What is Tournr?
• Competition running site
• Any competition,
• Sport (running, swimming)
• Game (scrabble, chess)
• Other…
• Help organisers and competitors have a
permanent record
• Unified Ranking
Chapter 1
SQL
WISA
The traditional entry point for a .NET developer
• Windows
• Internet Information Services (IIS)
• MS SQL Server 2012
• Entity Framework (ORM)
• Asp.NET
The Good
• Lots of common knowledge
• SQL is well-known
• Standard
• Libraries
• Language
• Skills
• Cheap!
INTERLUDE – The Generalist
• The Swiss-Army Knife of development
• Good at lots of things
• But you’re not going to want to cut a
tree down with it
• I can write SQL
• But I’m no expert
The Bad
• Changes to the model involve writing SQL
• Lots of SQL
• Running scripts against DBs,
• then resetting,
• then…
• Slooooooow turnaround
• Feels so 2000
6 Months Later…
• Adding features to Tournr was becoming a bit
onerous
• Lookup Tables to more Lookup Tables
• Playing around with RavenDB
• *PING* Maybe a good idea for Tournr?
Chapter 2
Document DB (RavenDB)
WISA to WIRA
• Windows / IIS / RavenDB / Asp.NET
• Took about a month, maybe a smidge more
• This is partly due to a lack of users – so migration
not an issue
• Threw away a lot of code (this is good btw)
The Good
• RavenDB has 1st class .NET integration
• Really – it’s awesome
• Supports LINQ
• Adding new features – very quick
• Tournr would not exist without RavenDB
• Simple to setup (dev wise at least)
• Embeddable
• Testing against an actual DB instance
• Dynamic Indexing
Example
using (var session = RavenSession)
{
var tournament = session.Load<Tournament>(id);
tournament.Name = "New Name";
session.SaveChanges();
}
INTERLUDE – Disclaimer
• Struggled with TECH X eh? That’s because:
• You’re doing it wrong…
• You don’t understand…
• You’re being vindictive…
• <insert your own reason here>
• Probably totally correct (except the vindictive
one), and this is really MY experiences!
The Bad
• The documentation is sucky
• Niche DB
• Getting help is sometimes hard
• Especially for edge cases
• Hiring
• Forced into a way of architecting your code
• Ayende’s way or the Highway
• Manual linking of documents
• Documents grow
• Hosting
A year or so later…
• Tournr was doing well
• Adding features was going OK, but…
• Beginning to hit the wall of the design
• Duplicates appearing in the documents (by design
obvs)
• Drawing out a new feature – looked like a graph
• But I’m not using a Graph DB
• I could though right?
Chapter 3
Graph DB (Neo4j)
Am I being foolish?
• A question for every day
• Negatives:
• Switching DBs - you can’t add new stuff whilst doing
it
• I have users this time
• Positives:
• I’ve been using Neo4j for 3 years or so now, so have
experience
• Fits the model better
JUST DO IT!
WIRA to WINA
• Windows / IIS / Neo4j / Asp.NET
• Nuget -> Neo4jClient
• Actually:
• Azure Web Apps / GrapheneDB / Asp.NET
• Is AWAGA as good?
Membership
• Important Quick Win
• MVC3 Membership -> MVC5 Identity
• A pain in itself
• The only ‘architectural’ change aside from DB
changes
• Why?
• No existing Neo4j Membership implementation
• An existing Neo4j Identity implementation
• Switched over, and it worked
• No-one more surprised than me (seriously)
The “Process”
1. Remove RavenDB
2. Break build entirely
3. Fix build
4. Start putting Model 1 into place
5. Decide to go with Model 2
6. Pull out Model 1 code
7. Replace with Model 2 code
8. Repeat
The ‘Models’
• Evolved over time
• First model pretty much thrown out
• Getting to market was priority
• Ease of writing queries / code over correctness
Model 1
USER
PERSONAL
DETAILS
: HAS_PERSONAL_DETAILS
Model 2
USER
PersonalDetails
{get;}
Email { get; }
Points of change – New stuff
• Custom Json Serializers
• Not needed before
• Cypher Queries
• All DB access was changed
• Single biggest area of changes
Points of change – Removed
stuff
• Controller DB Access code
• Ravens approach not really appropriate for Neo4j
So… It compiles – job done!
• Looks like it’s working…
• Uh oh – problems
• Missing / Broken functionality
• Performance is SHOCKINGLY bad
• 30+ seconds to load a page
• Don’t worry Neo4j fans! it’s my fault…
Problems you say… why?
• Change of mindset required
• First initial code change – simple, and dumb
• Multiple queries = SLOW
• Always try to get the most out of one query
• Literally – if a controller action required 2 calls,
make that 1
The Good
• Quick & Easy to install
• Good modelling
• Adding new features quick
• Documentation
• Cypher
MATCH (t:Tournament)-[:HAS_COMPETITOR]->(c:Competitor)
WHERE t.Name = ‘Awesome Comp #1’
RETURN c
The Bad
• Niche DB
• Support
• Hiring
• Cypher can cause muddles
• Hosting
Epilogue
Or “Was it worth it?”
Model
Comp
750m
User
1.5Km
IS_OWNER
IS_COMPETING
IS_COMPETING
HAS_EVENT
HAS_EVENT
To
Code - Max
IL
Cyclomatic
Complexity
Max LOC per
Method
Methods per
Type
Code - Averages
IL
Cyclomatic
Complexity
LOC per
Method
Methods per
Type
Future Development
• With Model – easier to ‘see’
• Some frustrations over how to do some things
Appendix 1
Or “Where the book analogy
shows signs of being tired”
What should you take from
this?
• Converting from SQL / N.E.Other Database
• Totally Doable
• Is it easy?
• Yes and no
• Is it worth it?
• Depends on your use-case
• Does a graph model provide a closer analogy to your
work space?
• Would I do it again?
• Yes
Further Reading
Links!
• RavenDB – http://ravendb.net/
• Neo4j – http://neo4j.org/
• Tournr – https://www.tournr.com/
• XClave - http://www.xclave.co.uk/
The End.
Questions?
@cskardon | chris@tournr.com

A tale of 3 databases

  • 1.
    A Tale of3 Databases Chris Skardon @cskardon | chris@tournr.com
  • 2.
  • 3.
    Contents • Prologue • Chapter1 – SQL • Chapter 2 – RavenDB • Chapter 3 – Neo4j • Epilogue
  • 4.
  • 5.
    What is Tournr? •Competition running site • Any competition, • Sport (running, swimming) • Game (scrabble, chess) • Other… • Help organisers and competitors have a permanent record • Unified Ranking
  • 6.
  • 7.
    WISA The traditional entrypoint for a .NET developer • Windows • Internet Information Services (IIS) • MS SQL Server 2012 • Entity Framework (ORM) • Asp.NET
  • 8.
    The Good • Lotsof common knowledge • SQL is well-known • Standard • Libraries • Language • Skills • Cheap!
  • 9.
    INTERLUDE – TheGeneralist • The Swiss-Army Knife of development • Good at lots of things • But you’re not going to want to cut a tree down with it • I can write SQL • But I’m no expert
  • 10.
    The Bad • Changesto the model involve writing SQL • Lots of SQL • Running scripts against DBs, • then resetting, • then… • Slooooooow turnaround • Feels so 2000
  • 11.
    6 Months Later… •Adding features to Tournr was becoming a bit onerous • Lookup Tables to more Lookup Tables • Playing around with RavenDB • *PING* Maybe a good idea for Tournr?
  • 12.
  • 13.
    WISA to WIRA •Windows / IIS / RavenDB / Asp.NET • Took about a month, maybe a smidge more • This is partly due to a lack of users – so migration not an issue • Threw away a lot of code (this is good btw)
  • 14.
    The Good • RavenDBhas 1st class .NET integration • Really – it’s awesome • Supports LINQ • Adding new features – very quick • Tournr would not exist without RavenDB • Simple to setup (dev wise at least) • Embeddable • Testing against an actual DB instance • Dynamic Indexing
  • 15.
    Example using (var session= RavenSession) { var tournament = session.Load<Tournament>(id); tournament.Name = "New Name"; session.SaveChanges(); }
  • 16.
    INTERLUDE – Disclaimer •Struggled with TECH X eh? That’s because: • You’re doing it wrong… • You don’t understand… • You’re being vindictive… • <insert your own reason here> • Probably totally correct (except the vindictive one), and this is really MY experiences!
  • 17.
    The Bad • Thedocumentation is sucky • Niche DB • Getting help is sometimes hard • Especially for edge cases • Hiring • Forced into a way of architecting your code • Ayende’s way or the Highway • Manual linking of documents • Documents grow • Hosting
  • 18.
    A year orso later… • Tournr was doing well • Adding features was going OK, but… • Beginning to hit the wall of the design • Duplicates appearing in the documents (by design obvs) • Drawing out a new feature – looked like a graph • But I’m not using a Graph DB • I could though right?
  • 19.
  • 20.
    Am I beingfoolish? • A question for every day • Negatives: • Switching DBs - you can’t add new stuff whilst doing it • I have users this time • Positives: • I’ve been using Neo4j for 3 years or so now, so have experience • Fits the model better
  • 21.
  • 22.
    WIRA to WINA •Windows / IIS / Neo4j / Asp.NET • Nuget -> Neo4jClient • Actually: • Azure Web Apps / GrapheneDB / Asp.NET • Is AWAGA as good?
  • 23.
    Membership • Important QuickWin • MVC3 Membership -> MVC5 Identity • A pain in itself • The only ‘architectural’ change aside from DB changes • Why? • No existing Neo4j Membership implementation • An existing Neo4j Identity implementation • Switched over, and it worked • No-one more surprised than me (seriously)
  • 24.
    The “Process” 1. RemoveRavenDB 2. Break build entirely 3. Fix build 4. Start putting Model 1 into place 5. Decide to go with Model 2 6. Pull out Model 1 code 7. Replace with Model 2 code 8. Repeat
  • 25.
    The ‘Models’ • Evolvedover time • First model pretty much thrown out • Getting to market was priority • Ease of writing queries / code over correctness
  • 26.
  • 27.
  • 28.
    Points of change– New stuff • Custom Json Serializers • Not needed before • Cypher Queries • All DB access was changed • Single biggest area of changes
  • 29.
    Points of change– Removed stuff • Controller DB Access code • Ravens approach not really appropriate for Neo4j
  • 30.
    So… It compiles– job done! • Looks like it’s working… • Uh oh – problems • Missing / Broken functionality • Performance is SHOCKINGLY bad • 30+ seconds to load a page • Don’t worry Neo4j fans! it’s my fault…
  • 31.
    Problems you say…why? • Change of mindset required • First initial code change – simple, and dumb • Multiple queries = SLOW • Always try to get the most out of one query • Literally – if a controller action required 2 calls, make that 1
  • 32.
    The Good • Quick& Easy to install • Good modelling • Adding new features quick • Documentation • Cypher MATCH (t:Tournament)-[:HAS_COMPETITOR]->(c:Competitor) WHERE t.Name = ‘Awesome Comp #1’ RETURN c
  • 33.
    The Bad • NicheDB • Support • Hiring • Cypher can cause muddles • Hosting
  • 34.
    Epilogue Or “Was itworth it?”
  • 35.
  • 36.
    Code - Max IL Cyclomatic Complexity MaxLOC per Method Methods per Type
  • 37.
  • 38.
    Future Development • WithModel – easier to ‘see’ • Some frustrations over how to do some things
  • 39.
    Appendix 1 Or “Wherethe book analogy shows signs of being tired”
  • 40.
    What should youtake from this? • Converting from SQL / N.E.Other Database • Totally Doable • Is it easy? • Yes and no • Is it worth it? • Depends on your use-case • Does a graph model provide a closer analogy to your work space? • Would I do it again? • Yes
  • 41.
  • 42.
    Links! • RavenDB –http://ravendb.net/ • Neo4j – http://neo4j.org/ • Tournr – https://www.tournr.com/ • XClave - http://www.xclave.co.uk/
  • 43.

Editor's Notes

  • #2 Hello! I’m going to try to keep these notes bits with relevant info regarding the slides – for example, this is the intro slide, that’s my name, and my twitter and email address. Tournr is my website, and the title is self explanatory – but in case… I’m a freelance software engineer, working out of Cornwall, but for companies in a variety of locations, both remotely and in-house if needs be. I focus on .NET development, and at the moment I manage the Neo4jClient (.NET Neo4j Client) and a few other projects.
  • #3 These are MY opinions, a lot is probably wrong, and I could have done things in lots of different ways, but it’s worked (spoiler!) for me. As with most things in development, there are lots of ways to do things, some do work, some don’t, and experimentation is always recommended. It’s also worth noting this isn’t a guide or ‘how to’ – code wise – there’s going to be very little, mainly as I’m not sure code is helpful for such a high level overview – the examples I will use will be just to illustrate differences, nothing major.
  • #4 This is the overview of the talk – it’s styled as a book with chapters. I may well regret this analogy / scheme later on. Generally speaking this talk is about the conversion of a project (website) from one db to another, to another, so in light of that, it’s worth having a brief overview of that project. Next up we’ll go over version 0, or the SQL version, Then we’ll hit RavenDB – a .NET document DB – which powered version 1 of the site Finally we get to the section you’re probably most interested in – V2, or the Graph (Neo4j) edition.
  • #5 Without further ado – let’s discuss Tournr.
  • #6 Tournr is a Sports / Games competition running site It’s designed to help an organiser manage a competition, and help competitors keep track of their competing history over all of their competitions. It doesn’t care what the sport/game is – It has a unified ranking system underpinning everything allowing me to rank competitors against each other (as you’d expect ranking to), which allows me to help sponsors / interested parties find the best people in the world (!!) at a given thing. It started about 2/3 years ago on the back of a competition I ran (and continue to for Surf Kayaking), and has been slowly building up usage, both in sport types and competitors.
  • #7 Microsoft SQL Server – a relational database (RDBMS) – allowing you to store everyday things in tables, and link them together with handy constraints and keys giving you all important referential integrity. The daddy of databases, and the defacto standard for most development. I can count on one finger the number of times I’ve been in a ‘project startup’ meeting where the DB choice hasn’t been automatically set as an RDBMS of one sort or another. Who am I to break that trend?
  • #8 Tournr began life as an ASP.NET MVC3 (Model View Controller) application, and as such was based on the standard ‘NET Stack’ at the time. So, it was hosted on IIS on a Windows Server, used Sql Server 2012 for it’s database (accessed via EF) all displayed in ASP.NET pages – using the Razor engine. Even today, if you fire up a new MVC application in VS – you’ll get a SQL Server EF based application (albeit with SQL Server Express / whatever it’s called today) The names of the various servers / technologies change quite a bit, so switching between MVC3 to MVC5 for example – is non-trivial. It’s worth noting up until this point – I had been a desktop developer – From WinForms to WPF (via some PubSub services) up-to and including (ahem) Silverlight. I needed this to be easy to begin with!
  • #9 Let’s start off with a positive note, If you want to be developing against a framework which has bucketloads of information about it – including ‘Getting Started’ type docs – this is the one to go with – of the three dev choices in this talk this is by far the best documented, and that’s all down to history – SQL Server is well known – there may be X iterations – but each is similar to the previous, SQL itself is well known – so problems there can be easily ironed out SQL & .NET together – well known – standard libraries – (EF is at 5 or 6 now I think), leaving that aside you have access to SQL from .NET since day 1 (hello SqlServerAdapter) If you were looking to hire someone – you can get a lot of developers with SQL experience (even I have SQL experience) It’s relatively cheap to host – due to the number of hosting companies (from Azure to WinHosts and those in-between), and you can split your hosting depending on how you feel – regardless – competition drives down prices!
  • #10 I think this covers most developers, you might specialise in certain areas, but the development space nowadays is too big to be an ‘expert’ at all, at best I think you can really get focused in a couple of areas, but there is the risk of being lost in that space if you get too narrow in focus. Over time I’ve written software for a variety of clients, as an employee and as a freelancer, and the moving of jobs / projects doesn’t allow you to really go ‘deep’ in a technology. Generally I think you tend to get specialized for a given project – and then move to another, and have to develop different skillsets – all of which makes for a rounded developer. Case in point – I’ve never needed to write SQL particularly – I know some stuff – and I can write basic SQL for adding/editing and deleting things, but will have to look it up on google. In general I’ve worked with DBA’s who are specialised in that area – whilst I’m specialised in the code – and I think it maps together nicely. With that in mind…
  • #11 I don’t like writing SQL – No, that’s not true - I don’t like writing lots of SQL – in particular scripts for updating tables, fields etc – get painful over time, you end up having a large list of files that need to be executed in a given order – and that invariably leads to mistakes (for me at least) and long hard release procedures. Now, EF does a pretty good job at this – but still – it’s more than I want to do! I found the turnaround to add new features was getting slower and slower and more complicated – which obviously caused delays in fixing problems, or even just adding new things I wanted. This is definitely a problem for me – I’m pretty sure if I was super confident with SQL – this wouldn’t have mattered – but then I probably would lack confidence elsewhere. Also – I’m writing a ‘cutting edge’ site – shouldn’t I be using cutting edge technology – use this as a learning experience?
  • #12 When you start a new project, you are full of ideas, some good – some not so. Inevitably, those ‘not so’ creep into the program. I had a few of those, tables looking up values in other tables which were in turn… you get the point. I was nearing the point of giving up, when I started to play with RavenDB at my job (still permy at that point!) – As part of a ‘tech talks’ event that I’d been running. The challenge had been to build a fully fledged web application in under an hour from scratch. Using RavenDB I’d got that working – and that made me think about converting Tournr. The problem is – that any database conversion takes time, and is totally non-trivial.
  • #13 RavenDB is a native .NET document database built by Oren Eini (Ayende Rahien). I started using it around version 2.0x and whilst using it – it went to 2.5x. Searching for RavenDB will pop it up easily enough, and usually there are samples to get you going – but we’ll cross that bridge A document DB is a database that (simply put) stores documents, generally JSON objects, and works on the principles that a given document normally contains all the data you need – you don’t tend to go back too much to get more info, or perform joins etc.
  • #14 So we’re starting to modify the basic template, but still sticking with native .net stuff – and that’s ok! It took me about a month, maybe a bit more to do the conversion – I was blessed with a few things – firstly I had very few users – it had only been used to run one competition (my own) and all the people registered were friends, so they were accommodating. I also don’t charge to use Tournr – so what can they complain about eh? I was also willing to dump a lot of code – this is important early on as it meant I could release myself from the circles I’d got tied into. The code became a bit more lean, and didn’t have the DB projects, buuuut did have some setup – you win some, you lose some.
  • #15 So – the good – Well – RavenDB is awesomeballs in .NET – really – it’s exceptional, the interface and usability is top notch initially. It supports LINQ – though that can catch you out, as some things it doesn’t do that LINQ does (think ‘.Contains’) I could turn around new features in days, sometimes hours, without needing to worry about scripting, as long as I understood what happened when adding/removing properties I was all good. Genuinely – Tournr would not exist without RavenDB – I would have given up I think in the SQL world – Raven got me going and fast. It’s simple to setup, just run the .exe with a /install flag and you’re off, You can embed the DB so testing can be done against an actual DB instance – which is very powerful. Dynamic Indexing is super-cool -> generally you don’t setup indexes, but raven learns how you use it and starts to create indexes for you based on your usage. Initially these are dynamic, but then if used enough – they become static indexes.
  • #16 This is just a quick example of the code to load a tournament, change it’s name and save it. The id in this case can be anything really, a long, a guid, a string, doesn’t matter as long as it uniquely identifies the object, and is attached to an ‘Id’ property. Changing the name here is trivial, if you were to change the contents of a list on the object, it’s all tracked, and saving the changes will save it all correctly. Simple easily readable code.
  • #17 Before I get onto the ‘Bad’ bits of Raven, I just want to clarify – whenever anyone in tech says ‘oh I don’t like X’ inevitably there will be someone who will say “oh you struggled with X did you? Maybe that’s because” and then depending on who they are they will vent with a variety of options Wrong Don’t understand You’re Vindictive It’s hard particularly if you’ve invested time/money in a solution to hear someone bad mouth it. So as a disclaimer – I fully accept that I probably did use RavenDB incorrectly, or didn’t really understand it properly – in my defence I’d say that’s easily done with most things, and time was my biggest concern….
  • #18 OK, the BAD. The docs are sucky – it’s really hard to find out a good example of how to do X, Which ties into the whole Niche DB type affair – it’s hard to get help sometimes, if say StackOverflow has a million users (I know), only a very small % use Raven, and even smaller can probably help and your questions can easily get lost in the quagmire. You could argue this is the case with a lot of edge cases no matter the tech – and I’d agree. Hiring – that’s right – not so easy – put ‘ravendb’ as a requirement on your job spec, and see the numbers of applicants fall away. The way Raven is written makes it work very well as long as you adhere to Ayende’s way of thinking, break out of that for whatever reason – and watch out - there be sharks! Ayende is probably right – but I don’t like being forced by a technology to do something – if I want to break it, then I should be able to. It’s a document database, not relational – so of course – you have to manually keep all those references, which will eventually lead to some referential integrity issues down the line. Documents grow, what might be simple and small – gradually becomes bigger and bigger – this is almost definitely a case where I messed up – but still – it’s there and it happened to me, obviously the bigger the document the slower the page load. Hosting – you have one choice if you don’t want to set it up yourself – and being a Swiss Army Knife – that’s the route I’d rather take. It’s not hugely expensive – but it is that or nothing – and the licensing costs for your own server are not cheap.
  • #19 So a year down the line, Tournr is growing, I’ve taken on all of the Surf Kayak competitions in the UK and branched out to Ireland, but I started to hit some design issues. I would find duplicates in my documents – where a competitor registered for two classes in a competition, their data would be duped in some places, updating a user involved hunting through docs to see where they were registered etc. I was wanting to add a new feature to Tournr, and I was talking it over with my Partner and I drew out what I thought it should look like, (drawing a graph- NO WAY!) and I said to her “Problem is it’s a graph and I’m not using a GraphDB”, to which she replied – “Why don’t you?” It’s the nice thing about non-techy people – they don’t see the tech problems – they just see solutions.
  • #20 Neo4j is a graph database made in Java (hence the j) but not used by just Java. It is (on reflection) not the best naming choice. It’s been around for a little while now, 4-5 years – I’m not sure exactly – googling or asking others will probably help there. A graph DB is a collection of documents linked together by relationships. The importance of the relationship is the key, you need the relationship to give context, and meaning to your data. Visually it’s very easy to understand – but you all know that – you’re here already.
  • #21 Tournr has been around at this point for a couple of years, and the downside (plus side) is that is had users. Downside? Switching a DB in this case now means migrating users, their passwords everything. Ideally – no-one should know what I’ve done (except of course for you) Whilst doing the conversion – I can’t add new things – but then depending on how you view it – it was harder to add new things anyways – so bit of a chicken/egg thing there. I did (at least) have experience of Neo4j this time though, which is a big plus – I’d been using since 1.5/6 I think, so I was pretty sure I could get it done. And it does fit the model better – a lot better.
  • #22 Sometimes you just have to
  • #23 The final Acronym change! Still keeping with the W, I and A which handily means we’re one step closer to ‘WINE’ To use neo4j from .NET you need the Neo4jClient from Nuget (though now – a new driver for 3.0x is coming…) A little side note – Tournr is (and actually always has been) on Azure, originally Azure Websites, and now Azure Web Apps, I also (as with Raven) prefer to have it hosted – I’m lazy – and this time we’re with GrapheneDB (which is also hosted in Azure – so that’s nice).
  • #24 Quick wins are always important, they give you a boost in confidence – I only allowed myself one change of code that wasn’t strictly DB related and that was the change from MVC3 Membership to MVC5 Identity – why add the extra work? Well – 2 reasons. There was no existing implementation of Membership using Neo4j There was an existing implementation of Identity for Neo4j – and I’d helped write it. From the point of view of MVC5, identity just gels better, I’d actually upgraded Tournr to MVC5 a while ago, and this was really the last bit of legacy code to go. There are a few things that happened here that I’m going to gloss over – mainly about switching to use OWIN for membership stuff, but needless to say – I added the package in, and pretty much built it and it worked. I was massively surprised – obviously I’d not migrated users – so no existing account could log in – but the site started and that was a big plus.
  • #25 It kind of went downhill from there. Changing a DB fundamentally breaks your code, big time. I’m firmly in the ‘break it, break it big’ camp – and so I removed the RavenDB Nuget packages and checked out the red squigglies of doom (like spell checks – but you can’t add them to your dictionary). My general approach is to fix the build – typically comment out offending code, and once compiling – start to add the code in place to get the functionality back I’m also a try it type developer – so I tried a model out, then another, usually evolving between the models, until I get to where I want to be – I suspect this is not the last evolution, but I’m getting there.
  • #26 So – the models As I said – I evolve my models – I find I rarely get it right first time, so I may as well start and be prepared to throw it away. The priority for me during this period was to get Tournr up to the point it was with Raven, regardless of code quality / speed – as long as it was functional I would be happy – I didn’t have to release this version – merely get it working.
  • #27 Seems pretty standard, and indeed it is, the problem stems from my RavenDB prior code – in which I had a lot of nested types – as it’s easy and Raven accepts them all. Personal Details is really a collection of things like ‘Gender’, ‘Country Flag’ etc, probably things that could be a property on the actual ‘User’ itself. I have no plans (at the moment!) to search by those terms, so don’t need access to them via Cypher.
  • #28 By shifting to Model2 – I lose some expressiveness – and you could see it as a shortcut/HACK – and you’d probably be right. Code wise we’re looking at the FASTEST route to market, not the best, or ‘most correct’. Which is one of the nice things about not being constrained by a DB (hello Raven) you can do what you want, even if it looks a bit … hmmm… Trade off – Less nodes – so easier code, but less flexibility for querying If I do have a property ‘Country’ for example in the PersonalDetails property – I can’t query by it. My view is that if I need to do that – I can bring it out, or change the model to suit – it’s a bit of a pain – but not too hard.
  • #29 The main changes come from the addition of custom Json Serializers – this allows me to serialize my classes in the way I want to – Neo4j doesn’t support nested structures – so (for example) you can’t have a property that is of type ‘Dictionary’ and expect Neo to store it, in fact Neo4jClient will B0rk at you. So to have nested complex types – you have to custom serialize/deserialize them – which can be a bit of a pain. Especially compared to raven, where it took these things in it’s stride. The other major change was the way the code interacted with the DB – with Raven – no scripts or query languages – just ‘SaveChanges’ and it’s over. With Neo – (similar to SQL) you need to write the queries yourself – now – that can be painful – it can also be beneficial. You do get to understand more of what you’re doing, and how it’s doing it. Whilst a huge pain for the initial conversion – I’m pretty pleased with the changes. I’m certainly happier that I know what’s happening.
  • #30 To be honest, there wasn’t much to remove, once you take out the Raven code itself, and strip the base class mechanisms for the controllers. What you do find is that the controllers become a lot less cluttered with DB access code – and become more ‘business’ focused.
  • #31 So after much time, let’s say 1-2 months for a reasonably complicated set of code that had been evolving for a year/year and a half – my first goal had been – get it converted, get it converted quickly and get it out so it was being used. Largely I achieved those goals – I had a compiled version that very much looked like it worked. So what went wrong? Obviously – missing / broken functionality – why? With such a huge amount of change, the code was bound to have issues. With raven you test against a live DB (embedded) you can’t do that with Neo – so the tests even if they’d made sense wouldn’t have worked. I can live with that – I knew it would be the case – but the real shocker wasn’t the missing functionality – it was the performance. It was terrible. Taking 30 seconds + to open a competition site that at most was a second or so previously. Don’t worry Neo4j fans – it’s my fault
  • #32 So I have a converted and compiling code – base all good Run and go to a Tournament page – ok took a while – maybe that’s good ol’ ASP.NET warmup… Go to another comp… uh oh… still taking time… Spend a while debugging – Debug.WriteLines everywhere, TimeStamps, and notice the pattern that the time is coming from DB hits. Riiiight Why? Let’s think back with some wavey lines to a few slides back when I mentioned I was all about getting it to market – functionality important – performance not so. Turns out – I was wrong. First cut – multiple calls to the DB – analogous to the way I was using Raven – difference being that I’ve not loaded the entire Document as I would have with Raven, so I’m being – well – stupid. Or even well stupid. My first change to minimize the calls ends up with a pretty big results class, but only one call to the DB – which comes back super fast (well – fast) – and so stage two of the conversion – eliminating multiple calls begins… In essence this is a MindSet thing – Raven spoiled me, I wouldn’t have done that with SQL, so why did I think it was ok with any other DB? Thinking back – Raven really was doing what I was supposed to do – getting as much in one go as I would need to display and then displaying it – Well played Ayende. Well played.
  • #33 So at this point Tournr has been live for I think 5 months on Neo4j – it’s gone pretty well – I’ve added some new (and pretty big features) easily, and only had one howler of a bug (to which I will not be elucidating on) – needless to say it was a cypher query I messed up with. Let’s evaluate the DB, It is easy to install – similar to raven – but this time with some powershell goodness. The modelling looks like what you draw, and this is awesome when it comes to working out what to do, or how something is doing it I can add new features quickly, mainly because of the modelling – and indeed the Cypher language. First – I think it’s important to mention the documentation – Neo is very good on this, better than pretty much any other open source or even closed source company I’ve seen – and they really should be congratulated on it. Without the docs I think it would be a significant struggle to use. Cypher. Mentioned in passing – but oh so useful. Cypher allows you to draw your model in ASCII – which allows you to describe (perhaps better than almost any other language) what you want. The COBOL of the graph world ;)
  • #34 Again, Neo4j is a niche db, it’s not huge and as such has similar problems to Raven Getting support is generally OK – you can ask on StackOverflow – and usually get a response quickly – sometimes it can take longer – depends on whether Michael is up (actually – does Michael even sleep??) but it’s a consideration – you are much more on the edge than with comfort techs like SQL Server. Same applies to Neo4j – not many will have it on their CV, even less combinations of things like Neo4j & .NET (though I am one of those) Cypher can cause muddles – I’m reasonably confident / competent with it, but still sometimes get unstuck – always run it on the browser if you can, and read the docs. I pretty much always read the docs for ‘MERGE’ everytime. Hosting – again – niche DB = niche hosting – though there are more options than with Raven, and both are good companies.
  • #35 So, the epilogue. Was it worth the effort? We’ll try to cover a few basics here, nothing too deep.
  • #36 There are 3 main areas to look at, the Model – is it clearer and easier? The code – is that clearer and easier Future Development – can this be done quickly, easily and without huge consequences.
  • #37 Obviously this is a vastly simplified design of the model, but the difference really is as shown. If you’re looking at the DB directly – it’s a case of looking at a single (huge) JSON document (in Raven) or a collection of semantically easier relationship/node pairings. In the past, I’ve had to edit Raven docs directly where I’ve messed up code, and that involves whipping out a text editor, copying, pasting, fixing, recopying etc with Neo4j – I find I can query a lot easier than before.
  • #38 These are the max values the code has reached – so worst case – and you can see the max complexity is higher – Generally – this is where queries have become large to accommodate the ‘single’ query approach. Higher Cyclomatic complexity does mean it’s going to be more painful to find/fix errors in those areas, but the performance tradeoff is worth it when taking into account how much slower the multiple queries were. The Max LOC is also bigger, but again that’s very much down to the single query issue. Methods per type stays pleasingly at the same level.
  • #39 Averages show a better story – no real change in complexity or lines of code over the change. Methods per type has dropped a little – and that’ll be down to some abstracting out of code. Nothing too major.
  • #40 The obligatory LOC slide. So what it shows is a LOC increase – but I should add this isn’t a totally fair comparison – as Tournr has had new features added, so I would totally expect it to be up. Note the big junk in ‘Comments’ – guess where I commented out some raven code?
  • #41 The new model makes it easier to add things to Tournr, I know what I need to do and can see how it works. Is it perfect – no It’s harder than raven to just add new features – but on the side of that I have more control, and more visibility over what I’m doing / adding.
  • #42 We all thought it was over – but what I quickly want to cover is what you should take away from this.
  • #43 It’s always do-able to convert from one DB to another, it may not be appropriate for all cases, and there should absolutely be no ‘gung ho’ we’re using DB X because I saw a demo and it looked cool – always always evaluate the db, does the model fit? Do you have expertise – or the time to get expertise – what if your expert dies – can you replace them easily? With the DBs here, there are different answers for each, and for some (like Neo4j) you might need to hire a Java developer to do a real job with .NET  Is it easy – yes and no. The model is easier in Neo4j, the code is harder than Raven, but easier than SQL, so it’s a mixed bag – you do need to learn the DB, specifically Cypher to get the most out of Neo4j, but that’s not too painful. It is worth it? In my view – yes (dependent on the use case of course) – again – does it make sense? Can you justify the changes to yourself or ‘the man’? Would I do it again. Yes -
  • #44 Some things for you to read up on if you want…
  • #45 Raven – the raven db Neo4j – obvs Tournr – well we’ve talked about it now Xclave – where I write about stuff…