15. Wide-range and niche companies
Finding the perfect job for your hipster-esque coding needs
Percentage distribution for top 3 endorsed skills for selected companies.
18. Case study: Minimalist social network
Epic battle!
Let’s consider a social network with 1 000 000 users, each having 50 friends.
SQL has to “fake” relationships (don’t we all?).
SQL: Graph:
19. Minimalist social network (cont’d)
S14E04: You have 0 friends
Also consider a non-reflexive scenario: Who are my followers?
Reversing the direction of a traversal would be difficult with non-native graph processing.
For that, you must either create a costly reverse-lookup index for each traversal or perform a
brute-force search through the original index.
The results are in!
20. Native Graph: Index-free adjacency
Lightning McQueen
Index-free adjacency ensures lightning-fast retrieval without the need for indexes.
Query times are only proportional to the amount of the graph searched.
Each node directly references its adjacent nodes, acting as a micro-index for all nearby nodes.
Bidirectional joins are effectively precomputed and stored in the database as relationships.
Relationships – rather than over-reliance on indexes – are used for efficient traversals.
21. Index-free adjacency (cont’d)
For native graph databases, node records point to lists of relationships, labels and properties.
Graph data is kept in store files, each of which contain data for a specific graph internals.
Example: The node store is a fixed-size record store, where each record is 15 bytes in length.
The database can directly compute a record’s location, at cost O(1).
Let's get dirty!
22. Index-free adjacency (cont’d‘d)
//TODO: find super awesome pun!
With fixed-sized records and pointer-like record IDs, traversals are implemented simply
by chasing pointers around a data structure, which can be performed at very high speed.
Neo4j 2.x could store 34 billion nodes. Neo4j 3.x deploys dynamic pointer compression for
infinite nodes.
Conceptually, it all comes down to this:
23. Index-free adjacency (cont’d‘d’d)
And find I'm king of the hill, top of the heap!
Neo4j 2.x lazy loading on-heap object-cache:
Neo4j 3.x relies only on a scalable, high performing LRU-K off-heap page-cache.
24. Key features for Neo4j
Fully ACID database.
Scalability and HA capabilities.
Intuitive data queries using Cypher.
Open source.
Neo4j takes things seriously: relationships are considered first class citizens!
Is returning something random considered eventual consistency?
25. Cypher
‘Member ASCII art? (っ◕‿◕)っ
Powerful and expressive query language requiring 10x to 100x less code than SQL.
Declarative language for describing patterns in graphs visually using an ASCII-art syntax.
Comes with a profiler / interactive query planner.
26. Looks come first (and you know it)
Visual models are easiest to comprehend by humans. Even the ER model is itself a graph!
Businesses need tools for capturing multiple-domain semantics within a visual data model.
Data interconnectivity and topology is at least as important as the data.
Let's Get Visual! Visual!
27. Making sense of data
The value of data isn’t represented by its volume, but by our capacity to understand the
relationships between its consisting elements.
Graph databases represent a technology that has the analytical and discovery capabilities
that no other persistence solution can provide.
Moreover, modern data is starting to have an obvious graph-like structure. SQL does not
naturally support graph specific operations (e.g. DFS, BFS).
In case of a traditional approach, queries take too long to complete to be run on demand.
That’s not necessarily the case for graphs!
Go graph like all the other cool kids!