69. Distributed Graphs at Scale with Cassandra
and Titan
● Structured Knowledge
Representation
● Index-free adjacency:
Like memory pointers,
but in disk
● Navigation between
nodes in constant time.
● Graph != No schema
70. Michael Laing
Architect, Edge Engineering (soon to include OLAP w Spark Streaming)
New York Times
michael.laing@nytimes.com
Editor's Notes
Back to 1964
Not me
Nice shoes though
GE 225 – 8KB
Rethink
Flatten
Global
Reliable
Manageable
Mesh
Resilient
Balancing
Nodes w Roles
Self organizing
Messaging everywhere
GLobal
Messaging everywhere
Header = Metadata: routing, timestamp, source
Messaging everywhere
Designed to fail fast
Designed to fail fast
Designed to fail fast
Designed to fail fast
Rethink
Flatten
Global
Reliable
Manageable
Rethink
Flatten
Global
Reliable
Manageable
Rethink
Flatten
Global
Reliable
Manageable
Rethink
Flatten
Global
Reliable
Manageable
Rethink
Flatten
Global
Reliable
Manageable
This powers The New York Times Semantic Platform: mechanism for accessing all of our knowledge about our indexing concepts: people, places organizations and descriptors. This is an example of a more complex traversal. Let’s say we have a concept called Ebola Virus, and we want to retrieve all the ancestors of this concepts. That means all concepts that have a broader meaning than Ebola Virus. So in theory, we need to iterate through adjacent edges and vertices until we can’t find any other broader term.