Successfully reported this slideshow.

Everything is not a graph problem. But, there are plenty.

2

Share

1 of 52
1 of 52

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Everything is not a graph problem. But, there are plenty.

  1. 1. Everything is not a graph problem. Lessons from life in the trenches. Denise Koessler Gosnell, Ph.D. Senior Graph Consultant, DataStax @denisekgosnell
  2. 2. Theory vs. Reality? © DataStax, All Rights Reserved.2 @denisekgosnell
  3. 3. © DataStax, All Rights Reserved.3 Agenda @denisekgosnell
  4. 4. © DataStax, All Rights Reserved.4 The Graph @denisekgosnell
  5. 5. © DataStax, All Rights Reserved.5 The Graph The vision @denisekgosnell@denisekgosnell person service device product device addressstore emailcredit card
  6. 6. © DataStax, All Rights Reserved.6 The Graph Reality @denisekgosnell
  7. 7. What does your problem need? @denisekgosnell@denisekgosnell
  8. 8. • Across all of my data, how many [___] are there? • How many of [___] happened in the past x amount of time? • What most closely matches [___]? © DataStax, All Rights Reserved.8 @denisekgosnell The starting point The Graph
  9. 9. © DataStax, All Rights Reserved.9 Goals? @denisekgosnell
  10. 10. © DataStax, All Rights Reserved.10 @denisekgosnell Goals? @denisekgosnell
  11. 11. © DataStax, All Rights Reserved.11 Goals? @denisekgosnell
  12. 12. Not a graph problem. @denisekgosnell@denisekgosnell
  13. 13. © DataStax, All Rights Reserved.13 Status @denisekgosnell
  14. 14. You have graph data. What do you plan on doing with it? @denisekgosnell@denisekgosnell
  15. 15. How many sessions are needed? © DataStax, All Rights Reserved.15
  16. 16. How many sessions are needed? Brynn Lender is organizing a conference with 4 speakers: Arya, Bran, Rob, and Sansa. Logistic Issues: 1. Arya’s and Sansa’s conference talk requires the same room, as does Bran’s and Rob’s. 2. Mr. Lender does not want to have Bran’s and Sansa’s talk at the same time. 3. Arya and Bran are helping each other with their presentations and need to attend each other’s talks. © DataStax, All Rights Reserved.16
  17. 17. We have graph data? @denisekgosnell@denisekgosnell
  18. 18. How many sessions are needed? © DataStax, All Rights Reserved.18
  19. 19. How many sessions are needed? © DataStax, All Rights Reserved.19 @denisekgosnell Bran SansaArya Rob
  20. 20. How many sessions are needed? © DataStax, All Rights Reserved.20 @denisekgosnell Bran SansaArya Rob
  21. 21. How many sessions are needed? © DataStax, All Rights Reserved.21 @denisekgosnell Bran SansaArya Rob
  22. 22. How many sessions are needed? © DataStax, All Rights Reserved.22 @denisekgosnell Bran SansaArya Rob
  23. 23. How many sessions are needed? © DataStax, All Rights Reserved.23 @denisekgosnell Bran SansaArya Rob
  24. 24. How many sessions are needed? © DataStax, All Rights Reserved.24 @denisekgosnell Bran SansaArya Rob
  25. 25. How many sessions are needed? © DataStax, All Rights Reserved.25 @denisekgosnell Bran SansaArya Rob
  26. 26. How many sessions are needed? © DataStax, All Rights Reserved.26 Session 1 Session 2 Session 3 @denisekgosnell Bran SansaArya Rob
  27. 27. Graph Analytics Problems. @denisekgosnell@denisekgosnell
  28. 28. • For all options, what is the best route from city a to city b? • What is a way to schedule [____]? • How do I place alarms in a building to uniquely identify a room according to alerts? © DataStax, All Rights Reserved.28 Common Qs @denisekgosnell
  29. 29. • All pairs, shortest path • Maximum clique • Chromatic number • Vertex degree distribution • Page Rank • … © DataStax, All Rights Reserved.29 Common Algorithms @denisekgosnell
  30. 30. © DataStax, All Rights Reserved.30 Status @denisekgosnell
  31. 31. Graph Database Problems. @denisekgosnell@denisekgosnell
  32. 32. @denisekgosnell@denisekgosnell First: Graph Based Entity Resolution
  33. 33. Telecom Social Networks © DataStax, All Rights Reserved.33 @denisekgosnell
  34. 34. Telecom Social Networks © DataStax, All Rights Reserved.34 @denisekgosnell
  35. 35. Telecom Social Fingerprinting © DataStax, All Rights Reserved.35 @denisekgosnell
  36. 36. Telecom Social Fingerprinting © DataStax, All Rights Reserved.36 August 2017 @denisekgosnell
  37. 37. Telecom Social Fingerprinting © DataStax, All Rights Reserved.37 August 2017 September 2017 @denisekgosnell
  38. 38. Telecom Social Fingerprinting © DataStax, All Rights Reserved.38 August 2017 September 2017 @denisekgosnell
  39. 39. Telecom Social Fingerprinting © DataStax, All Rights Reserved.39 August 2017 September 2017 @denisekgosnell
  40. 40. Telecom Social Fingerprinting © DataStax, All Rights Reserved.40 August 2017 September 2017 @denisekgosnell
  41. 41. Telecom Social Fingerprinting © DataStax, All Rights Reserved.41 August 2017 September 2017 @denisekgosnell
  42. 42. Telecom Social Fingerprinting © DataStax, All Rights Reserved.42 @denisekgosnell
  43. 43. @denisekgosnell@denisekgosnell Second: Graph DB as a Master Identity Store
  44. 44. © DataStax, All Rights Reserved.44 The Graph @denisekgosnell We’re back. @denisekgosnell person service device product device addressstore emailcredit card
  45. 45. © DataStax, All Rights Reserved.45 Good? @denisekgosnell person person person
  46. 46. @denisekgosnell@denisekgosnell What are you trying to read from your database?
  47. 47. © DataStax, All Rights Reserved.47 Better. @denisekgosnell@denisekgosnell person master_uuid person master_uuid address device payment device flight
  48. 48. © DataStax, All Rights Reserved.48 Best? @denisekgosnell provenance provenance provenance provenance person master_uuid person master_uuid address device payment device flight
  49. 49. © DataStax, All Rights Reserved.49 Status: @denisekgosnell
  50. 50. © DataStax, All Rights Reserved.50 Status: @denisekgosnell
  51. 51. Everything is not a graph problem. (but there are plenty) Denise Koessler Gosnell, Ph.D. Senior Graph Consultant, DataStax @denisekgosnell
  52. 52. • GitHub: • github.com/datastax/graph-examples • @DeniseKGosnell • Twitter: @DeniseKGosnell • Email: Denise.Gosnell@datastax.com © DataStax, All Rights Reserved.52 Contact

Editor's Notes

  • What I hope you get out of this talk: Life in the trenches


    Target audience: architects, PMs, CTOs.

    2017 was the year of the graph, what have we learned?


  • This talk:
    Life in the trenches
    What problems have gone wrong
    How I classify ”graph problems”


    About me…
  • Describe:
    non-graph problems,
    graph analytics problems,
    graph database problems

    Include:
    -- overlooked architectural issue
    -- recommendation
  • Has anyone in here been tasked with building this? The new singular system that will have it all?

    Story #1: This is my story of failure: tasked to build the one graph that ruled them all.
  • The world makes so much sense like this. Obviously a graph.

    Grand vision: “the one graph that rules it all”

  • -- overlooked architectural issue – these are the types of questions I was being asked.
    -- recommendation


    Entity resolution:
    ---> workload doesn’t match the data model
    -- huge swiss army knife and only opening cans of beans. (graph for bar charts) got a graph, only need bar charts
    - needed the simple and we had the powerful
    ---> I lost site of what my manager really needed
  • Is this really what you problem needs?

    Remember the one graph that I had to build? At the end of the day, my c-suite wanted this.
  • Is this really what you problem needs?
  • Is this really what you problem needs?
  • Essentially –

    We were letting the process of “how we think about the data”
    Drive the
    Actual implementation details.

  • my ask:

    use the right tool for the right problem
  • Ok. You have graph data. But do you need to analyze it or query it?
  • Let’s do a ”for – instance”
  • Essentially --
  • Query the audience:
    4 sessions?
    3?
    2?

    Graph problem?
  • Build a conflict graph
  • Arya’s and Sansa’s conference talk requires the same room, as does Bran’s and Rob’s.
  • Mr. Lender does not want to have Bran’s and Sansa’s talk at the same time.
  • Arya and Bran are helping each other with their presentations and need to attend each other’s talks.


    … WTF where are you going with this Denise??
  • Arya and Bran are helping each other with their presentations and need to attend each other’s talks.


    … WTF where are you going with this Denise??
  • Describe:
    non-graph problems,
    graph analytics problems,
    graph database problems

    Include:
    -- overlooked architectural issue
    -- recommendation
  • Describe:
    non-graph problems,
    graph analytics problems,
    graph database problems

    Include:
    -- overlooked architectural issue
    -- recommendation
  • This duality exists with any problem – not just graph problems: algorithmy vs storage/retrivaly

    Describe:
    non-graph problems,
    graph analytics problems,
    graph database problems

    Include:
    -- overlooked architectural issue
    -- recommendation


    What are the warning signs that showed us that you were going down the wrong path?
    --- aka --- what to look for to know that you are about to get fired before you get fired


    trade offs:
    --- all in 1 system and have bad SLAs
    Or
    -– data copied in multiple places in the system and the complexity of (1) maintaining data duplication and (2) properly routing queries to their clever data model for performant answers


    Life in the trenches

    Entity resolution:
    ---> workload doesn’t match the data model
    -- huge swiss army knife and only opening cans of beans. (graph for bar charts) got a graph, only need bar charts
    - needed the simple and we had the powerful
    ---> I lost site of what my manager really needed


    Entity resolution with a mysql database:
    - I answered today’s question, but didn’t look far enough down the road to answer the next question
    I’ve got a hammer, and I am hammering in nails –
    ok, now remove the nails
    Now open the can of beans with the hammer
  • Is there a situation in which graph databases have been useful for resolving entities?

    2 examples
  • Is there a situation in which graph databases have been useful for resolving entities?

    2 examples
  • First story: 2012
  • The churn problem: a special version of ER problem.

    After trying SVMs, ANNs, linear models.. You name it – this is what we arrived at
  • Sub graph from time t – 1. (aka last month_)
  • Induced subgraph for time t
  • Induced subgraph for time t
  • Induced subgraph for time t
  • Induced subgraph for time t
  • This is the identity which in production we would label with confidence as being associated to the identity from August
  • How good was this? See red, you are dead.
  • Is there a situation in which graph databases have been useful for resolving entities?

    2 examples
  • How do we use a graph database to make something like this happen?
  • We don’t get to skip that step just because we are in a graph db anymore
  • WHY the blue vertices?

    WHAT becomes a property? ---> WHAT ARE YOU NEEDING TO READ FROM. YOUR DB?
    We don’t get to skip that step anymore.


    1 QUESTION: WHAT DO YOU WANT OUT

  • What is best? It all depends. Vertices typically become things that require edges between master identities

    Before you hammer away at the key board, do some planning. What do you need back out? That is what will define a solid graph architecture for your application.
  • This duality exists with any problem – not just graph problems: algorithmy vs storage/retrivaly

    trade offs:
    --- all in 1 system and have bad SLAs
    Or
    -– data copied in multiple places in the system and the complexity of (1) maintaining data duplication and (2) properly routing queries to their clever data model for performant answers
  • This duality exists with any problem – not just graph problems: algorithmy vs storage/retrivaly

    trade offs:
    --- all in 1 system and have bad SLAs
    Or
    -– data copied in multiple places in the system and the complexity of (1) maintaining data duplication and (2) properly routing queries to their clever data model for performant answers
  • if you take anything away,

    it is that the graph community as a whole is at a critical point.

    if we continue to approach graphs a "CAN SOLVE EVERYTHING", we need to reconsider. This could have a wave of ramifications for the whole community

    Use the right tool for the right problem.
  • ×