Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Graphs for Ai and ML

56 views

Published on

Jim Webber - neo4j

Published in: Software
  • Be the first to comment

  • Be the first to like this

Graphs for Ai and ML

  1. 1. Graphs for AI and ML Dr. Jim Webber Chief Scientist, Neo4j @jimwebber
  2. 2. ● Some definitions ● Accidental Skynet ● Graph theory ● Contemporary graph ML ● The future of graph AI Overview
  3. 3. ● ML - Machine Learning ○ Finding functions from historical data to guide future interactions within a given domain ● AI - Artificial Intelligence ● The property of a system that it appears intelligent to its users ● Often, but not always, using ML techniques ● Or ML implementations that can be cheaply retrained to address neighbouring domains A Bluffer’s Guide to AI-cronyms
  4. 4. ● Predictive analytics ● Use past data to predict the future ● General purpose AI ● ML with transfer learning such that learned experiences in one domain can be applied elsewhere ● Human-like AI Often conflated with
  5. 5. ML all the things Where are we today?
  6. 6. Extract all the features! • What do we do? Turn it to vectors and pump it through a classification or regression model • That’s actually not a bad thing • But we can do so much before we even get to ML… • … if we have graph data
  7. 7. Credit: Graph Algorithms, Holder and Needham, O’Reilly 2019
  8. 8. http://www.bbc.co.uk/london/travel/downloads/tube_map.html
  9. 9. • Nodes with optional properties and optional labels • Named, directed relationships with optional properties • Relationships have exactly one start and end node • Which may be the same node Labeled Property graph model
  10. 10. Fearless querying MATCH path = (:author {name:’Jim Webber’} -[*]->(:character {name:’The Doctor’}) RETURN path OR MATCH (me:author {name:’Jim Webber’}, (doc:character {name:’The Doctor’}), path = shortestPath((me)-[*]->(doc)) RETURN path
  11. 11. Take a step back We can be smarter about this
  12. 12. Realtime Predictive Analytics (circa 2008) + + =
  13. 13. Not AI, but extremely effective Credit: https://medium.com/basecs/breaking-down-breadth-first-search-cebe696709d9
  14. 14. Credit: https://www.networkworld.com/article/3211410 /lan-wan/the-10-most-powerful-companies-in- enterprise-networking.html
  15. 15. Toolkit matures into proper database • Cypher and Neo4j server make real time graph analytical patterns simple to apply • Amazing and humane to implement
  16. 16. Firstname: Mickey Surname: Smith DoB: 19781006 SKU: 5e175641 Product: Badgers Nadgers Ale SKU: 2555f258 Product: Peewee Pilsner Category: beer SKU: 49d102bc Product: Baby Dry Nights Category: nappies Category: baby Category: alcoholic drinks SKU: 49d102bc Product: XBox 360 Category: consumer electronics Category: console BOUGHTBOUGHT MEMBER_OF MEMBER_OFMEMBER_OF MEMBER_OFMEMBER_OF
  17. 17. Firstname: * Surname: * DoB: 1996 > x > 1972 Category: beerCategory: nappies BOUGHTCategory: game console Young fathers pattern
  18. 18. Firstname: * Surname: * DoB: 1996 > x > 1972 Category: beerCategory: nappies !BOUGHTCategory: game console Business opportunity
  19. 19. (beer)(nappies) (console) (daddy) () () ()
  20. 20. (d)-[:BOUGHT]->()-[:MEMBER_OF]->(n) (d)-[:BOUGHT]->()-[:MEMBER_OF]->(b) (d)-[:BOUGHT]->()-[:MEMBER_OF]->(c) Flatten the graph
  21. 21. (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category) (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category) (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(c:Category) Include any labels
  22. 22. MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category), (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category) Add a MATCH clause
  23. 23. MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category), (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category), (c:Category) WHERE NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c)) Constrain the Pattern
  24. 24. MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category), (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category), (c:Category) WHERE n.category = "nappies" AND b.category = "beer" AND c.category = "console" AND NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c)) Add property constraints
  25. 25. MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category), (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category), (c:Category) WHERE n.category = "nappies" AND b.category = "beer" AND c.category = "console" AND NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c)) RETURN DISTINCT d AS daddy Profit!
  26. 26. ==> +---------------------------------------------+ ==> | daddy | ==> +---------------------------------------------+ ==> | Node[15]{name:"Rory Williams",dob:19880121} | ==> +---------------------------------------------+ ==> 1 row ==> 0 ms ==> neo4j-sh (0)$ Results
  27. 27. Which sushi restaurants in NYC do my friends like? Facebook Graph Search See http://maxdemarzi.com/
  28. 28. Graph Structure
  29. 29. Simple Query, Intelligent Results MATCH (:Person {name: 'Jim'}) -[:IS_FRIEND_OF]->(:Person) -[:LIKES]->(restaurant:Restaurant) -[:LOCATED_IN]->(:Place {location: 'New York'}), (restaurant)-[:SERVES]->(:Cuisine {cuisine: 'Sushi'}) RETURN restaurant
  30. 30. Search structure
  31. 31. Graph Theory • Rich knowledge of how graphs operate in many domains • Off the shelf algorithms to process those graphs for information, insight, predictions • Low barrier to entry • Amazingly powerful
  32. 32. Triadic Closure name: Kyle name: Stan name: Kenny
  33. 33. Triadic Closure name: Kyle name: Stan name: Kenny name: Kyle name: Stan name: Kenny FRIEND
  34. 34. Structural Balance name: Cartman name: Craig name: Tweek
  35. 35. Structural Balance name: Cartman name: Craig name: Tweek name: Cartman name: Craig name: Tweek FRIEND
  36. 36. Structural Balance name: Cartman name: Craig name: Tweek name: Cartman name: Craig name: Tweek ENEMY
  37. 37. Structural Balance name: Kyle name: Stan name: Kenny name: Kyle name: Stan name: Kenny FRIEND
  38. 38. Structural Balance is a key predictive technique And it’s domain-agnostic
  39. 39. Allies and Enemies UK GermanyFrance Russia Italy Austria
  40. 40. Allies and Enemies UK GermanyFrance Russia Italy Austria
  41. 41. Allies and Enemies UK GermanyFrance Russia Italy Austria
  42. 42. Allies and Enemies UK GermanyFrance Russia Italy Austria
  43. 43. Allies and Enemies UK GermanyFrance Russia Italy Austria
  44. 44. Allies and Enemies UK GermanyFrance Russia Italy Austria
  45. 45. Predicting WWI [Easley and Kleinberg]
  46. 46. It if a node has strong relationships to two neighbours, then these neighbours must have at least a weak relationship between them. [Wikipedia] Strong Triadic Closure
  47. 47. Triadic Closure (weak relationship) name: Kenny name: Stan name: Cartman
  48. 48. Triadic Closure (weak relationship) name: Kenny name: Stan name: Cartman name: Kenny name: Stan name: Cartman FRIEND 50%
  49. 49. • Relationships can have “strength” as well as intent • Think: weighting on a relationship in a property graph • Weak links play another super-important structural role in graph theory • They bridge neighbourhoods Weak relationships
  50. 50. Local Bridges FRIEND name: Kenny name: Stanname: Kyle FRIEND FRIEND name: Sally name: Bebename: Wendy FRIEND FRIEND 50% name: Cartman FRIEND ENEMY
  51. 51. “If a node A in a network satisfies the Strong Triadic Closure Property and is involved in at least two strong relationships, then any local bridge it is involved in must be a weak relationship.” [Easley and Kleinberg] Local Bridge Property
  52. 52. University Karate Club
  53. 53. • (NP) Hard problem • Repeatedly remove the spanning links between dense regions • Or recursively merge nodes into ever larger “subgraph” nodes • Choose your algorithm carefully – some are better than others for a given domain • Can use to (almost exactly) predict the break up of the karate club! Graph Partitioning
  54. 54. University Karate Clubs (predicted by Graph Theory) 9
  55. 55. University Karate Clubs (what actually happened!)
  56. 56. • Label Propagation • Union Find / Weakly Connected Components • Strongly Connected Components • Triangle-Count / Clustering Coefficient ClusteringCentrality • PageRank • Betweenness • Closeness • Degree Path Finding • Breadth-first search • Depth-first search • Single-source shortest path • All-pairs shortest path • Minimum weight spanning tree Graph Algorithms in Neo4j
  57. 57. Amazing Native Graph Performance
  58. 58. Credit: https://reezocar.blob.core.windows.net/blog/2015/09/k2000.jpg
  59. 59. Find and stop spammers Extract graph structure over time Not message content! (Fakhraei et al, KDD 2015) Learning to stop bad guys Result: find and classify 70% spammers with 90% accuracy
  60. 60. Much of modern graph ML is still about turning graphs to vectors Graph2Vec and friends Highly complementary techniques Mixing structural data and features gives better results Better data into the model, better results out But we don’t have to always vectorize graphs... Graph ML
  61. 61. Knowledge Graphs • Semantic domain knowledge for inference and understanding • E.g. eBay Google Assistant • What’s the next best question to ask when a potential customer says they want a bag? • Price? Function? Colour? • Depends on context! Demographic, history, user journey. • Richly connected data makes the system seem intelligent • But it’s “just” data and algorithms in reality
  62. 62. Graph Convolutional Neural Networks A general architecture for predicting node and relationship attributes in graphs. (Kipf and Welling, ICLR 2017) Credit: Andrew Docherty (CSIRO), YowData 2017 https://www.youtube.com/watch?v=Gmxz41L70Fg
  63. 63. Graph Networks for Structured Causal Models • Position paper from Google, MIT, Edinburgh • Structured representations and computations (graphs) are key • Goal: generalize beyond direct experience • Like human infants can https://arxiv.org/pdf/1806.01261.pdf
  64. 64. credit: @markhneedham
  65. 65. Thanks for listening Ask the experts session tomorrow 14:30 @jimwebber

×