Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Word Puzzles with Neo4j and Py2neo

Neo4j is a graph database (nodes and relationships) and is the perfect fit for some types of problem. Within that domain Neo4j is much, much faster than SQL and easier to query. Py2neo is a Python binding to Neo4j. The live presentation showed how to create word transformation puzzles e.g. getting from "stores" to "slaked" by one latter transformations where each intermediate step is a valid word. One solution is "stores"->"stored"->"stared"->"staked"->"slaked".

  • Login to see the comments

Word Puzzles with Neo4j and Py2neo

  1. 1. Presented by Grant Paton-Simpson Word Puzzles with Neo4j and Py2neo
  2. 2. Overview ● Brief look at graph databases & Neo4j ● Introduction to word transformation game ● Getting suitable words ● Adding words and relationships into Neo4j ● Querying graph data to generate puzzles
  3. 3. Graph Databases – a NoSQL option
  4. 4. NoSQL – when is it a good fit? ● SQL has its origins in the 1970s and may not be fresh and shiny any more but ... ● … we shouldn't choose NoSQL for reasons of fashion. ● Venerable SQL often a better choice for standard hierarchies e.g. countries that have cities that have suburbs etc
  5. 5.
  6. 6. Graph Databases ● Graph databases much, much better for related data with: – lots of different links between same nodes – different numbers of links between nodes e.g. 3 hops to one peer and 7 hops to another – lots of peer-to-peer links
  7. 7. Substantial Benefits ● Massive performance benefits (going exponential as number of links grows) ● Structural harmony – between structure of data and structure of data storage (what you draw on the whiteboard might look very similar to how you data is actually structured) – between questions of data and query language used to answer them
  8. 8. Word transformations ● Start with one word and get to the other by single-letter tranformations word-by-word ● E.g. starting with “stores” get to “slaked” – BTW there are 96 alternative ways 5 moves or less stores stored stared staked slaked
  9. 9. Puzzle taster Get from 'sloven' to 'closed' in no more than 5 steps (there are 10 unique solutions) sloven ? closed
  10. 10. Getting a simple word list ● How hard could it be? ● Lesson #1 – scrabble lists and similar are useless – only want lists with standard words otherwise puzzles too hard ● Lesson #2 – have to decide about taboo/profane words ● Lesson #3 – the number of words affects the number of ONE_LETTER_DIFF relationships a lot ● Lesson #4 – clever optimisation not needed if restricting self to ordinary words SCOWL (Spell Checker Oriented Word Lists)
  11. 11. Filtering words ● Needed to turn é to e ● Needed to eliminate possessives e.g. cat's (as used in the phrase “the cat's whiskers”) ● Needed to leave out capitalised words
  12. 12. For each word, identifying words different by one letter only Disclaimer: the code worked but probably some super-smart optimisations would be possible involving n-dimensional space or something
  13. 13. Adding data to Neo4j ● Create nodes and relationships ● Lots of room for optimisations ● Only need to build database once so 15 minutes is not worth reducing ● My Neo4j and Py2neo is beginner level but I was able to solve my problem
  14. 14. Py2neo and Cypher
  15. 15. Cypher Syntax as ASCII Art (Really!) Word Word ONE_OFF (Word) -[ONE_OFF]->(Word)
  16. 16. Cypher Syntax as ASCII Art (Really!) Word Word ONE_OFF (Word) -[ONE_OFF]->(Word) How cool is this?
  17. 17. Example Output
  18. 18. Matching chart
  19. 19. Live Demo – Suggestions for Start Word
  20. 20. “sloven” to “closed” solution(s)
  21. 21. Resources ● Neo4j – – –!/gists – ● Py2neo – ● SCOWL –
  22. 22. About Catalyst