ETL into Neo4j
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


ETL into Neo4j



Learn some of the ways to load data into Neo4j quickly.

Learn some of the ways to load data into Neo4j quickly.



Total Views
Views on SlideShare
Embed Views



13 Embeds 7,567 7522 14 9 5 4 3 3 2
http://localhost 1 1 1 1 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

ETL into Neo4j Presentation Transcript

  • 1. ETL into Neo4j Max De Marzi
  • 2. About Me Built the Neography Gem (Ruby Wrapper to the Neo4j REST API) Playing with Neo4j since 10/2009• My Blog:• Find me on Twitter: @maxdemarzi• Email me:• GitHub:
  • 3. Agenda• ETL your mind• ETL with Batch and the REST API• ETL with Gremlin and Groovy• ETL with the Batch Importer• ETL from SQL
  • 4. ETL your MindYou have to start there
  • 5. More Relational than RelationalStop thinking about how Start thinking about relationshipsTables are related
  • 6. Objects like to mingleOptimized for “trees” of data Optimized for seeing the forest and the trees, and the branches, and the trunks
  • 7. SELECT skills.*, user_skill.*FROM usersJOIN user_skill ON = user_skill.user_idJOIN skills ON user_skill.skill_id = skill.idWHERE = 1
  • 8. START user = node(1)MATCH user -[user_skill]-> skillRETURN skill, user_skill
  • 9. Property Graph
  • 10. Language LanguageCountry Countrylanguage_code language_code country_codelanguage_name country_code country_nameword_count primary flag_uri Language Countryname name IS_SPOKEN_INcode codeword_count as_primary flag_uri
  • 11. name: “Canada” languages_spoken: “[ „English‟, „French‟ ]” language:“English” spoken_in name: “USA”name: “Canada” language:“French” spoken_in name: “France”
  • 12. Country name flag_uri language_name number_of_words yes_in_langauge no_in_language currency_code currency_name Country Languagename nameflag_uri SPEAKS number_of_words yes no Currency code name
  • 13. ETL with Batch and the REST API
  • 14. Batch command from REST APIGreat for importing Facebook/Twitter friendsKeep each request under 10k commandsPreferably send a request every 2k to 5k commands
  • 15. Using Batch from Neography
  • 16. Why Batch Transactional: any failures not committed. Ordered: responses guaranteed to be in the same order as sent. Continuous loading/updating nodes and relationships in spurts or streaming.
  • 17. ETL with Gremlin and Groovy
  • 18. Commit every 1000 changes or so, make sure to stop the transaction to commit thelast few changes at the very end.Look into auto-indexing to make life easier.Disabled by default. See Docs for trick to make it full textinstead of exact index.
  • 19. Crazy Format is ok Id :: Title :: Genre|Genre|Genre But it’s preferable to stay clear of escape characters like “|”String location of data file, converted to URL, then processed one line at a time.Movie vertex created, genre vertex created unless it exists (index lookup), edgefrom movie to genre is created.Full walk-through on
  • 20. ETL with the Batch Importer
  • 21. Installation Walk-Through
  • 22. Testing it7.5M nodes, 42M relationships in just over 3 minutes on a laptop.
  • 23. Loading it into Neo4jFull walk-through on
  • 24. When to use the Batch Importer? • 1st time loading or periodic reloading • When you need Speed • When you don’t mind a little Java
  • 25. ETL from SQL
  • 26. Identities who vouched for each otherrow_number() and INTO are our friends
  • 27. The “term” vouched for will serve as our relationship type, status is a relationship property.
  • 28. Notice there are no node ids.These are automatic, clkao is node 1
  • 29. No time to get coffee >8-[
  • 30. What about multiple types of nodes?No problem, just add the MAX(node_id) from the first table. Full walk-through at: Need help? E-mail me, catch me on Google chat or Skype. Please don’t be shy…. and read my blog:
  • 31. Thank you!