Your SlideShare is downloading. ×
0
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013

1,277

Published on

Wanderu is a consumer-focused search engine for buses and trains. Eddy will recount the architectural, modeling and other technical “lessons learned” and “lessons unlearned” in implementing our …

Wanderu is a consumer-focused search engine for buses and trains. Eddy will recount the architectural, modeling and other technical “lessons learned” and “lessons unlearned” in implementing our geospatial and search features using Neo4j in the context of a NoSQL polyglot solution.

Published in: Technology
1 Comment
4 Likes
Statistics
Notes
  • The            setup            in            the            video            no            longer            works.           
    And            all            other            links            in            comment            are            fake            too.           
    But            luckily,            we            found            a            working            one            here (copy paste link in browser) :            www.goo.gl/yT1SNP
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,277
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
24
Comments
1
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Wanderu: Lessons Learned Lessons Learned and Unlearned from Building a Travel Site with Graphs and Neo4j Eddy Wong CTO, Wanderu.com @eddywongch
  • 2. About Wanderu.com Search Engine for (Intercity) Buses and Trains
  • 3. Demo
  • 4. From pt A to pt B A Shortest Path Problem as a function of depart, arrive, price, duration, date times Philly A: NYC MEG, $9, 11/07/2013 MEG, $4, 11/07/2013 BOLT, $13, 11/07/2013 Nomenclature: Stations, Trips B: DC
  • 5. Lessons •Architectural •Modeling •Geo Learned UnLearned Idea
  • 6. Our Story • 2 yr startup, Tech started about 1+ yr ago • Beta in Mar 2013, Launch in Aug 2013 • Knew nothing about Neo4j when we started (Jun 2012) • Did not like the relational model: wanted schema-less and no self-joins • Wanted a graph model
  • 7. Workflow Scraping Bus Websites JSON Non-uniform Data Server Store Uniform Data
  • 8. Architectural Lessons Art: MC Escher
  • 9. Our Situation • Data is written only in one direction • Users search for paths, then segments • Searches are done by date • Needed online capability • Trip info (price/avail) could change on some
  • 10. Solution Scraping Bus Websites JSON Uniform Data Non-uniform Data Replica Mechanism Nodes & Edges Neo4j Mongo Conn MongoDB
  • 11. MongoConnector • • • • • • • MongoDB Lab project, open source, unsupported Uses Replica Mechanism: Oplog Eventually Consistent (not real time) Written in Python Main methods: Upserts and Deletes, passes doc Implement DocMgr->Neo4jDocMgr->py2neo We can add new properties easily on the fly
  • 12. Polyglot Arch BOS, NYC BOS, PHL NYC, DC NYC, PHL Scraping Bus Websites JSON Non-uniform Data Replica Mechanism MongoDB REST Server Nodes & Edges Neo4j Mongo Conn
  • 13. Modeling Lessons Art: MC Escher
  • 14. Our Story • We tried to “dump” all data into Neo4j • Edges had dates -> too many Edges -> “Super Node Problem” • Query perf was terrible (1+ mins) and worse as # edges increased • Tried Gremlin -> No improvements • Needed range queries on Edges
  • 15. “Dehydate” • Don’t store everything in the Neo4j, only metadata • Use Neo4j as a “connection index” • Don’t store entities in Nodes, only keys • Don’t store heavy properties in Edges
  • 16. Neo4j Model source: Wes Freeman, Tobias Lindaaker
  • 17. Our Solution • Serve paths from Neo4j • Segments from MongoDB (with date constraints) • Back to “Joins” • “Join” across Neo4j + MongoDB: 1 != 525d9031e6c9236072114387
  • 18. Joins across DBs MongoDB: Stations Neo4j: Nodes BOS NYC DC DC ... generated by dbs BOS NYC • Forget seq id ... • Use a human-created “UUID” string for id MongoDB: Trips Neo4j: Edges BOS-NYC BOS-NYC BOS-DC BOS-DC NYC-DC NYC-DC ... ... • Convert pair into id: depart-arrive • For example: BOSNYC
  • 19. Geo Lessons Art: MC Escher
  • 20. Hybrid Solution • Google Autocomplete • Google Maps • MongoDB station geo lookup
  • 21. Lessons of Lessons • Really understand the Neo4j Runtime Model • Pick universal human generated ids • Join across dbs better than RDBMS: 10s paths x 100s segments vs. 500k x 500k • Glad to have picked Neo4j: doing content gen and more geo features now
  • 22. Useful Links • Neo4j Internals slideshare.net/thobe/an-overview-of-neo4j-internals • Aseem’s Lessons Learned with Neo4j http://aseemk.com/talks/neo4j-lessons-learned#/14 • Wes Freeman, Neo4j Internals http://wes.skeweredrook.com/graphdb-meetup-may-2013.pdf • MongoConnector blog.mongodb.org/post/29127828146/introducing-mongo-connector

×