Your SlideShare is downloading. ×
0
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

BlaBlaCar Elastic Search Feedback

1,917

Published on

How ElasticSearch was deployed in BlaBlaCar company

How ElasticSearch was deployed in BlaBlaCar company

Published in: Travel, Technology, Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,917
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
26
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. 1/37ElasticSearchfeedback
  2. 2/37Introduction
  3. 3/37Nicolas Blanc - BlaBlArchitectSinfomicSinfomic(1999)@thewhitegeek(2001)(2005)(2008)(2012)
  4. 4/37What is BlaBlaCar ?
  5. 5/373 000 000MEMBERSIN EUROPE
  6. 6/3710 9 countries10 9 countries● France● Spain● Italy● UK● Poland● Portugal● Netherlands● Belgium● Luxemburg● NEW Germany● France● Spain● Italy● UK● Poland● Portugal● Netherlands● Belgium● Luxemburg
  7. 7/37Growth50 millions25 millionsJanuary2008January2013
  8. 8/37Infrastructure 2 front web servers 2 MySQL master (+4 slaves SSD) 1 private cloud(KVM + Open vSwitch)●Redis●Memcache●RabbitMQ/workers 1 cluster ElasticSearch
  9. 9/37Changing the Search Engine
  10. 10/37Whats existing ? Why Changing ?MySQL Database●Relationnal DB (lots of join needed)●Plain SQL query●Home made geographical searchRecent problems●New feature, means more complex queries●Scalability : Performance depending on DB load
  11. 11/37Initial requirementsScalability●Trip search need to be made in less than 200ms●The system part of the solution easy to maintain●Be able to cluster it (also to not have SPOF)Low code impact on existing application●Same features as of today (geographical search)●Minimize the developpers work●Add one missing feature : facets
  12. 12/37Initial CompetitorsSenseiDB
  13. 13/37Why ElasticSearch✔Easyest cluster possibility✔Good performance when indexing✔Few code to write to use it✔Schema less✔Based on Lucene✔Written in Java (need to code grouping feature)
  14. 14/37ElasticSearch has won,now migrate our search !
  15. 15/37Changing our mindsetObject in Relationnal Database●Can be exploded on multiple tables●Lots of informations usable by JOINObject in Document Oriented Database●Only one big index for theses objects●All informations need to be in the object, not onmultiple tables
  16. 16/37Changing our mindsetObject in Relationnal Database●Can be exploded on multiple tables●Lots of informations usable by JOINObject in Document Oriented Database●Only one big index for theses objects●All informations need to be in the object, not onmultiple tables
  17. 17/37Well defining our objectsNeed to know what we want to search●Searching trips (front office usage)●Searching members (backoffice usage)●Searching FAQ (front office usage)Think of all needed field●The ones used for query●The ones used for filters●The ones used for facets
  18. 18/37Thinking of well defining indexSystem point of view●Number of Nodes in the cluster●Number of Shards●Number of ReplicaApplication point of view●Define type and attributes for all fields (mapping)●Using parent/child or nested to improve indexing●How to push documents from DB ?
  19. 19/37Indexing : using a river or not ?River advantages●Plugs directly to our source backend●ElasticSearch API exists to code a new oneRiver problems●Not easy to add business logic on some fields●Really hard when your DB is unconventionnal●Full Reindex all the documents
  20. 20/37Indexing : our manual wayWe write an asynchronous indexer●Written in java●Have business logic when fetching from db●Fetch from multiple DB/source●Use of java ES library●Easy interface●send {“trip”:1234567} and the server answer {“OK”}
  21. 21/37One index sample : Trip
  22. 22/37Well defining our object TripThink of all needed field●The ones used for query●Trip date of departure,from where,to where,user id●The ones used for filters●User ratings,price,vehicle,seats left,is user blocked(a blocked user, is a user who made some forbiddenaction on the website.)●The ones used for facets●User ratings,price,vehicle
  23. 23/37Well defining our index TripThink of all system requirement●The cluster has 2 nodes●We keep the default configuration for shards/replicaThink of object mapping●For each field :●Define the type (string, long, geo_point, date,float, boolean)●Define the scope (include_in_all)●Define the analyzer (for type string)
  24. 24/37Trip Mapping"trip": {"properties": {"is_user_blocked": {"type": "boolean","include_in_all" : false},"user_ratings" : {"type" : "long","include_in_all" : false},"from": {"type": "geo_point","include_in_all" : false},"price": {"include_in_all": false,"type": "float"},"price_euro": {"type": "float",“include_in_all: false},"seats_left": {"include_in_all": false,"type": "long"},"seats_offered": {"include_in_all": false,"type": "long"},"to": {"include_in_all": false,"type": "geo_point"},"trip_date": {"format": "dateOptionalTime","include_in_all": false,"type": "date"},“vehicle”: {"include_in_all": false,"type": "string"},"userid": {"include_in_all": false,"index": "not_analyzed","type": "string"}}}
  25. 25/37Well indexing eventsWhich modification send event change●All trips creation/deletion/modification●Member modifications (block or not)●New ratings from other members●A seat has been reserved●Member change his vehicleEvent change is a call to internal indexer●Send {“trip”:123456} to indexer (create/update)●Send {“tripd”:123456} to indexer (delete)
  26. 26/37Sample trip index query{"query": {"filtered": {"query": {"match_all": {}},"filter": {"and": [{"geo_distance": {"distance": "40.14937866995km","from": {"lat": 48.856614,"lon": 2.3522219}}}, {"geo_distance": {"distance": "40.14937866995km","to": {"lat": 45.764043,"lon": 4.835659}}},{"range": {"price": {"from": 0,"include_lower": false}}}]}}},"sort": [{"trip_date": { "order": "asc" },}],"filter": {"term": { "is_user_blocked": false }}},"from": 0,"size": 10}
  27. 27/37The Real WorldA trip has now more than 30 fields●(faq is around 25 fields)●(members even more...)To build a trip document we need 3differents SQL queries●(FAQ : 2 differents SQL queries)●(Member : 10 differents SQL queries)A trip has only 1 shard (grouping)
  28. 28/37And now the caveats
  29. 29/37Preloaded ScriptsWe use mvel script to improve scoring●They are not clustered●Each node need to have the scripts●Need a node restart to be added or modifiedSolution : Chef (tool from Opscode)All nodes configurations are centralized into Chefrepository
  30. 30/37Grouping documentsHome made patchs to ElasticSearch(based on a Martijn Van Groningen work forlusini.de)Soon in ElasticSearch(I hope so much)
  31. 31/37Mapping modificationOn a running index :Changing a type is not allowedChanging analyzer is not allowedSolution : index alias1) Changing mapping → create a new index2) When new index is up to date → changing alias
  32. 32/37IOs limitsWe have only 2 nodes●Trip index is around 2GB●But only 1 shard for Trip index●Can index 100 trips / seconds on busy eveningSolution : We put Intel SSDs(waiting for distributed grouping feature)
  33. 33/37Choosing the analyzerSome field need to not be analyzed●If you use ISO code for country(IT, for Italy or DE for Germany are ignored insome cases)Global analyzer has limits●Accentuation from countries like France,Germany or Spain are not always parsed correctly●One analyzer by country is difficult to implementin some cases
  34. 34/37OK Sweet,Whats next?
  35. 35/37Using ElasticSearch to ease log analysis
  36. 36/37By the way…We’re hiring !!!Dev, HTML Ninja, leader,…Come & See me right now… or send me your friends (And we have beer, baby foot and arcade cabinet  )
  37. 37/37Thank you !Follow us !@covoiturageApply now :join@BlaBlaCar.com

×