Your SlideShare is downloading. ×
  • Like
BlaBlaCar Elastic Search Feedback
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

BlaBlaCar Elastic Search Feedback

  • 1,418 views
Published

How ElasticSearch was deployed in BlaBlaCar company

How ElasticSearch was deployed in BlaBlaCar company

Published in Travel , Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,418
On SlideShare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
11
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. 1/37ElasticSearchfeedback
  • 2. 2/37Introduction
  • 3. 3/37Nicolas Blanc - BlaBlArchitectSinfomicSinfomic(1999)@thewhitegeek(2001)(2005)(2008)(2012)
  • 4. 4/37What is BlaBlaCar ?
  • 5. 5/373 000 000MEMBERSIN EUROPE
  • 6. 6/3710 9 countries10 9 countries● France● Spain● Italy● UK● Poland● Portugal● Netherlands● Belgium● Luxemburg● NEW Germany● France● Spain● Italy● UK● Poland● Portugal● Netherlands● Belgium● Luxemburg
  • 7. 7/37Growth50 millions25 millionsJanuary2008January2013
  • 8. 8/37Infrastructure 2 front web servers 2 MySQL master (+4 slaves SSD) 1 private cloud(KVM + Open vSwitch)●Redis●Memcache●RabbitMQ/workers 1 cluster ElasticSearch
  • 9. 9/37Changing the Search Engine
  • 10. 10/37Whats existing ? Why Changing ?MySQL Database●Relationnal DB (lots of join needed)●Plain SQL query●Home made geographical searchRecent problems●New feature, means more complex queries●Scalability : Performance depending on DB load
  • 11. 11/37Initial requirementsScalability●Trip search need to be made in less than 200ms●The system part of the solution easy to maintain●Be able to cluster it (also to not have SPOF)Low code impact on existing application●Same features as of today (geographical search)●Minimize the developpers work●Add one missing feature : facets
  • 12. 12/37Initial CompetitorsSenseiDB
  • 13. 13/37Why ElasticSearch✔Easyest cluster possibility✔Good performance when indexing✔Few code to write to use it✔Schema less✔Based on Lucene✔Written in Java (need to code grouping feature)
  • 14. 14/37ElasticSearch has won,now migrate our search !
  • 15. 15/37Changing our mindsetObject in Relationnal Database●Can be exploded on multiple tables●Lots of informations usable by JOINObject in Document Oriented Database●Only one big index for theses objects●All informations need to be in the object, not onmultiple tables
  • 16. 16/37Changing our mindsetObject in Relationnal Database●Can be exploded on multiple tables●Lots of informations usable by JOINObject in Document Oriented Database●Only one big index for theses objects●All informations need to be in the object, not onmultiple tables
  • 17. 17/37Well defining our objectsNeed to know what we want to search●Searching trips (front office usage)●Searching members (backoffice usage)●Searching FAQ (front office usage)Think of all needed field●The ones used for query●The ones used for filters●The ones used for facets
  • 18. 18/37Thinking of well defining indexSystem point of view●Number of Nodes in the cluster●Number of Shards●Number of ReplicaApplication point of view●Define type and attributes for all fields (mapping)●Using parent/child or nested to improve indexing●How to push documents from DB ?
  • 19. 19/37Indexing : using a river or not ?River advantages●Plugs directly to our source backend●ElasticSearch API exists to code a new oneRiver problems●Not easy to add business logic on some fields●Really hard when your DB is unconventionnal●Full Reindex all the documents
  • 20. 20/37Indexing : our manual wayWe write an asynchronous indexer●Written in java●Have business logic when fetching from db●Fetch from multiple DB/source●Use of java ES library●Easy interface●send {“trip”:1234567} and the server answer {“OK”}
  • 21. 21/37One index sample : Trip
  • 22. 22/37Well defining our object TripThink of all needed field●The ones used for query●Trip date of departure,from where,to where,user id●The ones used for filters●User ratings,price,vehicle,seats left,is user blocked(a blocked user, is a user who made some forbiddenaction on the website.)●The ones used for facets●User ratings,price,vehicle
  • 23. 23/37Well defining our index TripThink of all system requirement●The cluster has 2 nodes●We keep the default configuration for shards/replicaThink of object mapping●For each field :●Define the type (string, long, geo_point, date,float, boolean)●Define the scope (include_in_all)●Define the analyzer (for type string)
  • 24. 24/37Trip Mapping"trip": {"properties": {"is_user_blocked": {"type": "boolean","include_in_all" : false},"user_ratings" : {"type" : "long","include_in_all" : false},"from": {"type": "geo_point","include_in_all" : false},"price": {"include_in_all": false,"type": "float"},"price_euro": {"type": "float",“include_in_all: false},"seats_left": {"include_in_all": false,"type": "long"},"seats_offered": {"include_in_all": false,"type": "long"},"to": {"include_in_all": false,"type": "geo_point"},"trip_date": {"format": "dateOptionalTime","include_in_all": false,"type": "date"},“vehicle”: {"include_in_all": false,"type": "string"},"userid": {"include_in_all": false,"index": "not_analyzed","type": "string"}}}
  • 25. 25/37Well indexing eventsWhich modification send event change●All trips creation/deletion/modification●Member modifications (block or not)●New ratings from other members●A seat has been reserved●Member change his vehicleEvent change is a call to internal indexer●Send {“trip”:123456} to indexer (create/update)●Send {“tripd”:123456} to indexer (delete)
  • 26. 26/37Sample trip index query{"query": {"filtered": {"query": {"match_all": {}},"filter": {"and": [{"geo_distance": {"distance": "40.14937866995km","from": {"lat": 48.856614,"lon": 2.3522219}}}, {"geo_distance": {"distance": "40.14937866995km","to": {"lat": 45.764043,"lon": 4.835659}}},{"range": {"price": {"from": 0,"include_lower": false}}}]}}},"sort": [{"trip_date": { "order": "asc" },}],"filter": {"term": { "is_user_blocked": false }}},"from": 0,"size": 10}
  • 27. 27/37The Real WorldA trip has now more than 30 fields●(faq is around 25 fields)●(members even more...)To build a trip document we need 3differents SQL queries●(FAQ : 2 differents SQL queries)●(Member : 10 differents SQL queries)A trip has only 1 shard (grouping)
  • 28. 28/37And now the caveats
  • 29. 29/37Preloaded ScriptsWe use mvel script to improve scoring●They are not clustered●Each node need to have the scripts●Need a node restart to be added or modifiedSolution : Chef (tool from Opscode)All nodes configurations are centralized into Chefrepository
  • 30. 30/37Grouping documentsHome made patchs to ElasticSearch(based on a Martijn Van Groningen work forlusini.de)Soon in ElasticSearch(I hope so much)
  • 31. 31/37Mapping modificationOn a running index :Changing a type is not allowedChanging analyzer is not allowedSolution : index alias1) Changing mapping → create a new index2) When new index is up to date → changing alias
  • 32. 32/37IOs limitsWe have only 2 nodes●Trip index is around 2GB●But only 1 shard for Trip index●Can index 100 trips / seconds on busy eveningSolution : We put Intel SSDs(waiting for distributed grouping feature)
  • 33. 33/37Choosing the analyzerSome field need to not be analyzed●If you use ISO code for country(IT, for Italy or DE for Germany are ignored insome cases)Global analyzer has limits●Accentuation from countries like France,Germany or Spain are not always parsed correctly●One analyzer by country is difficult to implementin some cases
  • 34. 34/37OK Sweet,Whats next?
  • 35. 35/37Using ElasticSearch to ease log analysis
  • 36. 36/37By the way…We’re hiring !!!Dev, HTML Ninja, leader,…Come & See me right now… or send me your friends (And we have beer, baby foot and arcade cabinet  )
  • 37. 37/37Thank you !Follow us !@covoiturageApply now :join@BlaBlaCar.com