A search engine in a world of events and microservices - SF Pot @Meetic

19 millions searches per day @ Meetic
A search engine in a world of events and microservices

Me and myself
@seb_legall
Tech Lead back-end @ Meetic

Started in 2001
•
Active in 15 countries
•
Dating leader in Europe
•
Millions of Monthly Active Users
•
150 people in IT teams

Meetic Before
Webservices
Backoffices
Mobile Web
WAP
Desktop
Cronjobs
…

Exposition Layer
Event bus
Consumers
Micro-services
BackOffices
Meetic Now

John and Ygritte, a game of
microservices
Code is coming.

Once upon a time in a far far away galaxy
country…
John wants to meet someone Ygritte wants to meet someone

Ygritte knows what king of John she wants
Ygritte knows what she wants for her.
She is looking for a John that match with her criterias.
$ curl –XGET https://api.meetic.fr/search?height=188&eyes=brown&region=north&country=westeros

John knows nothing
John is looking for love, wherever it comes from.
(Except from Casterly Rock)
$ curl –XGET https://api.meetic.fr/search/1234/shuffle
John’s Id

Summary
• Let’s introduce the search microservice
• An overview of the code architecture in a microservice
• Indexing data
• What happens when Ygritte update her profile?
• John signs up on Meetic. How his profile is indexed?
• Searching people
• Overview of the advanced search feature used by Ygritte
• How does Meetic suggest profile that may interest John?

Introducing the search
microservice
It says “search”. Not necessarily “find”.

The search microservice
The search microservice has one responsibility and only one :
Searching Meetic users
Some Meetic features :
• Advanced Search
• Shuffle (Tinder like)
• Online profiles
• Similar Profiles
• Etc..

The search microservice
In order to do so, the search microservice should :
• Be responsible of the way data is stored
• Be aware of any data updates when the updates are in its scope
• Return a list of profile’s ID when calling it

The search microservice Event bus
Consumer
Search
Exposition Layer

About code structure in a
microservice @ Meetic
Don’t blame your messy code, blame the design pattern.
(or the architect, or the tech lead, or the product owner)

The hexagonal workflow
Request (GET)
Handler
DAO
Domain Object
Domain
Repository
Populate
Call
Implements
Infrastructure
Application
Domain

Design pattern hexagonal
Domain management
Infrastructure implementation
(repositories)
Request management

Design pattern hexagonal
Define the “profile” domain object
Implement the repository using Guzzle
Handle the “search” request
Define how profile should be get
Implement the repository using Doctrine

Ygritte
…already has an account on Meetic

Ygritte updates her Meetic Profile
Event bus
Consumer
Picture
Exposition Layer
PUT
{
”id”: 6789,
“picture”: “me.jpg”
}
PUT
{
”id”: 6789,
}
UPSERT
{
”id”: 6789,
“has_photo”: true
}

John
…have just discovered Meetic. He wants to sign up.

John signs up on Meetic
(What should be done)
Event bus
Consumer
Profile
Exposition Layer
POST
{
”id”: 1234,
“birthday”: “1989-01-12,
”picture”: “me.jpg”
}
POST
{
”id”: 1234,
“birthday”: “1989-01-12
}
POST
{
”id”: 6789,
“birthday”: “1989-01-12,
“has_photo”: true,
“paymentStatus”: “free”,
….
}
GET /1234/pictures
GET /payment-status/1234
GET …
Picture
POST
{
”id”: 1234,
}

Theory vs reality
Calling microservices
+ In case of change on any databases, this workflow stay unchanged
+ Avoid duplicated business logic
- Don’t scale very well because of the number of http calls needed
- Takes a lot of time to implement

John signs up on Meetic
(What is really done)
Event bus
Consumer
Profile
Exposition Layer
POST
{
”id”: 1234,
“birthday”: “1989-01-12
}
POST /reload/6789
Search
SELECT * FROM PROFILE
LEFT JOIN PAYMENT
WHERE ID = 1234POST
{
”id”: 1234,
“birthday”: “1989-01-12,
“eye”: “brown”,
“paymentStatus”: “free”,
….
}
Picture
POST
{
”id”: 1234,
“birthday”: “1989-01-12,
”picture”: “me.jpg”
}
POST
{
”id”: 1234,
}

Theory vs reality
Querying the database
+ The search microservice stays responsible of his data
+ Allows batch processing
- Works only because databases are not yet split
- Change on the database have to be replicated in the search
microservice

Managing big (SQL) queries
with Symfony
… using the compiler pass

What does the reload query looks
like?
SELECT
ID,
BIRTHDAY,
…,
(SELECT * FROM …),
FROM MEETIC.MEMBER
INNER JOIN …
LEFT JOIN…
INNER JOIN…
LEFT JOIN…
LEFT JOIN…
INNER JOIN…
WHERE (
SELECT T FROM MEETIC.T WHERE …
)
AND …
OR…

Keeping the reload query maintainable
Since we chose to get data directly from the database when creating a
new document in the index, the SQL query is huge and complex.
• We need it to be easily shared, review and updated by DBAs
• We need to keep it isolated so changes in the DB schema can be
reported in a single file
• We want to be able to just copy-paste it and check if it works.

Managing big (SQL) queries with Symfony
#1 Step : The
application handle the
request by calling the
domain interface

#2 Step : The
domain describe
how objects
should be
manipulated

#3 Step : The
infrastructure actually
manipulate data.

Injecting resources at compile time
Service declaration :

Injecting resources at compile time
We use the Symfony CompilerPass to
inject a file content as a string in the
DAO.

Ygritte
is using the search engine

Ygritte uses the search engine
Exposition Layer
Search
POST
{
“query”: {
“term”: {
”eyes”: “brown”
}
}
}
GET /search?eye=brown&hair=brown
GET /search?eye=brown&hair=brown
{
”memberId”: [
1234,
456786,
]
}

Keeping the search logic clear
Templating ElasticSearch queries with twig

What does an ElasticSearch
query look like?
Most of the time the ElasticSearch query contains the larger
part of the business logic.
• Very long
• Lot of strange json key
• Lot of parameters
And yet…

Why not generating queries via php?
• Managing big PHP array throw if…else is quite hard
• It becomes harder to understand what actually
does the query.

Keeping the
business logic clear
with twig
Templating the json query using
twig let us know easily what
actually does the query

Keeping the business logic clear with twig
Injecting parameters
become easier.

John
is using the shuffle feature

John uses the shuffle
Exposition Layer
Search
POST
{
“query”: {
”must”: {
“term”: {
”eyes”: “brown”
}
},
“must_not”: {
”terms”: {
”id”: [876,4567]
}
}
}
GET /search/1234/shuffle
GET /search/1234/shuffle
{
”memberId”: [
5678,
09876,
]
}
GET /interaction/1234/black-list
GET /referential/1234/search

Building ElasticSearch query
from multiple source
…using the Guzzle promises

Building query from multiple source
Request (GET)
Handler
Enricher
Domain Object
Domain
Repository
Populate
Call
Implements
Infrastructure
Application
Domain
DAO DAO
DAO

Optimizing response time with Guzzle promises
Http calls take time.
Guzzle promises let us use
parallelism in order to save
precious milliseconds.

Optimizing with Guzzle promises
1
2
3
4
5
6
1
2
3
4
5
6
Calls
Time

Conclusion
Is that all theoretical or is it actually working in production?

Search microservice in production
• 19 millions hits per day
• ~ 10 servers on 2 DC needed to be “Disaster Recovery Plan” friendly
• Search route AVG response time : ~ 163 ms
• Shuffle route AVG response time : ~ 336 ms

A search engine in a world of events and microservices - SF Pot @Meetic

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A search engine in a world of events and microservices - SF Pot @Meetic

Similar to A search engine in a world of events and microservices - SF Pot @Meetic (20)

More from meeticTech

More from meeticTech (7)

Recently uploaded

Recently uploaded (20)

A search engine in a world of events and microservices - SF Pot @Meetic