An agile system design to benefit search engine, online ads serving and recommenender system. Easy and efficient for deploying machine learning models. It completes the eco system of the run-time, user feedback and model training.
3. The Situations
Business
Promotions
Black lists
Special rules
Product
UX changes
Frequent releases
Low latency
A/B test
Engineering
Reuse of code modules
Isolation of modules
Configuration driven
Service/Operation
Live updates/No down
time deployments
Monitoring
Less machines
6. The Middleware Serving Container
Let the WD focus on UX
An agile software development and deployment
Provide horizontal capabilities
Execution model
Communication mechanisms
Data marshalling
Engineers focus on application logic
Single request parallel execution ability
A production playground for Research/Science
8. Search EngineBoo
st.A
sio
HTTP 1.1
XML Formatter
Search HandlerAdmin Handler
Query Parser
Inverted Index
XML Formatter
Spell Check
Query Parser
Inverted Index
Thre
ad
Pool
Thre
ad
Pool
9. Execution Model
Processor:
logical unit of
processing
(module)
Workflow:
directed acyclic
graph stitched
with processors
User Profile
Model B
Model A
Inverted Index
Cache
10. The Workflow
User Profile
Model B
Model A
Inverted Index
User Profile
Model BModel A
Inverted Index
CacheCache Hit
START
BRANCH
Known User Unknown User
END
END
END
Cache Miss
11. Control flow vs. U shape
1. Different types of
processors: BRANCH,
FORK, JOIN etc.
2. Hard to describe in
configuration
3. Early exit makes
workflow
complicated
4. Code path might be
complicated
1. One of a kind
processor
2. Configuration is
simple as a chain of
processors
3. Easy to exit early
4. Fixed/Limited code
path: easy for testing
and debugging
5. Natural for cache
layers
6. Keep application logic
together
7. Easily to split into
different containers
User Profile
Model B
Model A
Inverted Index
Cache
12. The Processor
All implemented the virtual function “Match”
Container calls the “Match” function in each
processor along the workflow
Built as a shared object, dynamically linked library
Container opens and loads the Processor form a .so
file
(Java: OSGi bundle as a jar file)
Support live updates
13. Execution interface
Result Match(Query query, Execution execution){
// could do something with query
// downward part in the U shape
Result result = execution.match(query);
// could do something with result
// upward part in the U shape
return result;
}
14. Ads Serving EngineBoo
st.A
sio
HTTP 1.1
Search HandlerAdmin Handler
XML Formatter
User Profile
Query Parser
Inverted Index
Clie
nt
Libr
ary
Thre
ad
Pool
Thre
ad
Pool
15. Change for Asynchronous Calls
Match(Query query, Result result, Execution
execution){
// do something
// downward part in the U shape
execution.match(query, result, execution);
}
Deliver(Query query, Result result, Execution
execution){
// do something
// upward part in the U shape
execution.deliver(query, result, execution);
}
User Profile
16. Thread pools
Separate I/O thread and Worker thread into two
different pools
Asynchronous calls make sense on when there will
be waiting/idling
For example: calling for out-of-box services
Keeping a thread busy without switching tasks is
more efficient
17. Administration &
Operation Interface
Two virtual functions of the Processor
Get_status: to show the processor specific status
Exec_cmd: to execute a specific task inside the
processor
No down time application deployment
Update configuration without code change
Deploy code change from another shared object file
Visualized Configuration
19. Replay & Offline Simulation
Some people do:
Have another set of code to simulate
Some other people do:
Have another setup identical to production system
Prepare the log, copy over to simulation clients
Have multiple clients sending requests and saving
results
Copy the result back to your research platform
Configure to use the standard I/O interface
Utilize the Hadoop streaming to simulate over
hundreds of machines
Must-have for efficient research
20. Recommender EngineBoo
st.A
sio
HTTP 1.1
Search HandlerAdmin Handler
User Profile
Model A
Model B
Inverted Index
Clie
nt
Libr
ary
User Profile
Model A
Model B
Redis Adapter
Thre
ad
Pool
Thre
ad
Pool