GNES is Generic Neural Elastic Search (OSSEU19 Lyon)

GNGenericNeuralElasticSearch
S
Linux Foundation OSS EU, Lyon, France. Oct. 29 LF AI Summit
Han Xiao @

Agenda
- What is GNES?
- Breakdown: neural, elastic and search
- GNES principles and tech highlights
- Demo: GNES on docker swarm
- Demo: GNES ﬂow API
- Scalability benchmark
- Summary

hxiao87 hanxiao @hxiao
GNES is Generic Neural Elastic Search
4
GNES [jee-nes] is a cloud-native semantic search system based on deep neural networks.
GNES enables large-scale index and semantic search for text-to-text, image-to-image,
video-to-video and any-to-any content form.
Cloud-Native
Semantic Search based
on DNN
End2End Generic
Solution

GNES resources
6
Version: v0.0.46
Github: https://github.com/gnes-ai/gnes
5 direct contributors from Tencent
2 community contributors
Homepage: https://gnes.ai
Docs: https://doc.gnes.ai
PyPI: https://pypi.org/project/gnes
Docker Hub: https://cloud.docker.com/u/gnes/repository/docker/gnes/gnes
GNES Board: https://board.gnes.ai
Blog: https://hanxiao.github.io
Call for more contributors!

Breakdown of
Neural, Search and Elastic

Preliminaries: Neural, Elastic, and Search
9
Find semantic similar text in large database
■ Microservices are a software development technique—a variant of the
service-oriented architecture (SOA) architectural style that structures an
application as a collection of loosely coupled services.
■ Microservices - also known as the microservice architecture - is an architectural
style that structures an application as a collection of services that are.
■ Microservices architecture is a term used to describe the practice of breaking up
an application into a series of smaller, more specialised parts, each of which
communicate with one another across common interfaces such as APIs and REST
interfaces like HTTP.

10
Find semantic similar text in a large database
How to quantize the semantics?
Does it work on super-long
/short document as well?
How to deﬁne similarity?
How to store the semantics?
Vector representation of the doc
Vector indexing, e.g. Faiss
Distance metrics (L2, Hamming, etc)
Segment long document into sentences
State-of-the-art NLP model
Faster, lighter and
distributed database
Domain/app-speciﬁc
preprocessing

11
Find semantic similar text, image, video in a large database
Does it work on large/small
image, long/short video as
well?
Vector representation of the image/video
Segment image/video into patches
State-of-the-art CV model
Faster, lighter and
preprocessing

12
Find semantic similar text, image, video in a large database
Does it work on large/small
image, long/short video as
well?
Vector representation of the doc
Segment image/video into patches
State-of-the-art CV model
Faster, lighter and
preprocessing
Encoder
Indexer
Preprocessor

A good neural search is only possible
when document and query are comparable semantic units.

Minimum & Optimum Semantic Unit
14

Minimum & Optimum Semantic Unit
15
Preprocessor

Runtime in GNES
17
Typical ML system
Typical search system
GNES
Train Inference
QueryIndex
Train Index Query
Train-time is not for everyone. Most users will just use our
pretrained model from GNES Hub.
Train
Runtime

Router is required to enable elasticity
19

Four fundamental microservices
To summarize, we have four fundamental components in GNES:
- Preprocessor: transforming a real-world object to a list of workable semantic units, aka
chunk;
- Encoder: representing chunks with vector representation;
- Indexer: storing the vectors into memory/disk that allows fast-access;
- Router: forwarding messages between microservices: e.g. batching, mapping, reducing
20
Train Index Query

Which (microservice) does what (logic) at when (runtime)
Understanding how GNES works is basically to know
and design the corresponding workﬂow

GNES Key Tenets and Highlights

Highlight 1:
Cloud-Native and Elastic

hxiao87 hanxiao @hxiao 24
Highlights
Cloud-native: GNES is all-in-microservice: encoder, indexer, preprocessor and router are all
running statelessly and independently in their own containers. Scaling, load-balancing, automated
recovering, they come off-the-shelf in GNES.
Encoder IndexerPreprocessor
Monolith: everything coupled in one process Encoder
Indexer
Preprocessor
GNES microservice architecture

Cloud-Native and Elastic
25
Index time Query time

GNES Board https://board.gnes.ai
26

Yaml-ize everything
29
Encoder
Word2Vec (18 Lines)
Encoder
BERT (36 Lines)
Indexer
Binary+LevelDB (9 Lines)
Preprocessor
Chinese (7 Lines)

With vs. Without Code/Model Separation
30
Change on model
Encoder
Update encoder.py Rebuild the project
into package
Online
Deploy the new package
Change on model
Encoder
YAML
Update YAML
Online
Old version
Oﬄine the old version
Serve users
Rollout and serve users
✓ Immutable codebase
✓ Minimum rollout time
✓ Version-controlled model
✓ Ease AB test, side-by-side
comparison

Highlight 3:
Model as Docker, Docker as Plugin

Challenges to AI OSS maintainers
32
What is the most sustainable way to incorporate latest NLP/CV/AI model into a framework?
As the developer of bert-as-service (one of the most popular AI OSS in 2018), I was often asked
by the community
“Han, can you support model X and make it X-as-a-service?”

Challenges to AI OSS maintainers
Popular design philosophy of an AI framework:
- Rewrite the code and claim it better than the original one
- Wrap the code (e.g. C-> Python) and provide better interface
33
bert-as-service

Not sustainable because you can't match the speed of AI
34
AI development nowadays
You as a OSS maintainer

Not sustainable because you can't handle the dependencies
- dependencies: packages or libraries required to run the algorithm,
e.g. ffmpeg, libcuda, tensorflow;
- codes: the implementation of the logic, can be written in Python, C,
Java, Scala with the help of Tensorflow, Pytorch, etc;
- a small config file: the arguments abstracted from the logic for
better flexibility during training and inference. For example,
batch_size, index_strategy, and model_path;
- big data files: the serialization of the model structure the learned
parameters, e.g. a pretrained VGG/BERT model.
35
Four pieces required to
run an AI model

Leave the full autonomy to the model developer

Model as Docker, Docker as plugin.
This is the GNES solution in the face of accelerating
innovation on new models from the AI community.

GNES Hub: gnes-ai/hub
38

Easy to use on every level
39

Generic and Universal
40

Explainable
Scores are explainable down to atom-level:
- nested score structure
- explainable on every ops
41

Demo: Build a Poem Semantic Search
https://github.com/gnes-ai/demo-poems-ir
42

Demo: Build a Poem Semantic Search
43
Steps:
1. Define the workflow
a. What microserivces do I need
b. How should they connect with each other
2. Specify each microservice
a. Yaml config
b. additional python files/ Dockerfile
Preprocessor EncoderVector-Indexer Doc-IndexerRouter
Encoder

Deﬁne the workﬂow
44
Index time Query time

Specify each component
All microservice start with a base image: gnes/gnes:latest-alpine
Encoder design:
- use Pytorch-transformer
- need GPU and cuda support
- need to download pretrained model in advance
- conﬁg the encoder and pooling strategy
45

encode/Dockerﬁle
FROM pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-runtime
RUN pip install pytorch-transformers --no-cache-dir --compile && python -c
"from pytorch_transformers import *; x='bert-base-uncased';
BertModel.from_pretrained(x); BertTokenizer.from_pretrained(x)"
ADD *.py *.yml ./
RUN pip install gnes --no-cache-dir --compile
ENTRYPOINT ["gnes", "encode", "--yaml_path", "transformer.yml",
"--read_only"]
46
gnes/demo-poem:encode

encode/transformer.yaml
!PipelineEncoder
components:
- !PyTorchTransformers
parameters:
model_dir: $TORCH_TRANSFORMERS_MODEL
model_name: bert-base-uncased
- !PoolingEncoder
parameters:
pooling_strategy: REDUCE_MEAN
backend: torch
gnes_config:
name: my_transformer # a customized name
is_trained: true # indicate the model has been trained
work_dir: /workspace
batch_size: 128
47
Pooling
BERT

gnes/demo-poem:encode

vector-index/Dockerﬁle
- Install faiss
- Conﬁg it
49
FROM continuumio/anaconda3
RUN apt-get update && apt-get install -y
build-essential &&
/opt/conda/bin/conda install faiss-cpu -c pytorch
ADD *.py *.yml ./
RUN /opt/conda/bin/pip install -U gnes
--no-cache-dir --compile
ENTRYPOINT ["/opt/conda/bin/gnes", "index",
"--yaml_path", "faiss.yml"]
gnes/demo-poem:index

vector-index/faiss.yml
!FaissIndexer
parameters:
num_dim: 768
index_key: HNSW32
data_path: /workspace/idx.faiss
gnes_config:
name: my_faiss_indexer # a customized name
50

gnes/demo-poem:index

Fulltext indexer and Score Function
52
!DictIndexer
gnes_config:
name: my_fulltext_indexer
!Chunk2DocTopkReducer
parameters:
reduce_op: avg

Index!
53

Query!
54

GNES Flow
a Pythonic Way to Build Cloud-Native Neural Search Pipelines
55

GNES Flow to GNES is Keras to Tensorﬂow
56
Motivation
- a readable and brief idiom to deﬁne pipelines: index, query, train, etc.
- make GNES easier to debug locally

GNES Flow highlights
- chain multiple add() functions to build a pipeline;
- use self-defined names instead of ports to a service;
- modify a pipeline’s component via set();
- run a pipeline on multiple orchestration layers, e.g. multi-thread, multi-process,
Docker Swarm, Kubernetes;
- serialize/deserialize a pipeline to/from a binary file, a SVG image, Docker
Swarm/Kubernetes config files.
57

Deﬁne a Pipeline
58

Scale out a Pipeline
59

Deﬁne an index ﬂow
60

Use the index ﬂow
with flow.build(backend='process') as f:
f.index(txt_file='poems.txt', batch_size=20)
with flow.build(backend='swarm') as f:
f.index(bytes_gen=read_flowers(), batch_size=64)
61

Jupyter Notebook for Flower Search
62

Track the scalability on every new release/master
64

66
GNES is …
- Cloud-native, all-in-microservice
- Generic semantic search solution
using DNN
- Elastic workﬂow optimized and
tailored for search scenarios
- A different mindset for building
sustainable AI OSS
- Grow with the community
GNES is NOT ...
- Yet another collection of AI
algorithms
- a generic framework for doing every
ML task (e.g. clustering)

GNES resources
68
Version: v0.0.46
Github: https://github.com/gnes-ai/gnes
5 direct contributors from Tencent
2 community contributors
Homepage: https://gnes.ai
Docs: https://doc.gnes.ai
PyPI: https://pypi.org/project/gnes
Docker Hub: https://cloud.docker.com/u/gnes/repository/docker/gnes/gnes
GNES Board: https://board.gnes.ai
Blog: https://hanxiao.github.io
Call for more contributors!

My other opensource projects
69
263 6736 1407
Fashion-MNIST
Most popular AI open-source project of 2017 (0.3% chance)
Google Scholar > 1300 publications
165 5808 1174
bert-as-service
Most popular AI open-source project of 2018 (0.22% chance)

GNES is Generic Neural Elastic Search (OSSEU19 Lyon)

Recommended

Recommended

More Related Content

Similar to GNES is Generic Neural Elastic Search (OSSEU19 Lyon)

Similar to GNES is Generic Neural Elastic Search (OSSEU19 Lyon) (20)

Recently uploaded

Recently uploaded (20)

GNES is Generic Neural Elastic Search (OSSEU19 Lyon)