SlideShare a Scribd company logo
1 of 69
Download to read offline
GNGenericNeuralElasticSearch
S
Linux Foundation OSS EU, Lyon, France. Oct. 29 LF AI Summit
Han Xiao @
Agenda
- What is GNES?
- Breakdown: neural, elastic and search
- GNES principles and tech highlights
- Demo: GNES on docker swarm
- Demo: GNES flow API
- Scalability benchmark
- Summary
What is GNES?
hxiao87 hanxiao @hxiao
GNES is Generic Neural Elastic Search
4
GNES [jee-nes] is a cloud-native semantic search system based on deep neural networks.
GNES enables large-scale index and semantic search for text-to-text, image-to-image,
video-to-video and any-to-any content form.
Cloud-Native
Semantic Search based
on DNN
End2End Generic
Solution
hxiao87 hanxiao @hxiao 5
hxiao87 hanxiao @hxiao
GNES resources
6
Version: v0.0.46
Github: https://github.com/gnes-ai/gnes
5 direct contributors from Tencent
2 community contributors
Homepage: https://gnes.ai
Docs: https://doc.gnes.ai
PyPI: https://pypi.org/project/gnes
Docker Hub: https://cloud.docker.com/u/gnes/repository/docker/gnes/gnes
GNES Board: https://board.gnes.ai
Blog: https://hanxiao.github.io
Call for more contributors!
Breakdown of
Neural, Search and Elastic
"Neural"
hxiao87 hanxiao @hxiao
Preliminaries: Neural, Elastic, and Search
9
Find semantic similar text in large database
■ Microservices are a software development technique—a variant of the
service-oriented architecture (SOA) architectural style that structures an
application as a collection of loosely coupled services.
■ Microservices - also known as the microservice architecture - is an architectural
style that structures an application as a collection of services that are.
■ Microservices architecture is a term used to describe the practice of breaking up
an application into a series of smaller, more specialised parts, each of which
communicate with one another across common interfaces such as APIs and REST
interfaces like HTTP.
hxiao87 hanxiao @hxiao
Preliminaries: Neural, Elastic, and Search
10
Find semantic similar text in a large database
How to quantize the semantics?
Does it work on super-long
/short document as well?
How to define similarity?
How to store the semantics?
Vector representation of the doc
Vector indexing, e.g. Faiss
Distance metrics (L2, Hamming, etc)
Segment long document into sentences
State-of-the-art NLP model
Faster, lighter and
distributed database
Domain/app-specific
preprocessing
hxiao87 hanxiao @hxiao
Preliminaries: Neural, Elastic, and Search
11
Find semantic similar text, image, video in a large database
How to quantize the semantics?
Does it work on large/small
image, long/short video as
well?
How to define similarity?
How to store the semantics?
Vector representation of the image/video
Vector indexing, e.g. Faiss
Distance metrics (L2, Hamming, etc)
Segment image/video into patches
State-of-the-art CV model
Faster, lighter and
distributed database
Domain/app-specific
preprocessing
hxiao87 hanxiao @hxiao
Preliminaries: Neural, Elastic, and Search
12
Find semantic similar text, image, video in a large database
How to quantize the semantics?
Does it work on large/small
image, long/short video as
well?
How to define similarity?
How to store the semantics?
Vector representation of the doc
Vector indexing, e.g. Faiss
Distance metrics (L2, Hamming, etc)
Segment image/video into patches
State-of-the-art CV model
Faster, lighter and
distributed database
Domain/app-specific
preprocessing
Encoder
Indexer
Preprocessor
A good neural search is only possible
when document and query are comparable semantic units.
hxiao87 hanxiao @hxiao
Minimum & Optimum Semantic Unit
14
hxiao87 hanxiao @hxiao
Minimum & Optimum Semantic Unit
15
Preprocessor
"Search"
hxiao87 hanxiao @hxiao
Runtime in GNES
17
Typical ML system
Typical search system
GNES
Train Inference
QueryIndex
Train Index Query
Train-time is not for everyone. Most users will just use our
pretrained model from GNES Hub.
Train
Runtime
"Elastic"
hxiao87 hanxiao @hxiao
Router is required to enable elasticity
19
hxiao87 hanxiao @hxiao
Four fundamental microservices
To summarize, we have four fundamental components in GNES:
- Preprocessor: transforming a real-world object to a list of workable semantic units, aka
chunk;
- Encoder: representing chunks with vector representation;
- Indexer: storing the vectors into memory/disk that allows fast-access;
- Router: forwarding messages between microservices: e.g. batching, mapping, reducing
20
Train Index Query
Which (microservice) does what (logic) at when (runtime)
Understanding how GNES works is basically to know
and design the corresponding workflow
GNES Key Tenets and Highlights
Highlight 1:
Cloud-Native and Elastic
hxiao87 hanxiao @hxiao 24
Highlights
Cloud-native: GNES is all-in-microservice: encoder, indexer, preprocessor and router are all
running statelessly and independently in their own containers. Scaling, load-balancing, automated
recovering, they come off-the-shelf in GNES.
Encoder IndexerPreprocessor
Monolith: everything coupled in one process Encoder
Indexer
Preprocessor
GNES microservice architecture
hxiao87 hanxiao @hxiao
Cloud-Native and Elastic
25
Index time Query time
hxiao87 hanxiao @hxiao
GNES Board https://board.gnes.ai
26
Highlight 2:
State-of-the-art
hxiao87 hanxiao @hxiao 28
hxiao87 hanxiao @hxiao
Yaml-ize everything
29
Encoder
Word2Vec (18 Lines)
Encoder
BERT (36 Lines)
Indexer
Binary+LevelDB (9 Lines)
Preprocessor
Chinese (7 Lines)
hxiao87 hanxiao @hxiao
With vs. Without Code/Model Separation
30
Change on model
Encoder
Update encoder.py Rebuild the project
into package
Online
Deploy the new package
Change on model
Encoder
YAML
Update YAML
Online
Old version
Offline the old version
Serve users
Rollout and serve users
✓ Immutable codebase
✓ Minimum rollout time
✓ Version-controlled model
✓ Ease AB test, side-by-side
comparison
Highlight 3:
Model as Docker, Docker as Plugin
hxiao87 hanxiao @hxiao
Challenges to AI OSS maintainers
32
What is the most sustainable way to incorporate latest NLP/CV/AI model into a framework?
As the developer of bert-as-service (one of the most popular AI OSS in 2018), I was often asked
by the community
“Han, can you support model X and make it X-as-a-service?”
hxiao87 hanxiao @hxiao
Challenges to AI OSS maintainers
Popular design philosophy of an AI framework:
- Rewrite the code and claim it better than the original one
- Wrap the code (e.g. C-> Python) and provide better interface
33
bert-as-service
hxiao87 hanxiao @hxiao
Not sustainable because you can't match the speed of AI
34
AI development nowadays
You as a OSS maintainer
hxiao87 hanxiao @hxiao
Not sustainable because you can't handle the dependencies
- dependencies: packages or libraries required to run the algorithm,
e.g. ffmpeg, libcuda, tensorflow;
- codes: the implementation of the logic, can be written in Python, C,
Java, Scala with the help of Tensorflow, Pytorch, etc;
- a small config file: the arguments abstracted from the logic for
better flexibility during training and inference. For example,
batch_size, index_strategy, and model_path;
- big data files: the serialization of the model structure the learned
parameters, e.g. a pretrained VGG/BERT model.
35
Four pieces required to
run an AI model
Leave the full autonomy to the model developer
Model as Docker, Docker as plugin.
This is the GNES solution in the face of accelerating
innovation on new models from the AI community.
hxiao87 hanxiao @hxiao
GNES Hub: gnes-ai/hub
38
hxiao87 hanxiao @hxiao
Easy to use on every level
39
hxiao87 hanxiao @hxiao
Generic and Universal
40
hxiao87 hanxiao @hxiao
Explainable
Scores are explainable down to atom-level:
- nested score structure
- explainable on every ops
41
Demo: Build a Poem Semantic Search
https://github.com/gnes-ai/demo-poems-ir
42
hxiao87 hanxiao @hxiao
Demo: Build a Poem Semantic Search
43
Steps:
1. Define the workflow
a. What microserivces do I need
b. How should they connect with each other
2. Specify each microservice
a. Yaml config
b. additional python files/ Dockerfile
Preprocessor EncoderVector-Indexer Doc-IndexerRouter
Encoder
hxiao87 hanxiao @hxiao
Define the workflow
44
Index time Query time
hxiao87 hanxiao @hxiao
Specify each component
All microservice start with a base image: gnes/gnes:latest-alpine
Encoder design:
- use Pytorch-transformer
- need GPU and cuda support
- need to download pretrained model in advance
- config the encoder and pooling strategy
45
hxiao87 hanxiao @hxiao
encode/Dockerfile
FROM pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-runtime
RUN pip install pytorch-transformers --no-cache-dir --compile && python -c
"from pytorch_transformers import *; x='bert-base-uncased';
BertModel.from_pretrained(x); BertTokenizer.from_pretrained(x)"
ADD *.py *.yml ./
RUN pip install gnes --no-cache-dir --compile
ENTRYPOINT ["gnes", "encode", "--yaml_path", "transformer.yml",
"--read_only"]
46
gnes/demo-poem:encode
hxiao87 hanxiao @hxiao
encode/transformer.yaml
!PipelineEncoder
components:
- !PyTorchTransformers
parameters:
model_dir: $TORCH_TRANSFORMERS_MODEL
model_name: bert-base-uncased
- !PoolingEncoder
parameters:
pooling_strategy: REDUCE_MEAN
backend: torch
gnes_config:
name: my_transformer # a customized name
is_trained: true # indicate the model has been trained
work_dir: /workspace
batch_size: 128
47
Pooling
BERT
hxiao87 hanxiao @hxiao 48
gnes/demo-poem:encode
hxiao87 hanxiao @hxiao
vector-index/Dockerfile
- Install faiss
- Config it
49
FROM continuumio/anaconda3
RUN apt-get update && apt-get install -y
build-essential && 
/opt/conda/bin/conda install faiss-cpu -c pytorch
ADD *.py *.yml ./
RUN /opt/conda/bin/pip install -U gnes
--no-cache-dir --compile
ENTRYPOINT ["/opt/conda/bin/gnes", "index",
"--yaml_path", "faiss.yml"]
gnes/demo-poem:index
hxiao87 hanxiao @hxiao
vector-index/faiss.yml
!FaissIndexer
parameters:
num_dim: 768
index_key: HNSW32
data_path: /workspace/idx.faiss
gnes_config:
name: my_faiss_indexer # a customized name
work_dir: /workspace
50
hxiao87 hanxiao @hxiao 51
gnes/demo-poem:index
hxiao87 hanxiao @hxiao
Fulltext indexer and Score Function
52
!DictIndexer
gnes_config:
name: my_fulltext_indexer
work_dir: /workspace
!Chunk2DocTopkReducer
parameters:
reduce_op: avg
hxiao87 hanxiao @hxiao
Index!
53
hxiao87 hanxiao @hxiao
Query!
54
GNES Flow
a Pythonic Way to Build Cloud-Native Neural Search Pipelines
55
hxiao87 hanxiao @hxiao
GNES Flow to GNES is Keras to Tensorflow
56
Motivation
- a readable and brief idiom to define pipelines: index, query, train, etc.
- make GNES easier to debug locally
hxiao87 hanxiao @hxiao
GNES Flow highlights
- chain multiple add() functions to build a pipeline;
- use self-defined names instead of ports to a service;
- modify a pipeline’s component via set();
- run a pipeline on multiple orchestration layers, e.g. multi-thread, multi-process,
Docker Swarm, Kubernetes;
- serialize/deserialize a pipeline to/from a binary file, a SVG image, Docker
Swarm/Kubernetes config files.
57
hxiao87 hanxiao @hxiao
Define a Pipeline
58
hxiao87 hanxiao @hxiao
Scale out a Pipeline
59
hxiao87 hanxiao @hxiao
Define an index flow
60
hxiao87 hanxiao @hxiao
Use the index flow
with flow.build(backend='process') as f:
f.index(txt_file='poems.txt', batch_size=20)
with flow.build(backend='swarm') as f:
f.index(bytes_gen=read_flowers(), batch_size=64)
61
hxiao87 hanxiao @hxiao
Jupyter Notebook for Flower Search
62
Scalability Benchmark
63
hxiao87 hanxiao @hxiao
Track the scalability on every new release/master
64
Summary
65
66
GNES is …
- Cloud-native, all-in-microservice
- Generic semantic search solution
using DNN
- Elastic workflow optimized and
tailored for search scenarios
- A different mindset for building
sustainable AI OSS
- Grow with the community
GNES is NOT ...
- Yet another collection of AI
algorithms
- a generic framework for doing every
ML task (e.g. clustering)
67
Thanks for your attention!
hxiao87 hanxiao @hxiao
GNES resources
68
Version: v0.0.46
Github: https://github.com/gnes-ai/gnes
5 direct contributors from Tencent
2 community contributors
Homepage: https://gnes.ai
Docs: https://doc.gnes.ai
PyPI: https://pypi.org/project/gnes
Docker Hub: https://cloud.docker.com/u/gnes/repository/docker/gnes/gnes
GNES Board: https://board.gnes.ai
Blog: https://hanxiao.github.io
Call for more contributors!
hxiao87 hanxiao @hxiao
My other opensource projects
69
263 6736 1407
Fashion-MNIST
Most popular AI open-source project of 2017 (0.3% chance)
Google Scholar > 1300 publications
165 5808 1174
bert-as-service
Most popular AI open-source project of 2018 (0.22% chance)

More Related Content

Similar to GNES is Generic Neural Elastic Search (OSSEU19 Lyon)

TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...Edge AI and Vision Alliance
 
Solution sap hana
Solution sap hana Solution sap hana
Solution sap hana 希典 陈
 
Age of Language Models in NLP
Age of Language Models in NLPAge of Language Models in NLP
Age of Language Models in NLPTyrone Systems
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureBruno Cornec
 
The Big Data Puzzle, Where Does the Eclipse Piece Fit?
The Big Data Puzzle, Where Does the Eclipse Piece Fit?The Big Data Puzzle, Where Does the Eclipse Piece Fit?
The Big Data Puzzle, Where Does the Eclipse Piece Fit?J Langley
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewMarcel Hergaarden
 
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
Why documentation osidays
Why documentation osidaysWhy documentation osidays
Why documentation osidaysBastian Feder
 
The Nuxeo vision for 2009 and beyond
The Nuxeo vision for 2009 and beyondThe Nuxeo vision for 2009 and beyond
The Nuxeo vision for 2009 and beyondNuxeo
 
Transforming Application Delivery with PaaS and Linux Containers
Transforming Application Delivery with PaaS and Linux ContainersTransforming Application Delivery with PaaS and Linux Containers
Transforming Application Delivery with PaaS and Linux ContainersGiovanni Galloro
 
final proposal-Implement and create new documentation toolchain
final proposal-Implement and create new documentation toolchainfinal proposal-Implement and create new documentation toolchain
final proposal-Implement and create new documentation toolchainParamkusham Shruthi
 
substrate: A framework to efficiently build blockchains
substrate: A framework to efficiently build blockchainssubstrate: A framework to efficiently build blockchains
substrate: A framework to efficiently build blockchainsservicesNitor
 
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene PangSpark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene PangSpark Summit
 
Comparing IaaS: VMware vs OpenStack vs Google’s Ganeti
Comparing IaaS: VMware vs OpenStack vs Google’s GanetiComparing IaaS: VMware vs OpenStack vs Google’s Ganeti
Comparing IaaS: VMware vs OpenStack vs Google’s GanetiGiuseppe Paterno'
 
ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsChristian Charreyre
 
Dataverse in the European Open Science Cloud
Dataverse in the European Open Science CloudDataverse in the European Open Science Cloud
Dataverse in the European Open Science Cloudvty
 
Spark Pipelines in the Cloud with Alluxio
Spark Pipelines in the Cloud with AlluxioSpark Pipelines in the Cloud with Alluxio
Spark Pipelines in the Cloud with AlluxioAlluxio, Inc.
 
Plebeia, a new storage for Tezos blockchain state
Plebeia, a new storage for Tezos blockchain statePlebeia, a new storage for Tezos blockchain state
Plebeia, a new storage for Tezos blockchain stateJun Furuse
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsTimothy Spann
 

Similar to GNES is Generic Neural Elastic Search (OSSEU19 Lyon) (20)

TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
 
Solution sap hana
Solution sap hana Solution sap hana
Solution sap hana
 
Age of Language Models in NLP
Age of Language Models in NLPAge of Language Models in NLP
Age of Language Models in NLP
 
Redfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined InfrastructureRedfish and python-redfish for Software Defined Infrastructure
Redfish and python-redfish for Software Defined Infrastructure
 
The Big Data Puzzle, Where Does the Eclipse Piece Fit?
The Big Data Puzzle, Where Does the Eclipse Piece Fit?The Big Data Puzzle, Where Does the Eclipse Piece Fit?
The Big Data Puzzle, Where Does the Eclipse Piece Fit?
 
Red Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) OverviewRed Hat Storage 2014 - Product(s) Overview
Red Hat Storage 2014 - Product(s) Overview
 
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Why documentation osidays
Why documentation osidaysWhy documentation osidays
Why documentation osidays
 
The Nuxeo vision for 2009 and beyond
The Nuxeo vision for 2009 and beyondThe Nuxeo vision for 2009 and beyond
The Nuxeo vision for 2009 and beyond
 
Transforming Application Delivery with PaaS and Linux Containers
Transforming Application Delivery with PaaS and Linux ContainersTransforming Application Delivery with PaaS and Linux Containers
Transforming Application Delivery with PaaS and Linux Containers
 
final proposal-Implement and create new documentation toolchain
final proposal-Implement and create new documentation toolchainfinal proposal-Implement and create new documentation toolchain
final proposal-Implement and create new documentation toolchain
 
substrate: A framework to efficiently build blockchains
substrate: A framework to efficiently build blockchainssubstrate: A framework to efficiently build blockchains
substrate: A framework to efficiently build blockchains
 
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene PangSpark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene Pang
 
Comparing IaaS: VMware vs OpenStack vs Google’s Ganeti
Comparing IaaS: VMware vs OpenStack vs Google’s GanetiComparing IaaS: VMware vs OpenStack vs Google’s Ganeti
Comparing IaaS: VMware vs OpenStack vs Google’s Ganeti
 
ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projects
 
Dataverse in the European Open Science Cloud
Dataverse in the European Open Science CloudDataverse in the European Open Science Cloud
Dataverse in the European Open Science Cloud
 
Spark Pipelines in the Cloud with Alluxio
Spark Pipelines in the Cloud with AlluxioSpark Pipelines in the Cloud with Alluxio
Spark Pipelines in the Cloud with Alluxio
 
Plebeia, a new storage for Tezos blockchain state
Plebeia, a new storage for Tezos blockchain statePlebeia, a new storage for Tezos blockchain state
Plebeia, a new storage for Tezos blockchain state
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python Processors
 

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

GNES is Generic Neural Elastic Search (OSSEU19 Lyon)

  • 1. GNGenericNeuralElasticSearch S Linux Foundation OSS EU, Lyon, France. Oct. 29 LF AI Summit Han Xiao @
  • 2. Agenda - What is GNES? - Breakdown: neural, elastic and search - GNES principles and tech highlights - Demo: GNES on docker swarm - Demo: GNES flow API - Scalability benchmark - Summary
  • 4. hxiao87 hanxiao @hxiao GNES is Generic Neural Elastic Search 4 GNES [jee-nes] is a cloud-native semantic search system based on deep neural networks. GNES enables large-scale index and semantic search for text-to-text, image-to-image, video-to-video and any-to-any content form. Cloud-Native Semantic Search based on DNN End2End Generic Solution
  • 6. hxiao87 hanxiao @hxiao GNES resources 6 Version: v0.0.46 Github: https://github.com/gnes-ai/gnes 5 direct contributors from Tencent 2 community contributors Homepage: https://gnes.ai Docs: https://doc.gnes.ai PyPI: https://pypi.org/project/gnes Docker Hub: https://cloud.docker.com/u/gnes/repository/docker/gnes/gnes GNES Board: https://board.gnes.ai Blog: https://hanxiao.github.io Call for more contributors!
  • 9. hxiao87 hanxiao @hxiao Preliminaries: Neural, Elastic, and Search 9 Find semantic similar text in large database ■ Microservices are a software development technique—a variant of the service-oriented architecture (SOA) architectural style that structures an application as a collection of loosely coupled services. ■ Microservices - also known as the microservice architecture - is an architectural style that structures an application as a collection of services that are. ■ Microservices architecture is a term used to describe the practice of breaking up an application into a series of smaller, more specialised parts, each of which communicate with one another across common interfaces such as APIs and REST interfaces like HTTP.
  • 10. hxiao87 hanxiao @hxiao Preliminaries: Neural, Elastic, and Search 10 Find semantic similar text in a large database How to quantize the semantics? Does it work on super-long /short document as well? How to define similarity? How to store the semantics? Vector representation of the doc Vector indexing, e.g. Faiss Distance metrics (L2, Hamming, etc) Segment long document into sentences State-of-the-art NLP model Faster, lighter and distributed database Domain/app-specific preprocessing
  • 11. hxiao87 hanxiao @hxiao Preliminaries: Neural, Elastic, and Search 11 Find semantic similar text, image, video in a large database How to quantize the semantics? Does it work on large/small image, long/short video as well? How to define similarity? How to store the semantics? Vector representation of the image/video Vector indexing, e.g. Faiss Distance metrics (L2, Hamming, etc) Segment image/video into patches State-of-the-art CV model Faster, lighter and distributed database Domain/app-specific preprocessing
  • 12. hxiao87 hanxiao @hxiao Preliminaries: Neural, Elastic, and Search 12 Find semantic similar text, image, video in a large database How to quantize the semantics? Does it work on large/small image, long/short video as well? How to define similarity? How to store the semantics? Vector representation of the doc Vector indexing, e.g. Faiss Distance metrics (L2, Hamming, etc) Segment image/video into patches State-of-the-art CV model Faster, lighter and distributed database Domain/app-specific preprocessing Encoder Indexer Preprocessor
  • 13. A good neural search is only possible when document and query are comparable semantic units.
  • 14. hxiao87 hanxiao @hxiao Minimum & Optimum Semantic Unit 14
  • 15. hxiao87 hanxiao @hxiao Minimum & Optimum Semantic Unit 15 Preprocessor
  • 17. hxiao87 hanxiao @hxiao Runtime in GNES 17 Typical ML system Typical search system GNES Train Inference QueryIndex Train Index Query Train-time is not for everyone. Most users will just use our pretrained model from GNES Hub. Train Runtime
  • 19. hxiao87 hanxiao @hxiao Router is required to enable elasticity 19
  • 20. hxiao87 hanxiao @hxiao Four fundamental microservices To summarize, we have four fundamental components in GNES: - Preprocessor: transforming a real-world object to a list of workable semantic units, aka chunk; - Encoder: representing chunks with vector representation; - Indexer: storing the vectors into memory/disk that allows fast-access; - Router: forwarding messages between microservices: e.g. batching, mapping, reducing 20 Train Index Query
  • 21. Which (microservice) does what (logic) at when (runtime) Understanding how GNES works is basically to know and design the corresponding workflow
  • 22. GNES Key Tenets and Highlights
  • 24. hxiao87 hanxiao @hxiao 24 Highlights Cloud-native: GNES is all-in-microservice: encoder, indexer, preprocessor and router are all running statelessly and independently in their own containers. Scaling, load-balancing, automated recovering, they come off-the-shelf in GNES. Encoder IndexerPreprocessor Monolith: everything coupled in one process Encoder Indexer Preprocessor GNES microservice architecture
  • 25. hxiao87 hanxiao @hxiao Cloud-Native and Elastic 25 Index time Query time
  • 26. hxiao87 hanxiao @hxiao GNES Board https://board.gnes.ai 26
  • 29. hxiao87 hanxiao @hxiao Yaml-ize everything 29 Encoder Word2Vec (18 Lines) Encoder BERT (36 Lines) Indexer Binary+LevelDB (9 Lines) Preprocessor Chinese (7 Lines)
  • 30. hxiao87 hanxiao @hxiao With vs. Without Code/Model Separation 30 Change on model Encoder Update encoder.py Rebuild the project into package Online Deploy the new package Change on model Encoder YAML Update YAML Online Old version Offline the old version Serve users Rollout and serve users ✓ Immutable codebase ✓ Minimum rollout time ✓ Version-controlled model ✓ Ease AB test, side-by-side comparison
  • 31. Highlight 3: Model as Docker, Docker as Plugin
  • 32. hxiao87 hanxiao @hxiao Challenges to AI OSS maintainers 32 What is the most sustainable way to incorporate latest NLP/CV/AI model into a framework? As the developer of bert-as-service (one of the most popular AI OSS in 2018), I was often asked by the community “Han, can you support model X and make it X-as-a-service?”
  • 33. hxiao87 hanxiao @hxiao Challenges to AI OSS maintainers Popular design philosophy of an AI framework: - Rewrite the code and claim it better than the original one - Wrap the code (e.g. C-> Python) and provide better interface 33 bert-as-service
  • 34. hxiao87 hanxiao @hxiao Not sustainable because you can't match the speed of AI 34 AI development nowadays You as a OSS maintainer
  • 35. hxiao87 hanxiao @hxiao Not sustainable because you can't handle the dependencies - dependencies: packages or libraries required to run the algorithm, e.g. ffmpeg, libcuda, tensorflow; - codes: the implementation of the logic, can be written in Python, C, Java, Scala with the help of Tensorflow, Pytorch, etc; - a small config file: the arguments abstracted from the logic for better flexibility during training and inference. For example, batch_size, index_strategy, and model_path; - big data files: the serialization of the model structure the learned parameters, e.g. a pretrained VGG/BERT model. 35 Four pieces required to run an AI model
  • 36. Leave the full autonomy to the model developer
  • 37. Model as Docker, Docker as plugin. This is the GNES solution in the face of accelerating innovation on new models from the AI community.
  • 38. hxiao87 hanxiao @hxiao GNES Hub: gnes-ai/hub 38
  • 39. hxiao87 hanxiao @hxiao Easy to use on every level 39
  • 40. hxiao87 hanxiao @hxiao Generic and Universal 40
  • 41. hxiao87 hanxiao @hxiao Explainable Scores are explainable down to atom-level: - nested score structure - explainable on every ops 41
  • 42. Demo: Build a Poem Semantic Search https://github.com/gnes-ai/demo-poems-ir 42
  • 43. hxiao87 hanxiao @hxiao Demo: Build a Poem Semantic Search 43 Steps: 1. Define the workflow a. What microserivces do I need b. How should they connect with each other 2. Specify each microservice a. Yaml config b. additional python files/ Dockerfile Preprocessor EncoderVector-Indexer Doc-IndexerRouter Encoder
  • 44. hxiao87 hanxiao @hxiao Define the workflow 44 Index time Query time
  • 45. hxiao87 hanxiao @hxiao Specify each component All microservice start with a base image: gnes/gnes:latest-alpine Encoder design: - use Pytorch-transformer - need GPU and cuda support - need to download pretrained model in advance - config the encoder and pooling strategy 45
  • 46. hxiao87 hanxiao @hxiao encode/Dockerfile FROM pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-runtime RUN pip install pytorch-transformers --no-cache-dir --compile && python -c "from pytorch_transformers import *; x='bert-base-uncased'; BertModel.from_pretrained(x); BertTokenizer.from_pretrained(x)" ADD *.py *.yml ./ RUN pip install gnes --no-cache-dir --compile ENTRYPOINT ["gnes", "encode", "--yaml_path", "transformer.yml", "--read_only"] 46 gnes/demo-poem:encode
  • 47. hxiao87 hanxiao @hxiao encode/transformer.yaml !PipelineEncoder components: - !PyTorchTransformers parameters: model_dir: $TORCH_TRANSFORMERS_MODEL model_name: bert-base-uncased - !PoolingEncoder parameters: pooling_strategy: REDUCE_MEAN backend: torch gnes_config: name: my_transformer # a customized name is_trained: true # indicate the model has been trained work_dir: /workspace batch_size: 128 47 Pooling BERT
  • 48. hxiao87 hanxiao @hxiao 48 gnes/demo-poem:encode
  • 49. hxiao87 hanxiao @hxiao vector-index/Dockerfile - Install faiss - Config it 49 FROM continuumio/anaconda3 RUN apt-get update && apt-get install -y build-essential && /opt/conda/bin/conda install faiss-cpu -c pytorch ADD *.py *.yml ./ RUN /opt/conda/bin/pip install -U gnes --no-cache-dir --compile ENTRYPOINT ["/opt/conda/bin/gnes", "index", "--yaml_path", "faiss.yml"] gnes/demo-poem:index
  • 50. hxiao87 hanxiao @hxiao vector-index/faiss.yml !FaissIndexer parameters: num_dim: 768 index_key: HNSW32 data_path: /workspace/idx.faiss gnes_config: name: my_faiss_indexer # a customized name work_dir: /workspace 50
  • 51. hxiao87 hanxiao @hxiao 51 gnes/demo-poem:index
  • 52. hxiao87 hanxiao @hxiao Fulltext indexer and Score Function 52 !DictIndexer gnes_config: name: my_fulltext_indexer work_dir: /workspace !Chunk2DocTopkReducer parameters: reduce_op: avg
  • 55. GNES Flow a Pythonic Way to Build Cloud-Native Neural Search Pipelines 55
  • 56. hxiao87 hanxiao @hxiao GNES Flow to GNES is Keras to Tensorflow 56 Motivation - a readable and brief idiom to define pipelines: index, query, train, etc. - make GNES easier to debug locally
  • 57. hxiao87 hanxiao @hxiao GNES Flow highlights - chain multiple add() functions to build a pipeline; - use self-defined names instead of ports to a service; - modify a pipeline’s component via set(); - run a pipeline on multiple orchestration layers, e.g. multi-thread, multi-process, Docker Swarm, Kubernetes; - serialize/deserialize a pipeline to/from a binary file, a SVG image, Docker Swarm/Kubernetes config files. 57
  • 59. hxiao87 hanxiao @hxiao Scale out a Pipeline 59
  • 60. hxiao87 hanxiao @hxiao Define an index flow 60
  • 61. hxiao87 hanxiao @hxiao Use the index flow with flow.build(backend='process') as f: f.index(txt_file='poems.txt', batch_size=20) with flow.build(backend='swarm') as f: f.index(bytes_gen=read_flowers(), batch_size=64) 61
  • 62. hxiao87 hanxiao @hxiao Jupyter Notebook for Flower Search 62
  • 64. hxiao87 hanxiao @hxiao Track the scalability on every new release/master 64
  • 66. 66 GNES is … - Cloud-native, all-in-microservice - Generic semantic search solution using DNN - Elastic workflow optimized and tailored for search scenarios - A different mindset for building sustainable AI OSS - Grow with the community GNES is NOT ... - Yet another collection of AI algorithms - a generic framework for doing every ML task (e.g. clustering)
  • 67. 67 Thanks for your attention!
  • 68. hxiao87 hanxiao @hxiao GNES resources 68 Version: v0.0.46 Github: https://github.com/gnes-ai/gnes 5 direct contributors from Tencent 2 community contributors Homepage: https://gnes.ai Docs: https://doc.gnes.ai PyPI: https://pypi.org/project/gnes Docker Hub: https://cloud.docker.com/u/gnes/repository/docker/gnes/gnes GNES Board: https://board.gnes.ai Blog: https://hanxiao.github.io Call for more contributors!
  • 69. hxiao87 hanxiao @hxiao My other opensource projects 69 263 6736 1407 Fashion-MNIST Most popular AI open-source project of 2017 (0.3% chance) Google Scholar > 1300 publications 165 5808 1174 bert-as-service Most popular AI open-source project of 2018 (0.22% chance)