SlideShare a Scribd company logo
1 of 60
Download to read offline
Computer vision
in the cloud
and beyond
Roman Storchak
CTO@DataI
Main tasks in CV for surveillance
● Verification (1:1, Border control)
● Age/Gender/emotion recognition ( sometimes other properties)
● Identification (1:N, Surveillance)
● Events and action recognition
Interest
&
Difficulty
ⒸDataI
Lets build one together
Definition of product (technical) success:
Our CV product have to answer these
questions:
● Who?
● Where?
● When?
● What?
ⒸDataI
Why bother, Lets use API (or SDK)
● Amazon Rekognition
● Face ++
● Meerkat
● Azure Cognitive Services
● Google CLOUD VISION ( No Face recognition)
● VeriLook SDK
● iFace SDK
● Cognitec VACS SDK
● Luxand FACE SDK
● Affectiva SDK
● Betaface SDK
ⒸDataI
Amazon Recognition
Prices:
- Rekognition: 10-12 c$/min
- S3+Data transfer or
AWS Kinesis streams >3 c$/TB
~70k$ per store per month
ⒸDataI
Module pipeline
Face detector
Face alignment
Age + Gender
models
FacePrint Face Search
Person
detector
Body Keypoint
classifier
Face selector
Action
recognition
Tracker
TRACK
with
metadata
Decisions,
BA material
ⒸDataI
Object for detection
HEAD
FACE
Upper BODY
BODY
NDA
ⒸDataI
Objects and attributes
Objects Static Attributes Dynamic
Attributes
Body Body Embedding Location
Actions
Head Count
Face Embedding//ID
Age, Gender,
Race
Emotions
Head Pose
ⒸDataI
Edge or Cloud?
https://sirinsoftware.com/blog/edge-computing/
Edge-to-Cloud
Face
detector
Face
alignment
Age + Gender
models
Face
Embedding
Face Search
Person
detector
Body
Keypoint
classifier
Face
selector
Action
recognition
Tracker
TRACK
with
metadata
Decisions,
BA material
CLOUD
On-Premise
ⒸDataI
OpenCV
+ Very simple to use
- GIL Python limitations for multithreading
NVIDIA DeepStream SDK
+ Fast & Flexible
- Only Jetson & Tesla
Video Streaming (Edge)
ⓒ nVidia DeepStream SDK
ⒸDataI
Video Streaming (Edge)
3. Gstreamer
+ plugin-based architecture
+ easy/fast Video Record (any supported format)
+ easy/fast overlay draw (using Cairo)
+ wide variety of ready-to-use plugins
- difficult for understanding
- weak support by Gstreamer community
- too many plugins has internal bugs
ⒸDataI & Taras lishchenko
Many plugins has internal bugs
ⒸDataI
General Architecture
ⒸDataI & Taras Lischenko
Simple Gstreamer Pipeline
filesrc ! decodebin ! fakesink
ⒸDataI & Taras Lishenko
Gstreamer Basics
Gstreamer Version: 1.12.4
Launch:
● gst-launch-1.0 filesrc ! decodebin ! fakesink
Check:
● gst-inspect-1.0 filesrc
ⒸDataI & Taras Lischenko
Min. Gstreamer Pipeline
Supported video sources
Live:
● Web Camera
● Rtsp Camera
Not-live:
● Video File (supported format)
● Multiple Files:
○ video01.mp4, video02.mp4,
video03.mp4 ...
● acquire frames from video
source
● decode
● convert to RGB
ⒸDataI & Taras Lischenko
NX
Number of simultaneous video streams?
ⒸDataI & Taras Lischenko
Min & Mean FPS / Num Video Streams
PC specs (Oper, shop-000001): Intel(R)
Core(TM) i7-7700K CPU @ 4.20GHz
ⒸDataI & Taras Lishchenko
15X
Number of simultaneous video streams?
ⒸDataI & Taras Lishchenko
“Video” Pipelines Templates
plugins = [FILESRC, TSDEMUX, H264PARSE, IDENTITY, FPS, TEE_HEAD,
QUEUE, AVDEC_H264, VIDEOSCALE, VIDEORATE, CAPS_FULL, VIDEOCONVERT,
OBJECT_DETECTOR, OBJECT_TRACKER,
VIDEOCONVERT, DYNAMIC_OVERLAY, VIDEOCONVERT,
I420_CAPS, VIDEOCONVERT, X264ENC, SPLITMUXSINK,
TEE_TAIL, QUEUE, SPLITMUXSINK]
ⒸDataI & Taras Lishchenko
Problems with Gstreamer
- Buffer offset/timestamps
- when live video - offset is const, when not - offset is in range [0, maxint]
- Solution: GstIdentity (Force offset increment [0, maxint])
- timestamp - CLOCK_MONOTONIC (project requires CLOCK_REAL to sync video with
annotations)
- Solution: Store Map CLOCK_MONOTONIC -> CLOCK_REAL
- X264enc too slow (requires a lot of CPU power)
- Solution: Use plugins (h264parse) for DVR without conversion from RGB (Drawbacks: Can’t
draw on non-RGB buffer)
- Python has limitations compared to C version:
- Passing objects from Python to C buffers (Metadata)
- Solution: DIY Python wrappers for C-libs that works with Gstreamer Objects
ⒸDataI & Taras Lishchenko
Sync
frame-by-frame
+ Guarantee buffers order
- N-1 waiting points
- Drop buffers
ⒸDataI & Taras Lishchenko
CPU vs GPU
ⒸDataI & Taras Lishchenko
Sync
Batch mode
+ Guarantee buffers order
+ Easy to sync annotations with
frame data
- N waiting points
- Drop buffers
- Gstreamer should emit buffers
with small delay, to reduce wait
time in batch collector
ⒸDataI & Taras Lishchenko
Async Batch mode
+ No waiting points
+ No need for gstreamer to emit buffers faster
+ Better GPU load
- Not guaranteed order between streams
- Hard to sync frames with annotations
- additional complexity to handle queues
ⒸDataI & Taras Lishchenko
Object detection
src/detectors
ⒸDataI & Taras Lishchenko
Evolution
1. Face detection (hard to track):
a. Dlib
i. bad for small faces (need to upscale image -> decrease performance)
ii. too slow (works on CPU)
b. MTCNN
i. Bad for many faces (performance decreases with increase number of faces due to
architecture)
2. Person detection:
a. Haar cascade
i. poor quality
ii. too slow (works on CPU)
b. Blob person detector
i. not invariant to noisy images
ii. too slow due to Background Substraction
c. TinyYolov2 (Darkflow)
d. Mobilenet-SSD
ⒸDataI & Taras Lishchenko
CPU usage
Min Max Mean
test_all_cpu 26.0656 28.6466 27.1232
test_on_7_cpu 26.9883 32.5846 28.8077
test_on_6_cpu 26.9680 31.8769 29.0395
test_on_5_cpu 27.0186 32.4739 30.0355
test_on_4_cpu 28.0154 37.4240 32.2915
test_on_3_cpu 34.6486 44.4604 37.7983
test_on_2_cpu 48.0222 60.0206 50.0859
test_on_1_cpu 83.4291 96.7086 88.9920
Model: BodyEmbeddings
CPU: i7-7700HQ CPU @ 2.80GHz
Tensorflow (explanation):
● intra_op_parallelism_threads=[0, NUM_CORES] (0 - best)
● inter_op_parallelism_threads=[0, NUM_CORES] (0 - best)
Conclusion: Some models could be executed in
parallel if there is a Number of cores =< Half of
Total Num of cores, without huge performance loss
ⒸDataI & Taras Lishchenko
One Session vs Multiple Sessions
Model: Body Embeddings
Conclusion: Performance can benefit from Single Graph in Single Session only on CPU. But not
significant difference (12%)
ⒸDataI & Taras Lishchenko
Resize Methods
Nearest Neighbours Resize with different implementations
Conclusion: OpenCV faster when resize. Use Nearest Neighbours method to gain max performance
ⒸDataI & Taras Lishchenko
np.stack instead np.concatenate (Batch)
Batch size: 20
Image size: 640x360x3
Conclusion: When collect images in batch:
- put to list
- np.stack(list)
ⒸDataI & Taras Lishchenko
Object Tracking
ⒸDataI & Taras Lishchenko
Evolution
- Dlib Tracker
- IOU (base version)
- no thresholding by confidence
- Configurable track start,drop,IOU thresholds, etc.
- IOU (extended) (with Hungarian Algorithm)
ⒸDataI & Taras Lishchenko
Problem #1 with Body Detector & IOU Tracking
ⒸDataI
Face Selector
Negative classes
● asymmetric Accuracy: 0.940
● bad Accuracy: 0.951
● bad_angle Accuracy: 0.875
● bad_manual Accuracy: 0.824
● blurred Accuracy: 0.951
● many Accuracy: 0.850
● no_face Accuracy: 0.953
● no_landmarks Accuracy: 0.918
● not_inside Accuracy: 0.941
Positive class
As close to ISO/IEC
19794-5:2005 compliant
photo as possible
ⒸDataI & Oleksiy Udod
ISO/IEC 19794-5:2005
ⒸDataI
Identification pipeline
Face detector Face alignment
Age + Gender
models
Face
Embedding
Face Search
ⒸDataI
Closed set vs Open set search
Closed-set evaluation:
● cumulative match characteristics
(CMC) curves
● receiver operating characteristic
(ROC) curves.
Open-set evaluation
● detection and identification rate
(DIR) curves (TPIR,FPIR,FNIR,...)
ⒸDataI
Open-set video evaluation
Gallery: a set of images of interest.
Probes: a set of images from for
querying.
In our case probe might be:
● the best-quality face image among all
images within the same person track
● any face image from the video
ⒸDataI & Vlad KhizanovⒸDataI
Metrics for Open Set Evaluation
Definition: Query is succeed if top result has similarity greater than t.
FPIR(t) = # of success non-mate search queries / # of queries
TPIR(t) = # of success mate search queries / # of queries
MISS(t) = # of non-success search queries / # of queries
FPIR(t) + TPIR(t) + MISS(t) = 1
Note: sometimes *FPIR = 1 - FPIR
Note: here TPIR(t) = TPIR(t, 1), where 1 is a rank
Mate searches are those for which the person in the search image has a face image in the enrolled dataset
Non-mate searches are those which the person in the search image does not have a face image in the enrolled dataset.
ⒸDataI & Vlad Khizanov
Metrics on chart
Usually metrics visualized as a
curve in parametric form:
x(t) = FPIR(t)
y(t) = TPIR(t)
t = 0.0, 0.01, 0.02, …, 1.0
Note: for usage it’s useful to pick optimal
threshold.
FNIR=1-TPIR
FPIR
FPIR@FNIR
ⒸDataI & Vlad Khizanov
Extreme Value Machine
Given the conditions for the Margin Distribution
Theorem, the probability that x’ is included in the
boundary estimated by xi
is given by:
Ψ(xi , x0 ; κi , λi ,) = exp− ||xi−x 0 || λi κi (1)
where ||xi
− x’ || is the distance of x’ from sample xi
, and
κi
, λi
are Weibull shape and scale parameters
respectively obtained from fitting to the smallest
pairwise margin estimate.
ⒸDataI & Oleksiy Udod
https://arxiv.org/pdf/1506.06112.pdf
ⒸDataI & Oleksiy Udod
@135k distractors
Edge-to-Cloud
Face
detector
Face
alignment
Age + Gender
models
Face
Embedding
Face Search
Person
detector
Body
Keypoint
classifier
Face
selector
Action
recognition
Tracker
TRACK
with
metadata
Decisions,
BA material
CLOUD
On-Premise
ⒸDataI
Serving with RESTful API
Market server N
Storage
AWS
Consumers
auto
scaling
group
Local
kafka
broker
Mirror
Maker
Market server 1
Local
kafka
broker
Mirror
Maker
Consumer A
Consumer B
Consumer C
Kafka broker
1
Kafka broker
2
Kafka broker
3
N partition
N
partition
N partition
RESTful API
Model1
RESTful API
Model2..n
ⒸDataI & Konstantin Bulgakov
Serving in Kafka Streams
Market server N
Storage
AWS
Consumers with TF,
auto scaling
group
Local
kafka
broker
Mirror
Maker
Market server 1
Local
kafka
broker
Mirror
Maker
Consumer A
Consumer B
Consumer C
Kafka broker
1
Kafka broker
2
Kafka broker
3
N partition
N
partition
N
partition
ⒸDataI & Konstantin Bulgakov
Msg size
distribution
Size distribution for
14K records was
measured at the
producer side,
counting the str
length of every
message
*1 str element ~ at
least 1 byte
ⒸDataI & Olesia Stestsiuk
t2.medium vs t2x.large
ⒸDataI & Olesia Stestsiuk
Pyflame
● based on the Linux ptrace(2) system call not sys.settrace()
● no modification of the source code required
● profiling embedded Python interpreters like uWSGI.
● profiling multi-threaded Python programs.
● written in C++, with attention to speed and performance.
● Pyflame usually introduces less overhead than the builtin profile (or cProfile)
modules, and also emits richer profiling data.
Just sudo pyflame -s 600 -r 0.001 --threads -p 1493 |
./flamegraph.pl >10_min_every_milisec.svg
http://eng.uber.com/pyflame/ⒸDataI & Olesia Stestsiuk
How to read Flame Graphs
● Each box represents a function in the stack (a "stack frame").
● The y-axis shows stack depth (number of frames on the stack). The top box shows the function
that was on-CPU. Everything beneath that is ancestry. The function beneath a function is its
parent, just like the stack traces shown earlier.
● The x-axis spans the sample population. It does not show the passing of time from left to right, as
most graphs do. The left to right ordering has no meaning (it's sorted alphabetically to maximize
frame merging).
● The width of the box shows the total time it was on-CPU or part of an ancestry that was on-CPU
(based on sample count). Functions with wide boxes may consume more CPU per execution than
those with narrow boxes, or, they may simply be called more often. The call count is not shown
(or known via sampling).
● The sample count can exceed elapsed time if multiple threads were running and sampled
concurrently.
ⒸDataI & Olesia Stestsiuk
ⒸDataI & Olesia Stestsiuk
Benchmark your code
Know your product requirements
Use APIs and SDKs even if they are not free
When you are using something, know the limits of
the execution
End to end testing in data driven products is a
disaster and the best technical feedback.
Roman Storchak, PhD,
CTO @ DatAI
roman.storchak@gmail.com
https://www.linkedin.com/in/storchak/
+38(063)617-61-15

More Related Content

Similar to Data Summer Conf 2018, “How we build Computer vision as a service (ENG)” — Roman Storchak, CTO at DatAI

BWC Supercomputing 2008 Presentation
BWC Supercomputing 2008 PresentationBWC Supercomputing 2008 Presentation
BWC Supercomputing 2008 Presentation
lilyco
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
Positive Hack Days
 
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
Edge AI and Vision Alliance
 
AktaionPPTv5_JZedits
AktaionPPTv5_JZeditsAktaionPPTv5_JZedits
AktaionPPTv5_JZedits
Rod Soto
 
breed_python_tx_redacted
breed_python_tx_redactedbreed_python_tx_redacted
breed_python_tx_redacted
Ryan Breed
 

Similar to Data Summer Conf 2018, “How we build Computer vision as a service (ENG)” — Roman Storchak, CTO at DatAI (20)

Approaches for application request throttling - dotNetCologne
Approaches for application request throttling - dotNetCologneApproaches for application request throttling - dotNetCologne
Approaches for application request throttling - dotNetCologne
 
BWC Supercomputing 2008 Presentation
BWC Supercomputing 2008 PresentationBWC Supercomputing 2008 Presentation
BWC Supercomputing 2008 Presentation
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
 
Approaches for application request throttling - Cloud Developer Days Poland
Approaches for application request throttling - Cloud Developer Days PolandApproaches for application request throttling - Cloud Developer Days Poland
Approaches for application request throttling - Cloud Developer Days Poland
 
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
The System of Automatic Searching for Vulnerabilities or how to use Taint Ana...
 
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
 
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
Monitoring as Code: Getting to Monitoring-Driven Development - DEV314 - re:In...
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
AktaionPPTv5_JZedits
AktaionPPTv5_JZeditsAktaionPPTv5_JZedits
AktaionPPTv5_JZedits
 
breed_python_tx_redacted
breed_python_tx_redactedbreed_python_tx_redacted
breed_python_tx_redacted
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]
 
ANPR based Security System Using ALR
ANPR based Security System Using ALRANPR based Security System Using ALR
ANPR based Security System Using ALR
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
RAT - Repurposing Adversarial Tradecraft
RAT - Repurposing Adversarial TradecraftRAT - Repurposing Adversarial Tradecraft
RAT - Repurposing Adversarial Tradecraft
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
 
ConFoo Montreal - Approaches for application request throttling
ConFoo Montreal - Approaches for application request throttlingConFoo Montreal - Approaches for application request throttling
ConFoo Montreal - Approaches for application request throttling
 
Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014Seeing with Python presented at PyCon AU 2014
Seeing with Python presented at PyCon AU 2014
 

More from Provectus

AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
Provectus
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
Provectus
 

More from Provectus (20)

Choosing the right IDP Solution
Choosing the right IDP SolutionChoosing the right IDP Solution
Choosing the right IDP Solution
 
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
 
Choosing the Right Document Processing Solution for Healthcare Organizations
Choosing the Right Document Processing Solution for Healthcare OrganizationsChoosing the Right Document Processing Solution for Healthcare Organizations
Choosing the Right Document Processing Solution for Healthcare Organizations
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMRCost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
 
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
 
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K..."Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
 
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky..."Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
 
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2..."Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
 
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma..."Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
 
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ..."Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
 
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
 
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
 
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti..."Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
 
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
 
How to implement authorization in your backend with AWS IAM
How to implement authorization in your backend with AWS IAMHow to implement authorization in your backend with AWS IAM
How to implement authorization in your backend with AWS IAM
 

Recently uploaded

Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
HyderabadDolls
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
HyderabadDolls
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
HyderabadDolls
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 

Recently uploaded (20)

Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 

Data Summer Conf 2018, “How we build Computer vision as a service (ENG)” — Roman Storchak, CTO at DatAI

  • 1.
  • 2. Computer vision in the cloud and beyond Roman Storchak CTO@DataI
  • 3. Main tasks in CV for surveillance ● Verification (1:1, Border control) ● Age/Gender/emotion recognition ( sometimes other properties) ● Identification (1:N, Surveillance) ● Events and action recognition Interest & Difficulty ⒸDataI
  • 4. Lets build one together Definition of product (technical) success: Our CV product have to answer these questions: ● Who? ● Where? ● When? ● What? ⒸDataI
  • 5. Why bother, Lets use API (or SDK) ● Amazon Rekognition ● Face ++ ● Meerkat ● Azure Cognitive Services ● Google CLOUD VISION ( No Face recognition) ● VeriLook SDK ● iFace SDK ● Cognitec VACS SDK ● Luxand FACE SDK ● Affectiva SDK ● Betaface SDK ⒸDataI
  • 6. Amazon Recognition Prices: - Rekognition: 10-12 c$/min - S3+Data transfer or AWS Kinesis streams >3 c$/TB ~70k$ per store per month ⒸDataI
  • 7. Module pipeline Face detector Face alignment Age + Gender models FacePrint Face Search Person detector Body Keypoint classifier Face selector Action recognition Tracker TRACK with metadata Decisions, BA material ⒸDataI
  • 8. Object for detection HEAD FACE Upper BODY BODY NDA ⒸDataI
  • 9. Objects and attributes Objects Static Attributes Dynamic Attributes Body Body Embedding Location Actions Head Count Face Embedding//ID Age, Gender, Race Emotions Head Pose ⒸDataI
  • 11. Edge-to-Cloud Face detector Face alignment Age + Gender models Face Embedding Face Search Person detector Body Keypoint classifier Face selector Action recognition Tracker TRACK with metadata Decisions, BA material CLOUD On-Premise ⒸDataI
  • 12. OpenCV + Very simple to use - GIL Python limitations for multithreading NVIDIA DeepStream SDK + Fast & Flexible - Only Jetson & Tesla Video Streaming (Edge) ⓒ nVidia DeepStream SDK ⒸDataI
  • 13. Video Streaming (Edge) 3. Gstreamer + plugin-based architecture + easy/fast Video Record (any supported format) + easy/fast overlay draw (using Cairo) + wide variety of ready-to-use plugins - difficult for understanding - weak support by Gstreamer community - too many plugins has internal bugs ⒸDataI & Taras lishchenko
  • 14. Many plugins has internal bugs ⒸDataI
  • 16. Simple Gstreamer Pipeline filesrc ! decodebin ! fakesink ⒸDataI & Taras Lishenko
  • 17. Gstreamer Basics Gstreamer Version: 1.12.4 Launch: ● gst-launch-1.0 filesrc ! decodebin ! fakesink Check: ● gst-inspect-1.0 filesrc ⒸDataI & Taras Lischenko
  • 18. Min. Gstreamer Pipeline Supported video sources Live: ● Web Camera ● Rtsp Camera Not-live: ● Video File (supported format) ● Multiple Files: ○ video01.mp4, video02.mp4, video03.mp4 ... ● acquire frames from video source ● decode ● convert to RGB ⒸDataI & Taras Lischenko
  • 19. NX Number of simultaneous video streams? ⒸDataI & Taras Lischenko
  • 20. Min & Mean FPS / Num Video Streams PC specs (Oper, shop-000001): Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz ⒸDataI & Taras Lishchenko
  • 21. 15X Number of simultaneous video streams? ⒸDataI & Taras Lishchenko
  • 22. “Video” Pipelines Templates plugins = [FILESRC, TSDEMUX, H264PARSE, IDENTITY, FPS, TEE_HEAD, QUEUE, AVDEC_H264, VIDEOSCALE, VIDEORATE, CAPS_FULL, VIDEOCONVERT, OBJECT_DETECTOR, OBJECT_TRACKER, VIDEOCONVERT, DYNAMIC_OVERLAY, VIDEOCONVERT, I420_CAPS, VIDEOCONVERT, X264ENC, SPLITMUXSINK, TEE_TAIL, QUEUE, SPLITMUXSINK] ⒸDataI & Taras Lishchenko
  • 23. Problems with Gstreamer - Buffer offset/timestamps - when live video - offset is const, when not - offset is in range [0, maxint] - Solution: GstIdentity (Force offset increment [0, maxint]) - timestamp - CLOCK_MONOTONIC (project requires CLOCK_REAL to sync video with annotations) - Solution: Store Map CLOCK_MONOTONIC -> CLOCK_REAL - X264enc too slow (requires a lot of CPU power) - Solution: Use plugins (h264parse) for DVR without conversion from RGB (Drawbacks: Can’t draw on non-RGB buffer) - Python has limitations compared to C version: - Passing objects from Python to C buffers (Metadata) - Solution: DIY Python wrappers for C-libs that works with Gstreamer Objects ⒸDataI & Taras Lishchenko
  • 24. Sync frame-by-frame + Guarantee buffers order - N-1 waiting points - Drop buffers ⒸDataI & Taras Lishchenko
  • 25. CPU vs GPU ⒸDataI & Taras Lishchenko
  • 26. Sync Batch mode + Guarantee buffers order + Easy to sync annotations with frame data - N waiting points - Drop buffers - Gstreamer should emit buffers with small delay, to reduce wait time in batch collector ⒸDataI & Taras Lishchenko
  • 27. Async Batch mode + No waiting points + No need for gstreamer to emit buffers faster + Better GPU load - Not guaranteed order between streams - Hard to sync frames with annotations - additional complexity to handle queues ⒸDataI & Taras Lishchenko
  • 29. Evolution 1. Face detection (hard to track): a. Dlib i. bad for small faces (need to upscale image -> decrease performance) ii. too slow (works on CPU) b. MTCNN i. Bad for many faces (performance decreases with increase number of faces due to architecture) 2. Person detection: a. Haar cascade i. poor quality ii. too slow (works on CPU) b. Blob person detector i. not invariant to noisy images ii. too slow due to Background Substraction c. TinyYolov2 (Darkflow) d. Mobilenet-SSD ⒸDataI & Taras Lishchenko
  • 30. CPU usage Min Max Mean test_all_cpu 26.0656 28.6466 27.1232 test_on_7_cpu 26.9883 32.5846 28.8077 test_on_6_cpu 26.9680 31.8769 29.0395 test_on_5_cpu 27.0186 32.4739 30.0355 test_on_4_cpu 28.0154 37.4240 32.2915 test_on_3_cpu 34.6486 44.4604 37.7983 test_on_2_cpu 48.0222 60.0206 50.0859 test_on_1_cpu 83.4291 96.7086 88.9920 Model: BodyEmbeddings CPU: i7-7700HQ CPU @ 2.80GHz Tensorflow (explanation): ● intra_op_parallelism_threads=[0, NUM_CORES] (0 - best) ● inter_op_parallelism_threads=[0, NUM_CORES] (0 - best) Conclusion: Some models could be executed in parallel if there is a Number of cores =< Half of Total Num of cores, without huge performance loss ⒸDataI & Taras Lishchenko
  • 31. One Session vs Multiple Sessions Model: Body Embeddings Conclusion: Performance can benefit from Single Graph in Single Session only on CPU. But not significant difference (12%) ⒸDataI & Taras Lishchenko
  • 32. Resize Methods Nearest Neighbours Resize with different implementations Conclusion: OpenCV faster when resize. Use Nearest Neighbours method to gain max performance ⒸDataI & Taras Lishchenko
  • 33. np.stack instead np.concatenate (Batch) Batch size: 20 Image size: 640x360x3 Conclusion: When collect images in batch: - put to list - np.stack(list) ⒸDataI & Taras Lishchenko
  • 34. Object Tracking ⒸDataI & Taras Lishchenko
  • 35. Evolution - Dlib Tracker - IOU (base version) - no thresholding by confidence - Configurable track start,drop,IOU thresholds, etc. - IOU (extended) (with Hungarian Algorithm) ⒸDataI & Taras Lishchenko
  • 36. Problem #1 with Body Detector & IOU Tracking ⒸDataI
  • 38. Negative classes ● asymmetric Accuracy: 0.940 ● bad Accuracy: 0.951 ● bad_angle Accuracy: 0.875 ● bad_manual Accuracy: 0.824 ● blurred Accuracy: 0.951 ● many Accuracy: 0.850 ● no_face Accuracy: 0.953 ● no_landmarks Accuracy: 0.918 ● not_inside Accuracy: 0.941 Positive class As close to ISO/IEC 19794-5:2005 compliant photo as possible ⒸDataI & Oleksiy Udod
  • 40. Identification pipeline Face detector Face alignment Age + Gender models Face Embedding Face Search ⒸDataI
  • 41. Closed set vs Open set search Closed-set evaluation: ● cumulative match characteristics (CMC) curves ● receiver operating characteristic (ROC) curves. Open-set evaluation ● detection and identification rate (DIR) curves (TPIR,FPIR,FNIR,...) ⒸDataI
  • 42. Open-set video evaluation Gallery: a set of images of interest. Probes: a set of images from for querying. In our case probe might be: ● the best-quality face image among all images within the same person track ● any face image from the video ⒸDataI & Vlad KhizanovⒸDataI
  • 43. Metrics for Open Set Evaluation Definition: Query is succeed if top result has similarity greater than t. FPIR(t) = # of success non-mate search queries / # of queries TPIR(t) = # of success mate search queries / # of queries MISS(t) = # of non-success search queries / # of queries FPIR(t) + TPIR(t) + MISS(t) = 1 Note: sometimes *FPIR = 1 - FPIR Note: here TPIR(t) = TPIR(t, 1), where 1 is a rank Mate searches are those for which the person in the search image has a face image in the enrolled dataset Non-mate searches are those which the person in the search image does not have a face image in the enrolled dataset. ⒸDataI & Vlad Khizanov
  • 44. Metrics on chart Usually metrics visualized as a curve in parametric form: x(t) = FPIR(t) y(t) = TPIR(t) t = 0.0, 0.01, 0.02, …, 1.0 Note: for usage it’s useful to pick optimal threshold. FNIR=1-TPIR FPIR FPIR@FNIR ⒸDataI & Vlad Khizanov
  • 45. Extreme Value Machine Given the conditions for the Margin Distribution Theorem, the probability that x’ is included in the boundary estimated by xi is given by: Ψ(xi , x0 ; κi , λi ,) = exp− ||xi−x 0 || λi κi (1) where ||xi − x’ || is the distance of x’ from sample xi , and κi , λi are Weibull shape and scale parameters respectively obtained from fitting to the smallest pairwise margin estimate. ⒸDataI & Oleksiy Udod https://arxiv.org/pdf/1506.06112.pdf
  • 46. ⒸDataI & Oleksiy Udod @135k distractors
  • 47. Edge-to-Cloud Face detector Face alignment Age + Gender models Face Embedding Face Search Person detector Body Keypoint classifier Face selector Action recognition Tracker TRACK with metadata Decisions, BA material CLOUD On-Premise ⒸDataI
  • 48. Serving with RESTful API Market server N Storage AWS Consumers auto scaling group Local kafka broker Mirror Maker Market server 1 Local kafka broker Mirror Maker Consumer A Consumer B Consumer C Kafka broker 1 Kafka broker 2 Kafka broker 3 N partition N partition N partition RESTful API Model1 RESTful API Model2..n ⒸDataI & Konstantin Bulgakov
  • 49. Serving in Kafka Streams Market server N Storage AWS Consumers with TF, auto scaling group Local kafka broker Mirror Maker Market server 1 Local kafka broker Mirror Maker Consumer A Consumer B Consumer C Kafka broker 1 Kafka broker 2 Kafka broker 3 N partition N partition N partition ⒸDataI & Konstantin Bulgakov
  • 50. Msg size distribution Size distribution for 14K records was measured at the producer side, counting the str length of every message *1 str element ~ at least 1 byte ⒸDataI & Olesia Stestsiuk
  • 51. t2.medium vs t2x.large ⒸDataI & Olesia Stestsiuk
  • 52. Pyflame ● based on the Linux ptrace(2) system call not sys.settrace() ● no modification of the source code required ● profiling embedded Python interpreters like uWSGI. ● profiling multi-threaded Python programs. ● written in C++, with attention to speed and performance. ● Pyflame usually introduces less overhead than the builtin profile (or cProfile) modules, and also emits richer profiling data. Just sudo pyflame -s 600 -r 0.001 --threads -p 1493 | ./flamegraph.pl >10_min_every_milisec.svg http://eng.uber.com/pyflame/ⒸDataI & Olesia Stestsiuk
  • 53. How to read Flame Graphs ● Each box represents a function in the stack (a "stack frame"). ● The y-axis shows stack depth (number of frames on the stack). The top box shows the function that was on-CPU. Everything beneath that is ancestry. The function beneath a function is its parent, just like the stack traces shown earlier. ● The x-axis spans the sample population. It does not show the passing of time from left to right, as most graphs do. The left to right ordering has no meaning (it's sorted alphabetically to maximize frame merging). ● The width of the box shows the total time it was on-CPU or part of an ancestry that was on-CPU (based on sample count). Functions with wide boxes may consume more CPU per execution than those with narrow boxes, or, they may simply be called more often. The call count is not shown (or known via sampling). ● The sample count can exceed elapsed time if multiple threads were running and sampled concurrently. ⒸDataI & Olesia Stestsiuk
  • 54. ⒸDataI & Olesia Stestsiuk
  • 56. Know your product requirements
  • 57. Use APIs and SDKs even if they are not free
  • 58. When you are using something, know the limits of the execution
  • 59. End to end testing in data driven products is a disaster and the best technical feedback.
  • 60. Roman Storchak, PhD, CTO @ DatAI roman.storchak@gmail.com https://www.linkedin.com/in/storchak/ +38(063)617-61-15