SlideShare a Scribd company logo
1 of 97
Download to read offline
Solr and Machine Vision
Scott Cote and Trevor Grant
Lucidworks / IBM
ABOUT US
Trevor Grant
 PMC: Apache Mahout
Apache Streams
 IBM: Open Source Evangelist
“AI Engineer”
 
 @rawkintrevo
 www.rawkintrevo.org
Scott Cote
 Organizer: DFW Data Science
Mahout Fan
 Lucidworks: Senior Software Engineer
(Fusion Core Team)
 @scottccote @dfwdatascience
ACT 1 The Maths
DEEP LEARNING: AN
OVERVIEW
 Deep learning is an exciting new technology with numerous applications, such as
detecting cats in pictures, creating nonsensical manuscripts, “completing” un finished
symphonies, magically returning your company to profitability after decades of poor
management through clever application of buzzwords, etc.
DEEP LEARNING: AN
OVERVIEW
WHO’S INTERESTED IN
“DEEP LEARNING”
WHAT I THINK YOU DO
DEEP LEARNING: SLOW
TRAINING / PREDICTION
TIMES
ALTERNATIVELY- EXPENSIVE
(AND STILL SLOW)
BECAUSE YOU DON’T
IMAGE DETECTION
Haar Cascade Filters Deep Learning
Speed of training Days Months
Speed of prediction Ultrafast Not great
Accuracy Slightly lower Higher to MUCH higher (domain)
Type of recognition Well understood problem (faces) Poorly understood problem
(darkmatter)
Best Use-case •  You understand the domain
•  You can use multiple methods
•  You have limited resources:
•  Limited Time
•  Limited Compute Power
•  Limited $$$
DON’T HURT YOUR EYES
(IMAGE DETECTION PUN)
”FAST PREDICTION” IS
RELATIVE
REAL TIME VIDEO- OK, NOT
GOOD ENOUGH
LAST MEMES FOR A WHILE
LESS HATER-Y
 “Neural Nets are universal function approximates”
- Jake Manix, talk an hour ago.
 When milliseconds count- we can’t afford to approximate.
- Me, Now.
ANCIENT PARADIGM
Fast
(Training and Prediction Time)
Right
(Highest
accuracy)
Cheap
(In dollars
and
in hardware)
GPU
Deep Learning
Haar-Cascade
Filters
CPU
Deep
Learning
CASCADE FILTER OVERVIEW
 Scans for areas that match certain patterns.
 Historical Context of Cascade Filters
CASCADE FILTER OVERVIEW
CASCADE FILTER
CASCADE FILTER
(AREAWISE)
EIGENFACES (FACIAL
RECOGNITION) OVERVIEW
 Similar to Principal Component Analysis-
­  We week reduce dimensionality of images (tens of thousands of individual pixels) to a composition of
“eigenfaces”
­  A face (as a 250x250 image) is represented as a vector of length 62500 (250 x 250 = 62500 pixels)
­  If we decompose into a combination of 130 Eigenfaces, we can represent a face with a vector of length
130.
­  Advantages over “Deep Learning”
­  Quicker to identify face
­  Quicker to retrain
­  Can instantaneously add new face to dataset
 History of Eigenfaces:
WHY NOT LANDMARK
RECOGNITION
EIGENFACES (FACIAL
RECOGNITION)
EIGENFACES (FACIAL
RECOGNITION)
EIGENFACES (PIXELWISE)
Squares represent pixels…
EIGENFACES (PIXELWISE)
Squares represent pixels…
EIGENFACES (PIXELWISE)
Squares represent pixels…
EIGENFACES (PIXELWISE)
22 85 54 123
56 187 92 91
111 204 103 245
8 247 155 212
239 87 99 84
Squares represent pixels…
EIGENFACES (PIXELWISE)
22 85 54 123
56 187 92 91
111 204 103 245
8 247 155 212
239 87 99 84
Squares represent pixels…
EIGENFACES (FACIAL
RECOGNITION)
EIGENFACES (FACIAL
RECOGNITION)
EIGENFACES (FACIAL
RECOGNITION)
Matrix of Faces
ith Image
jth Pixel Position
EIGENFACES: SINGULAR
VALUE DECOMPOSITION
Matrix of FacesU Vx =
EIGENFACES: MATRIX V
Matrix of FacesU V (Eigenfaces)x =
EIGENFACES: MATRIX U
Matrix of FacesU Vx =
Linear combinations of Eigenfaces required to form the Nth Face
= 2.456 x - 7.2345 x + 0.4125 x
NEW FACES
y
V Transpose
(each column is
eigenface)
NEW FACES
y X
Simple Regression (OLS)
Ordinary Least Squares
β
RECAP
 Cascade Filters: Facial Detection (where/is there a ‘face’ in this picture)
 Eigen faces: Facial Recognition (WHO am I looking at?)
 Neural nets / deep learning- could do both in one pass- very very slow.
ACT 2 Real-time Facial Recognition
CREATING THE EIGENFACES:
COMPUTING
 Apache Spark- an In-Memory Map-Reduce Engine (has weak ML library, however we
won’t use).
 Apache Mahout- Provides Distributed Stochastic Singular Value Decomposition
method. (Also provides Mathematically expressive Scala DSL, and GPU/CPU
acceleration)
 Creating Eigen faces- Spark Job took 45 minutes on Desktop with 32GB RAM, 8CPUs
@ 3.9GHz, but also I was watching Rick And Morty.
 THIS JOB CAN BE GPU ACCELERATED BY CHANGING ONE DEPENDENCY.
CREATING THE EIGENFACES:
DATASET
 University of Mass. Faces in the Wild Dataset: 10k images of labeled faces from the
internet. Each image is 250x250 (62500 pixels)
10k Faces Dataset Matrix
(10,000 x 62500)
Each row corresponds to 1 image of a face
Each column corresponds to a given pixel position
APACHE MAHOUT ON
APACHE SPARK CALCULATES
EIGENFACES
10k Faces Dataset Matrix
Linear
Combos
Eigenfaces
x =
OPEN CV DETECTS FACES IN
VIDEO FRAME
SCALE THE IMAGE TO
250X250
321
4 5 6
7 8 9
1
2
3
4
5
6
7
8
9
Eigenfaces
Ordinary Least Squares
Linear Combination of Eigenfaces
MAHOUT DECOMPOSES
FACERECT INTO LINEAR
COMBINATION OF
EIGENFACES VECTOR
SEARCH SOLR FOR
MATCHING VECTOR
DOCUMENT
THE QUERY, RESPONSE, AND
DOCUMENTS
{
name_s: “Richard Hatch”,
e0_d : 1.512
e1_d : 5.125
e2_d : -15.1256
e3_d : 4.241
…
e129_d : 1.245
...
call_sign_s : “Apollo”
last_seen_dt : 2017-02-08T08:52:12
alias_s : “Tom Zarek”
...
}
{
name_s: “Richard Hatch”,
e0_d : 1.512
e1_d : 5.125
e2_d : -15.1256
e3_d : 4.241
…
e129_d : 1.245
...
call_sign_s : “Apollo”
last_seen_dt : 2017-02-08T08:52:12
alias_s : “Tom Zarek”
...
}
Query
THE QUERY, RESPONSE, AND
DOCUMENTS
DOCUMENT
THE QUERY, RESPONSE, AND
DOCUMENTS
Query
ALL Documents
Euclidean Distance
Ascending Order
THE QUERY, RESPONSE, AND
DOCUMENTS
Response:
[ { “name_s” : “Apollo”, “calcDist” : 1256.254, “lastseen_pdt”: 1979-05-11T08:41:25},
{ “name_s” : “Tom Zarek”, “calcDist” : 1826.529, “lastseen_pdt”: 2017-02-07T08:41:25},
{ “name_s” : “Starbuck”, “calcDist” : 5826.529, “lastseen_pdt”: 2017-09-14T15:22:56},
{ “name_s” : “Caprica 6”, “calcDist” : 7119.525, “lastseen_pdt”: 2017-09-14T08:41:25},
…
]
RECOGNIZE OR NEW
ENTITY?
Response
Recognize?
Yes
No Add Person to Solr
Done
WHO DOES THE WORK?
Local
 Advantages:
 - Edge device can build use context clues
to make final decision
 Disadvantages:
 - Requires more hardware at edge to
“think”
On Solr
 Advantages:
 - Leverage advantages of Solr
 - Less hardware requirement on edge
 Disadvantages:
 - “Contextual clues” must be encoded in
query
Response
Recognize?
ACT 3 Building your own Cylons
DRONES ARE GETTING
CHEAP
 Drone 2-Pack
­  $99.99
­  Controlled via Smartphone
 FPV Camera
­  $39.99 / ea
­  Video over Wifi via RTSP
Video enabled drones for ~$90 each
CHALLENGES AND
OPPORTUNITIES
Challenge:
­  Cascade Filters inconsistently frame face
­  “Ghost Faces”
­  Eigenfaces not robust to facial expressions,
changes in light, etc.
CHALLENGES AND
OPPORTUNITIES
Opportunity:
­  Video gives us a lot more ”context clues”
than still frames.
­  People don’t sporadically disappear and appear
­  Someone seen recently is more likely to be present
than someone seen long ago.
OPENCV DETECTS FACES IN
A VIDEO FRAME
OPENCV DETECTS FACES IN
A VIDEO FRAME
OPENCV DETECTS FACES IN
A VIDEO FRAME
OPENCV DETECTS FACES IN
A VIDEO FRAME
OPENCV DETECTS FACES IN
A VIDEO FRAME
OPENCV DETECTS FACES IN
A VIDEO FRAME
2 PROBLEMS
1.  The face is inconsistently detected (Eigenfaces is sensitive to this)
2.  Shadows, patterns on clothes, etc. cause “ghost faces” to be identified
sporradically.
OPENCV DETECTS FACES IN
A VIDEO FRAME
SOLUTION: CLUSTERING/
FILTERING/WINDOWING
 Proposal: Cluster faces by location in frame. If less than N faces in cluster- remove
all faces in cluster (e.g. ghost clusters)
 Problem-2: People move around frame in time.
 Proposal-2: Break frames up into sliding window of M seconds.
 Problem-3: Clustering/machine learning can be somewhat computationally expensive
 Proposal-3: Canopy clustering (old, but still effective method- 1 pass clustering).
CANOPY CLUSTERING
 Create N Second Window
 Cluster Faces in Window
 Quick dirty clustering- but effective.
­  First point is “center”
­  All points within distance t2 are “in that cluster.
­  If a point is not within t2 of any cluster- it becomes a new cluster center.
t2= max square width
OPENCV DETECTS FACES IN
VIDEO FRAME
t2= max square widthFirst rect – new cluster
Second Rect- within one width of first rect (same cluster)
Third Rect- within one width of first rect (same cluster)
Forth Rect- NOT within one width of first rect (new cluster)
Fifth Rect- within one width of first rect (same cluster)
Finally- any cluster with less than two entities in windows gets filtered out.
CANOPY CLUSTERING TO
REMOVE “GHOST” FACES
CLUSTERING BECAUSE WE
DON’T KNOW HOW MANY
TRUE FACES THERE ARE
SETTING THE ”LOOSE
DISTANCE"Half the width of largest rectangle is the “Loose Distance”
SETTING THE “TIGHT
DISTANCE”Half the width of largest rectangle is the “Loose Distance”
ADAPTIVE HYPER-
PARAMETERS
 A very simple machine learning algorithm adapts its self in real time to the input it is
receiving…
 A.I. Is a strong buzzword but...
A BETTER WAY TO SOLR (1)
SEARCH SOLR FOR
MATCHING VECTOR
DOCUMENT
A BETTER WAY TO SOLR
{
name_s: “Richard Hatch”,
e0_d : 1.512
e1_d : 5.125
e2_d : -15.1256
e3_d : 4.241
…
e129_d : 1.245
...
call_sign_s : “Apollo”
last_seen_dt : 2017-02-08T08:52:12
alias_s : “Tom Zarek”
...
}
LEVERAGE PAYLOAD
CAPABILITY OF TERM FIELDS
 New Query
 q=“*:*”
 &sort=dist(
 2
  ,payload(“e_dpf”,”e_00”)
  ,payload(“e_dpf”,”e_01”)
  ...
  ,payload(“e_dpf”,”e_129”)
  ,x_e0
  ,x_e1
  ...
  ,x_e130
 ) asc
 &rows=5
{
name_s : “Richard Hatch”
e_dpf:”e0|1.512 e1|1.512 … e129|
1.245”
…
,call_sign_s : “Apollo”
,last_seen_dt : “2017-02-08T08:52:12
,alias_s : “Tom Zarek”
…
}
Thank you Erik Hatcher (SOLR-1485  https://issues.apache.org/jira/browse/SOLR-1485)
THESE METHODS
BETTER SCALING
Cluster1 Cluster2 Cluster3
Cluster2a Cluster2b
WINDOWING
 A video is just a stream of Frames
 Apache Flink gives us a nice API for splitting/joining the stream, as well as creating
windows and applying functions to the windows. (Other bonuses too)
ENTER THE STREAM:
OPENCV DETECTS FACES
ENTER THE STREAM:
MAHOUT CANOPY CLUSTER
An n-m-second sliding window:
Every m seconds this window emits a set of clusters based on the last n seconds of data. For Exampe:
5-1, every 1 second a new set of ”face zones” based on faces detected the previous 5 seconds.
MAHOUT CANOPY CLUSTER
An n-m-second sliding window:
Every m seconds this window emits a set of clusters based on the last n seconds of data. For Exampe:
5-1, every 1 second a new set of ”face zones” based on faces detected the previous 5 seconds.
(Or 0.5 / 0.1 – Every 10th of a second based on last half second)
ENTER THE STREAM: A LAG
Here a small lag is introduced.
“APPLY THE CLUSTERS”
BASED ON FIT CANOPIES
Face Cluster 1 Face Cluster 1 Face Cluster 1 Face Cluster 1 Face Cluster 2 (only 1 image- Ghost)
STORE OUR MEMORIES IN
SOLR
STORE OUR MEMORIES IN
SOLR
METHOD1: AVERAGING
1.  Take all Face Rects in Cluster.
2.  Average them All together.
3.  Search Solr for this averaged image.
4.  If this “Average Face” matches a face in the cluster (within
some distance tolerance) we assign that name to every face
in the cluster- and write all faces to Solr as that person’s name.
5.  Otherwise- we create a new name, and write all faces to Solr under the new
Name.
6.  This really doesn’t work very well at all.
7.  ADVANTAGE: Minimize network traffic/SOLR taxation
STORE OUR MEMORIES IN
SOLR
METHOD2: “VOTING”
1.  Search EACH face
2.  Get list of names in results
3.  Assign points based on rank or distance
4.  Aggregate points across all rects, highest points “wins”- if winner has some
minimum threshold, assign that name.
5.  Otherwise- we create a new name, and write all faces to Solr under the new
Name.
PUNCHLINE:
 Second benefit of Eigen faces over ”deep learning” quickly add faces
WHY APACHE SOLR
­ Capable of storing large amounts of data
­ Scales to petabytes text oriented 
­ Numeric compute friendly
­ Many ways to store different types of data
WHY APACHE MAHOUT
­ Engine Agnostic (Spark/Flink/Standalone/RYO)
­ Native acceleration on CPU/GPU/CUDA
­ Possible to accelerate BLAS operations on ANY arch
(edge devices) 
­ Mathematically expressive Scala
WHY APACHE FLINK
­ Sophisticated Windowing Functions
­ Complex Event Library
­ Scales linearly (1 drone vs Army of Drones)
TECHNICALLY “BORG-STYLE”
AI, NOT CYLONS
  A finer technical point for those familiar with the Cylons and the Borg
 “Hive Mind” Architecture
NEW HUMAN-0001OH, hai HUMAN-0001
LEARNING PROPAGATES
QUICKLY
SHAPE OF THINGS TO COME.
”Science Fiction” of 10 years ago, today is domain of
hobbyists
Demo presented here is “Science Fair” grade AI.
Vlad Putin’s recently talking about “it is undesirable for
anyone to monopolize AI”. (Yay Apache!)
DEMO Here’s a fun video while I set up
Thank You

More Related Content

Similar to Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM

Oracle 10g Performance: chapter 00 intro live_short
Oracle 10g Performance: chapter 00 intro live_shortOracle 10g Performance: chapter 00 intro live_short
Oracle 10g Performance: chapter 00 intro live_short
Kyle Hailey
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
PyData
 
SDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeSDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's Stampede
Intel® Software
 

Similar to Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM (20)

Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
 
Large scale landuse classification of satellite imagery
Large scale landuse classification of satellite imageryLarge scale landuse classification of satellite imagery
Large scale landuse classification of satellite imagery
 
AI 로봇 아티스트의 비밀(창원대학교 정보통신공학과 특강)
AI 로봇 아티스트의 비밀(창원대학교 정보통신공학과 특강)AI 로봇 아티스트의 비밀(창원대학교 정보통신공학과 특강)
AI 로봇 아티스트의 비밀(창원대학교 정보통신공학과 특강)
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Open Cv – An Introduction To The Vision
Open Cv – An Introduction To The VisionOpen Cv – An Introduction To The Vision
Open Cv – An Introduction To The Vision
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Chaos Engineering - The Art of Breaking Things in Production
Chaos Engineering - The Art of Breaking Things in ProductionChaos Engineering - The Art of Breaking Things in Production
Chaos Engineering - The Art of Breaking Things in Production
 
Analysis Of Netflix
Analysis Of NetflixAnalysis Of Netflix
Analysis Of Netflix
 
Oracle 10g Performance: chapter 00 intro live_short
Oracle 10g Performance: chapter 00 intro live_shortOracle 10g Performance: chapter 00 intro live_short
Oracle 10g Performance: chapter 00 intro live_short
 
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
 
Unsupervised Computer Vision: The Current State of the Art
Unsupervised Computer Vision: The Current State of the ArtUnsupervised Computer Vision: The Current State of the Art
Unsupervised Computer Vision: The Current State of the Art
 
Devoxx
DevoxxDevoxx
Devoxx
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
An Introduction to Computer Vision
An Introduction to Computer VisionAn Introduction to Computer Vision
An Introduction to Computer Vision
 
Deep Neural Networks for Video Applications at the Edge
Deep Neural Networks for Video Applications at the EdgeDeep Neural Networks for Video Applications at the Edge
Deep Neural Networks for Video Applications at the Edge
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
 
SDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeSDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's Stampede
 
Neural network image recognition
Neural network image recognitionNeural network image recognition
Neural network image recognition
 
[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platform[第34回 WBA若手の会勉強会] Microsoft AI platform
[第34回 WBA若手の会勉強会] Microsoft AI platform
 

More from Lucidworks

Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 

Recently uploaded (20)

Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 

Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM

  • 1. Solr and Machine Vision Scott Cote and Trevor Grant Lucidworks / IBM
  • 2. ABOUT US Trevor Grant  PMC: Apache Mahout Apache Streams  IBM: Open Source Evangelist “AI Engineer”    @rawkintrevo  www.rawkintrevo.org Scott Cote  Organizer: DFW Data Science Mahout Fan  Lucidworks: Senior Software Engineer (Fusion Core Team)  @scottccote @dfwdatascience
  • 3. ACT 1 The Maths
  • 4. DEEP LEARNING: AN OVERVIEW  Deep learning is an exciting new technology with numerous applications, such as detecting cats in pictures, creating nonsensical manuscripts, “completing” un finished symphonies, magically returning your company to profitability after decades of poor management through clever application of buzzwords, etc.
  • 7. WHAT I THINK YOU DO
  • 8. DEEP LEARNING: SLOW TRAINING / PREDICTION TIMES
  • 11. IMAGE DETECTION Haar Cascade Filters Deep Learning Speed of training Days Months Speed of prediction Ultrafast Not great Accuracy Slightly lower Higher to MUCH higher (domain) Type of recognition Well understood problem (faces) Poorly understood problem (darkmatter) Best Use-case •  You understand the domain •  You can use multiple methods •  You have limited resources: •  Limited Time •  Limited Compute Power •  Limited $$$
  • 12. DON’T HURT YOUR EYES (IMAGE DETECTION PUN)
  • 14. REAL TIME VIDEO- OK, NOT GOOD ENOUGH
  • 15. LAST MEMES FOR A WHILE
  • 16. LESS HATER-Y  “Neural Nets are universal function approximates” - Jake Manix, talk an hour ago.  When milliseconds count- we can’t afford to approximate. - Me, Now.
  • 17. ANCIENT PARADIGM Fast (Training and Prediction Time) Right (Highest accuracy) Cheap (In dollars and in hardware) GPU Deep Learning Haar-Cascade Filters CPU Deep Learning
  • 18. CASCADE FILTER OVERVIEW  Scans for areas that match certain patterns.  Historical Context of Cascade Filters
  • 22. EIGENFACES (FACIAL RECOGNITION) OVERVIEW  Similar to Principal Component Analysis- ­  We week reduce dimensionality of images (tens of thousands of individual pixels) to a composition of “eigenfaces” ­  A face (as a 250x250 image) is represented as a vector of length 62500 (250 x 250 = 62500 pixels) ­  If we decompose into a combination of 130 Eigenfaces, we can represent a face with a vector of length 130. ­  Advantages over “Deep Learning” ­  Quicker to identify face ­  Quicker to retrain ­  Can instantaneously add new face to dataset  History of Eigenfaces:
  • 29. EIGENFACES (PIXELWISE) 22 85 54 123 56 187 92 91 111 204 103 245 8 247 155 212 239 87 99 84 Squares represent pixels…
  • 30. EIGENFACES (PIXELWISE) 22 85 54 123 56 187 92 91 111 204 103 245 8 247 155 212 239 87 99 84 Squares represent pixels…
  • 33. EIGENFACES (FACIAL RECOGNITION) Matrix of Faces ith Image jth Pixel Position
  • 35. EIGENFACES: MATRIX V Matrix of FacesU V (Eigenfaces)x =
  • 36. EIGENFACES: MATRIX U Matrix of FacesU Vx = Linear combinations of Eigenfaces required to form the Nth Face = 2.456 x - 7.2345 x + 0.4125 x
  • 37. NEW FACES y V Transpose (each column is eigenface)
  • 38. NEW FACES y X Simple Regression (OLS) Ordinary Least Squares β
  • 39. RECAP  Cascade Filters: Facial Detection (where/is there a ‘face’ in this picture)  Eigen faces: Facial Recognition (WHO am I looking at?)  Neural nets / deep learning- could do both in one pass- very very slow.
  • 40. ACT 2 Real-time Facial Recognition
  • 41. CREATING THE EIGENFACES: COMPUTING  Apache Spark- an In-Memory Map-Reduce Engine (has weak ML library, however we won’t use).  Apache Mahout- Provides Distributed Stochastic Singular Value Decomposition method. (Also provides Mathematically expressive Scala DSL, and GPU/CPU acceleration)  Creating Eigen faces- Spark Job took 45 minutes on Desktop with 32GB RAM, 8CPUs @ 3.9GHz, but also I was watching Rick And Morty.  THIS JOB CAN BE GPU ACCELERATED BY CHANGING ONE DEPENDENCY.
  • 42. CREATING THE EIGENFACES: DATASET  University of Mass. Faces in the Wild Dataset: 10k images of labeled faces from the internet. Each image is 250x250 (62500 pixels) 10k Faces Dataset Matrix (10,000 x 62500) Each row corresponds to 1 image of a face Each column corresponds to a given pixel position
  • 43. APACHE MAHOUT ON APACHE SPARK CALCULATES EIGENFACES 10k Faces Dataset Matrix Linear Combos Eigenfaces x =
  • 44. OPEN CV DETECTS FACES IN VIDEO FRAME
  • 45. SCALE THE IMAGE TO 250X250
  • 46. 321 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 Eigenfaces Ordinary Least Squares Linear Combination of Eigenfaces MAHOUT DECOMPOSES FACERECT INTO LINEAR COMBINATION OF EIGENFACES VECTOR
  • 48. DOCUMENT THE QUERY, RESPONSE, AND DOCUMENTS { name_s: “Richard Hatch”, e0_d : 1.512 e1_d : 5.125 e2_d : -15.1256 e3_d : 4.241 … e129_d : 1.245 ... call_sign_s : “Apollo” last_seen_dt : 2017-02-08T08:52:12 alias_s : “Tom Zarek” ... }
  • 49. { name_s: “Richard Hatch”, e0_d : 1.512 e1_d : 5.125 e2_d : -15.1256 e3_d : 4.241 … e129_d : 1.245 ... call_sign_s : “Apollo” last_seen_dt : 2017-02-08T08:52:12 alias_s : “Tom Zarek” ... } Query THE QUERY, RESPONSE, AND DOCUMENTS DOCUMENT
  • 50. THE QUERY, RESPONSE, AND DOCUMENTS Query ALL Documents Euclidean Distance Ascending Order
  • 51. THE QUERY, RESPONSE, AND DOCUMENTS Response: [ { “name_s” : “Apollo”, “calcDist” : 1256.254, “lastseen_pdt”: 1979-05-11T08:41:25}, { “name_s” : “Tom Zarek”, “calcDist” : 1826.529, “lastseen_pdt”: 2017-02-07T08:41:25}, { “name_s” : “Starbuck”, “calcDist” : 5826.529, “lastseen_pdt”: 2017-09-14T15:22:56}, { “name_s” : “Caprica 6”, “calcDist” : 7119.525, “lastseen_pdt”: 2017-09-14T08:41:25}, … ]
  • 53. WHO DOES THE WORK? Local  Advantages:  - Edge device can build use context clues to make final decision  Disadvantages:  - Requires more hardware at edge to “think” On Solr  Advantages:  - Leverage advantages of Solr  - Less hardware requirement on edge  Disadvantages:  - “Contextual clues” must be encoded in query Response Recognize?
  • 54. ACT 3 Building your own Cylons
  • 55. DRONES ARE GETTING CHEAP  Drone 2-Pack ­  $99.99 ­  Controlled via Smartphone  FPV Camera ­  $39.99 / ea ­  Video over Wifi via RTSP Video enabled drones for ~$90 each
  • 56. CHALLENGES AND OPPORTUNITIES Challenge: ­  Cascade Filters inconsistently frame face ­  “Ghost Faces” ­  Eigenfaces not robust to facial expressions, changes in light, etc.
  • 57. CHALLENGES AND OPPORTUNITIES Opportunity: ­  Video gives us a lot more ”context clues” than still frames. ­  People don’t sporadically disappear and appear ­  Someone seen recently is more likely to be present than someone seen long ago.
  • 58. OPENCV DETECTS FACES IN A VIDEO FRAME
  • 59. OPENCV DETECTS FACES IN A VIDEO FRAME
  • 60. OPENCV DETECTS FACES IN A VIDEO FRAME
  • 61. OPENCV DETECTS FACES IN A VIDEO FRAME
  • 62. OPENCV DETECTS FACES IN A VIDEO FRAME
  • 63. OPENCV DETECTS FACES IN A VIDEO FRAME
  • 64. 2 PROBLEMS 1.  The face is inconsistently detected (Eigenfaces is sensitive to this) 2.  Shadows, patterns on clothes, etc. cause “ghost faces” to be identified sporradically.
  • 65. OPENCV DETECTS FACES IN A VIDEO FRAME
  • 66. SOLUTION: CLUSTERING/ FILTERING/WINDOWING  Proposal: Cluster faces by location in frame. If less than N faces in cluster- remove all faces in cluster (e.g. ghost clusters)  Problem-2: People move around frame in time.  Proposal-2: Break frames up into sliding window of M seconds.  Problem-3: Clustering/machine learning can be somewhat computationally expensive  Proposal-3: Canopy clustering (old, but still effective method- 1 pass clustering).
  • 67. CANOPY CLUSTERING  Create N Second Window  Cluster Faces in Window  Quick dirty clustering- but effective. ­  First point is “center” ­  All points within distance t2 are “in that cluster. ­  If a point is not within t2 of any cluster- it becomes a new cluster center.
  • 68. t2= max square width OPENCV DETECTS FACES IN VIDEO FRAME
  • 69. t2= max square widthFirst rect – new cluster Second Rect- within one width of first rect (same cluster) Third Rect- within one width of first rect (same cluster) Forth Rect- NOT within one width of first rect (new cluster) Fifth Rect- within one width of first rect (same cluster) Finally- any cluster with less than two entities in windows gets filtered out. CANOPY CLUSTERING TO REMOVE “GHOST” FACES
  • 70. CLUSTERING BECAUSE WE DON’T KNOW HOW MANY TRUE FACES THERE ARE
  • 71. SETTING THE ”LOOSE DISTANCE"Half the width of largest rectangle is the “Loose Distance”
  • 72. SETTING THE “TIGHT DISTANCE”Half the width of largest rectangle is the “Loose Distance”
  • 73. ADAPTIVE HYPER- PARAMETERS  A very simple machine learning algorithm adapts its self in real time to the input it is receiving…  A.I. Is a strong buzzword but...
  • 74. A BETTER WAY TO SOLR (1)
  • 76. DOCUMENT A BETTER WAY TO SOLR { name_s: “Richard Hatch”, e0_d : 1.512 e1_d : 5.125 e2_d : -15.1256 e3_d : 4.241 … e129_d : 1.245 ... call_sign_s : “Apollo” last_seen_dt : 2017-02-08T08:52:12 alias_s : “Tom Zarek” ... }
  • 77. LEVERAGE PAYLOAD CAPABILITY OF TERM FIELDS  New Query  q=“*:*”  &sort=dist(  2   ,payload(“e_dpf”,”e_00”)   ,payload(“e_dpf”,”e_01”)   ...   ,payload(“e_dpf”,”e_129”)   ,x_e0   ,x_e1   ...   ,x_e130  ) asc  &rows=5 { name_s : “Richard Hatch” e_dpf:”e0|1.512 e1|1.512 … e129| 1.245” … ,call_sign_s : “Apollo” ,last_seen_dt : “2017-02-08T08:52:12 ,alias_s : “Tom Zarek” … } Thank you Erik Hatcher (SOLR-1485  https://issues.apache.org/jira/browse/SOLR-1485)
  • 79. BETTER SCALING Cluster1 Cluster2 Cluster3 Cluster2a Cluster2b
  • 80. WINDOWING  A video is just a stream of Frames  Apache Flink gives us a nice API for splitting/joining the stream, as well as creating windows and applying functions to the windows. (Other bonuses too)
  • 81. ENTER THE STREAM: OPENCV DETECTS FACES
  • 82. ENTER THE STREAM: MAHOUT CANOPY CLUSTER An n-m-second sliding window: Every m seconds this window emits a set of clusters based on the last n seconds of data. For Exampe: 5-1, every 1 second a new set of ”face zones” based on faces detected the previous 5 seconds.
  • 83. MAHOUT CANOPY CLUSTER An n-m-second sliding window: Every m seconds this window emits a set of clusters based on the last n seconds of data. For Exampe: 5-1, every 1 second a new set of ”face zones” based on faces detected the previous 5 seconds. (Or 0.5 / 0.1 – Every 10th of a second based on last half second)
  • 84. ENTER THE STREAM: A LAG Here a small lag is introduced.
  • 85. “APPLY THE CLUSTERS” BASED ON FIT CANOPIES Face Cluster 1 Face Cluster 1 Face Cluster 1 Face Cluster 1 Face Cluster 2 (only 1 image- Ghost)
  • 87. STORE OUR MEMORIES IN SOLR METHOD1: AVERAGING 1.  Take all Face Rects in Cluster. 2.  Average them All together. 3.  Search Solr for this averaged image. 4.  If this “Average Face” matches a face in the cluster (within some distance tolerance) we assign that name to every face in the cluster- and write all faces to Solr as that person’s name. 5.  Otherwise- we create a new name, and write all faces to Solr under the new Name. 6.  This really doesn’t work very well at all. 7.  ADVANTAGE: Minimize network traffic/SOLR taxation
  • 88. STORE OUR MEMORIES IN SOLR METHOD2: “VOTING” 1.  Search EACH face 2.  Get list of names in results 3.  Assign points based on rank or distance 4.  Aggregate points across all rects, highest points “wins”- if winner has some minimum threshold, assign that name. 5.  Otherwise- we create a new name, and write all faces to Solr under the new Name.
  • 89. PUNCHLINE:  Second benefit of Eigen faces over ”deep learning” quickly add faces
  • 90. WHY APACHE SOLR ­ Capable of storing large amounts of data ­ Scales to petabytes text oriented  ­ Numeric compute friendly ­ Many ways to store different types of data
  • 91. WHY APACHE MAHOUT ­ Engine Agnostic (Spark/Flink/Standalone/RYO) ­ Native acceleration on CPU/GPU/CUDA ­ Possible to accelerate BLAS operations on ANY arch (edge devices)  ­ Mathematically expressive Scala
  • 92. WHY APACHE FLINK ­ Sophisticated Windowing Functions ­ Complex Event Library ­ Scales linearly (1 drone vs Army of Drones)
  • 93. TECHNICALLY “BORG-STYLE” AI, NOT CYLONS   A finer technical point for those familiar with the Cylons and the Borg  “Hive Mind” Architecture
  • 94. NEW HUMAN-0001OH, hai HUMAN-0001 LEARNING PROPAGATES QUICKLY
  • 95. SHAPE OF THINGS TO COME. ”Science Fiction” of 10 years ago, today is domain of hobbyists Demo presented here is “Science Fair” grade AI. Vlad Putin’s recently talking about “it is undesirable for anyone to monopolize AI”. (Yay Apache!)
  • 96. DEMO Here’s a fun video while I set up