This document discusses using polysemous codes to perform large-scale search over visual signatures. Polysemous codes allow product quantization codes to be interpreted as both compact binary codes for efficient Hamming distance search and codes that preserve distance information for accurate nearest neighbor search. The key ideas are to learn an index assignment that maps similar product quantization codes to binary codes with smaller Hamming distance, and to directly optimize this assignment to match the distances between codebook centroids. This allows using a single code representation for both fast Hamming search and precise distance search, without increasing memory requirements. The document provides examples of applying polysemous codes to build a large graph connecting images based on visual similarity.
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...Thom Lane
Summary of models and methods used for DAWNBench CIFAR-10 Challenge. Starting with an review of ResNets from high level architecture, we review Basic vs Bottleneck blocks, pre-activation blocks and Wide Resets. After a brief mention of PyramidNet, ResNext and DenseNet models, we look at regularization techniques such as Mixup. And we finish with a review of Cyclical Learning Rates, and the phenomenon of "Super Convergence".
MXNet Gluon API was used for the implementations.
At StampedeCon 2014, John Tran of NVIDIA presented "GPUs in Big Data." Modern graphics processing units (GPUs) are massively parallel general-purpose processors that are taking Big Data by storm. In terms of power efficiency, compute density, and scalability, it is clear now that commodity GPUs are the future of parallel computing. In this talk, we will cover diverse examples of how GPUs are revolutionizing Big Data in fields such as machine learning, databases, genomics, and other computational sciences.
Talk held at the FrOSCon 2013 on 24.08.2013 in Sankt Augustin, Germany
Agenda:
- Why Twitter Storm?
- What is Twitter Storm?
- What to do with Twitter Storm?
A tutorial presentation based on storm.apache.org documentation.
I gave this presentation at Amirkabir University of Technology as Teaching Assistant of Cloud Computing course of Dr. Amir H. Payberah in spring semester 2015.
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...Thom Lane
Summary of models and methods used for DAWNBench CIFAR-10 Challenge. Starting with an review of ResNets from high level architecture, we review Basic vs Bottleneck blocks, pre-activation blocks and Wide Resets. After a brief mention of PyramidNet, ResNext and DenseNet models, we look at regularization techniques such as Mixup. And we finish with a review of Cyclical Learning Rates, and the phenomenon of "Super Convergence".
MXNet Gluon API was used for the implementations.
At StampedeCon 2014, John Tran of NVIDIA presented "GPUs in Big Data." Modern graphics processing units (GPUs) are massively parallel general-purpose processors that are taking Big Data by storm. In terms of power efficiency, compute density, and scalability, it is clear now that commodity GPUs are the future of parallel computing. In this talk, we will cover diverse examples of how GPUs are revolutionizing Big Data in fields such as machine learning, databases, genomics, and other computational sciences.
Talk held at the FrOSCon 2013 on 24.08.2013 in Sankt Augustin, Germany
Agenda:
- Why Twitter Storm?
- What is Twitter Storm?
- What to do with Twitter Storm?
A tutorial presentation based on storm.apache.org documentation.
I gave this presentation at Amirkabir University of Technology as Teaching Assistant of Cloud Computing course of Dr. Amir H. Payberah in spring semester 2015.
Published on 11 may, 2018
Chainer is a deep learning framework which is flexible, intuitive, and powerful.
This slide introduces some unique features of Chainer and its additional packages such as ChainerMN (distributed learning), ChainerCV (computer vision), ChainerRL (reinforcement learning), Chainer Chemistry (biology and chemistry), and ChainerUI (visualization).
K-Means clustering is a popular clustering algorithm in data mining. Clustering large data sets can be time consuming, and in an attempt to minimize this time, our project is a parallel implementation of K-Means clustering algorithm on CUDA using C. We present the performance analysis and implementation of our approach to parallelizing K-Means clustering.
Real-Time Integration Between MongoDB and SQL DatabasesEugene Dvorkin
Many companies have huge investment in Data Warehouse and BI tools and want to leverage those investments to process data collected by applications in MongoDB. For example, a company may need to blend clickstream data collected by distributed MongoDB data storage with personal data from Oracle into the Data Warehouse system or Analytics platform to provide timely marketing reports. Most of the time the job requires converting a MongoDB JSON document structure into a traditional relational model. Traditional ETL (Extract Transform Load) process still needed to be developed for loading and conversion of unstructured data into traditional analytical tools or Hadoop. In this talk we discuss how to develop a real-time, scalable, fault-tolerant ETL process to integrate MongoDB with traditional RDBMS storage using the open-sourced Twitter Storm project. We will be capturing data streamed by MongoDB oplog or capped collections, transforming it into tables, rows and columns and loading it into a SQL database. We will discuss mongoDB oplog and Storm architecture. The principles discussed in the talk can be used for many other applications - like advanced analytics, continuous computations and so on. We will be using Java as our language of choice but you can use the same software stack with any language.
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsKusano Hitoshi
This is the slide used for IEEE International Conference on Robotics and Automation (ICRA) 2017, Workshop on Learning and Control for Autonomous Manipulation Systems on June 2nd, 2017.
Parallel Implementation of K Means Clustering on CUDAprithan
K-Means clustering is a popular clustering algorithm in data mining. Clustering large data sets can be
time consuming, and in an attempt to minimize this time, our project is a parallel implementation of KMeans
clustering algorithm on CUDA using C. We present the performance analysis and implementation
of our approach to parallelizing K-Means clustering.
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017MLconf
High Performance Deep Learning on Edge Devices With Apache MXNet:
Deep network based models are marked by an asymmetry between the large amount of compute power needed to train a model, and the relatively small amount of compute power needed to deploy a trained model for inference. This is particularly true in computer vision tasks such as object detection or image classification, where millions of labeled images and large numbers of GPUs are needed to produce an accurate model that can be deployed for inference on low powered devices with a single CPU. The challenge when deploying vision models on these low powered devices though, is getting inference to run efficiently enough to allow for near real time processing of a video stream. Fortunately Apache MXNet provides the tools to solve this issues, allowing users to create highly performant models with tools like separable convolutions, quantized weights and sparsity exploitation as well as providing custom hardware kernels to ensure inference calculations are accelerated to the maximum amount allowed by the hardware the model is being deployed on. This is demonstrated though a state of the art MXNet based vision network running in near real time on a low powered Raspberry Pi device. We finally discuss how running inference at the edge as well as leveraging MXNet’s efficient modeling tools can be used to massively drive down compute costs for deploying deep networks in a production system at scale.
PHP Backends for Real-Time User Interaction using Apache Storm.DECK36
Engaging users in real-time is the topic of our times. Whether it’s a game, a shop, or a content-network, the aim remains the same: providing a personalized experience. In this workshop we will look under the hood of Apache Storm and lay a firm foundation on how to use it with PHP. By that, you can leverage your existing codebase and PHP expertise for an entirely new world: real-time analytics and business logic working on message streams. During the course of the workshop, we will introduce Apache Storm and take a look at all of its components. We will then skyrocket the applicability of Storm by showing you how to implement their components with PHP. All exercises will be conducted using an example project, the infamous and most exhilarating lolcat kitten game ever conceived: Plan 9 From Outer Kitten. In order to follow the hands-on excercises, you will need a development VM prepared by us with all relevant system components and our project repositories. To make the workshop experience as smooth as possible for all participants, please bring a prepared computer to the workshop, as there will be no time to deal with installation and setup issues. Please download all prerequisites and install them as described: VM, Plan 9 webapp, Plan 9 storm backend, (Tutorial: https://github.com/DECK36/plan9_workshop_tutorial ).
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Big Data Spain
Session presented at Big Data Spain 2012 Conference
16th Nov 2012
ETSI Telecomunicacion UPM Madrid
www.bigdataspain.org
More info: http://www.bigdataspain.org/es-2012/conference/memory-efficient-applications/francesc-alted
QConSF 2014 talk on Netflix Mantis, a stream processing systemDanny Yuan
Justin and I gave this talk in QCon SF 2014 about the Mantis, a stream processing system that features a reactive programming API, auto scaling, and stream locality
Published on 11 may, 2018
Chainer is a deep learning framework which is flexible, intuitive, and powerful.
This slide introduces some unique features of Chainer and its additional packages such as ChainerMN (distributed learning), ChainerCV (computer vision), ChainerRL (reinforcement learning), Chainer Chemistry (biology and chemistry), and ChainerUI (visualization).
K-Means clustering is a popular clustering algorithm in data mining. Clustering large data sets can be time consuming, and in an attempt to minimize this time, our project is a parallel implementation of K-Means clustering algorithm on CUDA using C. We present the performance analysis and implementation of our approach to parallelizing K-Means clustering.
Real-Time Integration Between MongoDB and SQL DatabasesEugene Dvorkin
Many companies have huge investment in Data Warehouse and BI tools and want to leverage those investments to process data collected by applications in MongoDB. For example, a company may need to blend clickstream data collected by distributed MongoDB data storage with personal data from Oracle into the Data Warehouse system or Analytics platform to provide timely marketing reports. Most of the time the job requires converting a MongoDB JSON document structure into a traditional relational model. Traditional ETL (Extract Transform Load) process still needed to be developed for loading and conversion of unstructured data into traditional analytical tools or Hadoop. In this talk we discuss how to develop a real-time, scalable, fault-tolerant ETL process to integrate MongoDB with traditional RDBMS storage using the open-sourced Twitter Storm project. We will be capturing data streamed by MongoDB oplog or capped collections, transforming it into tables, rows and columns and loading it into a SQL database. We will discuss mongoDB oplog and Storm architecture. The principles discussed in the talk can be used for many other applications - like advanced analytics, continuous computations and so on. We will be using Java as our language of choice but you can use the same software stack with any language.
FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsKusano Hitoshi
This is the slide used for IEEE International Conference on Robotics and Automation (ICRA) 2017, Workshop on Learning and Control for Autonomous Manipulation Systems on June 2nd, 2017.
Parallel Implementation of K Means Clustering on CUDAprithan
K-Means clustering is a popular clustering algorithm in data mining. Clustering large data sets can be
time consuming, and in an attempt to minimize this time, our project is a parallel implementation of KMeans
clustering algorithm on CUDA using C. We present the performance analysis and implementation
of our approach to parallelizing K-Means clustering.
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017MLconf
High Performance Deep Learning on Edge Devices With Apache MXNet:
Deep network based models are marked by an asymmetry between the large amount of compute power needed to train a model, and the relatively small amount of compute power needed to deploy a trained model for inference. This is particularly true in computer vision tasks such as object detection or image classification, where millions of labeled images and large numbers of GPUs are needed to produce an accurate model that can be deployed for inference on low powered devices with a single CPU. The challenge when deploying vision models on these low powered devices though, is getting inference to run efficiently enough to allow for near real time processing of a video stream. Fortunately Apache MXNet provides the tools to solve this issues, allowing users to create highly performant models with tools like separable convolutions, quantized weights and sparsity exploitation as well as providing custom hardware kernels to ensure inference calculations are accelerated to the maximum amount allowed by the hardware the model is being deployed on. This is demonstrated though a state of the art MXNet based vision network running in near real time on a low powered Raspberry Pi device. We finally discuss how running inference at the edge as well as leveraging MXNet’s efficient modeling tools can be used to massively drive down compute costs for deploying deep networks in a production system at scale.
PHP Backends for Real-Time User Interaction using Apache Storm.DECK36
Engaging users in real-time is the topic of our times. Whether it’s a game, a shop, or a content-network, the aim remains the same: providing a personalized experience. In this workshop we will look under the hood of Apache Storm and lay a firm foundation on how to use it with PHP. By that, you can leverage your existing codebase and PHP expertise for an entirely new world: real-time analytics and business logic working on message streams. During the course of the workshop, we will introduce Apache Storm and take a look at all of its components. We will then skyrocket the applicability of Storm by showing you how to implement their components with PHP. All exercises will be conducted using an example project, the infamous and most exhilarating lolcat kitten game ever conceived: Plan 9 From Outer Kitten. In order to follow the hands-on excercises, you will need a development VM prepared by us with all relevant system components and our project repositories. To make the workshop experience as smooth as possible for all participants, please bring a prepared computer to the workshop, as there will be no time to deal with installation and setup issues. Please download all prerequisites and install them as described: VM, Plan 9 webapp, Plan 9 storm backend, (Tutorial: https://github.com/DECK36/plan9_workshop_tutorial ).
Memory efficient applications. FRANCESC ALTED at Big Data Spain 2012Big Data Spain
Session presented at Big Data Spain 2012 Conference
16th Nov 2012
ETSI Telecomunicacion UPM Madrid
www.bigdataspain.org
More info: http://www.bigdataspain.org/es-2012/conference/memory-efficient-applications/francesc-alted
QConSF 2014 talk on Netflix Mantis, a stream processing systemDanny Yuan
Justin and I gave this talk in QCon SF 2014 about the Mantis, a stream processing system that features a reactive programming API, auto scaling, and stream locality
In this paper we propose Regularised Cross-Modal Hashing
(RCMH) a new cross-modal hashing model that projects
annotation and visual feature descriptors into a common
Hamming space. RCMH optimises the hashcode similarity
of related data-points in the annotation modality using an
iterative three-step hashing algorithm: in the first step each
training image is assigned a K-bit hashcode based on hyperplanes learnt at the previous iteration; in the second step the binary bits are smoothed by a formulation of graph regularisation so that similar data-points have similar bits; in the third step a set of binary classifiers are trained to predict the regularised bits with maximum margin. Visual descriptors are projected into the annotation Hamming space by a set of binary classifiers learnt using the bits of the corresponding annotations as labels. RCMH is shown to consistently improve retrieval effectiveness over state-of-the-art baselines.
Pysense: wireless sensor computing in Python?Davide Carboni
PySense aims at bringing wireless sensor (and "internet of things") macroprogramming to the audience of Python programmers. WSN macroprogramming is an emerging approach where the network is seen as a whole and the programmer focuses only on the application logic. The PySense runtime environment
partitions the code and transmits code snippets to the right nodes finding a balance between energy
consumption and computing performances.
In this project I use a stack of denoising autoencoders to learn low-dimensional
representations of images. These encodings are used as input to a locality sensitive
hashing algorithm to find images similar to a given query image. The results clearly
shows that this approach outperforms basic LSH by far.
This presentation focuses on Deep Learning (DL) concepts, such as neural networks, backprop, activation functions, and Convolutional Neural Networks. You'll also learn how to incorporate Deep Learning in Android applications. Basic knowledge of matrices is helpful for this session, which is targeted primarily to beginners.
Log Analytics in Datacenter with Apache Spark and Machine LearningPiotr Tylenda
Presented during DataMass Summit 2017.
http://summit2017.datamass.io/
https://www.youtube.com/watch?v=eGJfhHPdhuo
Data center workloads produce a significant amount of log data which has to be analyzed in order to discover any potential issues. We present an automated text mining approach for workload monitoring and data analytics, which is a combination of machine learning and big data processing. This session provides an overview of a data pipeline based on key components such as Apache Kafka, Apache Spark and generalized version of k-means algorithm.
Log Analytics in Datacenter with Apache Spark and Machine LearningAgnieszka Potulska
Presented during DataMass Summit 2017.
http://summit2017.datamass.io/
https://www.youtube.com/watch?v=eGJfhHPdhuo
Data center workloads produce a significant amount of log data which has to be analyzed in order to discover any potential issues. We present an automated text mining approach for workload monitoring and data analytics, which is a combination of machine learning and big data processing. This session provides an overview of a data pipeline based on key components such as Apache Kafka, Apache Spark and generalized version of k-means algorithm.
Object detection is a central problem in computer vision and underpins many applications from medical image analysis to autonomous driving. In this talk, we will review the basics of object detection from fundamental concepts to practical techniques. Then, we will dive into cutting-edge methods that use transformers to drastically simplify the object detection pipeline while maintaining predictive performance. Finally, we will show how to train these models at scale using Determined’s integrated deep learning platform and then serve the models using MLflow.
What you will learn:
Basics of object detection including main concepts and techniques
Main ideas from the DETR and Deformable DETR approaches to object detection
Overview of the core capabilities of Determined’s deep learning platform, with a focus on its support for effortless distributed training
How to serve models trained in Determined using MLflow
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp
Reconnaissance de visages sur vos photos Facebook, détection de maladies via imagerie médicale, les applications de la reconnaissance d'images grâce à l'intelligence artificielle offrent de vastes possibilités. Lors de cet événement, Cristina & Pierre - Machine Learning Engineers chez Photobox - vous feront une démonstration des outils de reconnaissance d'images via ces algorithmes de Deep Learning.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/bdti/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Shehrzad Qureshi, Senior Engineer at BDTI, presents the "Demystifying Deep Neural Networks" tutorial at the May 2017 Embedded Vision Summit.
What are deep neural networks, and how do they work? In this talk, Qureshi provides an introduction to deep convolutional neural networks (CNNs), which have recently demonstrated impressive success on a wide range of vision tasks. Without using a lot of complex math, he introduces the basics of CNNs. He explores the differences between shallow and deep networks, and explains why deep learning has only recently become prevalent. He examines the different types of layers used in contemporary CNN designs and illustrates why networks composed of these layers are well suited to vision tasks.
An Efficient Interpolation-Based Chase BCH Decoderijsrd.com
Error correction codes are the codes used to correct the errors occurred during the transmission of the data in the unreliable communication mediums. The idea behind these codes is to add redundancy bits to the data being transmitted so that even if some errors occur due to noise in the channel, the data can be correctly received at the destination end. Bose,Ray Chaudhuri, Hocquenghem (BCH)codes are one of the error correcting codes. The BCH decoder consists of four blocks namely syndrome block, chien search block and error correction block. This paper describes a new method for error detection in syndrome and chien search block of BCH decoder. The proposed syndrome block is used to reduce the number of computation by calculating the even number syndromes from the corresponding odd number syndromes.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
2. Problem statement
Given a query, find closest match(es) in large database of “entities”
Example entities: image, video, text, post, user, ad, …
Example applications:
• video copy detection (query = video, database = video)
• blog recommendation (query = user, database = blogs)
• ad placement (query = user, database = ads)
→ very large-scale problems
2
8. 1.1 Binary codes
000 001
100
110 111
011
101
010
Idea: design/learn a function mapping the original space into a
compact Hamming space
Neighbors w.r.t Hamming space try to reflect neighbors in original space
Advantages: compact descriptor, fast distance computation
LSH example: random projection + thresholding
[Charikar’02] shows that Hamming distance gives a cosine estimator 8
9. 1.2 Product quantization (PQ)
y = y1 y2 y3 y4
Decompose the feature space as a product space
• use a distinct quantizer in each subspace, typically k-means
• estimate distances by using look-ups and additions only
[Jégou’11]
9
10. 1.3 Binary codes vs PQ
Binary codes (ITQ) [Gong’11] Product quantization
e.g. [01000110…01] e.g. [2 63 27 227]
context-free comparison need quantizer centroids
1,190M comparisons / sec 222M comparisons / sec
precision = 0.143 precision = 0.442
How to get the best of both worlds?
Seen as competing methods in literature
10
12. 2.1 A naïve approach
q_bin= binary_encode(x) # compute query binary code
d_min = ∞
for i = 1..n # loop over database items
db_bin = db_bin_codes[i] # get binary code for item i
if hamming(q_bin, db_min) < threshold
db_pq = db_pq_codes[i] # get PQ code for item i
d = PQ_distance(x, db_pq)
if d < d_min
nearest_neighbor, d_min = i, d
Encode all DB items with binary and PQ codes:
12
13. 2.2 A naïve approach
Encode all DB items with binary and PQ codes
→ memory increase (x2)
Could we use the same codes for both the Hamming and
PQ distances?
→ polysemous codes
13
14. 2.3 Channel optimized vector quantization
Channel-optimized vector quantizers: “pseudo-Gray coding”
Minimize the overall expected distortion (both from source and channel)
Optimize the index assignment → neighboring codes encode similar info
enc=01001100 dec=01011100
14
15. 2.4 Index assignment optimization
Given a k-means quantizer, learn a permutation of the codes such that the
binary comparison reflects centroid distances
15
16. 2.5 The polysemous approach
q = encode(x) # compute query code
d_min = ∞
for i = 1..n # loop over database items
db = db_codes[i] # get code for item i
if hamming(q, db) < threshold
d = PQ_distance(x, db)
if d < d_min
nearest_neighbor, d_min = i, d
Interpret PQ codes as binary codes:
16
17. 2.5 The polysemous approach
q = encode(x) # compute query code
d_min = ∞
for i = 1..n # loop over database items
db = db_codes[i] # get code for item i
if hamming(q, db) < threshold
d = PQ_distance(x, db)
if d < d_min
nearest_neighbor, d_min = i, d
Interpret PQ codes as binary codes:
→ no memory increase 17
18. 2.6 Objective function
Find a permutation 𝝅() such that
the Hamming distance between permuted indices matches
the distance between centroids
weighting to
favor nearby
centroids
×
optimize permutation
over all pairs
of centroids
monotonous (linear)
function to correct the scale
18
19. 2.6 Objective function
Find a permutation 𝝅() such that
the Hamming distance between permuted indices matches
the distance between centroids
weighting to
favor nearby
centroids
×
optimize permutation
over all pairs
of centroids
monotonous (linear)
function to correct the scale
19
20. 2.6 Objective function
Find a permutation 𝝅() such that
the Hamming distance between permuted indices matches
the distance between centroids
weighting to
favor nearby
centroids
×
optimize permutation
over all pairs
of centroids
monotonous (linear)
function to correct the scale
20
21. 2.6 Objective function
Find a permutation 𝝅() such that
the Hamming distance between permuted indices matches
the distance between centroids
weighting to
favor nearby
centroids
×
optimize permutation
over all pairs
of centroids
monotonous (linear)
function to correct the scale
21
22. 2.7 Optimization
Simulated annealing:
• initialization: random permutation
• swap two entries in the permutation
• converges in approx. 200k iterations (<10s)
22
27. 3.1 Building a graph on images
Testbed: Flickr100M
• public dataset of CC images
• described with AlexNet FC7 features
normalized, PCA to 256D, encoded as 32bytes,
coarse quantizer size 4096
Each image in turn is a query
• compute 100-NN
• build index = 14h, search = 7h
• storage for the graph = 2 x 40 GB RAM
28. 3.2 Graph modes
Graph seen as a Markov model
→ compute stationary distribution [Cho’12]
Sparse matrix – vector multiplication
• 200 iterations (30s / iter)
• mode = local maximum over nodes
29. 3.3 Paths in the graph
Almost all images are connected: find path between pairs of images
→ morphing from one image to another
Which paths?
• shortest path
• minimize sum of distances
• minimize max of distances
29