Mobile Visual Search (MVS) is a fascinating research field with many open challenges and opportunities, which have the potential to impact the way we organize, annotate, and retrieve visual data (images and videos) using mobile devices.
This talk is structured in four parts:
1. Opportunities: where I present recent and relevant numbers of the mobile computing market, particularly in the field of photography apps, social networks, and mobile search.
2. Basic concepts: where I explain the basic MVS pipeline and discuss the three main MVS scenarios and associated challenges.
3. Technical aspects: where I briefly cover topics such as feature extraction, indexing, descriptor matching, and geometric verification, discuss the state of the art in these fields, and comment on open problems and research opportunities.
4. Examples and applications: where I show representative examples of academic research and commercial apps in this field.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Similarity based Dynamic Web Data Extraction and Integration System from Sear...IDES Editor
There is an explosive growth of information in
the World Wide Web thus posing a challenge to Web users
to extract essential knowledge from the Web. Search
engines help us to narrow down the search in the form of
Search Engine Result Pages (SERP). Web Content Mining
is one of the techniques that help users to extract useful
information from these SERPs. In this paper, we propose
two similarity based mechanisms; WDES, to extract desired
SERPs and store them in the local depository for offline
browsing and WDICS, to integrate the requested contents
and enable the user to perform the intended analysis and
extract the desired information. Our experimental results
show that WDES and WDICS outperform DEPTA [1] in
terms of Precision and Recall.
Mobile Visual Search (MVS) is a fascinating research field with many open challenges and opportunities, which have the potential to impact the way we organize, annotate, and retrieve visual data (images and videos) using mobile devices.
This talk is structured in four parts:
1. Opportunities: where I present recent and relevant numbers of the mobile computing market, particularly in the field of photography apps, social networks, and mobile search.
2. Basic concepts: where I explain the basic MVS pipeline and discuss the three main MVS scenarios and associated challenges.
3. Technical aspects: where I briefly cover topics such as feature extraction, indexing, descriptor matching, and geometric verification, discuss the state of the art in these fields, and comment on open problems and research opportunities.
4. Examples and applications: where I show representative examples of academic research and commercial apps in this field.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Similarity based Dynamic Web Data Extraction and Integration System from Sear...IDES Editor
There is an explosive growth of information in
the World Wide Web thus posing a challenge to Web users
to extract essential knowledge from the Web. Search
engines help us to narrow down the search in the form of
Search Engine Result Pages (SERP). Web Content Mining
is one of the techniques that help users to extract useful
information from these SERPs. In this paper, we propose
two similarity based mechanisms; WDES, to extract desired
SERPs and store them in the local depository for offline
browsing and WDICS, to integrate the requested contents
and enable the user to perform the intended analysis and
extract the desired information. Our experimental results
show that WDES and WDICS outperform DEPTA [1] in
terms of Precision and Recall.
Everyday life products manufacturers worldwide produce a multitude of items that are intended for one use only. A disposable is a product designed for a single use after which it is recycled or is disposed as solid waste. The term often implies cheapness and short-term convenience rather than medium to long-term durability. The term is also sometimes used for products that may last several months distinguish from similar products that last indefinitely.
Tags
baby diapers and sanitary napkin manufacturing process, banana leaves plates manufacturing, Bottle Making Process, Diapers Bay - Diaper (Nappy) manufacturing, Different uses of polystyrene, disposable glass manufacturing, disposable items manufacturing process, Disposable Bowls, Disposable Plates from Banana Leaves, disposable plates manufacturing business, Disposable Products Manufacturing, Disposable Products Manufacturing book, Disposable Thermocol Paper cup manufacturing business, disposable thermocol plates manufacturing process, Disposable Wet Wipes for Babies, facial tissue manufacturing process, How are plastic cups manufactured?, How are Plastic Cutlery made? , How baby wipes is made, How cutlery is made – material and production process , How to make Plastic Cups and Cutlery, how to manufacture Disposable Products, How to Produce Thermocol & Its Products, How to Start Manufacturing Business of Disposable Products, India Banana Leaf Plate, Manufacturing Disposable Plastic Cutlery, Manufacturing Disposable Plastic ware, manufacturing process of disposable plastic glass, Manufacturing Process of Paper Cups, Manufacturing Process of Plastic Bottles, Paper Cup Making: Small Scale Manufacturing, Paper cup manufacturing business, Paper plate manufacturing, Pet bottle manufacturing process, Pet Bottle Production: Small Scale Manufacturing, pet Plastic Bottle Manufacturing, plastic cups manufacturing process, Plastic Cutlery Manufacturing process, Plastic Glass & Cup Manufacturing, Plastic spoon manufacturing process, plates made of banana leaves, polyethylene terephthalate uses, Production of Disposable Products, sanitary napkin manufacturing Unit, Setup a Toilet Paper Manufacturing Business - Startup Business, Start Diaper Manufacturing Business, Start Manufacturing Toilet Paper, Starting a Paper Cup Manufacturing Business, Starting Toilet Paper & Roll Manufacturing Business, Technology book on disposable products, The Many Uses of PP / Polypropylene Plastics, thermocol cup plate glass production, thermocol glass and plates manufacturing process, Thermocol Packaging, Thermoforming Processes: Vacuum Forming: Pressure Forming, Tissue Paper Making: Profitable Small Business, Tissue paper production for hygienic and domestic use, Toilet Paper & Roll Manufacturing process, toilet paper making business, Toilet Roll (Tissue Paper) Production Business, use and throw plates manufacturing
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Rekognition (MAC203)Amazon Web Services
This session will introduce you to Amazon Rekognition, a new service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces. Rekognition’s API lets you easily build powerful visual search and discovery into your applications. With Amazon Rekognition, you only pay for the images you analyze and the face metadata you store. There are no minimum fees and there are no upfront commitments.
To get started with Rekognition, simply log in to the Rekognition console to try the service with sample photos or your own photos. Join this session and learn more about Amazon Rekognition!
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...Editor IJMTER
Web mining techniques are used to analyze the web page contents and usage details. Human facial
images are shared in the internet and tagged with additional information. Auto face annotation techniques are used
to annotate facial images automatically. Annotations are used in online photo search and management.
Classification techniques are used to assign the facial annotation. Supervised or semi-supervised machine learning
techniques are used to train the classification models. Facial images with labels are used in the training process.
Noisy and incomplete labels are referred as weak labels. Search-based face annotation (SBFA) is assigned by
mining weakly labeled facial images available on the World Wide Web (WWW). Unsupervised label refinement
(ULR) approach is used for refining the labels of web facial images with machine learning techniques. ULR
scheme is used to enhance the label quality using graph-based and low-rank learning approach. The training phase
is designed with facial image collection, facial feature extraction, feature indexing and label refinement learning
steps. Similar face retrieval and voting based face annotation tasks are carried out under the testing phase.
Clustering-Based Approximation (CBA) algorithm is applied to improve the scalability. Bisecting K-means
clustering based algorithm (BCBA) and divisive clustering based algorithm (DCBA) are used to group up the
facial images. Multi step Gradient Algorithm is used for label refinement process. The web face annotation scheme
is enhanced to improve the label quality with low refinement overhead. Noise reduction is method is integrated
with the label refinement process. Duplicate name removal process is integrated with the system. The indexing
scheme is enhanced with weight values for the labels. Social contextual information is used to manage the query
facial image relevancy issues.
Everyday life products manufacturers worldwide produce a multitude of items that are intended for one use only. A disposable is a product designed for a single use after which it is recycled or is disposed as solid waste. The term often implies cheapness and short-term convenience rather than medium to long-term durability. The term is also sometimes used for products that may last several months distinguish from similar products that last indefinitely.
Tags
baby diapers and sanitary napkin manufacturing process, banana leaves plates manufacturing, Bottle Making Process, Diapers Bay - Diaper (Nappy) manufacturing, Different uses of polystyrene, disposable glass manufacturing, disposable items manufacturing process, Disposable Bowls, Disposable Plates from Banana Leaves, disposable plates manufacturing business, Disposable Products Manufacturing, Disposable Products Manufacturing book, Disposable Thermocol Paper cup manufacturing business, disposable thermocol plates manufacturing process, Disposable Wet Wipes for Babies, facial tissue manufacturing process, How are plastic cups manufactured?, How are Plastic Cutlery made? , How baby wipes is made, How cutlery is made – material and production process , How to make Plastic Cups and Cutlery, how to manufacture Disposable Products, How to Produce Thermocol & Its Products, How to Start Manufacturing Business of Disposable Products, India Banana Leaf Plate, Manufacturing Disposable Plastic Cutlery, Manufacturing Disposable Plastic ware, manufacturing process of disposable plastic glass, Manufacturing Process of Paper Cups, Manufacturing Process of Plastic Bottles, Paper Cup Making: Small Scale Manufacturing, Paper cup manufacturing business, Paper plate manufacturing, Pet bottle manufacturing process, Pet Bottle Production: Small Scale Manufacturing, pet Plastic Bottle Manufacturing, plastic cups manufacturing process, Plastic Cutlery Manufacturing process, Plastic Glass & Cup Manufacturing, Plastic spoon manufacturing process, plates made of banana leaves, polyethylene terephthalate uses, Production of Disposable Products, sanitary napkin manufacturing Unit, Setup a Toilet Paper Manufacturing Business - Startup Business, Start Diaper Manufacturing Business, Start Manufacturing Toilet Paper, Starting a Paper Cup Manufacturing Business, Starting Toilet Paper & Roll Manufacturing Business, Technology book on disposable products, The Many Uses of PP / Polypropylene Plastics, thermocol cup plate glass production, thermocol glass and plates manufacturing process, Thermocol Packaging, Thermoforming Processes: Vacuum Forming: Pressure Forming, Tissue Paper Making: Profitable Small Business, Tissue paper production for hygienic and domestic use, Toilet Paper & Roll Manufacturing process, toilet paper making business, Toilet Roll (Tissue Paper) Production Business, use and throw plates manufacturing
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Rekognition (MAC203)Amazon Web Services
This session will introduce you to Amazon Rekognition, a new service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces. Rekognition’s API lets you easily build powerful visual search and discovery into your applications. With Amazon Rekognition, you only pay for the images you analyze and the face metadata you store. There are no minimum fees and there are no upfront commitments.
To get started with Rekognition, simply log in to the Rekognition console to try the service with sample photos or your own photos. Join this session and learn more about Amazon Rekognition!
FACE EXPRESSION IDENTIFICATION USING IMAGE FEATURE CLUSTRING AND QUERY SCHEME...Editor IJMTER
Web mining techniques are used to analyze the web page contents and usage details. Human facial
images are shared in the internet and tagged with additional information. Auto face annotation techniques are used
to annotate facial images automatically. Annotations are used in online photo search and management.
Classification techniques are used to assign the facial annotation. Supervised or semi-supervised machine learning
techniques are used to train the classification models. Facial images with labels are used in the training process.
Noisy and incomplete labels are referred as weak labels. Search-based face annotation (SBFA) is assigned by
mining weakly labeled facial images available on the World Wide Web (WWW). Unsupervised label refinement
(ULR) approach is used for refining the labels of web facial images with machine learning techniques. ULR
scheme is used to enhance the label quality using graph-based and low-rank learning approach. The training phase
is designed with facial image collection, facial feature extraction, feature indexing and label refinement learning
steps. Similar face retrieval and voting based face annotation tasks are carried out under the testing phase.
Clustering-Based Approximation (CBA) algorithm is applied to improve the scalability. Bisecting K-means
clustering based algorithm (BCBA) and divisive clustering based algorithm (DCBA) are used to group up the
facial images. Multi step Gradient Algorithm is used for label refinement process. The web face annotation scheme
is enhanced to improve the label quality with low refinement overhead. Noise reduction is method is integrated
with the label refinement process. Duplicate name removal process is integrated with the system. The indexing
scheme is enhanced with weight values for the labels. Social contextual information is used to manage the query
facial image relevancy issues.
Mobile Web Browsing Based On Content Preserving With Reduced CostEswar Publications
Internet has played a drastic change in today’s life. Especially, web browsing has become more exclusive in compact devices. This tempts the people to migrate their innovations & skills into an unimaginable world. With these things in mind, it is necessary for us to concentrate more on the techniques that how the web data’s are accessed and accounted. Developed countries use a widely popular technique called Flat- rate pricing, which is solely independent on data usage. But whereas, developing countries are still behind the concept of “pay as you use”, which leads to high usage bills.With an effort to resolve the problem of high usage bills, we propose a cost
effective technique, which reduces the data consumption in web mobile browsing. It reduces the usage bills in the
mechanism of usage-based pricing. The key idea of our approach is to leverage the data plan of the user to compute a cost quota for each web request and a network middle-box to automatically adapt any web page to the cost quota. Here we use a simple but effective content adaption technique that highly decides which image or data best fits the mobile display with low cost and high quality resolution. It also emphasis on the trendy technique,”
The Data Mining “which mines the requested & required data. The mined data’s are filtered based on the content adaption technique and fit into the display effectively. Interesting and noticeable feature in this concept is that only important web contents requested by the user are exhibited. A feedback process involves in this concept to retrieve the required data alone and also to improve the best fit resolution. With this proposed system web mobile browsing becomes cheaper & contributes an enormous logic for the future project in the field of Mobile browsing.
A recent direction in Business Process Management studied methodologies to control the execution of Business Processes under several sources of uncertainty in order to always get to the end by satisfying all constraints. Current approaches encode business processes into temporal constraint networks or timed game automata in order to exploit their related strategy synthesis algorithms. However, the proposed encodings can only synthesize single-strategies and fail to handle loops. To overcome these limits I will discuss a recent approach based on supervisory control. The approach considers structured business processes with resources, parallel and mutually exclusive branches, loops, and uncertainty. I will discuss an encoding into finite state automata and prove that their concurrent behavior models exactly all possible executions of the process. After that, I will introduce tentative commitment constraints as a new class of constraints restricting the executions of a process. Finally, I will discuss a tree decomposition of the process that plays a central role in modular supervisory control.
In his ignite talk „The Digital Transformation of Education: A Hyper-Disruptive Era through Blockchain and Generative AI,“ Dr. Alexander Pfeiffer delves into the intricate challenges and potential benefits associated with integrating blockchain technologies and generative AI into the educational landscape. He scrutinizes consensus algorithms and explores sustainable methods of operating blockchain systems, while also examining how smart contracts and transactions can be tailored to meet the specific needs of the educational sector. Alexander underscores the importance of establishing secure digital identities and ensuring robust data protection, while simultaneously casting a critical eye on potential risks and vulnerabilities. The topic of digital identities, facilitated through tokenization, forms a bridge between storing data using blockchain-based databases and the increasingly urgent need for content verification of AI-generated material.
Alexander explores the profound alterations occurring in teaching methodologies, assignment creation, and evaluation processes, shedding light on the hyper-disruptive impact these changes are having on both research and practical applications in education. The production of textual content by educators and students is analyzed with a focus on ensuring clear traceability of content sources and editors, and its proper citation, a critical aspect in the responsible use of AI. In addition to generative text and graphics, AI plays a crucial role in future learning and assignment practices, particularly through adaptive game-based learning and assessment. Alexander will provide a brief glimpse into his game „Gallery-Defender,“ a prototype demonstrating how AI and blockchain can be effectively implemented in serious gaming scenarios.
Furthermore, he emphasizes the imperative for ongoing education and professional development for educational personnel, advocating for a proactive stance in addressing the (legal) challenges associated with AI-generated images and text. This ignite talk aims to provide a balanced and critically reflective perspective on hyper-disruptive technologies, setting the stage for further discourse and exploration in the subsequent discussion.
The simulation of melee combat is central to many contemporary and traditional strategic games and simulations. In order to elevate this element of play from mere exercises of stats-comparison and dice rolling to a meaningful experience of play, strategy games rely on a rich plethora of cultural motives as deciding factors of their mechanic design. On the example of Samurai-themed skirmishing games, my talk elaborates on the impact that (popular) culture and other inspirations have on gaming experiences. It provides concrete examples from Japanese history, its traditional cinema, and postmodern Western reflections of Japanese cultural practices. Based on these insights, it compares four tabletop strategy games, muses on which phenomena they have adapted in their mechanics, and asks why or why not they may succeed in capturing a cultural essence via their rules.
Ultimately, this comparative approach shall serve to decipher the interplay of dice mechanics and aesthetic properties as the longing for a dramatic ideal in tabletop gaming and encourage participants to reflect on the idea in a subsequent, shared gaming experience.
How does a development team expand on an already existing game?
We will look at the two community driven and committee led expansions to the abandoned Tabletop game 'GuildBall' and explore the stages of development that the game went through. The art and lore driven approach employed will show us how rough sketches and concept ideas become a fully fledged ruleset and ultimately miniatures that can be put on the table. We will also explore pitfalls in rules design like over complicating abilities, the lack of streamlining across the game or simply creating expansions who break the game instead of the mold.
Exploring the development and production pipelines for miniatures in the tabletop wargaming industry. Including a look at the career route taken by the speaker, a case study on developing anatomical archetypes for consistent design outcomes, and a brief look at the various production methods available to the industry.
In recent years, we have experienced an exponential growth in the amount of data generated by IoT devices. Data have to be processed strict low latency constraints, that cannot be addressed by conventional computing paradigm and architectures. On top of this, if we consider that we recently hit the limit codified by the Moore’s law, satisfying low-latency requirements of modern applications will become even more challenging in the future. In this talk, we discuss challenges and possibilities of heterogeneous distributed systems in the Post-Moore era.
In the modern world, we are permanently using, leveraging, interacting with, and relying upon systems of ever higher sophistication, ranging from our cars, recommender systems in eCommerce, and networks when we go online, to integrated circuits when using our PCs and smartphones, security-critical software when accessing our bank accounts, and spreadsheets for financial planning and decision making. The complexity of these systems coupled with our high dependency on them implies both a non-negligible likelihood of system failures, and a high potential that such failures have significant negative effects on our everyday life. For that reason, it is a vital requirement to keep the harm of emerging failures to a minimum, which means minimizing the system downtime as well as the cost of system repair. This is where model-based diagnosis comes into play.
Model-based diagnosis is a principled, domain-independent approach that can be generally applied to troubleshoot systems of a wide variety of types, including all the ones mentioned above. It exploits and orchestrates techniques for knowledge representation, automated reasoning, heuristic problem solving, intelligent search, learning, stochastics, statistics, decision making under uncertainty, as well as combinatorics and set theory to detect, localize, and fix faults in abnormally behaving systems.
In this talk, we will give an introduction to the topic of model-based diagnosis, point out the major challenges in the field, and discuss a selection of approaches from our research addressing these challenges. For instance, we will present methods for the optimization of the time and memory performance of diagnosis systems, show efficient techniques for a semi-automatic debugging by interacting with a user or expert, and demonstrate how our algorithms can be effectively leveraged in important application domains such as scheduling or the Semantic Web.
Function-as-a-Service (FaaS) is the latest paradigm of cloud computing in which developers deploy their codes as serverless functions, while the entire underlying platform and infrastructure is completely managed by cloud providers. Each cloud provider offers a huge set of cloud services and many libraries to simplify development and deployment, but only inside their clouds, often in a single cloud region. With such „help“ of cloud providers, users are locked to use resources and services of the selected cloud provider, which are often limited. Moreover, such heterogeneous and distributed environment of multiple cloud regions and providers challenge scientists to engineer cloud applications, often in a form of serverless workflows. In this talk, I will present our design principle „code once, run everywhere, with everything“. In particular, I will present challenges and our approaches and techniques how to program, model, orchestrate, and run distributed serverless workflow applications in federated FaaS.
As the network softwarization trend started by SDN and NFV keeps evolving, the hardware/software continuum becomes more relevant than ever, offering new offloading/acceleration opportunities at node and network-wide scales. This talk will review evolving transformations behind network softwarization with a special focus on network refactoring and offloading trends leading to “fluid networks planes”, characterized by multiple candidate options for the specific HW/SW embodiment and the location of chained network functions, from the edge to core, from one administrative provider to another, from programmable silicon to portable lightweight virtualized containers. The talk will overview concrete examples from the literature with a special focus on the role of Machine Learning to assist key (automated) decision-making steps. Lastly, the talk will conclude with a glimpse on ongoing ML work applied to Youtube video QoE prediction in live 5G networks.
The dynamics of networks enables the function of a variety of systems we rely on every day, from gene regulation and metabolism in the cell to the distribution of electric power and communication of information. Understanding, steering and predicting the function of interacting nonlinear dynamical systems, in particular if they are externally driven out of equilibrium, relies on obtaining and evaluating suitable models, posing at least two major challenges. First, how can we extract key structural system features of networks if only time series data provide information about the dynamics of (some) units? Second, how can we characterize nonlinear responses of nonlinear multi-dimensional systems externally driven by fluctuations, and consequently, predict tipping points at which normal operational states may be lost? Here we report recent progress on nonlinear response theory extended to predict tipping points and on model-free inference of network structural features from observed dynamics.
When it comes to integrating digital technologies into the classroom in higher education, many teachers face similar challenges. Nevertheless, it is difficult for teachers to share experiences because it is usually not possible to transfer successful teaching scenarios directly from one area to another, as subject-specific characteristics make it difficult to reuse them. To address this problem, instructional scenarios can be described as patterns that have been used previously in educational contexts. Patterns can capture proven teaching strategies and describe instructional scenarios in a consistent structure that can be reused. Because priorities for content, methods, and tools are different in each domain, a consensus-tested taxonomy was first developed with the goal of modeling a domain-independent database to collect digital instructional practices. In addition, this presentation will present preliminary insights into a data-driven approach to identifying effective instructional practices from interdisciplinary data as patterns. A web-based application will be developed for this that can both collect teaching/learning scenarios and individually extract scenarios from patterns for a learning platform.
The advent of fog and edge computing has prompted predictions that they will take over the traditional cloud for information processing and knowledge extraction in Internet of Things (IoT) systems. Notwithstanding the fact that fog and edge computing have undoubtedly large potential, these predictions are probably oversimplified and wrongly portray the relations between cloud, fog and edge computing.
Concretely, fog and edge computing have been introduced as an extension of the cloud services towards the data sources, thus forming the computing continuum. The computing continuum enables the creation of a new type of services, spanning across distributed infrastructures, supporting various IoT applications. These applications have a large spectrum of requirements, burdensome to meet with "distant'' cloud data centers. However, the introduction of the computing continuum raises multiple challenges for management, deployment and orchestration of complex distributed applications, such as: increased network heterogeneity, limited resource capacity of edge devices, fragmented storage management, high mobility of edge devices and limited support of native monolithic applications. These challenges primarily concern the complexity and the large diversity of the devices, managed by different entities (cloud providers, universities, private institutions), which range from single-board computers such as Raspberry Pis to powerful multi-processor servers.
Therefore, in this talk, we will discuss novel algorithms for low latency, scalable, and sustainable computing over heterogeneous resources for information processing and reasoning, thus enabling transparent integration of IoT applications. We will tackle the heterogeneity challenge of dynamically changing topologies of the computing infrastructure and present a novel concept for sustainable processing at scale.
East-west oriented photovoltaic power system is a new trend in orienting photovoltaic system. This lecture presents an evaluation of east–west oriented photovoltaic power system. A comparison between east–west oriented photovoltaic system and south oriented photovoltaic system in terms of cost of energy and technical requirement is conducted is presented in this lecture. In addition to that, the benefits of using east–west oriented photovoltaic system are discussed in this paper.
Randomized Signature or random feature selection are two instances of machine learning, where randomly chosen structures appear to be highly expressive. We analyze several aspects of the theory behind it, show that these structures have several theoretically attractive properties and introduce two classes of examples from finance (joint works with Christa Cuchiero, Lukas Gonon, Lyudmila Grigoryeva, Martin Larsson, and Juan-Pablo Ortega).
We live in a “digital” world, the separation between physical and virtual makes (almost) no sense anymore. Here, the Corona pandemic has also acted as an accelerator/magnifier demonstrating that the future of our digital society is here with all its possibilities, but also shortcomings.
In his talk, Hannes Werthner will briefly reflect on the history of computer science, and then discuss the need for an interdisciplinary response to these shortcomings. Such an answer is the Digital Humanism, which looks at this interplay of technology and humankind, it analyzes, and, most importantly, tries to influence the complex interplay of technology and humankind, for a better society and life. In the second part he will discuss this approach, and show what was achieved since its first workshop in 2019, and what lies ahead.
In the latest years, we have witnessed a growing number of media transmitted and stored on computers and mobile devices. For this reason, there is an actual need to employ smart compression algorithms to reduce the size of our media files. However, such techniques are often responsible for severe reduction of user perceived quality. In this talk we present several approaches we have developed to restore degraded images and videos to match their original quality, making use of Generative Adversarial Networks. The aim of the talk is to highlight the main features of our research work, including the advantages of our solution, the current challenges and the possible directions for future improvements.
Recommendation systems today are widely used across many applications such as in multimedia content platforms, social networks, and ecommerce, to provide suggestions to users that are most likely to fulfill their needs, thereby improving the user experience. Academic research, to date, largely focuses on the performance of recommendation models in terms of ranking quality or accuracy measures, which often don’t directly translate into improvements in the real-world. In this talk, we present some of the most interesting challenges that we face in the personalization efforts at Netflix. The goal of this talk is to sunshine challenging research problems in industrial recommendation systems and start a conversation about exciting areas of future research.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
Mobile Visual Search
1. Mobile Visual Search
Oge Marques
Florida Atlantic University
Boca Raton, FL - USA
TEWI
Kolloquium
–
24
Jan
2012
2. Take-home message
Mobile Visual Search (MVS) is a fascinating research
field with many open challenges and opportunities
which have the potential to impact the way we
organize, annotate, and retrieve visual data (images
and videos) using mobile devices.
Oge
Marques
3. Outline
• This talk is structured in four parts:
1. Opportunities
2. Basic concepts
3. Technical details
4. Examples and applications
Oge
Marques
5. Mobile visual search: driving factors
• Age of mobile computing
h<p://60secondmarketer.com/blog/2011/10/18/more-‐mobile-‐phones-‐than-‐toothbrushes/
Oge
Marques
6. Mobile visual search: driving factors
• Smartphone market
h<p://www.idc.com/getdoc.jsp?containerId=prUS23123911
Oge
Marques
7. Mobile visual search: driving factors
• Smartphone market
h<p://www.cellular-‐news.com/story/48647.php?s=h
Oge
Marques
8. Mobile visual search: driving factors
• Why do I need a camera? I have a smartphone…
h<p://www.cellular-‐news.com/story/52382.php
Oge
Marques
9. Mobile visual search: driving factors
• Why do I need a camera? I have a smartphone…
h<p://www.cellular-‐news.com/story/52382.php
Oge
Marques
10. Mobile visual search: driving factors
• Powerful devices
1 GHz ARM
Cortex-A9
processor,
PowerVR
SGX543MP2,
Apple A5 chipset
h<p://www.apple.com/iphone/specs.html
h<p://www.gsmarena.com/apple_iphone_4s-‐4212.php
Oge
Marques
11. Mobile visual search: driving factors
Social networks
and mobile
devices
(May 2011)
hp://jess3.com/geosocial-‐universe-‐2/
Oge
Marques
12. Mobile visual search: driving factors
• Social networks and mobile devices
– Motivated users: image taking and image sharing are
huge!
:
hp://www.onlinemarkeVng-‐trends.com/2011/03/facebook-‐photo-‐staVsVcs-‐and-‐insights.html
Oge
Marques
13. Mobile visual search: driving factors
• Instagram:
– 13 million registered (although not
necessarily active) users (in 13 months)
– 7 employees
– Several apps based on it!
hp://venturebeat.com/2011/11/18/instagram-‐13-‐million-‐users/
Oge
Marques
14. Mobile visual search: driving factors
• Food photo
sharing!
hp://mashable.com/2011/05/09/foodtography-‐infographic/
Oge
Marques
15. Mobile visual search: driving factors
• Legitimate (or not quite…) needs and use cases
hp://www.slideshare.net/dtunkelang/search-‐by-‐sight-‐google-‐goggles
hps://twier.com/#!/courtanee/status/14704916575
Oge
Marques
16. Search system, a low-latency interactive visual search system. base and is the key to very fast retr
Several sidebars in this article invite the interested reader to dig features they have in common wit
deeper into the underlying algorithms. of potentially similar images is sele
Finally, a geometric verificatio
Mobile visual search: driving factors
ROBUST MOBILE IMAGE RECOGNITION
Today, the most successful algorithms for content-based image
most similar matches in the datab
spatial pattern between features of
retrieval use an approach that is referred to as bag of features didate database image to ensure
(BoFs) or bag of words (BoWs). The BoW idea is borrowed from Example retrieval systems are pres
• A natural use case for CBIR with QBE (at last!)
text retrieval. To find a particular text document, such as a Web
page, it is sufficient to use a few well-chosen words. In the
For mobile visual search, ther
to provide the users with an int
– The example is right in front of the user!
database, the document itself can be likewise represented by a deployed systems typically transm
the server, which might require t
large databases, the inverted file in
memory swapping operations slow
ing stage. Further, the GV step
and thus increases the response t
the retrieval pipeline in the follow
the challenges of mobile visual se
Query Feature
Image Extraction
[FIG2] A Pipeline for image retrieva
from the query image. Feature mat
[FIG1] A snapshot of an outdoor mobile visual search system images in the database that have m
being used. The system augments the viewfinder with with the query image. The GV step
information about the objects it recognizes in the image taken feature locations that cannot be pl
with a camera phone. in viewing position.
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
18. MVS: technical challenges
• How to ensure low latency (and interactive
queries) under constraints such as:
– Network bandwidth
– Computational power
– Battery consumption
• How to achieve robust visual recognition in spite
of low-resolution cameras, varying lighting
conditions, etc.
• How to handle broad and narrow domains
Oge
Marques
19. MVS: Pipeline for image retrieval
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
22. Part III - Outline
• The MVS pipeline in greater detail
• Datasets for MVS research
• MPEG Compact Descriptors for Visual Search
(CDVS)
Oge
Marques
23. MVS: descriptor extraction
• Interest point detection
• Feature descriptor computation
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
24. Interest point detection
• Numerous interest-point detectors have been
proposed in the literature:
– Harris Corners (Harris and Stephens 1988)
– Scale-Invariant Feature Transform (SIFT) Difference-of-
Gaussian (DoG) (Lowe 2004)
– Maximally Stable Extremal Regions (MSERs) (Matas et al.
2002)
– Hessian affine (Mikolajczyk et al. 2005)
– Features from Accelerated Segment Test (FAST) (Rosten
and Drummond 2006)
– Hessian blobs (Bay, Tuytelaars and Van Gool 2006)
– etc.
Girod
et
al.
IEEE
Signal
Processing
Magazine
2011
Oge
Marques
25. Interest point detection
• Different tradeoffs in repeatability and complexity:
– SIFT DoG and other affine interest-point detectors are slow to
compute but are highly repeatable.
– SURF interest-point detector provides significant speed up over
DoG interest-point detectors by using box filters and integral
images for fast computation.
• However, the box filter approximation causes significant anisotropy, i.e.,
the matching performance varies with the relative orientation of query
and database images.
– FAST corner detector is an extremely fast interest-point
detector that offers very low repeatability.
• See (Mikolajczyk and Schmid 2005) for a comparative
performance evaluation of local descriptors in a common
framework.
Girod
et
al.
IEEE
Signal
Processing
Magazine
2011
Oge
Marques
26. Feature descriptor computation
• After interest-point detection, we compute a
visual word descriptor on a normalized patch.
• Ideally, descriptors should be:
– robust to small distortions in scale, orientation, and
lighting conditions;
– discriminative, i.e., characteristic of an image or a small
set of images;
– compact, due to typical mobile computing constraints.
Girod
et
al.
IEEE
Signal
Processing
Magazine
2011
Oge
Marques
27. Feature descriptor computation
• Examples of feature descriptors in the literature:
– SIFT (Lowe 1999)
– Speeded Up Robust Feature (SURF) interest-point
detector (Bay et al. 2008)
– Gradient Location and Orientation Histogram (GLOH)
(Mikolajczyk and Schmid 2005)
– Compressed Histogram of Gradients (CHoG)
(Chandrasekhar et al. 2009, 2010)
• See (Winder, (Hua,) and Brown CVPR 2007, 2009) and
(Mikolajczyk and Schmid PAMI 2005) for comparative
performance evaluation of different descriptors.
Girod
et
al.
IEEE
Signal
Processing
Magazine
2011
Oge
Marques
28. Feature descriptor computation
• What about compactness?
– Several attempts in the literature to compress off-the-
shelf descriptors did not lead to the best-rate-
constrained image-retrieval performance.
– Alternative: design a descriptor with compression in
mind.
Girod
et
al.
IEEE
Signal
Processing
Magazine
2011
Oge
Marques
29. Feature descriptor computation
• CHoG (Compressed Histogram of Gradients)
(Chandrasekhar et al. 2009, 2010)
– Based on the distribution of gradients within a patch of pixels
– Histogram of gradient (HoG)-based descriptors [e.g. (Lowe
2004), (Bay et al. 2008), (Dalal and Triggs 2005), (Freeman and
Roth 1994), and (Winder et al. 2009)] have been shown to be
highly discriminative at low bit rates.
Girod
et
al.
IEEE
Signal
Processing
Magazine
2011
Oge
Marques
30. CHoG: Compressed Histogram of Gradients
Gradients
Gradient distributions
Patch
for each bin
dx
dy
dx
dy
011101
Spatial
0100101
binning
01101
101101
Histogram
0100011
111001
compression
0010011
01100
1010100
CHoG
Descriptor
Bernd Girod: Mobile Visual Search
Chandrasekhar
et
al.
CVPR
09,10
Oge
Marques
31. the context for each spatial bin. I
LHC provides two key benefits. First, encoding the (x,
locations of a set of N features as the histogram reduces we
the bit rate by log(N!) compared to encoding each feature rate
Encoding descriptor’s location information
location in sequence [47]. Here, we exploit the fact that
the features can be sent in any order. Consider the sample
VGA
loca
space that represents N features. There are N! number of tion
codes that represent the same feature set because the tati
• Location Histogram Coding (LHC)
order does not matter. Thus, if we fix the ordering for the fixe
– Rationale: Interest-
point locations in
images tend to
cluster spatially.
[FIG
a lo
[FIG S3] Interest-point locations in images tend to cluster spa
spatially. bloc
Girod
et
al.
IEEE
Signal
Processing
Magazine
2011
Oge
Marques
32. spatial different coding gains. In our experiments, Hessian Laplace
ounts, has the highest gain followed by SIFT and SURF interest-
ns. We point detectors. Even if the feature points are uniformly
based scattered in the image, LHC is still able to exploit the
sed as ordering gain, which results in log(N!) saving in bits.
g the Encoding descriptor’s location information
In our experiments, we found that quantizing the
(x, y) location to four-pixel blocks is sufficient for GV. If
educes we use a simple fixed-length coding scheme, then the
eature rate will be log(640/4) 1 log(640/4) z 14 b/feature for a
t that VGA size image. Using LHC, we can transmit the same • Method:
ample
ber of • Location Histogram
location data with z 5 b/descriptor; z 12.5 times reduc-
tion in data compared to a 64-b floating point represen- 1. Generate a 2D histogram from
se the
or the Coding (LHC)
tation and z 2.8 times rate reduction compared to
fixed-length coding [48].
the locations of the descriptors.
• Divide the image into spatial bins and
count the number of features within
each spatial bin.
2. Compress the binary map,
indicating which spatial bins
contains features, and a sequence
1
of feature counts, representing
2 1 the number of features in
1 1 3 occupied bins.
1
1 1 1 3. Encode the binary map using a
trained context-based arithmetic
[FIG S4] We represent the location of the descriptors using
coder, with the neighboring bins
a location histogram. The image is first divided into evenly being used as the context for
er spaced blocks. We enumerate the features within each spatial
block by generating a location histogram. each spatial bin.
be tra- ed by images taken from multiple view points. The size of the
inverted index is reduced by using geometry to find matching Oge
Marques
Girod
et
al.
IEEE
Signal
Processing
Magazine
2011
sed to features across images, and only retaining useful features and
devel- discarding irrelevant clutter features.
33. MVS: feature indexing and matching
• Goal: produce a data structure that can quickly return a short
list of the database candidates most likely to match the query
image.
– The short list may contain false positives as long as the correct match
is included.
– Slower pairwise comparisons can be subsequently performed on just
the short list of candidates rather than the entire database.
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
34. clustering is applied to the training descriptors assigned to fall in same cluster.
that cluster, to generate k smaller clusters. This recursive di- During a query, the VT is traversed for each feature in
vision of the descriptor space is repeated until there are the query image, finishing at one of the leaf nodes. The
enough bins to ensure good classification performance. corresponding lists of images and frequency counts are
MVS: feature indexing and matching
Figure B1 shows a VT with only two levels, branching factor
k ¼ 3, and 32 ¼ 9 leaf nodes. In practice, VT can be much
larger, for example, with height 6, branching factor k ¼
subsequently used to compute similarity scores be-
tween these images and the query image. By pulling
images from all these lists and sorting them according
10, and containing 106 ¼ 1 million nodes. to the scores, we arrive at a subset of database images
• Vocabulary Tree (VT)-Based Retrieval
The associated inverted index structure maintains two
lists for each VT leaf node, as shown in Figure B2. For a
that is likely to contain a true match to the query
image.
1
2
3 Training descriptor
Root node
7 8
1st level intermediate node
4 5
2nd level leaf node
9
6
(1)
Vocabulary tree Inverted index
i11 i12 i13 ... i1N1
c11 c12 c13 ... c1N
1
i21 i22 i23 ... i2N2
1 2 3 4 5 6 7 8 9 ... c2N2
c21 c22 c23
(2)
Girod
et
al.
IEEE
MulVmedia
2011
B. (1) Vocabulary tree and (2) inverted index structures.
Figure Oge
Marques
35. MVS: geometric verification
• Goal: use location information of features in
query and database images to confirm that the
feature matches are consistent with a change in
view-point between the two images.
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
36. counts are far from code. Index compression reduces memory usage from near-
be much more rate ly 10 GB to 2 GB. This five times reduction leads to a sub-
g the distributions of stantial speedup in server-side processing, as shown in
h inverted list can be
[63]. Since keeping
tant for interactive
MVS: geometric verification
Figure S6(b). Without compression, the large inverted
index causes swapping between main and virtual memory
and slows down the retrieval engine. After compression,
me that allows ultra- memory swapping is avoided and memory congestion
• Method: perform pairwise matching of feature
C. The carryover code delays no longer contribute to the query latency.
descriptors and evaluate geometric consistency of
correspondences.
checks to rerank
cale information of
69] propose incor-
he VT matching or
uthors investigate
tself. Philbin et al.
atures to propose
mation model and
s. Weak geometric
to rerank a larger
GV is performed on
[FIG4] In the GV step, we match feature descriptors pairwise and
find feature correspondences that are consistent with a geometric
etric reranking step model. True feature matches are shown in red. False feature
ted inGirod
et
al.
IEEE
MulVmedia
2011
Figure 5. In matches are shown in green. Oge
Marques
37. MVS: geometric verification
• Techniques:
– The geometric transform between the query and database
image is usually estimated using robust regression
techniques such as:
• Random sample consensus (RANSAC) (Fischler and Bolles 1981)
• Hough transform (Lowe 2004)
– The transformation is often represented by an affine
mapping or a homography.
• GV is computationally expensive, which is why it’s
only used for a subset of images selected during the
feature-matching stage.
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
38. mation itself. Philbin et al.
ching features to propose
transformation model and
MVS: geometric reranking
ypotheses. Weak geometric
lly used to rerank a larger
e a full GV is performed on
• Speed-up step between Vocabulary Tree building
[FIG4] In the GV step, we match feature descriptors pairwise and
find feature correspondences that are consistent with a geometric
and Geometric Verification.
are shown in red. False feature
d a geometric reranking step model. True feature matches
s illustrated in Figure 5. In matches are shown in green.
tep that
mation
up stage
t of top Query Geometric Identify
VT GV
Data Reranking Information
ometric
e of the
x, y fea- [FIG5] An image retrieval pipeline can be greatly sped up by incorporating a geometric
use scale reranking stage.
IEEE SIGNAL PROCESSING MAGAZINE [69] JULY 2011
Oge
Marques
39. Fast geometric reranking
ing algorithm in
erank a short list
set of potential log (÷)
database image
nerating a set of
geometric score (a) (b) (c) (d)
te the geometric
We find the dis- location geometric score is computed as follows:
• The [FIG S7] The location geometric score is computed as
uery image a) features of twofeatures are matched based on VT quantization;
and follows: (a) images of two images are matched based on
VT quantization, (b) distances between pairs of features
atching features distances between pairs of features within an image are calculated;
b)
distance corre- within an image are calculated, (c) log-distance ratios of the
c) log-distance ratiospairs (denoted by color) pairs (denotedand color) are
corresponding of the corresponding are calculated, by
the two images. calculated; and
(d) histogram of log-distance ratios is computed. The
res in the query histogram of log-distance histogram is the geometric similarity
d) maximum value of the ratios is computed.
es. If there exists score. A peak in the histogram indicates a similarity
• The maximum value of the histogram is the geometric
peak in the his- transform between the query and database image.
similarity score.
y that the query
– A peak in the histogram indicates a similarity transform between
The time required to calculate a geometric similarity score
use we use the the query to two orders of magnitude less than using RANSAC.
is one and database image.
otential feature Typically, we perform fast geometric reranking on the top
scoring scheme.MulVmedia
2011
Girod
et
al.
IEEE
500 images and RANSAC on the top 50 ranked images. Oge
Marques
40. Datasets for MVS research
• Stanford Mobile Visual Search Data Set
(http://web.cs.wpi.edu/~claypool/mmsys-2011-dataset/stanford/)
– Key characteristics:
• rigid objects
• widely varying lighting conditions
• perspective distortion
• foreground and background clutter
• realistic ground-truth reference data
• query data collected from heterogeneous low and high-end
camera phones.
Chandrasekhar
et
al.
ACM
MMSys
2011
Oge
Marques
41. Stanford Mobile Visual Search (SMVS)
Data Set
• Limitations of popular datasets
Data Database Query Classes Rigid Lighting Clutter Perspective Camera
Set (#) (#) (#) Phone
√ √ √
ZuBuD 1005 115 200 √ −
√ √ √ −
Oxford 5062 55 17 √ √ ×
INRIA 1491 500 500 −
√ − √ −
UKY 10200 2550 2550 −
√ −
√ √ −
ImageNet 11M 15K 15K −
√ √ √ √ −
√
SMVS 1200 3300 1200
able 1: Comparison of different data sets. “Classes” refers to the number of distinct objects in the data set.
Rigid” refers to whether on not the objects in the database are rigid. “Lighting” refers to whether or not
ZuBuD
e query images capture widely varying lighting conditions. “Clutter” refers to whether or not the query
mages contain foreground/background clutter. “Perspective” refers to whether the data set contains typical
rspective distortions. “Camera-phone” refers to whether the images were captured with mobile devices.
MVS is a good data set for mobile visual search applications.
Oxford
ries like CDs, DVDs, books, text documents and business affine models with a minimum threshold of 10 matches post-
rds, we capture the images indoors under widely varying RANSAC for declaring a pair of images to be a valid match.
hting conditions over several days. We include foreground In Fig. 3, we report results for 3 state-of-the-art schemes:
d background clutter that would be typically present in (1) SIFT INRIA
Difference-of-Gaussian (DoG) interest point de-
e application, e.g., a picture of a CD would might other tector and SIFT descriptor (code: [27]), (2) Hessian-affine
Ds in the background. For landmarks, we capture images interest point detector and SIFT descriptor (code [17]), and
buildings in San Francisco. We collected query images (3) Fast Hessian blob interest point detector [2] sped up
veral months after the reference data was collected. For
UKY
with integral images, and the recently proposed Compressed
deo clips, the query images were taken from laptop, com- Histogram of Gradients (CHoG) descriptor [4]. We report
ter and TV screens to include typical specular distortions. the percentage of images that match, the average number
nally, the paintings were captured at the Cantor Arts Cen- of features and the average number of features that match
at Stanford University under controlled lighting condi-
Image Nets
post-RANSAC for each category.
ns typical of museums. et
al.
ACM
MMSys
2011
Chandrasekhar
Oge
Marques
Figure 2: Limitations with popular data sets in are easier vision. The left most image in each row is
First, we note that indoor categories computer than out-
The resolution of the query images varies for each camera database image, and the E.g., some categories like CDs, ZuBuD,and
door categories. other 3 images are query images. DVDs INRIA and UKY consist of images ta
one. We provide the original JPEG compressed high qual- at the book time and location. ImageNets is not suitable for image retrieval applications. The Oxford data
same
covers achieve over 95% accuracy. The most challeng-
has different faades of the same building labelled with the same name.
42. SMVS Data Set: categories and examples
• Number of query and database images per
category
Chandrasekhar
et
al.
ACM
MMSys
2011
Oge
Marques
43. SMVS Data Set: categories and examples
• DVD covers
hp://web.cs.wpi.edu/~claypool/mmsys-‐2011-‐dataset/stanford/mvs_images/dvd_covers.html
Oge
Marques
44. SMVS Data Set: categories and examples
• CD covers
hp://web.cs.wpi.edu/~claypool/mmsys-‐2011-‐dataset/stanford/mvs_images/cd_covers.html
Oge
Marques
45. SMVS Data Set: categories and examples
• Museum paintings
hp://web.cs.wpi.edu/~claypool/mmsys-‐2011-‐dataset/stanford/mvs_images/museum_painVngs.html
Oge
Marques
46. Other MVS data sets
ISO/IEC
JTC1/SC29/WG11/N12202
-‐
July
2011,
Torino,
IT
Oge
Marques
47. Other MVS data sets
ISO/IEC
JTC1/SC29/WG11/N12202
-‐
July
2011,
Torino,
IT
Oge
Marques
48. Other MVS data sets
• Distractor set
– 1 million images of various resolution and content
collected from FLICKR.
ISO/IEC
JTC1/SC29/WG11/N12202
-‐
July
2011,
Torino,
IT
Oge
Marques
49. MPEG Compact Descriptors for Visual Search (CDVS)
• Objectives
– Define a standard that:
• enables design of visual search applications
• minimizes lengths of query requests
• ensures high matching performance (in terms of reliability and
complexity)
• enables interoperability between search applications and visual databases
• enables efficient implementation of visual search functionality on mobile
devices
• Scope
– It is envisioned that (as a minimum) the standard will specify:
• bitstream of descriptors
• parts of descriptor extraction process (e.g. key-point detection) needed
to ensure interoperability
Bober,
Cordara,
and
Reznik
(2010)
Oge
Marques
50. MPEG CDVS
• Requirements
– Robustness
• High matching accuracy shall be achieved at least for images of textured
rigid objects, landmarks, and printed documents. The matching accuracy
shall be robust to changes in vantage point, camera parameters, lighting
conditions, as well as in the presence of partial occlusions.
– Sufficiency
• Descriptors shall be self-contained, in the sense that no other data are
necessary for matching.
– Compactness
• Shall minimize lengths/size of image descriptors
– Scalability
• Shall allow adaptation of descriptor lengths to support the required
performance level and database size.
• Shall enable design of web-scale visual search applications and databases.
Bober,
Cordara,
and
Reznik
(2010)
Oge
Marques
51. MPEG CDVS
• Requirements (cont’d)
– Image format independence
• Descriptors shall be independent of the image format
– Extraction complexity
• Shall allow descriptor extraction with low complexity (in terms of
memory and computation) to facilitate video rate implementations
– Matching complexity
• Shall allow matching of descriptors with low complexity (in terms of
memory and computation).
• If decoding of descriptors is required for matching, such decoding shall
also be possible with low complexity.
– Localization:
• Shall support visual search algorithms that identify and localize matching
regions of the query image and the database image
• Shall support visual search algorithms that provide an estimate of a
geometric transformation between matching regions of the query image
and the database image
Bober,
Cordara,
and
Reznik
(2010)
Oge
Marques
52. MPEG CDVS
[3B2-9] mmu2011030086.3d 1/8/011 16:44 Page 93
• Summarized timeline
Table 1. Timeline for development of MPEG standard for visual search.
When Milestone Comments
March, 2011 Call for Proposals is published Registration deadline: 11 July 2011
Proposals due: 21 November 2011
December, 2011 Evaluation of proposals None
February, 2012 1st Working Draft First specification and test software model that can
be used for subsequent improvements.
July, 2012 Committee Draft Essentially complete and stabilized specification.
January, 2013 Draft International Standard Complete specification. Only minor editorial
changes are allowed after DIS.
July, 2013 Final Draft International Finalized specification, submitted for approval and
Standard publication as International standard.
that among several component technologies for existing standards, such as MPEG Query For-
image retrieval, such a standard should focus pri- mat, HTTP, XML, JPEG, and JPSearch.
marily on defining the format of descriptors and
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
parts of their extraction process (such as interest Conclusions and outlook
point detectors) needed to ensure interoperabil- Recent years have witnessed remarkable
53. MPEG CDVS
• CDVS: evaluation framework
– Experimental setup
• Retrieval experiment: intended to assess performance of
proposals in the context of an image retrieval system
ISO/IEC
JTC1/SC29/WG11/N12202
-‐
July
2011,
Torino,
IT
Oge
Marques
54. MPEG CDVS
• CDVS: evaluation framework
– Experimental setup
• Pair-wise matching experiments: intended for assessing
performance of proposals in the context of an application
that uses descriptors for the purpose of image matching.
Annota-‐
Vons
Check
accuracy
of
Report
search
results
Image
A
Extract
descriptor
Match
Image
B
Extract
descriptors
ISO/IEC
JTC1/SC29/WG11/N12202
-‐
July
2011,
Torino,
IT
Oge
Marques
55. MPEG CDVS
• For more info:
– https://mailhost.tnt.uni-hannover.de/mailman/listinfo/cdvs
– http://mpeg.chiariglione.org/meetings/geneva11-1/geneva_ahg.htm
(Ad hoc groups)
Oge
Marques
57. Examples
• Academic
– Stanford Product Search System
• Commercial
– Google Goggles
– Kooaba: Déjà Vu and Paperboy
– SnapTell
– oMoby (and the IQ Engines API)
– pixlinQ
– Moodstocks
Oge
Marques
58. list of important references for each module in the matching (500 3 500 pixel resolution) [75] exhibiting challenging pho-
pipeline in Table 2. tometric and geometric distortions, as shown in Figure 7. For
Stanford Product Search (SPS) System
[TABLE 2] SUMMARY OF REFERENCES FOR MODULES IN A MATCHING PIPELINE.
MODULE LIST OF REFERENCES
• Local feature based visual search system
FEATURE EXTRACTION HARRIS AND STEPHENS [17], LOWE [15], [23], MATAS ET AL. [18], MIKOLAJCZYK ET AL. [16], [22],
DALAL AND TRIGGS [41], ROSTEN AND DRUMMOND [19], BAY ET AL. [20], WINDER ET AL. [27], [28],
CHANDRASEKHAR ET AL. [25], [26], PHILBIN ET AL. [40]
FEATURE INDEXING AND MATCHING SCHMID AND MOHR [13], LOWE [15], [23], SIVIC AND ZISSERMAN [9], NISTÉR AND STEWÉNIUS [10],
CHUM ET AL. [50], [52], [53], YEH ET AL. [51], PHILBIN ET AL. [12], JEGOU ET AL. [11], [59], [60], ZHANG ET AL. [54]
CHEN ET AL. [58], PERRONNIN [61], MIKULIK ET AL. [55], TURCOT AND LOWE [56], LI ET AL. [57]
GV FISCHLER AND BOLLES [66], SCHAFFALITZKY AND ZISSERMAN [74], LOWE [15], [23], CHUM ET AL. [53], [70], [71]
• Client-server architecture
FERRARI ET AL. [68], JEGOU ET AL. [11], WU ET AL. [69], TSAI ET AL. [73]
Query Data
Image Feature Feature
VT Matching
Extraction Compression
Network
Display GV
Client Identification Data Server
[FIG6] Stanford Product Search system. Because of the large database, the image-recognition server is placed at a remote location. In
most systems [1], [3], [7], the query image is sent to the server and feature extraction is performed. In our system, we show that by
performing feature extraction on the phone we can significantly reduce the transmission delay and provide an interactive experience.
IEEE SIGNAL PROCESSING MAGAZINE [70] JULY 2011
Girod
et
al.
IEEE
MulVmedia
2011
Tsai
et
al.
ACM
MM
2010
Oge
Marques
59. Stanford Product Search (SPS) System
• Key contributions:
– Optimized feature extraction implementation
– CHoG: a low bit-rate compact descriptor (provides up
to 20× bit-rate saving over SIFT with comparable
image retrieval performance)
– Inverted index compression to enable large-scale
image retrieval on the server
– Fast geometric re-ranking
Girod
et
al.
IEEE
MulVmedia
2011
Tsai
et
al.
ACM
MM
2010
Oge
Marques
60. The system
including different distances, viewing angles, Figure 1. A
the viewfind
Mobile image-based retrieval
and lighting conditions, or in the presence of of an outdo
information
technologies
partial occlusions or motion blur. visual-searc
objects it rec
Most successful algorithms for image-based
The system
the image ta
retrieval today use an approach that is referred
Stanford Product Search (SPS) System
the viewfind
a phone cam
Mobile image-based(BoF) or bag of words
to as bag of features retrieval
information
technologiesBoW idea is borrowed from text
(BoW).1,2 The
objects it re
document retrieval. To find afor image-based
Most successful algorithms particular text
the image ta
retrieval today use anwebpage, it’s sufficient to
document, such as a approach that is referred
a phone cam
to as few well-chosen words. or the database,
use a bag of features (BoF) In bag of words
• Two modes:
(BoW).1,2 The BoW idea is borrowed be repre-
the document itself can likewise from text characteristic of a particular image take the
document a bag of salient words, regardless
sented by retrieval. To find a particular text role of visual words. As with text retrieval,
document, such as a webpage, it’s sufficient to
of where these words appear in the document. BoF image retrieval does not consider where
use aimages, robust local features database,
For few well-chosen words. In the that are
– Send Image mode
in the image the features occur, at least in the
the document itself can likewise be repre- characteristic of a particular image take the
sented by a bag of salient words, regardless role of visual words. As with text retrieval,
Figure 2. Mo
Mobile phone
of where these words appear in the document. Visual search server
BoF image retrieval does not consider where
search archi
For images, robust local features that are in the image the features occur, at least in the
(a) The mob
Image encoding Image Descriptor
Image transmits th
(JPEG) decoding extraction
compressed
Figure 2. Mo
Mobile phone Visual search server
Wireless network Descriptor while analys
search arch
matching Database image and r
(a) The mob
Image encoding Image Descriptor are done ent
Image Process and (JPEG) Search transmits th
decoding extraction
display results results remote serve
compressed
Wireless network Descriptor local image
while analy
matching Database (descriptors)
image and r
(a)
extracted en
are done on
Process and Search
display results results phone and t
remote serve
Mobile phone Visual search server
encoded and
– Send Feature mode
local image
transmitted
(descriptors
(a) Descriptor Descriptor Descriptor
Image network. Su
extracted on
extraction encoding decoding
descriptors a
phone and t
Mobile phone Visual search server
Wireless network Descriptor used by the
encoded and
matching Database perform the
transmitted
Descriptor Descriptor Descriptor (c) The mob
network. Su
Image Process and
extraction encoding Search
decoding
display results results maintains a
descriptors a
the databas
used by the
Wireless network Descriptor
matching Database search reque
perform the
(b)
remote serve
(c) The mob
Process and Search
display results object of int
maintains a
Mobile phone results Visual search server
found in thi
the databas
further requ
search redu
Girod
et
al.
IEEE
MulVmedia
2011
(b)
Image
Descriptor Descriptor Descriptor
amountserve
remote of d
extraction encoding decoding
Tsai
et
al.
ACM
MM
2010
Oge
Marques
over the netw
object of int
Mobile phone No Visual search server
Descriptor Found Wireless network Descriptor found in th
matching matching Database further redu
Descriptor Descriptor Descriptor
61. Stanford Product Search System
• Performance evaluation
– Dataset: 1 million CD, DVD, and book cover images +
1,000 query images (500×500) with challenging
photometric and geometric distortions
(a)
(b)
Girod
et
al.
IEEE
distortions. 2011
pairs from the data set. (a) A clean database picture is matched against (b) a real-world picture with various
[FIG7] Example image
MulVmedia
Oge
Marques
the client, we use a Nokia 5800 mobile phone with a 300-MHz Figure 8 compares the recall for the three schemes: send
62. Stanford Product Search System
[3B2-9] mmu2011030086.3d 30/7/011 16:27 Page 92
• Performance evaluation
– Recall vs. bit rate
Industry and Standards
100
features, as they arrive.15 On
98 finds a result that has sufficien
ing score, it terminates the searc
96 ately sends the results back. T
optimization reduces system
Classification accuracy (%)
94
other factor of two.
92 Overall, the SPS system dem
using the described array of tec
90 bile visual-search systems can ac
ognition accuracy, scale to re
88
databases, and deliver search r
86 ceptable time.
84 Send feature (CHoG) Emerging MPEG standard
Send image (JPEG) As we have seen, key compo
82
Send feature (SIFT) gies for mobile visual search alr
80 we can choose among several p
100 101 102
tures to design such a system. W
Query size (Kbytes)
these options at the beginnin
Figure 7. Comparison of different schemes with regard to classification The architecture shown in Figur
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
est one to implement on a mobi
accuracy and query size. CHoG descriptor data is an order of magnitude
smaller compared to JPEG images or uncompressed SIFT descriptors. requires fast networks such as W
good performance. The archite
63. achieve ,1 s server processing latency while maintaining
high recall. 14
Communication Time-Out (%)
Stanford Product Search System
12
TRANSMISSION DELAY
The transmission delay depends on the type of network used. 10
In Figure 10, we observe that the data transmission time is 8
• Performance evaluation
insignificant for a WLAN network because of the high
6
– Processing times
4
[TABLE 3] PROCESSING TIMES.
2
CLIENT-SIDE OPERATIONS TIME ( S)
0
IMAGE CAPTURE 1–2 0
FEATURE EXTRACTION AND COMPRESSION 1–1.5
(FOR SEND IMAGE MODE)
SERVER-SIDE OPERATIONS TIME ( MS)
FEATURE EXTRACTION 100
[FIG9] Measured
(FOR SEND IMAGE MODE)
VT MATCHING 100
percentage (b) f
FAST GEOMETRIC RERANKING (PER IMAGE) 0.46 3G network. Ind
GV (PER IMAGE) 30 Indoor (II) is test
tested outside o
IEEE SIGNAL PROCESSING MAGAZINE [72] JULY
Girod
et
al.
IEEE
MulVmedia
2011
Tsai
et
al.
ACM
MM
2010
Oge
Marques
64. ognition accuracy, sca
Classifica
88
databases, and deliver s
86 ceptable time.
Stanford Product Search System
84 Send feature (CHoG) Emerging MPEG stan
Send image (JPEG) As we have seen, key
82
Send feature (SIFT) gies for mobile visual se
80 we can choose among se
100 101 102
tures to design such a s
• Performance evaluation
Query size (Kbytes)
these options at the b
Figure 7. Comparison of different schemes with regard to classification The architecture shown
accuracy and query size. CHoG descriptor data is an order of magnitude est one to implement on
– End-to-end latency
smaller compared to JPEG images or uncompressed SIFT descriptors. requires fast networks su
good performance. The
Figure 2b reduces netwo
12
fast response over tod
Feature extraction
Network transmission requires descriptors to
10 Retrieval phone. Many applicatio
further by using a cache
phone, as exemplified
Response time (seconds)
8 shown in Figure 2c.
However, this imme
tion of interoperability
6 mobile visual search app
across a broad range of d
the information is exc
4
compressed visual de
images? This question w
ing the Workshop on
2
held at Stanford Univer
This discussion led to a
US delegation to MPEG,
0
JPEG Feature Feature JPEG Feature tential interest in a stan
(3G) (3G) progressive (WLAN) (WLAN) applications be explore
(3G)
ploratory activity in MP
Figure 8. End-to-end latency for different schemes. Compared to a system produced a series of do
Girod
et
al.
IEEE
MulVmedia
2011
Oge
Marques
transmitting a JPEG query image, a scheme employing progressive quent year describing a
transmission of CHoG features achieves approximately four times the objectives, scope, and re
reduction in system latency over a 3G network. standard.17
65. Examples of commercial MVS apps
• Google
Goggles
– Android
and iPhone
– Narrow-
domain
search and
retrieval
hp://www.google.com/mobile/goggles
Oge
Marques
66. SnapTell
• One of the earliest (ca. 2008) MVS apps for iPhone
– Eventually acquired by Amazon (A9)
• Proprietary technique (“highly accurate and robust
algorithm for image matching: Accumulated Signed Gradient
(ASG)”).
hp://www.snaptell.com/technology/index.htm
Oge
Marques
67. oMoby (and the IQ Engines API)
– iPhone app
hp://omoby.com/pages/screenshots.php
Oge
Marques
68. oMoby (and the IQ Engines API)
• The IQ Engines API:
“vision as a service”
hp://www.iqengines.com/applicaVons.php
Oge
Marques
71. Kooaba: Déjà Vu and Paperboy
• “Image recognition in the cloud” platform
hp://www.kooaba.com/en/home/developers
Oge
Marques
72. Kooaba: Déjà Vu and Paperboy
• Déjà Vu
• Paperboy
– Enhanced digital – News sharing from
memories / notes / printed media
journal
hp://www.kooaba.com/en/products/dejavu
hp://www.kooaba.com/en/products/paperboy
Oge
Marques
73. pixlinQ
• A “mobile visual
search solution that
enables you to link
users to digital
content whenever
they take a mobile
picture of your
printed materials.”
– Powered by image
recognition from LTU
technologies
hp://www.pixlinq.com/home
Oge
Marques
74. pixlinQ
• Example app (La Redoute)
hp://www.youtube.com/watch?v=qUZCFtc42Q4
Oge
Marques
75. Moodstocks
• Real-time mobile image recognition that works offline (!)
• API and SDK available
hp://www.youtube.com/watch?v=tsxe23b12eU
Oge
Marques
76. Moodstocks
• Many successful apps for different platforms
hp://www.moodstocks.com/gallery/
Oge
Marques
78. Concluding thoughts
• Mobile Visual Search (MVS) is coming of age.
• This is not a fad and it can only grow.
• Still a good research topic
– Many relevant technical challenges
– MPEG efforts have just started
• Infinite creative commercial possibilities
Oge
Marques