This document summarizes Terrance Boult's talk on open world recognition and learning with unknown inputs. It discusses how traditional machine learning assumes all classes are known, but in reality there are many "unknown unknowns" that models are not trained on. It reviews different levels of openness in problems from closed multi-class classification to open set recognition. It also summarizes different algorithms that can help with open set recognition, like SVDD and openmax, and discusses challenges in applying these techniques to deep networks. The talk concludes that we cannot anticipate all unknown inputs and that open set recognition is an important area for further research.
Leaping over the Boundaries of Boundary Value AnalysisTechWell
Many books, articles, classes, and conference presentations tout equivalence class partitioning and boundary value analysis as core testing techniques. Yet many discussions of these techniques are shallow and oversimplified. Testers learn to identify classes based on little more than hopes, rumors, and unwarranted assumptions, while the "analysis" consists of little more than adding or subtracting one to a given number. Do you want to limit yourself to checking the product's behavior at boundaries? Or would you rather test the product to discover that the boundaries aren't where you thought they were, and that the equivalence classes aren't as equivalent as you've been told? Join Michael Bolton as he jumps over the partitions and leaps across the boundaries to reveal a topic far richer than you might have anticipated and far more complex than the simplifications that appear in traditional testing literature and folklore.
AI and ML Skills for the Testing World TutorialTariq King
Software continues to revolutionize the world, impacting nearly every aspect of our work, family, and personal life. Artificial intelligence (AI) and machine learning (ML) are playing key roles in this revolution through improvements in search results, recommendations, forecasts, and other predictions. AI and ML technologies are being used in platforms for digital assistants, home entertainment, medical diagnosis, customer support, and autonomous vehicles. Testing practitioners are recognizing the potential for advances in AI and ML to be leveraged for automated testing—an area that still requires significant manual effort. Tariq King and Jason Arbon introduce you to the world of AI for software testing. Learn the fundamentals behind autonomous and intelligent agents, ML approaches including Bayesian networks, decision tree learning, neural networks, and reinforcement learning. Discover how to apply these techniques to common testing tasks such as identifying testable features, generating test flows, and detecting erroneous states.
Building machine learning systems remains something of an art, from gathering and transforming the right data to selecting and finetuning the most fitting modeling techniques. If we want to make machine learning more accessible and foster skilfull use, we need novel ways to share and reuse findings, and streamline online collaboration. OpenML is an open science platform for machine learning, allowing anyone to easily share data sets, code, and experiments, and collaborate with people all over the world to build better models. It shows, for any known data set, which are the best models, who built them, and how to reproduce and reuse them in different ways. It is readily integrated into several machine learning environments, so that you can share results with the touch of a button or a line of code. As such, it enables large-scale, real-time collaboration, allowing anyone to explore, build on, and contribute to the combined knowledge of the field. Ultimately, this provides a wealth of information for a novel, data-driven approach to machine learning, where we learn from millions of previous experiments to either assist people while analyzing data (e.g., which modeling techniques will likely work well and why), or automate the process altogether.
Renan Ranelli, especialista em desenvolvimento de software na locaweb, Nessa palestra apresenta os princípios de Random Testing.
Assista em: https://www.eventials.com/locaweb/random-testing/
Random Testing é uma tecnica de teste de software bem menos usual, mas que resolve várias limitações das abordagens clássicas. Em especial, é bastante eficiente para encontrar comportamentos indefinidos, problemas de validação e segurança. Em especial, a palestra é bastante focada em "Property Based Testing" e apresenta quais são os aspectos fundamentais para aplicar as técnicas e as limitações destas.
Knowledge graphs for knowing more and knowing for sureSteffen Staab
Knowledge graphs have been conceived to collect heterogeneous data and knowledge about large domains, e.g. medical or engineering domains, and to allow versatile access to such collections by means of querying and logical reasoning. A surge of methods has responded to additional requirements in recent years. (i) Knowledge graph embeddings use similarity and analogy of structures to speculatively add to the collected data and knowledge. (ii) Queries with shapes and schema information can be typed to provide certainty about results. We survey both developments and find that the development of techniques happens in disjoint communities that mostly do not understand each other, thus limiting the proper and most versatile use of knowledge graphs.
Leaping over the Boundaries of Boundary Value AnalysisTechWell
Many books, articles, classes, and conference presentations tout equivalence class partitioning and boundary value analysis as core testing techniques. Yet many discussions of these techniques are shallow and oversimplified. Testers learn to identify classes based on little more than hopes, rumors, and unwarranted assumptions, while the "analysis" consists of little more than adding or subtracting one to a given number. Do you want to limit yourself to checking the product's behavior at boundaries? Or would you rather test the product to discover that the boundaries aren't where you thought they were, and that the equivalence classes aren't as equivalent as you've been told? Join Michael Bolton as he jumps over the partitions and leaps across the boundaries to reveal a topic far richer than you might have anticipated and far more complex than the simplifications that appear in traditional testing literature and folklore.
AI and ML Skills for the Testing World TutorialTariq King
Software continues to revolutionize the world, impacting nearly every aspect of our work, family, and personal life. Artificial intelligence (AI) and machine learning (ML) are playing key roles in this revolution through improvements in search results, recommendations, forecasts, and other predictions. AI and ML technologies are being used in platforms for digital assistants, home entertainment, medical diagnosis, customer support, and autonomous vehicles. Testing practitioners are recognizing the potential for advances in AI and ML to be leveraged for automated testing—an area that still requires significant manual effort. Tariq King and Jason Arbon introduce you to the world of AI for software testing. Learn the fundamentals behind autonomous and intelligent agents, ML approaches including Bayesian networks, decision tree learning, neural networks, and reinforcement learning. Discover how to apply these techniques to common testing tasks such as identifying testable features, generating test flows, and detecting erroneous states.
Building machine learning systems remains something of an art, from gathering and transforming the right data to selecting and finetuning the most fitting modeling techniques. If we want to make machine learning more accessible and foster skilfull use, we need novel ways to share and reuse findings, and streamline online collaboration. OpenML is an open science platform for machine learning, allowing anyone to easily share data sets, code, and experiments, and collaborate with people all over the world to build better models. It shows, for any known data set, which are the best models, who built them, and how to reproduce and reuse them in different ways. It is readily integrated into several machine learning environments, so that you can share results with the touch of a button or a line of code. As such, it enables large-scale, real-time collaboration, allowing anyone to explore, build on, and contribute to the combined knowledge of the field. Ultimately, this provides a wealth of information for a novel, data-driven approach to machine learning, where we learn from millions of previous experiments to either assist people while analyzing data (e.g., which modeling techniques will likely work well and why), or automate the process altogether.
Renan Ranelli, especialista em desenvolvimento de software na locaweb, Nessa palestra apresenta os princípios de Random Testing.
Assista em: https://www.eventials.com/locaweb/random-testing/
Random Testing é uma tecnica de teste de software bem menos usual, mas que resolve várias limitações das abordagens clássicas. Em especial, é bastante eficiente para encontrar comportamentos indefinidos, problemas de validação e segurança. Em especial, a palestra é bastante focada em "Property Based Testing" e apresenta quais são os aspectos fundamentais para aplicar as técnicas e as limitações destas.
Knowledge graphs for knowing more and knowing for sureSteffen Staab
Knowledge graphs have been conceived to collect heterogeneous data and knowledge about large domains, e.g. medical or engineering domains, and to allow versatile access to such collections by means of querying and logical reasoning. A surge of methods has responded to additional requirements in recent years. (i) Knowledge graph embeddings use similarity and analogy of structures to speculatively add to the collected data and knowledge. (ii) Queries with shapes and schema information can be typed to provide certainty about results. We survey both developments and find that the development of techniques happens in disjoint communities that mostly do not understand each other, thus limiting the proper and most versatile use of knowledge graphs.
How Four Statistical Rules Forecast Who Wins a Competitive BidIntelCollab.com
Can Bayesian statistics really determine in advance if the bid you are offering will be the winner or just another loser? And, if the metrics forecast a loss, can the same algorithm tell you what to change in order to win instead?
Competitive bidding is where big money sales opportunities are won or lost, and there are four (4) rules that can help you turn a losing situation into a winning sale.
These four rules help you better understand what the customer wants, examine what competitors might do in response and how to beat them, while helping you to offer the best bid, optimized for yours and your prospective customer’s intended outcome. Statistical metrics evaluate your probability of success against the competition and help you more objectively determine how to win. But how can you get at the foundational issues that will determine who will win?
Learning objectives:
Learn the Four Rules that help you understand what will actually determine the customer’s decision.
Visualize your bid head-to-head against the competition and employ objective metrics to determine if you will win.
Identify weaknesses in your offer that must be improved for your bid to beat the competition.
Bill Zangwill is a Professor, Emeritus, from the University of Chicago, Booth School of Business. He has authored four published books, one of which was selected by the Library Journal as “One of the Best Business Books of the Year,” and had over 50 papers in academic journals. In addition, he has had three articles published in the Wall Street Journal. His consulting engagements include top firms such as IBM, AT&T, Motorola, many smaller firms and the US government. He has also taught at the University of Illinois and the University of California, Berkeley. He is considered one of the most innovative thinkers in his field.
Bill will present 30 minutes on how the four rules can help you turn a losing situation into a winning sale and will be joined by webinar moderator Arik Johnson, Founder & Chairman at Aurora WDC.
DN18 | A/B Testing: Lessons Learned | Dan McKinley | MailchimpDataconomy Media
Abstract about the Presemtation:
Introducing A/B testing to a large team that has never done it before is a weird and bewildering thing that Dan McKinley has somehow done twice. This has burdened him with many opinions about how to achieve this with minimal wailing and gnashing of teeth.
About the Author:
Dan McKinley is a Co-Founder of Skyliner in Los Angeles. Previously he worked at Stripe and spent nearly 7 years building Etsy, during which he worked on “pretty much every feature and backend facility on the site”. He resides in LA with his wife and son.
University Course Timetabling by using Multi Objective Genetic AlgortihmsHalil Kaşkavalcı
University course timetabling is a research area in combinatorial optimization. Since the problem is an NP-Hard problem, exhaustive search is not feasible. Thus, smarter methods need to be applied. University timetables should be feasible and decent. Constraints introduced by faculty and department can be categorized as hard and soft objectives. Hard objectives should be satisfied strictly, whereas soft objectives should be fulfilled as much as possible. In this work, Yeditepe University Computer and Engineering department’s course timetabling problem is solved by using multi objective genetic algorithms. Yeditepe University’s timetabling problem introduces constraints which are not covered in literature. YU constraints are handled in this work and graphical user interface is implemented for a user friendly experience for the program.
Bringing Red vs. Blue to Machine LearningBobby Filar
Machine learning (ML) has introduced novel techniques designed to identify malware, recognize suspicious domains, and detect anomalous behavior, even in the absence of observed data. As ML-based security platforms increasingly are adopted, the ML models also introduce potential vulnerabilities. In fact, security is at best an afterthought for most machine learning models. This presents an interesting dynamic where ML models both enhance defensive capabilities as well as opportunities for attackers, making them an interesting new challenge in Red vs. Blue exercises moving forward. In this presentation I will briefly introduce adversarial machine learning, how these models can be attacked, while also demonstrating how blue teams can harden defenses, and explain how ML should not be viewed as the panacea, but rather another technology that can help, but needs to be wary of exploitation as well.
Greg Wilson - We Know (but ignore) More Than We Think#DevTO
Over the past thirty years, software engineering researchers have learned a lot about how software actually gets built, and about which tools and practices make developers more productive. However, most working programmers don’t know these studies exist, and don’t base their own work on the best available evidence. This talk will describe a more of the most interesting results, and explore why the gulf between research and practice persists.
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...QA or the Highway
Many teams and organizations struggle to scale their quality and testing strategies once they reach tens of teams and hundreds of developers and services across their systems. Traditional strategies and techniques, like testing phases and code freezes, do not work at scale and quickly add friction, reduce productivity, and make testing and quality harder.
In this presentation, we will cover different ideas and strategies to make things like BDD and TDD easier to adopt at the beginning, how to include observability and operability in your definition of quality, and how leveraging ML/AI can augment your devs and testers and reduce risk while accelerating value.
By the end, you will have some "low quality" indicators that you can use to identify patterns and practices that won\'t scale well. You will have new insights and ideas for how you can set up your teams and strategies for success long term, and you will see tangible, practical examples you can take to your team and company to start this transformation now.
Why Do Computational Scientists Trust Their Sojpipitone
A very informal talk I gave to Hausi Muller's group at UVic in June 2009.
I have included, without permission, slides from Daniel Hook's excellent presentation at SE-CSE 2009 (http://www.cs.ua.edu/~SECSE09/schedule.htm).
How Four Statistical Rules Forecast Who Wins a Competitive BidIntelCollab.com
Can Bayesian statistics really determine in advance if the bid you are offering will be the winner or just another loser? And, if the metrics forecast a loss, can the same algorithm tell you what to change in order to win instead?
Competitive bidding is where big money sales opportunities are won or lost, and there are four (4) rules that can help you turn a losing situation into a winning sale.
These four rules help you better understand what the customer wants, examine what competitors might do in response and how to beat them, while helping you to offer the best bid, optimized for yours and your prospective customer’s intended outcome. Statistical metrics evaluate your probability of success against the competition and help you more objectively determine how to win. But how can you get at the foundational issues that will determine who will win?
Learning objectives:
Learn the Four Rules that help you understand what will actually determine the customer’s decision.
Visualize your bid head-to-head against the competition and employ objective metrics to determine if you will win.
Identify weaknesses in your offer that must be improved for your bid to beat the competition.
Bill Zangwill is a Professor, Emeritus, from the University of Chicago, Booth School of Business. He has authored four published books, one of which was selected by the Library Journal as “One of the Best Business Books of the Year,” and had over 50 papers in academic journals. In addition, he has had three articles published in the Wall Street Journal. His consulting engagements include top firms such as IBM, AT&T, Motorola, many smaller firms and the US government. He has also taught at the University of Illinois and the University of California, Berkeley. He is considered one of the most innovative thinkers in his field.
Bill will present 30 minutes on how the four rules can help you turn a losing situation into a winning sale and will be joined by webinar moderator Arik Johnson, Founder & Chairman at Aurora WDC.
DN18 | A/B Testing: Lessons Learned | Dan McKinley | MailchimpDataconomy Media
Abstract about the Presemtation:
Introducing A/B testing to a large team that has never done it before is a weird and bewildering thing that Dan McKinley has somehow done twice. This has burdened him with many opinions about how to achieve this with minimal wailing and gnashing of teeth.
About the Author:
Dan McKinley is a Co-Founder of Skyliner in Los Angeles. Previously he worked at Stripe and spent nearly 7 years building Etsy, during which he worked on “pretty much every feature and backend facility on the site”. He resides in LA with his wife and son.
University Course Timetabling by using Multi Objective Genetic AlgortihmsHalil Kaşkavalcı
University course timetabling is a research area in combinatorial optimization. Since the problem is an NP-Hard problem, exhaustive search is not feasible. Thus, smarter methods need to be applied. University timetables should be feasible and decent. Constraints introduced by faculty and department can be categorized as hard and soft objectives. Hard objectives should be satisfied strictly, whereas soft objectives should be fulfilled as much as possible. In this work, Yeditepe University Computer and Engineering department’s course timetabling problem is solved by using multi objective genetic algorithms. Yeditepe University’s timetabling problem introduces constraints which are not covered in literature. YU constraints are handled in this work and graphical user interface is implemented for a user friendly experience for the program.
Bringing Red vs. Blue to Machine LearningBobby Filar
Machine learning (ML) has introduced novel techniques designed to identify malware, recognize suspicious domains, and detect anomalous behavior, even in the absence of observed data. As ML-based security platforms increasingly are adopted, the ML models also introduce potential vulnerabilities. In fact, security is at best an afterthought for most machine learning models. This presents an interesting dynamic where ML models both enhance defensive capabilities as well as opportunities for attackers, making them an interesting new challenge in Red vs. Blue exercises moving forward. In this presentation I will briefly introduce adversarial machine learning, how these models can be attacked, while also demonstrating how blue teams can harden defenses, and explain how ML should not be viewed as the panacea, but rather another technology that can help, but needs to be wary of exploitation as well.
Greg Wilson - We Know (but ignore) More Than We Think#DevTO
Over the past thirty years, software engineering researchers have learned a lot about how software actually gets built, and about which tools and practices make developers more productive. However, most working programmers don’t know these studies exist, and don’t base their own work on the best available evidence. This talk will describe a more of the most interesting results, and explore why the gulf between research and practice persists.
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...QA or the Highway
Many teams and organizations struggle to scale their quality and testing strategies once they reach tens of teams and hundreds of developers and services across their systems. Traditional strategies and techniques, like testing phases and code freezes, do not work at scale and quickly add friction, reduce productivity, and make testing and quality harder.
In this presentation, we will cover different ideas and strategies to make things like BDD and TDD easier to adopt at the beginning, how to include observability and operability in your definition of quality, and how leveraging ML/AI can augment your devs and testers and reduce risk while accelerating value.
By the end, you will have some "low quality" indicators that you can use to identify patterns and practices that won\'t scale well. You will have new insights and ideas for how you can set up your teams and strategies for success long term, and you will see tangible, practical examples you can take to your team and company to start this transformation now.
Why Do Computational Scientists Trust Their Sojpipitone
A very informal talk I gave to Hausi Muller's group at UVic in June 2009.
I have included, without permission, slides from Daniel Hook's excellent presentation at SE-CSE 2009 (http://www.cs.ua.edu/~SECSE09/schedule.htm).
Understanding User Behavior with Google Analytics.pdfSEO Article Boost
Unlocking the full potential of Google Analytics is crucial for understanding and optimizing your website’s performance. This guide dives deep into the essential aspects of Google Analytics, from analyzing traffic sources to understanding user demographics and tracking user engagement.
Traffic Sources Analysis:
Discover where your website traffic originates. By examining the Acquisition section, you can identify whether visitors come from organic search, paid campaigns, direct visits, social media, or referral links. This knowledge helps in refining marketing strategies and optimizing resource allocation.
User Demographics Insights:
Gain a comprehensive view of your audience by exploring demographic data in the Audience section. Understand age, gender, and interests to tailor your marketing strategies effectively. Leverage this information to create personalized content and improve user engagement and conversion rates.
Tracking User Engagement:
Learn how to measure user interaction with your site through key metrics like bounce rate, average session duration, and pages per session. Enhance user experience by analyzing engagement metrics and implementing strategies to keep visitors engaged.
Conversion Rate Optimization:
Understand the importance of conversion rates and how to track them using Google Analytics. Set up Goals, analyze conversion funnels, segment your audience, and employ A/B testing to optimize your website for higher conversions. Utilize ecommerce tracking and multi-channel funnels for a detailed view of your sales performance and marketing channel contributions.
Custom Reports and Dashboards:
Create custom reports and dashboards to visualize and interpret data relevant to your business goals. Use advanced filters, segments, and visualization options to gain deeper insights. Incorporate custom dimensions and metrics for tailored data analysis. Integrate external data sources to enrich your analytics and make well-informed decisions.
This guide is designed to help you harness the power of Google Analytics for making data-driven decisions that enhance website performance and achieve your digital marketing objectives. Whether you are looking to improve SEO, refine your social media strategy, or boost conversion rates, understanding and utilizing Google Analytics is essential for your success.
1.Wireless Communication System_Wireless communication is a broad term that i...JeyaPerumal1
Wireless communication involves the transmission of information over a distance without the help of wires, cables or any other forms of electrical conductors.
Wireless communication is a broad term that incorporates all procedures and forms of connecting and communicating between two or more devices using a wireless signal through wireless communication technologies and devices.
Features of Wireless Communication
The evolution of wireless technology has brought many advancements with its effective features.
The transmitted distance can be anywhere between a few meters (for example, a television's remote control) and thousands of kilometers (for example, radio communication).
Wireless communication can be used for cellular telephony, wireless access to the internet, wireless home networking, and so on.
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
Instagram has become one of the most popular social media platforms, allowing people to share photos, videos, and stories with their followers. Sometimes, though, you might want to view someone's story without them knowing.
Italy Agriculture Equipment Market Outlook to 2027harveenkaur52
Agriculture and Animal Care
Ken Research has an expertise in Agriculture and Animal Care sector and offer vast collection of information related to all major aspects such as Agriculture equipment, Crop Protection, Seed, Agriculture Chemical, Fertilizers, Protected Cultivators, Palm Oil, Hybrid Seed, Animal Feed additives and many more.
Our continuous study and findings in agriculture sector provide better insights to companies dealing with related product and services, government and agriculture associations, researchers and students to well understand the present and expected scenario.
Our Animal care category provides solutions on Animal Healthcare and related products and services, including, animal feed additives, vaccination
Italy Agriculture Equipment Market Outlook to 2027
AAAI19-Open.pptx
1. LEARNING AND THE UNKNOWN:
SURVEYING STEPS TOWARD OPEN WORLD RECOGNITION
Terrance E. Boult
IEEE Fellow
El Pomar Prof. of Innovation and Security
University of Colorado Colorado Springs
I’ll post video of this talk at https://github.com/vastab
2. Bachelor of InnovationTM
University of Colorado Colorado Springs
WHAT “CLASS” IS THIS OBJECT?
A few will actually know it
Most will (quickly) think “I
don’t know” – you know
you don’t know which is
OpenSet
The curious might try to
look it up and get more
data to “learn” it’s a
Ctenophore aka a comb
jelly – i.e. we do open
world learning
3. Bachelor of InnovationTM
University of Colorado Colorado Springs
TRAFFICKCAM EXAMPLE (AAAI19 PAPER)
HOTELS-50K: A GLOBAL HOTEL RECOGNITION DATASET:
ABBY STYLIANOU, HONG XUAN, MAYA SHENDE, JONATHAN BRANDT, RICHARD SOUVENIR, ROBERT PLESS
Adversaries will try to defeat a system
4. Bachelor of InnovationTM
University of Colorado Colorado Springs
SURVEILLANCE EXAMPLE: CAR DETECTION/COUNTING
Nature can produce persistent/long-lived unknown inputs
e.g. Ice on camera housing. (Also bug, bird doodoo…)
5. Bachelor of InnovationTM
University of Colorado Colorado Springs
Gallery
Probe
Match Score =0.72705
Face Verification (mid JANUS IARPA program)
6. Bachelor of InnovationTM
University of Colorado Colorado Springs
Match Score= 0.99769
Gallery
Probe
Face Verification (mid-JANUS IARPA program, top performer)
L2-Softmax trained Cosine-distance matching
7. Bachelor of InnovationTM
University of Colorado Colorado Springs
WHAT WENT WRONG?
Absence of Evidence is not Evidence of Absence
Being far from boundaries & training evidence, A of E,
implies high “probabilities” in classifiers such as
SVM or Softmax
The open set/world is full of “unknowns” that
will absent in training!
Bayesian Reasoning cannot help us as we cannot
normalize without the probability of the unknown inputs
8. Bachelor of InnovationTM
University of Colorado Colorado Springs
The Range of Openness/Unknowns in problems
Multi-class Classification
Face
Verification
Detection
Open S
et
Recognition
Closed Open
? ? ?
?
? ?
? ?
? ?
Training and
testing samples
come from
known classes
Claimed
identity,
possibility for
impostors
One class,
everything else
in the world is
negative
Multiple known
classes,many
unknown
classes
Multi-class Classification
Face
Verification
Detection
Open S
et
Recognition
Closed Open
? ? ?
?
? ?
? ?
? ?
Training and
testing samples
come from
known classes
Claimed
identity,
possibility for
impostors
One class,
everything else
in the world is
negative
Multiple known
classes,many
unknown
classes
Paper has > 90 citations covering OSR from 14 different application areas
Multi-class Classification
Face
Verification
Detection
Open S
et
Recognition
Closed Open
? ? ?
?
? ?
? ?
? ?
Training and
testing samples
come from
known classes
Claimed
identity,
possibility for
impostors
One class,
everything else
in the world is
negative
Multiple known
classes,many
unknown
classes
9. Bachelor of InnovationTM
University of Colorado Colorado Springs
Thresholding “Probability” vs Open Set
On optimum recognition error and
reject tradeoff. C. Chow, IEEE
Trans. Info. Theory 16(1):41–46.
1970
10. Bachelor of InnovationTM
University of Colorado Colorado Springs
LEARNING IN THE FACE OF UNKNOWN UNKNOWNS:
FORMALIZATION OF OPEN-SET RECOGNITION
Open Space Risk Empirical Risk/Error
Scheirer et al. TPAMI ‘13
𝑉 is set of Valid Class training samples;
𝐾 is set of Known unknowns (backgrounds)
11. Bachelor of InnovationTM
University of Colorado Colorado Springs
OPEN SPACE RISK
“open space” is the space far from known samples. A
simple risk model a constant penalty for labeling that
anything other than unknown in a ratio such as:
12. Bachelor of InnovationTM
University of Colorado Colorado Springs
Algorithms that solve OSR
RBF SVM
WHAT SOLVES OSR?
Any detector that uses pure linear classifiers, linear SVM,
HAAR cascades, or Softmax-based classifiers, will almost
always have an unbounded open set risk, and hence
does not solve OSR even with thresholding.
GMM
SVDD
If they include a
“bias” or Bayesian
normalization they
probably don’t
solve OSR.
WSVM
PI-SVM
KDE
EVM
NNO
13. Bachelor of InnovationTM
University of Colorado Colorado Springs
“NOVELTY DETECTION, ANOMALY DETECTION AND
DETECTING “OUT OF DISTRIBUTION SAMPLE”
Long history and many many papers on the first 2, while the latter is a
new term for similar ideas or learning with outliers.
Open-Set Recognition ≅
Anomaly/Novelty detection + Multi-class Recognition
Compute
Novelty or
Anomaly score
Is
Outlier
?
”Closed set”
multi-class
Label
Does NOT directly address “open set recognition” but can be used in sequence to
address OSR. However, rarely is open set recognition part of evaluation evaluation.
14. Bachelor of InnovationTM
University of Colorado Colorado Springs
Classic machine learning
presumes all classes
known and classifies all
of feature space.
Thresholding vs Open Set
15. Bachelor of InnovationTM
University of Colorado Colorado Springs
COMPACT ABATING PROBABILITY
W-SVM ~= OneClassRBF * (EVT scaled) Binary RBF SVM
16. Bachelor of InnovationTM
University of Colorado Colorado Springs
THERE ARE MANY PROBLEM VARIANTS AND SOLUTIONS
It can be a part of zero/few shot learning
Xian, Y.; Lampert, C. H.; Schiele, B.; and Akata, Z. 2018. Zero- shot learning-A
comprehensive evaluation of the good, the bad and the ugly. IEEE TPAMI
2018.
Or Open set clustering/incremental with no labels
Active Sampling for Open-Set Classification without Initial Annotation Z-Y. Liu
and S.-J Huang AAA19 (Tech Session 1: Weakly Supervised Learning 1 Thu 2-3:30, Coral
Ballroom 3-5)
Problems where you have to predict scores on unseen data, e.g.
“An Open-World Extension to Knowledge Graph Completion Models” H. Shah
et al AAAI 19 (Tech Session 3: Thu 10:25-11:25, Coral 2)
17. Bachelor of InnovationTM
University of Colorado Colorado Springs
• World with Knowns (K) &
• Known Unknowns (KU)
Unknowns Unknowns (UU)
OSR: Recognizes
known classes or
Detect as Unknown
• NU: Novel
Unknowns
Collect &
Label Data
• LU: Labeled
Unknowns
Incremental
Class Learning
OpenSet
Network
Feature
Training
Class
Label
Unknown
Bendale-Boult CVPR15
Open World Learning
18. Bachelor of InnovationTM
University of Colorado Colorado Springs
EXTREME VALUE MACHINE
Uses our “margin distribution theorem” to derive EVT-based
non-linear models that are provably Open-set and also can do
Open World/”Incremental”. Can use classic or deep features
Rudd Et Al. TPAMI 18
19. Bachelor of InnovationTM
University of Colorado Colorado Springs
TOWARDS OPEN-SET DEEP NETWORKS
Abhijit Bendale*, Terrance Boult
Samsung Research America*
University of Colorado of Colorado Springs
20. Bachelor of InnovationTM
University of Colorado Colorado Springs
OpenMax for Deep Networks
Distance from MAV
Frequency
FC7
FC8
AlexNet
OpenMax
CAP Model using using EVT
On distances from MAV
21. Bachelor of InnovationTM
University of Colorado Colorado Springs
OPEN-SET DEEP NETWORKS
Model (MAV) Real Image
Softmax: 0.94, baseball
Fooling Image
Softmax: 1.0, baseball
Open-Set Image
Softmax :0.15, baseball
Openmax : 0.94, baseball Openmax: 0.00, baseball
0.95 Unknown
Openmax: 0.17, baseball
: 0.80, Unknown
Model (MAV)
Real Image
Fooling Image
Open-Set Image
22. Bachelor of InnovationTM
University of Colorado Colorado Springs
OPENMAX ONLY SOMEWHAT BETTER. WE RESEARCHED WHY.
Lenet++ 2D feature representation
24. Bachelor of InnovationTM
University of Colorado Colorado Springs
LENET++ trained on MNIST
MNIST test set (Colors)
LENET++ trained on MNIST
MNIST test set (colors) Black = NIST Letters..
Rather than being far away or “outside” the data,
features for unknown inputs generally overlap known classes
25. Bachelor of InnovationTM
University of Colorado Colorado Springs
OBSERVATION FROM DEFAULT RESPONSE – LEADING TO OUR APPROACH
Observer that there is difference in entropy and magnitude. While Open-Set
limited response outside the ring of data, most of the unknowns had smaller
magnitude.
The NeurIPS18 approach seeks to emphasize that difference.
27. Bachelor of InnovationTM
University of Colorado Colorado Springs
LENET++ RESPONSES TO KNOWNS AND UNKNOWNS. Colored dots represent test samples
from the ten MNIST classes, while black dots represent samples from unknown unknowns. The dashed
gray-white lines indicate class borders. The figures in the bottom are histograms of network scores for
known (green) and unknowns (red) with logarithmic vertical axis.
Dhamija et al. 18
30. Bachelor of InnovationTM
University of Colorado Colorado Springs
Matching Deep Features from
VGG2 set at FAR=10-4 says this
pic of TB matches Barack Obama
So does a commercial system
Adversarial Examples show we do NOT understand how deep
network actually work– “Close in input is not close in features”.
Until we do understand we cannot really do open world deep
networks.
31. Bachelor of InnovationTM
University of Colorado Colorado Springs
CONCLUSIONS
We cannot anticipate and train for all “unknown inputs”
Bayesian reasoning cannot help us if we don’t know
probability of ”unknown” inputs occurring.
Almost all classical classifiers have unbounded risk and make
highly confident errors. OSR tools address both.
Traditional deep network map unknown ontop of knowns
Starting to make progress on deep networks, but it’s a area
with lots of research potential. Adversarial Examples show
there is major issues still “unknown.
32. Bachelor of InnovationTM
University of Colorado Colorado Springs
“Intelligence comes with hard work and
curiosity for the unknown.”
Roberto Llamas
Do not fear the unknown —
join us in taming it.
https://github.com/vastab
Our code is mostly LIBSVM or BSD-3 ”free” licensed.
Editor's Notes
The first part of the talk will explore issues with deep networks dealing with "unknowns" inputs, and the general issues of Open-Set recognition in deep networks. We review our first attempt at Open-Set deep networks, "OpenMax," and discuss is successes and limitations and why classic "open-set" approaches don't really solve the problem of deep unknowns. We then present our ongoing work, to first appear at NIPS2018, on a new model we call the ObjectoSphere. Using ObjectoSphere loss begins to address the learning of deep features that can handle unknown inputs. We present examples of its use first on simple datasets sets (MNIST/CFAR) and then on a real-world problem of open-set face recognition. We then move to another type of unknown for deep networks: adversarial examples, images perturbations that are invisible to humans but easily fool deep networks. While Open-Set recognition tries to deal with inputs that are "far" from known training samples, these adversarial examples are in perceptually close in input space but far in feature face. This last part of the talk will discuss various potential theories about the causes of adversarial examples, why we know those theories are not correct, and why they show we don't understand deep networks. We introduce our deep-feature adversarial approach, called LOTS, and return to the examples of object-recognition and face-recognition showing how our LOTS adversarial examples can successfully attack even open-set recognition systems.
Why does this happen? We believe, the closed set nature of deep networks forces them to choose from one of the known classes leading to such artifacts. Our hypothesis is that deep networks suffer from the open space, prone to any discriminative classifier. The softmax layer divides the output space into N half spaces each with potentially infinite volume. However, recognition in the real world is Open-Set, i.e. the recognition system should reject unknown/unseen classes at test time.
Gaussian Mixture Modules, kernel density estimators, RBF SVMs or Support Vector Data Descriptors (SVDD), may, but do not have to have bounded open space risk. It depends how they are combined and thresholded. If they include a “bias” they probably don’t solve OSR.
Hi. I am Abhijit Bendale and I will present work on Towards Open-Set Deep Networks. This is joint work with Prof. Terrance Boult at University of Colorado.
To explore the approaches we used LeNet++ which a has a 2D feature space just before the SoftMax classifier.
Observations leading to our NeuralIPS 18 spotlight paper
Performance of two new losses (red and green) are significantly better than any prior approach.