SlideShare a Scribd company logo
Experiments with Randomisation
 and Boosting for Multi-instance
        Classification
   Luke Bjerring, James Foulds, Eibe Frank
            University of Waikato




                       September 13, 2011
What's in this talk?
    • What is multi-instance learning?
    • Basic multi-instance data format in WEKA
    • The standard assumption in multi-instance learning
    • Learning decision tree and rules
    • Ensembles using randomisation
    • Diverse density learning
    • Boosting diverse density learning
    • Experimental comparison
    • Conclusions


© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   2
Multi-instance learning
    • Generalized (supervised) learning scenario where each
      example for learning is a bag of instances

           Single-instance
           model
                                                              one feature vector                                classification




          Multi-instance
          model


                                                                                                                classification


                                                                 multiple feature vectors


                                                           Figure based on diagram in Dietterich et al (1997)


© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO                                       09/13/11                          3
Example applications
    • Applicable whenever an object can best be represented
      as an unordered collection of instances
    • Two popular application areas in the literature:
              − Image classification (e.g. does an image contain a tiger?)
                        • Approach: image is split into regions, each region becomes
                          an instance described by a fixed-length feature vector
                        • Motivation for MI learning: location of object not important for
                          classification, some “key” regions determine outcome
              − Activity of molecules (e.g. does molecule smell musky?)
                        • Approach: instances describe possible conformations in 3D
                          space, based on fixed-length feature vector
                        • Motivation for MI learning: conformations cannot easily be
                          ordered, only some responsible for activity

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11                          4
Multi-instance data in WEKA
  • Bag of data given as value of relation-valued attribute




          bag identifier                                              instances in
                                                                      bag


                                                                      class label




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11                  5
What's the big deal?
  • Multi-instance learning is challenging because instance-
    level classifications are assumed to be unknown
            − Algorithm is told that an image contains a tiger, but not which
              regions are “tiger-like”
            − Similarly, a molecule is known to be active (or inactive), but
              algorithm is not told which conformation is responsible for this

  • Basic (standard) assumption in MI learning: bag is
    positive iff it contains at least one positive instance
            − Example: molecule is active if at least one conformation is active,
              and inactive otherwise

  • Generalizations of this are possible that assume
    interactions between instances in a bag
  • Alternative: instances contribute collectively to bag label
© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11                 6
A synthetic example
  • 10 positive/negative bags, 10 instances per bag




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   7
A synthetic example
  • Bag positive iff at least one instance in (0.4,0.6)x(0.4,0.6)




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   8
Assigning bag labels to instances...
  • 100 positive/negative bags, 10 instances per bag




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   9
Partitioning generated by C4.5
  • Many leaf nodes, only one of them matters...




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   10
Partitioning generated by C4.5
  • Many leaf nodes, only one of them matters...




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   11
Blockeel et al.'s MITI tree learner
  • Idea: home in on big positive leaf node, remove
    instances associated with that leaf node
                               y <= 0.3942 : 443 [0 / 443] (-)

                               y > 0.3942 : 1189

                               |         y <= 0.6004 : 418

                               |         |         x <= 0.6000 : 262

                               |         |         |       x <= 0.3676 : 59 [0 / 59] (-)

                               |         |         |       x > 0.3676 : 128

                               |         |         |       |      x <= 0.3975 : 2 [0 / 2] (-)

                               |         |         |       |      x > 0.3975 : 118

                               |         |         |       |      |      y <= 0.3989 : 1 [0 / 1] (-)

                               |         |         |       |      |      y > 0.3989 : 116 [116 / 0] (+)

                               |         |         x > 0.6000 : 88 [0 / 88] (-)

                               |         y > 0.6004 : 407 [0 / 407] (-)

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO                                   09/13/11       12
How MITI works
  • Two key modifications compared to standard top-down
    decision tree inducers:
            ­ Nodes are expanded in best-first manner, based on proportion of
              positive instances (→ identify positive leaf nodes early)
            ­ Once a positive leaf node has been found, all bags associated
              with this leaf node are removed from the training data
              (→ all other instances in these bags are irrelevant)

  • Blockeel et al. also use special purpose splitting
    criterion and biased estimate of proportion of positives
  • Our experiments indicate that it is better to use Gini
    index and unbiased estimate of proportion
            →Trees are generally slight more accurate and
             substantially smaller (also affects runtime)

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11             13
Learning rules: MIRI
  • Conceptual drawback of MITI tree learner: deactivated
    data may have already been used to grow other branches
  • Simple fix based on separate-and-conquer rule learning
    using partial trees:
            ‒ When positive leaf is found, make the path to this leaf into an if-then
              rule, discard the rest of the tree
            ‒ Start (partial) tree generation from scratch on the remaining data to
              generate the next rule
            ‒ Stop when no positive leaf can be made; add default rule

  • Experiments show: resulting rule learner (MIRI) has
    similar classification accuracy to MITI
  • However: rule sets are much more compact than
    corresponding decision trees
© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11                14
Random forests for MI learning
  • Random forests are well-known to be high-performance
    ensemble classifiers in single-instance learning
  • Straightforward to adapt MITI to learn semi-random
    decision trees from multi-instance data
            – At each node, choose random fixed-size subset of
              attributes, then choose best split amongst those
            – Also possible to apply semi-random node expansion (not
              best-first), but this yields little benefit
  • Can trivially apply this to MIRI rule learning as well: it's
    based on partially grown MITI trees
  • Ensemble can be generated in WEKA using
    RandomCommittee meta classifier

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11    15
Some experimental results: MITI




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   16
Some experimental results: MIRI




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   17
Maron's diverse density learning
  • Idea: identify point x in instance space where positive
    bags overlap, centre bell-shaped function at this point
  • Using this function, probability that instance Bij is positive,
    based on current hypothesis h, is assumed to be:

      where hypothesis h includes location x, but also a feature
      scaling vector s:

  • Instance-level probabilities are turned into bag-level
    probabilities using noisy-or function:



© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   18
Boosting diverse density learning
  • Point x and scaling vector s are found using gradient
    descent by maximising bag-level likelihood
  • Problem: very slow; takes very long to converge
  • QuickDD heuristic: find best point x first, using fixed
    scaling vector s, then optimise s; if necessary, iterate
  • Much faster, similar accuracy on benchmark data (also,
    compares favourably to subsampling-based EMDD)
  • Makes it computationally practical to apply boosting
    (RealAdaboost) to improve accuracy:
            – In this case, QuickDD is applied with weighted likelihood,
              symmetric learning, and localised model


© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11        19
Some experimental results: Boosted DD




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   20
So how do the ensembles compare?




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   21
But: improvement on “naive” methods?
  • Can apply standard single-instance random forests to
    multi-instance data using data transformations...




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   22
Summary
  • MITI and MIRI are fast methods for learning compact
    decision trees and rule sets for MI data
  • Randomisation for ensemble learning yields significantly
    improved accuracy in both cases
  • Heuristic QuickDD variant of diverse density learning
    makes it computationally practical to boost DD learning
  • Boosting yields substantially improved accuracy
  • Neither boosting nor randomisation has clear advantage
    in accuracy, but randomisation is much faster
  • However: marginal improvement in accuracy compared
    to “naive” methods

© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   23
Where in WEKA?
  • Location of multi-instance learners in Explorer GUI:




  • Available via package manager in WEKA 3.7, which
    also provides MITI, MIRI, and QuickDD
© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   24
Details on QuickDD for RealAdaboost
  • Weights in RealAdaboost are updated using odds ratio:


  • Weighted conditional likelihood is used in QuickDD:


  • QuickDD model is thresholded at 0.5 probability to achieve
    local effect on weight updates:


  • Symmetric learning is applied (i.e. both classes are tried as the
    positive class in turn)
            – Of the two models, the one that maximises weighted
              conditional likelihood is added into the ensemble


© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11     25
Random forest vs. bagging and boosting




© THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO   09/13/11   26

More Related Content

Similar to Experiments with Randomisation and Boosting for Multi-instance Classification

Initial Experiments on Learning Based Randomized Bin-Picking Allowing Finger...
Initial Experiments on Learning Based Randomized Bin-Picking Allowing Finger...Initial Experiments on Learning Based Randomized Bin-Picking Allowing Finger...
Initial Experiments on Learning Based Randomized Bin-Picking Allowing Finger...
Kensuke Harada
 
Classification.pptx
Classification.pptxClassification.pptx
Classification.pptx
Dr. Amanpreet Kaur
 
Process Mining - Chapter 3 - Data Mining
Process Mining - Chapter 3 - Data MiningProcess Mining - Chapter 3 - Data Mining
Process Mining - Chapter 3 - Data Mining
Wil van der Aalst
 
Process mining chapter_03_data_mining
Process mining chapter_03_data_miningProcess mining chapter_03_data_mining
Process mining chapter_03_data_mining
Muhammad Ajmal
 
Data mining Basics and complete description onword
Data mining Basics and complete description onwordData mining Basics and complete description onword
Data mining Basics and complete description onword
Sulman Ahmed
 
Mini datathon
Mini datathonMini datathon
Mini datathon
Kunal Jain
 
Weka bike rental
Weka bike rentalWeka bike rental
Weka bike rental
Pratik Doshi
 
Activity Monitoring Using Wearable Sensors and Smart Phone
Activity Monitoring Using Wearable Sensors and Smart PhoneActivity Monitoring Using Wearable Sensors and Smart Phone
Activity Monitoring Using Wearable Sensors and Smart Phone
DrAhmedZoha
 
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceMulti-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Diego Tosato
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousing
Vaishnavi
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
MonicaTimber
 
Improving the Model’s Predictive Power with Ensemble Approaches
Improving the Model’s Predictive Power with Ensemble ApproachesImproving the Model’s Predictive Power with Ensemble Approaches
Improving the Model’s Predictive Power with Ensemble Approaches
SAS Asia Pacific
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
ASHOK KUMAR
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
Aly Abdelkareem
 
SAX-VSM
SAX-VSMSAX-VSM
SAX-VSM
Pavel Senin
 
Chapter19
Chapter19Chapter19
Chapter19
Ying Liu
 
Software Methods for Sustainable Solutions
Software Methods for Sustainable SolutionsSoftware Methods for Sustainable Solutions
Software Methods for Sustainable Solutions
BIOVIA
 
Software Methods for Sustainable Solutions
Software Methods for Sustainable SolutionsSoftware Methods for Sustainable Solutions
Software Methods for Sustainable Solutions
George Fitzgerald
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
Martin Pelikan
 
Data Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdfData Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdf
Jayanti Pande
 

Similar to Experiments with Randomisation and Boosting for Multi-instance Classification (20)

Initial Experiments on Learning Based Randomized Bin-Picking Allowing Finger...
Initial Experiments on Learning Based Randomized Bin-Picking Allowing Finger...Initial Experiments on Learning Based Randomized Bin-Picking Allowing Finger...
Initial Experiments on Learning Based Randomized Bin-Picking Allowing Finger...
 
Classification.pptx
Classification.pptxClassification.pptx
Classification.pptx
 
Process Mining - Chapter 3 - Data Mining
Process Mining - Chapter 3 - Data MiningProcess Mining - Chapter 3 - Data Mining
Process Mining - Chapter 3 - Data Mining
 
Process mining chapter_03_data_mining
Process mining chapter_03_data_miningProcess mining chapter_03_data_mining
Process mining chapter_03_data_mining
 
Data mining Basics and complete description onword
Data mining Basics and complete description onwordData mining Basics and complete description onword
Data mining Basics and complete description onword
 
Mini datathon
Mini datathonMini datathon
Mini datathon
 
Weka bike rental
Weka bike rentalWeka bike rental
Weka bike rental
 
Activity Monitoring Using Wearable Sensors and Smart Phone
Activity Monitoring Using Wearable Sensors and Smart PhoneActivity Monitoring Using Wearable Sensors and Smart Phone
Activity Monitoring Using Wearable Sensors and Smart Phone
 
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceMulti-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video Surveillance
 
Business intelligence and data warehousing
Business intelligence and data warehousingBusiness intelligence and data warehousing
Business intelligence and data warehousing
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
Improving the Model’s Predictive Power with Ensemble Approaches
Improving the Model’s Predictive Power with Ensemble ApproachesImproving the Model’s Predictive Power with Ensemble Approaches
Improving the Model’s Predictive Power with Ensemble Approaches
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 
SAX-VSM
SAX-VSMSAX-VSM
SAX-VSM
 
Chapter19
Chapter19Chapter19
Chapter19
 
Software Methods for Sustainable Solutions
Software Methods for Sustainable SolutionsSoftware Methods for Sustainable Solutions
Software Methods for Sustainable Solutions
 
Software Methods for Sustainable Solutions
Software Methods for Sustainable SolutionsSoftware Methods for Sustainable Solutions
Software Methods for Sustainable Solutions
 
Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
 
Data Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdfData Mining Module 3 Business Analtics..pdf
Data Mining Module 3 Business Analtics..pdf
 

More from LARCA UPC

Spectral Learning Methods for Finite State Machines with Applications to Na...
  Spectral Learning Methods for Finite State Machines with Applications to Na...  Spectral Learning Methods for Finite State Machines with Applications to Na...
Spectral Learning Methods for Finite State Machines with Applications to Na...
LARCA UPC
 
A query language for analyzing networks
A query language for analyzing networksA query language for analyzing networks
A query language for analyzing networks
LARCA UPC
 
A discussion on sampling graphs to approximate network classification functions
A discussion on sampling graphs to approximate network classification functionsA discussion on sampling graphs to approximate network classification functions
A discussion on sampling graphs to approximate network classification functions
LARCA UPC
 
Overlapping correlation clustering
Overlapping correlation clusteringOverlapping correlation clustering
Overlapping correlation clustering
LARCA UPC
 
Machine Learning Application Development
Machine Learning Application DevelopmentMachine Learning Application Development
Machine Learning Application Development
LARCA UPC
 
Semi-random model tree ensembles: an effective and scalable regression method
Semi-random model tree ensembles: an effective and scalable regression method Semi-random model tree ensembles: an effective and scalable regression method
Semi-random model tree ensembles: an effective and scalable regression method
LARCA UPC
 
Distributed clustering from data streams
Distributed clustering from data streamsDistributed clustering from data streams
Distributed clustering from data streams
LARCA UPC
 
Adaptive pre-processing for streaming data
Adaptive pre-processing for streaming dataAdaptive pre-processing for streaming data
Adaptive pre-processing for streaming data
LARCA UPC
 

More from LARCA UPC (8)

Spectral Learning Methods for Finite State Machines with Applications to Na...
  Spectral Learning Methods for Finite State Machines with Applications to Na...  Spectral Learning Methods for Finite State Machines with Applications to Na...
Spectral Learning Methods for Finite State Machines with Applications to Na...
 
A query language for analyzing networks
A query language for analyzing networksA query language for analyzing networks
A query language for analyzing networks
 
A discussion on sampling graphs to approximate network classification functions
A discussion on sampling graphs to approximate network classification functionsA discussion on sampling graphs to approximate network classification functions
A discussion on sampling graphs to approximate network classification functions
 
Overlapping correlation clustering
Overlapping correlation clusteringOverlapping correlation clustering
Overlapping correlation clustering
 
Machine Learning Application Development
Machine Learning Application DevelopmentMachine Learning Application Development
Machine Learning Application Development
 
Semi-random model tree ensembles: an effective and scalable regression method
Semi-random model tree ensembles: an effective and scalable regression method Semi-random model tree ensembles: an effective and scalable regression method
Semi-random model tree ensembles: an effective and scalable regression method
 
Distributed clustering from data streams
Distributed clustering from data streamsDistributed clustering from data streams
Distributed clustering from data streams
 
Adaptive pre-processing for streaming data
Adaptive pre-processing for streaming dataAdaptive pre-processing for streaming data
Adaptive pre-processing for streaming data
 

Recently uploaded

20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 

Recently uploaded (20)

20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 

Experiments with Randomisation and Boosting for Multi-instance Classification

  • 1. Experiments with Randomisation and Boosting for Multi-instance Classification Luke Bjerring, James Foulds, Eibe Frank University of Waikato September 13, 2011
  • 2. What's in this talk? • What is multi-instance learning? • Basic multi-instance data format in WEKA • The standard assumption in multi-instance learning • Learning decision tree and rules • Ensembles using randomisation • Diverse density learning • Boosting diverse density learning • Experimental comparison • Conclusions © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 2
  • 3. Multi-instance learning • Generalized (supervised) learning scenario where each example for learning is a bag of instances Single-instance model one feature vector classification Multi-instance model classification multiple feature vectors Figure based on diagram in Dietterich et al (1997) © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 3
  • 4. Example applications • Applicable whenever an object can best be represented as an unordered collection of instances • Two popular application areas in the literature: − Image classification (e.g. does an image contain a tiger?) • Approach: image is split into regions, each region becomes an instance described by a fixed-length feature vector • Motivation for MI learning: location of object not important for classification, some “key” regions determine outcome − Activity of molecules (e.g. does molecule smell musky?) • Approach: instances describe possible conformations in 3D space, based on fixed-length feature vector • Motivation for MI learning: conformations cannot easily be ordered, only some responsible for activity © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 4
  • 5. Multi-instance data in WEKA • Bag of data given as value of relation-valued attribute bag identifier instances in bag class label © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 5
  • 6. What's the big deal? • Multi-instance learning is challenging because instance- level classifications are assumed to be unknown − Algorithm is told that an image contains a tiger, but not which regions are “tiger-like” − Similarly, a molecule is known to be active (or inactive), but algorithm is not told which conformation is responsible for this • Basic (standard) assumption in MI learning: bag is positive iff it contains at least one positive instance − Example: molecule is active if at least one conformation is active, and inactive otherwise • Generalizations of this are possible that assume interactions between instances in a bag • Alternative: instances contribute collectively to bag label © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 6
  • 7. A synthetic example • 10 positive/negative bags, 10 instances per bag © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 7
  • 8. A synthetic example • Bag positive iff at least one instance in (0.4,0.6)x(0.4,0.6) © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 8
  • 9. Assigning bag labels to instances... • 100 positive/negative bags, 10 instances per bag © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 9
  • 10. Partitioning generated by C4.5 • Many leaf nodes, only one of them matters... © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 10
  • 11. Partitioning generated by C4.5 • Many leaf nodes, only one of them matters... © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 11
  • 12. Blockeel et al.'s MITI tree learner • Idea: home in on big positive leaf node, remove instances associated with that leaf node y <= 0.3942 : 443 [0 / 443] (-) y > 0.3942 : 1189 | y <= 0.6004 : 418 | | x <= 0.6000 : 262 | | | x <= 0.3676 : 59 [0 / 59] (-) | | | x > 0.3676 : 128 | | | | x <= 0.3975 : 2 [0 / 2] (-) | | | | x > 0.3975 : 118 | | | | | y <= 0.3989 : 1 [0 / 1] (-) | | | | | y > 0.3989 : 116 [116 / 0] (+) | | x > 0.6000 : 88 [0 / 88] (-) | y > 0.6004 : 407 [0 / 407] (-) © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 12
  • 13. How MITI works • Two key modifications compared to standard top-down decision tree inducers: ­ Nodes are expanded in best-first manner, based on proportion of positive instances (→ identify positive leaf nodes early) ­ Once a positive leaf node has been found, all bags associated with this leaf node are removed from the training data (→ all other instances in these bags are irrelevant) • Blockeel et al. also use special purpose splitting criterion and biased estimate of proportion of positives • Our experiments indicate that it is better to use Gini index and unbiased estimate of proportion →Trees are generally slight more accurate and substantially smaller (also affects runtime) © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 13
  • 14. Learning rules: MIRI • Conceptual drawback of MITI tree learner: deactivated data may have already been used to grow other branches • Simple fix based on separate-and-conquer rule learning using partial trees: ‒ When positive leaf is found, make the path to this leaf into an if-then rule, discard the rest of the tree ‒ Start (partial) tree generation from scratch on the remaining data to generate the next rule ‒ Stop when no positive leaf can be made; add default rule • Experiments show: resulting rule learner (MIRI) has similar classification accuracy to MITI • However: rule sets are much more compact than corresponding decision trees © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 14
  • 15. Random forests for MI learning • Random forests are well-known to be high-performance ensemble classifiers in single-instance learning • Straightforward to adapt MITI to learn semi-random decision trees from multi-instance data – At each node, choose random fixed-size subset of attributes, then choose best split amongst those – Also possible to apply semi-random node expansion (not best-first), but this yields little benefit • Can trivially apply this to MIRI rule learning as well: it's based on partially grown MITI trees • Ensemble can be generated in WEKA using RandomCommittee meta classifier © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 15
  • 16. Some experimental results: MITI © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 16
  • 17. Some experimental results: MIRI © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 17
  • 18. Maron's diverse density learning • Idea: identify point x in instance space where positive bags overlap, centre bell-shaped function at this point • Using this function, probability that instance Bij is positive, based on current hypothesis h, is assumed to be: where hypothesis h includes location x, but also a feature scaling vector s: • Instance-level probabilities are turned into bag-level probabilities using noisy-or function: © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 18
  • 19. Boosting diverse density learning • Point x and scaling vector s are found using gradient descent by maximising bag-level likelihood • Problem: very slow; takes very long to converge • QuickDD heuristic: find best point x first, using fixed scaling vector s, then optimise s; if necessary, iterate • Much faster, similar accuracy on benchmark data (also, compares favourably to subsampling-based EMDD) • Makes it computationally practical to apply boosting (RealAdaboost) to improve accuracy: – In this case, QuickDD is applied with weighted likelihood, symmetric learning, and localised model © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 19
  • 20. Some experimental results: Boosted DD © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 20
  • 21. So how do the ensembles compare? © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 21
  • 22. But: improvement on “naive” methods? • Can apply standard single-instance random forests to multi-instance data using data transformations... © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 22
  • 23. Summary • MITI and MIRI are fast methods for learning compact decision trees and rule sets for MI data • Randomisation for ensemble learning yields significantly improved accuracy in both cases • Heuristic QuickDD variant of diverse density learning makes it computationally practical to boost DD learning • Boosting yields substantially improved accuracy • Neither boosting nor randomisation has clear advantage in accuracy, but randomisation is much faster • However: marginal improvement in accuracy compared to “naive” methods © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 23
  • 24. Where in WEKA? • Location of multi-instance learners in Explorer GUI: • Available via package manager in WEKA 3.7, which also provides MITI, MIRI, and QuickDD © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 24
  • 25. Details on QuickDD for RealAdaboost • Weights in RealAdaboost are updated using odds ratio: • Weighted conditional likelihood is used in QuickDD: • QuickDD model is thresholded at 0.5 probability to achieve local effect on weight updates: • Symmetric learning is applied (i.e. both classes are tried as the positive class in turn) – Of the two models, the one that maximises weighted conditional likelihood is added into the ensemble © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 25
  • 26. Random forest vs. bagging and boosting © THE UNIVERSITY OF WAIKATO • TE WHARE WANANGA O WAIKATO 09/13/11 26