SlideShare a Scribd company logo
1 of 9
Download to read offline
Natural Language Processing Labs. By Daanv
2019.07.16
Situation Recognition
Natural Language Processing Labs. By Daanv
Abstract
The problem of producing a concise summary of the situation an image depicts
1) The main activity
2) The participating actors (objects)
3) The roles these participants play in the activity
https://framenet.icsi.berkeley.edu/fndrupal/
FrameNet(Frame Semantics)
: building a lexical database of English that is
both human and machine readable,
based on annotating examples of how words are used in actual texts.
Natural Language Processing Labs. By Daanv
Formal Task Definition
- verb(V) , nouns(N), frames(F)
Ex) Rf = {(agent, boy), (source, cliff), (obstacle, Ø), (destination, water), (place, lake)}
> Predict a situation, S = (v, Rf)
Predict: S = (jumping, {(agent, boy), (source, cliff), (obstacle, Ø), (destination, water), (place, lake)})
Natural Language Processing Labs. By Daanv
Situation Recognition
This task compare performance to baselines that independently recognize activities and objects.
Dataset
> imSitu
Images labeled with situations (over 500 verbs with 125,000 images.)
Metrics
> Accuracy
The evaluation data has situations provided by multiple annotators.
So, the accuracy is high
when the verb prediction(verb) and semantic role-value pair predictions(value) matches one of the annotations.
Systems
> CRF (feature: ImageNet, wordNet) with VGG
- Stochastic gradient descent
- Batch size: 192 / epochs: 30 / learning rate: 1e-5
Natural Language Processing Labs. By Daanv
Situation Recognition
The number of semantic roles a noun can participate in, on a log-scale.
Natural Language Processing Labs. By Daanv
Situation Recognition
The number of nouns that can participate in a sample of semantic roles.
Natural Language Processing Labs. By Daanv
Situation Recognition
The number of nouns that appear with a sample of verbs.
Natural Language Processing Labs. By Daanv
Situation Recognition
Quantitative Results
Qualitative Results
Semantic roles Values for those roles
activity
The standard annotated data The output from CRF model when it correctly predicted the activity
Natural Language Processing Labs. By Daanv
THANK YOU.

More Related Content

Similar to Situation recognition visual semantic role labeling for image understanding

SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2Viral Gupta
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanfordSakthivel C R
 
RCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerRCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerbohanairl
 
RCOMM 2011 - Sentiment Classification
RCOMM 2011 - Sentiment ClassificationRCOMM 2011 - Sentiment Classification
RCOMM 2011 - Sentiment Classificationbohanairl
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learningtelss09
 
Tg noh jeju_workshop
Tg noh jeju_workshopTg noh jeju_workshop
Tg noh jeju_workshopTae-Gil Noh
 
Filtering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open DataFiltering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open Dataebrahim_bagheri
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the CloudDataMine Lab
 
Translated learning
Translated learningTranslated learning
Translated learningSOYEON KIM
 
A Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual EntailmentA Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual EntailmentFaculty of Computer Science
 
An Effective Rule Miner for Instance Matching in a Web of Data
An Effective Rule Miner for Instance Matching in a Web of DataAn Effective Rule Miner for Instance Matching in a Web of Data
An Effective Rule Miner for Instance Matching in a Web of DataXing Niu
 
Reconciling Event-Based Knowledge through RDF2VEC
Reconciling Event-Based Knowledge through RDF2VECReconciling Event-Based Knowledge through RDF2VEC
Reconciling Event-Based Knowledge through RDF2VECMehwish Alam
 
Logic Programming and Prolog
Logic Programming and PrologLogic Programming and Prolog
Logic Programming and PrologSadegh Dorri N.
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Latent Relational Model for Relation Extraction
Latent Relational Model for Relation ExtractionLatent Relational Model for Relation Extraction
Latent Relational Model for Relation ExtractionGaetano Rossiello, PhD
 

Similar to Situation recognition visual semantic role labeling for image understanding (20)

AINL 2016: Maraev
AINL 2016: MaraevAINL 2016: Maraev
AINL 2016: Maraev
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanford
 
RCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMinerRCOMM 2011 - Sentiment Classification with RapidMiner
RCOMM 2011 - Sentiment Classification with RapidMiner
 
RCOMM 2011 - Sentiment Classification
RCOMM 2011 - Sentiment ClassificationRCOMM 2011 - Sentiment Classification
RCOMM 2011 - Sentiment Classification
 
Beyond nouns eccv_2008
Beyond nouns eccv_2008Beyond nouns eccv_2008
Beyond nouns eccv_2008
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learning
 
Presentation
PresentationPresentation
Presentation
 
Tg noh jeju_workshop
Tg noh jeju_workshopTg noh jeju_workshop
Tg noh jeju_workshop
 
Filtering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open DataFiltering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open Data
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the Cloud
 
Translated learning
Translated learningTranslated learning
Translated learning
 
A Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual EntailmentA Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual Entailment
 
An Effective Rule Miner for Instance Matching in a Web of Data
An Effective Rule Miner for Instance Matching in a Web of DataAn Effective Rule Miner for Instance Matching in a Web of Data
An Effective Rule Miner for Instance Matching in a Web of Data
 
Reconciling Event-Based Knowledge through RDF2VEC
Reconciling Event-Based Knowledge through RDF2VECReconciling Event-Based Knowledge through RDF2VEC
Reconciling Event-Based Knowledge through RDF2VEC
 
Logic Programming and Prolog
Logic Programming and PrologLogic Programming and Prolog
Logic Programming and Prolog
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Latent Relational Model for Relation Extraction
Latent Relational Model for Relation ExtractionLatent Relational Model for Relation Extraction
Latent Relational Model for Relation Extraction
 

More from Danbi Cho

Crf based named entity recognition using a korean lexical semantic network
Crf based named entity recognition using a korean lexical semantic networkCrf based named entity recognition using a korean lexical semantic network
Crf based named entity recognition using a korean lexical semantic networkDanbi Cho
 
Attention boosted deep networks for video classification
Attention boosted deep networks for video classificationAttention boosted deep networks for video classification
Attention boosted deep networks for video classificationDanbi Cho
 
A survey on deep learning based approaches for action and gesture recognition...
A survey on deep learning based approaches for action and gesture recognition...A survey on deep learning based approaches for action and gesture recognition...
A survey on deep learning based approaches for action and gesture recognition...Danbi Cho
 
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
ELECTRA_Pretraining Text Encoders as Discriminators rather than GeneratorsELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
ELECTRA_Pretraining Text Encoders as Discriminators rather than GeneratorsDanbi Cho
 
A survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in textA survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in textDanbi Cho
 
Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...
Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...
Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...Danbi Cho
 
Decision tree and ensemble
Decision tree and ensembleDecision tree and ensemble
Decision tree and ensembleDanbi Cho
 
Can recurrent neural networks warp time
Can recurrent neural networks warp timeCan recurrent neural networks warp time
Can recurrent neural networks warp timeDanbi Cho
 
Mitigating unwanted biases with adversarial learning
Mitigating unwanted biases with adversarial learningMitigating unwanted biases with adversarial learning
Mitigating unwanted biases with adversarial learningDanbi Cho
 

More from Danbi Cho (10)

Crf based named entity recognition using a korean lexical semantic network
Crf based named entity recognition using a korean lexical semantic networkCrf based named entity recognition using a korean lexical semantic network
Crf based named entity recognition using a korean lexical semantic network
 
Gpt models
Gpt modelsGpt models
Gpt models
 
Attention boosted deep networks for video classification
Attention boosted deep networks for video classificationAttention boosted deep networks for video classification
Attention boosted deep networks for video classification
 
A survey on deep learning based approaches for action and gesture recognition...
A survey on deep learning based approaches for action and gesture recognition...A survey on deep learning based approaches for action and gesture recognition...
A survey on deep learning based approaches for action and gesture recognition...
 
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
ELECTRA_Pretraining Text Encoders as Discriminators rather than GeneratorsELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
ELECTRA_Pretraining Text Encoders as Discriminators rather than Generators
 
A survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in textA survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in text
 
Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...
Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...
Zero wall detecting zero-day web attacks through encoder-decoder recurrent ne...
 
Decision tree and ensemble
Decision tree and ensembleDecision tree and ensemble
Decision tree and ensemble
 
Can recurrent neural networks warp time
Can recurrent neural networks warp timeCan recurrent neural networks warp time
Can recurrent neural networks warp time
 
Mitigating unwanted biases with adversarial learning
Mitigating unwanted biases with adversarial learningMitigating unwanted biases with adversarial learning
Mitigating unwanted biases with adversarial learning
 

Recently uploaded

WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2
 
WSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid EnvironmentsWSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid EnvironmentsWSO2
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...WSO2
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & InnovationWSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & InnovationWSO2
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2
 
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and ApplicationsWSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and ApplicationsWSO2
 
WSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration ToolingWSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration ToolingWSO2
 
Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2WSO2
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2
 
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...WSO2
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of TransformationWSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of TransformationWSO2
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...WSO2
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2
 

Recently uploaded (20)

WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
WSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid EnvironmentsWSO2Con2024 - Software Delivery in Hybrid Environments
WSO2Con2024 - Software Delivery in Hybrid Environments
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
WSO2Con2024 - Facilitating Broadband Switching Services for UK Telecoms Provi...
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & InnovationWSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
WSO2CON 2024 - OSU & WSO2: A Decade Journey in Integration & Innovation
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and ApplicationsWSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
WSO2CON 2024 - Architecting AI in the Enterprise: APIs and Applications
 
WSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration ToolingWSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration Tooling
 
Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2Driving Innovation: Scania's API Revolution with WSO2
Driving Innovation: Scania's API Revolution with WSO2
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
WSO2Con2024 - Navigating the Digital Landscape: Transforming Healthcare with ...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of TransformationWSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
WSO2CON 2024 - Designing Event-Driven Enterprises: Stories of Transformation
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public AdministrationWSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
WSO2CON 2024 - How CSI Piemonte Is Apifying the Public Administration
 

Situation recognition visual semantic role labeling for image understanding

  • 1. Natural Language Processing Labs. By Daanv 2019.07.16 Situation Recognition
  • 2. Natural Language Processing Labs. By Daanv Abstract The problem of producing a concise summary of the situation an image depicts 1) The main activity 2) The participating actors (objects) 3) The roles these participants play in the activity https://framenet.icsi.berkeley.edu/fndrupal/ FrameNet(Frame Semantics) : building a lexical database of English that is both human and machine readable, based on annotating examples of how words are used in actual texts.
  • 3. Natural Language Processing Labs. By Daanv Formal Task Definition - verb(V) , nouns(N), frames(F) Ex) Rf = {(agent, boy), (source, cliff), (obstacle, Ø), (destination, water), (place, lake)} > Predict a situation, S = (v, Rf) Predict: S = (jumping, {(agent, boy), (source, cliff), (obstacle, Ø), (destination, water), (place, lake)})
  • 4. Natural Language Processing Labs. By Daanv Situation Recognition This task compare performance to baselines that independently recognize activities and objects. Dataset > imSitu Images labeled with situations (over 500 verbs with 125,000 images.) Metrics > Accuracy The evaluation data has situations provided by multiple annotators. So, the accuracy is high when the verb prediction(verb) and semantic role-value pair predictions(value) matches one of the annotations. Systems > CRF (feature: ImageNet, wordNet) with VGG - Stochastic gradient descent - Batch size: 192 / epochs: 30 / learning rate: 1e-5
  • 5. Natural Language Processing Labs. By Daanv Situation Recognition The number of semantic roles a noun can participate in, on a log-scale.
  • 6. Natural Language Processing Labs. By Daanv Situation Recognition The number of nouns that can participate in a sample of semantic roles.
  • 7. Natural Language Processing Labs. By Daanv Situation Recognition The number of nouns that appear with a sample of verbs.
  • 8. Natural Language Processing Labs. By Daanv Situation Recognition Quantitative Results Qualitative Results Semantic roles Values for those roles activity The standard annotated data The output from CRF model when it correctly predicted the activity
  • 9. Natural Language Processing Labs. By Daanv THANK YOU.