Natural Language Processing Labs. By Daanv
2019.07.16
Situation Recognition
Natural Language Processing Labs. By Daanv
Abstract
The problem of producing a concise summary of the situation an image depicts
1) The main activity
2) The participating actors (objects)
3) The roles these participants play in the activity
https://framenet.icsi.berkeley.edu/fndrupal/
FrameNet(Frame Semantics)
: building a lexical database of English that is
both human and machine readable,
based on annotating examples of how words are used in actual texts.
Natural Language Processing Labs. By Daanv
Formal Task Definition
- verb(V) , nouns(N), frames(F)
Ex) Rf = {(agent, boy), (source, cliff), (obstacle, Ø), (destination, water), (place, lake)}
> Predict a situation, S = (v, Rf)
Predict: S = (jumping, {(agent, boy), (source, cliff), (obstacle, Ø), (destination, water), (place, lake)})
Natural Language Processing Labs. By Daanv
Situation Recognition
This task compare performance to baselines that independently recognize activities and objects.
Dataset
> imSitu
Images labeled with situations (over 500 verbs with 125,000 images.)
Metrics
> Accuracy
The evaluation data has situations provided by multiple annotators.
So, the accuracy is high
when the verb prediction(verb) and semantic role-value pair predictions(value) matches one of the annotations.
Systems
> CRF (feature: ImageNet, wordNet) with VGG
- Stochastic gradient descent
- Batch size: 192 / epochs: 30 / learning rate: 1e-5
Natural Language Processing Labs. By Daanv
Situation Recognition
The number of semantic roles a noun can participate in, on a log-scale.
Natural Language Processing Labs. By Daanv
Situation Recognition
The number of nouns that can participate in a sample of semantic roles.
Natural Language Processing Labs. By Daanv
Situation Recognition
The number of nouns that appear with a sample of verbs.
Natural Language Processing Labs. By Daanv
Situation Recognition
Quantitative Results
Qualitative Results
Semantic roles Values for those roles
activity
The standard annotated data The output from CRF model when it correctly predicted the activity
Natural Language Processing Labs. By Daanv
THANK YOU.

Situation recognition visual semantic role labeling for image understanding

  • 1.
    Natural Language ProcessingLabs. By Daanv 2019.07.16 Situation Recognition
  • 2.
    Natural Language ProcessingLabs. By Daanv Abstract The problem of producing a concise summary of the situation an image depicts 1) The main activity 2) The participating actors (objects) 3) The roles these participants play in the activity https://framenet.icsi.berkeley.edu/fndrupal/ FrameNet(Frame Semantics) : building a lexical database of English that is both human and machine readable, based on annotating examples of how words are used in actual texts.
  • 3.
    Natural Language ProcessingLabs. By Daanv Formal Task Definition - verb(V) , nouns(N), frames(F) Ex) Rf = {(agent, boy), (source, cliff), (obstacle, Ø), (destination, water), (place, lake)} > Predict a situation, S = (v, Rf) Predict: S = (jumping, {(agent, boy), (source, cliff), (obstacle, Ø), (destination, water), (place, lake)})
  • 4.
    Natural Language ProcessingLabs. By Daanv Situation Recognition This task compare performance to baselines that independently recognize activities and objects. Dataset > imSitu Images labeled with situations (over 500 verbs with 125,000 images.) Metrics > Accuracy The evaluation data has situations provided by multiple annotators. So, the accuracy is high when the verb prediction(verb) and semantic role-value pair predictions(value) matches one of the annotations. Systems > CRF (feature: ImageNet, wordNet) with VGG - Stochastic gradient descent - Batch size: 192 / epochs: 30 / learning rate: 1e-5
  • 5.
    Natural Language ProcessingLabs. By Daanv Situation Recognition The number of semantic roles a noun can participate in, on a log-scale.
  • 6.
    Natural Language ProcessingLabs. By Daanv Situation Recognition The number of nouns that can participate in a sample of semantic roles.
  • 7.
    Natural Language ProcessingLabs. By Daanv Situation Recognition The number of nouns that appear with a sample of verbs.
  • 8.
    Natural Language ProcessingLabs. By Daanv Situation Recognition Quantitative Results Qualitative Results Semantic roles Values for those roles activity The standard annotated data The output from CRF model when it correctly predicted the activity
  • 9.
    Natural Language ProcessingLabs. By Daanv THANK YOU.