Successfully reported this slideshow.

ESWC - PhD Symposium 2016

1

Share

1 of 19
1 of 19

More Related Content

ESWC - PhD Symposium 2016

  1. 1. Machine-Crowd Annotation Workflow for Event Understanding across Collections & Domains Oana Inel Extended Semantic Web Conference PhD Symposium May 30th 2016
  2. 2. Too much information ... e.g., if you are interested in the topic of “whaling” 2
  3. 3. … and after a while it all looks the same it is difficult to form a global picture on a topic 3
  4. 4. … thus, content without context is difficult to process events can help create context around content 4
  5. 5. …, but events are not easy to deal with • Events are vague • Event semantics are difficult • Events can be viewed and interpreted from multiple perspectives and interpretations e.g. of participants interpretation: The mayor of the city called the celebration a success. • Events can be presented at different levels of granularities e.g. of spatial disagreement: The celebration took place in every city in the Netherlands. • People are not consistent in the way they talk about or use events e.g.: The celebration took place last week, fireworks shows were held everywhere. 5
  6. 6. … a lot of ground truth is needed to learn event specifics • Traditional ground truth collection doesn’t scale: • there is not really ‘one type of experts’ when it comes to events • the annotation guidelines for events are difficult to define • the annotation of events can be a tedious process • all of the above can result in high inter-annotator disagreement • Crowdsourcing could be an alternative • but is still not a robust & replicable approach 6
  7. 7. … let’s look at some examples According to department policy prosecutors must make a strong showing that lawyers' fees came from assets tainted by illegal profits before any attempts at seizure are made. The unit makes intravenous pumps used by hospitals and had more than $110 million in sales last year according to Advanced Medical. 7
  8. 8. … here is what experts annotate on these sentences [According] to department policy prosecutors must make a strong [showing] that lawyers' fees [came] from assets tainted by illegal profits before any [attempts] at [seizure] are [made]. The unit makes intravenous pumps used by hospitals and [had] more than $110 million in [sales] last year according to Advanced Medical. 8
  9. 9. … here is what the crowd annotates on them According to department policy prosecutors must make a [strong [showing]] that lawyers' fees [[came] from assets] [tainted] by illegal profits before any [attempts] at [seizure] are [made]. The unit [makes] intravenous pumps [used] by hospitals and [[had] more than $110 million in [sales]] last year according to Advanced Medical. 9
  10. 10. … here is what the machines can detect According to department policy prosecutors must [make] a strong showing that lawyers' fees [came] from assets [tainted] by illegal profits before any attempts at seizure are made. The unit [makes] intravenous pumps [used] by hospitals and [had] more than $110 million in sales last year according to Advanced Medical. 10
  11. 11. Research Questions • Can crowdsourcing help in improving event detection? • Can we provide reliable crowdsourced training data? • Can we optimize the crowdsourcing process by using results from NLP tools? • Can we achieve a replicable data collection process across different data types and use cases? 11
  12. 12. Current Hypothesis: Disagreement-based approach to crowdsource ground truth is reliable and produces quality results 12
  13. 13. Preliminary Results - Crowd vs. Experts ● 200 news snippets from TimeBank● 3019 tweets published in 2014 ● potential relevant tweets for events such as ‘whaling’, ‘Davos 2014’ among others CrowdTruth approach outperforms the-state-of-the-art crowdsourcing approaches such as single annotator and majority vote The crowd performs almost as good as the experts due to very linguistic-specialized guidelines for expert annotators13
  14. 14. Current Hypothesis: Disagreement-based approach to crowdsource ground truth can be optimised by using results from NLP tools 15
  15. 15. Preliminary Results - Hybrid Workflow ENTITY EXTRACTION EVENTS CROWDSOURCING AND LINKING TO CONCEPTS SEGMENTATION & KEYFRAMES LINKING EVENTS AND CONCEPTS TO KEYFRAMES diveplus.beeldengeluid.nl 16
  16. 16. Preliminary Results - Hybrid Workflow Outcome 17diveplus.beeldengeluid.nl
  17. 17. Approach: Disagreement is Signal Principles for disagreement-based crowdsourcing • Do not enforce agreement • Capture a multitude of views • Take advantage of existing tools, reuse their functionality This results in teaching machines to reason in the disagreement space 18
  18. 18. Overall Methodology 1. Instantiate the research methodology with specific data, domain • Video synopsis, news 2. Identify state-of-the-art IE approaches that can be used • NER tools for identifying events and their participating entities in the video synopsis 3. Evaluate IE approaches and identify their drawbacks • Poor performance in extracting events 4. Combine IE with crowdsourcing tasks in a complementary way • Use crowdsourcing for identifying the events and linking them with their participating entities 5. Evaluate crowdsourcing results with CrowdTruth disagreement-first approach • Evaluate the input unit, the workers and the annotations 6. Instantiate the same workflow with different data and/or different domain • Tweets, Twitter 7. Perform cross-domain analysis • Event extraction in video synopsis vs. event extraction in tweets 19
  19. 19. Project Websites http://CrowdTruth.org http://diveproject.beeldengeluid.nl Tools & Code http://dev.CrowdTruth.org http://github.com/CrowdTruth http://diveplus.beeldengeluid.nl Data http://data.crowdtruth.org http://data.dive.beeldengeluid.nl 20

Editor's Notes

  • Massive amount of information
    One of the main characteristics of today is the massive, even overwhelming amount of information around us
    Just think at all the videos, images and the infinite amount of web pages, tweets that you get as search results when you want to learn about a topic
  • However, this unconceivable amount of information starts to ‘look all the same’ to the users and they are not able to properly consume the information and get an overview of the topic
  • and this happens because content without context is difficult to process.
    but, events can help create context around content
  • Experts can be inconsistent - despite the traditional believe that they are always right
  • The crowd overlaps with the experts in proportion of 88%, i.e. it detects almost the same events as the experts
    But the added value is that crowd finds even more events and it is more specific
    Another point is that the crowd seems to be more consistent :-)
  • And how little the machines are able to detect from this - so they need to learn more, thus more training data is needed for them
  • majority vote - the answer that was picked by the majority of the workers and all the answers that were picked by at least half of the total number of workers
    single - randomly sampled from the set of workers annotating it; to show that having more annotators generates better quality data.
    CT scores consistently above the majority vote and single annotator and its performance is also comparable to that of domain experts.
    The crowdsourcing task where workers choose annotations from a fixed number of options perform better at higher thresholds, e.g. (Twitter event extraction). Whereas open annotation tasks (event extraction) perform better when the threshold is at its lowest, thus ensuring the most diverse opinions are accounted in the resulting ground truth.


  • Message of the results
    Data on which the experiments were performed
  • Have two hypothesis for this
  • Experts are inconsistent
  • Automatic tools detect less; difficult to see what is the focus
    The crowd is much more specific than the experts
    The crowd overlaps a lot with the experts
    Experts have some difficult events
    Experts are not consistent
  • Automatic tools detect less; difficult to see what is the focus
    The crowd is much more specific than the experts
    The crowd overlaps a lot with the experts
    Experts have some difficult events
    Experts are not consistent
  • ×