Successfully reported this slideshow.
Your SlideShare is downloading. ×

Dutch Humor Detection by Generating Negative Examples

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 21 Ad

Dutch Humor Detection by Generating Negative Examples

Download to read offline

This talk was originally presented by Thomas Winters on the 20th of November 2020 at the 29th Belgian Dutch Conference on Machine Learning (Benelearn 2020). The conference awarded this presentation the "Best Video Award".

A video of this talk is also available on https://www.youtube.com/watch?v=U1cShms67ec

More information, see https://thomaswinters.be/talk/2020benelearn

Abstract:

This talk was originally presented by Thomas Winters on the 20th of November 2020 at the 29th Belgian Dutch Conference on Machine Learning (Benelearn 2020). The conference awarded this presentation the "Best Video Award".

A video of this talk is also available on https://www.youtube.com/watch?v=U1cShms67ec

More information, see https://thomaswinters.be/talk/2020benelearn

Abstract:

Advertisement
Advertisement

More Related Content

Similar to Dutch Humor Detection by Generating Negative Examples (20)

More from Thomas Winters (13)

Advertisement

Recently uploaded (20)

Dutch Humor Detection by Generating Negative Examples

  1. 1. 1 Dutch Humor Detection by Generating Negative Examples Thomas Winters & Pieter Delobelle PhD Students at DTAI, KU Leuven firstname.lastname@kuleuven.be @thomas_wint thomaswinters.be @pieterdelobelle people.cs.kuleuven.be /~pieter.delobelle
  2. 2. 2 Humor Intrinsically human! AI-Complete problem?
  3. 3. 3 Incongruity-Resolution Theory Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory. Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?”
  4. 4. 4 Incongruity-Resolution Theory Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory. Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” Setup
  5. 5. 5 Incongruity-Resolution Theory Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory. Obvious Interpretation Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” Setup
  6. 6. 6 Incongruity-Resolution Theory Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory. Obvious Interpretation Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” Setup Punchline
  7. 7. 7 Incongruity-Resolution Theory Based on: Ritchie, G. (1999). Developing the incongruity-resolution theory. Obvious Interpretation Hidden Interpretation Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” Setup Punchline
  8. 8. 8 Human-focused definition! Machine should not only spot two mental images Obvious Interpretation Hidden Interpretation But also this is not too hard or too easy for a human!
  9. 9. 9 Transformer models Large language models, pretrained on large corpora Outperforming previous neural architectures on most language tasks GPT-2 & GPT-3 Completes any textual prompt BERT Classifies any text sequence / token Brown, Tom B., et al. "Language models are few-shot learners." Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding
  10. 10. 10 Not just for English  Dutch RobBERT! RobBERT is a Dutch RoBERTa-based language model Vastly outperforms other architectures on large range of Dutch NLP tasks & generally outperforms other BERT models, especially on small datasets Easy to use: just import & fine-tune on your task But can it learn to recognise humor? Delobelle, P., Winters, T., & Berendt, B. (2020). RobBERT: a dutch RoBERTa-based language model. RobBERT Our Dutch BERT-like model from transformers import RobertaTokenizer, RobertaForSequenceClassification tokenizer = RobertaTokenizer.from_pretrained("pdelobelle/robbert-v2-dutch-base") model = RobertaForSequenceClassification.from_pretrained("pdelobelle/robbert-v2-dutch-base")
  11. 11. 11 Early Humor Detector • Designed humor features e.g. alliteration, antonym, adult slang... • Used Naive Bayes and Support Vector Machines • Task: One-liners vs news, neutral corpus & proverbs Mihalcea, R., & Strapparava, C. (2005). Making computers laugh: Investigations in automatic humor recognition.
  12. 12. 12 But is this a good dataset? News & proverbs have completely different types of words than jokes!  Looking at word frequencies is often already “enough”! Is this really humor detection?
  13. 13. 13 Jokes are fragile! Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
  14. 14. 14 Jokes are fragile! Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” Generate non-jokes using dynamic templates! (@TorfsBot) Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
  15. 15. 15 Jokes are fragile! Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” men Generate non-jokes using dynamic templates! (@TorfsBot) Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
  16. 16. 16 Jokes are fragile! Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” men bar Generate non-jokes using dynamic templates! (@TorfsBot) Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
  17. 17. 17 Jokes are fragile! Two fish are in a tank. Says one to the other: “Do you know how to drive this thing?” men bar Generate non-jokes using dynamic templates! (@TorfsBot) Word-based features won’t work anymore! Winters, T. (2019). Generating philosophical statements using interpolated markov models and dynamic templates.
  18. 18. 18 Examples of generated Dutch non-jokes Het is groen en het is een mummie? Kermit de Waterkant Wat is het toppunt van principe? 1) Wachten totdat een Nederlander gaat twijfelen 2) Een Zuster met een autoladder 3) Een brandwacht brandmeester met een brandmeester van 9 maanden “Ober, kunt u die schrik uit mijn politieman halen? Want ik eet liever alleen.” "Mijn hond is heel vreselijk: Hij schreeuwt mij iedere zus de broer.“ "Maar dat is toch niet zo heel vreselijk?“ "Jawel, want ik heb geen rapport!" Wat staat er midden in het bos? De kapper. Er loopt een super vriendelijk blondje langs een armband. Last er een toonbank: “zo, waargaan die mooie mannen heen?” Blondje: “naar de barkeeper als er niets tussen komt…” Hoe heet de vrouw van Sinterklaas? Keukentafel. "Twee tanden zwemmen in de zee en ze zien een stamgast op een stamgast. De ene raad zegt tegen de andere raad: 'Hé kijk! Ons eten op een bord!'"
  19. 19. 19 51% 60% 50% 94% 94% 47% 94% 94% 47% 99% 96% 89% Jokes vs News Jokes vs Proverbs Jokes vs Generated Jokes Binary classification of Dutch jokes versus texts from other domains Naive Bayes LSTM CNN RobBERT Much more challenging dataset! More truthful humor detection?
  20. 20. 20 Conclusion Novel joke detection dataset creation method Easily scales to other languages Illustrated humor insights of transformer Strongly outperforming previous neural networks Created first Dutch humor detectors Humour https://github.com/twinters/dutch-humor-detection
  21. 21. 21 Some images (based on the works) of dooder & alekksall on freepik.com Thomas Winters & Pieter Delobelle PhD Students at DTAI, KU Leuven firstname.lastname@kuleuven.be Dutch Humor Detection by Generating Negative Examples @thomas_wint thomaswinters.be @pieterdelobelle people.cs.kuleuven.be /~pieter.delobelle

×