Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A look inside Babelfy: Examining the bubble

7 views

Published on

Presented at CLiN (Computational Linguistics in the Netherlands) conference 26, in December 2015.
Slides created collaboratively with Minh Le.

Published in: Science
  • .DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... .DOWNLOAD PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... .DOWNLOAD doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

A look inside Babelfy: Examining the bubble

  1. 1. A look inside Babelfy: Examining the bubble Filip IlievskiMinh Le
  2. 2. Reproducibility “Reproducibility is a defining feature of science, but the extent to which it characterizes current research is unknown.” (Estimating the reproducibility of psychological science, 2015)
  3. 3. “According to the replicators’ qualitative assessments, only 39 of the 100 replication attempts were successful.” (http://www.nature.com/news/over-half-of-psychology-studies-fa il-reproducibility-test-1.18248)
  4. 4. Replication is h.a.r.d. “According to the replicators’ qualitative assessments, only 39 of the 100 replication attempts were successful.” “BUT, whether a replication attempt is considered successful is not straightforward.” (http://www.nature.com/news/over-half-of-psychology-studies-fa il-reproducibility-test-1.18248)
  5. 5. Everyone has a replication story to tell ... And here is ours!
  6. 6. Babelfy ● Solution published in 2014
  7. 7. Babelfy ● Solution published in 2014 ● State-of-the-art reported performance
  8. 8. Babelfy ● Solution published in 2014 ● State-of-the-art reported performance ● Great algorithm ○ Multilinguality ○ Joint decision making ○ Combined knowledge from multiple sources ■ NLP (EL, WSD) ■ Semantic Web (BabelNet)
  9. 9. Babelfy ● Solution published in 2014 ● State-of-the-art reported performance ● Great algorithm ○ Multilinguality ○ Joint decision making ○ Combined knowledge from multiple sources ■ NLP (EL, WSD) ■ Semantic Web (BabelNet) But: No open source available
  10. 10. Idea: Reimplement ● No open source available ● Curiosity how the algorithm works in detail ● Accelerate research by opening the source code
  11. 11. Algorithm 1. Preprocessing (run once) 1.1. Input: BabelNet graph (undirected, labeled graph of concepts and entities) 1.2. Re-weight the graph 1.3. Random walk with restart 1.4. Output: set of related concepts and entities 2. Disambiguation (for each document) 2.1. Generate candidates 2.2. Construct referent graph 2.3. Apply Densest subgraph algorithm 2.4. Disambiguate with a threshold
  12. 12. 1) Graph -> Concept set ->
  13. 13. Algorithm 1. Preprocessing (run once) 1.1. Input: BabelNet graph (undirected, labeled graph of concepts and entities) 1.2. Re-weight the graph 1.3. Random walk with restart 1.4. Output: set of related concepts and entities 2. Disambiguation (for each document) 2.1. Generate candidates 2.2. Construct referent graph 2.3. Apply Densest subgraph algorithm 2.4. Disambiguate with a threshold
  14. 14. Thomas and Mario are strikers playing in Munich. 2) Disambiguation
  15. 15. 2) Disambiguation: Graph construction Thomas and Mario are strikers playing in Munich.
  16. 16. 2) Disambiguation: Densest subgraph Thomas and Mario are strikers playing in Munich.
  17. 17. Experimental setup We ran experiments on: a University server 16/32GB RAM 1TB disk space 16 cores CPU @ 2.60 Ghz
  18. 18. Experiments KORE-50 dataset: 50 ‘difficult’ sentences
  19. 19. Execution time: KORE-50 1. Preprocessing (run once) 1.1. Input: BabelNet graph (undirected, labeled graph of concepts and entities) 1.2. Re-weight the graph 1.3. Random walk with restart 1.4. Output: set of related concepts and entities 2. Disambiguation (for each document) 2.1. Generate candidates 2.2. Construct referent graph 2.3. Apply Densest subgraph algorithm 2.4. Disambiguate with a threshold 10.3 hours 3.5 days
  20. 20. Results: KORE-50 Precision Recall F1 Recognition 0.82 0.77 0.79 Disambiguation 0.41 0.40 0.40 Disambiguation (reported by Babelfy) / / 0.715
  21. 21. Discussion ● significant accuracy mismatch ○ Reported F1-score: 71.5 % ○ Our F1-score: 40%
  22. 22. Discussion ● significant accuracy mismatch ● algorithm gaps that matter ○ partial matching for entities, exact matching for concepts: but how do we know if something is an entity? ○ different parameters for EL and WSD ○ parameters tuned per dataset :-( ○ what happens if the linking conference is too low?
  23. 23. Discussion ● significant accuracy mismatch ● algorithm gaps that matter ● computationally quite heavy For KORE-50: ○ 3.5 days to preprocess data ○ 17 mins to generate candidates ○ 10 hours to disambiguate
  24. 24. Discussion ● significant accuracy mismatch ● algorithm gaps that matter ● computationally quite heavy ● ok, implementation could be improved ○ currently we use Python + MongoDB + up to 32GB RAM
  25. 25. Conclusions Combining knowledge is an attractive idea but coping with that knowledge is additional challenge! Babelfy is a great effort which should have an open source version which should be reproducible Given: the detail in the paper & the computational costs, reproducing is very difficult
  26. 26. Do we know what we think we know? Reproducibility is a defining feature of science Is reproducibility appreciated in the real-world? Scientific claims should gain trust and credence by replicability of their supporting evidence
  27. 27. Join us! https://github.com/filipdbrsk/BabelfyReimplementation
  28. 28. Appendix
  29. 29. Discussion on computation ● a lot of requests for generating candidates ● a lot of candidates ● big graphs ● computationally super heavy (better implementation needed): For KORE-50: ○ 3.5 days to preprocess data ○ 17 mins to generate candidates ○ 10 hours to disambiguate
  30. 30. Discussion on algorithm partial vs exact matching: when to use what partial matching for entities, but how do we know if something is an entity? different hyperparameters for EL & WSD -> very confusing what happens if the confidence is too low -> unreported tuning of the accuracy per dataset: especially on partial vs full matching
  31. 31. Discussion on implementation ● Python is a poor choice for speed ○ GIL (Global Interpreter Lock) → no multithreading ● MongoDB is scalable but not fast enough ● More RAM
  32. 32. The algorithm in a nutshell Finding semantic signature: ● Synset graph: an edge between any pair of synsets having a relation of any type ● Structural weighting: edge weight is proportional to the number of directed triangles ● Semantic signature: random walk with restart
  33. 33. Finding candidates: - tight integration with the background knowledge - every phrase which contains a noun is checked for existence in BabelNet The algorithm in a nutshell
  34. 34. Disambiguating: The algorithm in a nutshell
  35. 35. Implementation Structural weighting: 1st attempt for u in vertices for v in adjacent vertices of u for w in adjacent vertices of v is u adjacent to w? → O(E^2)
  36. 36. Implementation Structural weighting: 2nd attempt reverse all edges for u in vertices for v in adjacent vertices of u intersection(adjacent vertices of v, reversed-adjacent vertices of u) → O(E)

×