Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Iterative knowledge extraction from social networks. The Web Conference 2018

788 views

Published on

Knowledge in the world continuously evolves, and ontologies are largely incomplete, especially regarding data belonging to the so-called long tail. We propose a method for discovering emerging knowledge by extracting it from social content. Once initialized by domain experts, the method is capable of finding relevant entities by means of a mixed syntactic-semantic method. The method uses seeds, i.e. prototypes of emerging entities provided by experts, for generating candidates; then, it associates candidates to feature vectors built by using terms occurring in their social content and ranks the candidates by using their distance from the centroid of seeds, returning the top candidates. Our method can run iteratively, using the results as new seeds.

In this paper we address the following research questions: (1) How does the reconstructed domain knowledge evolve if the candidates of one extraction are recursively used as seeds (2) How does the reconstructed domain knowledge spread geographically (3) Can the method be used to inspect the past, present, and future of knowledge (4) Can the method be used to find emerging knowledge?.

This work was presented at The Web Conference 2018, MSM workshop.

Published in: Social Media
  • Be the first to comment

  • Be the first to like this

Iterative knowledge extraction from social networks. The Web Conference 2018

  1. 1. Iterative Knowledge Extraction from Social Networks Marco Brambilla, Stefano Ceri, Florian Daniel, Marco Di Giovanni, Andrea Mauri, Giorgia Ramponi The Web Conference, WWW 2018, Lyon, France MSM'2018
  2. 2. Humans aim at formalizing knowledge
  3. 3. Famous Emerging … Famous: entities on WikiPedia or in other knowledge bases
  4. 4. Famous Emerging … Emerging: entities not present in any knowledge bases
  5. 5. Extracting Emerging Knowledge from Social Media Marco Brambilla, et Al. 2017. Extracting Emerging Knowledge from Social Media. In Proceedings of the 26th International Conference on World Wide Web (WWW '17). DOI: https://doi.org/10.1145/3038912.3052697
  6. 6. Extracting Emerging Knowledge from Social Media @Kasparov63 @LevAronian @MagnusCarlsen @FabianoCaruana
  7. 7. Extracting Emerging Knowledge from Social Media ChessPlayer Tournament
  8. 8. Extracting Emerging Knowledge from Social Media
  9. 9. Extracting Emerging Knowledge from Social Media
  10. 10. Extracting Emerging Knowledge from Social Media Feature vector: [𝑡1, 𝑡2, 𝑡3,..,𝑖1, 𝑖2, 𝑖3, . . ] where: • 𝑡𝑖 are DBpedia types • 𝑖𝑖 are DBpedia instances of Expert Types
  11. 11. Extracting Emerging Knowledge from Social Media
  12. 12. Extracting Emerging Knowledge from Social Media
  13. 13. Extracting Emerging Knowledge from Social Media
  14. 14. Extracting Emerging Knowledge from Social Media
  15. 15. Iterative Extraction Process
  16. 16. 1. How does reconstructed domain knowledge evolve with an iterative process? 2. How does the reconstructed domain knowledge spread geographically? 3. Can the method be used to inspect the past, present, and future of knowledge? 4. Can the method be used to find emerging knowledge?
  17. 17. 1. How does reconstructed domain knowledge evolve with an iterative process? Extremely domain dependent
  18. 18. 1. How does reconstructed domain knowledge evolve with an iterative process? Extremely domain dependent Precision remains rather stable
  19. 19. 1. How does reconstructed domain knowledge evolve with an iterative process? Extremely domain dependent Precision sometimes also increases
  20. 20. 2. How does the reconstructed domain knowledge spread geographically? USA ChessPlayers
  21. 21. 2. How does the reconstructed domain knowledge spread geographically? USA ChessPlayers
  22. 22. 2. How does the reconstructed domain knowledge spread geographically? USA ChessPlayers
  23. 23. 2. How does the reconstructed domain knowledge spread geographically? Iteratively found knowledge spans large geographical areas very fast
  24. 24. 3. Can the method be used to inspect the past, present, and future of knowledge? 2016.01 2016.03 2016.06 2016.09 2016.12 27 candidates Fashion designer experiment
  25. 25. 3. Can the method be used to inspect the past, present, and future of knowledge? 2016.01 2016.03 2016.06 2016.09 2016.12 27 candidates 34 new candidates Fashion designer experiment
  26. 26. 3. Can the method be used to inspect the past, present, and future of knowledge? 2016.01 2016.03 2016.06 2016.09 2016.12 27 candidates 34 new candidates 18 new candidates Fashion designer experiment
  27. 27. 3. Can the method be used to inspect the past, present, and future of knowledge? 2016.01 2016.03 2016.06 2016.09 2016.12 27 candidates 34 new candidates 18 new candidates 16 new candidates Fashion designer experiment
  28. 28. 4. Can the method be used to find emerging knowledge? Fashion designers Finance influencers Fiction Writers Chess Players Emergent Famous
  29. 29. Some updates.. Syntactic features: verbs, nouns and proper nouns
  30. 30. Some updates.. Feature vector of user u: [𝑛1, 𝑛2, 𝑛3, . . ] where: • 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets Syntactic features: verbs, nouns and proper nouns
  31. 31. Some updates.. Feature vector of user u: [𝑛1, 𝑛2, 𝑛3, . . ] where: • 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets Syntactic features: verbs, nouns and proper nouns We consider 𝑛𝑖 as the probabilities that u uses the i-th word in his tweets.
  32. 32. Some updates.. Feature vector of user u: [𝑛1, 𝑛2, 𝑛3, . . ] where: • 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets We consider 𝑛𝑖 as the probabilities that u uses the i-th word in his tweets. Syntactic features: verbs, nouns and proper nouns DISTRIBUTION OF WORDS Entropy as metric
  33. 33. Evaluation with 10 seeds, 10 good candidates and 600 random accounts Finance domain
  34. 34. Conclusion • We show the geographic and temporal spreading of entities extracted by the method. • The method grants a good precision even after some iterations in many domains. • Future work includes the semi-automatic building of a richer domain model, by studying other twitter features (such as verbs and nouns which appear in tweet texts).
  35. 35. THANKS! QUESTIONS? Contacts: marco.brambilla@polimi.it http://datascience.deib.polimi.it Twitter: datascience_mi and marcobrambi Iterative Knowledge Extraction from Social Networks Marco Brambilla, Stefano Ceri, Florian Daniel, Marco Di Giovanni, Andrea Mauri, Giorgia Ramponi

×