Iterative
Knowledge Extraction
from Social Networks
Marco Brambilla, Stefano Ceri, Florian Daniel,
Marco Di Giovanni, Andrea Mauri, Giorgia Ramponi
The Web Conference, WWW 2018, Lyon, France
MSM'2018
Humans aim at
formalizing knowledge
Famous Emerging
…
Famous: entities on WikiPedia
or in other knowledge bases
Famous Emerging
…
Emerging: entities not present in
any knowledge bases
Extracting Emerging Knowledge from Social Media
Marco Brambilla, et Al. 2017. Extracting Emerging Knowledge from Social Media.
In Proceedings of the 26th International Conference on World Wide Web (WWW '17).
DOI: https://doi.org/10.1145/3038912.3052697
Extracting Emerging Knowledge from Social Media
@Kasparov63
@LevAronian
@MagnusCarlsen
@FabianoCaruana
Extracting Emerging Knowledge from Social Media
ChessPlayer
Tournament
Extracting Emerging Knowledge from Social Media
Extracting Emerging Knowledge from Social Media
Extracting Emerging Knowledge from Social Media
Feature vector: [𝑡1, 𝑡2, 𝑡3,..,𝑖1, 𝑖2, 𝑖3, . . ] where:
• 𝑡𝑖 are DBpedia types
• 𝑖𝑖 are DBpedia instances of Expert Types
Extracting Emerging Knowledge from Social Media
Extracting Emerging Knowledge from Social Media
Extracting Emerging Knowledge from Social Media
Extracting Emerging Knowledge from Social Media
Iterative Extraction Process
1. How does reconstructed domain knowledge evolve with an iterative
process?
2. How does the reconstructed domain knowledge spread
geographically?
3. Can the method be used to inspect the past, present, and future of
knowledge?
4. Can the method be used to find emerging knowledge?
1. How does reconstructed domain knowledge evolve with an iterative
process?
Extremely domain dependent
1. How does reconstructed domain knowledge evolve with an iterative
process?
Extremely domain dependent
Precision remains rather stable
1. How does reconstructed domain knowledge evolve with an iterative
process?
Extremely domain dependent
Precision sometimes also increases
2. How does the reconstructed domain knowledge spread geographically?
USA ChessPlayers
2. How does the reconstructed domain knowledge spread geographically?
USA ChessPlayers
2. How does the reconstructed domain knowledge spread geographically?
USA ChessPlayers
2. How does the reconstructed domain knowledge spread geographically?
Iteratively found knowledge spans large geographical areas very fast
3. Can the method be used to inspect the past, present, and future of
knowledge?
2016.01 2016.03 2016.06 2016.09 2016.12
27
candidates
Fashion designer experiment
3. Can the method be used to inspect the past, present, and future of
knowledge?
2016.01 2016.03 2016.06 2016.09 2016.12
27
candidates
34
new candidates
Fashion designer experiment
3. Can the method be used to inspect the past, present, and future of
knowledge?
2016.01 2016.03 2016.06 2016.09 2016.12
27
candidates
34
new candidates
18
new candidates
Fashion designer experiment
3. Can the method be used to inspect the past, present, and future of
knowledge?
2016.01 2016.03 2016.06 2016.09 2016.12
27
candidates
34
new candidates
18
new candidates
16
new candidates
Fashion designer experiment
4. Can the method be used to find emerging knowledge?
Fashion designers Finance influencers
Fiction Writers Chess Players
Emergent
Famous
Some updates..
Syntactic features: verbs, nouns and proper nouns
Some updates..
Feature vector of user u: [𝑛1, 𝑛2, 𝑛3, . . ] where:
• 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets
Syntactic features: verbs, nouns and proper nouns
Some updates..
Feature vector of user u: [𝑛1, 𝑛2, 𝑛3, . . ] where:
• 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets
Syntactic features: verbs, nouns and proper nouns
We consider 𝑛𝑖 as the probabilities that u uses the i-th word in his
tweets.
Some updates..
Feature vector of user u: [𝑛1, 𝑛2, 𝑛3, . . ] where:
• 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets
We consider 𝑛𝑖 as the probabilities that u uses the i-th word in his
tweets.
Syntactic features: verbs, nouns and proper nouns
DISTRIBUTION OF WORDS Entropy as metric
Evaluation with 10 seeds, 10 good candidates and 600 random accounts
Finance domain
Conclusion
• We show the geographic and temporal spreading of entities extracted
by the method.
• The method grants a good precision even after some iterations in
many domains.
• Future work includes the semi-automatic building of a richer domain
model, by studying other twitter features (such as verbs and nouns
which appear in tweet texts).
THANKS!
QUESTIONS?
Contacts: marco.brambilla@polimi.it
http://datascience.deib.polimi.it
Twitter: datascience_mi and marcobrambi
Iterative Knowledge Extraction from Social Networks
Marco Brambilla, Stefano Ceri, Florian Daniel, Marco Di Giovanni, Andrea Mauri, Giorgia Ramponi

Iterative knowledge extraction from social networks. The Web Conference 2018

  • 1.
    Iterative Knowledge Extraction from SocialNetworks Marco Brambilla, Stefano Ceri, Florian Daniel, Marco Di Giovanni, Andrea Mauri, Giorgia Ramponi The Web Conference, WWW 2018, Lyon, France MSM'2018
  • 2.
  • 3.
    Famous Emerging … Famous: entitieson WikiPedia or in other knowledge bases
  • 4.
    Famous Emerging … Emerging: entitiesnot present in any knowledge bases
  • 5.
    Extracting Emerging Knowledgefrom Social Media Marco Brambilla, et Al. 2017. Extracting Emerging Knowledge from Social Media. In Proceedings of the 26th International Conference on World Wide Web (WWW '17). DOI: https://doi.org/10.1145/3038912.3052697
  • 6.
    Extracting Emerging Knowledgefrom Social Media @Kasparov63 @LevAronian @MagnusCarlsen @FabianoCaruana
  • 7.
    Extracting Emerging Knowledgefrom Social Media ChessPlayer Tournament
  • 8.
  • 9.
  • 10.
    Extracting Emerging Knowledgefrom Social Media Feature vector: [𝑡1, 𝑡2, 𝑡3,..,𝑖1, 𝑖2, 𝑖3, . . ] where: • 𝑡𝑖 are DBpedia types • 𝑖𝑖 are DBpedia instances of Expert Types
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
    1. How doesreconstructed domain knowledge evolve with an iterative process? 2. How does the reconstructed domain knowledge spread geographically? 3. Can the method be used to inspect the past, present, and future of knowledge? 4. Can the method be used to find emerging knowledge?
  • 17.
    1. How doesreconstructed domain knowledge evolve with an iterative process? Extremely domain dependent
  • 18.
    1. How doesreconstructed domain knowledge evolve with an iterative process? Extremely domain dependent Precision remains rather stable
  • 19.
    1. How doesreconstructed domain knowledge evolve with an iterative process? Extremely domain dependent Precision sometimes also increases
  • 20.
    2. How doesthe reconstructed domain knowledge spread geographically? USA ChessPlayers
  • 21.
    2. How doesthe reconstructed domain knowledge spread geographically? USA ChessPlayers
  • 22.
    2. How doesthe reconstructed domain knowledge spread geographically? USA ChessPlayers
  • 23.
    2. How doesthe reconstructed domain knowledge spread geographically? Iteratively found knowledge spans large geographical areas very fast
  • 24.
    3. Can themethod be used to inspect the past, present, and future of knowledge? 2016.01 2016.03 2016.06 2016.09 2016.12 27 candidates Fashion designer experiment
  • 25.
    3. Can themethod be used to inspect the past, present, and future of knowledge? 2016.01 2016.03 2016.06 2016.09 2016.12 27 candidates 34 new candidates Fashion designer experiment
  • 26.
    3. Can themethod be used to inspect the past, present, and future of knowledge? 2016.01 2016.03 2016.06 2016.09 2016.12 27 candidates 34 new candidates 18 new candidates Fashion designer experiment
  • 27.
    3. Can themethod be used to inspect the past, present, and future of knowledge? 2016.01 2016.03 2016.06 2016.09 2016.12 27 candidates 34 new candidates 18 new candidates 16 new candidates Fashion designer experiment
  • 28.
    4. Can themethod be used to find emerging knowledge? Fashion designers Finance influencers Fiction Writers Chess Players Emergent Famous
  • 29.
    Some updates.. Syntactic features:verbs, nouns and proper nouns
  • 30.
    Some updates.. Feature vectorof user u: [𝑛1, 𝑛2, 𝑛3, . . ] where: • 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets Syntactic features: verbs, nouns and proper nouns
  • 31.
    Some updates.. Feature vectorof user u: [𝑛1, 𝑛2, 𝑛3, . . ] where: • 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets Syntactic features: verbs, nouns and proper nouns We consider 𝑛𝑖 as the probabilities that u uses the i-th word in his tweets.
  • 32.
    Some updates.. Feature vectorof user u: [𝑛1, 𝑛2, 𝑛3, . . ] where: • 𝑛𝑖 are nouns/verbs/proper nouns frequencies in user tweets We consider 𝑛𝑖 as the probabilities that u uses the i-th word in his tweets. Syntactic features: verbs, nouns and proper nouns DISTRIBUTION OF WORDS Entropy as metric
  • 33.
    Evaluation with 10seeds, 10 good candidates and 600 random accounts Finance domain
  • 34.
    Conclusion • We showthe geographic and temporal spreading of entities extracted by the method. • The method grants a good precision even after some iterations in many domains. • Future work includes the semi-automatic building of a richer domain model, by studying other twitter features (such as verbs and nouns which appear in tweet texts).
  • 35.
    THANKS! QUESTIONS? Contacts: marco.brambilla@polimi.it http://datascience.deib.polimi.it Twitter: datascience_miand marcobrambi Iterative Knowledge Extraction from Social Networks Marco Brambilla, Stefano Ceri, Florian Daniel, Marco Di Giovanni, Andrea Mauri, Giorgia Ramponi

Editor's Notes

  • #2 Good afternoon
  • #3 It’s known that humans aim at formalizing knowledge infact there are many corpus knowledge bases
  • #6 Now I want present our last work that was published on www. The aim of this process is to find emerging entities who belong to the same domain of
  • #16 New process tolta domanda aggiungi qui spiegazione sulla iterazione
  • #29 anima