Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
filannim@cs.man.ac.uk 
School of Computer Science 
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 
Mining...
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
introduction 
■ Temporal information is crucial for ...
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
ManTIME 
URL: http://www.cs.man.ac.uk/~filannim/mant...
Test with long text 4 / 23
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
temporal footprint 
A temporal footprint is a 
conti...
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
problem 
Can we predict temporal footprints from 
en...
Web 
Cellphone 
Computer 
Car 
Richard Feynman 
Bicycle 
Carl Friedrich Gauss 
French revolution 
Age of Enlightenment 
Ga...
8 / 23
8 / 23
8 / 23
8 / 23
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
methodology 
1. date mention extraction 
2. outlier ...
presentation 1st AHA! Workshop, COLING 2014 
1360 1410 1460 1510 1560 1610 1660 1710 1760 1810 
Dublin, 23/08/2014 / 25 
d...
presentation 1st AHA! Workshop, COLING 2014 
outlier filtering 
γ param. 
1360 1410 1460 1510 1560 1610 1660 1710 1760 181...
presentation 1st AHA! Workshop, COLING 2014 
normal distribution fitting 
0.050 
0.038 
freq 
1360 1410 1460 1510 1560 161...
presentation 1st AHA! Workshop, COLING 2014 
normal distribution fitting 
0.050 
0.038 
freq 
1360 1410 1460 1510 1560 161...
presentation 1st AHA! Workshop, COLING 2014 
normal distribution fitting 
β param. 
1360 1410 1460 1510 1560 1610 1660 171...
presentation 1st AHA! Workshop, COLING 2014 
normal distribution fitting 
β param. 
1360 1410 1460 1510 1560 1610 1660 171...
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
error measure 
gold 
prediction 
Fatima De Carvalho....
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
error measure 
Fatima De Carvalho. 1996. Histogramme...
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
strategies 
A. RegEx 
B. RegEx + Filtering 
C. RegEx...
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
evaluation 
■ subject: people 
■ lived from 1000 AD ...
#people 
500 
400 
300 
200 
100 
0 
0 250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000 3250 3500 3750 
#words 
Pe...
presentation 1st AHA! Workshop, COLING 2014 
Dublin, 23/08/2014 / 25 
aggregate results 
19 
Strategy 
Mean 
Distance 
Err...
presentation 1st AHA! Workshop, COLING 2014 
1112 3336 5560 7785 10009 12233 14458 16682 18906 21131 23355 25579 27804 
Du...
presentation 1st AHA! Workshop, COLING 2014 
1112 3336 5560 7785 10009 12233 14458 16682 18906 21131 23355 25579 27804 
Du...
presentation 1st AHA! Workshop, COLING 2014 
results 
■ Galileo Galilei (1564-1642), prediction: 1556-1654 
Dublin, 23/08/...
presentation 1st AHA! Workshop, COLING 2014 
results 
■ Robin Williams (1951 - 2014), prediction: 1953-2006 
Dublin, 23/08...
presentation 1st AHA! Workshop, COLING 2014 
other types of temporal footprint? 
■ Christopher Columbus will die in 2057 ?...
presentation 1st AHA! Workshop, COLING 2014 
other types of temporal footprint? 
■ Christopher Columbus will die in 2057 ?...
presentation 1st AHA! Workshop, COLING 2014 
other types of temporal footprint? 
■ Christopher Columbus will die in 2057 ?...
presentation 1st AHA! Workshop, COLING 2014 
other types of temporal footprint? 
■ Christopher Columbus will die in 2057 ?...
presentation 1st AHA! Workshop, COLING 2014 
physical existence vs. social coverage 
■ Anne Frank’s footprint is shifted i...
presentation 1st AHA! Workshop, COLING 2014 
physical existence vs. social coverage 
■ Anne Frank’s footprint is shifted i...
presentation 1st AHA! Workshop, COLING 2014 
physical existence vs. social coverage 
■ Anne Frank’s footprint is shifted i...
presentation 1st AHA! Workshop, COLING 2014 
conclusions 
■ how the methodology behaves on different 
Dublin, 23/08/2014 /...
Thank you.
? QUESTIONS 
Contact: 
filannim@cs.man.ac.uk 
! 
Visit: 
tinyurl.com/temporal-footprints
Upcoming SlideShare
Loading in …5
×

Mining temporal footprints from Wikipedia

Discovery of temporal information is key for organising knowledge and therefore the task of extracting and representing temporal information from texts has received an increasing interest. In this paper we focus on the discovery of temporal footprints from encyclopaedic descriptions. Temporal footprints are time-line periods that are associated to the existence of specific concepts. Our approach relies on the extraction of date mentions and prediction of lower and upper bound- aries that define temporal footprints. We report on several experiments on persons’ pages from Wikipedia in order to illustrate the feasibility of the proposed methods.

  • Be the first to comment

Mining temporal footprints from Wikipedia

  1. 1. filannim@cs.man.ac.uk School of Computer Science presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 Mining temporal footprints from Wikipedia Michele Filannino, Goran Nenadic
  2. 2. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 introduction ■ Temporal information is crucial for organising structured and unstructured data ■ Several temporal information extraction (TIE) systems are nowadays available ● thanks to TempEval challenge series 2
  3. 3. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 ManTIME URL: http://www.cs.man.ac.uk/~filannim/mantime.html 3
  4. 4. Test with long text 4 / 23
  5. 5. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 temporal footprint A temporal footprint is a continuous period on the time-line that temporally defines the existence of a particular concept. Immanuel Kant, Paul Guyer, and Allen W Wood. 1998. Critique of pure reason. Cambridge University Press. 5
  6. 6. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 problem Can we predict temporal footprints from encyclopaedic descriptions of concepts? ■ input: textual description of a concept ■ output: prediction of a temporal interval
  7. 7. Web Cellphone Computer Car Richard Feynman Bicycle Carl Friedrich Gauss French revolution Age of Enlightenment Galileo Galilei Leonardo Da Vinci Christopher Columbus Renaissance Arming sword High Middle Ages Gengis Khan 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 Object Person Historical period Examples of temporal footprints 7 / 23
  8. 8. 8 / 23
  9. 9. 8 / 23
  10. 10. 8 / 23
  11. 11. 8 / 23
  12. 12. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 methodology 1. date mention extraction 2. outlier filtering 3. normal distribution fitting 4. prediction 9
  13. 13. presentation 1st AHA! Workshop, COLING 2014 1360 1410 1460 1510 1560 1610 1660 1710 1760 1810 Dublin, 23/08/2014 / 25 date mentions extraction 0.050 0.038 freq 0.025 0.013 0.000 time (in years) 10
  14. 14. presentation 1st AHA! Workshop, COLING 2014 outlier filtering γ param. 1360 1410 1460 1510 1560 1610 1660 1710 1760 1810 Dublin, 23/08/2014 / 25 freq 0.050 0.038 0.025 0.013 0.000 time (in years) Gamma parameter controls the outlier region’s boundaries. 11
  15. 15. presentation 1st AHA! Workshop, COLING 2014 normal distribution fitting 0.050 0.038 freq 1360 1410 1460 1510 1560 1610 1660 1710 1760 1810 12 Dublin, 23/08/2014 / 25 0.025 0.013 0.000 time (in years) Alpha and Beta parameters control the size and offset of the gaussian bell. α param.
  16. 16. presentation 1st AHA! Workshop, COLING 2014 normal distribution fitting 0.050 0.038 freq 1360 1410 1460 1510 1560 1610 1660 1710 1760 1810 12 Dublin, 23/08/2014 / 25 0.025 0.013 0.000 time (in years) Alpha and Beta parameters control the size and offset of the gaussian bell. α param.
  17. 17. presentation 1st AHA! Workshop, COLING 2014 normal distribution fitting β param. 1360 1410 1460 1510 1560 1610 1660 1710 1760 1810 Dublin, 23/08/2014 / 25 freq 0.050 0.038 0.025 0.013 0.000 time (in years) Alpha and Beta parameters control the size and offset of the gaussian bell. 13
  18. 18. presentation 1st AHA! Workshop, COLING 2014 normal distribution fitting β param. 1360 1410 1460 1510 1560 1610 1660 1710 1760 1810 Dublin, 23/08/2014 / 25 freq 0.050 0.038 0.025 0.013 0.000 time (in years) Alpha and Beta parameters control the size and offset of the gaussian bell. 13
  19. 19. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 error measure gold prediction Fatima De Carvalho. 1996. Histogrammes et indices de proximite ́en analyse donne és symboliques. Acyes de l’e ćole d’e t́e ́sur l’analyse des donne és symboliques. LISE-CEREMADE, Universite ́de Paris IX Dauphine, pages 101–127. 14 union overlap
  20. 20. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 error measure Fatima De Carvalho. 1996. Histogrammes et indices de proximite ́en analyse donne és symboliques. Acyes de l’e ćole d’e t́e ́sur l’analyse des donne és symboliques. LISE-CEREMADE, Universite ́de Paris IX Dauphine, pages 101–127. 15 union gold prediction
  21. 21. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 strategies A. RegEx B. RegEx + Filtering C. RegEx + Filtering + Gaussian fitting D. HeidelTime + Filtering + Gaussian fitting 16
  22. 22. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 evaluation ■ subject: people ■ lived from 1000 AD to 2014 ● text from Wikipedia web pages ● year of birth and death from DBpedia ■ 228,824 people collected ■ simple definition of temporal footprint ● birth and death dates 17
  23. 23. #people 500 400 300 200 100 0 0 250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000 3250 3500 3750 #words People per textual length 1 8 / 23
  24. 24. presentation 1st AHA! Workshop, COLING 2014 Dublin, 23/08/2014 / 25 aggregate results 19 Strategy Mean Distance Error Standard Deviation RegEx 0.2636 0.3409 RegEx + Filtering 0.2596 0.3090 RegEx + Filtering + Gaussian fitting 0.3503 0.2430 HeidelTime + Filtering + Gaussian fitting 0.5980 0.2470
  25. 25. presentation 1st AHA! Workshop, COLING 2014 1112 3336 5560 7785 10009 12233 14458 16682 18906 21131 23355 25579 27804 Dublin, 23/08/2014 / 25 results 1.0 0.8 0.6 MDE 0.4 0.2 0.0 #words 20 RegEx RegEx + Filtering HeidelTime + Filtering + Gaussian fitting RegEx + Filtering + Gaussian fitting
  26. 26. presentation 1st AHA! Workshop, COLING 2014 1112 3336 5560 7785 10009 12233 14458 16682 18906 21131 23355 25579 27804 Dublin, 23/08/2014 / 25 results 1.0 0.8 0.6 MDE 0.4 0.2 0.0 #words 20 RegEx RegEx + Filtering HeidelTime + Filtering + Gaussian fitting RegEx + Filtering + Gaussian fitting
  27. 27. presentation 1st AHA! Workshop, COLING 2014 results ■ Galileo Galilei (1564-1642), prediction: 1556-1654 Dublin, 23/08/2014 / 25 E: 0.204 21
  28. 28. presentation 1st AHA! Workshop, COLING 2014 results ■ Robin Williams (1951 - 2014), prediction: 1953-2006 Dublin, 23/08/2014 / 25 E: 0.159 22
  29. 29. presentation 1st AHA! Workshop, COLING 2014 other types of temporal footprint? ■ Christopher Columbus will die in 2057 ?! Dublin, 23/08/2014 / 25 Prediction: 1366-2057 (1451-1506), E: 0.92 23
  30. 30. presentation 1st AHA! Workshop, COLING 2014 other types of temporal footprint? ■ Christopher Columbus will die in 2057 ?! Dublin, 23/08/2014 / 25 Prediction: 1366-2057 (1451-1506), E: 0.92 23
  31. 31. presentation 1st AHA! Workshop, COLING 2014 other types of temporal footprint? ■ Christopher Columbus will die in 2057 ?! Dublin, 23/08/2014 / 25 Prediction: 1366-2057 (1451-1506), E: 0.92 23
  32. 32. presentation 1st AHA! Workshop, COLING 2014 other types of temporal footprint? ■ Christopher Columbus will die in 2057 ?! Dublin, 23/08/2014 / 25 Prediction: 1366-2057 (1451-1506), E: 0.92 23 AHA!
  33. 33. presentation 1st AHA! Workshop, COLING 2014 physical existence vs. social coverage ■ Anne Frank’s footprint is shifted in the future 24 Dublin, 23/08/2014 / 25
  34. 34. presentation 1st AHA! Workshop, COLING 2014 physical existence vs. social coverage ■ Anne Frank’s footprint is shifted in the future 24 Dublin, 23/08/2014 / 25
  35. 35. presentation 1st AHA! Workshop, COLING 2014 physical existence vs. social coverage ■ Anne Frank’s footprint is shifted in the future 24 Dublin, 23/08/2014 / 25
  36. 36. presentation 1st AHA! Workshop, COLING 2014 conclusions ■ how the methodology behaves on different Dublin, 23/08/2014 / 25 languages? how on different sources? ■ oracle-like side-effect behaviour: • Apple Inc. will be closed down this year • Stanford University will be closed down in 2029 ■ Future works • mixture of normal distributions 25
  37. 37. Thank you.
  38. 38. ? QUESTIONS Contact: filannim@cs.man.ac.uk ! Visit: tinyurl.com/temporal-footprints

×