From Events to Stories: Different ways of structuring the same bag of events over time


Talk given at Soeterbeeck eHumanities Workshop 14 June 2013: Describing event and storyline modelling for Semantics of History, BiographyNet and NewsReader

Published in: Technology, Education
  1. 1. FROM EVENTSTO STORIESDifferent ways of structuring the same bag of events over timeMarieke van ErpVU University Amsterdam
  2. 2. • Events are things thathappen at a certain placeand time• Events are a core buildingblock of many informationsources• Events are at the heart ofmany eHumanities researchdomainsImage
  3. 3. • Events are multidimensionalobjects• Exact event boundaries andelements are difficult todefine• Sources reporting on eventsmay not have completeinformation or may promotetheir own viewImage:
  4. 4. • Semantics of Historyexplores the temporaldimension of events• BiographyNet exploresrelations between andpeople and events as well aschanges in perspective onthese events• NewsReader builds uponSemantics of History andBiographyNet and scales itupImage:
  5. 5. • One elderly British gentleman was walking around in a state of shock. His wifehad been swimming when the waves struck. (BBC News, Sri Lanka, 26December 2004)• More than 300 people have died and 3,500 were injured after the massive seasurges caused by an earthquake smashed intoThailands western coast. (BBCNews, 27 December 2004)• The UK government is to give at least £15m to help the victims of the Asianearthquake which is thought to have killed nearly 60,000 people. (BBC News,28 December 2004)• A huge quake off western Indonesia on 26 December 2004 caused a massivetsunami that killed around 230,000 people around the region. (BBC News, 4January 2009)
  6. 6. GeneralisationLevelLocationNewsArticlesParticipantsTimeEventHistoricalTextsLow level eventsSmall areasIndividualsShort periods of timeHigh level eventsBigger areasGroup agentsLonger periods of timeTemporalperspective
  7. 7. BiographyNet
  8. 8. BiographyNet
  9. 9. BiographyNet
  10. 10. BiographyNet
  11. 11. Total daily stream of documentsArchives of decadesof news reportsDaily document intake of an individualdecision maker 50–3,000±2,000,000 sources±25,000,000,000 documents:news, company reports, manager biographiesunknown volume:events, sources and background data consultedNewsReader: Zooming, Linking and Scaling upVolumes beyond result list paradigmDuplications, repetitions: new/oldInconsistent and contradictoryColoured and opinionatedIncomplete, piece-mealUnauthorised
  12. 12. • the 7.7 magnitude quake(source: Xinhuanet)• two quakes, measuring 7.6and 7.4 (source: Bloomberg)• One 7.3-magnitude tremor(source: Jakartapost)Image:
  13. 13. • To link current to previousinformation, different ways ofdescribing and registeringevents need to beinterconnected• To allow reasoning, domainknowledge needs to becaptured• To provide differentperspectives on the samenews story, the source a pieceof information came fromneeds to be kept track ofImage:
  14. 14. Grounded Annotation Framework(GAF)• Keep event mentionsseparate from event instances• Linguistic information captured inseparate layer from semanticinformation• Semantic layer can also importnon-linguistic information, e.g.coming from sensors• Provenance is captured throughPROV-O
  15. 15. changes in the worldpublication of sources2004 2009ANNOTATIONNAFSEM-EVENTTEMBLORANNOTATIONTAFSEM-EVENTTSUNAMI2004 2006 2007 2008 2009SEM-EVENTTEMBLORSEM-EVENTTSUNAMIANNOTATIONSEM-EVENTTEMBLORSEM-EVENTTSUNAMI2013ANNOTATIONANNOTATION ANNOTATIONANNOTATIONsensor datadirect event reportdelayed event reportfuture event reportTsunami alertsystemfuture tsunami"The catastrophe four years ago devastated IndianOcean community and killed more than 230,000people, over 170,000 of them in Acehat northern tip of Sumatra Island of Indonesia."..., the vessel is the party responsible for the 2004 IndianOcean tsunami that killed 230,000 people. Apparently,the submarine was able to trigger seismic activity viasome kind of directed energy weapon.SEM-EVENTUSS JimmyCarter energyweapon20052006 2007 20082005
  16. 16. colorado:Set_Subsetnaacl:INSTANCE_186naacl:INSTANCE_200naacl:INSTANCE_201naacl:INSTANCE_179naacl:INSTANCE_197naacl:INSTANCE_188sem:subEventOfsem:subEventOfsem+:causessem:hasActorwn30:synset-tsunami-noun-1sem:EventTypesem:EventTypesem:Eventrdf:type rdf:typedbpedia:2004_Indian_Ocean_earthquake_and_ tsunamirdf:typewn30:synset-earthquake-noun-1sem:EventTyperdf:typewn30:synset-shift-verb-4sem:EventTypesem:hasLocationdbpedia:Tectonic_Platerdfs:isDefinedBydbpedia:Sundra_Trunchsem:Place rdf:typeskos:exactMatchskos:exactMatchwn30:synset-stable-adjective-1owl:objectPropertydbpedia:USS_Jimmy_Carter_(SSN_23)skos:exactMatchnaacl:INSTANCE_MENTION_118gaf:denotedBynaacl:INSTANCE_MENTION_120gaf:denotedBynaacl:INSTANCE_181gaf:causessem:subEventOftaf:causal_c187@e@workshop37_1190@e@workshop37_1skos:exactMatchcolorado:cause_effect184@e@workshop37_1gaf:denotedBynaacl:INSTANCE_202sem+:causesskos:exactMatchnaacl:INSTANCE_MENTION_112naacl:INSTANCE_MENTION_40gaf:denotedBytaf:hasParticipant_nsubj"plates"@en "shift"@en "earthquakes"@en "temblor"@en "tsunami"@enstr:anchorOfstr:anchorOfstr:anchorOfstr:anchorOfrdf:typerdf:typegaf:G2sem:AccordingTodbpedia:Veterans_Todaygaf:G3dbpedia:Bloombergsem:AccordingTo182@e@workshop37_1colorado:cause_effectgaf:denotedByskos:exactMatchstr:anchorOfgaf:G4gaf:G5prov:wasGeneratedBytaf:annotation_2013_03_24prov:wasGeneratedBycolorado:annotation_2013_03_12skos:exactMatchrdf:typesem:EventType
  17. 17. sem:EventEventInstancerdf:typeTypeInstancesem:EventEventInstancerdf:typesem:eventTypesem:EventEventInstancerdf:typesem:EventTyperdf:typesem:EventEventInstancerdf:typesem:eventTypesem:ActorActorInstancerdf:typesem:PlacePlaceInstancerdf:typesem:hasPlacesem:EventEventInstancerdf:typesem:hasPlacesem:EventEventInstancerdf:typesem:hasActor sem:hasActorTopicTopologicalConceptualBiographical
  18. 18. Image:
  19. 19. • Events can be processed andpresented in a myriad of ways→ interdisciplinary problem• To preserve context,perspective and provenanceneed to be presented →recognised in both humanitiesand computer science• A representation frameworkneeds to separate mentionsfrom instances → GAF is afirst stepImage:
  NewsReader is funded by the European Union's7th Framework Programme (ICT-316404)BiographyNet is funded by the NetherlandseScience Center. Partners in BiographyNet areHuygens/ING Institute of the Dutch Academy ofSciences andVU University Amsterdam.Semantics of History is funded by the NetworkInstitute.