The NewsReader Project
• Process financial news in 4 different European languages (English, Dutch, Spanish,
Italian). Extract what happened to whom, when, and where. Align, storing provenance,
not discarding any information. Distinguish unfolding story lines. Assist financial
decision support by explaining current events.
• Consortium
• VU University Amsterdam
• The University of the Basque Country
• Fondazione Bruno Kessler
• LexisNexis
• ScraperWiki
• Funded by EU FP7 programme, grant 316404
• Jan. 2013 - Dec. 2015
the problem
• you will not notice the significance of an event
if you don’t know the story behind it
• the story is fragmented and parts are told in different ways
• “How does the process of gathering understanding work?”
example
the Hua Feng enters the harbor and nobody cares
Hua Feng
This research was funded by COMMIT/Metis (NL) and COMBINE (USA ONRG).
www.tradewindsnews.com
This research was funded by COMMIT/Metis (NL) and COMBINE (USA ONRG).
en.wikipedia.org
This research was funded by COMMIT/Metis (NL) and COMBINE (USA ONRG).
www.trafigura.com
“Our item headlined Success for the Guardian
(26 April, page 2) erroneously linked the
dumping of toxic waste in Ivory Coast from a
vessel chartered by Trafigura with the deaths
of a number of West Africans...”
– Guardian
This research was funded by COMMIT/Metis (NL) and COMBINE (USA ONRG).
the task
discover the events that preceded and possibly led to a current event
and summarize these for a human user
• investigate participating actors, places, moments in time
and a limited number of relations between these
• follow all leads in parallel
• forage for information on the World Wide and Semantic Web
• store the connections in an RDF graph, keeping precise track of the
provenance of the connections
• summarize and present the results as a story line
a bit more detail
• gather textual information from the news on the Web
or gather knowledge on the Semantic Web that relate to the current event
(Information Retrieval / Databases)
• in the case of news items, extract (partial) event descriptions from the text
(Natural Language Processing)
• generalize event descriptions to a desirable level of abstraction
(Logical/Spatiotemporal Reasoning)
• construct a story line from a selection of the event descriptions
(Summarization)
• visualize the story line (Visualization)
Will this provide a more efficient way to structure information and news?