Be the first to like this
Incorporation of natural language processing techniques to the indexing process of open data sets to increase their characterization and allow their linking with news published in digital media. Given the text of a news item, the system suggests to its author data sets published in datos.gob.es based on three aspects:
- space-time proximity: datasets whose content covers the location and / or time where the news is framed.
- similarity: data sets that deal with issues similar to those discussed in the news.
- reference: datasets that contain people, places or organizations mentioned in the news. The latter case will also incorporate new metadata to the existing data set based on the entities (people, places, organizations) extracted from its content.
Ideally, the importance of each line of recommendation will depend on the content of the news.