
Be the first to like this
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Published on
Links between datasets are an essential ingredient of Linked Open Data. Since the manual creation of links is expensive at largescale, link sets are often created using heuristics, which may lead to errors. In this paper, we propose an unsupervised approach for finding erroneous links. We represent each link as a feature vector in a higher dimensional vector space, and find wrong links by means of different multidimensional outlier detection methods. We show how the approach can be implemented in the RapidMiner platform using only offtheshelf components, and present a first evaluation with realworld datasets from the Linked Open Data cloud showing promising results, with an Fmeasure of up to 0.54, and an area under the ROC curve of up to 0.86.
Be the first to like this
Be the first to comment