Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Human Activities as Linked Data


Published on

Pecha Kucha presentation of our paper "Integrating Know-How into the Linked Data Cloud" at the EKAW 2014 conference (28th of November 2014, Linköping, Sweden).

Project website:

Conference website:

* special thanks to Marco Malebolgie for the artwork!

Published in: Science
  • Be the first to comment

  • Be the first to like this

Human Activities as Linked Data

  1. 1. Integrating Know-How in the Linked Data Cloud Paolo Pareti, Benoit Testu, Ryutaro Ichise, Ewan Klein and Adam Barker “As we all know, there is a large amount of facts available on the Web. But what about human activities or know-how? The goal of this talk is to tell you how this kind of knowledge can be made machine understandable and available on the Web.”
  2. 2. Human activities (or know-how) 1. can be represented as Linked Data 2. can be automatically extracted 3. can be automatically interlinked 4. experiment: extracted a large Linked Data dataset 5. evaluation: our system outperforms humans “In particular, the presentation will focus on those five points.”
  3. 3. 339933,,660000 “If we ask an intelligent system this question: ‘What is the population of the capital of New Zealand?’ we would now assume it can answer this question correctly, by accessing knowledge bases available on the Web. But what happens if we ask a seemingly easier question: ‘What do you need to wash you hands?’ In this case, the system would not be able to answer.”
  4. 4. ??? “This is because, to answer this question, the intelligent system would need to have some understanding of what an activity is, and maybe what are its requirements. This knowledge, however, is not currently available in existing knowledge bases.”
  5. 5. Why Know-How? “But actually know-how is very useful and has a lot of applications. Know-how is relevant in almost all domains, and it can be common sense know-how available on the Web, or maybe internal know-how of specific organizations, such as standard operating procedures. This knowledge also has applications in fields such as question answering, recommender systems and activity recognition.”
  6. 6. “Human know-how is on the Web, but why is it not accessible? First of all, this knowledge is usually represented in unstructured resources. We can think for example of step-by-step instructions, which are typically represented as text in natural language, or maybe as pictures and videos.”
  7. 7. ? ? ? “But the most serious limitation is the fact that a single document contains only limited information. What happens if we (or a machine) does not understand how to do a specific step, or what a particular ingredient is. In fact, it is often the case that humans look at multiple resources to complete a complex task for the same time.”
  8. 8. Data “The first step for making know-how machine understandable is by using a structured representation. We can identify several entities in a process, such as steps, methods, requirements and outputs. We can link those entities with each other, depending on which relation exists between them.”
  9. 9. Linked Data “To solve the problem of the isolation of single resources, we have adopted a Linked Data representation. In this way, humans and machines can discover related resources when they are interested in more information about a specific entity. It is important to notice that these are not just links between documents, but between specific entities contained in these documents.”
  10. 10. “Our simple Linked Data representation of know-how is a point of contact between humans and machines. From the human perspective, know-how as Linked Data is a way to manage and find relevant resources which are human understandable. From the machine perspective, this data can be easily used for analysis, inferencing, and it can be extended to more complex representations where required.”
  11. 11. “So all of this is not just an idea. It is actually possible and we have run experiments and evaluated our results.”
  12. 12. “What do we want to achieve exactly, when we talk about machine-understandable activities? While it is true that we want to have a knowledge representation more powerful than simple text in a document, we cannot yet aim to have machines capable of automating all human activities. Therefore we need to start by reaching a first significant but realistic goal.”
  13. 13. “We show the usefulness of this system in a real application. A task currently done by humans is the interlinking of related know-how resources. In particular, the WikiHow community is actively creating such kind of links; for example between the step of a process and another set of instructions that explains how to do it.”
  14. 14. How to Make a Pancake Steps: 1. Prepare the mix 2. Pour the mix in a hot pan 3. Cook until golden Make a Pancake has_step has_step has_step Prepare the mix Cook until golden Pour the mix in a hot pan “This is a simplified example (e.g. missing the relations to specify the order of the steps) of how our system generates a Linked Data representation of a Web document. This can be done in many ways, but when the original document has some degree of structure, this knowledge extraction can be done easily and accurately.”
  15. 15. How to Make a Pancake Steps: 1. Prepare the mix 2. Pour the mix in a hot pan 3. Cook until golden Make a Pancake has_step requires requires has_step has_step Eggs Milk Prepare the mix Cook until golden Pour the mix in a hot pan Requirements: ● Eggs ● Milk ● Flour Flour requires “On the Web, most of these resources have some degree of structure. This is because a well structured set of instructions is better understood by humans, even before machines. This structure usually takes form of a simple enumeration of steps, methods and requirements.”
  16. 16. > 200,000 procedures > 2,600,000 entities “WikiHow and Snapguide are two large repositories that contain well organized know-how. We have extracted the knowledge of these websites and obtained a large dataset of over 200,000 procedures decomposed in over 2,600,000 entities. This can be seen as a large-scale extraction of know-how from the Web and conversion to Linked Data.”
  17. 17. Hot to Install an Operating System create a partition How to Create a Partition “In order to interlink the extracted entities, we have created a system to automatically discover two kinds of links. The first kind is a functional link between a step and another set of instructions that explains how this step can be done.”
  18. 18. DBpedia Guacamole How to Make Guacamole How to Serve Nachos “The second kind of links we discovered is similar to an Input/Output link between two processes. Instead of representing it directly, we have this link implicitly represented by the types of the input and the output of processes. In this example, we can infer that there is an Input/Output relation between the two processes, as one requires the object ‘Guacamole’ while the other outputs it.”
  19. 19. Evaluation + 16% precision + ×2 number of links + ×2 coverage + automatic + semantic links “Finally we evaluated the links extracted by our system against the links generated manually by the WikiHow community. The result was a significant improvement. Our system identified links of better quality, more in number, and better spread across all resources. All of this on top of being a completely automatic system which creates semantic Linked Data links, more expressive than simple html links.”
  20. 20. Know How as Linked Data? ….a dream that comes true! ● Generated a large dataset of > 200,000 human activities as Linked Data ● Integrated in the Linked Data Cloud ● Outperformed the human baseline “In conclusion, we have seen how know-how can become a new useful resource on the Linked Data Cloud. Our system automated the extraction and the integration of this knowledge on a large scale. Please visit this website if you are interested in this dataset or information about the project. This website also contains a link to an online visualization tool to explore the dataset”.