Extraction of Adaptation Knowledge from Internet Communities

FGWM’09 Fachgruppentreffen Wissensmanagement at LWA09,
TU Darmstadt, 2009-09-22

Extraction of Adaptation Knowledge
from Internet Communities *

Norman Ihle, Alexandre Hanft, and Klaus-Dieter Althoff

University of Hildesheim
Institute for Computer Science
Intelligent Information Systems Lab

[lastname]@iis.uni-hildesheim.de

* This
is an extended version of the paper presented at the Workshop ”WebCBR:
Reasoning from Experiences on the Web” at ICCBR’09

FGWM @ LWA‘2009 | 2009-09-22 2

Outline
Motivation
CookIIS
CommunityCook
A system for model-based knowledge extraction from
Internet-Communities
Evaluation
Conclusion & Outlook

Extraction of Adaptation Knowledge from Internet Communities

FGWM @ LWA‘2009 | 2009-09-22 3

Motivation
Adaptation is the „Reasoning“ in CBR [Kolodner 1997]
Most CBR-Systems avoid adaptation [Schmidt et al. 2003;
Minor 2006]
Adaptation Knowledge Acquisition (AKA) is cost intensive and
time consuming
− Experts hardly available
− Small number of research papers and systems
− Most systems focus on the case-base as source of knowledge
The Internet is a large source of knowledge, especially user-
generated content


FGWM @ LWA‘2009 | 2009-09-22 4

The Cooking Domain
The cooking domain is well suited for adaptation, because:
The context can be described easily:
1. all ingredients can be listed with exact amount and quality
2. ingredients can be obtained in standardized quantity and in
comparable quality
3. kitchen machines and tools are available in a standardized manner
4. (in case of a failure) the preparation of a meal can start over every
time again from the same initial situation
(except that we have more experience in cooking after each failure)
Cooking is about creativity and variation


FGWM @ LWA‘2009 | 2009-09-22 5

FGWM @ LWA‘2009 | 2009-09-22 6

CookIIS
System for the retrieval and adaptation of recipes in the
cooking domain http://cookiis.iis.uni-hildesheim.de
Competes in the ComputerCookingContest
Given recipes
Different tasks and requirements
− Identification of negations, type of meal and origin of the dish
− Handling of certain diets
− Creation of a three course menu
Developed using the empolis:Information Access Suite
(e:IAS)


FGWM @ LWA‘2009 | 2009-09-22 7

CookIIS knowledge model
Most important component: modelled ingredients
11 different classes, about 1000 concepts
Modelled in English and German with synonyms
Concepts organised in taxonomies
Combined similarity
Other components: tools, origins, methods, etc.
Overall about 2000 modelled concepts
Rules for the recognition of the origin of the dish
Rules for the recognition of the type of meal


FGWM @ LWA‘2009 | 2009-09-22 8

Adaptation in CookIIS

Model-based approach:
Replace unwanted ingredients with similar ones
Similarity is mainly based on taxonomies and using a set-function
offered by e:IAS Rule Engine:
− Parent and Child concepts are retrieved as well as sibling concepts
− Too many similar ingredients are retrieved
In many cases the approach is not appropriate


FGWM @ LWA‘2009 | 2009-09-22 9

Cooking Communities
A number of Internet communities deal with cooking
knowledge
Users upload recipes and discuss them inside comments
They express affirmation, critics and what they changed for
their own variation of the recipe (their personal adaptation)
If they vary the recipe, they name ingredients
Idea: using the CookIIS knowledge model to extract those
ingredients


FGWM @ LWA‘2009 | 2009-09-22 10

CommunityCook: Classification Idea

Comments can be classified according to extracted ingredients into
three categories:
NEW: all ingredients that are discussed, but are not part of the recipe
Add some ingredient
OLD: all ingredients that are discussed and are part of the recipe
“more”/ “less” of an ingredient, explanation for an ingredient
OLDANDNEW: some ingredients that are discussed are part of the
recipe and others not
Replacement of ingredients == adaptation
Specialisation (“for cheese I took parmesan”)
Latter category can be an adaptation suggestion, especially if
ingredients are of the same class of the knowledge model


FGWM @ LWA‘2009 | 2009-09-22 11

CommunityCook: Crawling
Crawling of a large German cooking community:
About 76.000 recipes with 286.000 related comments
HTML source code
Extraction of necessary data by building filters with the
help of an open source tool (HTMLParser)
Recipe title
Single ingredients (amount, measurement, name)
Comments
Statistics
Saved into a database


FGWM @ LWA‘2009 | 2009-09-22 12

CommunityCook: Text Mining case bases
Configuring e:IAS with two case-bases: recipe, comment
Cases representation is based on the modelled ingredients
of the CookIIS knowledge model
Use e:IAS TextMiner to fill cases with concepts from text


FGWM @ LWA‘2009 | 2009-09-22 13

CommunityCook: perform Classification
In the next step we retrieved one recipe and all comments
relating to that recipe
Each comment classified into on of the three categories
Additionally we tried to find phrases in the text that support
and specify the classification
assigned a score to determine the confidence of the
classification
If a pair of ingredients of the same class is found we also
analyse if one concept is the parent concept of the other
no adaptation, but specialisation
About 35.000 comments classified as OLDANDNEW,
16.000 with the subcategory “adaptation”


FGWM @ LWA‘2009 | 2009-09-22 14

CommunityCook: Aggregation
One way:
Aggregation of all classified comments belonging to the
recipe
We counted the number of same classifications per recipe and
aggregated the score by calculating the average and assigning a
bonus for every classification
Second way:
also aggregated all classified ingredients without regarding
the recipe (statistical)


FGWM @ LWA‘2009 | 2009-09-22 15

CommunityCook: Realization
Transformation of data:
Building of adaptation suggestions in database-rows to easily
retrieve those
With regard to the recipe and without (“independent”)


FGWM @ LWA‘2009 | 2009-09-22 16

CommunityCook: Integration into CookIIS
6200 different adaptation suggestions are available for 570
different ingredients
Using the two most common adaptation suggestions per
ingredient (without regard of the recipe) to create
adaptation suggestions
Integration into CookIIS-Workflows:
− If no adaptation suggestion is created with community data, the
model-based adaptation is used


FGWM @ LWA‘2009 | 2009-09-22 17

CommunityCook: Realization

Query: Chicken,
but no cream


FGWM @ LWA‘2009 | 2009-09-22 18

Starting Evaluation
First look:
The class “supplement” of the knowledge model
− Too many different kinds of ingredients are in these class so that the
adaptation suggestions are not adequate
The class “basic” of the knowledge model
− Basic ingredients like flour or egg are just hard to replace
both not in the review, not used in CookIIS
two different evaluations:
One evaluation to review the classification scheme
− Do the classified ingredients represent what was expressed in the original
comment?
One evaluation to review the extracted knowledge
− Are the created adaptation suggestion applicable?


FGWM @ LWA‘2009 | 2009-09-22 19

Evaluation

Evaluation of the extracted knowledge:
Expert survey
− Real chefs review the adaptation suggestions for recipes
− Questionnaire with recipe and adaptation suggestions
− One adaptation suggestion that was extracted from comments
belonging to that recipe (“dependent”)
− Two adaptation suggestions without regard of the recipe
(“independent”) each with two ingredients as replacement suggestion
(as in CookIIS)
50 Questionnaires with 50 dependent and 100 pairs of
independent ingredients


FGWM @ LWA‘2009 | 2009-09-22 20

Evaluation: overall Applicability


FGWM @ LWA‘2009 | 2009-09-22 21

Evaluation 1st vs. 2nd suggestion
Only 11 of the 100
independent
adaptation
suggestions included
no ingredient that can
be used as substitution


FGWM @ LWA‘2009 | 2009-09-22 22

Evaluation: Quality


FGWM @ LWA‘2009 | 2009-09-22 23

Future work

Further improvements on the knowledge model
Usage of adaptation suggestions that were extracted from
recipes similar to the current one
Adding some semantic analysis to improve accuracy
Usage of comments with other classifications for building
variations of recipes
Check the applicability in other domains


FGWM @ LWA‘2009 | 2009-09-22 24

Related work
Systems that give preparation advises for meals:
CHEF [Hammond 1986]
JULIA [Hinrichs 1992]
Adaptation knowledge aquisition:
DIAL [Leake et. al 1996]
CABAMAKA [d'Aquin et al. 2007]
IAKA [Cordier et al. 2008]
Using the web as knowledge source in CBR:
SEASALT [Bach et al. 2007]
EDIR [Plaza 2008]


FGWM @ LWA‘2009 | 2009-09-22 25

Conclusion
Adaptation knowledge is hard to acquire
The World Wide Web is a large source for knowledge
CommunityCook is a system the extracts adaptation
knowledge from web communities in the domain of
cooking
uses an existing knowledge model
The evaluation shows the applicability of the extracted
knowledge


FGWM @ LWA‘2009 | 2009-09-22 26

Thank you for your attention!

Questions?


FGWM @ LWA‘2009 | 2009-09-22 27

Literature
[Bach et al., 2007] Kerstin Bach, Meike Reichle, and Klaus-Dieter Althoff. A domain
independent system architecture for sharing experience. In Alexander Hinneburg, editor,
Proceedings of LWA 2007, Workshop Wissens- und Erfahrungsmanagement, pages 296–
303, Sep. 2007.
[Cordier et al., 2008] Amélie Cordier, Béatrice Fuchs, Léonardo Lana de Carvalho, Jean
Lieber, and Alain Mille. Opportunistic acquisition of adaptation knowledge cases - the iaka
approach. In Althoff et al. [2008], pages 150–164.
[d’Aquin et al., 2007] Mathieu d’Aquin, Fadi Badra, Sandrine Lafrogne, Jean Lieber, Amedeo
Napoli, and Laszlo Szathmary. Case base mining for adaptation knowledge acquisition. In
Manuela M. Veloso, editor, IJCAI, pages 750–755. Morgan Kaufmann, 2007.
[Hammond, 1986] Kristian J. Hammond. Chef: A model of case-based planning. In American
Association for Artificial Intelligence, AAAI-86, Philadelphia, pages 267–271, 1986.
[Hinrichs, 1992] Thomas R. Hinrichs. Problem solving inopen worlds. Lawrence Erlbaum,
1992.
[Leake et al. 1996]: D. Leake, A. Kinley und D. Wilson: Acquiring Case-Adaptation Knowledge:
A hybrid Approach, in: Proceedings of theThirteenth National Conference on Artificial
Intelligence, S. 684-689, AAAI Press, 1996.
[Minor 2006]: M. Minor: Erfahrungsmanagement mit fallbasierten Assistenzsystemen,
Dissertation, Humbolt-Universitat zu Berlin, 2006.
[Plaza, 2008] Enric Plaza. Semantics and experience in the future web. In Althoff et al. [2008],
pages 44–58. invited talk.
[Schmidt et al. 2003]: R. Schmidt, O. Vorobieva und L. Gierl: Case-based Adaptation
Problems in Medicine, in: U. Reimer (Hrsg.): Proceedings of WM2003: Professionelles
Wissensmanagement – Erfahrungen und Visionen, Kollen-Verlag, 2003.

Extraction of Adaptation Knowledge from Internet Communities

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (18)

Similar to Extraction of Adaptation Knowledge from Internet Communities

Similar to Extraction of Adaptation Knowledge from Internet Communities (20)

Recently uploaded

Recently uploaded (20)

Extraction of Adaptation Knowledge from Internet Communities