SlideShare a Scribd company logo
1 of 37
Download to read offline
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Measuring the Structural and
Conceptual Similarity
of Folktales using Plot Graphs
Victoria Anugrah Lestari & Ruli Manurung
Faculty of Computer Science
Universitas Indonesia
victoria.anugrah@ui.ac.id, maruli@cs.ui.ac.id
Beijing, China
30 July 2015
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Folktales
Folktales are a characteristically anonymous, timeless,
and placeless tale circulated orally among a people.
http://onceuponatime.wikia.com/wiki/Rumpelstiltskin_(Fairytale)
http://indonesianfolklore.blogspot.com/2007/10/lutung-kasarung-folklore-from-west-java.html
http://indonesianfolklore.blogspot.com/2007/10/keong-emas-golden-snail-prince-raden.html 2/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Humanities work on folktales
• Vladimir Propp (1928): Morphology of the
(Russian) folktale  story grammars
• Aarne-Thompson-Uther (ATU) index (1910,
1961, 2004): story motifs, hierarchy of folktale
types
3/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Computational work on folktales
• Vaz Lobo & de Matos (2010): latent semantic mapping +
clustering 453 fairy tales from Gutenberg.
• Nguyen et al. (2012): classification based on genre, e.g.
legend, fairytale, jokes, puzzle, urban legend, etc. using
lexical, POS, NE, metadata.
• Nguyen et al. (2013): Ranking based on story types (ATU,
Brunvand) using IR, lexical, SVO triplets.
• Karsdorp & van den Bosch (2013): Topic modelling (L-LDA) for
multiple labelling of ATU motifs (defined by types).
4/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Folktales as narratives
• Narratives: Focus on sequence of related
events  structure
• Models of narrative: Turner (1994), Mateas &
Stern (2003), Pérez y Pérez & Sharples (2004),
etc.
5/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Folktales as narratives
• Narratives: Focus on sequence of related
events  structure
• Models of narrative: Turner (1994), Mateas &
Stern (2003), Pérez y Pérez & Sharples (2004),
etc.
• However: Fisseni & Löwe (2012): People tend
to focus on motifs & content, less on
structure.
5/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Plot graphs (McIntyre & Lapata, 2010)
6/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Goals of this work
• Construct representations that capture
structural & conceptual properties.
• Define similarity metric, use to organize
folktales.
• Compare to BoW-based methods wrt. ATU.
7/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
8/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
Action nodes: Action edges:
8/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
Action nodes:
Child
nodes:
Action edges:
Action-
Child
edges:
8/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
Action nodes:
Child
nodes:
Entity
nodes:
Action edges:
Action-
Child
edges:
Entity
edges:
8/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Representing folktales as plot graphs
Note that the core structure is linear.
Action nodes:
Child
nodes:
Entity
nodes:
Action edges:
Action-
Child
edges:
Entity
edges:
8/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Example
9/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Example
live
lion forest
subj in
9/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Example
live sleep
lion forest it tree
subj in subj under
9/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Example
live sleep come
lion forest it tree mouse
subj in subj under subj
9/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Example
live sleep come play
lion forest it tree mouse lionit
subj in subj under subj subj on
9/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Automatic construction
Stanford CoreNLP SemanticGraph (a.k.a.
dependency parse)
10/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
From SemanticGraph to plot graph
Some observation-based heuristics on selecting relations:
• Governors of nsubj (nominal subject), expl (expletive “there”), and aux (auxiliary)
• Add child if relation(parent,child) not conj, comp, adv, aux, cop, dep, expl, mark
11/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Construction example
12/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Construction example
CoreNLP CorefChain
(length >1)
12/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Construction example
CoreNLP CorefChain
(length >1)
12/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Construction example
CoreNLP CorefChain
(length >1)
12/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Final result
13/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Measuring plot graph similarity
A lion lives in the
forest. One day it
sleeps under a
tree. Then a
mouse plays on
the lion and
disturbs its sleep.
A lion eats meat. A
lion lives in the
jungle. One day it
rests under a tree.
14/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Measuring plot graph similarity
A lion lives in the
forest. One day it
sleeps under a
tree. Then a
mouse plays on
the lion and
disturbs its sleep.
A lion eats meat. A
lion lives in the
jungle. One day it
rests under a tree.
14/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Alignment of event sequence
Needleman-
Wunsch
15/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Conceptual similarity: Wu-Palmer
Measure path
distance
between 2
words based on
WordNet
taxonomy
Word pairs Similarity
sleep, live 0.25
disturb, rest 0.33
live, eat 0.29
prince, king 0.94
jungle, forest 0.31
palace, house 0.91
16/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Example mapping
eat live rest
live 0.29 1 0.33
sleep 0.22 0.25 0.43
play 0.29 0.33 0.43
disturb 0.29 0.33 0.33
eat live rest
0 -1 -2 -3
live -1 0.29 0 -1
sleep -2 -0.71 0.54 1
play -3 -1.71 -0.38 0.96
disturb -4 -2.71 -1.38 -0.04
Wu-Palmer similarity
Alignment scoring & traceback matrix
17/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Folktale similarity measurement
p1 & p2 = the two plot graphs being compared
α = weighting for action node similarity
β = weighting for child node similarity
(a1i ,a2i ) = pair of action nodes from alignment of p1 and p2
g = gap penalty
(c1i ,c2i ) = pair of child nodes from alignment of p1 and p2
n = alignment length of p1 and p2
18/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Initial experiment
• Determining values for α, β, and g
• For each story, 5 paraphrases manually created: word
replacement, sentence structure change, insertion/deletion of
phrases & sentences
• Measure similarity between paraphrases & across stories.
Maximize difference.
No. Title #Words
1 A friend in need is a friend indeed 133
2 Honesty is the best policy 129
3 A town mouse and a country mouse 260
4 How to tell a true princess 382
5 The butterfly lovers 572
6 Rumpelstiltskin 1106
http://www.english-for-students.com/Simple-Short-Stories.html 19/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Similarity scores using various parameters
g=
α = 0.7, β = 0.3 α = 0.5, β = 0.5 α = 0.3, β = 0.7
-1 -0.5 0 -1 -0.5 0 -1 -0.5 0
Between
paraphrases
Avg 0.83 0.80 0.74 0.83 0.80 0.73 0.83 0.79 0.71
Min 0.69 0.61 0.53 0.69 0.60 0.49 0.68 0.58 0.45
Across
stories
Avg 0.37 0.30 0.15 0.41 0.32 0.12 0.45 0.33 0.09
Max 0.55 0.45 0.25 0.55 0.43 0.20 0.55 0.42 0.16
BP min - AS max 0.14 0.16 0.28 0.14 0.17 0.29 0.13 0.16 0.29
Diff. between avgs 0.46 0.50 0.59 0.42 0.48 0.61 0.38 0.46 0.62
20/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Main experiment: BoW comparison
24 fairy tales from Fairy Books of Andrew Lang, grouped into 5 clusters under
ATU (fairy tales):
• Supernatural Adversaries — Bluebeard; Hansel and Gretel; Jack and the
Beanstalk; Rapunzel; The Twelve Dancing Princesses.
• Supernatural or Enchanted Relatives — Beauty and the Beast; Brother
and Sister; East of the Sun, West of the Moon; Snow White and Rose Red;
The Bushy Bride; The Six Swans; The Sleeping Beauty.
• Supernatural Helpers — Cinderella; Donkey Skin; Puss in Boots;
Rumpelstiltskin; The Goose Girl; The Story of Sigurd.
• Magic Objects — Aladdin and the Wonderful Lamp; Fortunatus and His
Purse; The Golden Goose; The Magic Ring.
• Other Stories of the Supernatural — Little Thumb; The Princess and the
Pea.
Measure similarity between clusters & across clusters.
http://www.gutenberg.org/ebooks/30580 21/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity ExperimentsStory type Story
Plot graph Bag of words Combination
Within Across Within Across Within Across
Supernatural
adversaries
Bluebeard 0.1000 0.1037 0.8629 0.8618 0.4814 0.4586
Hansel and Gretel 0.1075 0.1157 0.8492 0.8630 0.4783 0.4894
Jack and the Beanstalk 0.1050 0.1110 0.9050 0.8891 0.5050 0.5001
Rapunzel 0.1000 0.1047 0.8790 0.8575 0.4895 0.4571
The Twelve Dancing Princesses 0.1125 0.1073 0.8808 0.8631 0.4966 0.4610
Supernatural
or enchanted
relatives
Beauty and the Beast 0.0767 0.0705 0.8803 0.8605 0.4785 0.4397
Brother and Sister 0.1233 0.1135 0.8881 0.8722 0.5057 0.4654
East of the Sun, West of the Moon 0.1117 0.1012 0.8914 0.8571 0.5015 0.4525
Snow White and Rose Red 0.1200 0.1165 0.8650 0.8566 0.4925 0.4865
The Bushy Bride 0.1200 0.1182 0.8862 0.8739 0.5031 0.4960
The Six Swans 0.0925 0.1100 0.9006 0.8662 0.5020 0.4881
The Sleeping Beauty 0.1125 0.1194 0.8990 0.8918 0.5087 0.5056
Supernatural
helpers
Cinderella 0.1180 0.1144 0.8150 0.8306 0.4665 0.4725
Donkey Skin 0.1040 0.1122 0.8873 0.9025 0.4956 0.5074
Puss in Boots 0.1175 0.1095 0.8170 0.8486 0.4672 0.4551
Rumpelstiltskin 0.0750 0.0858 0.8467 0.8569 0.4609 0.4478
The Goose Girl 0.1240 0.1178 0.8617 0.8624 0.4928 0.4643
The Story of Sigurd 0.1080 0.1178 0.8516 0.8670 0.4800 0.4664
Magic objects
Aladdin and the Wonderful Lamp 0.0975 0.0910 0.8958 0.8664 0.4946 0.4559
Fortunatus and His Purse 0.1133 0.1185 0.8945 0.8306 0.5039 0.4519
The Golden Goose 0.1033 0.1155 0.9006 0.8529 0.5012 0.4611
The Magic Ring 0.1033 0.1040 0.9120 0.8960 0.5077 0.4762
Other stories
Little Thumb 0.0300 0.1214 0.7444 0.8562 0.3872 0.4675
The Princess and the Pea 0.0300 0.0405 0.7444 0.7844 0.3872 0.3945
# Similarity within > across 10 (41.67%) 15 (62.50%) 19 (79.16%)
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
Analysis & Discussion
• Errors in automatic construction (dependency
parses aren’t really semantic graphs), e.g.:
“along came a mouse” vs. “a mouse came”,
coreference errors.
• Consistent with Fisseni & Löwe (2012)
findings: focus more on content & motifs?
• Combination of plot graph + BoW yields best
results.
23/24
Beijing
30 July ‘15
Folktales Plot graphs Similarity Experiments
 THANK YOU 
24/24

More Related Content

Recently uploaded

Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionPriyansha Singh
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 

Recently uploaded (20)

Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorption
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 

Featured

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 

Featured (20)

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 

LaTeCH 2015: Measuring the Structural and Conceptual Similarity of Folktales using Plot Graphs (Lestari & Manurung)

  • 1. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Measuring the Structural and Conceptual Similarity of Folktales using Plot Graphs Victoria Anugrah Lestari & Ruli Manurung Faculty of Computer Science Universitas Indonesia victoria.anugrah@ui.ac.id, maruli@cs.ui.ac.id Beijing, China 30 July 2015
  • 2. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Folktales Folktales are a characteristically anonymous, timeless, and placeless tale circulated orally among a people. http://onceuponatime.wikia.com/wiki/Rumpelstiltskin_(Fairytale) http://indonesianfolklore.blogspot.com/2007/10/lutung-kasarung-folklore-from-west-java.html http://indonesianfolklore.blogspot.com/2007/10/keong-emas-golden-snail-prince-raden.html 2/24
  • 3. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Humanities work on folktales • Vladimir Propp (1928): Morphology of the (Russian) folktale  story grammars • Aarne-Thompson-Uther (ATU) index (1910, 1961, 2004): story motifs, hierarchy of folktale types 3/24
  • 4. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Computational work on folktales • Vaz Lobo & de Matos (2010): latent semantic mapping + clustering 453 fairy tales from Gutenberg. • Nguyen et al. (2012): classification based on genre, e.g. legend, fairytale, jokes, puzzle, urban legend, etc. using lexical, POS, NE, metadata. • Nguyen et al. (2013): Ranking based on story types (ATU, Brunvand) using IR, lexical, SVO triplets. • Karsdorp & van den Bosch (2013): Topic modelling (L-LDA) for multiple labelling of ATU motifs (defined by types). 4/24
  • 5. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Folktales as narratives • Narratives: Focus on sequence of related events  structure • Models of narrative: Turner (1994), Mateas & Stern (2003), Pérez y Pérez & Sharples (2004), etc. 5/24
  • 6. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Folktales as narratives • Narratives: Focus on sequence of related events  structure • Models of narrative: Turner (1994), Mateas & Stern (2003), Pérez y Pérez & Sharples (2004), etc. • However: Fisseni & Löwe (2012): People tend to focus on motifs & content, less on structure. 5/24
  • 7. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Plot graphs (McIntyre & Lapata, 2010) 6/24
  • 8. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Goals of this work • Construct representations that capture structural & conceptual properties. • Define similarity metric, use to organize folktales. • Compare to BoW-based methods wrt. ATU. 7/24
  • 9. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs 8/24
  • 10. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs Action nodes: Action edges: 8/24
  • 11. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs Action nodes: Child nodes: Action edges: Action- Child edges: 8/24
  • 12. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs Action nodes: Child nodes: Entity nodes: Action edges: Action- Child edges: Entity edges: 8/24
  • 13. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Representing folktales as plot graphs Note that the core structure is linear. Action nodes: Child nodes: Entity nodes: Action edges: Action- Child edges: Entity edges: 8/24
  • 14. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Example 9/24
  • 15. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Example live lion forest subj in 9/24
  • 16. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Example live sleep lion forest it tree subj in subj under 9/24
  • 17. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Example live sleep come lion forest it tree mouse subj in subj under subj 9/24
  • 18. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Example live sleep come play lion forest it tree mouse lionit subj in subj under subj subj on 9/24
  • 19. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Automatic construction Stanford CoreNLP SemanticGraph (a.k.a. dependency parse) 10/24
  • 20. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments From SemanticGraph to plot graph Some observation-based heuristics on selecting relations: • Governors of nsubj (nominal subject), expl (expletive “there”), and aux (auxiliary) • Add child if relation(parent,child) not conj, comp, adv, aux, cop, dep, expl, mark 11/24
  • 21. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Construction example 12/24
  • 22. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Construction example CoreNLP CorefChain (length >1) 12/24
  • 23. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Construction example CoreNLP CorefChain (length >1) 12/24
  • 24. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Construction example CoreNLP CorefChain (length >1) 12/24
  • 25. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Final result 13/24
  • 26. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Measuring plot graph similarity A lion lives in the forest. One day it sleeps under a tree. Then a mouse plays on the lion and disturbs its sleep. A lion eats meat. A lion lives in the jungle. One day it rests under a tree. 14/24
  • 27. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Measuring plot graph similarity A lion lives in the forest. One day it sleeps under a tree. Then a mouse plays on the lion and disturbs its sleep. A lion eats meat. A lion lives in the jungle. One day it rests under a tree. 14/24
  • 28. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Alignment of event sequence Needleman- Wunsch 15/24
  • 29. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Conceptual similarity: Wu-Palmer Measure path distance between 2 words based on WordNet taxonomy Word pairs Similarity sleep, live 0.25 disturb, rest 0.33 live, eat 0.29 prince, king 0.94 jungle, forest 0.31 palace, house 0.91 16/24
  • 30. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Example mapping eat live rest live 0.29 1 0.33 sleep 0.22 0.25 0.43 play 0.29 0.33 0.43 disturb 0.29 0.33 0.33 eat live rest 0 -1 -2 -3 live -1 0.29 0 -1 sleep -2 -0.71 0.54 1 play -3 -1.71 -0.38 0.96 disturb -4 -2.71 -1.38 -0.04 Wu-Palmer similarity Alignment scoring & traceback matrix 17/24
  • 31. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Folktale similarity measurement p1 & p2 = the two plot graphs being compared α = weighting for action node similarity β = weighting for child node similarity (a1i ,a2i ) = pair of action nodes from alignment of p1 and p2 g = gap penalty (c1i ,c2i ) = pair of child nodes from alignment of p1 and p2 n = alignment length of p1 and p2 18/24
  • 32. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Initial experiment • Determining values for α, β, and g • For each story, 5 paraphrases manually created: word replacement, sentence structure change, insertion/deletion of phrases & sentences • Measure similarity between paraphrases & across stories. Maximize difference. No. Title #Words 1 A friend in need is a friend indeed 133 2 Honesty is the best policy 129 3 A town mouse and a country mouse 260 4 How to tell a true princess 382 5 The butterfly lovers 572 6 Rumpelstiltskin 1106 http://www.english-for-students.com/Simple-Short-Stories.html 19/24
  • 33. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Similarity scores using various parameters g= α = 0.7, β = 0.3 α = 0.5, β = 0.5 α = 0.3, β = 0.7 -1 -0.5 0 -1 -0.5 0 -1 -0.5 0 Between paraphrases Avg 0.83 0.80 0.74 0.83 0.80 0.73 0.83 0.79 0.71 Min 0.69 0.61 0.53 0.69 0.60 0.49 0.68 0.58 0.45 Across stories Avg 0.37 0.30 0.15 0.41 0.32 0.12 0.45 0.33 0.09 Max 0.55 0.45 0.25 0.55 0.43 0.20 0.55 0.42 0.16 BP min - AS max 0.14 0.16 0.28 0.14 0.17 0.29 0.13 0.16 0.29 Diff. between avgs 0.46 0.50 0.59 0.42 0.48 0.61 0.38 0.46 0.62 20/24
  • 34. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Main experiment: BoW comparison 24 fairy tales from Fairy Books of Andrew Lang, grouped into 5 clusters under ATU (fairy tales): • Supernatural Adversaries — Bluebeard; Hansel and Gretel; Jack and the Beanstalk; Rapunzel; The Twelve Dancing Princesses. • Supernatural or Enchanted Relatives — Beauty and the Beast; Brother and Sister; East of the Sun, West of the Moon; Snow White and Rose Red; The Bushy Bride; The Six Swans; The Sleeping Beauty. • Supernatural Helpers — Cinderella; Donkey Skin; Puss in Boots; Rumpelstiltskin; The Goose Girl; The Story of Sigurd. • Magic Objects — Aladdin and the Wonderful Lamp; Fortunatus and His Purse; The Golden Goose; The Magic Ring. • Other Stories of the Supernatural — Little Thumb; The Princess and the Pea. Measure similarity between clusters & across clusters. http://www.gutenberg.org/ebooks/30580 21/24
  • 35. Beijing 30 July ‘15 Folktales Plot graphs Similarity ExperimentsStory type Story Plot graph Bag of words Combination Within Across Within Across Within Across Supernatural adversaries Bluebeard 0.1000 0.1037 0.8629 0.8618 0.4814 0.4586 Hansel and Gretel 0.1075 0.1157 0.8492 0.8630 0.4783 0.4894 Jack and the Beanstalk 0.1050 0.1110 0.9050 0.8891 0.5050 0.5001 Rapunzel 0.1000 0.1047 0.8790 0.8575 0.4895 0.4571 The Twelve Dancing Princesses 0.1125 0.1073 0.8808 0.8631 0.4966 0.4610 Supernatural or enchanted relatives Beauty and the Beast 0.0767 0.0705 0.8803 0.8605 0.4785 0.4397 Brother and Sister 0.1233 0.1135 0.8881 0.8722 0.5057 0.4654 East of the Sun, West of the Moon 0.1117 0.1012 0.8914 0.8571 0.5015 0.4525 Snow White and Rose Red 0.1200 0.1165 0.8650 0.8566 0.4925 0.4865 The Bushy Bride 0.1200 0.1182 0.8862 0.8739 0.5031 0.4960 The Six Swans 0.0925 0.1100 0.9006 0.8662 0.5020 0.4881 The Sleeping Beauty 0.1125 0.1194 0.8990 0.8918 0.5087 0.5056 Supernatural helpers Cinderella 0.1180 0.1144 0.8150 0.8306 0.4665 0.4725 Donkey Skin 0.1040 0.1122 0.8873 0.9025 0.4956 0.5074 Puss in Boots 0.1175 0.1095 0.8170 0.8486 0.4672 0.4551 Rumpelstiltskin 0.0750 0.0858 0.8467 0.8569 0.4609 0.4478 The Goose Girl 0.1240 0.1178 0.8617 0.8624 0.4928 0.4643 The Story of Sigurd 0.1080 0.1178 0.8516 0.8670 0.4800 0.4664 Magic objects Aladdin and the Wonderful Lamp 0.0975 0.0910 0.8958 0.8664 0.4946 0.4559 Fortunatus and His Purse 0.1133 0.1185 0.8945 0.8306 0.5039 0.4519 The Golden Goose 0.1033 0.1155 0.9006 0.8529 0.5012 0.4611 The Magic Ring 0.1033 0.1040 0.9120 0.8960 0.5077 0.4762 Other stories Little Thumb 0.0300 0.1214 0.7444 0.8562 0.3872 0.4675 The Princess and the Pea 0.0300 0.0405 0.7444 0.7844 0.3872 0.3945 # Similarity within > across 10 (41.67%) 15 (62.50%) 19 (79.16%)
  • 36. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments Analysis & Discussion • Errors in automatic construction (dependency parses aren’t really semantic graphs), e.g.: “along came a mouse” vs. “a mouse came”, coreference errors. • Consistent with Fisseni & Löwe (2012) findings: focus more on content & motifs? • Combination of plot graph + BoW yields best results. 23/24
  • 37. Beijing 30 July ‘15 Folktales Plot graphs Similarity Experiments  THANK YOU  24/24