SlideShare a Scribd company logo
Peter M. Broadwell
@peterbroadwell
broadwell@library.ucla.edu
Timothy R. Tangherlini
@tango63
tango@humnet.ucla.edu
ElfYelp: Geolocated Topic Models for Pattern
Discovery in a Large Folklore Corpus
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Evald Tang Kristensen, Danske Sagn vol. 3, no. 108
Told by Jens Bek and Mikkel Hansen in Lille-Tåning
There were a couple of giants that had had a falling out,
one in Borum-Eshøj, and the other over in Hasle høj. So
they were going to beat each other with maces. The one
over on Borum-Eshøj hit first, but his aim was way off and
he made that water hole over by Brabrand that they call
Brabrand Lake. Then he was going to hit over, but he didn’t
have much strength left, and his mace didn’t reach further
than Gjeding Lake. So there was no point in him hitting
anymore. Now the other one was to go at it, and he started
to hit, but was much stronger. He winds up smacking down
over Borum-Eshøj, and the spiked ball comes off his mace,
and it flies further to the west and makes Lading Lake.
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Evald Tang Kristensen, Danske Sagn vol. 3, no. 108
Told by Jens Bek and Mikkel Hansen in Lille-Tåning
There were a couple of giants that had had a falling out,
one in Borum-Eshøj, and the other over in Hasle høj. So
they were going to beat each other with maces. The one
over on Borum-Eshøj hit first, but his aim was way off and
he made that water hole over by Brabrand that they call
Brabrand Lake. Then he was going to hit over, but he didn’t
have much strength left, and his mace didn’t reach further
than Gjeding Lake. So there was no point in him hitting
anymore. Now the other one was to go at it, and he started
to hit, but was much stronger. He winds up smacking down
over Borum-Eshøj, and the spiked ball comes off his mace,
and it flies further to the west and makes Lading Lake.
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Evald Tang Kristensen’s Danish legends
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
• Collected in Denmark
between 1867-1924
• From 3,500 storytellers,
mostly in Jutland
• 20,431 legends mention a
resolvable place name
• 6,423 place names ->
2,126 lat/long pairs
• 14,254 places
mentioned ≠ place told
Evald Tang Kristensen’s Danish legends
and their geographical context
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Challenge: geo-topic discovery and exploration
• Explore and analyze the relationships between place, meaning and
context across arbitrary regions of the geo-located ETK corpus
• Use techniques from geo-located social recommendation systems:
Where are the elves? What else is in the area? Are there other
areas like this one? What factors make them similar?
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Categories in Evald Tang Kristensen’s
Danish legends
Witches
and their
Sport
Hidden Folk
Witches and
their Sport
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Story categories: multi-level indices
Primary categories:
Mound dwellers (Hidden folk)
Elves
Household spirits
Traveling monsters
Water spirits
Wiverns and small creepy-crawlies
Werewolves and nightmares
Religious legends
Death portents
Lights and portents
Heroes and their sport
Churches, monasteries, holy springs
Legends about farms and towns
Diverse place legends
Legends about treasure
Small kings and their feuds... Enemy invasions
Manor lords, ladies and mistresses
Ministers
Diverse people
Robbers, murderers and thieves
Strandings
Plague and illnesses
Secondary categories:
Robber's Christmas eve
On grain, rats and mice
Hidden folk driving or riding
The Devil as a playing companion
Jilted lovers bewitched
Swedes and Poles north of the Limfjord
Destruction of mounds. Animals sick, unrest in
the house
Sand movements
Meadows and swamps
Giant graves
Cessation of the destruction of a mound
Changelings, the old child
Bad ministers
Giants build churches
Giants throw stones at churches
The murdered child, mother with the knife and
washing it or the clothes
Black dogs and the like show themselves
Smiths in mounds
The church's foundations moved
Witches as revenants
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Geo-semantic exploration
Borum-Eshøj Hasle høj Brabrand Gjeding Lake Lading Lake
Elves 0 0 0 0 0
Giants and
their sport
0 0 0 0 0
Mound
dwellers
0 0 0 0 0
There were a couple of giants that had had a falling out, one in Borum-
Eshøj, and the other over in Hasle høj. So they were going to beat each
other with maces. The one over on Borum-Eshøj hit first, but his aim was
way off and he made that water hole over by Brabrand that they call
Brabrand Lake. Then he was going to hit over, but he didn’t have much
strength left, and his mace didn’t reach further than Gjeding Lake. So
there was no point in him hitting anymore. Now the other one was to go at
it, and he started to hit, but was much stronger. He winds up smacking
down over Borum-Eshøj, and the spiked ball comes off his mace, and it
flies further to the west and makes Lading Lake.
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Geo-semantic exploration
Borum-Eshøj Hasle høj Brabrand Gjeding Lake Lading Lake
Elves 0 0 0 0 0
Giants and
their sport
1 1 1 1 1
Mound
dwellers
0 0 0 0 0
There were a couple of giants that had had a falling out, one in Borum-
Eshøj, and the other over in Hasle høj. So they were going to beat each
other with maces. The one over on Borum-Eshøj hit first, but his aim was
way off and he made that water hole over by Brabrand that they call
Brabrand Lake. Then he was going to hit over, but he didn’t have much
strength left, and his mace didn’t reach further than Gjeding Lake. So
there was no point in him hitting anymore. Now the other one was to go at
it, and he started to hit, but was much stronger. He winds up smacking
down over Borum-Eshøj, and the spiked ball comes off his mace, and it
flies further to the west and makes Lading Lake.
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Geo-semantic exploration
Borum-Eshøj Hasle høj Brabrand Gjeding Lake Lading Lake
Elves 0 0 0 0 0
Giants and
their sport
1 1 1 1 1
Mound
dwellers
1 1 0 1 0
A mound man lived in a mound close to Hasle village (the mound has
now disappeared and been replaced by a gravel pit), and he was invited
once to a birth celebration over at the mound man’s place in Borum-
Eshøj, and he was supposed to be the godfather, but the day of the party,
his wife got sick and he had to stay home. He didn’t want them not to get
his godfather gift, which was going to be a gold hammer, so he went up
on his mound to throw it over there. But as he stood there swinging, the
hammer head fell off the shaft, and it flew to the northwest and landed in
a little dale near Mundelstrup and created Gjeding Lake, but the shaft
made it to Borum-Eshøj without leaving a trace.
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Computational folkloristics and the macroscope
Early microscope macroscope
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Some geo-semantic folklore “scopes”
WitchHunter: Visualizes concentrations of story categories in the
landscape
TrollFinder: For a given area, finds terms and categories that are
“characteristic” of stories mentioning places in the area
GhostScope: Places all storytellers at the center of the landscape,
plots place references to build conceptual maps
TreasureX: Links actual places told to places mentioned, plotting
the references on a map
Börner, Katy. 2011. “Plug-and-Play Macroscopes.”
Communications of the ACM 54 (3): 60–69.
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
WitchHunter: Category/place co-occurrence
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
TrollFinder: Finding region-specific terms
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
GhostScope: Storytellers’ conceptual geographies
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Advanced challenges for ElfYelp
• Exploring an unfamiliar region: in an arbitrary location,
what salient geo-semantic features of the corpus are
found here?
• For a given region, how do we find regions that are geo-
semantically similar? (Location recommendation
problem)
• Pairwise comparison of places is computationally
expensive. Using geographical topic models with a set
number of region centroids is easier and also allows
characterization of arbitrary, unlabeled points.
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Latent Geographical Topic Analysis (LGTA)
Created to identify regional
topics from collections of
human-tagged, geo-located
photographs (Flickr)
Any point on the map can be
recognized as a mixture of
multiple latent “geo-topics”
that span the landscape
Yin, Zhijun, et al. 2011. “Geographical
Topic Discovery and Comparison.”
Proceedings of the 20th International
Conference on the World Wide Web.
ACM, 2011.
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Latent Geographical Topic Analysis (LGTA)
1. Treat locations
as documents
containing story
terms/tags
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Latent Geographical Topic Analysis (LGTA)
1. Treat locations
as documents
containing story
terms/tags
1. Central points
of place
clusters stand
in for the rest
1. Run pLSA
math/magic on
these points
and their tags
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
LGTA geo-topics as regional “core samples”
Lindorm / witches and Satan / hidden
folk / maiden revenants: 86.88%
Life cycle and calendrical rituals:
8.65%
Mound dwellers, ghosts and Satan’s
influence: 2.22%
Things that happen in fields: 1.24%
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
ElfYelp: region geo-topics, similar regions, stories
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Kling, Christoph Carl, et al. 2014. “Detecting Non-Gaussian Geographical
Topics in Tagged Photo Collections.” Proceedings of the 7th ACM
International Conference on Web Search and Data Mining. ACM, 2014.
Alternative: non-Gaussian geographical topics
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Next steps/future work
• Enhance feature extraction for locations when building
geo-topics and improve performance with multiple
tags/keywords per document
• Ability to query geo-topic mixtures for any point on the
map (not just region centers): supported by LGTA, but
not yet implemented in ElfYelp
• Incorporate geographic boundaries, references to
toponyms (e.g., “that hill over there”) into the model
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
ElfYelp:
http://etkspace.scandinavian.ucla.edu/maps/elfyelp.html
The macroscope menagerie:
http://etkspace.scandinavian.ucla.edu/macroscope.html
● WitchHunter
● TrollFinder
● GhostScope
● TreasureX
● The Danish Folklore Nexus
● ...and many more
Please try it out!
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
Thanks to our funders / supporters
• American Council of Learned Societies
• The National Endowment for the Humanities
• UCLA Council on Research
• Nordic Council of Ministers
• The Institute for Pure and Applied Mathematics
(NSF)
ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
Peter M. Broadwell, Timothy R. Tangherlini
#dh2015 - University of Western Sydney - July 2, 2015
A Brief AdvertA Brief Advertisement--please apply!
Peter M. Broadwell
@peterbroadwell
broadwell@library.ucla.edu
Timothy R. Tangherlini
@tango63
tango@humnet.ucla.edu
ElfYelp: Geolocated Topic Models for Pattern
Discovery in a Large Folklore Corpus

More Related Content

Viewers also liked

Concepts Through Time: Tracing Concepts in Dutch Newspaper Discourse using Se...
Concepts Through Time: Tracing Concepts in Dutch Newspaper Discourse using Se...Concepts Through Time: Tracing Concepts in Dutch Newspaper Discourse using Se...
Concepts Through Time: Tracing Concepts in Dutch Newspaper Discourse using Se...Melvin Wevers
 
Unremembering the forgotten
Unremembering the forgottenUnremembering the forgotten
Unremembering the forgotten
Tim Sherratt
 
Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...
Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...
Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...Harriett Green
 
From Crowdsourcing to Knowledge Communities
From Crowdsourcing to Knowledge CommunitiesFrom Crowdsourcing to Knowledge Communities
From Crowdsourcing to Knowledge Communities
Jon Voss
 

Viewers also liked (6)

Concepts Through Time: Tracing Concepts in Dutch Newspaper Discourse using Se...
Concepts Through Time: Tracing Concepts in Dutch Newspaper Discourse using Se...Concepts Through Time: Tracing Concepts in Dutch Newspaper Discourse using Se...
Concepts Through Time: Tracing Concepts in Dutch Newspaper Discourse using Se...
 
2015 07-dh2015
2015 07-dh20152015 07-dh2015
2015 07-dh2015
 
Unremembering the forgotten
Unremembering the forgottenUnremembering the forgotten
Unremembering the forgotten
 
DH 2015
DH 2015DH 2015
DH 2015
 
Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...
Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...
Capturing Virtual Verse: A Needs Assessment on Access and Preservation of Onl...
 
From Crowdsourcing to Knowledge Communities
From Crowdsourcing to Knowledge CommunitiesFrom Crowdsourcing to Knowledge Communities
From Crowdsourcing to Knowledge Communities
 

More from Peter Broadwell

Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Onl...
Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Onl...Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Onl...
Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Onl...
Peter Broadwell
 
The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...
The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...
The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...
Peter Broadwell
 
Integration of a Unique Multimedia Collection into Public Linked Open Data R...
Integration of a Unique Multimedia Collection into Public Linked Open Data R...Integration of a Unique Multimedia Collection into Public Linked Open Data R...
Integration of a Unique Multimedia Collection into Public Linked Open Data R...
Peter Broadwell
 
aiSelections: Computational Techniques for Matching Faculty Research Profiles...
aiSelections: Computational Techniques for Matching Faculty Research Profiles...aiSelections: Computational Techniques for Matching Faculty Research Profiles...
aiSelections: Computational Techniques for Matching Faculty Research Profiles...
Peter Broadwell
 
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish FolkloreTrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore
Peter Broadwell
 
From Trot to Cultural Technology: The Historical Development of Production Ne...
From Trot to Cultural Technology:The Historical Development of Production Ne...From Trot to Cultural Technology:The Historical Development of Production Ne...
From Trot to Cultural Technology: The Historical Development of Production Ne...
Peter Broadwell
 
Social Network Analysis of Collaborative Composition in Film Scoring via the ...
Social Network Analysis of Collaborative Composition in Film Scoring via the ...Social Network Analysis of Collaborative Composition in Film Scoring via the ...
Social Network Analysis of Collaborative Composition in Film Scoring via the ...
Peter Broadwell
 

More from Peter Broadwell (7)

Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Onl...
Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Onl...Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Onl...
Conspiracy Stories: Building Archives to Facilitate Narrative Analyses of Onl...
 
The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...
The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...
The East Asian Studies Macroscope: Infrastructure for Collaborative Scholars...
 
Integration of a Unique Multimedia Collection into Public Linked Open Data R...
Integration of a Unique Multimedia Collection into Public Linked Open Data R...Integration of a Unique Multimedia Collection into Public Linked Open Data R...
Integration of a Unique Multimedia Collection into Public Linked Open Data R...
 
aiSelections: Computational Techniques for Matching Faculty Research Profiles...
aiSelections: Computational Techniques for Matching Faculty Research Profiles...aiSelections: Computational Techniques for Matching Faculty Research Profiles...
aiSelections: Computational Techniques for Matching Faculty Research Profiles...
 
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish FolkloreTrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore
TrollFinder: Geo-Semantic Exploration of a Very Large Corpus of Danish Folklore
 
From Trot to Cultural Technology: The Historical Development of Production Ne...
From Trot to Cultural Technology:The Historical Development of Production Ne...From Trot to Cultural Technology:The Historical Development of Production Ne...
From Trot to Cultural Technology: The Historical Development of Production Ne...
 
Social Network Analysis of Collaborative Composition in Film Scoring via the ...
Social Network Analysis of Collaborative Composition in Film Scoring via the ...Social Network Analysis of Collaborative Composition in Film Scoring via the ...
Social Network Analysis of Collaborative Composition in Film Scoring via the ...
 

Recently uploaded

Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
gb193092
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 

Recently uploaded (20)

Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 

ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus

  • 1. Peter M. Broadwell @peterbroadwell broadwell@library.ucla.edu Timothy R. Tangherlini @tango63 tango@humnet.ucla.edu ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus
  • 2. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Evald Tang Kristensen, Danske Sagn vol. 3, no. 108 Told by Jens Bek and Mikkel Hansen in Lille-Tåning There were a couple of giants that had had a falling out, one in Borum-Eshøj, and the other over in Hasle høj. So they were going to beat each other with maces. The one over on Borum-Eshøj hit first, but his aim was way off and he made that water hole over by Brabrand that they call Brabrand Lake. Then he was going to hit over, but he didn’t have much strength left, and his mace didn’t reach further than Gjeding Lake. So there was no point in him hitting anymore. Now the other one was to go at it, and he started to hit, but was much stronger. He winds up smacking down over Borum-Eshøj, and the spiked ball comes off his mace, and it flies further to the west and makes Lading Lake.
  • 3. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Evald Tang Kristensen, Danske Sagn vol. 3, no. 108 Told by Jens Bek and Mikkel Hansen in Lille-Tåning There were a couple of giants that had had a falling out, one in Borum-Eshøj, and the other over in Hasle høj. So they were going to beat each other with maces. The one over on Borum-Eshøj hit first, but his aim was way off and he made that water hole over by Brabrand that they call Brabrand Lake. Then he was going to hit over, but he didn’t have much strength left, and his mace didn’t reach further than Gjeding Lake. So there was no point in him hitting anymore. Now the other one was to go at it, and he started to hit, but was much stronger. He winds up smacking down over Borum-Eshøj, and the spiked ball comes off his mace, and it flies further to the west and makes Lading Lake.
  • 4. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Evald Tang Kristensen’s Danish legends
  • 5. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 • Collected in Denmark between 1867-1924 • From 3,500 storytellers, mostly in Jutland • 20,431 legends mention a resolvable place name • 6,423 place names -> 2,126 lat/long pairs • 14,254 places mentioned ≠ place told Evald Tang Kristensen’s Danish legends and their geographical context
  • 6. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Challenge: geo-topic discovery and exploration • Explore and analyze the relationships between place, meaning and context across arbitrary regions of the geo-located ETK corpus • Use techniques from geo-located social recommendation systems: Where are the elves? What else is in the area? Are there other areas like this one? What factors make them similar?
  • 7. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Categories in Evald Tang Kristensen’s Danish legends Witches and their Sport Hidden Folk Witches and their Sport
  • 8. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Story categories: multi-level indices Primary categories: Mound dwellers (Hidden folk) Elves Household spirits Traveling monsters Water spirits Wiverns and small creepy-crawlies Werewolves and nightmares Religious legends Death portents Lights and portents Heroes and their sport Churches, monasteries, holy springs Legends about farms and towns Diverse place legends Legends about treasure Small kings and their feuds... Enemy invasions Manor lords, ladies and mistresses Ministers Diverse people Robbers, murderers and thieves Strandings Plague and illnesses Secondary categories: Robber's Christmas eve On grain, rats and mice Hidden folk driving or riding The Devil as a playing companion Jilted lovers bewitched Swedes and Poles north of the Limfjord Destruction of mounds. Animals sick, unrest in the house Sand movements Meadows and swamps Giant graves Cessation of the destruction of a mound Changelings, the old child Bad ministers Giants build churches Giants throw stones at churches The murdered child, mother with the knife and washing it or the clothes Black dogs and the like show themselves Smiths in mounds The church's foundations moved Witches as revenants
  • 9. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Geo-semantic exploration Borum-Eshøj Hasle høj Brabrand Gjeding Lake Lading Lake Elves 0 0 0 0 0 Giants and their sport 0 0 0 0 0 Mound dwellers 0 0 0 0 0 There were a couple of giants that had had a falling out, one in Borum- Eshøj, and the other over in Hasle høj. So they were going to beat each other with maces. The one over on Borum-Eshøj hit first, but his aim was way off and he made that water hole over by Brabrand that they call Brabrand Lake. Then he was going to hit over, but he didn’t have much strength left, and his mace didn’t reach further than Gjeding Lake. So there was no point in him hitting anymore. Now the other one was to go at it, and he started to hit, but was much stronger. He winds up smacking down over Borum-Eshøj, and the spiked ball comes off his mace, and it flies further to the west and makes Lading Lake.
  • 10. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Geo-semantic exploration Borum-Eshøj Hasle høj Brabrand Gjeding Lake Lading Lake Elves 0 0 0 0 0 Giants and their sport 1 1 1 1 1 Mound dwellers 0 0 0 0 0 There were a couple of giants that had had a falling out, one in Borum- Eshøj, and the other over in Hasle høj. So they were going to beat each other with maces. The one over on Borum-Eshøj hit first, but his aim was way off and he made that water hole over by Brabrand that they call Brabrand Lake. Then he was going to hit over, but he didn’t have much strength left, and his mace didn’t reach further than Gjeding Lake. So there was no point in him hitting anymore. Now the other one was to go at it, and he started to hit, but was much stronger. He winds up smacking down over Borum-Eshøj, and the spiked ball comes off his mace, and it flies further to the west and makes Lading Lake.
  • 11. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Geo-semantic exploration Borum-Eshøj Hasle høj Brabrand Gjeding Lake Lading Lake Elves 0 0 0 0 0 Giants and their sport 1 1 1 1 1 Mound dwellers 1 1 0 1 0 A mound man lived in a mound close to Hasle village (the mound has now disappeared and been replaced by a gravel pit), and he was invited once to a birth celebration over at the mound man’s place in Borum- Eshøj, and he was supposed to be the godfather, but the day of the party, his wife got sick and he had to stay home. He didn’t want them not to get his godfather gift, which was going to be a gold hammer, so he went up on his mound to throw it over there. But as he stood there swinging, the hammer head fell off the shaft, and it flew to the northwest and landed in a little dale near Mundelstrup and created Gjeding Lake, but the shaft made it to Borum-Eshøj without leaving a trace.
  • 12. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Computational folkloristics and the macroscope Early microscope macroscope
  • 13. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Some geo-semantic folklore “scopes” WitchHunter: Visualizes concentrations of story categories in the landscape TrollFinder: For a given area, finds terms and categories that are “characteristic” of stories mentioning places in the area GhostScope: Places all storytellers at the center of the landscape, plots place references to build conceptual maps TreasureX: Links actual places told to places mentioned, plotting the references on a map Börner, Katy. 2011. “Plug-and-Play Macroscopes.” Communications of the ACM 54 (3): 60–69.
  • 14. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 WitchHunter: Category/place co-occurrence
  • 15. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 TrollFinder: Finding region-specific terms
  • 16. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 GhostScope: Storytellers’ conceptual geographies
  • 17. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Advanced challenges for ElfYelp • Exploring an unfamiliar region: in an arbitrary location, what salient geo-semantic features of the corpus are found here? • For a given region, how do we find regions that are geo- semantically similar? (Location recommendation problem) • Pairwise comparison of places is computationally expensive. Using geographical topic models with a set number of region centroids is easier and also allows characterization of arbitrary, unlabeled points.
  • 18. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Latent Geographical Topic Analysis (LGTA) Created to identify regional topics from collections of human-tagged, geo-located photographs (Flickr) Any point on the map can be recognized as a mixture of multiple latent “geo-topics” that span the landscape Yin, Zhijun, et al. 2011. “Geographical Topic Discovery and Comparison.” Proceedings of the 20th International Conference on the World Wide Web. ACM, 2011.
  • 19. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Latent Geographical Topic Analysis (LGTA) 1. Treat locations as documents containing story terms/tags
  • 20. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Latent Geographical Topic Analysis (LGTA) 1. Treat locations as documents containing story terms/tags 1. Central points of place clusters stand in for the rest 1. Run pLSA math/magic on these points and their tags
  • 21. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 LGTA geo-topics as regional “core samples” Lindorm / witches and Satan / hidden folk / maiden revenants: 86.88% Life cycle and calendrical rituals: 8.65% Mound dwellers, ghosts and Satan’s influence: 2.22% Things that happen in fields: 1.24%
  • 22. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 ElfYelp: region geo-topics, similar regions, stories
  • 23. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Kling, Christoph Carl, et al. 2014. “Detecting Non-Gaussian Geographical Topics in Tagged Photo Collections.” Proceedings of the 7th ACM International Conference on Web Search and Data Mining. ACM, 2014. Alternative: non-Gaussian geographical topics
  • 24. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Next steps/future work • Enhance feature extraction for locations when building geo-topics and improve performance with multiple tags/keywords per document • Ability to query geo-topic mixtures for any point on the map (not just region centers): supported by LGTA, but not yet implemented in ElfYelp • Incorporate geographic boundaries, references to toponyms (e.g., “that hill over there”) into the model
  • 25. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 ElfYelp: http://etkspace.scandinavian.ucla.edu/maps/elfyelp.html The macroscope menagerie: http://etkspace.scandinavian.ucla.edu/macroscope.html ● WitchHunter ● TrollFinder ● GhostScope ● TreasureX ● The Danish Folklore Nexus ● ...and many more Please try it out!
  • 26. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 Thanks to our funders / supporters • American Council of Learned Societies • The National Endowment for the Humanities • UCLA Council on Research • Nordic Council of Ministers • The Institute for Pure and Applied Mathematics (NSF)
  • 27. ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus Peter M. Broadwell, Timothy R. Tangherlini #dh2015 - University of Western Sydney - July 2, 2015 A Brief AdvertA Brief Advertisement--please apply!
  • 28. Peter M. Broadwell @peterbroadwell broadwell@library.ucla.edu Timothy R. Tangherlini @tango63 tango@humnet.ucla.edu ElfYelp: Geolocated Topic Models for Pattern Discovery in a Large Folklore Corpus

Editor's Notes

  1. Tim: Maybe begin with a sample story showing importance of location (or not)…
  2. Tim: Maybe begin with a sample story showing importance of location (or not)… this is good
  3. Tim: Our prior work on this collection (see dh2014) has involved finding ways to make sense of a corpus that’s way too big to label by hand. Now we want to take advantage of the pervasive geographic place references in the collection to ask more questions.
  4. Tim: People weren’t terribly mobile then; storytellers tend to situate stories in their local environment, frequently incorporating references to nearby places. We find regions to be intrinsically more interesting than individual points; note that ETK’s collecting approach basically involved “sampling” regions based on where the good storytellers were; “good” storytellers usually told stories about their local environment.
  5. Tim: We’re getting at the idea that the countryside is much more complex than people have given it credit for being. Through folklore, people projected concerns about contemporary issues onto their local environment - though sometimes in the guise of elves and mound dwellers. Villages could have widely divergent ways of doing this, though conversely, sometimes villages that were very far apart might have quite similar formulations and ways of viewing the landscape (and be completely unaware of it). We want to explore this phenomenon. [Consider: both Echo Park in LA and South Delhi, India have Elf cafes! - found on Yelp and Zomato]
  6. Tim: ETK grouped stories with similar themes into volumes for publication; there are ~38 categories. This is a very shallow and unreliable system, but it’s a start. He also tagged each story with one of ~770 sub-categories, which are even more idiosyncratic but also more fine-grained, so they are potentially useful for computational profiling of the narrative landscape, along with actual keyword counts.
  7. Tim
  8. Tim: The idea is to use opening sample story & maybe one more to demonstrate place/term co-occurrence counts, perhaps with a map. I’ll put the slide together, or just chuck it if it’s not needed.
  9. Tim: The idea is to use opening sample story & maybe one more to demonstrate place/term co-occurrence counts, perhaps with a map. I’ll put the slide together, or just chuck it if it’s not needed.
  10. Tim: The second story used here is DS_01_0_00298 (#298), told by N Nielsen in Viborg. I also found at least one other story that was very similar to the first one, albeit in the Mound Dwellers volume (DS_01_0_01319)
  11. Pete: Early close reading? There are many different kinds of geographically oriented “distant reading” that we can do with this corpus. We refer to the various tools we have developed for these purposes as “macroscopes”; a macroscope is a tool for modeling and exploring highly complex systems.
  12. Pete: Here they are
  13. Pete: Plan to have this open in a browser window for a very quick demo
  14. Pete: Plan to have this open in a browser window for a very quick demo
  15. Pete: Plan to have this open in a browser window for a very quick demo
  16. Pete: Note that existing categories and topics built from keywords do not take geographical context into account in a generative way
  17. Pete: Related: Sizov, Sergej. “GeoFolk: Latent Spatial Semantics in Web 2.0 Social Media.” Proceedings of the Third ACM International Conference on Web Search and Data Mining. ACM, 2010.
  18. Pete
  19. Pete
  20. Pete: The output is a set of geo-topic “core samples” showing the proportions of particular topics in a given place (both the number of core samples and the number of geo-topics can be specified). These topic mixtures fan out from the core point (we use a Gaussian spatial distribution) and blend with the mixtures of the points around it. Given this model, we can “drill” a new core sample at any point on the map to see what’s there.
  21. Pete: This should be skipped in favor of a live demo, ideally. Demonstrate region selection, geo-topic “core sample” readout, similar places readout, info about the region (from TrollFinder), ability to drill down to story texts..
  22. Pete: Other, more recent projects have focused on the “location recommendation” challenge using geo-tagged social data. This approach divides the landscape into “cells” around each point and tries to find the best “cut” of the cells such that one topic predominates in the entire shape found. This is useful for marketing purposes, but not for the types of corpus and spatial exploration we’re doing; the LGTA “core sample” approach is a better fit.
  23. Tim: Feel free to summarize our conclusions here as well and restate the research questions this is helping us to address
  24. Tim