“Concise Preservation by Combining Managed Forgetting
and Contextualized Remembering”
EU/FP7 ForgetIT Project (2013-2016)
http://www.forgetit-project.eu
What Triggers Human Remembering of Events?
Large-Scale Analysis of Collective Memory in Wikipedia
Nattiya Kanhabua, Tu Ngoc Nguyen and Claudia Niederée
L3S Research Center , Hannover, Germany
Motivation: ForgetIT Project
Human Forgetting and Remembering
Collective Memory in Wikipedia
Experiments and Discussion
Conclusions
Outline
However, we are facing:
• Dramatic increase in content creation (e.g. digital photos)
• Increasing use of mobile devices with restricted capacity
• Information overload and changing professional and private lives
• Inadvertent forgetting due to lack of systematic preservation
Forgetting plays a crucial role for human remembering and life
(focus on current, relevant information; ignore redundant details)
Managed forgetting ≠ automatic deletion
Instead: a range of forgetting options e.g.
• Resource condensation
• Change of indexing & ranking
• Reduction of redundancy
A computer that forgets intentionally ?
And, in context of digital preservation??
However, we are facing:
• Dramatic increase in content creation (e.g. digital photos)
• Increasing use of mobile devices with restricted capacity
• Information overload and changing professional and private lives
• Inadvertent forgetting due to lack of systematic preservation
Forgetting plays a crucial role for human remembering and life
(focus on current, relevant information; ignore redundant details)
Managed forgetting ≠ automatic deletion
Instead: a range of forgetting options e.g.
• Resource condensation
• Change of indexing & ranking
• Reduction of redundancy
A computer that forgets intentionally ?
And, in context of digital preservation??
Managed forgetting = to remember the right information
Individual memories are subject to a fast
forgetting process [Ebbinghaus, 1885]
• Rapidly forget details -> “less redundancy”
Episodic memory (of one’s past event) is
reconstructed from similar events/context
• Rely on common patterns -> “false memory”
Memory bumps in the forgetting curve is
caused by reminding or triggering of:
• A physical object (e.g. a printed photo)
• A digital memory system
• Different subsequent events
Human Forgetting and Remembering
H. Ebbinghaus, Über das Gedächtnis. Untersuchungen zur experimentellen Psychologie. Duncker & Humblot,
Leipzig, 1885.
E. Tulving, Episodic memory: From mind to brain. Annual review of psychology, vol. 53, no. 1, pp. 1-25, 2002.
“ Collective memory is a socially constructed, common image (memory)
of the past of a community, which frames its current understanding
and actions.” [Halbwachs, 1950]
• Crowd phenomenon and important to societal processes
• Not static as determined by the concerns of the present
From Individual Memories to Collective Memory
M. Halbwachs, On collective memory. Chicago: The University of Chicago Press, 1950 (Translation).
Flashbulb memories in cognitive psychology
• A study of remembering of high-impact events, e.g.,
The British Royal Wedding or September 11 attacks
• Aspects: details, confidence, consistency of memory
over time, impact of media coverage
• Qualitative study: limited number of events and users
Collective Memory in Wikipedia
Wikipedia as a source for global memory
• Largest and most up-to-date online encyclopedia
(19M registered users, 30K active editors)
• Social negotiation and construction reflected in
early editing activities and talk pages
• Indicators for identifying real-world events
C. Pentzold, The online encyclopaedia wikipedia as a global memory place, Memory Studies, 2009.
M. Georgescu, N. Kanhabua, D. Krause, W. Nejdl, and S. Siersdorfer, Extracting event-related information
from article updates in wikipedia, ECIR'2013.
View logs as the signal for collective memory
• Public page view traffics with a long time span
• Not directly reflect how people forget; significant
patterns are a good estimate public remembering
• Large-scale analysis complements (1) qualitative
studies (2) analyzing article content (scalability)
Contributions
First study of identifying catalysts for event memory triggering by using
time series analysis techniques:
• temporal correlations in peaking page visits between events,
• a surprise score or the residual sum of squares on prediction error, and
• the skewness of view shapes as a catalyst for memories
Identify the relationship between events by using different features
• the role of time passed, the same types of events, the size or magnitude of
events, the near-by city or neighbor country
Analyze over 5500 high-impact events from 11 event categories
Related to the previous study by [Au Yeung and Jatowt, 2011]
• Analyzed references to the past (as an indicator to what is remembered) in a
large news collection for identifying, which years are most frequently referenced
C.-m. Au Yeung and A. Jatowt, Studying how the past is remembered: Towards computational history
through large scale text mining, CIKM’2011
We propose a 3-step approach, for a given event:
1. Compute “remembering scores” of past events within the same category
2. Rank related past events by the computed remembering scores
3. Identify features (e.g., time, location) having a high correlation with remembering
Our Approach
Remembering scores: a linear
combination of three features:
1. Cross-correlation coefficient (CCF)
2. Sum of squared error (SSE)
3. Skewness (Kurtosis)
Measuring Signals for Memory Revival
Remembering scores: a linear
combination of three features:
1. Cross-correlation coefficient (CCF)
2. Sum of squared error (SSE)
3. Skewness (Kurtosis)
Measuring Signals for Memory Revival
Remembering scores: a linear
combination of three features:
1. Cross-correlation coefficient (CCF)
2. Sum of squared error (SSE)
3. Skewness (Kurtosis)
Measuring Signals for Memory Revival
Remembering = α•CCF + β•SSE + γ•Kurtosis
Features for Triggered Remembering
Temporal similarity:
• Time distance between two events (in days, months or years)
• Time distance based on exponential decay functions
Location similarity:
• Map a geographic hierarchy of event locations as follows
 city -> state -> country -> neighbor countries -> continent
• Assign 4 scale values: 4 to same city, 3 to state, 2 to country,1 to continent
Impact of Events:
• Damaged area/properties/cost/fatalities
• Magnitude (for earthquake events)
• Highest winds, lowest pressure (for Atlantic hurricanes)
N. Kanhabua and K. Nørvåg: Determining time of queries for re-ranking search results. ECDL 2010
J. Strötgen, M. Gertz, and C. Junghans: An event-centric model for multilingual document similarity. SIGIR 2011
Experiments
Datasets:
• Page views statistics 2007-2013
• A large set of 5,500 events
• From 11 event-related categories
• α = 0.5, β = 0.4, γ = 0.1
Temporal and spatial distributions
• Strong focus on more recent events
• Better coverage with increasing popularity
• Most frequent locations depending on event types
Temporal and Spatial Distributions
Temporal and spatial distributions
• Strong focus on more recent events
• Better coverage with increasing popularity
• Most frequent locations depending on event types
Temporal and Spatial Distributions
Temporal and spatial distributions
• Strong focus on more recent events
• Better coverage with increasing popularity
• Most frequent locations depending on event types
Temporal and Spatial Distributions
Category: Atlantic Hurricane
Distributions of remembering scores
• Hurricane Sandy (Form date: October 22, 2012, Affected area: Mid-Atlantic)
• Hurricane Hanna (Form date: August 28, 2008, Affected area: US east coast)
Category: Atlantic Hurricane
Distributions of remembering scores
• Hurricane Sandy (Form date: October 22, 2012, Affected area: Mid-Atlantic)
• Hurricane Hanna (Form date: August 28, 2008, Affected area: US east coast)
Location and time have a low effect on
remembering scores for this category.
Category: Atlantic Hurricane
Top-10 events triggered by the two events
• Hurricane Hanna commemorates Hurricane Gustav, the freshest hurricane
stuck at the area of Puerto Rico and East Coast
• Hurricane Sandy triggers 1991 Perfect Storm initially formed around Canada
area, which t is high impact and most destructive
Category: Atlantic Hurricane
Top-10 events triggered by the two events
• Hurricane Hanna commemorates Hurricane Gustav, the freshest hurricane
stuck at the area of Puerto Rico and East Coast
• Hurricane Sandy triggers 1991 Perfect Storm initially formed around Canada
area, which t is high impact and most destructive
Category: Aviation accidents
Mixture of impact factors, such as, time and location
• Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of
(1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia),
and (2) Aero Caribbean Flight 883 (most recent event)
Most recent
Category: Aviation accidents
Mixture of impact factors, such as, time and location
• Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of
(1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia),
and (2) Aero Caribbean Flight 883 (most recent event)
Same
destination
Category: Aviation accidents
Mixture of impact factors, such as, time and location
• Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of
(1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia),
and (2) Aero Caribbean Flight 883 (most recent event)
Same
destination
Deadliest (two
aircraft collided)
Concorde
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Look beyond single events, especially, if there are
several events in temporal and local proximity.
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Look beyond single events, especially, if there are
several events in temporal and local proximity.
Category: Terrorist incidents
Interesting observation: semantic similarity between events
• June 2012 Kaduna church bombings triggers other religion terror attacks
• 2008 Mumbai attacks trigger terror attacks in business, entertainment and hotels
2nd
5th
24th
Category: Terrorist incidents
Interesting observation: semantic similarity between events
• June 2012 Kaduna church bombings triggers other religion terror attacks
• 2008 Mumbai attacks trigger terror attacks in business, entertainment and hotels
2nd
7th
15th
Conclusions
We identified some first pattern for event memory triggering for diverse
event types including natural and manmade disasters as well as
accidents and terrorism.
Our analysis confirmed the influence of closeness in time and location,
but the semantic similarity of events also influences which event
memories are triggered by an event.
In our future work, we plan to deepen our systematic analysis of factors
for revisiting past events and of the combination of those factors.
We also plan to investigate external factors such as media coverage
linking new events to past events or reflection of such relationships in
other types of social media.
What do you remember?
Thanks for your attention!

What Triggers Human Remembering of Events? A Large-Scale Analysis of Catalysts for Collective Memory in Wikipedia

  • 1.
    “Concise Preservation byCombining Managed Forgetting and Contextualized Remembering” EU/FP7 ForgetIT Project (2013-2016) http://www.forgetit-project.eu What Triggers Human Remembering of Events? Large-Scale Analysis of Collective Memory in Wikipedia Nattiya Kanhabua, Tu Ngoc Nguyen and Claudia Niederée L3S Research Center , Hannover, Germany
  • 2.
    Motivation: ForgetIT Project HumanForgetting and Remembering Collective Memory in Wikipedia Experiments and Discussion Conclusions Outline
  • 3.
    However, we arefacing: • Dramatic increase in content creation (e.g. digital photos) • Increasing use of mobile devices with restricted capacity • Information overload and changing professional and private lives • Inadvertent forgetting due to lack of systematic preservation Forgetting plays a crucial role for human remembering and life (focus on current, relevant information; ignore redundant details) Managed forgetting ≠ automatic deletion Instead: a range of forgetting options e.g. • Resource condensation • Change of indexing & ranking • Reduction of redundancy A computer that forgets intentionally ? And, in context of digital preservation??
  • 4.
    However, we arefacing: • Dramatic increase in content creation (e.g. digital photos) • Increasing use of mobile devices with restricted capacity • Information overload and changing professional and private lives • Inadvertent forgetting due to lack of systematic preservation Forgetting plays a crucial role for human remembering and life (focus on current, relevant information; ignore redundant details) Managed forgetting ≠ automatic deletion Instead: a range of forgetting options e.g. • Resource condensation • Change of indexing & ranking • Reduction of redundancy A computer that forgets intentionally ? And, in context of digital preservation?? Managed forgetting = to remember the right information
  • 5.
    Individual memories aresubject to a fast forgetting process [Ebbinghaus, 1885] • Rapidly forget details -> “less redundancy” Episodic memory (of one’s past event) is reconstructed from similar events/context • Rely on common patterns -> “false memory” Memory bumps in the forgetting curve is caused by reminding or triggering of: • A physical object (e.g. a printed photo) • A digital memory system • Different subsequent events Human Forgetting and Remembering H. Ebbinghaus, Über das Gedächtnis. Untersuchungen zur experimentellen Psychologie. Duncker & Humblot, Leipzig, 1885. E. Tulving, Episodic memory: From mind to brain. Annual review of psychology, vol. 53, no. 1, pp. 1-25, 2002.
  • 6.
    “ Collective memoryis a socially constructed, common image (memory) of the past of a community, which frames its current understanding and actions.” [Halbwachs, 1950] • Crowd phenomenon and important to societal processes • Not static as determined by the concerns of the present From Individual Memories to Collective Memory M. Halbwachs, On collective memory. Chicago: The University of Chicago Press, 1950 (Translation). Flashbulb memories in cognitive psychology • A study of remembering of high-impact events, e.g., The British Royal Wedding or September 11 attacks • Aspects: details, confidence, consistency of memory over time, impact of media coverage • Qualitative study: limited number of events and users
  • 7.
    Collective Memory inWikipedia Wikipedia as a source for global memory • Largest and most up-to-date online encyclopedia (19M registered users, 30K active editors) • Social negotiation and construction reflected in early editing activities and talk pages • Indicators for identifying real-world events C. Pentzold, The online encyclopaedia wikipedia as a global memory place, Memory Studies, 2009. M. Georgescu, N. Kanhabua, D. Krause, W. Nejdl, and S. Siersdorfer, Extracting event-related information from article updates in wikipedia, ECIR'2013. View logs as the signal for collective memory • Public page view traffics with a long time span • Not directly reflect how people forget; significant patterns are a good estimate public remembering • Large-scale analysis complements (1) qualitative studies (2) analyzing article content (scalability)
  • 8.
    Contributions First study ofidentifying catalysts for event memory triggering by using time series analysis techniques: • temporal correlations in peaking page visits between events, • a surprise score or the residual sum of squares on prediction error, and • the skewness of view shapes as a catalyst for memories Identify the relationship between events by using different features • the role of time passed, the same types of events, the size or magnitude of events, the near-by city or neighbor country Analyze over 5500 high-impact events from 11 event categories Related to the previous study by [Au Yeung and Jatowt, 2011] • Analyzed references to the past (as an indicator to what is remembered) in a large news collection for identifying, which years are most frequently referenced C.-m. Au Yeung and A. Jatowt, Studying how the past is remembered: Towards computational history through large scale text mining, CIKM’2011
  • 9.
    We propose a3-step approach, for a given event: 1. Compute “remembering scores” of past events within the same category 2. Rank related past events by the computed remembering scores 3. Identify features (e.g., time, location) having a high correlation with remembering Our Approach
  • 10.
    Remembering scores: alinear combination of three features: 1. Cross-correlation coefficient (CCF) 2. Sum of squared error (SSE) 3. Skewness (Kurtosis) Measuring Signals for Memory Revival
  • 11.
    Remembering scores: alinear combination of three features: 1. Cross-correlation coefficient (CCF) 2. Sum of squared error (SSE) 3. Skewness (Kurtosis) Measuring Signals for Memory Revival
  • 12.
    Remembering scores: alinear combination of three features: 1. Cross-correlation coefficient (CCF) 2. Sum of squared error (SSE) 3. Skewness (Kurtosis) Measuring Signals for Memory Revival Remembering = α•CCF + β•SSE + γ•Kurtosis
  • 13.
    Features for TriggeredRemembering Temporal similarity: • Time distance between two events (in days, months or years) • Time distance based on exponential decay functions Location similarity: • Map a geographic hierarchy of event locations as follows  city -> state -> country -> neighbor countries -> continent • Assign 4 scale values: 4 to same city, 3 to state, 2 to country,1 to continent Impact of Events: • Damaged area/properties/cost/fatalities • Magnitude (for earthquake events) • Highest winds, lowest pressure (for Atlantic hurricanes) N. Kanhabua and K. Nørvåg: Determining time of queries for re-ranking search results. ECDL 2010 J. Strötgen, M. Gertz, and C. Junghans: An event-centric model for multilingual document similarity. SIGIR 2011
  • 14.
    Experiments Datasets: • Page viewsstatistics 2007-2013 • A large set of 5,500 events • From 11 event-related categories • α = 0.5, β = 0.4, γ = 0.1
  • 15.
    Temporal and spatialdistributions • Strong focus on more recent events • Better coverage with increasing popularity • Most frequent locations depending on event types Temporal and Spatial Distributions
  • 16.
    Temporal and spatialdistributions • Strong focus on more recent events • Better coverage with increasing popularity • Most frequent locations depending on event types Temporal and Spatial Distributions
  • 17.
    Temporal and spatialdistributions • Strong focus on more recent events • Better coverage with increasing popularity • Most frequent locations depending on event types Temporal and Spatial Distributions
  • 18.
    Category: Atlantic Hurricane Distributionsof remembering scores • Hurricane Sandy (Form date: October 22, 2012, Affected area: Mid-Atlantic) • Hurricane Hanna (Form date: August 28, 2008, Affected area: US east coast)
  • 19.
    Category: Atlantic Hurricane Distributionsof remembering scores • Hurricane Sandy (Form date: October 22, 2012, Affected area: Mid-Atlantic) • Hurricane Hanna (Form date: August 28, 2008, Affected area: US east coast) Location and time have a low effect on remembering scores for this category.
  • 20.
    Category: Atlantic Hurricane Top-10events triggered by the two events • Hurricane Hanna commemorates Hurricane Gustav, the freshest hurricane stuck at the area of Puerto Rico and East Coast • Hurricane Sandy triggers 1991 Perfect Storm initially formed around Canada area, which t is high impact and most destructive
  • 21.
    Category: Atlantic Hurricane Top-10events triggered by the two events • Hurricane Hanna commemorates Hurricane Gustav, the freshest hurricane stuck at the area of Puerto Rico and East Coast • Hurricane Sandy triggers 1991 Perfect Storm initially formed around Canada area, which t is high impact and most destructive
  • 22.
    Category: Aviation accidents Mixtureof impact factors, such as, time and location • Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of (1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia), and (2) Aero Caribbean Flight 883 (most recent event) Most recent
  • 23.
    Category: Aviation accidents Mixtureof impact factors, such as, time and location • Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of (1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia), and (2) Aero Caribbean Flight 883 (most recent event) Same destination
  • 24.
    Category: Aviation accidents Mixtureof impact factors, such as, time and location • Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of (1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia), and (2) Aero Caribbean Flight 883 (most recent event) Same destination Deadliest (two aircraft collided) Concorde
  • 25.
    Category: Earthquakes A seriesof earthquake events at Christchurch, New Zealand • 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high- impact) and two close-by events, and high-impact historical earthquakes • 2011 Christchurch earthquake shows locality focus, i.e., people seem to be interested in the previous events in the same region • June 2011 Christchurch earthquake, the remembered events are dominated by the two predecessor events
  • 26.
    Category: Earthquakes A seriesof earthquake events at Christchurch, New Zealand • 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high- impact) and two close-by events, and high-impact historical earthquakes • 2011 Christchurch earthquake shows locality focus, i.e., people seem to be interested in the previous events in the same region • June 2011 Christchurch earthquake, the remembered events are dominated by the two predecessor events
  • 27.
    Category: Earthquakes A seriesof earthquake events at Christchurch, New Zealand • 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high- impact) and two close-by events, and high-impact historical earthquakes • 2011 Christchurch earthquake shows locality focus, i.e., people seem to be interested in the previous events in the same region • June 2011 Christchurch earthquake, the remembered events are dominated by the two predecessor events
  • 28.
    Category: Earthquakes A seriesof earthquake events at Christchurch, New Zealand • 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high- impact) and two close-by events, and high-impact historical earthquakes • 2011 Christchurch earthquake shows locality focus, i.e., people seem to be interested in the previous events in the same region • June 2011 Christchurch earthquake, the remembered events are dominated by the two predecessor events Look beyond single events, especially, if there are several events in temporal and local proximity.
  • 29.
    Category: Earthquakes A seriesof earthquake events at Christchurch, New Zealand • 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high- impact) and two close-by events, and high-impact historical earthquakes • 2011 Christchurch earthquake shows locality focus, i.e., people seem to be interested in the previous events in the same region • June 2011 Christchurch earthquake, the remembered events are dominated by the two predecessor events Look beyond single events, especially, if there are several events in temporal and local proximity.
  • 30.
    Category: Terrorist incidents Interestingobservation: semantic similarity between events • June 2012 Kaduna church bombings triggers other religion terror attacks • 2008 Mumbai attacks trigger terror attacks in business, entertainment and hotels 2nd 5th 24th
  • 31.
    Category: Terrorist incidents Interestingobservation: semantic similarity between events • June 2012 Kaduna church bombings triggers other religion terror attacks • 2008 Mumbai attacks trigger terror attacks in business, entertainment and hotels 2nd 7th 15th
  • 32.
    Conclusions We identified somefirst pattern for event memory triggering for diverse event types including natural and manmade disasters as well as accidents and terrorism. Our analysis confirmed the influence of closeness in time and location, but the semantic similarity of events also influences which event memories are triggered by an event. In our future work, we plan to deepen our systematic analysis of factors for revisiting past events and of the combination of those factors. We also plan to investigate external factors such as media coverage linking new events to past events or reflection of such relationships in other types of social media.
  • 33.
    What do youremember? Thanks for your attention!