Time Based Cluster Analysis for Automatic Blog Generation

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

1 comments

Comments 1 - 1 of 1 previous next Post a comment

Post a comment
Embed Video
Edit your comment Cancel

3 Favorites

Time Based Cluster Analysis for Automatic Blog Generation - Presentation Transcript

  1. Time Based Context Cluster Analysis for Automatic Blog Generation Luca Costabello and Laurent-Walter Goix Telecom Italia, Italy
  2. Context as Blog Content
    • User context is gaining importance
      • Location info
      • Nearby buddies
      • The surrounding environment in general
    • We mine context data to detect daily user actions
    • User actions are converted into natural text
    • Blog posts describing the user days enable the detection of a community of users with similar behavioral patterns.
  3. Context-Based Blog Generation 1) Raw data gathering Daily actions 2) Offline Cluster analysis 3) Blog post generation
  4. System Architecture
  5. Cluster Analysis: Detecting User Actions 2007-10-03 11:02:33 222-1-61101-72162201 office,tilab 2007-10-03 10:59:09 222-1-61101-72162201 office,tilab 2007-10-03 10:55:46 222-1-61101-72162201 office,tilab 2007-10-03 10:52:41 222-1-61101-64530928 n/a,n/a 2007-10-03 10:48:59 222-1-61101-72162201 office,tilab 2007-10-03 10:45:34 222-1-61101-72162201 office,tilab 2007-10-03 10:42:11 222-1-61101-64530928 n/a,n/a 2007-10-03 10:38:47 222-1-61101-72162201 office,tilab 2007-10-03 10:37:47 222-1-61101-72162201 office,tilab 2007-10-03 09:27:01 222-1-61101-72157899 office,tilab 2007-10-03 08:58:11 222-1-61104-72386176 n/a,n/a 2007-10-03 08:56:28 222-1-24650-121 n/a,n/a 2007-10-03 08:56:05 222-1-24650-122 n/a,n/a 2007-10-03 08:54:20 222-1-54650-923 n/a,n/a 2007-10-03 08:51:31 222-1-61104-72395762 n/a,n/a 2007-10-03 08:49:16 222-1-61104-72384437 n/a,n/a 2007-10-03 08:48:47 222-1-61104-72395762 n/a,n/a 2007-10-03 08:48:18 222-1-61104-72384437 n/a,n/a 2007-10-03 08:47:50 222-1-61104-72395762 n/a,n/a 2007-10-03 08:47:21 222-1-61104-72395762 n/a,n/a 2007-10-03 08:46:51 222-1-61104-72384437 n/a,n/a 2007-10-03 08:46:20 222-1-61104-72376116 n/a,n/a 2007-10-03 08:45:15 222-1-61104-72395763 n/a,n/a 2007-10-03 08:44:02 222-1-61104-72400263 n/a,n/a 2007-10-03 08:42:33 222-1-61104-72395770 n/a,n/a 2007-10-03 08:42:02 222-1-61104-72400262 n/a,n/a 2007-10-03 08:40:08 222-1-24650-1281 residence,home 2007-10-03 08:36:26 222-1-24650-1281 residence,home 2007-10-03 08:33:02 222-1-24650-1281 residence,home Cluster 1 (Static) Start 08:58 End 11:02 CGI 222-1-61101-162201 VP CGI Office, TILab VP Bth Not available Cluster 2 (Movement) Start 08:42 End 08:56 CGI From 222-1-24550-1281 CGI To 222-1-24650-121 VP CGI From Residence,home VP CGI To Office, TILab VP Bth Not available Timestamp Cell ID Cell ID Virtual Place
  6. Clustering Algorithms Dimensions
    • Location
      • GSM/UMTS Cell IDs
      • User-defined Cell ID Labels
    • Time
      • Chronological order of actions must be respected
    Categorical attributes Euclidean distance not available Time must be evaluated according to “temporal distance” Ad-hoc algorithms had to be designed
  7. Cell-Based Location Data Issues
    • Context updates occur with variable frequency
    • Detecting static situations VS detecting movement
    • Base station concentration affects context data patterns
    • Frequent cell handovers during static actions
  8. Compare&Merge Algorithm 2007-10-03 11:02:33 222-1-61101-72162201 office,tilab 2007-10-03 10:59:09 222-1-61101-72162201 office,tilab 2007-10-03 10:55:46 222-1-61101-72162201 office,tilab 2007-10-03 10:52:41 222-1-61101-64530928 n/a,n/a 2007-10-03 10:48:59 222-1-61101-72162201 office,tilab 2007-10-03 10:45:34 222-1-61101-72162201 office,tilab 2007-10-03 10:42:11 222-1-61101-64530928 n/a,n/a 2007-10-03 10:38:47 222-1-61101-72162201 office,tilab 2007-10-03 10:37:47 222-1-61101-72162201 office,tilab 2007-10-03 09:27:01 222-1-61101-72157899 office,tilab 2007-10-03 08:58:11 222-1-61104-72386176 n/a,n/a 2007-10-03 08:56:28 222-1-24650-121 n/a,n/a 2007-10-03 08:56:05 222-1-24650-122 n/a,n/a 2007-10-03 08:54:20 222-1-54650-923 n/a,n/a 2007-10-03 08:51:31 222-1-61104-72395762 n/a,n/a 2007-10-03 08:49:16 222-1-61104-72384437 n/a,n/a 2007-10-03 08:48:47 222-1-61104-72395762 n/a,n/a 2007-10-03 08:48:18 222-1-61104-72384437 n/a,n/a Context History Preliminary Context Scan Long Temporary Cluster Short Temporary Clusters Temporary Clusters Merge Static Cluster Movement Cluster Static Cluster
  9. MultiLevel Sliding Window Algorithm
    • For each window iteration:
    • Check if any user-defined label is available.
    • Detect user movement
    • Detect the most frequent position
    • Merge window data with previous window iteration (if detected position is the same)
  10. Algorithms Comparison Lower precision than C&M. (A 30 minute long window leads to a less than 30 minutes error) Very high in optimal situations (less than 2-5 minutes) Precision
    • Non-labeled areas
    • Frequent cell handovers
    • Good user labeling
    • Cells with low handovers issues
    Optimal usage None Frequent cell handovers Critical situations MultiLevel Sliding Window Compare&Merge  
  11. Cluster Analysis Accuracy VS User Perception
  12. From Clusters To Blog Post NLG Natural Text Generation Action Detector Context Clusters User Preferences
  13. Results
    • Mining context history leads to user pattern discovery
    • Daily actions sharing
    • Detection of user communities, according to daily behaviors
    • Clustering accuracy VS personal memories perception
    • Movement detection
    • Location-labeling importance
    • Any Questions?
    Thank You! luca.costabello@guest.telecomitalia.it [email_address] Email

+ Luca CostabelloLuca Costabello, 2 years ago

custom

1195 views, 3 favs, 0 embeds more stats

Presented at the Social Web Search and Mining Works more

More info about this document

© All Rights Reserved

Go to text version

  • Total Views 1195
    • 1195 on SlideShare
    • 0 from embeds
  • Comments 1
  • Favorites 3
  • Downloads 13
Most viewed embeds

more

All embeds

less

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel
File a copyright complaint
Having problems? Go to our helpdesk?

Categories