Retention of visitors coming to Palco 3.0 via the news in SAPO:  Decoy optimization Jindřich Tandler Luboš Popelínský Jan ...
<ul><ul><li>Readers of SAPO ( www.sapo.pt ) are presented with Palco news. </li></ul></ul><ul><ul><li>News appear occasion...
Main task <ul><ul><li>Task: improve current level of bouncing of users (users coming to  www.palcoprincipal.pt  through ne...
What else could be done? <ul><ul><li>Improving fidelity of already registered users by exploiting the social network </li>...
Current work <ul><ul><li>Data has been collected into the database: </li></ul></ul>
Current work <ul><ul><li>Social network structure has been extracted from the database to Pajek format </li></ul></ul><ul>...
Current work – network properties <ul><ul><li>Exploring social network properties to understand its specific features </li...
Possible future work <ul><ul><li>Clustering </li></ul></ul><ul><ul><li>division of users into groups with similar </li></u...
Data needed <ul><ul><li>Actual rate of bouncing, time spent on the pages, etc. (Google Analytics?) </li></ul></ul><ul><ul>...
Some other questions <ul><ul><li>To what extent is it possible to change the layout and content of Palco pages? </li></ul>...
Data overview
Data – statistics * * February 2010 Query No of results users 68314 listeners 50663 (74%) artists 17660 (26%) friendship t...
Data – users <ul><ul><li>User </li></ul></ul><ul><ul><ul><li>CreatedAt, LastLogin, IsActive </li></ul></ul></ul><ul><ul><l...
Data – users <ul><ul><li>Friend </li></ul></ul><ul><ul><ul><li>InviterUserId, InvitedUserId, StatusId, CreatedAt, UpdatedA...
Data - contact <ul><ul><li>Contact </li></ul></ul><ul><ul><ul><li>country_id, city_id, Place, Address, PhoneNumber, Homepa...
Data – activities <ul><ul><li>Owner_activities </li></ul></ul><ul><ul><ul><li>ActionId, OwnerId, TargetId, CreatorId, Crea...
Data - music <ul><ul><li>Playlist </li></ul></ul><ul><ul><ul><li>listener_id, photo_id, Name, Description, Hash, Slug, Pos...
Data – tags, comments <ul><ul><li>Tag </li></ul></ul><ul><ul><ul><li>Culture, Name, Hash, Slug, CleanSlug, Length, Created...
Data – bands <ul><ul><li>mainstreamBand </li></ul></ul><ul><ul><ul><li>Summary, About, Name, CreatedAt, UpdatedAt, LastFmL...
Upcoming SlideShare
Loading in …5
×

Lubos palcomaio2010

1,471 views
1,379 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,471
On SlideShare
0
From Embeds
0
Number of Embeds
479
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • … no, but now is changing, user can choose a layout item-based collaborative filtering
  • Lubos palcomaio2010

    1. 1. Retention of visitors coming to Palco 3.0 via the news in SAPO: Decoy optimization Jindřich Tandler Luboš Popelínský Jan Kocourek Knowledge Discovery Lab FI, Masaryk University Brno Czech Republic
    2. 2. <ul><ul><li>Readers of SAPO ( www.sapo.pt ) are presented with Palco news. </li></ul></ul><ul><ul><li>News appear occasionally that redirect the user to PalcoPrincipal ( www.palcoprincipal.pt ). </li></ul></ul><ul><ul><li>Problem: people who come to Palco in this way usually don't stay there long enough. </li></ul></ul><ul><ul><li>Question: How can we help Palco to retain them? </li></ul></ul>Introduction
    3. 3. Main task <ul><ul><li>Task: improve current level of bouncing of users (users coming to www.palcoprincipal.pt through news) </li></ul></ul><ul><ul><li>Hypothesis: Shared interests are more common among the users who clicked on a specific link (are viewing the same content) than among users who didn't (are not). </li></ul></ul><ul><ul><li>The link(s) clicked is probably the only information we can possibly know about users coming from www.sapo.pt </li></ul></ul>
    4. 4. What else could be done? <ul><ul><li>Improving fidelity of already registered users by exploiting the social network </li></ul></ul><ul><ul><li>Hypothesis: socially related users share interests </li></ul></ul><ul><ul><li>e.g. we can identify users of Palco who showed interest in particular items and use their activity to make suggestions to socially related users </li></ul></ul>
    5. 5. Current work <ul><ul><li>Data has been collected into the database: </li></ul></ul>
    6. 6. Current work <ul><ul><li>Social network structure has been extracted from the database to Pajek format </li></ul></ul><ul><ul><li>Using R software and igraph library we can compute assortative mixing coefficient </li></ul></ul><ul><ul><li>This can be used for verification of the second hypothesis (i.e. socially related users share interests) </li></ul></ul><ul><ul><li>R in combination with igraph will be used also for exploring the social network properties to get better understanding of its specific features </li></ul></ul>
    7. 7. Current work – network properties <ul><ul><li>Exploring social network properties to understand its specific features </li></ul></ul><ul><ul><li>Small world phenomenom, degree distribution, degree correlations, community structure, ... </li></ul></ul><ul><ul><li>A new student has been assigned to help with this task (Pavel Kocourek) </li></ul></ul>
    8. 8. Possible future work <ul><ul><li>Clustering </li></ul></ul><ul><ul><li>division of users into groups with similar </li></ul></ul><ul><ul><li>characteristics can be used for content </li></ul></ul><ul><ul><li>recommendation </li></ul></ul><ul><ul><li>Association rule mining </li></ul></ul><ul><ul><li>for discovering direct or indirect relationships between </li></ul></ul><ul><ul><li>web pages in users' browsing behavior </li></ul></ul>
    9. 9. Data needed <ul><ul><li>Actual rate of bouncing, time spent on the pages, etc. (Google Analytics?) </li></ul></ul><ul><ul><li>Clickstream data – browsing history of particular users (important for the main task) </li></ul></ul><ul><ul><li>Clickstream data assigned to the registered Palco users could be useful for mining as well </li></ul></ul><ul><ul><li>(privacy issues?) </li></ul></ul>
    10. 10. Some other questions <ul><ul><li>To what extent is it possible to change the layout and content of Palco pages? </li></ul></ul><ul><ul><li>What does it take to change something? (time, people involved, …) </li></ul></ul><ul><ul><li>What algorithm is currently being used for content recommendation? </li></ul></ul><ul><ul><li>Is it possible to perform AB testing of the site with the old and a new layout/content? </li></ul></ul>
    11. 11. Data overview
    12. 12. Data – statistics * * February 2010 Query No of results users 68314 listeners 50663 (74%) artists 17660 (26%) friendship ties 168342 (avg 2.5) mainstream bands 114062 fans 42228 comments 817370 tags 55018
    13. 13. Data – users <ul><ul><li>User </li></ul></ul><ul><ul><ul><li>CreatedAt, LastLogin, IsActive </li></ul></ul></ul><ul><ul><li>User_profile </li></ul></ul><ul><ul><ul><li>genre_id, Name, Slug, Category, Type, Culture, ModerationStatus, About, CreatedAt, UpdatedAt </li></ul></ul></ul><ul><ul><li>Artist </li></ul></ul><ul><ul><ul><li>MusicUploads, GooglePageRank, GooglePageRankUpdatedAt </li></ul></ul></ul>
    14. 14. Data – users <ul><ul><li>Friend </li></ul></ul><ul><ul><ul><li>InviterUserId, InvitedUserId, StatusId, CreatedAt, UpdatedAt </li></ul></ul></ul><ul><ul><li>Listener_data </li></ul></ul><ul><ul><ul><li>GenderId, BirthDate </li></ul></ul></ul>
    15. 15. Data - contact <ul><ul><li>Contact </li></ul></ul><ul><ul><ul><li>country_id, city_id, Place, Address, PhoneNumber, Homepage </li></ul></ul></ul><ul><ul><li>Country </li></ul></ul><ul><ul><li>City </li></ul></ul><ul><ul><li>State </li></ul></ul>
    16. 16. Data – activities <ul><ul><li>Owner_activities </li></ul></ul><ul><ul><ul><li>ActionId, OwnerId, TargetId, CreatorId, CreatedAt, Slug </li></ul></ul></ul><ul><ul><li>Activity_stream_action </li></ul></ul><ul><ul><ul><li>ActionTag (new event added, band music added, ...), PrivacyTagTitle, PrivacySubscriberTagTitle, ActionTitle </li></ul></ul></ul>
    17. 17. Data - music <ul><ul><li>Playlist </li></ul></ul><ul><ul><ul><li>listener_id, photo_id, Name, Description, Hash, Slug, Position, CreatedAt, UpdatedAt </li></ul></ul></ul><ul><ul><li>Playlist_music </li></ul></ul><ul><ul><ul><li>playlist_id, track_id, Position, CreatedAt </li></ul></ul></ul><ul><ul><li>Track </li></ul></ul><ul><ul><ul><li>album_id, Name, CreatedAt, UpdatedAt, Complete, Downloadable, Position, IdS3, FilenameS3, LinkS3, UploadedS3, Slug, Processed </li></ul></ul></ul><ul><ul><li>Genre </li></ul></ul>
    18. 18. Data – tags, comments <ul><ul><li>Tag </li></ul></ul><ul><ul><ul><li>Culture, Name, Hash, Slug, CleanSlug, Length, CreatedAt </li></ul></ul></ul><ul><ul><li>Comments </li></ul></ul><ul><ul><ul><li>AuthorId, OwnerId, CommentableModel, CommentableId, Aproved, Text, CreatedAt </li></ul></ul></ul>
    19. 19. Data – bands <ul><ul><li>mainstreamBand </li></ul></ul><ul><ul><ul><li>Summary, About, Name, CreatedAt, UpdatedAt, LastFmLink, Hash, Slug </li></ul></ul></ul><ul><ul><li>fans </li></ul></ul><ul><ul><ul><li>mainstreamBand_id, user_id </li></ul></ul></ul><ul><ul><li>Albums </li></ul></ul><ul><ul><ul><li>artist_id, photo_id, Name, Description, ReleaseDate, CreatedAt, UpdatedAt, Hash, Slug </li></ul></ul></ul>

    ×