Parameters1. Playcounts2. Playlists3. Ages4. IDs5. Number of friends (degrees)Compare average using RW and RWRW!
MethodologyUtilized lastfm APIs to obtain ● user info ● number of friends (degree)RW with UIS-WROn-the-fly, we apply RW formula:
MethodologyFor RWRW, we apply:The weight Wv is set to number of friends (degree)
ResultsCrawled for ~10 hoursNumber of samples: 48000Number of age samples: 36363, not all usersshow their age
Results - Ages RW estimates lower After about 25k average age samples, the There is a big values. age stabilizes. correlation between age and the degree
Results - Playlists Most users do not have playlists. RW estimates higher numbers of playlists. Users with higher degrees tend to have more playlists.
Results - Playcounts We found some users having playcounts in the order of millions. RW estimates higher playcounts. Users with higher degree tend to have higher playcounts
Results - IDs Not yet stable. RW estimates a lower average ID compared to RWRW. An user with lower ID has generally a higher degree
Results - Degrees RWRW reduces the bias of nodes with higher probability to be visited due to the high degree. This is indeed close to the expected degree value.
Conclusion● A simple random walk in a social network generally results into biased averages. ○ A node with higher degree has a higher probability of being discovered.● RWRW normalizes the averages. ○ High variations do not abruptly impact the estimation. ○ RWRW reduces the biases of RW.● Low variance means lower difference between RW and RWRW.● Crawling lastfm produces many challenges ○ e.g.: 0 degree, banned user, huge playcounts
QuestionsCheck the code in:● http://code.google.com/p/lastfm-rwrw/