Using Social Media Data for Online Television Recommendation Services at RTÉ Ireland
Report
Andrea Barraza-UrbinaTop pick for You: Enthusiastic Researcher with a Focus on Machine Learning and Recommender Systems | In Permanent Beta at Insight Centre for Data Analytics
Follow
•3 likes•678 views
1 of 21
Using Social Media Data for Online Television Recommendation Services at RTÉ Ireland
Andrea Barraza-UrbinaTop pick for You: Enthusiastic Researcher with a Focus on Machine Learning and Recommender Systems | In Permanent Beta at Insight Centre for Data Analytics
Using Social Media Data for Online Television Recommendation Services at RTÉ Ireland
1. USING SOCIAL MEDIA DATA FOR ONLINE TELEVISION RECOMMENDATION
SERVICES AT RTÉ IRELAND
Unit for Information Mining and Retrieval (UIMR)
The Insight Centre for Data Analytics
National University of Ireland, Galway, Ireland
Andrea Barraza-Urbina, Hugo Hromic, Ioana Hulpus, Benjamin Heitmann, Conor Hayes, Neal Cantle
2nd Workshop on Recommendation Systems for
Television and Online Video – RecSysTV
September 19th, 2015
Vienna, Austria
2. Centre for Data Analytics
Agenda
2
Introduction
• RTÉ Use Case, Challenges, Research Goal
Our Approach
• Data Integration
• Proposed Solution Approach
Conclusion and Future Work
3. Centre for Data Analytics
Agenda
3
Introduction
• RTÉ Use Case, Challenges, Research Goal
Our Approach
• Data Integration
• Proposed Solution Approach
Conclusion and Future Work
4. Centre for Data Analytics
4
Introduction
500 + watching hours
Live Broadcast
5. Centre for Data Analytics
5
Introduction
Recommendation Service
Lack of personal
preference data
such as ratings.
Lack of historical
user session
information.
Dynamic inventory
and limited life span
of recommendable
items.
Challenges:
7. Centre for Data Analytics
7
Introduction
Amazing acting this evening Rachel
#thevixenisback @RCPilkington
@RTEFairCity great script
Don't think I've ever laughed as
much, @damoandivor is hilarious
tonight, still trying to catch my breath
from all the laughing!
Really interesting RTÉ documentary
about the late Brian Lenihan. Didn't see it
advertised much. Worth a watch
8. Centre for Data Analytics
8
RTÉ Project
RTÉ Programme Preferences Implicit RTÉ Programme Preferences
Transfer Learning
9. Centre for Data Analytics
9
Introduction: Research Goal
Design a Recommender System that can offer
programme suggestions to an anonymous user,
based solely on information about the user’s current
session and inferred preferences of other RTÉ users
extracted from social media.
10. Centre for Data Analytics
Agenda
10
Introduction
• RTÉ Use Case, Challenges, Research Goal
Our Approach
• Data Integration
• Proposed Solution Approach
Conclusion and Future Work
11. Centre for Data Analytics
11
Data Integration
USER TWEET
Fair City
EPISODE
MEDIA
LINKS
Amazing acting this evening Rachel
#thevixenisback @RCPilkington
@RTEFairCity great script
IMDb
DBpedia URI
PROGRAMME
SOCIAL MEDIA
RTÉ CONTENT
LINKED DATA
12. Centre for Data Analytics
12
Generating Recommendations
Collaborative Filtering Recommendation
User-Item Matrix
13. Centre for Data Analytics
13
Generating Recommendations
Collaborative Filtering Recommendation
Number of common
users between all
pairs of programmes
Item-Item Matrix
Limitation: Data Sparsity
14. Centre for Data Analytics
14
mentions
reply
retweets
Community-based Recommendation
User – User Graph
w1
w2
w3
15. Centre for Data Analytics
15
Window time range:
From 2015-07-29 21:00:00
To 2015-07-30 20:59:59
# Users: 36648
# Tweets: 23232
16. Centre for Data Analytics
16
Community-based Recommendation
# Users: 34
# Tweets: 18
# Retweets: 30
[Laurenmx15]
"I wonder who's gonna move in the old
no 23 Slater household.. #eastenders"
[Tobiiiaaas]
"Ah psycho Abi is back #EastEnders"
"(On the plus side yay an episode
focusing on Dylan)! #Casualty"
17. Centre for Data Analytics
17
Generating Recommendations
Hybrid Recommendation
18. Centre for Data Analytics
18
Generating Recommendations
USER TWEET
Fair City
EPISODE
MEDIA
LINKS
Amazing acting this evening Rachel
#thevixenisback @RCPilkington
@RTEFairCity great script
IMDb
DBpedia URI
PROGRAMME
SOCIAL MEDIA
RTÉ CONTENT
LINKED DATA
19. Centre for Data Analytics
Agenda
19
Introduction
• RTÉ Use Case, Challenges, Research Goal
Our Approach
• Data Integration
• Proposed Solution Approach
Conclusion and Future Work
20. Centre for Data Analytics
20
Conclusion and Future Work
• RTÉ project presents a representative use case to explore the potential use
of microblogging data to enhance Recommendation Systems.
• We proposed novel recommendation approaches that learn the user-item
models from Twitter, and apply them to the context of an online TV Player.
• Evaluate our proposed solutions with user testing.
Good Morning,
My name is Andrea Barraza
Today I am presenting the industrial case study at RTÉ Ireland which focuses on the use of social media for online television reco
RTÉ is the national provider of television and radio in Ireland.
The company makes its programmes available online through the RTÉ Player service which is a video on demand service
RTE allows its users to both catch up on programmes they might have missed or watch streaming of live broadcasts
The player holds a wide variety of content
From both national and international providers
From its 4 different tv channel targeted to a wide audience including children
-national international
-4 channels
-different audience
-
Given RTE’s large product catalogue
A recommendation service is a desirable feature
However the development of such a service faces 3 main challenges
Users don’t tend to login and it is private info
Private info
Products only available limited time
However, RTE communicates with their users through other platforms
And encourage users to participate and engage with each other in social media
highly encourages users to discuss and share in social media their opinions and preferences concerning programmes.
This provides a key opportunity in using data immersed within social media to better understand user show preferences.
----------------
To further relate with their audience, RTÉ also publishes content using social media channels such as Twitter, Facebook and YouTube.
However, RTÉ uses social media to further relate with their users
For example, to
For example, from the RTE player service a user can Tweet about a programme
To further relate with their audience, RTÉ also publishes content using social media channels such as Twitter, Facebook and YouTube.
However, RTÉ engages with users through social media
And we believe we can find user programme prefences implicit…
-------------------
If linked with the Facebook account, users can easily express their views on RTÉ content using buttons such as “Watching it”, “Like” and “Share”.
Also, the Player has a “Tweet” button that allows users to publish in Twitter about a given episode. As can be seen, RTÉ highly encourages users to discuss and share in social media their opinions and preferences concerning programmes.
Finally, we highlight that RTÉ also offers content outside the RTÉ Player page. Besides possibly having their own website, programmes could also have: a Facebook page (e.g., to announce related news), Twitter user accounts (e.g., to promote events and gossip) and YouTube video channels (e.g., posting trailers).
The analysis of social media can give us great insights into how user communities are formed around TV shows, the impact of the TV shows, the sentiment(s) expressed with respect to the programmes, as well as insights into their content, and ultimately provide users with valuable recommendations.
How can we leverage knowledge from Social Media for Recommendation?
For a solution, we draw inspiration from transfer learning [10] approaches that deal with the application of a model learnt in a source domain, to draw conclusions about a separate target domain.
We would like to offer to an anonymous user … based on the user’s current session
Recommendations based on the preferences of other users extracted from social media
Current approaches:
Require a rich user profile
Do not consider input from social media
To do this, we integrate data from heterogeneous sources
Our approach relies on the integration between three types of data: social media content, RTÉ content and Linked Data. For seamless integration, flexibility and scalability we propose a graph-based data model.
These connections are automatically created. Currently, for the annotation of the Tweets with Programmes, we use a manually curated set of hashtags as described in section 5.1. Episodes can further be inferred by analysing the date and time of the tweet as well as its content.
Named entity recognition, linking and disambiguation can be used for linking Tweet content to mentioned DBpedia concepts.
we can obtain a User‑Item matrix
This graph could be used by several RS approaches described in [4], such as random walk similarity using ItemRank.
All communities
To form community clusters, the structure of twitter interactions among users is considered: user mentions user, user retweets post of user and user reply user
"user retweets posts from other users" para q quede mas claro. Recuerda que este subgrafo es user-user weighted así que se considera homogéneo, directed y weighted.
(48 tweets total)
A specific community with its keywords and associated programmes
East Enders is a british tv series<- soap
Holby city is a spin-off of Casualty… british tv series<- drama series
Neighbours …Also like a soap opera first screened in united kingdom
The main intuition is to use graph algorithms over the Linked Data in order to assess semantic relatedness between tweets and ultimately between programmes.
The key to this approach is the URI-URI matrix that captures semantic relatedness between URI’s.
For instance, given a tweet about Ryan Tubridy, the computation of semantic relatedness identifies films transmitted by RTÉ in which Ryan Tubridy plays. A novel method for computing the URI-URI associations is Heitmann et al.’s approach [8], a form of spreading activation to provide content‑based recommendations, where recommended items are represented by URI’s.
Introduction – RTÉ Use Case, Challenges and Research Goal
The RTÉ project presents a representative use case to explore the potential use of microblogging data to enhance Recommendation Systems. Due to the requirements of this use case we have to treat all users of the RTÉ Player as anonymous.
In this paper, we proposed novel recommendation approaches that learn the user-item models from Twitter, and apply them to the context of an online TV Player. To this end, we designed collaborative filtering as well as hybrid approaches, and showed how Linked Data can potentially be used to further enrich the knowledge extracted from social media
We plan to evaluate our proposed RS solutions with user testing, both in a controlled environment and on-line. The purpose of the RS service is to increase user retention and increase user engagement with content on the RTÉ Player. Promising metrics to measure user engagement are the length of user sessions, the number of recurrent users, the number of videos watched by a user and the number of tweets sent via the RTÉ Player. Evaluations must be first carried out in a controlled environment with a sample of user volunteers. Later, A/B testing could be used to temporarily test a RS approach on the live service.
We found that only 1,373 users tweeted about more than one programme (3.5% from retrieved users). We found 46 programmes had at least one shared user with another programme. However, most of the programme pairs had less than ten overlapping users. The maximum user overlap found was of 145 users between the programmes: RTÉ News: Six One and The Sunday Game. These users generated 888 tweets commenting on these two programmes.
We found that only 1,373 users tweeted about more than one programme (3.5% from retrieved users). We found 46 programmes had at least one shared user with another programme. However, most of the programme pairs had less than ten overlapping users. The maximum user overlap found was of 145 users between the programmes: RTÉ News: Six One and The Sunday Game. These users generated 888 tweets commenting on these two programmes.