This document summarizes research on cross-system user modeling using data from Twitter, Flickr, and Delicious. It finds that a user's tag-based profiles on different systems overlap little, but aggregating profiles reveals more information. Cross-system user modeling significantly improves recommendation quality for new users compared to single-system or content-based approaches. The best strategies adapt to factors like the source and target systems. Overall, cross-system modeling is effective for cold-start recommendations by enriching sparse individual profiles.
IncPot, is Webenza's Social Listening & Analytics tool. This tool provides solutions which helps create effective branding from an Awareness, ORM or Lead Generation standpoint for our clients.
As a brand, you can now track online conversations based on an identified set of relevant keywords, analyse conversations, identify patterns and provide actionable insights.
Some elementary principles and procedures for Facebook data-mining. Combination of Graph API and OpenRefine software for parsing the JSON output. Two beer brands are analyze with respect to their active fans and engagement.
The second part is dedicated to the Interest positioning (as pioneered by PerfectCrowd) technique and what can OutWit Hub do as a substitute for more sophisticated techniques & apps.
These slides were created for the course:
Comm 350R Social Media
Dr. Matthew J. Kushin
Department of Communication
Utah Valley University
For more on the course see:
http://profkushinsocial.wordpress.com
For more about the professor, see:
http://profkushin.wordpress.com
or @mjkushin on Twitter
IncPot, is Webenza's Social Listening & Analytics tool. This tool provides solutions which helps create effective branding from an Awareness, ORM or Lead Generation standpoint for our clients.
As a brand, you can now track online conversations based on an identified set of relevant keywords, analyse conversations, identify patterns and provide actionable insights.
Some elementary principles and procedures for Facebook data-mining. Combination of Graph API and OpenRefine software for parsing the JSON output. Two beer brands are analyze with respect to their active fans and engagement.
The second part is dedicated to the Interest positioning (as pioneered by PerfectCrowd) technique and what can OutWit Hub do as a substitute for more sophisticated techniques & apps.
These slides were created for the course:
Comm 350R Social Media
Dr. Matthew J. Kushin
Department of Communication
Utah Valley University
For more on the course see:
http://profkushinsocial.wordpress.com
For more about the professor, see:
http://profkushin.wordpress.com
or @mjkushin on Twitter
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...GUANGYUAN PIAO
In this paper, we study if reusing Google+ profiles can provide reliable recommendations on Twitter to resolve the cold start problem. Next, we investigate the impact of giving different weights for aggregating user profiles from two OSNs and present that giving a higher weight to the targeted OSN profile for aggregation allows the best performance in the context of a personalized link recommender system. Finally, we propose a user modeling strategy which combines entity-and category-based user profiles using with a discounting strategy. Results show that our proposed strategy improves the quality of user modeling significantly compared to the baseline method.
Lecture 5: Personalization on the Social Web (2014)Lora Aroyo
This is the fifth lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Project Explanation: Book Recommendation System
The goal of this project was to develop a book recommendation system that provides personalized recommendations to users based on their preferences and past reading behavior. The project involved the following key steps:
1. Data Collection: I gathered a comprehensive dataset of books, including information such as titles, authors, genres, and user ratings. This data was obtained from various reliable sources, such as online bookstores or publicly available book datasets.
2. Data Preprocessing: The collected data required cleaning and preprocessing to ensure its quality and consistency. I handled missing values, resolved inconsistencies in book titles or authors, and standardized the data format for further analysis.
3. Exploratory Data Analysis: I performed exploratory data analysis to gain insights into the dataset. This included analyzing book genres, distribution of user ratings, and identifying popular authors or books.
4. Feature Engineering: To capture the preferences and interests of users, I created relevant features from the available data. These features could include book genres, authors, user demographics, or historical reading behavior.
5. Recommendation Model Development: I developed a recommendation model using collaborative filtering techniques or content-based filtering methods. Collaborative filtering utilizes the preferences of similar users to make recommendations, while content-based filtering suggests books based on their attributes and user preferences. I employed popular machine learning algorithms, such as matrix factorization or k-nearest neighbors, to build the recommendation model.
6. Model Evaluation: I evaluated the performance of the recommendation system using metrics such as precision, recall, or mean average precision. I also conducted A/B testing or cross-validation to assess the system's effectiveness and optimize its performance.
7. User Interface Development: I created a user-friendly interface where users could input their preferences and receive personalized book recommendations. The interface provided an intuitive and interactive experience, allowing users to explore recommended books and provide feedback.
8. Deployment and Feedback Loop: The recommendation system was deployed in a production environment, where users could access it and provide feedback on the recommended books. This feedback was incorporated into the system to continually improve its accuracy and relevance over time.
By completing this project, I gained hands-on experience in data collection, preprocessing, exploratory data analysis, and recommendation system development. I demonstrated my ability to leverage machine learning algorithms and user data to build a personalized book recommendation system that enhances user engagement and satisfaction.
Olist Store Analysis
ccording to the data, Olist E-commerce has about 99,440 orders. With about 89,940 orders being delivered, the company has a 90% delivery success rate.
✔Their average product rating is 4.09 stars, with product categories going as high as 4.67 stars and as low as 2.5 stars. 1 Star reviews are on third place in the review score distribution ranking which likely indicates that there could be problems with product quality in some product categories
✔It helps in understanding the spending patterns of customers in sao paulo city .it also helps Olist in identifying high value customers and creating targeted marketing campaigns.
SharePoint Lists: Used, Abused and UnderappreciatedWes Preston
Lists are used as the core of many things SharePoint but they are rarely used as well as they can be. Learn the top tips and tricks for getting the most of your lists: How to build them smarter, how to display them more clearly, and how to use them in your environment for effectively. Also see how these practices evolve from 2007 to 2010.
I was invited to speak at OMCap Berlin 2014 about the close relationship between search engines and user experience with prescriptive guidance to gain higher rankings and more conversions.
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...GUANGYUAN PIAO
In this paper, we study if reusing Google+ profiles can provide reliable recommendations on Twitter to resolve the cold start problem. Next, we investigate the impact of giving different weights for aggregating user profiles from two OSNs and present that giving a higher weight to the targeted OSN profile for aggregation allows the best performance in the context of a personalized link recommender system. Finally, we propose a user modeling strategy which combines entity-and category-based user profiles using with a discounting strategy. Results show that our proposed strategy improves the quality of user modeling significantly compared to the baseline method.
Lecture 5: Personalization on the Social Web (2014)Lora Aroyo
This is the fifth lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Project Explanation: Book Recommendation System
The goal of this project was to develop a book recommendation system that provides personalized recommendations to users based on their preferences and past reading behavior. The project involved the following key steps:
1. Data Collection: I gathered a comprehensive dataset of books, including information such as titles, authors, genres, and user ratings. This data was obtained from various reliable sources, such as online bookstores or publicly available book datasets.
2. Data Preprocessing: The collected data required cleaning and preprocessing to ensure its quality and consistency. I handled missing values, resolved inconsistencies in book titles or authors, and standardized the data format for further analysis.
3. Exploratory Data Analysis: I performed exploratory data analysis to gain insights into the dataset. This included analyzing book genres, distribution of user ratings, and identifying popular authors or books.
4. Feature Engineering: To capture the preferences and interests of users, I created relevant features from the available data. These features could include book genres, authors, user demographics, or historical reading behavior.
5. Recommendation Model Development: I developed a recommendation model using collaborative filtering techniques or content-based filtering methods. Collaborative filtering utilizes the preferences of similar users to make recommendations, while content-based filtering suggests books based on their attributes and user preferences. I employed popular machine learning algorithms, such as matrix factorization or k-nearest neighbors, to build the recommendation model.
6. Model Evaluation: I evaluated the performance of the recommendation system using metrics such as precision, recall, or mean average precision. I also conducted A/B testing or cross-validation to assess the system's effectiveness and optimize its performance.
7. User Interface Development: I created a user-friendly interface where users could input their preferences and receive personalized book recommendations. The interface provided an intuitive and interactive experience, allowing users to explore recommended books and provide feedback.
8. Deployment and Feedback Loop: The recommendation system was deployed in a production environment, where users could access it and provide feedback on the recommended books. This feedback was incorporated into the system to continually improve its accuracy and relevance over time.
By completing this project, I gained hands-on experience in data collection, preprocessing, exploratory data analysis, and recommendation system development. I demonstrated my ability to leverage machine learning algorithms and user data to build a personalized book recommendation system that enhances user engagement and satisfaction.
Olist Store Analysis
ccording to the data, Olist E-commerce has about 99,440 orders. With about 89,940 orders being delivered, the company has a 90% delivery success rate.
✔Their average product rating is 4.09 stars, with product categories going as high as 4.67 stars and as low as 2.5 stars. 1 Star reviews are on third place in the review score distribution ranking which likely indicates that there could be problems with product quality in some product categories
✔It helps in understanding the spending patterns of customers in sao paulo city .it also helps Olist in identifying high value customers and creating targeted marketing campaigns.
SharePoint Lists: Used, Abused and UnderappreciatedWes Preston
Lists are used as the core of many things SharePoint but they are rarely used as well as they can be. Learn the top tips and tricks for getting the most of your lists: How to build them smarter, how to display them more clearly, and how to use them in your environment for effectively. Also see how these practices evolve from 2007 to 2010.
I was invited to speak at OMCap Berlin 2014 about the close relationship between search engines and user experience with prescriptive guidance to gain higher rankings and more conversions.
Payday on the Social Semantic Web: life would be better if we would embed a fair donation system (similarly to Flattr) into the Web. Thanks to bar codes, also those people can receive donations that do not have Internet access but happen to appear in a YouTube video (or other media)...
Details: http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/outrageous/iswc2011outrageousid_submission_8.pdf
Talk given at the Semantic Web SIKS course 2011: why we need semantics on the Social Web. Three examples: social tagging, user profiling based on Twitter streams and cross-system user profiling (linking user profiles).
Slides presented at ICWE 2011: Learning Semantic Relationships between Entities in Twitter
Supporting web site: http://wis.ewi.tudelft.nl/icwe2011/relation-learning/
Learning Semantic Relationships between Entities in Twitter
Analyzing Cross-System User Modeling on the Social Web
1. Analyzing Cross-System User Modeling on the Social Web ICWE, Cyprus, June 22, 2011 Fabian Abel, SamurAraujo, QiGao, Geert-Jan Houben Web Information Systems, TU Delft
2. What we do: Science and Engineering for the Personal Web domains: news social mediacultural heritage public datae-learning Personalized Recommendations Personalized Search Adaptive Systems Analysis and User Modeling Semantic Enrichment, Linkage and Alignment user/usage data Social Web
3. profile ? Hi, I have a new-user problem! profile Hi, I’m back and I have new interests. Hi, I don’t know that your interests changed! Pitfalls of User-adaptive Systems Hi, I’m your new user. Give me personalization! System A System D System C System B How can we tackle these problems? profile profile profile time
5. SocialGraph API 1. get other accounts of user Account Mapping 2. aggregate public profile data Social Web Aggregator Blog posts: Semantic Enhancement Profile Alignment Bookmarks: 3. Map profiles to target user model 4. enrich data with semantics Other media: WordNet® Social networking profiles: FOAF vCard Interweaving public user data with Mypes Aggregated, enriched profile (e.g., in RDF or vCard) Google Profile URI http://google.com/profile/XY Analysis and user modeling 5. generate user profiles
6. In this paper: User Modeling across Twitter, Flickr and Delicious Twitter and Delicious 1500 users 80k + 620k TAS Flickr and Delicious 1467 users 890k + 680k TAS Bob travel, google IO web socialmedia identity This is #interesting: http://bit.ly/3gt42f #web http://claimid.com Twitter Delicious Flickr
7. Tag-based user profiles Tag-based profile of a user u = set of weighted tags: weight indicates to what degree the user is interested in t tag of interest Lightweight weighting scheme: count how often the user applied the tag
9. Characteristics of tag-based profiles What are the characteristics of the individual tag-based profiles in Twitter, Flickr and Delicious? How do the tag-based profiles of individual users overlap between the different systems?
11. Overlap of tag-based profiles Overlap of tag-based profile is less than 10% for more than 90% of the users
12. where: - p(t) = probability that t occurs in Tu - Tu = tags in user profile P(u) Entropy of Tag-based profiles Delicious Flickr & Delicious Flickr Twitter & Delicious Twitter Aggregated profiles reveal wrt entropy significantly more information than the service specific profiles.
13. Observations Profile size varies from system to system (e.g. tag-based Twitter profiles are rather sparse) Tag-based profiles of an individual user overlap only little(e.g. overlap is less than 10% for more than 90% of the users) Entropy of tag-based profiles: Twitter < Flickr < Delicious < aggregated profiles
15. Evaluation: Recommending tags / bookmarks Hi, I’m your new user. Give me personalization! delicious profile profile ? user’s tags and bookmarks profile Ground truth: leave-n-out evaluation tags to explore Cosine-based recommender Web sites to bookmark Cross-system user modeling actual tags and bookmarks of the user How does cross-system user modeling impact the recommendation quality (in cold-start situations)?
16. User Modeling Building Blocks 1. Which tags should be contained in the profile? 2. Further enrich/align tags? 3. How to weight the tags? 1. Source Profile? tags weights analyze 0.1 0.1 0.5 0.2 0.1 t1 t2 t3 t4 t5 2. Semantic Enrichment enrich 3. Weighting Scheme ? weight System A System B
17. User Modeling Building Blocks (in this talk) Source: Personal tags from foreign system Popular tags from target system Semantic Enrichment: Enrich tags with similar tags (based on Jaro-Winkler similarity) Cross-system rules: if tag A was used in foreign system then add tag B Weighting scheme: Personal usage frequency in foreign system Global usage frquency in target system personal profile popular profile ? similarity cross rules personal global Foreign: Target: a) simJaro(blog, blogs) is high b) Cross-system rule: blogforeignnikontarget web blog java requires profile to compute recommendations blogs france
18. Cross-System User Modeling for Cold-start recommendations Which user modeling strategies performs best in which context? How do the different building blocks of the user modeling strategies (e.g. source of user data) influence the quality of the tag-based profiles?
20. Tag recommendations: Twitter Delicious Significant improvements regarding all metrics! Improvement regarding P@10, but “global Delicious trend” performs better regarding MRR & S@1. Cross-system strategies lead to significant improvement (impact of semantic enrichment is rather low) profile profile profile global tag frequencies (weights) profile ? profile ? user’s tags user profile popular personal personal personal global personal global global baseline Cross-system user modeling similarity
21. Tag recommendations: Delicious Twitter Semantic enrichment (cross-system rules) allow for significant improvement regarding P@10 Significant improvements regarding all metrics! profile profile profile Tag-based profile information from Delicious seems to be more valuable than hashtga-based Twitter profiles user’s tags and tag frequencies (weights) profile ? user profile popular personal personal personal global personal global global baseline Cross-system user modeling crossrules
22. Tag Recommendations: different settings profile profile target: Cross-system user modeling allows for cold-start tag recommendations in Delicious: Twitter profiles are more appropriate than Flickr profiles. Cross-system user modeling is also beneficial for cold-start tag recommendations in Flickr. target: profile ? profile ? Cross-system user modeling has significant impact on the recommendation performance To optimize the performance one adapt to the given application setting profile
23. Bookmark Recommendations Cross-system user modeling achieves also significant improvements for cold-start bookmark recommendations Twitter is again a more appropriate source than Flickr baseline Cross UM Cross UM
24. Conclusions Characteristics of distributed tag-based profiles: Overlap of tag-based profiles, which an individual user creates at different services, is low Aggregated profiles reveal significantly more information (regarding entropy) than service-specific profiles Performance of cross-system user modeling for cold-start recommendations: Cross-system UM leads to tremendous (and significant) improvements of the tag and bookmark recommendation quality To optimize the performance one has to adapt the cross-system strategies to the concrete application setting http://persweb.org
25. Thank you! Fabian Abel, QiGao, Geert-Jan Houben, Ke Tao Datasets: http://wis.ewi.tudelft.nl/icwe2011/um/ Twitter: @persweb http://persweb.org
Editor's Notes
Observations:Even though the size of Flickr profiles is high, the entropy is rather lowEntropy of aggregated profiles is the highest
Source = which tags do we put into the profile?Semantic Enrichment: do we do something further with the tags that are already selected to be in the profile? (here: do we add further tags?)Weighting scheme: how do we weigh the tags (in the paper we compare two dimensions: (i) type of weighting (-> TF vs. TFxIDF) and (ii) where do we count (i.e. do we take the the TF statistics from (a) the personal profile of the foreign system or (b) from the “global statistics” of the target system) In this talk, we do just look at (ii). We use TF and do not report on TFxIDF.
Here, we do “semantic enrichment” based on “Tag-similarity” (see slide 15: User Modeling Building Blocks)
Here, we do “semantic enrichment” based on “cross-system rules” (see slide 15: User Modeling Building Blocks)
Characteristics:Overlap is small; still one gets significantly more informationPerformance: cross-system UM leads to very high improvements for cold-start recommendations (some personal information is better than nothing) to optimize: we need to know the characteristics of the system (we can be stupid and simply aggregate what we can get this is fine as we will get improvements anyhow; but we can massively optimize if we carefully select the different building blocks of the cross-system UM strategy with respect to the given application (e.g. Recommending bookmarks in Delicious select tags from personal Twitter profile, but weigh them according to the global Delicious tag frequencies.