This is the fifth lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 3: Vocabularies & Data Formats on the Social Web (2014)Lora Aroyo
This is the third lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 1: Social Web Introduction (2014)Lora Aroyo
This is the first lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 2: Interactions, Frameworks, Privacy & Security on the Social Web (2014)Lora Aroyo
This is the second lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 3: Vocabularies & Data Formats on the Social Web (2014)Lora Aroyo
This is the third lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 1: Social Web Introduction (2014)Lora Aroyo
This is the first lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
Lecture 2: Interactions, Frameworks, Privacy & Security on the Social Web (2014)Lora Aroyo
This is the second lecture in the Social Web course (2014) at the VU University Amsterdam. Visit the website for more information: http://thesocialweb2014.wordpress.com/
ESRC Research Methods Festival - From Flickr to Snapchat: The challenge of an...Farida Vis
From Flickr to Snapchat: The challenge of analysing images on social media. Presentation part of the 'Challenges/Opportunities of Using Social Media for Social Science Research' panel. 9th of July 2014
Slides for a talk on "What's On the Technology Horizon?" given by Brian Kelly, UKOLN at the ILI 2011 conference in London on 27 October 2011.
See http://www.ukoln.ac.uk/web-focus/events/conferences/ili-2011/
ESRC Research Methods Festival - From Flickr to Snapchat: The challenge of an...Farida Vis
From Flickr to Snapchat: The challenge of analysing images on social media. Presentation part of the 'Challenges/Opportunities of Using Social Media for Social Science Research' panel. 9th of July 2014
Slides for a talk on "What's On the Technology Horizon?" given by Brian Kelly, UKOLN at the ILI 2011 conference in London on 27 October 2011.
See http://www.ukoln.ac.uk/web-focus/events/conferences/ili-2011/
TEDx Navesink 2015: to be AND not to be - Quantum IntelligenceLora Aroyo
Lora Aroyo and Chris Welty propose a radical new approach to modeling human behavior for the next generation of PDAs: Use quantum math instead of probability theory. It makes sense. Watch the video here: https://www.youtube.com/watch?v=CyAI_lVUdzM
Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities pr...Lora Aroyo
This presentation was given at the NL eSchience Center during the "De Geest Uit De Fles" event for the kick off of eHumanities project in 2014:
http://esciencecenter.nl/agenda/703-26-may-de-geest-uit-de-fles/
The Agora project is a collaboration between the History and Computer Science departments at the VU University Amsterdam, the Rijksmuseum Amsterdam and the Dutch national audiovisual archive Beeld en Geluid. The aim of Agora is to develop a social platform in which museum objects can be placed into an explicit (art)historic context. Through the (art)historic context, objects from highly diverse museum collections can be related, resulting in a more complete and illustrated description of historical events. End-users will also be allowed to create their own personal narratives which will lead to theoretical reflection on the meaning of digitally mediated public history in contemporary society.
Check out our website http://agora.cs.vu.nl/ and our twitter feed @agora_project
CHIP Project: Personalized Museum Tour with Real-Time Adaptation on a Mobile ...Lora Aroyo
For more information visit our website: http://chip-project.org
This is a presentation of a MSc thesis by Ivo Roes, performed within the CHIP project, entitled:
Personalized Museum Tour with Real-Time Adaptation on a Mobile Device with Multi-Point Touch Interface
The CHIP project is a collaborative effort between the Rijksmuseum Amsterdam, VU University Amsterdam and Eindhoven University of Technology
http://chip-project.org
Finding relevant multimedia content is notoriously difficult, and the difficulty increases with the size and heterogeneity of the content collection. Linked cultural media collections are heterogeneous by nature and rapidly increase in size, mainly through enormous amounts of user-generated content and metadata that are placed on the Internet on a daily basis. Without mechanisms for keeping any part of these collections easily accessible by any user at any time and any use context, the value of these collections for the community will drop, just like their value as an economic asset.
demo: http://2-dot-rma-accurator.appspot.com/#Intro
website: http://sealincmedia.wordpress.com/
Lecture 4: Social Web Personalization (2012)Lora Aroyo
This is the fourth lecture in the Social Web course at the VU University Amsterdam
Visit the website for more information: http://semanticweb.cs.vu.nl/socialweb2012/
Thanks to Fabian Abel for letting me adopt slides from his lectures
This proposal of work contains details and samples of the user centric design process I follow. I have been trying to find a good graph that represents the process, but at the end I have decided to make my own! ;)
Slides Ian Multon recently used in his discussion w/ mentees of The Product Mentor.
The Product Mentor is a program designed to pair Product Mentors and Mentees from around the World, across all industries, from start-up to enterprise, guided by the fundamental goals…Better Decisions. Better Products. Better Product People.
Throughout the program, each mentor leads a conversation in an area of their expertise that is live streamed and available to both mentee and the broader product community.
http://TheProductMentor.com
OACUHO2014 Technological Tools with Diverse ApplicationVadim Levin
This presentation will focus on exposing participants to less known technological applications. The presenter will share several applications and the ways in which they have used them in the past or can be used for day-to-day tasks as well as larger residence life processes. With a little creativity the following tools can be applied to professional productivity, staff management, community building, assessment and larger system wide practices.
This was a presentation at OACUHO conference 2014.
Information Experience Lab, IE Lab at SISLTIsa Jahnke
Founded in 2003
The Information Experience Laboratory, IE Lab – is a usability and user experience lab …
… with the mission to improve learning technologies, information and communication systems.
We here present the IE Lab and methods .
This is a quick overview of my design process which I can hardly call my own, because most of it is based on the work done by various experts in the field. I have compiled this to make it easier for anyone to get a quick overview of an end to end research to development lifecycle.
The Rijksmuseum Collection as Linked DataLora Aroyo
Presentation at ISWC2018: http://iswc2018.semanticweb.org/sessions/the-rijksmuseum-collection-as-linked-data/ of our paper published originally in the Semantic Web Journal: http://www.semantic-web-journal.net/content/rijksmuseum-collection-linked-data-2
Many museums are currently providing online access to their collections. The state of the art research in the last decade shows that it is beneficial for institutions to provide their datasets as Linked Data in order to achieve easy cross-referencing, interlinking and integration. In this paper, we present the Rijksmuseum linked dataset (accessible at http://datahub.io/dataset/rijksmuseum), along with collection and vocabulary statistics, as well as lessons learned from the process of converting the collection to Linked Data. The version of March 2016 contains over 350,000 objects, including detailed descriptions and high-quality images released under a public domain license.
FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo
Presentation at the NYC Media Lab (NYCML2018). There is a growing demand for news videos online, with more consumers preferring to watch the news than read or listen to it. On the publisher side, there is a growing effort to use video summarization technology in order to create easy-to-consume previews (trailers) for different types of broadcast programs. How can we measure the quality of video summaries and their potential to misinform? This workshop will inform participants about automatic video summarization algorithms and how to produce more “representative” video summaries. The research presented is from the FAIRview project and is supported by the Digital News Innovation Fund (DNI Fund), which is part of the Google News Initiative.
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo
Lora Aroyo, Chiel van den Akker, Marnix van Berchum, Lodewijk
Petram, Gerard Kuys, Tommaso Caselli, Jacco van Ossenbruggen, Victor de Boer, Sabrina Sauer, Berber Hagedoorn
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Lora Aroyo
The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to the volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, this assumption often creates issues in practice. Previous experiments we performed found that inter-annotator disagreement is usually never captured, either because the number of annotators is too small to capture the full diversity of opinion, or because the crowd data is aggregated with metrics that enforce consensus, such as majority vote. These practices create artificial data that is neither general nor reflects the ambiguity inherent in the data.
To address these issues, we proposed the method for crowdsourcing ground truth by harnessing inter-annotator disagreement. We present an alternative approach for crowdsourcing ground truth data that, instead of enforcing an agreement between annotators, captures the ambiguity inherent in semantic annotation through the use of disagreement-aware metrics for aggregating crowdsourcing responses. Based on this principle, we have implemented the CrowdTruth framework for machine-human computation, that first introduced the disagreement-aware metrics and built a pipeline to process crowdsourcing data with these metrics.
In this paper, we apply the CrowdTruth methodology to collect data over a set of diverse tasks: medical relation extraction, Twitter event identification, news event extraction and sound interpretation. We prove that capturing disagreement is essential for acquiring a high-quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with a majority vote, a method which enforces consensus among annotators. By applying our analysis over a set of diverse tasks we show that, even though ambiguity manifests differently depending on the task, our theory of inter-annotator disagreement as a property of ambiguity is generalizable.
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneLora Aroyo
Ambiguity in interpreting signs is not a new idea, yet the vast majority of research in machine interpretation of signals such as speech, language, images, video, audio, etc., tend to ignore ambiguity. This is evidenced by the fact that metrics for quality of machine understanding rely on a ground truth, in which each instance (a sentence, a photo, a sound clip, etc) is assigned a discrete label, or set of labels, and the machine’s prediction for that instance is compared to the label to determine if it is correct. This determination yields the familiar precision, recall, accuracy, and f-measure metrics, but clearly presupposes that this determination can be made. CrowdTruth is a form of collective intelligence based on a vector representation that accommodates diverse interpretation perspectives and encourages human annotators to disagree with each other, in order to expose latent elements such as ambiguity and worker quality. In other words, CrowdTruth assumes that when annotators disagree on how to label an example, it is because the example is ambiguous, the worker isn’t doing the right thing, or the task itself is not clear. In previous work on CrowdTruth, the focus was on how the disagreement signals from low quality workers and from unclear tasks can be isolated. Recently, we observed that disagreement can also signal ambiguity. The basic hypothesis is that, if workers disagree on the correct label for an example, then it will be more difficult for a machine to classify that example. The elaborate data analysis to determine if the source of the disagreement is ambiguity supports our intuition that low clarity signals ambiguity, while high clarity sentences quite obviously express one or more of the target relations. In this talk I will share the experiences and lessons learned on the path to understanding diversity in human interpretation and the ways to capture it as ground truth to enable machines to deal with such diversity.
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo
Software systems are becoming ever more intelligent and more useful, but the way we interact with these machines too often reveals that they don’t actually understand people. Knowledge Representation and Semantic Web focus on the scientific challenges involved in providing human knowledge in machine-readable form. However, we observe that various types of human knowledge cannot yet be captured by machines, especially when dealing with wide ranges of real-world tasks and contexts. The key scientific challenge is to provide an approach to capturing human knowledge in a way that is scalable and adequate to real-world needs. Human Computation has begun to scientifically study how human intelligence at scale can be used to methodologically improve machine-based knowledge and data management. My research is focusing on understanding human computation for improving how machine-based systems can acquire, capture and harness human knowledge and thus become even more intelligent. In this talk I will show how the CrowdTruth framework (http://crowdtruth.org) facilitates data collection, processing and analytics of human computation knowledge.
Some project links:
- http://controcurator.org/
- http://crowdtruth.org/
- http://diveproject.beeldengeluid.nl/
- http://vu-amsterdam-web-media-group.github.io/linkflows/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
How world-class product teams are winning in the AI era by CEO and Founder, P...
Lecture 5: Personalization on the Social Web (2014)
1. Social Web
2014
Lecture V: Personalization on the Social Web
(some slides adopted from Fabian Abel)
Lora Aroyo
The Network Institute
VU University Amsterdam
2. theory & techniques for
how to design & evaluate
recommenders & user models
to use in Social Web applications
Social Web 2014, Lora Aroyo!
3. Fig. 1 Functional model of tasks and sub-tasks specifically suited for SASs
Fig. 1 Functional model of tasks and sub-tasks specifically suited for SASs (Ilaria Torre, 2009)
Social Web 2014, Lora Aroyo!
4. User Modeling
How to infer & represent
user information that supports a given
application or context?
Kevin Kelly
Social Web 2014, Lora Aroyo!
5. User Modeling Challenge
• Application has to obtain,
understand & exploit information
about the user
• Information (need & context)
about user
• Inferring information about user &
representing it so that it can be
consumed by the application
• Data relevant for inferring
information about user
Social Web 2014, Lora Aroyo!
6. User & Usage Data
is Everywhere
• People leave traces on the Web and on their computers:
• Usage data, e.g., query logs, click-through-data
• Social data, e.g., tags, (micro-)blog posts, comments,
bookmarks, friend connections
• Documents, e.g., pictures, videos
• Personal data, e.g., affiliations, locations
• Products, applications, services - bought, used, installed
• Not only a user’s behavior, but also interactions of other users
• “people can make statements about me”
• “people who are similar to me can reveal information about me”
• “social learning” collaborative recommender systems
Social Web 2014, Lora Aroyo!
7. UM: Basic Concepts
• User Profile = data structure = a characterization of a user at a
particular moment
represents what, from a given
(system) perspective, there is to know about a user. The data
in the profile can be explicitly given by user or derived by
system
• User Model = definitions & rules for the interpretation of
observations about the user & about the translation of that
interpretation into the characteristics in a user profile
user model is the recipe for obtaining & interpreting user profiles
• User Modeling = the process of representing the user
Social Web 2014, Lora Aroyo!
8. User Modeling Approaches
• Overlay User Modeling: describe user characteristics, e.g.
“knowledge of a user”, “interests of a user” with respect to
“ideal” characteristics
• Customizing: user explicitly provides & adjusts elements of
the user profile
• User model elicitation: ask & observe the user; learn &
improve user profile successively
modeling”
“interactive user
• Stereotyping: stereotypical characteristics to describe a user
• User Relevance Modeling: learn/infer probabilities that a given
item or concept is relevant for a user
Related scientific conference: http://umap2011.org/ Related journal: http:/umuai.org/
Social Web 2014, Lora Aroyo!
9. Which approach suits best
the conditions of
applications?
Social Web 2014, Lora Aroyo! http://farm7.staticflickr.com/6240/6346803873_e756dd9bae_b.jpg
10. Overlay User Models
• among the oldest user models
• used for modeling student
knowledge
• the user is typically characterized
in terms of domain concepts &
hypotheses of the user’s knowledge
about these concepts in relation
to an (ideal) expert’s knowledge
• concept-value pairs
Social Web 2014, Lora Aroyo!
11. User Model Elicitation
• Ask the user explicitly
learn
• NLP, intelligent dialogues
• Bayesian networks, Hidden Markov models
• Observe the user
learn
• Logs, machine learning
• Clustering, classification, data mining
• Interactive user modeling: mixture of direct inputs of a
user, observations and inferences
Social Web 2014, Lora Aroyo!
13. User
Stereotypes
•
set of characteristics (e.g.
attribute-value pairs) that
describe a group of users.
•
user is not assigned to a single
stereotype - user profile can
feature characteristics of
several different stereotypes
Social Web 2014, Lora Aroyo!
http://farm1.staticflickr.com/155/413650229_31ef379b0b_b.jpg
15. Can we infer a
Twitter-based User Profile?
Personalized News
Recommender
Profile
?
User Modeling
(4 building blocks)
Semantic Enrichment,
Linkage and Alignment
based on slides from Fabien Abel
I want my
personalized news
recommendations!
16. User Modeling Building
Blocks
1. Temporal
Constraints
1. Which tweets of
the user should be
analyzed?
start
Profile?
concept weight
?
weekends
Morning:
Afternoon:
Night:
(a) time period
(b) temporal patterns
end
time
June 27
July 4
based on slides from Fabien Abel
July 11
17. User Modeling Building
Blocks
Francesca
Schiavone
T Sport
concept weight
# hashtag-based
entity-based
T topic-based
#
2. Profile
Type
Profile?
Francesca Schiavone won
French Open #fo2010
French
Open
1. Temporal
Constraints
?
fo2010
2. What type of concepts
should represent “interests”?
time
June 27
July 4
based on slides from Fabien Abel
July 11
18. User Modeling Building
Blocks
Francesca
Schiavone
1. Temporal
Constraints
(a) tweet-based
Profile?
Francesca Schiavone won!
http://bit.ly/2f4t7a
concept weight
Francesca
Schiavone
French Open
Tennis
Francesca wins French Open
Thirty in women's
tennis is primordially
old, an age when
agility and desire
recedes as the …
French
Open
2. Profile
Type
3. Semantic
Enrichment
(b) further enrichment
Tennis
3. Further enrich the semantics of tweets?
based on slides from Fabien Abel
19. User Modeling Building
Blocks
1. Temporal
Constraints
Profile?
concept
4. How to weight the
concepts?
Concept frequency (TF)
TFxIDF
Time-sensitive
weight
Francesca
Schiavone
4
French Open
Tennis
?
weight(French Open)
weight(Francesca
Schiavone)
3
6
2. Profile
Type
3. Semantic
Enrichment
4. Weighting
Scheme
weight(Tennis)
time
June 27
July 4
based on slides from Fabien Abel
July 11
20. Observations
• Profile characteristics:
• Semantic enrichment solves sparsity problems
• Profiles change over time: recent profiles reflect better
current user demands
• Temporal patterns: weekend profiles differ significantly
from weekday profiles
• Impact on recommendations:
• The more fine-grained the concepts the better the
recommendation performance: entity-based > topic-based
> hashtag-based
• Semantic enrichment improves recommendation quality
• Time-sensitivity (adapting to trends) improves
performance
Social Web 2014, Lora Aroyo!
21. User Modeling
it is not about putting everything in a user profile
it is about making the right choices
Social Web 2014, Lora Aroyo!
22. User Adaptation
Knowing the user to adapt a system or interface
to improve the system functionality and user experience
Social Web 2014, Lora Aroyo!
23. User-Adaptive Systems
user
profile
user modeling
observations,
data and
information
about user
profile analysis
adaptation
decisions
A. Jameson. Adaptive interfaces and agents. The HCI handbook: fundamentals,
evolving technologies and emerging applications, pp. 305–330, 2003.
24. Last.fm adapts to
your music taste
user profile
interests in
genres,
artists, tags
user modeling
(infer current
musical taste)
compare profile
with possible next
songs to play
history of
songs, like,
ban, pause,
skip
next song to
be played
based on slides from Fabien Abel
25. Issues in User-Adaptive
Systems
• Overfitting, “bubble effects”, loss of serendipity problem:
• systems may adapt too strongly to the interests/behavior
• e.g., an adaptive radio station may always play the same or
very similar songs
• We search for the right balance between novelty and
relevance for the user
• “Lost in Hyperspace” problem:
• when adapting the navigation – i.e. the links on which
users can click to find/access information
• e.g., re-ordering/hiding of menu items may lead to
confusion
Social Web 2014, Lora Aroyo!
26. What is good user modelling
& personalisation?
Social Web 2014, Lora Aroyo!
http://www.flickr.com/photos/bellarosebyliz/4729613108
27. Success Perspectives
• From the consumer perspective of an
adaptive system:
! Adaptive system maximizes
satisfaction of the user
hard to measure/obtain
!
• From the provider perspective of an
adaptive system:
Adaptive system maximizes
the profit
Social Web 2014, Lora Aroyo!
influence of UM &
personalization may be
hard to measure/obtain
28. Evaluation Strategies
• User studies: ask/observe (selected) people whether you did a
good job
• Log analysis: Analyze (click) data and infer whether you did a
good job,
• Evaluation of user modeling:
• measure quality of profiles directly, e.g. measure overlap with
•
existing (true) profiles, or let people judge the quality of the
generated user profiles
measure quality of application that exploits the user profile,
e.g., apply user modeling strategies in a recommender
system
Social Web 2014, Lora Aroyo!
29. Evaluating User Modeling
in RecSys
training data
test data (ground truth)
item C
item A
item B
training
data
item G
item E
item D
measure
quality
time
item F
Recommendations:
Z
X
Y
Strategy X
Strategy Y
item H
Recommender
?
User Modeling strategies to compare
Social Web 2014, Lora Aroyo!
item H
item F
item H
Strategy Z
item R
item G
item H
?
?
?
item M
item N
item M
30. Possible Metrics
• The usual IR metrics:
• Precision: fraction of retrieved items that are relevant
• Recall: fraction of relevant items that have been retrieved
• F-Measure: (harmonic) mean of precision and recall
• Metrics for evaluating recommendation (rankings):
• Mean Reciprocal Rank (MRR) of first relevant item
• Success@k: probability that relevant item occurs within the
top k
• If a true ranking is given: rank correlations
• Precision@k, Recall@k & F-Measure@k
• Metrics for evaluating prediction of user preferences:
• MAE = Mean Absolute Error
• True/False Positives/Negatives
performance
Social Web 2014, Lora Aroyo!
strategy X
baseline
runs
Is strategy X better than the baseline?
31. Example Evaluation
• [Rae et al.] a typical example of how to investigate and evaluate a proposal for
improving (tag) recommendations (using social networks)
• Task: test how well the different strategies (different tag contexts) can be used
for tag prediction/recommendation
• Steps:
1. Gather a dataset of tag data part of which can be used as input and aim to
test the recommendation on the remaining tag data
2. Use the input data and calculate for the different strategies the predictions
3. Measure the performance using standard (IR) metrics: Precision of the top
5 recommended tags (P@5), Mean Reciprocal Rank (MRR), Mean Average
Precision (MAP)
4. Test the results for statistical significance using T-test, relative to the
baseline (e.g. existing approach, competitive approach)
[Rae et al. Improving Tag Recommendations Using Social Networks, RIAO’10]]
Social Web 2014, Lora Aroyo!
32. Example Evaluation
• [Guy et al.] another example of a similar evaluation approach
• The different strategies differ in the way people & tags are
used: with tag-based systems, there are complex
relationships between users, tags and items, and strategies
aim to find the relevant aspects of these relationships for
modeling and recommendation
• The baseline is the ‘most popular’ tags - often used to
compare the most popular tags to the tags predicted by a
particular personalization strategy - investigating whether
the personalization is worth the effort and is able to
outperform the easily available baseline.
[Guy et al. Social Media Recommendation based on People and Tags, SIGIR’10]]
Social Web 2014, Lora Aroyo!
42. Collaborative Filtering
• Memory-based: User-Item matrix: ratings/preferences of users => compute
similarity between users & recommend items of similar users
• Model-based: Item-Item matrix: similarity (e.g. based on user ratings) between
items => recommend items that are similar to the ones the user likes
• Model-based: Clustering: cluster users according to their preferences =>
recommend items of users that belong to the same cluster
• Model-based: Bayesian networks: P(u likes item B | u likes item A) = how likely
is it that a user, who likes item A, will like item B learn probabilities from
user ratings/preferences
• Others: rule-based, other data mining techniques
u1
likes
likes
u2
likes
Social Web 2014, Lora Aroyo!
! u1 likes
Pulp Fiction?
43. Memory vs. Model-based
• complete input data is
required
• pre-computation not
possible
• does not scale well
• high quality of
recommendations
!
• abstraction (model) of input
data
• pre-computation (partially)
possible (model has to be
re-built from time to time)
• scales better
• abstraction may reduce
recommendation quality
Social Web 2014, Lora Aroyo!
44. Social Networks &
Interest Similarity
• collaborative filtering: ‘neighborhoods’ of people with similar interest
& recommending items based on likings in neighborhood
• limitations: next to ‘cold start’ and ‘sparsity’ the lack of control (over
one’s neighborhood) is also a problem, i.e. cannot add ‘trusted’ people, nor
exclude ‘strange’ ones
• therefore, interest in ‘social recommenders’, where presence of social
connections defines the similarity in interests (e.g. social tagging CiteULike):
• does a social connection indicate user interest similarity?
• how much users interest similarity depends on the strength of their
connection?
• is it feasible to use a social network as a personalized
recommendation?
[Lin & Brusilovsky, Social Social Web 2014, Lora Aroyo! Similarity: The Case of CiteULike, HT’10]
Networks and Interest
45. Conclusions
• unilaterally connected pairs have more common items/metadata/tags than non-connected pairs
• highest similarity for direct connections - decreasing with the increase of distance between users in SN
• reciprocal relationship users - significantly larger similarity than users in a unidirectional relationship
• traditional item-level similarity may be less reliable to find similar users in social bookmarking systems
• peers connected by self-defined social connections could be a useful source for cross-recommendation
Social Web 2014, Lora Aroyo!
46. Content-based
Recommendations
• Input: characteristics of items & interests of a user into
characteristics of items => Recommend items that feature
characteristics which meet the user’s interests
• Techniques:
• Data mining methods: Cluster items based on their
•
•
characteristics => Infer users’ interests into clusters
IR methods: Represent items & users as term vectors =>
Compute similarity between user profile vector and items
Utility-based methods: Utility function that gets an item as
input; the parameters of the utility function are
customized via preferences of a user
Social Web 2014, Lora Aroyo!
47. Government stops
renovation of tower
bridge Oct 13th 2011
Tower Bridge
is a combined bascule and suspension
bridge in London, England, over the
River Thames.
Category: politics, england
Related Twiper news:
@bob: Why do they stop to… [more]
@mary: London stops reno… [more]
Tower Bridge today Under construction
Content
Features
db:Politics
db:Sports
db:Education
db:London
db:Tower_Bridge
db:Government
db:UK
Weighting strategy:
- occurrence frequency
- normalize vectors (1-norm ! sum of vector equals 1)
based on slides from Fabien Abel
0.2
0
0
0.2 = a
0.4
0.1
0.1
48. User’s Twitter
history
RT: Government stops
renovation of tower
bridge Oct 13th 2011
I am in London at the
moment Oct 13th 2011
I am doing sports
Oct 12th 2011
User Model
db:Politics
db:Sports
db:Education
db:London
db:Tower_Bridge
db:Government
db:UK
Weighting strategy:
- occurrence frequency (e.g. smoothened by occurrence time ! recent
concepts are more important
- normalize vectors (1-norm ! sum of vector equals 1)
based on slides from Fabien Abel
0
0.1
0
0.5 = u
0.2
0.2
0
50. RecSys Issues
• Cold-start problem (new user problem): no/little data available to infer preferences of new users
• Changing User Preferences: user interests may change over time
• Sparsity problem (new item problem): item descriptions are sparse, e.g. not many user rated or
tagged an item
• Lack of Diversity (overfitting): when adapting too strongly to the preferences of users they might see
same/similar recommendations
• Use the right context: users do things, which might not be relevant for their user model, e.g. try out
things, do stuff for other people
• Research challenge: right balance between serendipity & personalization
• Research challenge: right way to use the influence of recommendations on user’s behavior
Social Web 2014, Lora Aroyo!
52. Hands-on Teaser
• Your Facebook Friends’ popularity in a spread sheet
• Locations of your Facebook Friends
• Tag Cloud of your wall posts
image
Social Web 2014, Lora Aroyo! source: http://www.flickr.com/photos/bionicteaching/1375254387/