SlideShare a Scribd company logo
1 of 29
Download to read offline
the effect of  correlation coefficients on communities of recommenders neal lathia, stephen hailes, licia capra department of computer science university college london [email_address] ACM SAC TRECK, Fortaleza, Brazil: March 2008 Trust, Recommendations, Evidence and other Collaboration Know-how
[object Object],[object Object]
[object Object],design methods to solve problems ,[object Object],[object Object],[object Object],for example,
…  a method to  classify content  correctly data    predicted ratings intelligent process our focus: k-nearest neighbours (kNN)
how do we model kNN collaborative filtering?
a graph of cooperating users me nodes = users links = weighted according to similarity
accuracy, coverage to answer this question, we need to find the optimal weighting: the best similarity measure for the dataset, from the many available: and there are more still…
concordance: proportion of agreement +0.5 +3.0 -1.5 +1.5 +1.5 +/-? concordant discordant tied Somers’ d }
community view of the graph: -0.43 0.57 (a very small example) me -0.50 -0.65 0.12 0.87 0.01 0.57 0.84 0.22 0.99 0.82 0.23 0.39 0.11 0.68 0.02 0.41 0.01 -0.99 0.78
or, put another way: -0.43 0.57 (a very small example) me good bad none good good good good none none good bad bad good good good good none good good
what is the best way of generating the graph?
like this? -0.43 0.57 (a very small example) me good bad none none good bad bad good good good good good bad none none good none bad bad
or like this? -0.43 0.57 (a very small example) me good bad none good good good good none none bad bad bad good good good good none good good
similarity values depend on the method used: there is no agreement between measures [2] [3] [1] [5] [3] [4] [1] [3] [2] [3]    my profile  neighbour profile   pearson -0.50 weighted- pearson -0.05 cosine angle 0.76 co-rated proportion 1.00 concordance -0.06 bad near zero good very good near zero
each method will change the distribution of similarity across the graph nodes = users links = weighted according to similarity
…  the pearson distribution  intelligent process
…  the modified pearson distributions weighted-PCC, constrained-PCC
…  and other measures  intelligent process somers’ d, co-rated, cosine angle
an experiment with random numbers
what happens if we do this? me java.util.Random r = new java.util.Random() for all neighbours i { similarity(i) = (r.nextDouble()*2.0)-1.0); }
accuracy   … cross-validation results in paper movielens u1 subset… 0.7811 0.7769 0.7773 0.8025 0.8073 0.7992 0.7718 459 0.8058 0.7992 0.7919 0.7679 0.7716 0.7771 0.7717 229 0.8024 0.8243 0.8053 0.7638 0.7817 0.7727 0.7726 153 0.8153 0.8511 0.8222 0.7647 0.8136 0.7728 0.7759 100 0.8498 0.8922 0.8584 0.7733 0.9007 0.7817 0.7852 50 0.8848 0.9108 0.8903 0.7847 0.9464 0.7931 0.7979 30 0.9689 0.9495 0.9595 0.8277 1.0455 0.8355 0.8498 10 1.0341 1.0406 1.0665 0.9596 1.1150 0.9492 0.9449 1 R(-1.0, 1.0) Constant(1.0) R(0.5, 1.0) wPCC PCC Somers’ d Co Rated Neighborhood
coverage   … cross-validation results in paper movielens u1 subset… (best coverage when all of community used) 0.00495 0.00495 0.00495 0.0054 0.00495 459 0.00495 0.00915 0.01165 0.00965 0.00715 229 0.00495 0.01135 0.0273 0.0122 0.00945 153 0.00495 0.01485 0.08345 0.01645 0.01515 100 0.00495 0.0251 0.3641 0.0266 0.03065 50 0.00495 0.04135 0.57225 0.0407 0.0512 30 0.00495 0.1114 0.80515 0.0999 0.15455 10 0.00495 0.61375 0.96725 0.57165 0.67795 1 Oracle wPCC PCC Somers’ d Co Rated Neighborhood
why do we get these results?
a) our   error measures   are not good enough?   J. Herlocker, J. Konstan, L. Terveen, and J. Riedl.  Evaluating collaborative filtering recommender systems.  In ACM Transactions on Information Systems, volume 22, pages 5–53. ACM Press, 2004. S.M. McNee, J. Riedl, and J.A. Konstan.  Being accurate is not enough: How accuracy metrics have hurt recommender systems . In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems. ACM Press, 2006.
b) is there something wrong with the   dataset?
c) is user-similarity  not strong enough  to capture the best recommender relationships in the graph?
one proposal… N. Lathia, S. Hailes, L. Capra.  Trust-Based Collaborative Filtering.  To appear In IFIPTM 2008: Joint iTrust and PST Conferences on Privacy, Trust management and Security. Trondheim, Norway. June 2008. is modelling filtering as a trust-management problem a potential solution? once we do that, more questions arise…
current work what other graph properties emerge from kNN collaborative filtering? how does the graph evolve over time? N. Lathia, S. Hailes, L. Capra.  Evolving Communities of Recommenders: A Temporal Evaluation.  Research Note RN/08/01, Department of Computer Science, University College London. Under Submission. N. Lathia, S. Hailes, L. Capra.  kNN User Filtering: A Temporal Implicit Social Network.  Current Work.
questions? read more:  http://mobblog.cs.ucl.ac.uk trust, recommendations, … neal lathia, stephen hailes, licia capra department of computer science university college london [email_address] ACM SAC TRECK, Fortaleza, Brazil: March 2008 Trust, Recommendations, Evidence and other Collaboration Know-how

More Related Content

What's hot

Median and Its Significance - Dr Richard Diamond
Median and Its Significance - Dr Richard DiamondMedian and Its Significance - Dr Richard Diamond
Median and Its Significance - Dr Richard DiamondRichard Diamond
 
Automatic Visualization
Automatic VisualizationAutomatic Visualization
Automatic VisualizationSri Ambati
 
Statistics review
Statistics reviewStatistics review
Statistics reviewjpcagphil
 
3.1 measures of central tendency
3.1 measures of central tendency3.1 measures of central tendency
3.1 measures of central tendencyleblance
 
Biostatistics measures of central tendency
Biostatistics   measures of central tendencyBiostatistics   measures of central tendency
Biostatistics measures of central tendencyKarmadipsinh Zala
 

What's hot (7)

Arithmatic Mean
Arithmatic MeanArithmatic Mean
Arithmatic Mean
 
Dscriptive statistics
Dscriptive statisticsDscriptive statistics
Dscriptive statistics
 
Median and Its Significance - Dr Richard Diamond
Median and Its Significance - Dr Richard DiamondMedian and Its Significance - Dr Richard Diamond
Median and Its Significance - Dr Richard Diamond
 
Automatic Visualization
Automatic VisualizationAutomatic Visualization
Automatic Visualization
 
Statistics review
Statistics reviewStatistics review
Statistics review
 
3.1 measures of central tendency
3.1 measures of central tendency3.1 measures of central tendency
3.1 measures of central tendency
 
Biostatistics measures of central tendency
Biostatistics   measures of central tendencyBiostatistics   measures of central tendency
Biostatistics measures of central tendency
 

Viewers also liked

Painting On Eggs
Painting On EggsPainting On Eggs
Painting On Eggsguestbad002
 
David Copperfield1
David Copperfield1David Copperfield1
David Copperfield1guest4f4bed
 
Estudio Giro Pais
Estudio Giro PaisEstudio Giro Pais
Estudio Giro Paispmatamoros
 
Pautas que conectan
Pautas que conectanPautas que conectan
Pautas que conectannachouman
 
E U T A N A S I A A Favor En Contra 1
E U T A N A S I A A Favor En Contra 1E U T A N A S I A A Favor En Contra 1
E U T A N A S I A A Favor En Contra 1guest4d8173
 

Viewers also liked (7)

Diadelamujer
DiadelamujerDiadelamujer
Diadelamujer
 
Painting On Eggs
Painting On EggsPainting On Eggs
Painting On Eggs
 
David Copperfield1
David Copperfield1David Copperfield1
David Copperfield1
 
Keynote Rss
Keynote RssKeynote Rss
Keynote Rss
 
Estudio Giro Pais
Estudio Giro PaisEstudio Giro Pais
Estudio Giro Pais
 
Pautas que conectan
Pautas que conectanPautas que conectan
Pautas que conectan
 
E U T A N A S I A A Favor En Contra 1
E U T A N A S I A A Favor En Contra 1E U T A N A S I A A Favor En Contra 1
E U T A N A S I A A Favor En Contra 1
 

Similar to SAC TRECK 2008

Telefonica Lunch Seminar
Telefonica Lunch SeminarTelefonica Lunch Seminar
Telefonica Lunch SeminarNeal Lathia
 
copy for Gary Chin.
copy for Gary Chin.copy for Gary Chin.
copy for Gary Chin.Teng Xiaolu
 
IRJET - Face Recognition in Digital Documents with Live Image
IRJET - Face Recognition in Digital Documents with Live ImageIRJET - Face Recognition in Digital Documents with Live Image
IRJET - Face Recognition in Digital Documents with Live ImageIRJET Journal
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsMatthew Lease
 
Continuous Sentiment Intensity Prediction based on Deep Learning
Continuous Sentiment Intensity Prediction based on Deep LearningContinuous Sentiment Intensity Prediction based on Deep Learning
Continuous Sentiment Intensity Prediction based on Deep LearningYunchao He
 
Dwdm ppt for the btech student contain basis
Dwdm ppt for the btech student contain basisDwdm ppt for the btech student contain basis
Dwdm ppt for the btech student contain basisnivatripathy93
 
Download
DownloadDownload
Downloadbutest
 
Download
DownloadDownload
Downloadbutest
 
(Gaurav sawant & dhaval sawlani)bia 678 final project report
(Gaurav sawant & dhaval sawlani)bia 678 final project report(Gaurav sawant & dhaval sawlani)bia 678 final project report
(Gaurav sawant & dhaval sawlani)bia 678 final project reportGaurav Sawant
 
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...Monica Gero
 
Datamining intro-iep
Datamining intro-iepDatamining intro-iep
Datamining intro-iepaaryarun1999
 
20142014_20142015_20142115
20142014_20142015_2014211520142014_20142015_20142115
20142014_20142015_20142115Divita Madaan
 
SDNC13 -Day2- The subjective science of persona building by Stephen Masiclat
SDNC13 -Day2- The subjective science of persona building by Stephen Masiclat SDNC13 -Day2- The subjective science of persona building by Stephen Masiclat
SDNC13 -Day2- The subjective science of persona building by Stephen Masiclat Service Design Network
 
IRJET- Violent Social Interaction Recognition
IRJET- Violent Social Interaction RecognitionIRJET- Violent Social Interaction Recognition
IRJET- Violent Social Interaction RecognitionIRJET Journal
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.docbutest
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.docbutest
 
Community detection
Community detectionCommunity detection
Community detectionScott Pauls
 
PhD Consortium ADBIS presetation.
PhD Consortium ADBIS presetation.PhD Consortium ADBIS presetation.
PhD Consortium ADBIS presetation.Giuseppe Ricci
 

Similar to SAC TRECK 2008 (20)

Telefonica Lunch Seminar
Telefonica Lunch SeminarTelefonica Lunch Seminar
Telefonica Lunch Seminar
 
copy for Gary Chin.
copy for Gary Chin.copy for Gary Chin.
copy for Gary Chin.
 
IRJET - Face Recognition in Digital Documents with Live Image
IRJET - Face Recognition in Digital Documents with Live ImageIRJET - Face Recognition in Digital Documents with Live Image
IRJET - Face Recognition in Digital Documents with Live Image
 
Data mining BY Zubair Yaseen
Data mining BY Zubair YaseenData mining BY Zubair Yaseen
Data mining BY Zubair Yaseen
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to Ethics
 
Continuous Sentiment Intensity Prediction based on Deep Learning
Continuous Sentiment Intensity Prediction based on Deep LearningContinuous Sentiment Intensity Prediction based on Deep Learning
Continuous Sentiment Intensity Prediction based on Deep Learning
 
Pca seminar final report
Pca seminar final reportPca seminar final report
Pca seminar final report
 
Dwdm ppt for the btech student contain basis
Dwdm ppt for the btech student contain basisDwdm ppt for the btech student contain basis
Dwdm ppt for the btech student contain basis
 
Download
DownloadDownload
Download
 
Download
DownloadDownload
Download
 
(Gaurav sawant & dhaval sawlani)bia 678 final project report
(Gaurav sawant & dhaval sawlani)bia 678 final project report(Gaurav sawant & dhaval sawlani)bia 678 final project report
(Gaurav sawant & dhaval sawlani)bia 678 final project report
 
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...
A Collaborative Recommender System Based On Probabilistic Inference From Fuzz...
 
Datamining intro-iep
Datamining intro-iepDatamining intro-iep
Datamining intro-iep
 
20142014_20142015_20142115
20142014_20142015_2014211520142014_20142015_20142115
20142014_20142015_20142115
 
SDNC13 -Day2- The subjective science of persona building by Stephen Masiclat
SDNC13 -Day2- The subjective science of persona building by Stephen Masiclat SDNC13 -Day2- The subjective science of persona building by Stephen Masiclat
SDNC13 -Day2- The subjective science of persona building by Stephen Masiclat
 
IRJET- Violent Social Interaction Recognition
IRJET- Violent Social Interaction RecognitionIRJET- Violent Social Interaction Recognition
IRJET- Violent Social Interaction Recognition
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.doc
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.doc
 
Community detection
Community detectionCommunity detection
Community detection
 
PhD Consortium ADBIS presetation.
PhD Consortium ADBIS presetation.PhD Consortium ADBIS presetation.
PhD Consortium ADBIS presetation.
 

More from Neal Lathia

Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Neal Lathia
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Neal Lathia
 
Using language models to supercharge Monzo’s customer support
 Using language models to supercharge Monzo’s customer support Using language models to supercharge Monzo’s customer support
Using language models to supercharge Monzo’s customer supportNeal Lathia
 
Making Better Decisions Faster
Making Better Decisions FasterMaking Better Decisions Faster
Making Better Decisions FasterNeal Lathia
 
Machine Learning, Faster
Machine Learning, FasterMachine Learning, Faster
Machine Learning, FasterNeal Lathia
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised ExperiencesNeal Lathia
 
Opportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelOpportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelNeal Lathia
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineNeal Lathia
 
Machine Learning for Product Managers
Machine Learning for Product ManagersMachine Learning for Product Managers
Machine Learning for Product ManagersNeal Lathia
 
Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Neal Lathia
 
Happier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataHappier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataNeal Lathia
 
Data Science in Digital Health
Data Science in Digital HealthData Science in Digital Health
Data Science in Digital HealthNeal Lathia
 
Using Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeUsing Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeNeal Lathia
 
Analysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataAnalysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataNeal Lathia
 
Cambridge Quantified Self Meetup
Cambridge Quantified Self MeetupCambridge Quantified Self Meetup
Cambridge Quantified Self MeetupNeal Lathia
 
Data Science in #mHealth
Data Science in #mHealthData Science in #mHealth
Data Science in #mHealthNeal Lathia
 
Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Neal Lathia
 
Emotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentEmotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentNeal Lathia
 
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Neal Lathia
 
Using Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeUsing Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeNeal Lathia
 

More from Neal Lathia (20)

Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)
 
Using language models to supercharge Monzo’s customer support
 Using language models to supercharge Monzo’s customer support Using language models to supercharge Monzo’s customer support
Using language models to supercharge Monzo’s customer support
 
Making Better Decisions Faster
Making Better Decisions FasterMaking Better Decisions Faster
Making Better Decisions Faster
 
Machine Learning, Faster
Machine Learning, FasterMachine Learning, Faster
Machine Learning, Faster
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised Experiences
 
Opportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelOpportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised Travel
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation Engine
 
Machine Learning for Product Managers
Machine Learning for Product ManagersMachine Learning for Product Managers
Machine Learning for Product Managers
 
Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)
 
Happier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataHappier and Healthier with Smartphone Data
Happier and Healthier with Smartphone Data
 
Data Science in Digital Health
Data Science in Digital HealthData Science in Digital Health
Data Science in Digital Health
 
Using Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeUsing Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily Life
 
Analysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataAnalysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone Data
 
Cambridge Quantified Self Meetup
Cambridge Quantified Self MeetupCambridge Quantified Self Meetup
Cambridge Quantified Self Meetup
 
Data Science in #mHealth
Data Science in #mHealthData Science in #mHealth
Data Science in #mHealth
 
Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport
 
Emotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentEmotion Sense: From Design to Deployment
Emotion Sense: From Design to Deployment
 
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
 
Using Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeUsing Smartphones to Research Daily Life
Using Smartphones to Research Daily Life
 

Recently uploaded

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 

Recently uploaded (20)

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 

SAC TRECK 2008

  • 1. the effect of correlation coefficients on communities of recommenders neal lathia, stephen hailes, licia capra department of computer science university college london [email_address] ACM SAC TRECK, Fortaleza, Brazil: March 2008 Trust, Recommendations, Evidence and other Collaboration Know-how
  • 2.
  • 3.
  • 4. … a method to classify content correctly data   predicted ratings intelligent process our focus: k-nearest neighbours (kNN)
  • 5. how do we model kNN collaborative filtering?
  • 6. a graph of cooperating users me nodes = users links = weighted according to similarity
  • 7. accuracy, coverage to answer this question, we need to find the optimal weighting: the best similarity measure for the dataset, from the many available: and there are more still…
  • 8. concordance: proportion of agreement +0.5 +3.0 -1.5 +1.5 +1.5 +/-? concordant discordant tied Somers’ d }
  • 9. community view of the graph: -0.43 0.57 (a very small example) me -0.50 -0.65 0.12 0.87 0.01 0.57 0.84 0.22 0.99 0.82 0.23 0.39 0.11 0.68 0.02 0.41 0.01 -0.99 0.78
  • 10. or, put another way: -0.43 0.57 (a very small example) me good bad none good good good good none none good bad bad good good good good none good good
  • 11. what is the best way of generating the graph?
  • 12. like this? -0.43 0.57 (a very small example) me good bad none none good bad bad good good good good good bad none none good none bad bad
  • 13. or like this? -0.43 0.57 (a very small example) me good bad none good good good good none none bad bad bad good good good good none good good
  • 14. similarity values depend on the method used: there is no agreement between measures [2] [3] [1] [5] [3] [4] [1] [3] [2] [3]  my profile neighbour profile  pearson -0.50 weighted- pearson -0.05 cosine angle 0.76 co-rated proportion 1.00 concordance -0.06 bad near zero good very good near zero
  • 15. each method will change the distribution of similarity across the graph nodes = users links = weighted according to similarity
  • 16. … the pearson distribution  intelligent process
  • 17. … the modified pearson distributions weighted-PCC, constrained-PCC
  • 18. … and other measures  intelligent process somers’ d, co-rated, cosine angle
  • 19. an experiment with random numbers
  • 20. what happens if we do this? me java.util.Random r = new java.util.Random() for all neighbours i { similarity(i) = (r.nextDouble()*2.0)-1.0); }
  • 21. accuracy  … cross-validation results in paper movielens u1 subset… 0.7811 0.7769 0.7773 0.8025 0.8073 0.7992 0.7718 459 0.8058 0.7992 0.7919 0.7679 0.7716 0.7771 0.7717 229 0.8024 0.8243 0.8053 0.7638 0.7817 0.7727 0.7726 153 0.8153 0.8511 0.8222 0.7647 0.8136 0.7728 0.7759 100 0.8498 0.8922 0.8584 0.7733 0.9007 0.7817 0.7852 50 0.8848 0.9108 0.8903 0.7847 0.9464 0.7931 0.7979 30 0.9689 0.9495 0.9595 0.8277 1.0455 0.8355 0.8498 10 1.0341 1.0406 1.0665 0.9596 1.1150 0.9492 0.9449 1 R(-1.0, 1.0) Constant(1.0) R(0.5, 1.0) wPCC PCC Somers’ d Co Rated Neighborhood
  • 22. coverage  … cross-validation results in paper movielens u1 subset… (best coverage when all of community used) 0.00495 0.00495 0.00495 0.0054 0.00495 459 0.00495 0.00915 0.01165 0.00965 0.00715 229 0.00495 0.01135 0.0273 0.0122 0.00945 153 0.00495 0.01485 0.08345 0.01645 0.01515 100 0.00495 0.0251 0.3641 0.0266 0.03065 50 0.00495 0.04135 0.57225 0.0407 0.0512 30 0.00495 0.1114 0.80515 0.0999 0.15455 10 0.00495 0.61375 0.96725 0.57165 0.67795 1 Oracle wPCC PCC Somers’ d Co Rated Neighborhood
  • 23. why do we get these results?
  • 24. a) our error measures are not good enough? J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. In ACM Transactions on Information Systems, volume 22, pages 5–53. ACM Press, 2004. S.M. McNee, J. Riedl, and J.A. Konstan. Being accurate is not enough: How accuracy metrics have hurt recommender systems . In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems. ACM Press, 2006.
  • 25. b) is there something wrong with the dataset?
  • 26. c) is user-similarity not strong enough to capture the best recommender relationships in the graph?
  • 27. one proposal… N. Lathia, S. Hailes, L. Capra. Trust-Based Collaborative Filtering. To appear In IFIPTM 2008: Joint iTrust and PST Conferences on Privacy, Trust management and Security. Trondheim, Norway. June 2008. is modelling filtering as a trust-management problem a potential solution? once we do that, more questions arise…
  • 28. current work what other graph properties emerge from kNN collaborative filtering? how does the graph evolve over time? N. Lathia, S. Hailes, L. Capra. Evolving Communities of Recommenders: A Temporal Evaluation. Research Note RN/08/01, Department of Computer Science, University College London. Under Submission. N. Lathia, S. Hailes, L. Capra. kNN User Filtering: A Temporal Implicit Social Network. Current Work.
  • 29. questions? read more: http://mobblog.cs.ucl.ac.uk trust, recommendations, … neal lathia, stephen hailes, licia capra department of computer science university college london [email_address] ACM SAC TRECK, Fortaleza, Brazil: March 2008 Trust, Recommendations, Evidence and other Collaboration Know-how