Travel route reconmmendations using geotagged photosEli Boyarski
This document describes a travel route recommendation system that uses geotagged photos from Flickr. The system identifies landmarks from photo locations using mean-shift clustering and names them using tags. It estimates travel times between landmarks based on timestamps. A Markov model and topic model are used to generate personalized landmark sequences for a user based on their photo history and interests. The system was tested on over 690,000 photos across several cities and showed improved accuracy over alternative models in predicting landmark visits and travel times. Future work could integrate additional user data and consider time-varying factors.
TTC16: Gadi Bashvitz - Travel Booking. Personalized Maksim Izmaylov
OLSET provides APIs that enable personalization to improve hotel booking conversions for OTAs, TMCs, and other travel providers. Personalization is common in retail but not for travel booking, where hotel attachment rates are low. OLSET's algorithms match travelers to hotels based on profiles, reviews, and other data to provide personalized recommendations tailored for each traveler type. A/B tests showed OLSET generated 13% higher margins and 16% higher conversions than baseline products.
[ECWEB2012]Differential Context Relaxation for Context-Aware Travel Recommend...YONG ZHENG
Context-aware recommendation (CARS) has been shown to be an effective approach to recommendation in a number of domains. However, the problem of identifying appropriate contextual variables remains: using too many contextual variables risks a drastic increase in dimensionality and a loss of accuracy in recommendation. In this paper, we propose a novel treatment of context – identifying influential contexts for different algorithm components instead of for the whole algorithm. Based on this idea, we take traditional user-based collaborative filtering (CF) as an example, decompose it into three context-sensitive components, and propose a hybrid contextual approach. We then identify appropriate relaxations of contextual constraints for each algorithm component. The effectiveness of context relaxation is demonstrated by comparison of three algorithms using a travel data set: a contenxt-ignorant approach, contextual pre-filtering, and our hybrid contextual algorithm. The experiments show that choosing an appropriate relaxation of the contextual constraints for each component of an algorithm outperforms strict application of the context.
Applications of Machine Learning to Location-based Social NetworksJoan Capdevila Pujol
This document summarizes an application of machine learning techniques to location-based social networks. It discusses two applications:
1) GeoSRS, a hybrid social recommender system that provides personalized venue recommendations to users. It extracts data from Foursquare using an API, performs text modeling on tip content, and generates recommendations using both collaborative and content-based approaches.
2) Tweet-SCAN, an event discovery technique that identifies dense groups of geolocated tweets close in space, time, and topic to discover real-world events. It extends the DBSCAN clustering algorithm and represents tweet topics using probabilistic models. The technique is evaluated on tweets from Barcelona events.
The document describes a proposed web-based package tour reservation system for R-Rom-D Tour Co., Ltd. to address issues with their current manual system. The new system will allow customers to browse and reserve tours online through a website. It will integrate a customer database and enable reports for management. The 4-month project will include system analysis, design, implementation, and testing before launch.
Travel route reconmmendations using geotagged photosEli Boyarski
This document describes a travel route recommendation system that uses geotagged photos from Flickr. The system identifies landmarks from photo locations using mean-shift clustering and names them using tags. It estimates travel times between landmarks based on timestamps. A Markov model and topic model are used to generate personalized landmark sequences for a user based on their photo history and interests. The system was tested on over 690,000 photos across several cities and showed improved accuracy over alternative models in predicting landmark visits and travel times. Future work could integrate additional user data and consider time-varying factors.
TTC16: Gadi Bashvitz - Travel Booking. Personalized Maksim Izmaylov
OLSET provides APIs that enable personalization to improve hotel booking conversions for OTAs, TMCs, and other travel providers. Personalization is common in retail but not for travel booking, where hotel attachment rates are low. OLSET's algorithms match travelers to hotels based on profiles, reviews, and other data to provide personalized recommendations tailored for each traveler type. A/B tests showed OLSET generated 13% higher margins and 16% higher conversions than baseline products.
[ECWEB2012]Differential Context Relaxation for Context-Aware Travel Recommend...YONG ZHENG
Context-aware recommendation (CARS) has been shown to be an effective approach to recommendation in a number of domains. However, the problem of identifying appropriate contextual variables remains: using too many contextual variables risks a drastic increase in dimensionality and a loss of accuracy in recommendation. In this paper, we propose a novel treatment of context – identifying influential contexts for different algorithm components instead of for the whole algorithm. Based on this idea, we take traditional user-based collaborative filtering (CF) as an example, decompose it into three context-sensitive components, and propose a hybrid contextual approach. We then identify appropriate relaxations of contextual constraints for each algorithm component. The effectiveness of context relaxation is demonstrated by comparison of three algorithms using a travel data set: a contenxt-ignorant approach, contextual pre-filtering, and our hybrid contextual algorithm. The experiments show that choosing an appropriate relaxation of the contextual constraints for each component of an algorithm outperforms strict application of the context.
Applications of Machine Learning to Location-based Social NetworksJoan Capdevila Pujol
This document summarizes an application of machine learning techniques to location-based social networks. It discusses two applications:
1) GeoSRS, a hybrid social recommender system that provides personalized venue recommendations to users. It extracts data from Foursquare using an API, performs text modeling on tip content, and generates recommendations using both collaborative and content-based approaches.
2) Tweet-SCAN, an event discovery technique that identifies dense groups of geolocated tweets close in space, time, and topic to discover real-world events. It extends the DBSCAN clustering algorithm and represents tweet topics using probabilistic models. The technique is evaluated on tweets from Barcelona events.
The document describes a proposed web-based package tour reservation system for R-Rom-D Tour Co., Ltd. to address issues with their current manual system. The new system will allow customers to browse and reserve tours online through a website. It will integrate a customer database and enable reports for management. The 4-month project will include system analysis, design, implementation, and testing before launch.
[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...Sanghoon Yoon
A Random Walk Around the City: New Venue Recommendation in Location-Based Social Networks
Anastasios Noulas, ASE/IEEE International Conference on Social Computing, 2012
발표자: 김성현 (서울대 융합과학기술대학원 석사과정)
발표일: 2018.6.
네이버 해커톤에 참가하면서 주어진 문제에 어떻게 대처했고, 문제를 해결해나가면서 발생한 상황들에 대해 어떻게 대처했는지, 그리고 문제 해결을 위해서 어떤 방법들을 사용했고 어떤 방법들을 시도했는지에 대해서 사후 점검의 측면에서 정리합니다. 그리고 지금 다시 한다면 어떤 방법들을 시도해보고 싶은지와 문제 해결 전략에 대해서도 점검합니다.
문제 파악
무슨 문제를 선택할 것인가?
단어 vs 문자 단위 모델 선택
모델 선택
Why LSTM?
자잘한 선택들
Self Attention
모델 트레이닝과 정규화
일단 트레이닝부터
잘 알려진 정규화 방법을 시도
Weight Dropout
Variational Dropout
L2 Weight Decay
모델 크기 키우기
Weight Averaging
시도해본 것들
mLSTM
CNN
지금이라면 시도해보고 싶은 것들
SGD (+ Momentum), CLR, SWA, Snapshot Ensemble
FFN + Attention
해커톤 공략 전략에 대한 생각
베이스라인과 더 복잡한 모델, Exploitation vs Exploration
시간 제약이 있으면 LSTM은 위험할지도
제 15회 보아즈(BOAZ) 빅데이터 컨퍼런스 - [리뷰의 재발견 팀] : 이커머스 리뷰 유용성 파악 및 필터링BOAZ Bigdata
데이터 분석 프로젝트를 진행한 리뷰의 재발견 팀에서는 아래와 같은 프로젝트를 진행했습다.
Review? Re-View!
물건 살 때 우리는 리뷰를 보며 많은 정보를 얻습니다❕여러분이 보는 그 리뷰의 유용성을 저희가 알려드릴게요
리뷰 유용성 판단부터 토픽으로 알아보는 리뷰 유용성 결정 요인 분석, 군집화를 통한 대표 리뷰 추출까지
우리 같이 Review를 Re-View해봐요
16기 정수연 한양대 파이낸스경영학과
16기 문예진 서강대 경제학부 / 빅데이터 사이언스
16기 이상민 경희대 소프트웨어융합학과
16기 황의린 숙명여대 생명시스템학부 / 통계학과
16기 정승연 연세대 대학원 전산언어학
Course Overview:
This course offers a comprehensive exploration of recommender systems, focusing on both theoretical foundations and practical applications. Through a combination of lectures, hands-on exercises, and real-world case studies, you will gain a deep understanding of the key principles, methodologies, and evaluation techniques that drive effective recommendation algorithms.
Course Objectives:
Acquire a solid understanding of recommender systems, including their significance and impact in various domains.
Explore different types of recommendation algorithms, such as collaborative filtering, content-based filtering, and hybrid approaches.
Study cutting-edge techniques, including deep learning, matrix factorization, and graph-based methods, for enhanced recommendation accuracy.
Gain hands-on experience with popular recommendation frameworks and libraries, and learn how to implement and evaluate recommendation models.
Investigate advanced topics in recommender systems, such as fairness, diversity, and explainability, and their ethical implications.
Analyze and discuss real-world case studies and research papers to gain insights into the challenges and future directions of recommender systems.
Course Structure:
Introduction to Recommender Systems
Collaborative Filtering Techniques
Content-Based Filtering and Hybrid Approaches
Matrix Factorization Methods
Deep Learning for Recommender Systems
Graph-Based Recommendation Approaches
Evaluation Metrics and Experimental Design
Ethical Considerations in Recommender Systems
Fairness, Diversity, and Explainability in Recommendations
Case Studies and Research Trends
Course Delivery:
The course will be delivered through a combination of lectures, interactive discussions, hands-on coding exercises, and group projects. You will have access to state-of-the-art resources, including relevant research papers, datasets, and software tools, to enhance your learning experience.
Machine Learning Foundations (a case study approach) 강의 정리SANG WON PARK
실제 비즈니스에서 많이 활용되는 사례를 중심으로 어떻게 기존 데이터를 이용하여 알고리즘을 선택하고, 학습하여, 예측모델을 구축 하는지 jupyter notebook을 이용하여 실제 코드를 이용하여 실습할 수 있다.
강의 초반에 강조하는 것 처럼, 머신러닝 알고리즘은 나중에 자세히 설명하는 과정이 따로 있고, 이번 강의는 실제 어떻게 활용하는지에 완전히 초점이 맞추어져 있어서, 알고리즘은 아주 간략한 수준으로 설명해 준다. (좀 더 구체적인 내용은 심화과정이 따로 있음)
http://blog.naver.com/freepsw/221113685916 참고
https://github.com/freepsw/coursera/tree/master/ML_Foundations/A_Case_Study 코드 샘플
[논문발표] 20160725 A Random Walk Around the City: New Venue Recommendation in Lo...Sanghoon Yoon
A Random Walk Around the City: New Venue Recommendation in Location-Based Social Networks
Anastasios Noulas, ASE/IEEE International Conference on Social Computing, 2012
발표자: 김성현 (서울대 융합과학기술대학원 석사과정)
발표일: 2018.6.
네이버 해커톤에 참가하면서 주어진 문제에 어떻게 대처했고, 문제를 해결해나가면서 발생한 상황들에 대해 어떻게 대처했는지, 그리고 문제 해결을 위해서 어떤 방법들을 사용했고 어떤 방법들을 시도했는지에 대해서 사후 점검의 측면에서 정리합니다. 그리고 지금 다시 한다면 어떤 방법들을 시도해보고 싶은지와 문제 해결 전략에 대해서도 점검합니다.
문제 파악
무슨 문제를 선택할 것인가?
단어 vs 문자 단위 모델 선택
모델 선택
Why LSTM?
자잘한 선택들
Self Attention
모델 트레이닝과 정규화
일단 트레이닝부터
잘 알려진 정규화 방법을 시도
Weight Dropout
Variational Dropout
L2 Weight Decay
모델 크기 키우기
Weight Averaging
시도해본 것들
mLSTM
CNN
지금이라면 시도해보고 싶은 것들
SGD (+ Momentum), CLR, SWA, Snapshot Ensemble
FFN + Attention
해커톤 공략 전략에 대한 생각
베이스라인과 더 복잡한 모델, Exploitation vs Exploration
시간 제약이 있으면 LSTM은 위험할지도
제 15회 보아즈(BOAZ) 빅데이터 컨퍼런스 - [리뷰의 재발견 팀] : 이커머스 리뷰 유용성 파악 및 필터링BOAZ Bigdata
데이터 분석 프로젝트를 진행한 리뷰의 재발견 팀에서는 아래와 같은 프로젝트를 진행했습다.
Review? Re-View!
물건 살 때 우리는 리뷰를 보며 많은 정보를 얻습니다❕여러분이 보는 그 리뷰의 유용성을 저희가 알려드릴게요
리뷰 유용성 판단부터 토픽으로 알아보는 리뷰 유용성 결정 요인 분석, 군집화를 통한 대표 리뷰 추출까지
우리 같이 Review를 Re-View해봐요
16기 정수연 한양대 파이낸스경영학과
16기 문예진 서강대 경제학부 / 빅데이터 사이언스
16기 이상민 경희대 소프트웨어융합학과
16기 황의린 숙명여대 생명시스템학부 / 통계학과
16기 정승연 연세대 대학원 전산언어학
Course Overview:
This course offers a comprehensive exploration of recommender systems, focusing on both theoretical foundations and practical applications. Through a combination of lectures, hands-on exercises, and real-world case studies, you will gain a deep understanding of the key principles, methodologies, and evaluation techniques that drive effective recommendation algorithms.
Course Objectives:
Acquire a solid understanding of recommender systems, including their significance and impact in various domains.
Explore different types of recommendation algorithms, such as collaborative filtering, content-based filtering, and hybrid approaches.
Study cutting-edge techniques, including deep learning, matrix factorization, and graph-based methods, for enhanced recommendation accuracy.
Gain hands-on experience with popular recommendation frameworks and libraries, and learn how to implement and evaluate recommendation models.
Investigate advanced topics in recommender systems, such as fairness, diversity, and explainability, and their ethical implications.
Analyze and discuss real-world case studies and research papers to gain insights into the challenges and future directions of recommender systems.
Course Structure:
Introduction to Recommender Systems
Collaborative Filtering Techniques
Content-Based Filtering and Hybrid Approaches
Matrix Factorization Methods
Deep Learning for Recommender Systems
Graph-Based Recommendation Approaches
Evaluation Metrics and Experimental Design
Ethical Considerations in Recommender Systems
Fairness, Diversity, and Explainability in Recommendations
Case Studies and Research Trends
Course Delivery:
The course will be delivered through a combination of lectures, interactive discussions, hands-on coding exercises, and group projects. You will have access to state-of-the-art resources, including relevant research papers, datasets, and software tools, to enhance your learning experience.
Machine Learning Foundations (a case study approach) 강의 정리SANG WON PARK
실제 비즈니스에서 많이 활용되는 사례를 중심으로 어떻게 기존 데이터를 이용하여 알고리즘을 선택하고, 학습하여, 예측모델을 구축 하는지 jupyter notebook을 이용하여 실제 코드를 이용하여 실습할 수 있다.
강의 초반에 강조하는 것 처럼, 머신러닝 알고리즘은 나중에 자세히 설명하는 과정이 따로 있고, 이번 강의는 실제 어떻게 활용하는지에 완전히 초점이 맞추어져 있어서, 알고리즘은 아주 간략한 수준으로 설명해 준다. (좀 더 구체적인 내용은 심화과정이 따로 있음)
http://blog.naver.com/freepsw/221113685916 참고
https://github.com/freepsw/coursera/tree/master/ML_Foundations/A_Case_Study 코드 샘플
2. 1. Abstract
2. Introduction
3. User Preference Model
4. Location Based Social Matrix Factorization Model
5. Experimental Analysis
목차
3. 1.1 페이지 제목
Abstract
3 / 14
• 장소 기반 소셜 네트워크에서 사용자들은 특정 장소에 체크인을 하거나 팁을 남길 수 있다.
• 현재까지의 연구에서는 사용자들의 체크인에만 집중을 했고 팁에 대해서는 거의 연구되지
않았다.
• 현재의 연구는 social influence를 주로 고려했지만, 장소 유사도를 이용해서 추천 성능을
높일 수 있다는 것을 주장한다.
• 제안
– Sentiment analysis를 한 팁과 체크인 데이터를 조합한 user-location preference model
– User social influence와 venue similarity를 고려한 matrix factorization algorithm을 통한 location recommendation
4. 1.1 페이지 제목
User Preference Model
4 / 14
Tips data processing flow
• Input: Raw tips
• Output: Noun phrases with sentiment score
1. 언어 감지(영어만)
2. 문장으로 쪼개고, 각 단어에 품사 태깅을 한다
3. 각 단어를 SentiWordNet에서 찾음으로써 sentiment score를 얻는다
4. Noun phrase chunking (e.g. good + place = good place)
• 팁의 sentiment score는 각 phrase의 sentiment score를 합해서 [-1, 1]로 normalization을
한다
• 구현은 NTLK, SentiWordNet3.0 기반
5. 1.1 페이지 제목
User Preference Model
5 / 14
Preference extraction
• Power law distribution 때문에 왼쪽과 같은
mapping
• Sentiment score의 분포를 고려해서 왼쪽과
같은 mapping
Fusion
• 한 번의 체크인은 사용자의 감정에 대한 충
분한 정보를 준다고 보기 어려우므로
sentiment preference를 사용
• 𝑃𝑓𝑖𝑛𝑎𝑙 = 𝑃𝑐 + 𝑠𝑔𝑛 𝑃𝑐 − 𝑃𝑠 ∙ 𝐻 𝑃𝑐 − 𝑃𝑠 − 2
• H(x): Heaviside step function(unit step
function)
# of check-in
s
Check-in preference matrix element
1 2
2 3
3 4
4+ 5
Sentiment score Preference measure
[-1, -0.05] 1
(-0.05, -0.01] 2
(-0.01, 0.01) 3
[0.01, 0.05) 4
[0.05, 1] 5
6. 1.1 페이지 제목
Location Based Social Matrix Factorization
Model
6 / 14
Matrix Factorization
• Probabilistic matrix factorization(PMF)
• 𝑅 𝑚×𝑛 ≈ 𝑈 𝑚×𝑙 × 𝑉𝑛×𝑙
𝑇
– User-item rating matrix를 (user-latent space matrix) * (item-latent space matrix)로 factorize 한다.
• Bayesian inference를 통해서 𝑝 𝑈, 𝑉 𝑅, 𝜎 𝑅
2
, 𝜎 𝑈
2
, 𝜎 𝑉
2
∝ 𝑝 𝑅 𝑈, 𝑉, 𝜎 𝑅
2
𝑝 𝑈 𝜎 𝑈
2
𝑝 𝑉 𝜎 𝑉
2
– 위의 식을 maximizing함으로써 U, V를 얻어 recommendation을 위한 R을 만들 수 있다.
• 𝑝 𝑅 𝑈, 𝑉, 𝜎 𝑅
2
= 𝑖=1
𝑚
𝑗=1
𝑛
𝐼𝑖𝑗[𝒩(𝑅𝑖,𝑗|𝑈𝑖 × 𝑉𝑗
𝑇
, 𝜎𝑟
2
)]
– 𝐼𝑖𝑗: user 𝑖가 item 𝑗를 평가했을 때만을 고려하기 위한 function
• 𝑝 𝑈 𝜎 𝑈
2
= 𝑖=1
𝑚
𝒩(𝑈𝑖|0, 𝜎 𝑈
2
𝐼)
• 𝑝 𝑉 𝜎 𝑉
2
= 𝑗=1
𝑛
𝒩(𝑉𝑗|0, 𝜎 𝑉
2
𝐼)
• 𝒩(𝑥|𝜇, 𝜎2
)는 mean 𝜇, variance 𝜎2
인 normal distribution
8. 1.1 페이지 제목
Experimental Analysis
8 / 14
Dataset Description
• 4개월 동안의 Foursquare 체크인 데이터 (2011년 10월 24일 ~ 2012년 2월 20일)
• Noise와 invalid한 체크인 데이터 필터링
– 한 주에 적어도 한 개의 체크인을 한 사용자만을 고름 (active user로 간주)
– Sudden-move(1200km/h보다 빠른 연속적인 체크인) 제외
– 카테고리 정보가 unavailable한 장소 제외
• 762,315명의 사용자, 31,820,144개의 체크인
• 필터링 후 311,475명의 사용자, 21,920,144개의 체크인
• 뉴욕과 런던만 (영어를 주로 사용하기 때문에)
• 트위터에서 맞팔하는 경우에 친구 사이로 간주
• 9개의 parent category, 400개의 sub-category merged into 274 sub-category
9. 1.1 페이지 제목
Experimental Analysis
9 / 14
Social and Inter-venue Influence Modeling
• Social influence
– Similarity는 사용자들의 preference vector를 이용해서 계산 (Pearson Correlation Coefficient)
• Inter-venue influence
– Venue의 카테고리 정보에서 0/1 based venue similarity network를 생성
– 같은 sub-category를 포함하면 similarity score가 1
– 뉴욕 레스토랑의 venue similarity network의 density는 0.0353
– 런던은 0.0339
Metrics
• Mean Absolute Error (MAE)
• Root Mean Square Error (RMSE)
10. 1.1 페이지 제목
Experimental Analysis
10 / 14
Hybrid Preference Model Evaluation
아래 3개의 모델을 비교
• Basic model (BM): check-in preference matrix만을 사용
• Tip null model (TNM)
– Sentiment preference matrix를 랜덤하게 섞고 check-in preference matrix와 fuse
– Preference model의 분포를 유지한다
• Hybrid preference model (HPM): hybrid preference matrix를 사용
• Variance와 learning rate는 고정
• Training/test split을 80%, 90%로 나눠서 테스트
• Latent space dimension은 10
• 5번 반복해서 평균
Dataset Training Metric BM TNM HPM
New York
Restaurant
90%
RMSE 1.0137 0.8887 0.8524
MAE 0.8072 0.7032 0.6204
80%
RMSE 1.0386 1.0506 0.9580
MAE 0.8103 0.8306 0.7345
London
90%
RMSE 1.1045 0.9864 0.8929
MAE 0.9031 0.7889 0.7022
80%
RMSE 1.1245 1.0895 1.0119
MAE 0.9147 0.8828 0.8075
11. 1.1 페이지 제목
Experimental Analysis
11 / 14
Location Recommendation Evaluation
아래 4개의 모델과 LBSMF를 비교
• Collaborative filtering (CF)
• Probabilistic matrix factorization (PMF)
• SocialMF
– Social network influence를 고려
– 친구의 impact를 모두 동등하게 취급
• Social Regularized MF (SRMF)
– Social network influence를 고려
– Similarity measure도 고려
• Latent space dimension은 5, 10
• 방금과 나머지 변수들은 같음
13. 1.1 페이지 제목
Comments
13 / 14
• tip에서 venue semantic similarity 찾는 future work가 궁금
• Latent space dimension을 결정하기 위한 cross-validation이 이루어지지 않음
• Five repeated trial이 서로 다른 test-training split을 의미하는 걸까?
• 왜 Pearson Correlation Coefficient를 썼을까?