Yisou is a team at Creditease, focusing on bigdata technology usage in risk management domain. This PPT describes our products and how we accomplish all these
Yisou is a team at Creditease, focusing on bigdata technology usage in risk management domain. This PPT describes our products and how we accomplish all these
This document presents a framework for context-aware service recommendation. It begins with background on web services and the challenge of recommendation given data sparsity. It then discusses using context information like user location and service provider to learn features. A probabilistic matrix factorization approach is introduced to model user-service interactions in a joint low-rank feature space. The framework incorporates learning user-specific and service-specific context-aware features which are combined in a unified model. An experiment evaluates the approach on a real-world dataset and compares to baseline methods.
Recommender system slides for undergraduateYueshen Xu
Slides for undergraduate in IR class. Presented in Chinese
Mainly focus on the background, application, real case, idea, basic method of recommender systems
This document discusses various approaches to text clustering, including K-means clustering, Gaussian mixture models, and matrix factorization. It notes some of the limitations and assumptions of these approaches, such as the need to specify the number of clusters for K-means and the assumption of Gaussian distributions. The document also discusses other approaches like hierarchical clustering and methods that can handle sparse data like text. The goal is to provide an overview of clustering techniques for text without advanced mathematics.
This document provides an overview of hierarchical topic modeling. It begins with background on text summarization and topic modeling. Some key concepts in topic modeling like latent semantic analysis and probabilistic latent semantic indexing (PLSI) are introduced. Popular topic models like latent Dirichlet allocation (LDA) and hierarchical topic models using the Chinese restaurant process are described. Gibbs sampling is discussed as a method for parameter estimation in topic models. The document concludes with examples of hierarchical topic modeling and information on the author's related work.
This document provides an overview of hierarchical topic modeling. It begins with background on text summarization and topic modeling. Topic modeling aims to learn latent topics from a corpus using probabilistic models like PLSI and LDA. Hierarchical topic modeling uses non-parametric Bayesian models like the Chinese Restaurant Process to capture hierarchical structure in topics. The document explains the generative process of nested CRP models and provides examples of hierarchical topics. It also discusses parameter estimation methods and provides supplemental information on probabilistic graphical models and references for further reading.
This document presents a framework for context-aware service recommendation. It begins with background on web services and the challenge of recommendation given data sparsity. It then discusses using context information like user location and service provider to learn features. A probabilistic matrix factorization approach is introduced to model user-service interactions in a joint low-rank feature space. The framework incorporates learning user-specific and service-specific context-aware features which are combined in a unified model. An experiment evaluates the approach on a real-world dataset and compares to baseline methods.
Recommender system slides for undergraduateYueshen Xu
Slides for undergraduate in IR class. Presented in Chinese
Mainly focus on the background, application, real case, idea, basic method of recommender systems
This document discusses various approaches to text clustering, including K-means clustering, Gaussian mixture models, and matrix factorization. It notes some of the limitations and assumptions of these approaches, such as the need to specify the number of clusters for K-means and the assumption of Gaussian distributions. The document also discusses other approaches like hierarchical clustering and methods that can handle sparse data like text. The goal is to provide an overview of clustering techniques for text without advanced mathematics.
This document provides an overview of hierarchical topic modeling. It begins with background on text summarization and topic modeling. Some key concepts in topic modeling like latent semantic analysis and probabilistic latent semantic indexing (PLSI) are introduced. Popular topic models like latent Dirichlet allocation (LDA) and hierarchical topic models using the Chinese restaurant process are described. Gibbs sampling is discussed as a method for parameter estimation in topic models. The document concludes with examples of hierarchical topic modeling and information on the author's related work.
This document provides an overview of hierarchical topic modeling. It begins with background on text summarization and topic modeling. Topic modeling aims to learn latent topics from a corpus using probabilistic models like PLSI and LDA. Hierarchical topic modeling uses non-parametric Bayesian models like the Chinese Restaurant Process to capture hierarchical structure in topics. The document explains the generative process of nested CRP models and provides examples of hierarchical topics. It also discusses parameter estimation methods and provides supplemental information on probabilistic graphical models and references for further reading.
Yueshen Xu is a fifth-year Ph.D. student in computer science at Zhejiang University in China. He has published papers in several international conferences and journals on topics related to recommender systems, text mining, and natural language processing. He was a visiting student at the University of Illinois at Chicago from 2014-2015 and has worked as an intern developing recommendation algorithms.
Learning to recommend with user generated contentYueshen Xu
This document discusses recommendation systems that incorporate user generated content (UGC) such as tags, reviews, questions/answers, blogs and tweets. It proposes two new matrix factorization-based recommendation models: 1) UTR-MF which regularizes user latent factors based on their interested topics learned from UGC, and 2) ITR-MF which regularizes item latent factors based on their topic distributions learned from associated UGC. The models are evaluated on three real-world datasets and are shown to outperform baselines by utilizing UGC to better learn user preferences and item features. Future work could explore incorporating other UGC types like tweets and blogs.
This document discusses social recommender systems. It begins by noting the large size of major social media sites and how recommender systems can help with information overload on these sites. It then discusses how recommender systems are based on principles of word-of-mouth recommendations and collaborate filtering. Several fundamental recommendation approaches are described, including collaborative filtering, content-based filtering, and hybrid methods. Matrix factorization techniques like singular value decomposition (SVD) are also covered. The document concludes by discussing trends in expanding recommender systems to incorporate more relationship data and higher-dimensional tensor models.
This is an introduction of Topic Modeling, including tf-idf, LSA, pLSA, LDA, EM, and some other related materials. I know there are definitely some mistakes, and you can correct them with your wisdom. Thank you~
Acoustic modeling using deep belief networksYueshen Xu
This document describes using deep belief networks (DBNs) for acoustic modeling in automatic speech recognition. It involves pre-training a multi-layer neural network as a generative model one layer at a time using restricted Boltzmann machines. The pre-trained network is then fine-tuned discriminatively using backpropagation to output phoneme probabilities. The approach achieves better phone recognition than Gaussian mixture models by learning multiple layers of features from data without strong distribution assumptions.
The document summarizes a data mining program held at Renmin University in Beijing from May 21-27, 2012. It discusses the various lecturers and topics covered during the program. Professors Yang, Han, and Pei each gave lectures on their areas of expertise, including classification and transfer learning, information network models, and mining uncertain data. The curriculum focused mainly on data mining and included both basic and advanced concepts. Participants were encouraged to actively engage and ask questions throughout the program.