Improving Collaborative Filtering’s Rating Prediction Coverage in Sparse Datasets by Exploiting User Dissimilarity

Dionisis Margaris and Costas Vassilakis
Proceedings of The Fourth IEEE International Conference on Big Data Intelligence and Computing (DataCom 2018)

Collaborative filtering systems analyze the ratings databases to identify users with similar likings and preferences, termed as near neighbors, and then generate rating predictions for a user by examining the ratings of his near neighbors for items that the user has not yet rated; based on rating predictions, recommenda-tions are then formulated. However, these systems are known to exhibit the “gray sheep” problem, i.e. the situation where no near neighbors can be identified for a number of users, and hence no recommendation can be formulated for them. This problem is more intense in sparse datasets, i.e. datasets with relatively small number of ratings, compared to the number of users and items. In this work, we propose a method for alleviating this problem by exploiting user dissimilarity, under the assumption that if some users have exhibited opposing preferences in the past, they are likely to do so in the future. The proposed method has been eval-uated against seven widely used datasets and has been proven to be particularly effective in increasing the percentage of users for which personalized recommendations can be formulated in the context of sparse datasets, while at the same time maintaining or slightly improving rating prediction quality.

Note: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Research area: