An Algorithm for Density Enrichment of Sparse Collaborative Filtering Datasets Using Robust Predictions as Derived Ratings

TitleAn Algorithm for Density Enrichment of Sparse Collaborative Filtering Datasets Using Robust Predictions as Derived Ratings
Publication TypeJournal Article
Year of Publication2020
AuthorsMargaris D, Spiliotopoulos D, Karagiorgos G, Vassilakis C
JournalAlgorithms
Volume13
Pagination174
Date Published07/2020
Keywordscollaborative filtering, density enrichment, derived ratings, rating prediction accuracy, rating prediction coverage, Recommender Systems, robust predictions, Sparse Datasets
Abstract

Collaborative filtering algorithms formulate personalized recommendations for a user, first by analysing already entered ratings to identify other users with similar tastes to the user (termed as near neighbours), and then using the opinions of the near neighbours to predict which items the target user would like. However, in sparse datasets, too few near neighbours can be identified, resulting in low accuracy predictions and even a total inability to formulate personalized predictions. This paper addresses the sparsity problem by presenting an algorithm that uses robust predictions, that is predictions deemed as highly probable to be accurate, as derived ratings. Thus, the density of sparse datasets increases, and improved rating prediction coverage and accuracy are achieved. The proposed algorithm, termed as CFDR, is extensively evaluated using (1) seven widely-used collaborative filtering datasets, (2) the two most widely-used correlation metrics in collaborative filtering research, namely the Pearson correlation coefficient and the cosine similarity, and (3) the two most widely-used error metrics in collaborative filtering, namely the mean absolute error and the root mean square error. The evaluation results show that, by successfully increasing the density of the datasets, the capacity of collaborative filtering systems to formulate personalized and accurate recommendations is considerably improved.

DOI10.3390/a13070174
Full Text