Abstract:
In this paper, we introduce a pruning algorithm which removes aged user ratings from the rating database used by collaborative filtering algorithms, in order to (1) improve prediction quality and (2) minimize the rating database size, as well as the rating prediction generation time. The proposed algorithm needs no extra information concerning the items' characteristics (e.g. categories that they belong to or attributes' values) and can be used with all rating databases that include a timestamp. Furthermore, we propose and validate a method for identifying the most prominent combination of a pruning algorithm and a pruning level for datasets, allowing thus to perform the selection of pruning algorithm and pruning level in an unsupervised fashion.