Enhancing User Rating Database Consistency Through Pruning

TitleEnhancing User Rating Database Consistency Through Pruning
Publication TypeJournal Article
Year of Publication2017
AuthorsMargaris D, Vassilakis C
JournalTransactions on Large-Scale Data- and Knowledge-Centered Systems XXXIV
VolumeLNCS, volume 10620
Pagination33–64
KeywordsCollaborative filtering, Evaluation, Internet of Things, performance, Recommendation Applications, Social Networks, Web services
AbstractRecommender systems are based on information about users' past behavior to formulate recommendations about their future actions. However, as time goes by the interests and likings of people may change: people listen to different singers or even different types of music, watch different types of movies, read different types of books and so on. Due to this type of changes, an amount of inconsistency is introduced in the database since a portion of it does not reflect the current preferences of the user, which is its intended purpose. In this paper, we present a pruning technique that removes old aged user behavior data from the ratings database, which are bound to correspond to invalidated preferences of the user. Through pruning (1) inconsistencies are removed and data quality is upgraded, (2) better rating prediction generation times are achieved and (3) the ratings database size is reduced. We also propose an algorithm for determining the amount of pruning that should be performed, allowing the tuning and operation of the pruning algorithm in an unsupervised fashion. The proposed technique is evaluated and compared against seven aging algorithms, which reduce the importance of aged ratings, and a state-of-the-art pruning algorithm, using datasets with varying characteristics. It is also validated using two distinct rating prediction computation strategies, namely collaborative filtering and matrix factorization. The proposed technique needs no extra information concerning the items' characteristics (e.g. categories that they belong to or attributes' values), can be used in all rating databases that include a timestamp and has been proved to be effective in any size of users-items database and under two rating prediction computation strategies. Note: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
URLhttps://link.springer.com/article/10.1007%2Fs11761-017-0216-y
DOI10.1007/978-3-662-55947-5_3