Can Large Language Models Assess Serendipity in Recommender Systems?

Yu Tokutake; Kazushi Okamoto

doi:10.20965/jaciii.2024.p1263

Abstract

Serendipity-oriented recommender systems aim to counteract the overspecialization of user preferences. However, evaluating a user’s serendipitous response to a recommended item can be challenging owing to its emotional nature. In this study, we address this issue by leveraging the rich knowledge of large language models (LLMs) that can perform various tasks. First, it explores the alignment between the serendipitous evaluations made by LLMs and those made by humans. In this study, a binary classification task was assigned to the LLMs to predict whether a user would find the recommended item serendipitously. The predictive performances of three LLMs were measured on a benchmark dataset in which humans assigned the ground truth to serendipitous items. The experimental findings revealed that LLM-based assessment methods do not have a very high agreement rate with human assessments. However, they performed as well as or better than the baseline methods. Further validation results indicate that the number of user rating histories provided to LLM prompts should be carefully chosen to avoid both insufficient and excessive inputs and that interpreting the output of LLMs showing high classification performance is difficult.

Content from these authors

This article cannot obtain the latest cited-by information.

This article is licensed under a Creative Commons [Attribution-NoDerivatives 4.0 International] license (https://creativecommons.org/licenses/by-nd/4.0/).
The journal is fully Open Access under Creative Commons licenses and all articles are free to access at JACIII official website.
https://www.fujipress.jp/jaciii/jc-about/#https://creativecommons.org/licenses/by-nd

Favorites & Alerts

Corresponding author

Funder information

1.Fund name: Japan Society for the Promotion of Science

Register with J-STAGE for free!