09/07/2025 | Press release | Archived content
We explore a novel use case for Large Language Models (LLMs) in recommendation: generating natural language user taste profiles from listening histories. Unlike traditional opaque embeddings, these profiles are interpretable, editable, and give users greater transparency and control over their personalization. However, it is unclear whether users actually recognize themselves in these profiles, and whether some users or items are systematically better represented than others. Understanding this is crucial for trust, usability, and fairness in LLM-based recommender systems.
To study this, we generate profiles using three different LLMs and evaluate them along two dimensions: self-identification, through a user study with 64 participants, and recommendation performance in a downstream task. We analyze how both are affected by user attributes (e.g., age, taste diversity, mainstreamness) and item features (e.g., genre, country of origin). Our results show that profile quality varies across users and items, and that self-identification and recommendation performance are only weakly correlated. These findings highlight both the promise and the limitations of scrutable, LLM-based profiling in personalized systems.