Information-Consistent Language Model Recommendations through Group Relative Policy Optimization
arxiv.org·9h
👤Search Personalization
Preview
Report Post

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) are increasingly deployed in business-critical domains such as finance, education, healthcare, and customer support, where users expect consistent and reliable recommendations. Yet LLMs often exhibit variability when prompts are phrased with minor differences, even when semantically equivalent. Such inconsistency undermines trust, complicates compliance, and disrupts user experience. While personalization is desirable in certain contexts, many enterprise scenarios-such as HR onboarding, customer support, or policy disclosure-require invariant information delivery regardless of phrasing or prior conversational history. Existing approaches, …

Similar Posts

Loading similar posts...