Preference Learning
2024
LLM & RLHF - Paper Reading Notes
·758 words·4 mins
Machine Learning (ML)
Large Language Models (LLMs)
Direct Preference Optimization (DPO)
Preference Learning