Soft Best-of-n Sampling for Model Alignment

Best AI papers explained - A podcast by Enoch H. Kang

Categories:

This paper introduces Soft Best-of-n (BoN) sampling, an advancement over traditional BoN sampling for aligning large language model (LLM) outputs with human preferences. While standard BoN samples multiple responses and picks the highest-reward one, Soft BoN incorporates a temperature parameter (λ), enabling a smoother trade-off between maximizing reward and maintaining similarity to the original LLM distribution. The authors provide theoretical guarantees, demonstrating that Soft BoN converges to an optimal tilted distribution at a faster O(1/n) rate in terms of KL-divergence and expected relative reward compared to standard BoN. They also analyze an additive reward model, revealing that blockwise sampling (processing sequences) is less efficient than symbolwise sampling (processing individual tokens) in terms of sample complexity, though symbolwise sampling may be more computationally expensive in practice. The research highlights the delicate balance between λ and n for optimal alignment and proposes future work on implementing Soft BoN in real-world LLMs.