Best AI papers explained

A podcast by Enoch H. Kang

442 Episodes

MemReasoner: Generalizing Language Models on Reasoning-in-a-Haystack Tasks
Published: 3/27/2025
RAFT: In-Domain Retrieval-Augmented Fine-Tuning for Language Models
Published: 3/27/2025
Inductive Biases for Exchangeable Sequence Modeling
Published: 3/26/2025
InverseRLignment: LLM Alignment via Inverse Reinforcement Learning
Published: 3/26/2025
Prompt-OIRL: Offline Inverse RL for Query-Dependent Prompting
Published: 3/26/2025
Alignment from Demonstrations for Large Language Models
Published: 3/25/2025
Q♯: Distributional RL for Optimal LLM Post-Training
Published: 3/18/2025
Scaling Test-Time Compute Without Verification or RL is Suboptimal
Published: 3/14/2025
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Published: 3/14/2025
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Published: 3/14/2025
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Published: 3/14/2025
Revisiting Superficial Alignment Hypothesis
Published: 3/14/2025
Diagnostic uncertainty: teaching language Models to describe open-ended uncertainty
Published: 3/14/2025
Language Model Personalization via Reward Factorization
Published: 3/14/2025
Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration
Published: 3/14/2025
How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach
Published: 3/14/2025
Can Large Language Models Extract Customer Needs as well as Professional Analysts?
Published: 3/13/2025
Spurlens: finding spurious correlations in Multimodal llms
Published: 3/13/2025
Improving test-time search with backtrack- Ing Improving test-time search with backtrack- Ing against in-context value verifiersagainst in-context value verifiers
Published: 3/13/2025
Adaptive elicitation of latent information Using natural language
Published: 3/13/2025

22 / 23

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.