October 19, 2024

Note Of KTOTrainer (Kahneman-Tversky Optimization Trainer)

Clay
2024-10-192024-10-19
AI, Machine Learning

Last Updated on 2024-10-19 by Clay

I’ve been intermittently reading about a fine-tuning method called Kahneman-Tversky Optimization (KTO) from various sources like HuggingFace’s official documents and other online materials. It’s similar to DPO as a way to align models with human values, but KTO’s data preparation format is much more convenient, so I’m quickly applying it to my current tasks before making time to study the detailed content in the related papers.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31