10 11 月, 2024

Self-Speculative Decoding 實現: 跳層 Transformer 模型實作筆記

Clay
2024-11-102024-11-10
AI, Machine Learning, PyTorch

Last Updated on 2024-11-10 by Clay

介紹

自推測性解碼（Self-Speculative Decoding）是一個推測性解碼（Speculative Decoding）的變體。原本的 Speculative Decoding 是採用一個草稿模型（draft model）來優化我們真正想要推理的目標模型（target），並且 draft model 擁有與 target model 相似的輸出以及快上幾倍的推理時間，通常是由 target model 蒸餾而來。

一	二	三	四	五	六	日
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30