Skip to content

November 15, 2024

Optimizing LayerSkip Models with Bayesian Search for an Effective Layer Skipping Strategy

Last Updated on 2024-11-15 by Clay

In self-speculative decoding, since our draft model is derived from part of the target model’s network, finding an optimal 'Layer Skip Strategy' is crucial. We need to skip enough layers to achieve meaningful speedup while ensuring the draft model’s speculative decoding is good enough to avoid frequent rejection by the target model.

Today’s implementation focuses on optimizing my previously implemented LayerSkip model using the Bayesian optimization framework Optuna, to determine which layers to skip.

Read More »Optimizing LayerSkip Models with Bayesian Search for an Effective Layer Skipping Strategy