Note on Calculating VRAM Consumption for Training and Inference of AI Models
I've always used rough formulas to estimate the relationship between the scale of my models and the GPU VRAM consumption; after all, there are too many variables involved—model architecture, number of layers, attention mechanism implementation, sequence length, batch size, data precision used in training or inference... all of these affect our final calculation results.
Read More »Note on Calculating VRAM Consumption for Training and Inference of AI Models