[PyTorch] Release GPU / CPU Memory After Delete Model

Problem

Yesterday, I developed a model merging program. This time I have no enough gpu memory to merge the models in only one time, so I tried to merge layer by layer. I found the memory of GPU is easily to release but CPU didn’t.

I searched on Internet, and only one solution is useful. I found another method is suit for me, so record them on below.

Solutions

First, we record the GPU memory releasing.

model.to("cuda:0")
del model
torch.cuda.empty_cache()

We just need few lines to do it, but it can not use on CPU.

In CPU scenario, many people recommend to do:

import gc

del model
gc.collect()

This method has only worked once in the past day. But since I need to submit code that’s reliable enough, I can’t gamble on luck.

Some people shared that it is effective to wrap the model into a wrapper and then delete it again; I tried the lambda they recommended but it didn’t work for me; and by chance I saw the practice of using List to wrap it and then delete it. But it worked!

import gc
from transformers import AutoModelForCausalLM

# Init
models = []
models.append(AutoModelForCausalLM.from_pretrained("gpt2"))

# Operation
models[0].generate(...)

# Delete
del models
gc.collect()

I have tried this method many times, and it clears the memory normally every time.

Later, when I was optimizing my code, I accidentally discovered another way to clear the memory, which is to throw it on the meta device:

model.to("meta")

This way may create a null model architecture without any weights. It used for special goal generally, but in this time it help a lot.

References

GitHub Issue – CPU Memory Deallocation

[Solved] UserWarning: CUDA initialization: CUDA unknown error – this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at …/c10/cuda/CUDAFunctions.cpp:109.) return torch._C._cuda_getDeviceCount() > 0

[PyTorch] Release GPU / CPU Memory After Delete Model

Problem

Solutions

References

Read More

Related

Leave a ReplyCancel reply

[PyTorch] Release GPU / CPU Memory After Delete Model

Problem

Solutions

References

Read More

Share this:

Related

Leave a ReplyCancel reply