Skip to content

4 月 2022

[已解決] NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Last Updated on 2022-04-30 by Clay

問題描述

最近當我重新配置工作用環境的 Nvidia GPU 驅動程式時,當我安裝過後重新開機,我卻無法透過 nvidia-smi 指令去取得 GPU 的資訊。唯一得到的訊息是:

Read More »[已解決] NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

[已解決] RuntimeError: CUDA error: device kernel image is invalid – CUDA kernel errors might be asynchronously reported at some other API call…

Last Updated on 2022-07-27 by Clay

問題描述

最近我的某項工作就是把之前的舊專案使用 PyTorch Lightning 重構成新的訓練環節,並確保分數並沒有太大變化。其中,在我將某項二分類專案重構後,試跑出現了以下錯誤:

Read More »[已解決] RuntimeError: CUDA error: device kernel image is invalid – CUDA kernel errors might be asynchronously reported at some other API call…