This is a very rare Tesla GPU with two GP100 chips.
0x10de:0x15fa:3:0:0:NVIDIA Corporation:
0x10de:0x15fa:4:0:0:NVIDIA Corporation:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.64 Driver Version: 430.64 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 PH402 SKU 200 On | 00000000:03:00.0 Off | 0 |
| N/A 44C P0 36W / 140W | 0MiB / 32630MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 PH402 SKU 200 On | 00000000:04:00.0 Off | 0 |
| N/A 40C P0 36W / 140W | 0MiB / 32630MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
please add nvidia "PH402 SKU 200"
Moderators: Site Moderators, FAHC Science Team
-
- Site Moderator
- Posts: 6349
- Joined: Sun Dec 02, 2007 10:38 am
- Location: Bordeaux, France
- Contact:
Re: please add nvidia "PH402 SKU 200"
Your drivers are too old, current GPU cores require CUDA 11.2 (and a CPU with SSE4.2).
Added the follding GPUs that were missing :
0x15fa / GP100GL [DGX Station / PH402 SKU 200]
0x15fb / GP100GL [GP100 SKU 200]
0x15fc / GP100GL [Tesla P100-DGXS-16GB]
0x15ff / GP100GL [GP100 SKU 15ff]
Added the follding GPUs that were missing :
0x15fa / GP100GL [DGX Station / PH402 SKU 200]
0x15fb / GP100GL [GP100 SKU 200]
0x15fc / GP100GL [Tesla P100-DGXS-16GB]
0x15ff / GP100GL [GP100 SKU 15ff]
Re: please add nvidia "PH402 SKU 200"
Great! Thanks for the information.
I'm trying to find a newer driver to work with this GPU.
some of the most recent versions of drivers will cause OS crashes on this GPU.
I'm trying to find a newer driver to work with this GPU.
some of the most recent versions of drivers will cause OS crashes on this GPU.
Re: please add nvidia "PH402 SKU 200"
after some debugging, it turns out that the crash was caused by a bad Nvlink between two chips.
now I'm able to get it to work with the latest driver after disabled the Nvlink while loading the driver.
now I'm able to get it to work with the latest driver after disabled the Nvlink while loading the driver.
Code: Select all
modprobe nvidia NVreg_RegistryDwords="RMNvLinkControl=1"
Code: Select all
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA PH402 SKU 200 On | 00000000:03:00.0 Off | 0 |
| N/A 47C P0 89W / 140W | 313MiB / 32768MiB | 98% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA PH402 SKU 200 On | 00000000:04:00.0 Off | 0 |
| N/A 41C P0 88W / 140W | 289MiB / 32768MiB | 99% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 4104 C ...e/0x23-8.0.3/Core_23.fah/FahCore_23 310MiB |
| 1 N/A N/A 4112 C ...e/0x23-8.0.3/Core_23.fah/FahCore_23 286MiB |
+-----------------------------------------------------------------------------------------+