Monday, December 6, 2010

CUDA on Thinkpad T410 with NVIDIA NVS 3100M

I decided to try CUDA on my Thinkpad with Windows 7. First step was Windows Getting Started Guide from NVIDIA Developer Zone (link). Installing CUDA Toolkit and GPU Computing SDK went smoothly, problem appeared later. After running bandwidthTest from SDK samples I got:

[bandwidthTest]
bandwidthTest.exe Starting...

Running on...

Quick Mode

d:/bld_sdk10_x64.pl/rel/gpgpu/toolkit/r3.2/sdk/SDK10/Compute/C/src/bandwidthTest/bandwidthTest.cu(598) : cudaSafeCall() Runtime API error : CUDA driver version is insufficient for CUDA runtime version.


It seems that lenovo drivers have old version of CUDA driver (2.2 - it can be checked in NVIDIA Control Panel -> System Information -> Components -> NVCUDA.DLL). So I tried installing Developer Driver from NVIDIA Developer Zone - here comes next problem: this particular card isn't supported by this driver! Installer says: "This graphic driver could not find compatible graphic hardware."

Solution lies in NVLT.inf file (you can find it in Display.Driver directory after unpacking the driver):
Find lines starting with %NVIDIA_DEV.0A6C.01%, copy & paste next to them and change last parts (after underscore) to 214217AA. In my case I copied two lines:
...
%NVIDIA_DEV.0A6C.01% = Section049, PCI\VEN_10DE&DEV_0A6C&SUBSYS_21C017AA
%NVIDIA_DEV.0A6C.01% = Section049, PCI\VEN_10DE&DEV_0A6C&SUBSYS_214217AA
...
%NVIDIA_DEV.0A6C.01% = Section050, PCI\VEN_10DE&DEV_0A6C&SUBSYS_21C017AA
%NVIDIA_DEV.0A6C.01% = Section050, PCI\VEN_10DE&DEV_0A6C&SUBSYS_214217AA
...

(PCI\VEN_10DE&DEV_0A6C&SUBSYS_214217AA is hardware id of NVS 3100M - you can find it in device manager.)
Save the file and run setup.exe - it works!

After these steps it was possible to run some SDK samples. BandwidthTest:

[bandwidthTest]
bandwidthTest.exe Starting...

Running on...

Device 0: NVS 3100M
Quick Mode

Host to Device Bandwidth, 1 Device(s), Paged memory
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 1547.4

Device to Host Bandwidth, 1 Device(s), Paged memory
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 1641.0

Device to Device Bandwidth, 1 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 9482.9


[bandwidthTest] - Test results:
PASSED


Press <Enter> to Quit...
-----------------------------------------------------------


and deviceQuery:

deviceQuery.exe Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

There is 1 device supporting CUDA

Device 0: "NVS 3100M"
CUDA Driver Version: 3.20
CUDA Runtime Version: 3.20
CUDA Capability Major/Minor version number: 1.2
Total amount of global memory: 229179392 bytes
Multiprocessors x Cores/MP = Cores: 2 (MP) x 8 (Cores/MP) = 16 (Cores)
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Clock rate: 1.47 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)
Concurrent kernel execution: No
Device has ECC support enabled: No
Device is using TCC driver mode: No

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.20, CUDA Runtime Version = 3.20, NumDevs = 1, Device = NVS 3100M


PASSED

Press <Enter> to Quit...
-----------------------------------------------------------


The card isn't very impressing (16 cores), but I think it's enough to try learn CUDA ;).

2 comments: