安装新的CUDA 4.0驱动程序和SDK后,许多SDK测试失败(例如fastWalshTransform, matrixMul, reduction )。 这是./deviceQuery :
Device 0: "GeForce GTX 570" CUDA Driver Version / Runtime Version 4.0 / 4.0 CUDA Capability Major/Minor version number: 2.0 Total amount of global memory: 1279 MBytes (1341325312 bytes) (15) Multiprocessors x (32) CUDA Cores/MP: 480 CUDA Cores GPU Clock Speed: 1.57 GHz Memory Clock rate: 2100.00 Mhz Memory Bus Width: 320-bit L2 Cache Size: 655360 bytes Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048) Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048 Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Concurrent kernel execution: Yes Alignment requirement for Surfaces: Yes Device has ECC support enabled: No Device is using TCC driver mode: No Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 4 / 0例如, reduction输出是:
GPU结果= 2135772699 CPU结果= 2139353471=> FAILED 。
解决方案:它(现在仍然是)硬件问题(驱动程序更新无法解决问题)。 也许是一些记忆问题,但很常见。 我们有几张显示该问题的NVIDIA显卡(甚至是特斯拉!)。 到目前为止,我们发现的唯一解决方案是重新启动机器或稍微增加电压。
After installing fresh CUDA 4.0 drivers and SDK, many SDK tests fail (e.g. fastWalshTransform, matrixMul, reduction). This is the ./deviceQuery:
Device 0: "GeForce GTX 570" CUDA Driver Version / Runtime Version 4.0 / 4.0 CUDA Capability Major/Minor version number: 2.0 Total amount of global memory: 1279 MBytes (1341325312 bytes) (15) Multiprocessors x (32) CUDA Cores/MP: 480 CUDA Cores GPU Clock Speed: 1.57 GHz Memory Clock rate: 2100.00 Mhz Memory Bus Width: 320-bit L2 Cache Size: 655360 bytes Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048) Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048 Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Concurrent kernel execution: Yes Alignment requirement for Surfaces: Yes Device has ECC support enabled: No Device is using TCC driver mode: No Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 4 / 0E.g. output of reduction is:
GPU result = 2135772699 CPU result = 2139353471=> FAILED.
Solution: It was (and still is) a hardware problem (driver updates don't solve the problem). Maybe some memory issue but quite common. We have several NVIDIA cards showing that issue (even Tesla!). The only solution we have found so far is to restart the machine or to increase the voltage a little bit.
最满意答案
它(现在仍然是)硬件问题(驱动程序更新无法解决问题)。 也许是一些记忆问题,但很常见。 我们有几张显示该问题的NVIDIA显卡(甚至是特斯拉!)。 到目前为止,我们发现的唯一解决方案是重新启动机器或稍微增加电压。
It was (and still is) a hardware problem (driver updates don't solve the problem). Maybe some memory issue but quite common. We have several NVIDIA cards showing that issue (even Tesla!). The only solution we have found so far is to restart the machine or to increase the voltage a little bit.
更多推荐
发布评论