下記、CUDA-Zの結果。
CUDA-Z Report
Version: 0.9.231 http://cuda-z.sf.net/OS Version: Windows AMD64 6.1.7601 Service Pack 1
Driver Version: 347.25
Driver Dll Version: 7.0 (8.17.13.4725)
Runtime Dll Version: 6.0
Core Information
Name | GeForce GTX 960 |
---|---|
Compute Capability | 5.2 |
Clock Rate | 1228 MHz |
PCI Location | 0:1:0 |
Multiprocessors | 8 |
Therds Per Multiproc. | 2048 |
Warp Size | 32 |
Regs Per Block | 65536 |
Threads Per Block | 1024 |
Threads Dimensions | 1024 x 1024 x 64 |
Grid Dimensions | 2147483647 x 65535 x 65535 |
Watchdog Enabled | Yes |
Integrated GPU | No |
Concurrent Kernels | Yes |
Compute Mode | Default |
Stream Priorities | No |
Memory Information
Total Global | 2048 MiB |
---|---|
Bus Width | 128 bits |
Clock Rate | 3505 MHz |
Error Correction | No |
L2 Cache Size | 48 KiB |
Shared Per Block | 48 KiB |
Pitch | 2048 MiB |
Total Constant | 64 KiB |
Texture Alignment | 512 B |
Texture 1D Size | 65536 |
Texture 2D Size | 65536 x 65536 |
Texture 3D Size | 4096 x 4096 x 4096 |
GPU Overlap | Yes |
Map Host Memory | Yes |
Unified Addressing | No |
Async Engine | Yes, Bidirectional |
Performance Information
Memory Copy | |
---|---|
Host Pinned to Device | 2508.83 MiB/s |
Host Pageable to Device | 1433.22 MiB/s |
Device to Host Pinned | 1552.14 MiB/s |
Device to Host Pageable | 1034.73 MiB/s |
Device to Device | 38.3993 GiB/s |
GPU Core Performance | |
Single-precision Float | 2712.13 Gflop/s |
Double-precision Float | 86.5304 Gflop/s |
32-bit Integer | 821.532 Giop/s |
24-bit Integer | 602.154 Giop/s |
960を挿したベースのPCがPCIe3に対応していないからか目的のLibSVMはノートPCのCPUの方が速いという残念な結果でした。