|
Specifications for Compute Capability 1.0
The maximum number of threads per block is 512;
The maximum sizes of the x-, y-, and z-dimension of a thread block are 512, 512, and 64, respectively;
The maximum size of each dimension of a grid of thread blocks is 65535;
The warp size is 32 threads;
The number of registers per multiprocessor is 8192;
The amount of shared memory available per multiprocessor is 16 KB organized into 16 banks (see Section 5.1.2.5);
The total amount of constant memory is 64 KB;
The cache working set for constant memory is 8 KB per multiprocessor;
The cache working set for texture memory varies between 6 and 8 KB per multiprocessor;
The maximum number of active blocks per multiprocessor is 8;
The maximum number of active warps per multiprocessor is 24;
The maximum number of active threads per multiprocessor is 768;
For a texture reference bound to a one-dimensional CUDA array, the maximum width is 213;
For a texture reference bound to a two-dimensional CUDA array, the maximum width is 216 and the maximum height is 215;
For a texture reference bound to a three-dimensional CUDA array, the maximum width is 211, the maximum height is 211, and the maximum depth is 211;
For a texture reference bound to linear memory, the maximum width is 227;
The limit on kernel size is 2 million PTX instructions;
Each multiprocessor is composed of eight processors, so that a multiprocessor is able to process the 32 threads of a warp in four clock cycles. |
|