|
Atomic functions operating on 64-bit integer values in shared memory
Atomic addition operating on 32-bit floating point values in global and shared memory
__ballot()
__threadfence_system()
__syncthreads_count(), __syncthreads_and(), __syncthreads_or()
Surface functions
3D grid of thread blocks
Maximum x-dimension of a grid of thread blocks :(2^31)-1, Fermi 和之前的 GPU 是 65535。
Maximum number of resident blocks per multiprocessor: 16,以前的都是 8。
Maximum number of resident warps per multiprocessor:64,Fermi 是 48,更早之前是 32。
Maximum number of resident threads per multiprocessor:2048,Fermi 是 1536,更早之前是 1024 和 768(G80)。
Number of 32-bit registers per multiprocessor:64K,Fermi-32K,T20-16K,T8/T10-8K。
Maximum width, height, and depth for a 3D texture reference bound to a CUDA array:4096^3,之前的是 2048^3。
Maximum number of textures that can be bound to a kernel:256,之前的是 128。
Maximum number of surfaces that can be bound to a kernel:16,之前的是8
|
|