|
texture interpolation(FP16 on NV G60/G70):
Bilinear interpolation per one component takes 4 multiplies, 3 adds and 2 subs: 9 ops, x4 channels = 36 flops.
blend operation (FP16 on G60/G70/X1000buffer) 12 flops.
vec4 add/mul/sub(?)/div(?): 1flop*4 = 4flops
vec4 mad: 2 flops*4 = 8 flops
nrm:
squareRootOfTheSum = (src0.x*src0.x + src0.y*src0.y + src0.z*src0.z)1/2;
dest.x = src0.x * (1 / squareRootOfTheSum);
dest.y = src0.y * (1 / squareRootOfTheSum);
dest.z = src0.z * (1 / squareRootOfTheSum);
dest.w = src0.w * (1 / squareRootOfTheSum);
"在計算「平方根倒數」的時候,通常是用查表(假設不需要高的精確度),因此,它通常不算成一個浮點咚恪 |
|