|
http://igm.univ-mlv.fr/~biri/mlaa-gpu/MLAAGPU.pdf
This technique does not need multiple samples and can efficiently be implemented on CPU using vector instructions. However, this filter is not linear and requires deep
branching and image-wise knowledge which can be very inefficient on graphics hardware. We introduce an efficient adaptation of the MLAA algorithm running flawlessly on medium range GPUs.
Our implementation adds a total cost of 34ms (3.49ms) to the rendering at resolution 1248x1024 on a NVidia Geforce 8600 GT (295 GTX). The GPU version tends to scale very well since the cost at 1600x1200 resolution is only 67.5ms (5.54ms) which represents a cost 98% (65.3%) higher for 144% more pixels. We can compare
our results to a standard CPU implementation which runs in 67ms at 1024x768 and in 128ms at 1600x1200 on a Core2Duo 2.20Ghz. Note that it does not include the costly GPU/CPU/GPU transfers in case of real time rendering. |
|