OpenGL 4.0+ A-buffer 演示程序

Edison · 发表于 2010-6-10 17:47

http://blog.icare3d.org/2010/06/ ... le-pass-buffer.html

One of the first thing I wanted do try on the GF100 was the new NVIDIA extensions that allows random access read/write and atomic operations into global memory and textures, to implement a fast A-Buffer !

It worked pretty well since it provides something like a 1.5x speedup over the fastest previous approach (at least I know about !), with zero artifact and supporting arbitrary number of layers with a single geometry pass.

Sample application sources and Win32 executable:
Sources+executable+Stanford Dragon model
Additional models

Be aware that this will probably only run on a Fermi card. In particular it requires:EXT_shader_image_load_store, NV_shader_buffer_load, NV_shader_buffer_store,EXT_direct_state_access
Application uses freeglut in order to initialize an OpenGL 4.0 context with the core profile.

A-Buffer:
Basically an A-buffer is a simple list of fragments per pixel [Carpenter 1984]. Previous methods to implement it on DX10 generation hardware required multiple passes to capture an interesting number of fragments per pixel. They where essentially based on depth-peeling, with enhancements allowing to capture more than one layer per geometric pass, like the k-buffer and stencil routed k-buffer that suffers from read-modify-write hazards.Bucket sort depth peeling allows to capture up to 32 fragments per geometry pass but with only 32 bits per fragment (just a depth) and at the cost of potential collisions.
All these techniques were complex and basically limited by the maximum of 8 render targets that were writable by the fragment shader.

My technique can handle arbitrary number of fragments per pixels in a single pass, with only limitation the available video memory. In this example, I do order independent transparency with fragments storing 4x32bits values containing RGB color components and the depth.

Technique:
The idea is very simple: Each fragment is written by the fragment shader at it's position into a pre-allocated 2D texture array (or a global memory region) with a fixed maximum number of layers. The layer to write the fragment into is given by a counter stored per pixel into another 2D texture and incremented using an atomic increment (or addition) operation ( [image]AtomicIncWrap or [image]AtomicAdd). After the rendering pass, the A-Buffer contains an unordered list of fragments per pixel with it's size. To sort these fragments per depth and compose them on the screen, I simply use a single screen filling quad with a fragment shader. This shader copy all the pixel fragments in a local array (probably stored in L1 on Fermi), sort them with a naive bubble sort, and then combine them front-to-back based on transparency.

Performances:
To compare performances, this sample also features a standard rasterization mode which renders directly into the color buffer. On the Stanford Dragon example, a GTX480 and 32 layers in the A-Buffer, the technique range between 400-500 FPS, and is only 5-20% more costly than a simple rasterization of the mesh.
I also compared performances with the k-buffer which code is available online. On the GTX480, with the same model and shading (and 16 layers), I can get more than a 2x speedup, without the artifacts of the k-buffer version. Based on that results, I strongly believe that it is also close to 1.5x faster than the bucket sort depth peeling, without it's depth collision problems.

Artifacts comparison with the K-Buffer:

OpenGL 4.0 K-Buffer

goodayoo · 发表于 2010-6-10 18:01

现在没有什么游戏用OpenGL了吧，悲剧的标准。

莫貘 · 发表于 2010-6-10 18:10

3DS MAX呢？IDsoft呢？
不过话说OPENGL本来不是为游戏设计的

66666 · 发表于 2010-6-10 18:29

Accumulate－buffer？？？

Edison · 发表于 2010-6-10 18:42

这个主要是用来演示 OIT 的。

gz_easy · 发表于 2010-6-10 18:49

AMD的DX11 demo Mecha也展示过OIT.

66666 · 发表于 2010-6-10 18:57

这个open gl的OIT用什么实现的？

土星实验室 · 发表于 2010-6-10 20:13

xx, 又见xx.....

xm-2000 · 发表于 2010-6-10 21:53

xx, 又见xx.....

66666 · 发表于 2010-6-10 23:06

现在没有什么游戏用OpenGL了吧，悲剧的标准。
goodayoo 发表于 2010-6-10 18:01

非windows平台游戏不用OpenGL还能用什么api？

disruptor · 发表于 2010-6-11 12:28

貌似图挂了啊

Buffer · 发表于 2010-6-11 12:52

ＢＵＦＦＥＲ，看着真熟悉

我去看看我的软件里OpenGL 是哪一版的

帐号		自动登录	找回密码
密码			注册

OpenGL 4.0+ A-buffer 演示程序

浏览过的版块