POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 2692|回复: 11
打印 上一主题 下一主题

OpenGL 4.0+ A-buffer 演示程序

[复制链接]
跳转到指定楼层
1#
发表于 2010-6-10 17:47 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
http://blog.icare3d.org/2010/06/ ... le-pass-buffer.html
       

One of the first         thing I wanted do try on the GF100 was the new NVIDIA extensions that allows         random access read/write and atomic operations into global memory and         textures, to implement a fast A-Buffer !
       
        It worked pretty well since it provides something like a 1.5x speedup over         the fastest previous approach (at least I know about !), with zero artifact         and supporting arbitrary number of layers with a single geometry pass.  
       

       
               

               


       

       
        Sample application sources and Win32 executable:
        Sources+executable+Stanford Dragon model
        Additional models
       
        Be aware that this will probably only run on a Fermi card. In particular it         requires:EXT_shader_image_load_store, NV_shader_buffer_load, NV_shader_buffer_store,EXT_direct_state_access
        Application uses freeglut in         order to initialize an OpenGL 4.0 context with the core profile.
       
        A-Buffer:
        Basically an A-buffer is a simple list of fragments per pixel [Carpenter         1984]. Previous methods to implement it on DX10 generation hardware         required multiple passes to capture an interesting number of fragments per         pixel. They where essentially based on depth-peeling, with enhancements         allowing to capture more than one layer per geometric pass, like the k-buffer and stencil         routed k-buffer that suffers from read-modify-write hazards.Bucket         sort depth peeling allows to capture up to 32 fragments per geometry         pass but with only 32 bits per fragment (just a depth) and at the cost of         potential collisions.
        All these techniques were complex and basically limited by the maximum of 8         render targets that were writable by the fragment shader.
       
        My technique can handle arbitrary number of fragments per pixels in a single         pass, with only limitation the available video memory. In this example, I do         order independent transparency with fragments storing 4x32bits values         containing RGB color components and the depth.
       
        Technique:
        The idea is very simple: Each fragment is written by the fragment shader at         it's position into a pre-allocated 2D texture array (or a global memory         region) with a fixed maximum number of layers. The layer to write the         fragment into is given by a counter stored per pixel into another 2D texture         and incremented using an atomic increment (or addition) operation (         [image]AtomicIncWrap or [image]AtomicAdd). After the rendering pass, the         A-Buffer contains an unordered list of fragments per pixel with it's size.         To sort these fragments per depth and compose them on the screen, I simply         use a single screen filling quad with a fragment shader. This shader copy         all the pixel fragments in a local array (probably stored in L1 on Fermi),         sort them with a naive bubble sort, and then combine them front-to-back         based on transparency.
       
        Performances:
        To compare performances, this sample also features a standard rasterization         mode which renders directly into the color buffer. On the Stanford Dragon         example, a GTX480 and 32 layers in the A-Buffer, the technique range between         400-500 FPS, and is only 5-20% more costly than a simple rasterization of         the mesh.
        I also compared performances with the k-buffer which code is available online.         On the GTX480, with the same model and shading (and 16 layers), I can get         more than a 2x speedup, without the artifacts of the k-buffer version. Based         on that results, I strongly believe that it is also close to 1.5x faster         than the bucket sort depth peeling, without it's depth collision problems.
       

       
               

               


       

       
       
        Artifacts comparison with the K-Buffer:  
       

       
               

               


       

                                                         OpenGL 4.0                                  K-Buffer

2#
发表于 2010-6-10 18:01 | 只看该作者
现在没有什么游戏用OpenGL了吧,悲剧的标准。
回复 支持 反对

使用道具 举报

3#
发表于 2010-6-10 18:10 | 只看该作者
3DS MAX呢?IDsoft呢?
不过话说OPENGL本来不是为游戏设计的
回复 支持 反对

使用道具 举报

4#
发表于 2010-6-10 18:29 | 只看该作者
Accumulate-buffer???
回复 支持 反对

使用道具 举报

5#
 楼主| 发表于 2010-6-10 18:42 | 只看该作者
这个主要是用来演示 OIT 的。
回复 支持 反对

使用道具 举报

6#
发表于 2010-6-10 18:49 | 只看该作者
AMD的DX11 demo Mecha也展示过OIT.
回复 支持 反对

使用道具 举报

7#
发表于 2010-6-10 18:57 | 只看该作者
这个open gl的OIT用什么实现的?
回复 支持 反对

使用道具 举报

8#
发表于 2010-6-10 20:13 | 只看该作者
xx, 又见xx.....
回复 支持 反对

使用道具 举报

9#
发表于 2010-6-10 21:53 | 只看该作者
xx, 又见xx.....
回复 支持 反对

使用道具 举报

10#
发表于 2010-6-10 23:06 | 只看该作者
现在没有什么游戏用OpenGL了吧,悲剧的标准。
goodayoo 发表于 2010-6-10 18:01



   
非windows平台游戏不用OpenGL还能用什么api?
回复 支持 反对

使用道具 举报

11#
发表于 2010-6-11 12:28 | 只看该作者
貌似图挂了啊
回复 支持 反对

使用道具 举报

12#
发表于 2010-6-11 12:52 | 只看该作者
BUFFER,看着真熟悉


我去看看我的软件里OpenGL 是哪一版的
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2025-9-14 02:01

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表