POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 2393|回复: 10
打印 上一主题 下一主题

GDC 2008 主要讲座

[复制链接]
跳转到指定楼层
1#
发表于 2007-12-20 00:46 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
2#
发表于 2008-2-14 01:54 | 只看该作者
老大给粗略翻译一下讲课内容啊
回复 支持 反对

使用道具 举报

3#
发表于 2008-2-14 02:32 | 只看该作者
=.= 如果需要翻译 看这个也没有意义吧
回复 支持 反对

使用道具 举报

4#
发表于 2008-2-14 09:50 | 只看该作者
CRYSIS 有两个讲座哪,看来热点就是热点。
回复 支持 反对

使用道具 举报

5#
发表于 2008-2-14 22:52 | 只看该作者
PS3和XBOX360的也不少啊
回复 支持 反对

使用道具 举报

6#
发表于 2008-2-15 17:37 | 只看该作者
没有中文的么???
回复 支持 反对

使用道具 举报

7#
发表于 2008-2-16 01:23 | 只看该作者
期待完整版的PDF
回复 支持 反对

使用道具 举报

8#
 楼主| 发表于 2008-3-1 13:15 | 只看该作者
AMD 文档下载:

GDC 2008
  • Jon Story & Holger Grün. GDC08 MGPU: Slides

  • Holger Gruen, Jon Story & Ignacio Llamas. GDC08 AD3D MGPU:Slides

  • Richard Huddy. DirectX10.1 “DirectX 10 and then some…”: Slides

  • Bill Bilodeau & Peter Lohrmann. Tessellation in a Low Poly World: Slides

  • Nicolas Thibieroz. Ultimate Graphics Performance for DirectX 10 Hardware: Slides

  • Jonathan Zarge & Dan Ginsburg. The Ultimate Developers Toolkit: Slides

NVIDIA文档下载:

GDC 2008










回复 支持 反对

使用道具 举报

9#
 楼主| 发表于 2008-3-2 11:50 | 只看该作者




















































































简单地提取一些要点:


GS not designed for large-expansion algorithms like tessellation
   Due to required ordering and serial execution
   See Andrei Tatarinov’s talk on Instanced Tessellation

Remember you don’t need to use a GS if you are just processing vertices

Be aware of appropriate ALU to TEX hardware
instruction ratios:
4 5D-vector ALU per TEX on AMD [AMD承认是5D Vector,不再像press release的时候说是scalar了]
10 scalar ALU per TEX on NVIDIA GeForce 8 series

Check for excessive register usage
    > 10 vector registers is high on GeForce 8 series [GF8存在大约10个顶点寄存器的时候,shader性能会受到影响的现象]
    Simplify shader, disable loop unrolling
   DX compiler behavior may unroll loops so check output



AMD: Clears
   Always clear Z buffer to enable HiZ
   Clearing of color render targets is not free on
Radeon HD 2000 and 3000 series
  Cost is proportional to number of pixels to clear
  The less pixels to clear the better!
Here the rule about minimum work applies:
  Only clear render targets that need to be cleared!
  Exception for MSAA RTs: need clearing every frame
RT clears are not required for optimal multi-GPU usage

AMD: Depth Buffer Formats
Avoid DXGI_FORMAT_D24_UNORM_S8_UINT for
  depth shadow maps
  Reading back a 24-bit format is a slow path
  Usually no need for stencil in shadow maps anyway
Recommended depth shadow map formats:
  DXGI_FORMAT_D16_UNORM
    Fastestshadow map format
    Precision is enough in most situations
    Just need to set your projection matrix optimally
DXGI_FORMAT_D32_FLOAT
  High-precision but slower than the 16-bit format




NVIDIA: Clears
Always Clear Z buffer to enable ZCULL
Always prefer Clears vs. fullscreen quad draw calls
Avoid partial Clears
  Note there are no scissored Clears in DX10,they are only possible via draw calls
Use Clear at the beginning of a frame on any rendertarget or depthstencil buffer
  In SLI mode driver uses Clears as hint that no inter-frame dependency exist. It can then avoid synchronization and transfer between GPUs

NVIDIA: Attribute Boundedness
Interleave data when possible into a less VB streams:
    at least 8 scalars per stream
Use Load() from Buffer or Texture instead
Dynamic VBs/IBs might be on system memory accessed over PCIe:
    maybe CopyResource to USAGE_DEFAULT before using (especially if used multiple times in several passes)
Passing too many attributes from VS to PS may also be a bottleneck
    packing and Load() also apply in this case

NVIDIA: Depth Buffer Formats
Use DXGI_FORMAT_D24_UNORM_S8_UINT
DXGI_FORMAT_D32_FLOAT should offer very similar performance, but may have lower ZCULL efficiency
Avoid DXGI_FORMAT_D16_UNORM
  will not save memory or increase performance
CSAA will increase memory footprint

NVIDIA: ZCULL Considerations
Coarse Z culling is transparent, but it may underperform if:
  If depth test changes direction while writing depth (== no Z culling!)
  Depth buffer was written using different depth test direction than the one used for testing (testing is less efficient)
  If stencil writes are enabled while testing (it avoids stencil clear, but may kill performance)
  If DepthStencilView has Texture2D[MS]Array dimension (on GeForce 8 series)
  Using MSAA (less efficient)
  Allocating too many large depth buffers (it’s harder for the driver to manage)

:charles:
回复 支持 反对

使用道具 举报

10#
 楼主| 发表于 2008-3-2 12:22 | 只看该作者
NVIDIA的Tessllation:p















回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2025-1-25 04:01

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表