标题: 偷窥下一代API,微软DirectX11新特性一览 [打印本页] 作者: tayuzheng 时间: 2008-8-19 12:57 标题: 偷窥下一代API,微软DirectX11新特性一览 [In this sponsored feature, part of Gamasutra's XNA microsite,Microsoft's Kevin Gee explains in-depth the new features of DirectX 11,from improved multi-threading to Shader Model 5.0 and beyond.]
Recently, at its annual Gamefestconference, Microsoft announced the forthcoming DirectX 11 API set. Thistechnology, whose key features and benefits are discussed in this article,enables developers to take advantage of the latest hardware developments acrossboth CPUs and GPUs...all while easing development pain. Let's take a look at therich set of DirectX 11 features.
Feature Highlights
Down-level hardware and operating system support
Improved multithreaded device
New hardware stages for tessellation
Improved texture compression
Shader Model 5.0
Compute shader
Additional features
Down-Level Hardware and Operating System Support
Windows Vista and DirectX 10 wereengineered to improve the underlying Windows Display Driver Model (WDDM) andcreate significant opportunities for driver performance improvement. Inaddition, the DirectX 10 API was designed to be cleaner and simpler, with thenear full removal of capability bits, thereby making client code easier towrite and removing development pain. DirectX 11 brings enough new features tobe a full version update, however, since it builds upon and extends DirectX 10.Anyone familiar with DirectX 10 and 10.1 will feel immediately at home withDirectX 11. With DirectX 11, it is possible for developers to target hardwarefeature levels 10, 10.1, and 11 by using a single set of functions.
The timing for the final releaseof DirectX 11 aligns with the next version of Windows, but the API will also bemade available on Windows Vista. Thus, with the DirectX 10-class and 10.1-classhardware level already in consumer's machines, there will be a lot of hardware to targetright from launch.
Improved Multithreaded Device
Earlier releases of Direct3D focused primarily on single CPUconfigurations and as su**ad limited threading support. With DirectX 11, theAPI has been updated to enable developers to better drive the GPU from amulti-core CPU. DirectX 11 improves scaling on CPUs via changes to both the APImodel and driver model. Asynchronousdevice access becomes possible through two key features of the Direct3D 11device object.
First, improvements in synchronization between the Direct3D device object and the driver enable asynchronous API calls, including resource allocations. Direct3D 11 allows developers more freedom when expressing parallelism by allowing such calls to occur across multiple threads.
Second, the Direct3D device interface now supports multiple rendering contexts. 1) a primary immediate context which dictates the timeline for work submission to the GPU, and 2) optional deferred contexts created by the application developer as needed. Work associated with each deferred context can occur on a separate thread/core. This enables GPU commands to be accumulated in parallel to the main rendering work, and then sent to the GPU later when the main context is ready to submit a new task to the GPU.
The following figure shows rendering tasks being queued inparallel to the main immediate context, and being submitted as they become complete.
This feature of DirectX 11supports Direct3D 10-class and 10.1-class hardware, too, so changes made in theway applications render will benefit existing hardware.
New DirectX11 Hardware FeaturesNext, let's take a look at someof the hardware specific features DirectX 11 brings.
New Hardware Stages for TessellationDirectX 11 brings three newstages (hull shader, tessellator, and domain shader) to the rendering pipeline.These stages enable flexible, programmable hardware support of tessellation. Thehull and domain shaders are programmable parts; the tessellator is fixedfunction but supports a number of insertion settings providing control over thegenerated position data.
Hull Shader This programmable unit allows transforms on input data to be performed asit runs at the source control mesh frequency. When discussing applications ofthe pipeline, we often mention performing a basis change in this shader, from one surface representation to another-forexample, from Catmull-Clark quad mesh to Bezier patch controls.
Tessellator This fixed-function unit can be simply thought of as a data expander and asa place where the IHVs can safely parallelize with the user-providedalgorithms. It takes tessellation factors as input and inserts vertices in surfaceU,V space according to the chosen partitioning scheme.
Domain Shader This unit executes once for every generated vertex, and assuch is the place where surface formulations are evaluated. The inputsto this stage are provided in surface U,V domain ready for parametricsurfaceevaluation.
The pipeline supports severalinput types (quad patch, triangle patch, or even poly-line), which allowsdevelopers to target almost any surface formulation. One usage scenario thathas been strongly requested is support of sub-division surfaces for renderingcharacters.
Sub-Division Surface ApproximationSchemes
Charles Loop and Scott Schaefer from Microsoft Research worked on a numberof approaches for approximating sub-division surfaces that can be applied to theDirectX 11 pipeline. One of the approaches, provided as a DirectX 10 sample inthe DirectX SDK, changes a quad patch basis mesh into Bezier surfaces of fixedtessellation. When applied to the DirectX 11 pipeline, this and other schemescan be used to deliver real-time rendering of sub-division surface meshes.
Improved Texture CompressionTextures in games are often thelargest area of memory utilization, so it should be no surprise that furtherimprovements to texture compression are needed to keep working set size and memorybandwidth consumption within the rates required for real-time rendering. DirectX11 arms developers with new compression formats (BC6 and BC7) to help target high-qualityrendering without sacrificing performance. Here we will focus on two specific examplesof how DirectX 11 raises the bar for rendering quality. Some of you may be morefamiliar with the older DXT-style naming convention, which was changed to the blockcompressed (BC) naming convention for DirectX 10. The newer naming conventionis used here.
Compression of High Dynamic Range (HDR) Image Sources
High dynamic range image sources are very common in games these days. Whencombined with intelligent tone map operators, HDR is often required to maketitles look photorealistic. The new block compression (BC) scheme, BC6, hasbeen designed to provide high-quality 6:1 compression of HDR image data with hardwaresupport for decompression.
Click for full size.
Here we can see a comparisonimage for the HDR format. On the left is the HDR original image tone-mapped toa given exposure, and on the right is the equivalent BC6 image. The absoluteerror image is in the center. Notice howthe Abs image contains no obvious blocking errors, the errors we are seeing aregenerally diffused noise errors. These are visually much less noticeable to thehuman eye than edges introduced by blocking.
Low Dynamic Range (LDR) / Normal Map Compression
The new BC7 scheme provides support for 8-bit/low dynamic range (LDR) dataat 3:1 ratios. Here we compare the results of the new format with the existingblock compressed approach, BC3.
Click for full size.
You can clearly see the blocking artifactsin the BC3 image, which are drastically reduced in the BC7 image. With thisfeature, developers and artists can expect more from their linear texturecontent and normal maps for the same or lower cost in memory size.
[ 本帖最后由 tayuzheng 于 2008-8-19 14:14 编辑 ]作者: tayuzheng 时间: 2008-8-19 12:58 Click for full size.
You can clearly see the blocking artifacts in the BC3 image, which are drastically reduced in the BC7 image. With this feature, developers and artists can expect more from their linear texture content and normal maps for the same or lower cost in memory size.
Shader Model 5.0DirectX 10 brought you Shader Model 4.0, which included full support for integers and bitwise operators among other features. Direct3D 10.1 added Shader Model 4.1, with support for direct MSAA sample access. DirectX 11 brings Shader Model 5, which utilizes object-oriented concepts to help reduce the pain of shader development and brings optional support for double precision. This update to HLSL enables you to bring the full power of the HLSL compiler to bear on the problem of shader specialization using interfaces, objects, and polymorphism. With dynamic shader linkage, developers can more easily author larger, flexible shaders and permute out specialized, optimized versions for use at run time during specific rendering.
Compute Shader Anyone already familiar with general purpose use of GPUs will be excited to hear about the new compute shader, which brings cross-hardware vendor support for programming the GPU in general purpose ways (GPGPU). There have already been many advances made in applying the huge amount of numerical crunch power GPUs have to large scale computing problems in previously niche markets. With the addition of the compute shader in DirectX 11, Microsoft makes these algorithms possible on the client across a broad range of hardware. Look for exciting new ways that games and other application developers can take advantage of GPUs for tasks other than just rendering.
Key features include communication of data between threads, and a rich set of primitives for random access and streaming I/O operations. These features enable faster and simpler implementations of techniques already in use, such as imaging and post-processing effects, and also open up new techniques that become feasible on Direct3D 11-class hardware.
Additional Features Even more exciting features are in store for DirectX 11 than can be covered in this basic introduction, but here are two last-minute things we simply couldn't finish this article without mentioning. Conservative oDepth Traditionally, IHVs have had to disable Z acceleration structures and algorithms when shaders write to the depth buffer via the oDepth register. The conservative oDepth feature in DirectX 11 enables shaders to write to the depth buffer within a specified region guarantee. This enables the hardware to avoid the full loss in performance by enabling acceleration outside of the guaranteed region. 16K Texture Limits and Texture Clamps DirectX 11 raises the maximum texture size from 4K to 16K and also provides MIP-LOD control clamps to limit the number of mipmap levels loaded to the GPU.
SummaryWe're excited to bring you this newest release of the DirectX API set. This version runs on Windows Vista as well as future versions of Windows, and it will work on your Direct3D 10-class and 10.1-class hardware, while exposing the new features of DirectX 11-class hardware. Many of the features are intended to make developer's lives easier while enabling opportunities for new functionality and performance gains. Look forward to a community tech preview in the November 2008 release of the DirectX SDK and start working with this next step in the evolution of graphics technology.
ReferencesFor more information about sub-division surface approximations, see the Sub-Division Surface sample in the DirectX SDK. Also look for the forthcoming Gamefest 2008 talks "Multithreaded Rendering for Games" and "DirectX 11 Tessellation," coming soon at http://msdn.microsoft.com/directx/presentations. Finally, also see Graphics APIs in Windows Vista on MSDN.