|
[In this sponsored feature, part of Gamasutra's XNA microsite,Microsoft's Kevin Gee explains in-depth the new features of DirectX 11,from improved multi-threading to Shader Model 5.0 and beyond.]
Recently, at its annual Gamefestconference, Microsoft announced the forthcoming DirectX 11 API set. Thistechnology, whose key features and benefits are discussed in this article,enables developers to take advantage of the latest hardware developments acrossboth CPUs and GPUs...all while easing development pain. Let's take a look at therich set of DirectX 11 features.
Feature Highlights
- Down-level hardware and operating system support
- Improved multithreaded device
- New hardware stages for tessellation
- Improved texture compression
- Shader Model 5.0
- Compute shader
- Additional features
Down-Level Hardware and Operating System Support
Windows Vista and DirectX 10 wereengineered to improve the underlying Windows Display Driver Model (WDDM) andcreate significant opportunities for driver performance improvement. Inaddition, the DirectX 10 API was designed to be cleaner and simpler, with thenear full removal of capability bits, thereby making client code easier towrite and removing development pain. DirectX 11 brings enough new features tobe a full version update, however, since it builds upon and extends DirectX 10.Anyone familiar with DirectX 10 and 10.1 will feel immediately at home withDirectX 11. With DirectX 11, it is possible for developers to target hardwarefeature levels 10, 10.1, and 11 by using a single set of functions.
The timing for the final releaseof DirectX 11 aligns with the next version of Windows, but the API will also bemade available on Windows Vista. Thus, with the DirectX 10-class and 10.1-classhardware level already in consumer's machines, there will be a lot of hardware to targetright from launch.
Improved Multithreaded Device
Earlier releases of Direct3D focused primarily on single CPUconfigurations and as su**ad limited threading support. With DirectX 11, theAPI has been updated to enable developers to better drive the GPU from amulti-core CPU. DirectX 11 improves scaling on CPUs via changes to both the APImodel and driver model. Asynchronousdevice access becomes possible through two key features of the Direct3D 11device object.
- First, improvements in synchronization between the Direct3D device object and the driver enable asynchronous API calls, including resource allocations. Direct3D 11 allows developers more freedom when expressing parallelism by allowing such calls to occur across multiple threads.
- Second, the Direct3D device interface now supports multiple rendering contexts. 1) a primary immediate context which dictates the timeline for work submission to the GPU, and 2) optional deferred contexts created by the application developer as needed. Work associated with each deferred context can occur on a separate thread/core. This enables GPU commands to be accumulated in parallel to the main rendering work, and then sent to the GPU later when the main context is ready to submit a new task to the GPU.
The following figure shows rendering tasks being queued inparallel to the main immediate context, and being submitted as they become complete.
This feature of DirectX 11supports Direct3D 10-class and 10.1-class hardware, too, so changes made in theway applications render will benefit existing hardware.
New DirectX11 Hardware FeaturesNext, let's take a look at someof the hardware specific features DirectX 11 brings.
New Hardware Stages for TessellationDirectX 11 brings three newstages (hull shader, tessellator, and domain shader) to the rendering pipeline.These stages enable flexible, programmable hardware support of tessellation. Thehull and domain shaders are programmable parts; the tessellator is fixedfunction but supports a number of insertion settings providing control over thegenerated position data.
Hull Shader
This programmable unit allows transforms on input data to be performed asit runs at the source control mesh frequency. When discussing applications ofthe pipeline, we often mention performing a basis change in this shader, from one surface representation to another-forexample, from Catmull-Clark quad mesh to Bezier patch controls.
Tessellator
This fixed-function unit can be simply thought of as a data expander and asa place where the IHVs can safely parallelize with the user-providedalgorithms. It takes tessellation factors as input and inserts vertices in surfaceU,V space according to the chosen partitioning scheme.
Domain Shader
This unit executes once for every generated vertex, and assuch is the place where surface formulations are evaluated. The inputsto this stage are provided in surface U,V domain ready for parametricsurfaceevaluation.
The pipeline supports severalinput types (quad patch, triangle patch, or even poly-line), which allowsdevelopers to target almost any surface formulation. One usage scenario thathas been strongly requested is support of sub-division surfaces for renderingcharacters.
Sub-Division Surface ApproximationSchemes
Charles Loop and Scott Schaefer from Microsoft Research worked on a numberof approaches for approximating sub-division surfaces that can be applied to theDirectX 11 pipeline. One of the approaches, provided as a DirectX 10 sample inthe DirectX SDK, changes a quad patch basis mesh into Bezier surfaces of fixedtessellation. When applied to the DirectX 11 pipeline, this and other schemescan be used to deliver real-time rendering of sub-division surface meshes.
Improved Texture CompressionTextures in games are often thelargest area of memory utilization, so it should be no surprise that furtherimprovements to texture compression are needed to keep working set size and memorybandwidth consumption within the rates required for real-time rendering. DirectX11 arms developers with new compression formats (BC6 and BC7) to help target high-qualityrendering without sacrificing performance. Here we will focus on two specific examplesof how DirectX 11 raises the bar for rendering quality. Some of you may be morefamiliar with the older DXT-style naming convention, which was changed to the blockcompressed (BC) naming convention for DirectX 10. The newer naming conventionis used here.
Compression of High Dynamic Range (HDR) Image Sources
High dynamic range image sources are very common in games these days. Whencombined with intelligent tone map operators, HDR is often required to maketitles look photorealistic. The new block compression (BC) scheme, BC6, hasbeen designed to provide high-quality 6:1 compression of HDR image data with hardwaresupport for decompression.
Click for full size. Here we can see a comparisonimage for the HDR format. On the left is the HDR original image tone-mapped toa given exposure, and on the right is the equivalent BC6 image. The absoluteerror image is in the center. Notice howthe Abs image contains no obvious blocking errors, the errors we are seeing aregenerally diffused noise errors. These are visually much less noticeable to thehuman eye than edges introduced by blocking.
Low Dynamic Range (LDR) / Normal Map Compression
The new BC7 scheme provides support for 8-bit/low dynamic range (LDR) dataat 3:1 ratios. Here we compare the results of the new format with the existingblock compressed approach, BC3.
Click for full size.
You can clearly see the blocking artifactsin the BC3 image, which are drastically reduced in the BC7 image. With thisfeature, developers and artists can expect more from their linear texturecontent and normal maps for the same or lower cost in memory size.
[ 本帖最后由 tayuzheng 于 2008-8-19 14:14 编辑 ] |
|