The People Behind DirectX 10: Part 3--Nvidia\'s Tony Tamasi

Edison · 发表于 2006-7-12 03:52

http://www.extremetech.com/article2/0,1697,1986937,00.asp

The People Behind DirectX 10: Part 3--Nvidia's Tony Tamasi
July 11, 2006

By Jason Cross
In the first part of our feature series, The People Behind DirectX 10, we spoke with Microsoft's David Blythe and Chris Donahue about the new graphics API. We found out what new capabilities it offers, how we can expect it to change PC games, why it will only be available for Vista, and more.

In Part 2, we turned our attention to the GPU manufacturers, interviewing ATI's CTO of the PC Business Group, Bob Drebin. Of course, asking one GPU manufacturer about its DX10 ideas is only getting half the story, so today we present an interview with Nvidia's Vice President of Technical Marketing, Tony Tamasi. You'll notice these are the exact same questions as we posed to ATI, and that's intentional. We thought it would be interesting to give them the exact same questions, making it easier for you to compare and contrast the answers. Continued...

ExtremeTech: DirectX 10 adds several significant advances to PC graphics—geometry shaders, unified shader interface, better ability to stream out data from parts of the pipeline and read them in again, an integer instruction set, and more uniformity among features (no more messy cap bits)—to name a few. What do you think are the most significant additions, and why?

Tony Tamasi: DX10 fundamentally removes a number of barriers from developers' creativity, which in the end, is the most exciting thing about the new API. The API adds lots of great new features, which in general should result in content that is more "alive" and richer in a number of ways. From a developer perspective, geometry shaders, stream out, better instancing, and a uniform feature set are all great things. For developers to be able to more easily develop a "single" DX10 code path with a uniform feature set will mean they can invest more in their game than in implementing special case behavior for a particular IHV's [independent hardware vendor] implementation; say a choice to not include vertex texturing, for example.

The improvements in instancing in DX10 will allow developers to populate their worlds full of "stuff" (and yes, that's the technical term). Where before it was impractical to draw blades of grass, or individual leaves, now that becomes practical. Armies of characters, flocking behavior, and better interaction between characters and their environment all become much more tractable with DX10.

From an "alive" perspective, I think you'll see much more use of physics in the future. DX10 offers some nice additional functionality, such as stream out for a more computation-like data model, which will allow GPU physics to go that much further. GPU Physics is another tool in the developer's toolbox, allowing them to create dynamic simulations that are orders of magnitude more compelling than before. Thousands of rigid bodies, volumetric and dynamic smoke or fog, millions of colliding particles, flowing water, shattering glass—all will become tools for the developers, allowing them to add another layer of life to their worlds.

Of course, things like geometry shaders will allow a new level of realism in character animation, more pervasive use of shadows, more detailed and dynamic terrain, etc. Just as the visual bar was raised from DX8 to DX9, so will the bar be raised yet again from DX9 to DX10, not only visually, but also from an immersive perspective.

ExtremeTech: We know that DX10 has unified pixel, vertex, and geometry shaders on the API level, but it doesn't dictate the hardware implementation: It's possible to have DX10-compliant graphics cards with discrete pixel and vertex/geometry shaders. Do you think time is right to bring them together, or is there still some benefit from having separate units on a hardware level?

Tamasi: There are benefits and disadvantages to both approaches. Unified certainly is interesting in that it appears architecturally "new" and different, and has some interesting promise in terms of load balancing and extensibility.

Frankly speaking, however, the graphics industry has gotten very good at extracting a lot of performance from the current vertex/pixel shader architectures, so the competition for anything new architecturally is a highly evolved and efficient architecture. The first rule of any new GPU is to be better at the previous API. Any trade-off which might move you away from that goal has to be evaluated carefully. If you look at some of the existing GPU architectures you can see some pretty big differences in terms of architectural efficiencies, and that is with architectures that, at least at the highest level, would present themselves as "non-unified" architectures. Even within that environment, you can see huge (almost 2x) differences in terms of performance delivered per unit of area or power.

If you assume that the competitors will be bound by the same laws of physics and economics, that alone would put one competitor in a dramatically better position. Would consumers be willing, for example, to pay twice as much for a graphics card that delivered the same performance as another, just because a particular graphics card was unified, or to pay the same price, but run at ½ the performance just because it offered some new architectural block diagram? I doubt it.

There are plenty of examples, including some from recent memory, of architectures that looked new, and sounded good on paper, but in implementation suffered greatly in the first law of GPUs—which is every new GPU architecture must be better than the previous GPU architecture, as measured by the things which characterized that previous GPU architecture. Unified will come eventually, but only when being unified delivers the best performance, architectural, and power efficiency. Continued...

ExtremeTech: Graphics cards are huge. We're already in the 300-400 million transistor range on high-end parts. DX10 requires some features that are not inexpensive to implement, in terms of transistor budgets. Are we going to go over half a billion transistors on high-end DX10 cards? Do the DX10 requirements make it hard to economically produce low-cost graphics cards, or hard to produce energy-efficient yet high-performance mobile parts?

Tamasi: DX10 certainly isn't free, but there are innovations going on in many places of GPU architecture beyond strict 3D features. Nvidia has been working very hard on architectural efficiency from both a cost and power perspective, such that we can always meet the needs of all of our customers, including mobile customers. If you look at the current generation of GPUs and compare Nvidia GPUs to those of our competitors, we're able to deliver superior performance with significantly more efficient designs. That philosophy will serve us well as we move into the DX10 generation.

ExtremeTech: With DX10 graphics cards being bigger and more complex, how much harder was it to design than previous generation GPUs? Is your first DX10 card taking more time, more money, more manpower to develop? Can you quantify this?

Tamasi: While some of the complexity can be attributed to DX10, which is certainly an aggressive architectural revolution, building a large, architecturally, and power-efficient architecture that is higher-performance than what has come before also makes it harder to design, which translates into significantly more time, money, and manpower. Building modern GPU architectures can now be measured in hundreds of millions of dollars of investment, with teams measuring in the many hundreds, if not thousands. Continued...

ExtremeTech: When we finally see the first games to support DX10 (such as Crysis or Flight Simulator X), what do you think the biggest differences will be over the same games running in DX9 mode? In other words, what will DX10 bring to the table in the very near term?

Tamasi: We basically addressed this in the response to the first question, but perhaps one other aspect worth discussing is the relationship between speed and image quality. Today, there are API limitations with DX9 that make many functions impractical from a performance or efficiency perspective. For example, if it is inefficient to draw lots of individual blades of grass due to API or hardware behavior, then doing that will mean that the developer will have to make some trade-offs, either by reducing the quality of the grass, or reducing the quality of something else, as the drawing of that grass would consume more GPU (or CPU) resources than might be practical. So, with a new API and architecture, things which before might have been impractical due to speed reasons, now become practical, allowing higher levels of image quality, and more realistic scenes.

Speed-wise, there are lots of great features in DX10 that will make things more efficient. Pervasive instancing and things like geometry shaders allow refactoring of the graphics algorithms to move the graphics workload entirely to the GPU, or using new functions of the API to do things on the GPU that simply weren't possible on GPUs before. Those "speed" things can all result in improved image quality, and I expect you'll see developers be able to take advantage of some of those benefits early, the result being richer, more-detailed and more alive worlds. Of course, DX10 has some great features for image quality, both in terms of API-visible functionality like geometry shaders, as well as more consistent and specified behavior for things like texture filtering, antialiasing, and transparency that should also benefit first-generation DX10 games.

ExtremeTech: Looking further out, to when game developers can spend more time with DX10 hardware and really make good use of it, how will games look and behave differently?

Tamasi: Using the past as a predictor of the future, if you compare a first-generation DX9 title like Half Life 2, which looked quite good for its time, and compare that to a second-generation DX9 engine like Unreal Engine 3, the difference is quite dramatic. Developers have gone from shaders measuring tens of instructions, to shaders measuring in the hundreds of instructions. An order of magnitude increase in shader complexity has resulted in some truly stunning improvements. I would expect to see at least that level of difference with DX10, particularly in the domain of geometry shaders and world richness.

As engines begin to get developed from the start as core DX10 engines, I would expect a pretty radical increase in complexity of what can be presented to the player, especially in terms of "stuff in the world," character detail, and a world that is "dynamic." Things like worlds full of physical simulations; water that flows and interacts with characters as they move through it, or a bullet passing through volumetric fog and leaving a vapor trail, or snow that accumulates, drifts, and blows in the wind. Continued...

ExtremeTech: Are there any features you wanted in DX10 that didn't "make the cut," as it were? Something you're pushing hard for in the next DirectX revision?

Tamasi: There's always a lot more to do feature-wise, which is a great characteristic of graphics in general. We're still far from "done" feature set or capabilitywise. Some features that developers have continued to express strong interest in would be generalized tessellation, and more-sophisticated methods for solving transparency, among others.

ExtremeTech: General-purpose GPU computing, including (but not limited to) physics and video processing, is getting to be a bigger deal. What will DX10 cards, and the DX10 API, do to enable easier, faster, and more-robust GP-GPU applications?

Tamasi: Stream out is a big boon to the general data flow of GPU programming, as well as integer operations. Geometry shaders, interestingly enough, allow migration of a number of functions that used to be CPU-centric to the GPU, both for what you would think of as graphics algorithms, as well as many non-graphics algorithms. Of course, the dramatically increasing floating-point processing power of every new GPU generation is a huge win for GP-GPU as well.

ExtremeTech: Isn't the best way to tackle GP-GPU stuff to get away from a graphics API and allow developers to program the GPU in a more direct fashion?

Tamasi: Yes, and no. Graphics APIs were built to allow developers to basically extract the best performance from graphics processors. So, by their nature, graphics APIs are "fast." However, graphics APIs are awkward for expressing non-graphics problems, so you face the constant struggle of generality vs. performance. To some extent, part of the challenge of GP-GPU programming is not necessarily the API (although that is part of it), but probably more challenging expressing algorithms in a data-parallel manner. Programs traditionally written for serial type processing on modern CPUs have been optimized for that type of processing, and the programmers themselves haven't had the experience in developing and optimizing for a highly parallel architecture. There has been some fantastic work in developing higher-level programming languages for developing non-graphics applications on GPUs, in an effort to make GPU programming more approachable. Things like Stanford's Brooke for example. While they simplified GP-GPU programming to a good degree, some of that implementation and abstraction left a lot of performance on the table.

So a good GP-GPU interface may look different from a graphics API and may hide some of the "GPU-ness" from the non-graphics programmer. But in the end it needs to balance the generality that allows a wide range of applications to be easily programmed to a GPU while extracting the most performance from the floating-point horsepower on the GPU. Exactly what this interface should look like is an active area of research at a number of places including University of North Carolina, Stanford, Microsoft Research, University of Waterloo, and others. Continued...

ExtremeTech: It's disappointing that Vista won't launch for consumers until next year, and by the transitive property of software delays, neither will DX10. This shouldn't hold back the launch of DX10-compliant graphics card though, should it? The first DX10 cards are going to have to be the fastest DX9 cards on the market as well, right?

Tamasi: Yes. Every new-generation graphics processor must be good at everything that has come before, as well as delivering on the new features. History is littered with graphics architectures and companies that built graphics hardware that could do some new things fairly well, but couldn't run the existing applications or APIs well enough.

Read what Microsoft techies have to say about DirectX 10.

ExtremeTech: Is there anything you'd like to add? A personal message about DX10 (or your company's DX10 products) you'd like to send to our readers?

Tamasi: We're incredibly excited about the capabilities that DX10 offers, and the amazing content that developers will produce with it. Nvidia has been working on DX10 for many years, and as we were with Shader Model 3, expect to be the leader with DX10.

只看该作者 · 发表于 2006-7-12 03:55

提示: 作者被禁止或删除内容自动屏蔽

只看该作者 · 发表于 2006-7-13 16:11

提示: 作者被禁止或删除内容自动屏蔽

niwei_123 · 发表于 2006-7-14 16:08

提示: 作者被禁止或删除内容自动屏蔽

帐号		自动登录	找回密码
密码			注册

神的马甲该用户已被删除	2^# 发表于 2006-7-12 03:55 \| 只看该作者提示: 作者被禁止或删除内容自动屏蔽
神的马甲该用户已被删除
	回复支持反对使用道具举报显身卡

来不及思考该用户已被删除	3^# 发表于 2006-7-13 16:11 \| 只看该作者提示: 作者被禁止或删除内容自动屏蔽
来不及思考该用户已被删除
	回复支持反对使用道具举报显身卡

niwei_123 niwei_123 当前离线积分 5 IP卡狗仔卡头像被屏蔽	4^# 发表于 2006-7-14 16:08 \| 只看该作者提示: 作者被禁止或删除内容自动屏蔽
niwei_123 niwei_123 当前离线积分 5 IP卡狗仔卡头像被屏蔽
	回复支持反对使用道具举报显身卡