There are a lot of interesting things going on in this screenshot.
The main one is the particle explosion - this has now fully moved over to the GPU so it's not even necessary to send vertexes any more - just bind the buffers, set the shaders, update some constants and make a draw call. Everything else is animated and positioned on the GPU.
This is a bit of a juggling act. On the one hand you gain a little by not having to send vertexes, on the other hand you lose a little in state changes and extra draw calls (this scene involves 2 extra calls over the older modes - one for the rocket trail in front, one for the lavaball trail behind - previously all 3 would have been done in a single call). The vertex shader is moderately more complex too.
For now it remains an optional mode and is only used for particle emitters that have high numbers of particles - explosions, teleport splashes, the like. Simpler emitters with small numbers still go through the regular path.
In some initial benchmarking it does pull ahead of the other modes in cases where there is a lot of explosive action going on, and it has been nice to port some of my previous CPU-side code to the GPU. There's probably a balancing act between the two where you get the best of both methods.
Another interesting thing is the framerate - it's 888fps but that's actually slightly down in D3D9. This is something I've noticed fairly consistently on different machines and also with my older experimental D3D11 work - it is somewhat more fillrate-intensive than D3D9 was, so in scenes where fillrate is high it tends to drop off a little more.
The main criminals here would include screen tints and render-to-texture, but it also falls off a little more as resolution goes up. Overdraw (in cases where early-Z isn't working optimally) also hurts it a little more than I'd like. There are other areas where D3D11 seems a little "heavier" than 9 was - changing render targets, or clearing the depth buffer, for example (again, they all seem related). These are more than balanced by it's more general performance improvements, but they do exist all the same.
Overall though, at this kind of framerate we're purely in academic/theoretical land - it's already more than fast enough, and remains so even when it does get the drop off. But it is a puzzling and curious observation, all the same.
Wednesday, June 13, 2012
Posted by mhquake at 2:24 AM