This is a big one...
I've completely rewritten the alias model tri fan/strip generator to just use raw triangle lists. The code is now a whole load simpler, being just a single 12-line function (compare with the old gl_mesh.c - yes, I've replaced the entire thing with a 12-line function).
The speedup is something in the order of 20% to 30%. No joking, this one goes like a rocket. The big advantage obviously comes from being able to batch-submit an entire alias model in a single API call, and it has me thinking that I can get similar goodness from the world (it will be slightly more difficult owing to changing textures).
An earlier version retained the tri fans and strips but submitted them as a single indexed tri list, which was slightly faster again (less data in the submission). I've decided to prefer the slightly slower (by 1% or 2%) version I currently have as the code is simpler, cleaner, more maintainable, and faster to load. Indexed primitives also impose a hard limit on the number of verts in a model as some 3D cards still only support 16 bit indexes.
There is a slight memory overhead from not reusing some data, but it's relatively minor and anyway I have room to play with there as a result of keeping the vertex data indexed in memory (as opposed to expanding it out like GLQuake did).
This Christmas has been good to DirectQ. I've just switched over the main world texturechains to a similar triangle list based render, and gained some more speed. The complication is that there are two textures in each chain: the world texture and the lightmap. This makes it necessary to submit the current list and begin a new one each time either of them changes, so it can't be as optimal as alias models.
For reference, the machine I'm working on now used to get 105 FPS on demo1. It now hits near 140. For kicks I did a benchmark with r_novis 1 and got 80, so this puts us in a position where the impact of Quake's polygon fragmentation is significantly reduced.