Today I got the basics of IQM loading and drawing up and running in RMQ. By basics I mean bare minimum necessary to get an IQM in-game - the loader, some animation, and minimal drawing. It's untextured, I'm using a sample model, animation just cycles through the frames in the model, and a lot of other stuff is missing.
My general impression is that the loader is disgusting, animation is disgusting and drawing is clean and easy.
First up, the loader. I'm not overly familiar with model formats so I can't say what I would do differently, but I was rather shocked at the amount of load-time processing that's necessary. What seems even more bizarre is that it doesn't appear to have been from the perspective of reducing file size or anything like that - as it more or less does the processing in-place. I'm not entirely certain what the thinking behind that was, so I'll move on.
Animation. It's all CPU-side. Software T&L country, 12-year-old technology, steam-power man! DirectQ animates MDLs on the GPU and it more than doubled framerates in many cases, and the format should have been designed around GPU-side animation. Store your bone matrixes in a shader's constant registers, store an index with each vertex position, and do a lookup. Bam. Of course it reduces the number of bones you can have, but for most practical purposes you're not going to be going overboard with this in a game engine. Any forward thinking format should really design around doing as much as possible on the GPU. Quake is already CPU-bound, this just makes it worse.
On the bright side though it looks like I'm going to be able to support the $frame stuff QC-side. I don't see any reason whatsoever why it wouldn't work.
Drawing. Just set your vertex arrays and make a glDrawElements call, that simple. Right now I'm making no attempt to be efficient about stuff like state change batching, as the priority was just to get a working implementation. I'll come back and handle that later.
More drawing. The format uses 32-bit indexes by default. More software T&L country. If your 3D hardware doesn't support 32-bit indexes - guess what - OpenGL will drop your vertex pipeline to software emulation. Granted, most hardware does nowadays, but Quake remains one area where legacy hardware is still more widely used than normal. I switched the in-memory format to 16-bit indexes, which puts an upper limit of 64k vertexes on an IQM for use in RMQ. It's still high enough.
Other notes.
Adding a new model type to Quake is moderately painful. They don't fit seamlessly in, and there are a few places where you need to handle things. Going OO, with a base model_t class and every other type inheriting from it would be a huge improvement in code cleanliness and work required.
I decided not to put IQMs into cache memory as it simplifies a lot of things in the loader and the renderer. Cache memory is really an old-hat concept from the Pentium 60/MS-DOS/8 MB days and should probably be taken out back and sent to a better place anyway.
Think that's about it for now; the next step is to finish one outstanding part of the loader (textures) and clean up things OpenGL-side a bit more. I don't doubt that I'll have something more to report after that.
Monday, May 30, 2011
IQM Implementation Notes - Part 1 (of ???)
Posted by
mhquake
at
12:34 AM
Subscribe to:
Post Comments (Atom)
4 comments:
Getting a clean IQM implementation in DirectQ will take it above DarkPlaces and other 2 or 3 existing IQM-capable engines. I, myself, am waiting for this for a long time, as my engine coding efforts stalled for a couple of months.
Don't confuse the demo, which is deliberately made to just be a simple-as-possible piece of code giving an idea of how you could use it, with the actual way you would implement it in production. Some data was just manipulated in-place in the demo to keep the code small.
Whether you want to use short or long indices is sort of a load-time decision, but we went with long indices in the format so that large models can be supported.
Sauerbraten, for instance, does GPU skeletal animation with the quaternion data - sending each in 2x4 (rotation, translation) uniforms. It's not really a format thing, since all you're doing is flinging the quaternions and blend indexes/weights at the GPU and doing the blends there. The actual blend indexes/weights are quantized in such a way as to make it easy to send them to the GPU even (as 4 byte vertex attributes that you can just put in each vertex), with the weights already normalized for you.
But how exactly you encode the quaternions (or make them matrixes if that's your fancy, but requires more uniforms than quats) is so variable that the format doesn't really try to assist with that, and instead just tries to encode the animations in a space-efficient way.
If you blend on the CPU, convert the quats to matrixes, generate a blend palette to factor out redundant vertex blends, and just SSE it if really desired. This is pretty much what both DP and Sauerbraten do for the CPU fallbacks - though DP does not yet do the GP skinning part. There's nothing in the format that should prevent doing this.
Just a thought.
First you dislike it not using GPU skeletal animation, then about it using 32bit indices.
I'd be interested in seeing a card that supports GPU skeletal animations, and not 32bit indices.
So which is it? Should it target new or old hardware?
Post a Comment