I missed noticing the first anniversary of starting this blog, so I'm going to celebrate it today instead. Hard to believe that it's been a year since I really came back on to the scene, and how much has happened (and changed) during that time. Real-life-wise 2008 was probably the best year of my life, I certainly had an utter blast, fantastic times, achieved a lot personally, and have come out of it with some very happy memories. Here's to more of the same!
Engine-wise I really never anticipated what would happen and the way things would turn out. To think that I was aiming for a release of the GL engine in March! With hindsight it should have been obvious even then that I was going nowhere fast with it, and the fact that I've made more releases in shorter time with more valuable additions in them than ever before confirms that I eventually made the right choice.
Unfortunately I don't have a 1.4 release to round off the celebrations with; it will probably take another week or two to knock some of the rougher corners off 1.3, but it will be sooner rather than later.
Friday, January 30, 2009
I missed noticing the first anniversary of starting this blog, so I'm going to celebrate it today instead. Hard to believe that it's been a year since I really came back on to the scene, and how much has happened (and changed) during that time. Real-life-wise 2008 was probably the best year of my life, I certainly had an utter blast, fantastic times, achieved a lot personally, and have come out of it with some very happy memories. Here's to more of the same!
Posted by mhquake at 7:27 PM
Thursday, January 29, 2009
After having squashed the most recent bug, it's time for a few more updates. I'm pretty much decided that 1.4 is going to be a "nips and tucks" release rather than anything dramatically new. I've had enough huge changes in the last two releases, so it's time to settle down a bit and get the codebase as rock solid as possible before moving on again.
I could probably release what I have now in fact, but there is some outstanding work on alias models I'd like to get done first. I'd also like to fix the TGA writing for mapshots. Moving sky and "gl_flashblend 1" mode to shaders also remains outstanding.
Recent changes/improvements include fixing up FOV so that the viewmodel remains consistent (this was working before 1.3 but I broke it when moving to shaders) - I've changed this so that it gradually disappears with values of FOV below 90 - and adding 8 new custom crosshairs. The classic '+' remains available (and is in fact the "crosshair 1" setting); "crosshair 2" will give you an orange '+' (this has been in since the initial release) and values of "crosshair" from 3 to 10 will use the custom ones. I want to add a menu option for selecting these too.
I'm also starting to think about where to go next in terms of big features. A new particle system is almost definitely going to happen, but this will be an option with the classic system being the default. It might start happening in the following release. I also want to do game changing and port the games, maps and demos menus over from the old GL engine. The maps menu might require some reworking of the listbox code, as the GL engine used to let you enter custom filters for easy access.
I've also been giving further thought to the idea of QC and it's ability to call any command or set any cvar. Where I think I'm going to go is allow an option to just report on what QC is doing, rather than blocking anything. I can certainly see the value this would have for assisting mod developers in debugging their QC. I might even roll it into "developer 1".
Posted by mhquake at 12:43 AM
Wednesday, January 28, 2009
I've just spent the best part of 3 evenings tracking down a bug which turned out to be an evil concoction of Quark cleverness and compiler settings. This is annoying - on the one hand we have a tool that does lots of things in a non-standard way (but which somehow produces output that works with most Quake engines), whereas on the other we have voodoo enhanced instruction sets which for the most part work perfectly, except in this particular instance.
A warning to all engine developers:
DON'T ENABLE THE SSE INSTRUCTION SET IN QUAKE ENGINES.
I don't know what Quark does or how it does it, but it is not compatible with SSE. The longer version - lightmap texcoords are derived from a number of factors: vertex positions, surface texturemins values and surface extents values. Texturemins and extents are derived from texcoords, which are defined in the texinfo lump (offsets and scaling). Normal Quake maps have limited precision here, but Quark enhances the precision in a number of magical ways. Now, compared to a normal Quake map, using SSE instructions will cause extents and texturemins to evaluate as being a little off (16 units, which is really just 1 as they multiply by 16).
Where does the fault for this lie?
I could take the blame myself for switching on SSE (apparently I did so in 1.2). I could point the finger at Quark for being different and insisting on using non-standard formats. I could point the finger at the compile tools used for generating the BSP that way. Hell, I could have even discovered a rare SSE bug that Intel don't know about.
Right now though I'm going to bed.
Posted by mhquake at 1:38 AM
Friday, January 23, 2009
Such are the perils of an engine in development...
I've changed mapshot handling again.
It occurred to me that drawing them at the same aspect ratio as the screen is not the most clever idea: what if you change resolution and now have a different aspect ratio to that which you had when the mapshot was taken?
I could have amended this to write the mapshot out at the same aspect ratio as the screen, but this would mess up positioning/etc. We could also very quickly run out of horizontal space if the resolution was wide enough.
Instead I'm now taking a square rectangle from the center of the screen, which pretty much mimics the old 1.2 method (but doesn't blow up on underwater warps).
I've also changed them to 128 x 128 PNG images. The home-grown TGA writer had started misbehaving (I'll need to check that out), and while I would have preferred BMP (faster loading and saving) it seems as though Direct3D will gladly save a BMP with an alpha channel, but become unhappy when trying to load it. I'm sure there's a perfectly good reason why the API behaves this way (cue sarcastic tone). 128 x 128 is because the old 256 x 256 size was too slow. Noticeable stalls when a shot is written.
I'm going to amend the save/load screen layout a little; nothing dramatic, just a slight re-alignment of the save game info. I actually had this already done post 1.2 but forgot to copy across the code from the machine I had written it on.
Which is kinda scary - wondering what else I forgot to copy across...
Posted by mhquake at 12:59 AM
Thursday, January 22, 2009
This card is something of a joy to work with. It's sad reflection on the state of OpenGL drivers that it's only really starting to show it's true performance now.
4 x Anisotropic Filtering is faster on it than no Anisotropic Filtering.
Anyway, 1.4 is underway. After the fairly large leaps into the unknown of the previous releases, this one will be consolidating and tidying up, moreso than anything else. Here's the list so far:
- Fixed r_novis 1 crash.
- Removed contents transition code from R_MarkLeaves - doesn't play nice with translucency and doesn't handle entities.
- Fixed lockups in OpenQuartz and other mods where inline bmodels seem to share surfaces.
- Renamed r_anisotropicfilter to gl_texture_anisotropy (consistency with other engines).
- Added FOV to video/view menu.
- Changed widescreen fix so that it's only effective at FOV 90 (consistency with other engines).
- Added fov_compatible cvar to allow player to override this.
- Removed HUD when loading plaque is visible (looks wrong).
- Fixed instanced brush model R_LightPoint start position.
- Added missing sprite types from WinQuake.
- Switched viewsize to 120 when loading plaque is up (still not sure, looks right at intermissions though).
- Increased alias model vertex limits.
Posted by mhquake at 11:53 PM
I'm not certain quite what to make of this one, but it appears as though there is a map editor or compiler out there somewhere that lets inline brush models share surfaces. It doesn't happen with any of the ID1 maps, nor with the Hipnotic or Rogue maps I've tested with, nor with 99% of other maps, but it does happen with some.
This was compounded by the fact that I'm adding these models into the regular texture chains, which causes the texture chain to loop back on itself and become infinite. The fix was to check if a surf has already been added (I just reused visframe for this, as it's not otherwise used here) and not add it again if so.
Let me just say it again - this doesn't happen with ID1.
This I guess is one of the reasons why I generally hate mods, and why coding to ensure mod compatibility can be such a minefield. Of course there is nothing in the engine to prevent this from being legal, but even the standard Q1 map format doesn't support it. So - assuming my analysis is correct - what we have here is somebody who went and invented their own non-standard map format which just happened to work with regular Quake, purely by accident.
To say I'm a bit annoyed would be putting it mildly.
Posted by mhquake at 1:24 AM
Wednesday, January 21, 2009
Yesterday's release had the hoped-for response, quite a few interesting bugs coming out.
Among the jucier ones are some fairly vicious lock-ups in OpenQuartz - I've traced the majority of these to SCR_BeginLoadingPlaque, which appears to be causing some pain and suffering on the stack. The last remaining one (so far) is in SCR_ModalMessage, happening when you exit the game. One thing in common is that both of these call SCR_UpdateScreen - I note a comment in Host_LoadGame_f that bringing up the loading plaque can't be done there as too much stack space has been used (and also the warning on SCR_UpdateScreen itself), so I'm guessing that I have something similar in other places. I'll pursue this line until something proves otherwise, but for now OpenQuartz is officially unsupported, you run it at your own risk, and if you want to exit it, you'll need to type "disconnect" at the console first. In fact, you should probably also type "quit" at the console.
I'll be prioritizing fixing this as it could affect other mods.
A few other to do with rendering glitches/etc, but lockups take precedence for now.
Posted by mhquake at 10:16 PM
OK, I've decided to let it out and see what happens. :)
There have been so many changes that bug reports would be very welcome, as always. Don't expect that everything new will work perfectly, this is another medium-rare release.
What you need to know:
- You'll need hardware capable of pixel shaders 2.0 and with minimum 3 texture units. There are no fallback, compatibility, or whatever modes. Note that the Intel Integrated Mobile 910 I've done most of the development work for this on meets the requirement.
- Don't use gl_flashblend 1, please. I know it looks ugly, but them is the breaks.
- I've ripped net_dgrm, protocol and cl_demo code from the BJP engine (credit!), so anything that applies there protocol-wise etc also applies here. Anything bad in my engine that works OK in BJP is my own doing.
- I haven't done any testing of connections with a regular NetQuake exe. This should be able to connect to a server just fine, but may cause problems if used as a server.
- You no longer need to specify -heapsize, you can if you want but it will do nothing.
- This should go up to 150% the speed of the previous release, depending on hardware.
- It can load warpspasm maps - oh yes! In fact, this engine has much higher capacity than even the BJP engine in certain areas.
- Don't try to edit the .rc file in Visual Studio's resource editor - I've used a format that's not compatible, and you'll mess things up.
- If you have any mapshots from the previous engine they'll look funny in this one as I've adjusted them to the same aspect as the screen.
- I haven't fully tested everything new, but I am interested in hearing what does and doesn't work.
Posted by mhquake at 12:37 AM
Tuesday, January 20, 2009
Almost there. A few small bugs left - the old state change bug has now migrated to gl_flashblend 1 mode. I might not prioritize fixing this, as I don't suppose many people play with gl_flashblend 1 anymore, especially not those with PS2.0 capable hardware! If it's going to affect something, I'd rather it affect this than anything else.
The big one is that I go into an infinite loop somewhere when changing between warpspasm demos. This needs to be fixed as a matter of priority, so I'll be focussing on that.
Aside from those two, I'd pretty much consider it releasable. Of course, there's a few items on my to-do list that I'd also like to address, but most of them can hold off until the next release after this one.
Posted by mhquake at 10:39 PM
I've pretty much completed it now, and I'm quite happy with it. I managed to succeed in getting a full resolution rendertarget (the previous shot was half resolution) and a 75% FPS speed by going to vertex-based. It's not as smooth as the old pixel shader, but you'd really have to be looking to notice.
The real irony here is that I also tracked down the real reason for the huge FPS impact of my old render to texture water updates. I won't be reverting them though as my current pixel shader is perfectly good, and doesn't need as much video RAM.
Posted by mhquake at 2:05 AM
Monday, January 19, 2009
I promised myself I'd feature-freeze and concentrate on finishing stuff up so I could release soon, but this was too good an opportunity to pass by.
Yup, it's an almost exact replica of WinQuake's underwater warp.
In it's current incarnation it's a major FPS drain, over halving what you get. Not ideal, so I might sacrifice some precision and move it to vertex-based, but at the same time a part of me just thinks it looks so cool.
Posted by mhquake at 8:40 PM
Something very nice just happened - I've finally hit the magical 100 FPS point on my Intel 910. This was a long time coming, and various fine tunings and optimizations have paid off well. It was definitely one goal I wanted to achieve before releasing, as the previous version only did 70 FPS at best.
I've reverted particles from using point sprites to the more traditional technique of textured primitives. Was never really happy with point sprites - they came close but didn't quite match the classic look. This is better, and has also enabled me to run them through shaders. No real reason for this as I'm currently not doing anything in the shader that couldn't be done in fixed func, but it's good to have the supporting infrastructure in place in case I ever decide to.
I've also amended the dot texture a bit to something that looks more round; funny but I kinda miss the angular blob of classic Q1, but I suppose it's OK.
While I'm doing this I'm building the code up so that it'll be able to support custom particle types. I don't really have any intention of building a custom particle engine just yet, but it's something that might be on the cards at some time in the future.
Working on particles makes me really wish that MS would backport geometry shaders to Direct3D 9. I don't really see any reason why they couldn't be done (only to tempt folks towards Vista lock-in) - Direct3D 9 in it's current incarnation already uses the Direct3D 10 shader compiler, and OpenGL can do geometry shaders without requiring Vista - and they would be an absolute boon here. There are lots of ways they could help with fixing up particle systems.
Anyway, the old state change bug is finally gone now on account of having done this, but nonetheless I'm going to press ahead with completing the move of everything to shaders. All that's left is gl_flashblend 1 mode (another place where geometry shaders would help a lot - sigh!) and sky. Then I intend doing a complete review of all state changes in the code, some PIX profiling, some tidying up of other things, and we're into release time!
Posted by mhquake at 12:05 AM
Sunday, January 18, 2009
Today I've moved sprites and 2D drawing over to shaders. This was a really nice thing to do, especially for 2D drawing, as it's enabled me to get rid of a lot of messy code that was required to support the weird way the fixed pipeline does colours (4 bytes packed into a DWORD). I'm aiming for a full shader implementation before I release this one.
The weird state change bug has now migrated to particles. I'm not going to invest the time required to track it down, as I'm moving everything to shaders anyway, where I'll have complete control over what goes in and out of the pipeline.
As I said before, but I believe it bears repeating, a move to shaders does NOT imply that DirectQ will suddenly start down the gratuitous effects route; I'm using them to replicate the classic Quake rendering effects only.
So given all of this work, where does DirectQ fit in the bigger scheme of things? I don't believe there's room for another general-purpose engine in the current state of things - both DarkPlaces and QRack are the logical culminations of where the engine coding community has been going these past 10 or so years. Both serve their purposes extremely well, and if they do what somebody wants there's no reason to switch from them.
The future for any development outside of those two is in special purpose engines. We've already seen this in the BJP engines and ProQuake, which cater to very specific needs. I believe DirectQ is another such engine, with it's goal to be a stable, ultra-high capacity engine, that will run well on cranky hardware that doesn't play nice with OpenGL. I'm writing it because that goal suits my own needs, but if it also suits anybody else's, it might be worth a check-out.
Posted by mhquake at 8:12 PM
Things coming along reasonably well with the baseline functionality. I've got fullbrights on everything completed, which was a nice thing to be able to strike off the list. I've also fully transitioned from raw shaders to FX files, which has enabled me to remove a lot of messy "Version 1.0" code. The viewmodel drawing has gone into the regular Alias model drawing function too, so we're better integrated.
I've determined that the rendering bug I've been having only occurs when particles are not being drawn, so there's a state change missing from something somewhere that particle drawing fixes up. So far as I can guess at the moment it's probably FVF related, as it really only came about when I'd started removing FVF functionality from the main renders.
I'll probably transition everything from FVF to vertex declarations anyway, this is a really cool and flexible part of the API (although the setup code is horrible) which enables a lot of custom functionality in a really nice way. I'm exploiting it pretty well with Alias models, allowing me to move a lot of the interpolation related code to the GPU.
Posted by mhquake at 12:14 AM
Saturday, January 17, 2009
For the past coupla days I've been playing around with the Windows 7 Beta. I've made it my business to be involved in Windows beta testing for the past few releases, but it's never really gone beyond just installing the new OS. This time I've actually been more active in using it over the course of a working day and sending feedback.
So far I like a lot of what I see; it's really what Vista should have been. I always evaluate OS releases with my Network Admin hat on, as that's the day job and I like to see how much hassle I'm going to get from them.
Specific good stuff includes:
- Performance and resource utilisation; so far as I can see, it's back to XP levels. Of course, by default I disable a lot of the more heavyweight services (like indexing), but doing so never made any appreciable difference on Vista for me.
- Look and feel. The new taskbar is superb, and removing Desktop Gadgets by default was the Right Thing to do.
- Stability. OK, I bluescreened it in hour one by installing the Virtual PC 2007 add-ons, but this on research turned out to be a known issue. Otherwise it's been rock solid.
- Application compatibility. Vastly improved over Vista, several cranky 3rd party apps which persistently crashed under Vista work flawlessly now.
- Tunable UAC. I like the idea of UAC, but the Vista implementation was botched. 7 has 4 levels available, with the default being "don't nag me about anything I've launched myself".
- Microsoft still don't seem to understand corporate networks; all the multimedia and home networking gubbins remains there and in-yer-face even when joined to a Domain. OK, it'll probably be configurable via Group Policy, but a more sensible default for Domain members would be a good idea too.
- The Virtual PC crash - this is something that should have been anticipated and prevented.
- Some of the configuration dialogs are incredibly slow.
- UAC still doesn't integrate cleanly with Group Policy, causing elements like Logon Scripts to fail. At this stage it should be able to detect when something comes from a Logon Script, realise that this is something the Admin is doing, and just let it happen.
- "New ways to do familiar tasks" strikes again, with Control Panels and options having been arbirtarily rearranged. I largely skipped Vista so I don't know how much of this is a hangover and how much is new, but having to basically relearn everything from scratch with each OS rev is not good. Doesn't affect me so badly, as I do most of my config from the GPO editor, but it will affect some.
I'm also impressed by how polished and finished it seems. If this is only Beta 1 it bodes well for the final release, and I'll be following subsequent Betas and RCs with interest.
Posted by mhquake at 8:39 PM
Thursday, January 15, 2009
Fullbrights on world surfaces have just gone in. It's all shader based, which gives simpler setup and faster code.
I had played with the notion of adding a cvar to toggle fullbrights on or off, but I eventually removed it. It was a nice enough idea in principle, but ultimately fullbrights were in software Quake, meaning that they're a fundamental engine feature that you don't mess with.
Some small optimizations in the particle system have been very beneficial; got about 5-6 extra FPS from them. Not rendering and discarding particles when they hit a solid leaf is the name of the game.
Sprite code is starting to have an effect elsewhere now, it's messing with the render states a bit. Perhaps something in the 2D code also? I dunno to be honest, but it's looking as though I'm not restoring state properly somewhere. I'll get there.
Bumped MAX_MAP_LEAFS to 32767 on the basis that each node has 2 children, one of which could potentially be a leaf. Very very unlikely, but it's nice to round off this part of things properly. It can't go any higher as more than 32767 nodes requires a BSP format change.
Significant current engine limits look something like this:
Crazy-assed modders rejoice! I've enabled liquid surfs on instanced brush models, so you can now put your lava textures on ammo boxes. I'm refusing to enable sky surfs on them for the moment though...
Posted by mhquake at 11:58 PM
Wednesday, January 14, 2009
I'll start by qualifying "unlimited" - the amount of RAM, video RAM, CPU and GPU performance, highest number you can store in an int, and so on will always be limits.
Anyway, I had done a lot of work on removing hard limits in the GL engine, and some of that is now starting to come across to DirectQ. The eventual goal is going to be having the sole limit being what your PC can handle, and this won't happen in the course of 1 or even 2, 3 or 4 releases, but I will aim to get as close to it as possible, and closer each time I release.
In some cases I have retained hard limits; for example I support up to 8192 loaded models, and no more. Memory efficiency is a consideration, and in cases like this I use an array of pointers rather than an array of structs. Sometimes I pre-allocate a fixed number (something along the lines of what the old limit was) and expand dynamically (either by 1 or by another - lower - fixed number), in others I just go dynamic from the outset. Sometimes it's a linked list. It's horses for courses.
Doing it wrong is easy - just bump up the #define'd maxes and watch your memory utilisation balloon. Doing it right requires careful consideration of every single case. Sometimes a bump of the defined max is even the best solution - why bother putting a lot of effort into temp entities when they're only used for beams?
The big killer in Quake is Protocol 15. I seriously wish that everybody could just simply agree to move to a new higher-limits protocol which could be quickly and cleanly implemented (bump the maxes, switch from bytes to shorts, etc). In the absence of that you have to be very very careful that whatever you do doesn't break protocol 15. There are some small areas for optimisation - doing comparisons of origin and angles based on what will actually be sent rather than the raw float values is one. There are other areas where you're in a corner - model and sound numbers are transmitted as bytes, so any increase beyond 256 is a breaking change in and of itself.
I think it's fair to say that in 2009 the really ambitious mapping action is in the single player game. Deathmatch will always be cool, but it's more pure (and purist), and protocol restrictions make sense. Single player lets mappers imaginations stretch. Colossal maps with thousands of entities, ooooh yeah.
One way I'm thinking of going with this is just bypassing the protocol altogether and writing/reading the data directly. When the client and server reside on the same machine (and in the same executable instance) there is scope for all kinds of clever trickery. I'm not willing to expend the time and effort on a new custom network protocol; there are too many already and the world doesn't need another, especially in what is more of a niche engine that at most 2 or 3 people will ever use for net play.
An alternative is to just rip the BJP protocols; not reinventing the wheel makes sense here, and will create a really nice situation where my engine is protocol-compatible with another one out there, specifically the one that mappers prefer to use. I'm already higher-capacity (but with lower memory usage) in a lot of areas than that engine though, so I may start hitting limits on even those protocols!
Busy busy, as always.
Posted by mhquake at 6:39 PM
Tuesday, January 13, 2009
Here's one that I find quite interesting...
Been considering the relative performance merits of triangle lists vs strips vs fans, with indexing or without. There's a lot of literature on the topic, with much of it being contradictory, not to mention failing to take account of real world usage scenarios.
The idea is to submit as many vertexes in a single API call as possible; a figure of about 300 is mentioned in some of the docs. Problem is - and maybe it's just me - but I fail to see how that's achievable.
First up, a fan or strip is a discrete unit, you can't submit more than one at a time. You can go a long way with strips, but you'll eventually hit the boundary.
Secondly, texture changes mean that you'll have to stop what you're doing and resubmit at some point. The only way I can see around this is to cache a number of textures in spare TMUs and toggle them in a shader, but I haven't done any tests to see how efficient that would be.
Thirdly, there is the small matter of PVS. The primitives you're submitting aren't necessarily going to be contiguous in your VBO. Best case is you sort by leaf and you might get 4 to 5 good blocks, but you run the risk of breaking texture sorting. Alternative you keep a dynamic VBO and stream into that from system memory in the correct order, but per-frame dynamic VBOs really defeat the purpose of having a VBO in the first place, don't they? Did I mention that you might also break back-to-front order (already partially broken by texture sorting)? It gets worse.
Let's look at indexing. For a given world surf in classic Quake there are 7 floats per vert - 3 xyz, 2 st for diffuse and 2 st for lightmap. For the majority of cases the only reuse you're going to get out of that is in the xyz, which is a grand total of 8 bytes (12 less 4 for the vertex). In extreme high poly scenes those 8 bytes will be precious, but in Quake? I ain't so certain. Vertex caching performance may make indexing desirable despite that, but again I'm coming back to low poly. And again, how much data is going to be cachable? Not much.
Fourthly there's the expense of setting up all the infrastructure required to support it. Indexing for certain types of data looks like O(n²) to me... not nice. Ideally your build tools would do it all for you, but once again this is Quake, not la-la-idealistic-hippy-land.
Despite all that I can definitely see cases where certain techniques are advantageous. Strips are faster than fans as the driver can decompose them into individual triangles easier (reuse the last 2 verts and tack on the new one), and there may be certain cases where what's currently a trifan could be represented as a strip instead. Alias models are another good one, there's heavy reuse potential across both vertexes and texcoords, PVS ain't an issue, texture changing ain't an issue. Interpolation might be, which is why I currently run it in a vertex shader - just submit the 2 sets of xyz and the blend factors and let the GPU do the work. It's a heavier submission, but indexing will have greater potential there. Lighting is the big one, and is why I still use DrawPrimitiveUP with them - this potentially changes for every vertex each frame. A caching system may be viable at 10 FPS updates, but interpolation will break that.
The solution to alias models probably looks like (1) store verts, texcoords and lightnormals in a vertex buffer (thinks - lightnormals are shared for all alias models so could probably be encoded in a texture...), (2) store vert indexes, texcoord indexes and lightnormal indexes in an index buffer, (3) submit the indexes together with the shadelight values (which can be a one time submission per entity) and the intepolation blend factors in a single DrawPrimitive call per entity, and (4) run all the interpolation code currently in GL_DrawAliasModel in a vertex shader.
Mmmmm - this would definitely crack the problem of alias models being something of a bottleneck.
Posted by mhquake at 8:20 PM
Monday, January 12, 2009
The enforced layoff from the more intense stuff is doing me good, I'm getting the chance to tidy up all sorts of smaller nooks and crannies that I'd been letting things slip on.
Today I got skyboxes done, including loading sky from the worldspawn. I keep on bringing up skyboxes as an example of the lack of standards, but they are probably the classic example - they can come from /gfx/env, /env, and I've even seen one mod with them in /gfx; they can be any image format (not too big a bother that); can have an underscore between the name and the suffix or not; having all of them present is strictly optional; and the worldspawn key can be called any one of 4 reasonable things. Thankfully I built my texture loader to be able to cope with exactly that kind of abuse, so it wasn't too difficult to handle it all. Crazy and unnecessarily nonetheless.
Also some useful work on managing the sv.edicts array better; I no longer allocate 8192 at load time, but instead make a count of the entities in the map and allocate that number + 64 (but never less than 512), then expand as required (in batches of 128 - should this be lower?) at run time. It's not safe to expand dynamically at load time as the array is in use then, and doing so requires changing the pointer. This is a massive memory saving; about 5-6 MB for a typical map.
Wrote a PCX loader so I now support Link, TGA, PNG, DDS, BMP, JPG and PCX. I think that's enough image formats; Direct3D is capable of doing a handful more but they're a bit more esoteric. The TGA loader seems a bit fussy about formats, there was one I failed to load during testing, but opening it and saving it again in a standard image app worked fine. Obviously some kind of padding at the end. I might replace the builtin loader with one of my own; Microsoft's support for TGAs is a bit suspect anyway.
Been doing some thinking about DrawPrimitive vs DrawPrimitiveUP in D3D. All the documentation and advice gives dire warnings about DrawPrimitiveUP; it's slow, it's not flexible, performance will go through the floor, etc. Yet I've been using it pretty extensively, and have done fairly regular (about weekly) benchmarks in PIX, and I've never seen a difference. The stats fully support this; DrawPrimitiveUP is equally as fast, and is perfectly suited to situations where you don't know your vertex data in advance (I still use regular DrawPrimitive when I do). Seems to me that the overhead of locking, writing to, and unlocking a vertex buffer (or a number of them in some cases) each frame is just not worth it. Also seems to me that Microsoft have this fixation with loading everything into big static buffers and then drawing from those, which is probably a hangover from the old days when D3D used Execute Buffers. Still trying to justify a bad decision from over 12 years ago? Who knows.
Posted by mhquake at 11:35 PM
...how I seem to do my most useful work late at night.
Anyway, been a bit busier than I'd originally intended, maybe moving away from the HLSL rut I'd gotten myself into was of benefit here (I'm one of those people that positively needs variety or I go insane).
The famous next release will have a brand new memory system attached to it. This is something I'd been moving towards for some time, with my removal of alias models and sounds from the cache (I can put them back into my own variant on a cache now...) and removal of the cache and zone systems. I had made an abortive start at it, but rolled back when trying to track down an obscure bug in alias model rendering (can't even remember what the real cause was now). Now it's back, and much much better.
What this really means is that DirectQ's memory will now be unlimited (up to the physical RAM in your PC anyway), and it will no longer be necessary to specify -heapsize. Memory is free to shrink or expand as the requirements of a map dictate. This is a Good Thing to have, as it removes a limit in a nice unobtrusive way without causing any hassle for the user.
I also tracked down a really nasty bug in the old QER interpolation tutorials arising from entities that change models but that still keep interpolation data hanging around. This really only manifested for me with the new memory system, where zombie.mdl was switched to h_zombie.mdl, but one of the pose numbers was left at 48. As the amount of memory allocated to a model is no longer in one big contiguous block, this was a bad pointer and it crashed.
Anyway, I've posted the fix on Inside3D so hopefully everyone will benefit from this.
Posted by mhquake at 1:33 AM
Sunday, January 11, 2009
Just wrapping up some things before I break for a while. I've decided to use effect files (.fx) instead of standalone shaders, firstly for the sheer convenience, secondly because it's what authoring tools expect, and thirdly because they help to keep the code cleaner and simpler (always a bonus when dealing with Quake's wacky formats).
But damn it, they are slow. I can easily see myself losing the FPS I'd gained back from ditching render to texture as a consequence of just switching over to effect files. I've just brought the instanced brush model code up using effect files and lost 5 FPS as a consequence.
The SDK docs don't help much (at all!) by virtue of the way they just describe what each function does but hold off on any useful "this is the best way to do it" info, although I'm finding myself wondering if that would be a good thing at all, as a lot of D3D performance tips are either mutually exclusive or impossible to implement (at least without colossal supporting infrastructure) in any non-trivial real world application.
This ain't drawing teapots from .X files, Bill!
To say I'm not one bit happy about this would be an understatement. Quake is by no means a graphical heavyweight, and something is seriously wrong somewhere if a simple switch from standalone shaders to .fx files causes such a perf drop. It might be in my implementation, and if so that's cool, but there's part of me thinks a proper SDK would have prevented that.
Posted by mhquake at 7:47 PM
Saturday, January 10, 2009
I'm not going to get much done over the next week or so as there is some Real Life stuff I've been putting off that has now become a priority. Sometimes things suck like that, but all the same it will be good to get it over and out of the way.
Here's where I stand as of today:
This in undergoing a badly needed rewrite. The original code was fine for a straight port, but was quite clumsy to work further with. I'm hoping to be able to bring inline brush models back into the main surface refresh now that I have the problem of software transforms resolved.
Water warp updates
The entire render to texture framework has been stripped out and replaced with a pixel shader update. This maintains full speed at no extra video RAM cost, and is the right thing to do, even though it locks out those who don't have PS 2.0 capable hardware.
I'm not happy with how the poles of the sphere render, it looks wrong on open-sky maps, where it's really noticeable, but it's likely to stay that way for a while yet.
Still only supported on world and (some) 2D textures. Luma support is done but requires a rewrite of the other renders to enable it. I've come full circle and actually prefer the original ID1 textures these days, but people like external textures so I want to get this as complete as possible.
Alias and Instanced bmodels
These have pretty much moved over to a full shader-based render, which is far more efficient for handling transforms and vertex modifications. Instanced models are now lit properly, but seem too dark.
Reverted the original dot texture (the "fuzzy point" in the last release looked wrong), otherwise nothing done. I'm not in a hurry to write a full-on replacement.
Shaders in general
There are places where using shaders is the most efficient solution, but I'm not going down the whole bumpy/specular/per-pixel route. It might look nice, but it doesn't look like Quake.
Loading them is working fine, but they need to be integrated with the main renders. A lot (all?) of external texture packs provide lumas meant for use as an additive blend, which is incorrect. The correct blend is (lightmap + luma) * texture, and this is what I will be doing. To work around this I hack the texture data at load time to increase the intensity. It simplifies the render path which is what's important.
I've completely removed r_shadows 1 mode but the cvar is still there. This needs serious work but I don't consider it high priority.
I'm currently benchmarking at about 98 FPS on demo1 on the Intel 910; which I'm going to be continuing to use as my primary benchmark. This includes pixel shaders/etc and is up from 65 FPS in the previous release (getting rid of render to texture helped a lot there. Oddly enough, WinQuake is faster on this machine, which I suppose is as good an indication of the quality of the chipset as anything.
Quite a few new ones; the polyblend will switch itself off if you fire a weapon underwater, and the tileclear around the status bar sometimes doesn't draw are the most immediately noticeable.
I've completely ripped out the old cache and zone systems; I had the hunk removed too but I reverted it while chasing down an alias model vertex bug (that in the end was caused by something completely different - oh well). I'll probably simplify this even more with the eventual intention of going to a full dynamic allocation with no heapsize restrictions.
A lot (nearly all?) of the hard limits have been removed from the client, aside from those which are protocol dependent (and one or two others where the usage is so limited it doesn't seem to make sense to invest the effort). This has been very satisfying to do. The sv.edicts array is still being allocated at full 8192 capacity which puts some overhead on memory requirements; I had solved this in the GL engine so I'll probably spring-clean and port the code.
Overall quite a lot to do still, so a new release will be a way off yet.
Posted by mhquake at 5:10 PM
Friday, January 9, 2009
Since I wrote that last post I've been thinking more and experimenting more with the whole HLSL thing. There are at least a few areas where - even in a stock Quake renderer - use of a shading language makes perfect sense:
- Water warp updates. I'm now satisfied that render to texture is a non-runner for this owing to Video RAM and speed considerations. A pixel shader could do the job just as well (in fact it was effectively a software pixel shader in the original Quake).
- Underwater warp - ditto.
- Fullbrights. Getting the correct formula ((lightmap + fullbright) * texture) is messy with fixed functionality. Texture stage states are fine enough but there's too much setup and takedown involved to make them realistic for all of the special-purpose situations Quake needs.
- Matrix transforms. A lot of Quake content needs to go through a number of matrix transforms, meaning that either you can't batch the renderer properly or you do it in software. Here's where a vertex shader could make things so much easier.
Anyway, I've just written the first on the list, replicating the old software Quake warp as closely as possible, and not only does it avoid the video RAM overhead of render to texture, but it's about 1.5 times faster and looks a lot better. It also means that liquid surfs can now go into a proper vertex buffer rather than needing to come from main memory, so there will be more performance gain.
This is a point of no return; HLSL is definitely on the menu now. It's solved one major stumbling point I've been unhappy about and it's all been gain.
Unfortunately what this means is that people who don't have a pixel shaders 2.0 card are going to be locked out from this point forward. Sorry about that but I suppose it's progress. The alternative is that I get totally bogged down in trying to build backwards compatible solutions within the limitations of finite resources, meaning everyone would lose.
Posted by mhquake at 3:37 PM
Thursday, January 8, 2009
I've decided to defer any further HLSL experiments until such time as I have been able to give the whole thing more consideration. Primary reasons are:
- Decisions on using individual VertexShader and PixelShader files versus using an all-in-one Effect File need to be made. I'll discuss this further below.
- Concerns over lack of Pixel Shader support on older cards; one of my goals is to have a viable engine that will run on (almost) anything. Vertex Shaders will be very happily emulated in software by Direct3D 9, with performance at least equivalent to fixed functionality. Not so with Pixel Shaders.
- Lack of a really good IDE for developing HLSL shaders. There's really only RenderMonkey and FXComposer - I've already decided that FXComposer brings too much of it's own weirdness and assumptions to the party, which leaves RenderMonkey. To be honest, the best I can say about it is that it's "functional".
- Concerns over distribution of shaders. I like to package content as embedded resources, as doing so means that users only need the EXE to get everything, and my content doesn't clash with anyone else's. However, I also like to be able to give people the ability to replace my content with their own if they wish. That bit's easy, it's management of the .rc file, the resource.h, and all the other messing that goes with it that's unpleasant.
Anyway, HLSL won't be making an appearance any time soon, but this brings me back to the original difficulty of handling that damned water warp update. If I can get the miplevel 1 problem resolved I'd be happy to go with a double-sized texture, a 4 x storage overhead per miplevel seems more acceptable than a 16 x overhead.
Posted by mhquake at 8:09 PM
Wednesday, January 7, 2009
It may not look like much, but what you're seeing here is alias frame interpolation running on the GPU.
Only it's not a GPU, it's Direct3D software emulation of a Vertex Shader, which - so far - appears to run a few frames faster than the fixed functionality code I had previously written. Significant room for optimization too, I can put the texcoords into a Vertex Buffer and pass colours and verts as constant data; all kinds of fun in store here!
Still no final decision on what I'm going to do about the water; aside from the fact that Render to Texture will certainly be going.
Posted by mhquake at 3:04 AM
Tuesday, January 6, 2009
It was the beams list all along that was somehow somewhere going crazy and trashing the alias model verts (and who knows what else in other maps). While I'm a bit worried that I haven't quite figured out what was going crazy about it, I'm quite happy to lay the matter to rest for the time being.
So I've taken beams, and for good measure temp entities as well, off the hunk (allocated per map) to a single global static array with fixed size. I don't really mind doing that, as these two were only ever used for lightning bolts and grappling hook ropes, which there will never be an extremely large amount of. In any event, I've bumped the maximum on them to 64 and 128 respectively, which should be sufficient to cover all normal and most abnormal situations.
I've a mental note that I'm still going to investigate alternate means of removing the limits. I think a Hunk allocated linked list for beams seems in order, and maybe take temp entities from the top of the standard entities list (which is already an array of 8192 entity_t pointers).
Note that I said pointers there, so storage overhead for 8192 entities client-side is only 32K. The first 512 are allocated on the Hunk at map startup time, with anything beyond that allocated as required.
I've decided that the render to texture warp update will very probably have to go. I'm not one bit happy with the FPS loss on the Intel 910; and the video RAM overhead is bordering on the extreme. It requires a target texture 4 x the size (in each dimension!) of the source (otherwise miplevel 1 of the source is used for creating miplevel 0 of the target - ugleeeeee city!)
This means that for 512 x 512 source textures we have a 21 MB video RAM overhead per texture! The start map will eat 84 MB video RAM! I tried to ease this off by using 16 bit render targets, but it still works out as 42 MB for start. I'm considering that to be quite unacceptable.
Now that I know the Intel 910 is a respectable enough Pixel Shaders 2.0 performer - it can run RenderMonkey's DoF sample with 9 passes and 120,000 triangles at 10 FPS, and very easily hits 250 FPS for more typical Quake-ey effects - one or two passes, triangle counts in the order of 10,000 or so, it seems an option worth exploring. I am worried though that I might be locking out owners of older GeForces here - even a GeForce 4 is limited to 1.3.
Another option is to revert to surface subdivision and use a vertex shader to evaluate texcoords from the vertexes, which will work on any card that doesn't support vertex shaders, as Direct 3D can emulate them in software. It's fast too - that 250 FPS quoted above is with a software emulated vertex shader.
Of course this doesn't mean that the engine speeds will suddenly increase by a factor of 3 or so, there's a lot more going on in Quake aside from the render!
Posted by mhquake at 10:17 PM
Been mostly troubleshooting the weird alias model verts bug; it seems as though I'm overshooting a memory allocation somewhere, possibly in the alias model loader, although I fear that as my Hunk_Checks run OK just after R_NewMap it might be at runtime.
The worst of all bugs, a pointer bug. Time to enable PARANOID and do some checking I think...
Posted by mhquake at 12:02 AM
Monday, January 5, 2009
Since it will be another short while before I release an engine update, I've dusted off some old code and made it available. Only this time it's called "MHLight".
The readme says it all:
Not a general purpose lighting utility; this is the latest version of MHColour. I'd originally updated it back in 2007, but never got round to releasing it then. A few final bits of work and here it is.Grab it from the links list over there to your right.
Use it to generate a LIT file for (almost) any BSP; information from fullbright colours in the textures and from lava and slime textures is used to create colour values for nearby lights. Certain special lights (torches, etc) are also coloured.
Bugs? Limitations? I'm sure it has plenty... here's what I know of for sure:
* It seems to become unhappy with Hipnotic rotating brushes - not sure why and can't really be bothered to find out right now. It won't crash, but it won't light the brush either.
* It's based on the original ID Light utility, so all limitations and bugs that apply to that also apply to this. It'll probably choke on really large BSPs. One exception - I added "-extra4" as an option.
* As well as generating a LIT file it will also relight the BSP. LIT files created by this probably won't be redistributable. The reason for this is that I tried it the other way and it didn't work; packing the LIT data into the same face offsets used by the original produced weird results on some brushes. It only takes about half an hour to relight all of ID1 with -extra4 on a modern PC anyway, so it doesn't matter too much.
If anybody does anything cool with this I'd love to know; you can normally find me hanging around on Inside3D.
Posted by mhquake at 7:25 AM
Sunday, January 4, 2009
I think I've got the last piece of ugly Hipnotic and Rogue hackery behind me. The weapon layouts proved to be less painful than I'd feared (I seem to remember last time round I actually understood what was happening in the Hipnotic layout, which is a worrying thing). No idea how well or badly it's going to behave when I start testing out customization of the layout, but time will tell.
I can easily see one possible reason why ID moved the HUD layout into the game code from QII onwards - "let's not have any of that again".
I seem to have another bug in alias model verts. From time to time they go a bit wacky on me, and it's not consistent. I've reverted the code here to the old store-as-byte-plus-translate-and-scale method; seeing if that was the cause was one reason but memory usage was the main one. Anyway, no joy, occasionally one frame will go totally weird. I'm afraid that I'm going to have to put an MDL viewer into the engine to get to the bottom of this... oh well.
Speaking of memory, I'm moving a lot of the storage over to dynamically allocated so that I can remove a lot of the old static arrays and their attendant hard limits. What I'm aiming for is a bit more than just a "high capacity" engine; more like an "unlimited capacity" engine. Of course there are going to be protocol considerations which will restrict how far I can go with this, but the initial goal of removing limits on client-side capacity has gone well so far. After I release I'll probably do server-side, then worry about joining them up after that again. I'd already done most of the groundwork for this in the GL engine, so a lot of it is just code-porting.
Current heapsize requirements for a typical ID1 map is in the order of 15-20 MB, but that includes all models and sounds, entities, efrags, and a lot of other stuff that was previously static.
Posted by mhquake at 11:55 PM
Been slowly getting there; spent a good chunk of time working on the Save/Load menu. I've been aware ever since I released 1.2 that it was in need of some performance and usability improvements, basic things like not reloading the entire list everytime you enter the menu, preserving the old selection position, and so on. There were also some bugs present; one case where I'd forgotten to replace "/save/" with host_savedir.string, and another (that had wider implications) where I'd somewhat stupidly assumed that that Windows API CreateDirectory function could take quotes around the directory name (e.g. for if it had spaces in it).
Lesson 1: what's normal behaviour in the shell is not necessarily the way things are in the API - make no assumptions.
The remaining item is to decide how to handle saves from the command-line. Right now the selective rebuilding of the menu lists and checking for changes to host_savedir is confined to the menus.
The code for this menu has become quite complex and messy by now, and no doubt another slew of bugs will emerge when it goes public.
Other news; I've decided to get down off my high horse and drop the whole cvar and cmd protection thing. At the end of the day what it boils down to is that Quake is so old a game that anyone using it is going to know what they're getting anyway, so the whole concept seems a bit silly.
With the advent of LIT file support I really need to check out a number of different machines. I had something funny last week where the lightmaps were RGB on one machine but BGR on another, but I think I was working off an older codebase at the time. I definitely want to confirm that all is OK before releasing anything.
So when will I release 1.3? Can't commit at the moment, there's quite a bit that needs to be tightened up, but otherwise it's pretty much in "feature freeze" mode. This was a problem I had with the GL engine earlier on, where I could never discipline myself to stop adding new things, but thankfully I seem to have it licked by now.
Sometime next week is a possibility, all going well.
Posted by mhquake at 12:03 AM
Saturday, January 3, 2009
While taking a break from something moderately large I have brewing (...of which more later...) I'm putting a lot of work into console editing. The objective here is to have a somewhat less aggravating console, which actually lets you do useful stuff. It's a mixture of old QuakeSrc tutorials (which have to be worked over to fit in with other changes I've made) and original code, with most of the code being original.
This is probably the first time in about 6 years I've put a QuakeSrc tutorial into my code, but the need for the feature is there so why reinvent the wheel?
Console features now include:
- Enhanced TAB completion; you will get a list of all commands or cvars (sorted alphabetically, indicating which type each is) that match what you type. Press TAB repeatedly to cycle.
- Cursor stays at the end of what you've partially typed so you can fine-tune.
- Left and right arrow keys work as expected in the console; Delete and Insert functionality.
- Home and End functionality restored for backscrolling.
- Home and End bring you to the start or end of the line if you're in the middle of a line rather than scrolling the console.
I know I said that I wouldn't do it again, but with me being me I went ahead and did it anyway. I'm writing a scriptable HUD system. Not some flaky homebrew scripting language, nor some horrific concoction of XML (ugh!) or whatever, but good old cfg files and cvars. This will entirely replace the status bar, but the default layout will replicate the classic status bar look. I actually have most of it done, aside from weapons and ammo counts (having to tackle the Hipnotic and Rogue hackery in there has never been a pleasant experience).
You'll be able to independently position just about every single HUD element - for convenience I've grouped the inventory items (into keys/sigils/items/weapons/ammo counts), but there's still some flexibility with their positioning. Then just issue a "savehud myhud" command, and put "exec myhud.cfg" into your autoexec. There will also be a defaulthud cvar which you can use to specify a hud file to load automatically for you.
Because there are so many HUD cvars I'm not saving them to the config.
I have ambitions for this, not sure if anyone's ever gonna use it all, but it scratches an itch for me, so mission accomplished. But someday I hope to write a layout interface for it, where you can visually position all of the elements.
Posted by mhquake at 12:13 AM
Thursday, January 1, 2009
Inspired by a recent post on Inside3D, I'm going to be extending the cvar protection system to also include commands. This was a case where somebody is setting a custom bind for the F1 key through QC. Now, I'm of the opinion that keybindings belong to the PLAYER, not to the mod developer, and seeing this kind of thing makes me angry. What if I had my keyboard set up exactly the way I wanted it, and a merry little mod comes along and stomps all over my keybindings? What if I had F1 bound to +attack?
I do appreciate the power of QC for sure, but there are some things that are a step too far, and this is one of them. Mild by comparison to the worst that could be done (the thought of allowing file access through QC gives me the creeps), but still Just Not On all the same.
Postscript (of sorts)
I will probably be releasing with the protection system disabled by default. I'd recommend that everyone who downloads enable it, though.
Posted by mhquake at 9:26 PM
Here we go again!
As I said before, most of these changes are very much behind the scenes, there won't be much new to look at.
- Various changes to the cvar system for better reliability; protection from QC abuse for certain cvars (cvar_allowqc, disabled by default).
- Changed number of console notify lines to 5.
- Protected several cvars from QC abuse (disabled by default).
- Cvar-ized sound_nominal_clip_dist (default 1500).
- Added sound options menu.
- Moved "Customize Controls" to Input menu.
- Fixed mapshots (broke during viewport cleanup).
- Added "mapshot" command for user to take a mapshot wherever they wish.
- Added "r_automapshot" cvar (default 0) to take a mapshot automatically on entering a new level (in maps directory).
- Added mapshot drawing to serverlist menu.
- Moved sound to DirectSound 8.
- Removed parts of old "crappy Windows multimedia base".
- Modified somewhat weird sound startup to be more standardised.
- Removed sounds from cache memory system.
- Got rid of COM_LoadCacheFile (never used).
- Removed entire Cache system.
- Fixed "items/damage2.wav is not precached" bug on maps where you do impulse 255 with no quad in the map as standard.
- Removed Zone and Hunk memory systems; replaced with simplified Heap system.
- Fixed bug where larger player skins would crash R_TranslatePlayerTexture.
- Implemented texture flushing; those unused after 4 maps get flushed.
- Added LIT file support.
- Added -quoth support.
- Removed hard limit on number of warp update textures.
- Adjusted r_wateralpha so it's only effective if scene has translucent water or if r_novis is 1; made it an archive cvar.
- Bumped MAX_DLIGHTS to 128.
- Added r_monolight cvar to set coloured light off (default 1), requires map reload for static light.
- Changed rendering order for entities under translucent water.
- Added extra dynamic lights on many ents and tents, controlled by r_extradlight cvar (default 0).
- Fixed bug where lightning bolts weren't visible.
- Fixed model allocation memory leak (unlikely to be hit, but dangerous all the same!)
- Moved client-side entity lists to dynamic allocation (still fixed size with hard limits though...)
- Tightened up client side entity allocation memory overhead, 512 initial allocation, over 512 (up to 8192) allocated on-demand.
Posted by mhquake at 6:07 PM
1.2 had some good ones in it; these are currently being fixed for 1.3:
- Mapshots are broken; they don't crash the engine but they do take a shot of the top-left corner of the full display, rather than of a reduced size display. Mapshots are kinda set up for the server list menu but not properly finished.
- Player skins larger than the standard size crash the engine with a stack corruption in R_TranslatePlayerTexture.
- No texture flushing! You will eventually run out of video memory if you run enough maps.
- Lightning bolts are not visible; the entity struct has an alphaval of 0 when originally set up and I forgot to adjust it to 255.
- Nasty memory leak in model allocation; if you load more than 512 unique models over the course of a session you'll crash.
- Cvar toggle controls in the menus don't work right at all.
I've also been giving the matter of the HUD some thought. I want to retain the Trad HUD as default, but I also want to provide something a little more up-to-date looking, so that might also make it in.
Posted by mhquake at 2:12 PM