Worked over some of the memory allocation relating to short-lived objects at runtime. Previously these would have mostly been in their own pre-allocated memory buffers; now I just have a single big buffer (it starts at 1mb but can grow to 512mb if needed) - that also doubles up as a temp staging buffer for load time stuff - that these objects pull memory from as required, and that resets back to the 0 mark at the end of each frame.
This gives a much cleaner, more robust and slightly faster end result. Framerates are up a percent or so on account of this, and objects just come in and out of the scene perfectly naturally and normally. A lot of slightly hairy cases where I knew it should have been OK but still had a bad feeling about it have also gone away.
That's a fairly cool achievement that has ramifications beyond it's immediate effect. A key thing here I want to talk about is that percent-or-so performance improvement.
If I can get a 10% performance improvement with nasty, messy and unmaintainable code - I probably won't bother.
If I can get a 1% performance improvement with something clean and neat - I very probably will.
There's a school of thought that says that you shouldn't sweat over the 1% improvements; focus on the big stuff, and so on. That's very valid in many ways - we are firmly in sub-millisecond territory here.
On the other hand, sub-millisecond can be incredibly important. Depending on your hardware and the load that the map you're running puts on it, that sub-millisecond improvement can be critical.
Let's look at a hypothetical but not too far-fetched scenario. You're running at a 60hz refresh rate with vsync enabled. For arguments sake, let's say that each frame needs 16.6 milliseconds to draw; all is good and you'll get your 60fps.
Now let's say that something happens and as a result each frame is now taking 16.8 milliseconds. That's a 0.2 millisecond difference, but it's enough to cause you to miss a vsync interval. Suddenly you're no longer running at 60fps, you're running at 30.
That's a sub-millisecond difference that's caused a loss of half of your framerate, and when viewed on these terms sub-millisecond starts looking very important. Obviously the rules have to be interpreted in context, and where context says that what otherwise seems like a no-brainer is in fact errant nonsense, then you need to ditch your preconceptions and view things differently.
All the same, that would in no way justify the scenario of 10% from nasty code I spun earlier on. That's all context and tradeoffs again, and - in that case - it's more likely the case that if creating a mess gets you a performance improvement then you have some fundamental design flaw which needs to be corrected first.
All interesting stuff.
Wednesday, June 6, 2012
Memory Allocation
Posted by
mhquake
at
6:52 PM
Subscribe to:
Post Comments (Atom)
2 comments:
Are you essentially using a garbage collector now? Does Quake use one already? Not too familiar with the guts of this engine.
It is effectively a (simplified) garbage collector, yes. Quake already uses something slightly similar, but it's per-map whereas I'm now doing it per-frame (with the appropriate modifications to enable it to work fast, of course).
Post a Comment