Monday, July 9, 2012

Update

I've been doing some work on code quality recently.  Previously I'd been clean with warning level 3, but with 10 or so warnings disabled; I've now re-enabled 8 of those and cleaned up where required.  The remaining two warnings (conversion/possible loss of data and a signed/unsigned mismatch) are marked for cleaning up later on, but right now generate a very poor signal to noise ratio.

Before re-enabling those 8 warnings I got as clean as I could go with level 4, which amounted to compiling with only one warning (a nasty setjmp that there's no really good way of avoiding with the current architecture) but needing to disable a couple more in some specific cases.  Level 4 is really picky, with some of the warnings it throws being of dubious value.  Unused formal parameter is one that I had a lot of (and did clean out some of), but that's something you're going to be doing anyway while developing.  Another construct that level 4 throws on is "i = i", which I use frequently enough to enable me to set breakpoints at the end of functions quickly and easily and without side-effects.

I've been compiling with "warnings as errors" for at least the past 6 months, and highly recommend it to anyone, but I'm unsure of the merit of warning level 4.  It's more convenient to develop at level 3 and do a level 4 pass semi-regularly as a form of sanity check; with level 4 on all the time it's like being hauled in front of the class by the headmaster on account of getting your Latin past tense variants mixed up.

On to code analysis.  I'm aware that Carmack has been advocating it in recent times, and I gave it a try myself sometime last year, but the tool I used at the time (cppcheck) wasn't really giving me useful information and was slow as molasses.  Having access to premium editions of Visual Studio in work allowed me to try a few others, with one being a trial version of PVS Studio and the other being Microsoft's built-in Analyzer.

PVS Studio was nice enough to use, and (thankfully) the trial version isn't too hobbled - at full functionality but limited to 100 clicks it's good enough for a few passes over a project the size of DirectQ.  It did find some important stuff, and I got reasonably clean with it, but when I went to export the full log file my eyes opened.

There were still some actionable items in it for sure, but for the most part it was riddled with low-level warnings about how this 32-bit program, built for 32-bit platforms, using a 32-bit compiler, linking to 32-bit libs, had some constructs that weren't 64-bit safe.

No shit Sherlock.

Couple that with nitpicking over C vs C++ style casts and advice to use the -Ex version of some Windows API calls, and I decided to call it a day with that.  At least it didn't nag me about endianness issues on a purely x86/x64 platform (note to PVS Studio developers - don't start getting ideas...)

Microsoft's tool by comparison was an awesome experience.  It ran quite a deal slower than PVS Studio (but still at light-speed compared to cppcheck) and the stuff it was digging out was real, actionable, problematic code.  Stuff like locally defining the variable name "cl" for an auto-completion list while it's also globally defined as the client state, potential buffer overflows in cases where the original buffer definition may have been many levels up the call stack and/or even in a completely different module, potential NULL pointer dereferencing, etc.

I didn't even try to get completely clean with it.  For one there are some false positives (but for the most part they do prompt you to go over the code and re-check your assumptions which is quite good) and for another thing some of the external library code I'm using turned out to be incredibly dirty indeed.

Two other observations spring out of this.  One is that compilers can and will swallow (and even make sense of) some of the most awful junk, and two is that my own code quality has visibly improved in recent times - some of the gnarlier older code I had written threw up a lot of problems, but some of my more recent code (file system, surface refresh, MDLs, particles) was squeaky clean.

I'm adding a pass through that to my "do semi-regularly" list.

None of this translates into "totally bug free", of course, and it would be foolish to expect that.  What it does translate into is an extra layer of protection against stupid, careless and embarrassing mistakes.  I've long since lost my programmer ego and I'm well aware that just because I've written a few lines of code that happened to work it doesn't suddenly make me all-capable and all-knowledgable.  I make mistakes and it's great to have tools that help me at least stand a chance of preventing some of them from going public.

No comments: