Tuesday, February 7, 2017

Custom MSBuild Loggers for Blender Devs on Windows

Given that it was a long weekend here in NZ, I decided to take a little time off to work on a little "weekend project" I'd been wanting to work on for ages. (BTW, it wasn't the only weekend project I worked on this weekend, but it's the only one that will be of much interest to anyone else :)

Here's an obligatory screenshot showing the results of my work.

 The "cleaned up" Blender build output as a result of my project: blender_msbuild_loggers


The Story - Motivations
Ever since we moved to effectively only being able to use CMake/MSVC for building Blender on Windows, I've hated the way that the build output looked. Actually, it went beyond simple aesthetics, and veered more into the territory of workflow-impeding-usability-issues.

Specifically, in its native configuration (as seen in master currently), cmake/msvc (via "make.bat", and a "modified" version at that) is an incredibly verbose beast (see this example log file). Every time you compile, it will spew out a lot of crap about how it's skipping regenerating the build files, before proceeding to enumerate the fact that it has found that each and every "project file" (i.e. each internal module/library in Blender becomes a separate project, linked into the big project that builds the actual app) is now "complete", even if said libraries have not changed! Granted, scons also spewed out a lot of stuff, but the CMake/MSVC combo seems to out-do it several times over.

Now, it may have been ok for it to be this verbose if that verbosity didn't come at the cost of impairing the usability of the buildsystem. Specifically, there were 2-3 really critical problems caused by cmake's verbosity:
 1) It was really hard to see what files had (or hadn't) been recompiled. For example, I challenge you to quickly find what files got compiled in the log file linked to above. (For an extra challenge, try viewing this in a Command Prompt window, with the clunky scrolling, and smaller window size).

The problem, you'll quickly discover is that all the output lines are formatted in practically the same way, with little in the way of visual markers to help make the compile lines stand out from the "#Q&*^@#&*^#%&.vcxproj -> *$@^#&*#$##$@#$#^&*^@#^.lib" lines

Now, imagine, that a half a dozen more files have also been compiled, and that a few of those may have spawned some "innoculous" warnings (e.g. like the "non-standard extension: zero-sized array" warning seen in the screenshot above), and it quickly becomes a nightmare to quickly + easily see if CMake had successfully and dutifully recompiled the files you'd asked it to... this matters, since you often do want to check if you'd actually remembered to save the file before recompiling - especially when things still aren't working after the nth iteration of your fix.

 2) Due to all the extra fluff, it's hard to quickly find warning/error messages. With all this junk you have to pour through, it's all too easy (far to easy actually) to simply scroll right past many warnings that you really would've liked to know about (and fix). In fact, it's happened quite a few times recently.

(Also, I must add, that, quite ironically, although MSVC tends to be quite verbose with its "fluff" (in particular, it has a special affinity for being 100-page-essay picky/finicky about #pragmas, "signed vs unsigned", and "non standard extensions"), it tends to be far too silent about warnings/errors that you typically do care about. In my experience, stuff like variable shadowing and unused stuff is often not reported.)

3) The chaotic/unordered multi-threaded output. What doesn't help matters is that all the output ended up being a jumbled mess, due to the multithreaded/multi-process defaults for compilation. TBH, there are days when I'd rather go back to slower single-threaded builds (like what I used religiously in scons for years) vs having the whole thing recompile quickly, simply because it makes it easier to see what's gone wrong and where, and to be able to quickly cut things off when they go wrong (instead of continuing to waste compute time).

AFAIK, there are no cures for this last problem, so I won't dwell more on it here.

However, I will note that personally, I've replaced the "/mp"  argument to the msbuild command with "/maxcpucount:5" so that my machine doesn't lock up while compiling. (Note: Doing "/mp:5" doesn't actually work - I've tried it, so I know.) Why 5? Well, my machine uses an Intel i7-4700MQ, which AFAIK has 4 physical cores (presented to the OS as 8 of 'em, via the hyperthreading stuff IIRC). Now, left to its own devices, msbuild will spawn 8 processes, which will then call CL.exe (i.e. the C-compiler), which will in turn spawn 8 compiler instances to get through the list of files per target. The problem is this: with 8 processes simultaneously bolting along at full throttle, you're going to have fewer slots for UI/other apps (e.g. media players, youtube, etc.) to get a look in. Meaning, you guessed it, laggy/jittery/jumpy mouse movements and unresponsive apps (e.g. you get a freeze every 2 seconds, for 3 seconds), to stuttering playback (highly annoying). Oh, and the fan goes into overdrive, and sounds like a jet engine about to explode... that's perfect for frying ants, but not really something you want to have happen multiple times a day. Playing around, I've found that 5 gives me the best balance between acceptably-long compiles and a usable computer.

---

So, for ages I've been looking for ways to modify the log outputs - to filter and restyle them - to make the saner to work with. (And also, since getting lots of stuff to print can impair performance, if we manage to prune the output, we may be able to get slightly faster builds too!)


The Solution
Several months ago, I started looking into solutions for this during a bit of downtime (and/or after a few particularly frustrating episodes with msvc's logging diarrhea). That's when I stumbled across the CustomLogger techniques that can be used to manually specify your own logger to only report what you want to see, in the way you want it (subject of course to the limits of what info MSBuild is willing to give you).

Some useful links if you're curious:
* Sample/Tutorial -  https://helloacm.com/c-custom-logger-sample-for-msbuild/
* ILogger/Logger Docs (MSDN) - https://msdn.microsoft.com/en-us/library/ms171471(v=vs.120)
* A whole bunch of loggers others have written - https://github.com/mikefourie/MSBuildExtensionPack


My Solution - Development Process/Notes
After a good 7-8 hours of work, I managed to bash together a first working version of a custom logger that fixes the gripes I've got with the default logger. The main features/improvements are:
 1) It does not report those infernal ".vcxproj -> .lib" prints that are good for nothing
 2) For each file that it compiles, it now says 'Compiling => "filename.c"', just like Scons used to (I've always liked this thing about my old buildsystem)
 3) As for modules being relinked/updated, it now prints a 'Linking Lib => "bf_libname"' line, similar to what scons used to do
 4) It does not report all the "Cmake does not need to run" cruft at the start. Only when stuff has changed should it need to print anything there.
 5) I've kept the colour coding for warnings (yellow) and errors (red) that the default logger had. This was a genuine improvement over scons, and I wanted to keep it; doing so however, proved to be an interesting exercise (see below), and led to my decision to actually extend/improve the usage of color shading of the output making the overall result much much better.
 6) It reports number of warnings/errors in the code, so you have an idea of how much stuff to look out for.

Now for some screenshots of the improvements in action:


In case you missed it before, here's a link to the repo where I've uploaded the source code for the logger:

You should be able to download this, then run the following prompt:
   $ build && deploy
to get the logger ready to use. Then, you just need to make a copy of make.bat, and add the following line to the first "msbuild" command you find in that file (near the bottom):
    /logger:CustomLogger.dll /noconsolelogger

You will need to modify the paths in deploy.bat to point to where you've got your Blender source code (root directory, next to the make.bat file).

I've only tested this with VS2013, so your mileage may vary.

--------------

Dev Notes - Color
I really wanted to preserve the color coding for the warning/error prints, so I went off looking for info about how this stuff worked. It turns out that C# does offer some really handy API's for changing the color of the console output (pity it's not available in Python AFAIK), despite being limited to a certain small fixed palette.

In particular, I found the following pages particularly useful:
* MSDN Docs - https://msdn.microsoft.com/en-us/library/system.console.foregroundcolor(v=vs.110).aspx
* Tutorials/Examples (with pictures) - https://www.dotnetperls.com/console-color


Dev Notes - Events
The way that these loggers work is by subscribing to certain types of events generated by MSBuild during the build process. It took me several hours of experimenting with different event types and their parameters to find the combination that I wanted, and to find the right ways of filtering/processing/formatting them for my needs.

In short, if you're going to do this for another project (where you're wanting to just filter/format/augment the output that the default logger provides), just focus on the MessageRaised/BuildMessageEventHandler stream. It contains all of the things you want, along with a few extra things you don't (which for some bizarre reason, show up anyway).

You'll also want to include the WarningRaised and ErrorRaised streams too, since the Logger baseclass actually implements "do nothing" for each stream if you don't provide any handlers yourself. (You may be tempted to just not include the "/noconsolelogger" flag above to get around implementing these, but that's a mistake, because without that option, the default logger will just keep printing all of the things it does already, completely beating the purpose of your special filtering logger)

The long story:
Since my original objective was to get rid of the ".vcxproj -> .lib" prints, I thought that these were getting fired when each project was finished. So, I hooked up events to the ProjectFinished events and nothing else. Whoops. It turns out that "ProjectFinished" fires after each ".vcxproj" file has finished, BUT, it turns out that there's no way to tell if anything happened in that project. Besides, there were also a lot more useless garbage projects in there (aka "ZERO_CHECK.vcxproj" * 1000000  + "ALL_BUILD.vcxproj") that show up at random places/times all over the show. Also, like I said above, if you don't implement handlers for your Warnings/Errors yourself, you're not going to see any at all. So, back to the drawing board!

The next thing I tried was to hook into the "TargetStarted", and then "TaskStarted" events instead. I figured that instead of filtering the projects, let's just forget about the projects/modules, and only report when it's started compiling source code files, as well as the warnings/errors coming out of those.

Whelp... those two were also quite a disaster.
- TargetStarted only spewed out heaps of crap with long GUID's, and generally unintersting stuff that only build-system-optimisation engineers might find interesting when debugging dependency errors.
- TaskStarted was a bit more approachable, but it too spewed out heaps of really fine-grained itty-bitty minutae (see here), such as reading/writing stuff from files, deleting temp files, creating temp files, etc. The one "good" thing that did come out of this whole exercise though, was that I finally glimpsed what I really needed/wanted in the first place - the "Message" task events.

Finally, I was starting to see some stuff that looked a lot more like what I'd been used to seeing. Apparently, all the standard prints are "Messages". The example files "message_info.txt", "message_info2.txt", and "message_info3-dna.txt" offer a taste of what these events were.

So here's what I found out about them:
* "SenderName" - This is the text you see at the start of each line before the colon. In short, this tells us the type of event, and where it came from.
   -> "Message" is a generic piece of text,
   -> "CL" is the compiler,
   -> "LIB" is the module linker,
   -> "Lib" is the big linker, and
   -> "CustomBuild" are things that we (or rather Sergey/Brecht/Campbell/the Jacques B (?)) hacked into the Blender CMake files to support the Blender-specific things we do (e.g. running makesdna/makesrna, etc.). These ones usually have interesting info, but we need to filter them.

* For CL/LIB/Lib events, there are actually two events generated. Usually, the first echos the commandline (so you'll want to suppress this, unless printing verbose-ly to debug when your code won't compile), and the second gives us the name of the file being acted on. I later found that the "name of the file" line was actually the first line of the stdout, with any subsequent events being "leftovers" from the warnings that didn't get lumped in with the warnings they were created for. (Unfortunately, with the multi-threaded output, there's nothing we can do about the mess this causes).

* "e.Message" - This contains the text that usually gets printed from this event. If we want to do anything different, we'll have to pull all the info out of here, as we can't get it anywhere else. Otherwise, it can be just printed as-is.

* "e.Importance", "e.File", etc. - All the other fields present are completely useless from our perspective. Importance is nearly always 1 (well, you do want to filter these, or else it gives you all the low-level "creating/deleting tempfiles" garbage), but file/linenumber/etc. are always empty.  For example,  message_info 1 (the format there was: "SenderName: [file:line] <subcategory>")


Dev Notes - Testing This
In order to generate sufficient buggy code to force lots of recompiles (with interesting stuff), I ended up making a few mock changes in two files.
1) I added a whole bunch of bad code (e.g. using uninitialised vars, bad type assignments, unused code, etc.) to a file (to trigger some warnings - to see how well you can spot them). As it happened, I had been about to start work on a GP bugfix, so that's what I used.

2) I added/removed whitespace (i.e. adding/removing a blank line) from one of the DNA files, to trigger a DNA/RNA rebuild, thus forcing the special custombuild operations to fire up (so that I could check if everything was fine -> message_info3-dna" was the result).

By commenting/uncommenting excess code changes in those files, I could force decent-sized recompiles each time (without having to recompile the entire tree - as touching one of those blenlib headers or DNA_scene_types.h instead would have done). It was a struggle though at times to remember to keep cycling everything, so once in a while, it would do a "useless" run where nothing needed changing.

--------------

Future Development Plans
The logger as it stands is still very much a work in progress still. It is however a lot better than what we have/had by default, and I plan to keep tinkering with it as I start actually using it for serious work.

The following things are planned (as of time of writing). I may end up implementing a few more of these before the day is over - we'll see ;)
* Improve formatting of the warning/error messages - Right now, I'm using the standard API-provided formatting method for this, which should give you what the default loggers do. However, I personally find these warnings far too verbose for my liking, as they actually make it quite hard to see what's going wrong. So, it's very likely that I will go in and hack up some formatting magic to get them closer to the old (gcc < 4) style messages that I feel were head and shoulders above MSVC's verbosity. (Sadly, the newer gcc 4.6+ warnings have gone the way of MSVC, and are unfortunately growing increasingly wordy/bloaded/spaced out).

For starters, I'd like to prune down the filename report to only show the parent directory + the filename. IMO, with Blender's codebase (which this logger is customised for), there's no reason for a competent Blender dev to need any more info. (Also, for files located in "intern" directories - e.g. blenkernel/intern/action.c, I may make it show "blenkernel/action.c" instead, since that fits my mental model more... this is after all first and foremost a tool for my own usage, that I'm sharing in case anyone else finds it useful ;)

* Some of the colour usage needs some tweaks still - For example, I'd like to distinguish more between some of the "Generating" vs "Running" messages (e.g. the "Generating rna_blah_gen.c ... [5-lines later] ... rna_blah_gen.h" message is a prime candidate for being "de-emphasized")

* Investigate ways to all quickly make it show all the build commands, but without showing all of the other "way-too-verbose-that-it's-filtered" crap that MSVC will print. Currently I've got it attached to the Verbosity=Detailed setting, but I feel like unless I strip out all the other prints it might make, I might be better adding a dedicated handler there.

* Optimised progress prints for the scripts/release file copying. In my own make.bat, I've actually silenced that entire process to stop it from printing anything (since it always spews pages and pages out crap about unpacking the Python libraries, EVERY BLOODY TIME YOU COMPILE), but the problem then is that we have no way of knowing how much longer it'll take before all that copying is done. That is bad since there's now an awkward pause after "Build succeeded" and the final "Blender is ready" print. It's likely I'll have to add a second logger for that process to do this.


Final Words
Hopefully some other Windows-based Blender devs out there might find this useful. If we stabilise this enough (and there's enough interest), I'd be happy to look into including this with the official Blender code (maybe via the blender-tools submodule, or as a precompiled library?)

No comments:

Post a Comment