Wednesday, June 12, 2024

[MusicViz Project] Part 2 - Motivations + Rough Directions

This is Part 2 of what will hopefully be a series of posts documenting my attempts to build a music visualiser for automatically creating interesting dynamic visualisations for my back-catalogue of music I've been writing + recording of the past few years. Last time I checked, in total there's probably somewhere between 3 and 5 hours of "finished" or "presentable" tracks, with most averaging about 1 minute in length (most come in under that around 52-55 seconds), with only a few reaching 1:10 mins, and only 2-3 blowing out to ~2:30 mins.

Most notably, there are 2 playlists (or really "albums" by this point) of material I produced during the few months I was holed up in my room writing my thesis. During most of the day and night, I'd be listening to these playlists while slaving away in my text editor, desperately trying to make some progress (some days much more successfully than others); and then, to take a break / recharge, I'd write or record some music based on fragments that would come to mind. Rinse and repeat for several months. As my thesis grew, so too did these playlists, which each ended up over an hour long in the end.

For several years, I've been wanting to package these up in a suitable format to release into the world. Currently, only a small handful of these tracks have been heard by anyone other than myself, but certainly not the entirety of these playlists in their totality. Yes, granted, the expected audience is probably vanishingly small, as they are certainly not "mainstream", and don't fall neatly into established categories... hence, even if/when I do release these, I hardly expect many people to actually listen. Then again, if anyone's interested, I have actually since produced a few more hours of similar / evolved material since then LOL - heck, I'm listening to one of the newer playlists as I write this, and even I am surprised by some of the material I recorded even a few years ago.


Anyway, in looking into trying assemble these into a releasable format, and based on prior experiences trying to generate short music videos for small collections of 3-4 tracks I've put together in the past, what becomes clear is that there are significant logistical challenges to doing this:

1) Trying to come up with a different + suitably matching/representative visual for each track based purely on my photo collection (or by manually hand-created art) at this point is prohibitive. 

    Not only is it very time consuming (i.e. we're talking about several hundred tracks - in the 100-200 range for each playlist), but in many cases, I just don't have the necessary material to match.  (Also, generative AI art via text-based prompts is out of the question, as that implies again a lot of manual effort tagging each one with a suitable prompt).

2) Scheduling up all of these things in a video editor tool is a real pain

3) Managing all the assets will also be a struggle...

Hence, this lead to the idea of trying to auto-generate as many of these as possible (from either the finished audio, or maybe from the constituent parts if necessary, though that would be even more labour intensive), with the possibility of adjusting / tweaking these work better if the auto-generated approach yields unsatisfactory results (e.g. colour palette skews in wrong direction for the mood)

Fortunately, for most of these pieces, I do have copies of all the individual tracks used in each mix / render, though for many of these, it's quite a lot of work to export out all the individual data streams used for analysis. And analysis is required - while half of my original motivating collection was created using MuseScore 2 (and hence able to be exported out to MIDI), that doesn't hold for any of the violin layering recordings (for which only the original recordings exist, as they were recorded "live" with no retakes / splicing / editing).


While looking into ways of approaching this project, I've run into a few related efforts that have been hugely inspiring.

   1)  smalin's Music Animation Machine  (YoutubeWebsite)

Over many years, Stephen has meticulously created a corpus of visualisations for various pieces of classical music using his own visualisation toolset. From what I've read, the gist of his approach is that he first finds a recording he likes, matches a MIDI transcription with the recording, then feeds this MIDI encoding into his tool, where he manually sets up the visualisation in a custom way for each piece. In other words, each visualisation is pretty much custom built, with a lot of manual effort + fine-tuning.

One of the most interesting insights from these attempts is effectiveness of different visual motifs and devices - while some of these are similar to the ones I have in mind, TBH none of these quite matches what I'm personally trying to do for my own music (which is fine and expected... since I don't expect many people to have similar sounding music, crafted the way I do it)

   2) Du Kan's work creating Chinese Landscape Painting inspired art based on the waveforms of abstract modern music

When I first saw this a few years ago, I really fell in love with this. Yes, the music really wasn't stuff that I liked, but many other elements of this are things that do resonate a lot with me. In particular:

* The horizontal format, flowing along the canvas from left to right - This is something I've always felt should be done (but instead of being a fixed frame / full-piece-displayed-at-once thing, it would be slowly scrolling over the canvas - not in jagged stairsteppy janky fashion, but smoothly gliding across)

* The representation of different elements in the music as birds, clouds, mountains, and trees (and even more so that they could manage to be mapped as traditional Chinese ink painting styled ones) - Yes! Yes! Yes!   This is so freaking awesome to see!  (And really, TBH that's exactly what I'm often envisaging when creating certain parts in my own music - e.g. thin high sustained lines around D6 corresponding to "clouds / wispy-cloud-like elements", or 3-note figures anywhere from A5 to G6  (e.g. d'''16 c'''16 d'''8   or    fis'''16 e'''16 fis'''8) often being bird calls)

Rough Outlines for My Own Approach

So what about my own visualisations that I'm trying to create with this project?

I don't have a fully formed picture yet  (i.e. I don't exactly have a crystal clear goal in mind yet), but have figured out the following rough principles / elements, from which the rest will be built - based mainly on how it all looks in practice, but also probably things like practicality of implementation (though really, by this point, if I'm honest, I'm fairly fearless when it comes to difficulty of implementation - "At the end of the day, it's 'just' code..." - I've been wrangling code for decades now, several professionally, so yeah... I'm confident I can generally beat most code into shape one way or another eventually, especially if it's something I wrote from scratch myself)

Basics - Base Layer

At the most basic level, let's first look at what the background will look like.

The basic idea here is that the background will be coloured based on the "average sound" / "dominant tone colour" a that point in the piece.  For example:

v1) The basic colour of the background will be formed from the chord formed by all the harmonic lines that are steadily holding / hovering within a a 2nd to a 4th or so around some key note and/or moving up/down nearby slowly  (i.e. what happens a lot with most of my Violin Layering tracks)

v2) The colours of the background will be formed from the cluster of notes forming the repeating ostinato that keeps providing a dynamic "sonic base layer" upon which the rest of the piece is draped. The idea then will be that this cluster of notes will always be represented, but they will pulse / jostle with each other as they each fire up, but all this keeps happening in the background to form a continuously moving / changing background.


As alluded to in these notes so far, the idea is that:

* The colours being used are the tone colours mentioned in part 1

* The general appearance is that of a bunch of soft/blurry colour blobs that fill the majority of the background. This is similar to the collection of background images I once shot in the garden by deliberately setting my lens out of focus, then positioning my camera near my various polyanthers flowers.

I'm basically going for a similar aesthetic here, except that since this is dynamically generated, these blobs will move and subtly change colour over time, as driven by the music, which helps sell + capture the basic mood that each piece is expressing at any point in time

Melodic Lines, Points / Drops, etc.

The current plan is to then analyse the melodic lines and do one of the following with them (sometimes multiple):

1) For long flowing lines, do something similar to the "showcqt" filter from FFMPEG for showing each line. 

I experimented briefly with this in a rough prototype visualisation I did last year (write up about that coming soon in a future post in this series). with the idea being that these scroll from one side of the canvas to the other, horizontally.

Main differences from that one will be:

   - Colours based on my tone colours map

   - Direction of travel is inverted in that prototype

   - Position of the 'cursor' needs work, as does eliminating all those unnecessary drawing lines out to infinity (instead, these should softly fade in)

   - Line thickness more controlled via loudness

2) For more isolated plucks / pings / dings / etc. - These should show up as firework-like splashes, or water droplets, placed across the frame (based on panning, pitch/octave, etc.). Size is based on loudness, fade-out based on reverb/etc.

3) As mentioned earlier, for certain features in certain registers (mostly high lines), these should end up featuring as "clouds" and "birds"

4) Possibly overlaying the waveform as mountains / hills in the background, above the coloured blobs (as was done for the Chinese Landscape Paintings)?  Certainly, if we start to also play with 3D volume / staging for the positioning and stacking of various elements, we could easily end up with some very interesting things here...

Future Directions?

It's probably a bit too premature to be looking to far ahead, but...

* I'm particularly curious how this approach will turn out when using it visualise many famous pieces of music. (Particular examples include Bohemian Rhapsody, ABBA stuff, but also a bunch of the big classical pieces).

* Personally, I hope that this incidentally ends up showing up an interesting new way to get a visual summary of different pieces of music + get a good sense of what they may be like as a quick snapshot - better than what we can get via a waveform / spectrograms. Both of those are interesting, but I feel they are always way too abstract...

No comments:

Post a Comment