Thursday, January 7, 2016

Tip: Matrix-Fu for Solving Rigging Problems

Sometimes when trying to figure out rigging problems (e.g. working out a hack to patch around some weirdness with bone rolls, or maybe coding your own little transform tools), it pays to know a few tricks about matrices. In this post, I'm going to try and provide some "practical" advice about working with matrices, particularly for those found in Blender.

Unlike many other guides about these topics you'll likely encounter, I'm going to do away with most of the formalisms and so forth that usually happens, and just tailor this specifically for Blender, and for the sole purpose of describing the 3D transforms of an object or bone. For many years I've been meaning to write a guide like this, so let's get started!

1) What is a Matrix?
We're talking about "transformation matrices" here - not "The Matrix" ;)  Transformation Matrices are basically little tables of numbers which are used to compactly describe the set of transformations applied to an object/bone.

You're most likely to encounter two types of transformation matrix when dealing with 3D math:
  * 3x3  (9 items)     - If you encounter a matrix with 9 items in total (or laid out in a 3x3 - 3 columns, 3 rows) layout, they you've got what I'll refer to as a "RotScale" matrix. These types of matrices are typically used to describe rotations and scaling. Rotations and scaling are usually tired together when represented in a matrix, for reasons you'll soon understand.

  * 4x4  (16 items)   - If you encounter a matrix with 16 items in total (or laid out in a 4x4 - 4 columns, 4 rows) layout, then you've got what I'll refer to as a "Full Matrix". This is basically your 3x3 RotScale Matrix + the Location Vector  + some padding junk to account for all the extra space.

(Side Note: For 2D stuff, you deal with 2x2 and 3x3 matrices respectively, which serve the same roles, but are smaller, as there's less "stuff" to represent)

2) What is a Vector?
Remember how when working in 3D, we often describe points in terms of a set of 3-values:
   - x = How far along the x-axis the point lies
   - y = How far along the y-axis the point lies
   - z = How far along the z-axis the point lies

So, each point can be described as a tuple of coordinates: (x, y, z)

But, what about vectors? Well, as far as the numbers/computer representation goes, points and vectors are basically the exact same thing!

However, be warned that there is a very minor but technical distinction: the vector v = (x1, y1, z1) is basically the "directed line" (i.e. an arrow) that points from the origin (i.e. the point (0, 0, 0) in the middle of the world)  towards the point at (x1, y1, z1).

This distinction only really matters if you're trying to write/talk about what you're doing with really pedantic and anal-retentive mathematician-types who insist on formalism and correctness. In practice, you can get away with using these interchangably - computers generally don't care, unless whoever built the math toolkit also got carried away with nonsense like this.
Side note: If you ever hear anyone talking about "scalars" when there are vectors around, they're just referring to the "standard, standalone numbers", e.g. 3, 5, 1.1, -123, 0.0, ..., that you usually just refer to as "numbers"

Be aware that there are quite a few special operations, some of which are really useful (and which you've probably heard about if you've been playing around with any node trees recently) that can be performed using vectors, but that's really another topic for another post! Again, I've been plotting up some nice notes for getting you head around those.

One important concept to remember here is that of "unit vectors".
 * A unit vector is a vector where the length of the vector (sqrt(x^2 + y^2 + z^2)  - by Pythagoras) = 1
    (Beware: It is NOT a vector where the elements are 1's, or the elements add up to 1!)

 * Unit vectors are important, as they are used to represent a pure direction, without any scaling/stretching along the direction of the vector being applied.

 * To get a unit vector, divide all components of a vector by it's length, i.e.
          v' = (x / N, y / N, z / N)
         N = sqrt(x^2 + y^2 + z^2))

3) Structure of a Matrix
Here we get to the crux of today's little math lesson - the part that IMO isn't stressed enough in most introductions to this stuff. If you understand this, matrices become a whole lot less scary, as instead of a sea of numbers, you'll know how to go about breaking it down and picking it apart.

The basic structure of a 3x3 matrix (or the majority of a 4x4 matrix, occupying the top-left corner) is simply a collection of vectors describing the orientation of the object/bone's axes. In a "full transform" matrix, there's just an additional location vector, describing the object/bone's location in space.

So, let's look at a 4x4 matrix first (the 3x3 is just the top left corner):
     X.x | Y.x | Z.x | L.x
     X.y | Y.y | Z.y | L.y
     X.z | Y.z | Z.z | L.z
     0    |  0   |  0    |  1

Here X, Y, Z, and L are vectors, and .x, .y, .z are the x/y/z components of those vectors.
   - The X/Y/Z vectors represent the directions that the x-axis, y-axis, and z-axis of the object/bone point in.  (In more technical terms, they define a "basis" - i.e. orientation / reference frame in space. If there's no skewing going on, that is, all axes are at right angles to each other, then they form an "orthogonal basis")
   - The length of each vector (sqrt(x^2 + y^2 + z^2)) is the scale factor for that axis (i.e. "local space scale"). Typically though, you can just read it off the main diagonal (i.e. where the 1's are in the matrix below)
   - The "L" vector represents the location of the object/bone in 3D space. It is the point around which the rotation+scaling described by the RotScale part occurs. Alternatively, you can view it as the offset that is applied after performing the rotation and scaling.  (One word of caution: If playing around with this value, the RotScale part still needs to be used when computing the new values)

An object/bone with no transforms applied to it will look something like:
   1 | 0 | 0 | 0
   0 | 1 | 0 | 0
   0 | 0 | 1 | 0
   0 | 0 | 0 | 1

A matrix like this is called an "Identity" matrix (and is sometimes referred to as "I"). Can you see why this is the case?



The first column says that the X-axis points 1-unit down the X-axis,
The second that Y-axis 1-unit down the Y-axis,
The third that the Z-axis points 1-unit down the Z-axis, and
The final axis says that there is no location offset.

1-unit means no scaling, as the reference/world-space matrix also uses reference axis-vectors of unit (1-unit) length. Got it?

So, what can we do with this information?
  1) If we want to know where an object is in world space (or where the bone is in pose space), you can read it off from the L vector
  2) If you want to know whether the Z-axis of a bone points up or down (for a horizontal bone, check the Z.z value; or more generally, check on what the Z-axis is doing
  3) If you want to know the local scaling on an axis, take the length of the corresponding column
  4) If you want to change the scaling on any axis (or maybe flip a pesky/inverted axis), multiply the relevant column by the scale factor you want to apply local-space scaling to that axis
  5) If you want to introduce some skewing effects, introduce some offsets on the off-diagonal values for one of the axes. That way, you're making it so that instead of having the axes at right angles, one of them is now going to be at a different angle, thus creating some skewing/distortion effects.
  6) If you want to remove scaling, simply divide each column by it's length to normalise that axis, and return it to being a "unit vector"

4) Matrix Notations and Conventions
Unfortunately, things might get slightly confusing here, as there are several different competing standards which express things in different ways. I'm going to try my best to keep confusion down to a minimum, though it might not be possible.

First, all the matrices you've seen so far have been shown in "standard notation". This is the way that you'll see matrices laid out on paper if you open any math textbook. If you've heard about linear algebra from school, this is also probably the form you've been taught to see them in.

Second, Blender and OpenGL apparently use what is known as "column major" layout to store matrices in memory. However, almost all other 3D software/toolkits/etc. you'll come across use "row major".

  TIP: To get from row major to column major, you basically "transpose" the matrix which means that you swap the rows and columns. In math notation, you generally see this done as a little superscript T beside the matrix (or its name)

The implication of the difference between column and row major layout applies when trying to refer to values in the matrix. I'll first show you how to do this with the typical multi-dimensional arrays that are used to implement matrices, where you refer to each cell using 2 indices:

   * Column Major:   L.x = [3][0],    X.z = [0][2]
   * Row Major:        L.x = [0][3],    X.z = [2][0]

Now, sometimes matrices will be presented as "flattened array" instead (e.g. the OpenGL getMatrix functions, and also when looking at Blender's matrices via RNA in the Datablocks Outliner View). In this case, you can really start to see how the memory layout looks like, and why it's called "column major"  (to save confusion, I'm only going to show Column Major here):

            0  |  4  |  8  | 12
            1  |  5  |  9  | 13
            2  |  6  | 10 | 14
            3  |  7  | 11 | 15
       (Zero-Based Indices)  - L.x = M[12]

In the RNA UI, 1-based indices are used instead, so:
            1  |  5  |  9  | 13
            2  |  6  | 10 | 14
            3  |  7  | 11 | 15
            4  |  8  | 12 | 16
       (One-Based Indices - RNA UI)  - L.x = M[13]

Now, just to mess with you a little bit, if you ever go to print out a matrix from Blender, either via an API function, or using code as follows, e.g.
   for (i = 0; i < 3; i++) {
      for (j = 0; j < 3; j++) {
           printf("%f, ", mat[i][j]);

you'll likely see something like this:
   X.x, X.y, X.z, 0,
   Y.x, Y.y, Y.z, 0,
   Z.x, Z.y, Z.z, 0,
   L.x, L.y, L.z, 1

Just keep in mind this difference, and you shouldn't have that much trouble taking matrix math from other sources and adapting it for your own uses :)

Closing Words
Hopefully with these tips, you now know enough to get started playing around with matrices a bit more.

I haven't covered here the issue of how matrices are combined (i.e. multiplication), or the shenanigans which go on there (e.g. multiplication order - pre vs post, and how we notate all that). That's another topic for another time :)

No comments:

Post a Comment