Quantcast
Viewing all articles
Browse latest Browse all 20

What is the Matrix (order)?

As if trying to drag memories of vectors and matrices out of the long-discarded parts of your brain reserved for high school mathematics wasn’t hard enough, another curve ball is thrown at you when you discover that none of your text books, DirectX or OpenGL seem to agree on exactly which way round they go.

A typical text book might make reference that DirectX uses row vectors or row-major matrices while OpenGL uses column vectors or column-major matrices, and then the text box author will pick his own favorite style, which may be neither, and proceed to rattle off the fundamentals of affine transforms leaving you unsure whether they’ll work for you or not. A particularly mean text book I have switches randomly between row and column vectors depending on how much room he has on the page.

The OpenGL FAQ, somewhere where you might hope to have the situation cleared up somewhat, is totally cagey about this too. I’ve quoted their answer to the question “Are OpenGL matrices column-major or row-major?” here:

For programming purposes, OpenGL matrices are 16-value arrays with base vectors laid out contiguously in memory. The translation components occupy the 13th, 14th, and 15th elements of the 16-element matrix, where indices are numbered from 1 to 16 as described in section 2.11.2 of the OpenGL 2.1 Specification.

Column-major versus row-major is purely a notational convention. Note that post-multiplying with column-major matrices produces the same result as pre-multiplying with row-major matrices. The OpenGL Specification and the OpenGL Reference Manual both use column-major notation. You can use any notation, as long as it’s clearly stated.

Sadly, the use of column-major format in the spec and blue book has resulted in endless confusion in the OpenGL programming community. Column-major notation suggests that matrices are not laid out in memory as a programmer would expect.

To understand what’s going on and get rid of our confusion, we need to remember how matrix multiplication works. For the purposes of this blog entry I’m just going to use 2 x 2 matrices because I don’t have enough room to do the full 3 x 3 or 4 x 4 matrices that one uses in practice. Here’s how we multiply a 2d vector by a 2 x 2 matrix:

Image may be NSFW.
Clik here to view.
Matrix Multiplication

You might recall from high school that matrix multiplication is only defined when an n x m matrix is multiplied by an m x p matrix, and that the result is an n x p matrix. So in this case we treat the 2d vector as a 1 x 2 matrix (a row vector), and can multiply it by a 2 x 2 matrix to produce a new 1 x 2 matrix, ie. another 2d vector.

We can’t use a 2 x 1 matrix (a column vector) because matrix multiplication is not defined for that way round.

So now let’s introduce transforms, which is why we’re using matrices to begin with. Because we’ve simplified to using a 2 x 2 matrix for our examples, the only standard transforms available to us are  scaling and rotation in one axis, but the same works for all other rotations and translation in the bigger matrices.

The most elegant way to think of a transform matrix is as a collection of basis vectors of a new co-ordinate space we wish to transform the co-ordinates represented by the vector into. Matrix multiplication takes the dot product of each vector co-ordinate by the appropriate basis vector. Let’s use the identity matrix as an example:

Image may be NSFW.
Clik here to view.
Identity Matrix

Multiplying any vector by this matrix gives us the vector back. The worked equation shows us that one basis vector gives us the full amount of x and no y, while the other gives us no x and the full amount of y. We can use the vector to extract the basis vectors in the same way:

Image may be NSFW.
Clik here to view.
Basis Vectors

Extracting the basis vectors of the identity matrix we can get [ 1 0 ] for the basis vector of the x axis and [ 0 1 ] for the basis vector of the y axis. We can draw these vectors on a very dull graph to prove that they are indeed the unit vectors describing the axes of our untransformed co-ordinate system:

Image may be NSFW.
Clik here to view.
Identity Basis
We can do the same for a scale transform:

Image may be NSFW.
Clik here to view.
Scale Matrix

Image may be NSFW.
Clik here to view.
Scale

And for a rotation transform:

Image may be NSFW.
Clik here to view.
Rotation Transform

Image may be NSFW.
Clik here to view.
Rotation

So that’s the theory out of the way, if you want to know more about that kind of stuff or about 3d vectors, homogenous space, 4 x 4 matrices and all that kind of stuff go pick up a math textbook. All of the above is just to fix in our minds that we fundamentally care about the basis vectors that a matrix gives us and give us a way to translate text books later.

Let’s now get back to the fundamental question of order of the matrix. This matters because computers don’t have two-dimensional memory, memory addresses are linear and we have to pick a sensible way to lay out and address a matrix. You might choose something like this:

template <class T>
struct Matrix2 {
  T a, b, c d;
};

We flattened it down, placing the first basis vector contiguously and then the second after it. So far this matches what OpenGL claims. Vector and matrix multiplication also seems to work just fine:

Vector2<T> operator*(const Vector2<T>& v, const Matrix2<T>& m) {
  return Vector2<T>{ v.x * m.a + v.y * m.c, v.x * m.b + v.y * m.d };
}

So we’re still not any closer to figuring out what’s going on with all these text books and what they’re saying about OpenGL’s ordering. At least we have a reasonable understanding to try an experiment, let’s re-order our naming convention to match what the text books say OpenGL does:

Image may be NSFW.
Clik here to view.
OpenGL Matrix

All I’ve done is change the names of the values within the matrix, the rules of matrix multiplication still apply. That means to multiply a vector by this matrix, I have to change the names in the equation to match the names changed in my matrix:

Image may be NSFW.
Clik here to view.
OpenGL Matrix Multiplication

The rules have not changed, just the names. I’m re-iterating that because it’s important. If we didn’t change the names of the values in our multiplication equation, we’d be multiplying different values from before — in fact, we’d be multiplying by the transpose of the matrix we intended.

Our basis vectors haven’t moved either, the names have changed there too:

Image may be NSFW.
Clik here to view.
Column Basis Vectors

That’s not what we wanted, and not what the books tell us about OpenGL. How do we extract the columns of the matrix instead of the rows?

The trick is to use column vectors! Now remember that we can’t multiply a 2 x 1 matrix by a 2 x 2 matrix, but we can multiply a 2 x 2 matrix by a 2 x 1 matrix and as a result get another 2 x 1 matrix:

Image may be NSFW.
Clik here to view.
OpenGL Matrix Multiplication

We need to add a different function for dealing with this order of operation, but fortunately the types are different so this is pretty easy. In fact, the only difficult part is that I’ve changed my naming convention part of the way through the blog post, so the previous code uses the wrong names.

For sake of clarity here are both functions with the current naming convention, that is a column-major convention:

Vector2<T> operator*(const Vector2<T>& v, const Matrix2<T>& m) {
  return Vector2<T>{ v.x * m.a + v.y * m.b, v.x * m.c + v.y * m.d };
}
Vector2<T> operator*(const Matrix2<T>& m, const Vector2<T>& v) {
  return Vector2<T>{ v.x * m.a + v.y * m.c, v.x * m.b + v.y * m.d };
}

Now we have a matrix class that can deal with the basis vectors being in rows or columns, and multiplication by row vectors or column vectors. In fact the class doesn’t care at all, if you post-multiply a row vector by the matrix, the matrix rows are treated as the basis vectors; if you pre-multiply a column vector by the matrix, the matrix columns are treated as the basis vectors.

The rest of the functions, such as matrix/matrix multiplication, transpose, inverse, etc. remain exactly the same.

So why does it matter at all?

The first instance is when we read example transformation matrices off the Internet and want to put them into our class. We need to know the convention used; are the rows of the matrix presented the basis vectors, or the columns? Unfortunately this varies wildly, even Wikipedia arbitrarily flip/flops between the two conventions.

Here’s the example from the entry on Rotation:

Image may be NSFW.
Clik here to view.
Wikipedia Rotation

If you’ve been paying attention, you’ll realize that it’s backwards from the rotation transformation I provided above. Fortunately they provided a column vector with it, so we know that the basis vectors are in the columns. If we were going to use row vectors and post-multiply in our code, we have to transpose this matrix; just like I did above in the example.

This especially comes up when you’re copying out the various projection matrices.

Another instance this matters is when copying sample code from one language and convention to your own. I used simple names a, b, c, etc. for the values but another common approach would be to use an x,y convention or two-dimensional array. At this point it’s critically important that you know whether the first dimension selects a row or a column in that particular implementation.

Remember that this is still just a naming convention from an algorithm point of view. Consider the following example of obtaining the determinant of a matrix; I’ve presented it both in a row-major and a column-major naming convention:

Image may be NSFW.
Clik here to view.
Determinant Example

In the example, m2,1 and n1,2 are alternate names for the same matrix value and the only thing that changes in either side are the names used; m just happens to use row-major notation and n column-major notation in the naming.

This is important when copying code because for optimisation reasons, the code you’re writing will be laid out so that your basis vectors are always contiguous in memory. If you’re using row vectors, this means you’d be using a row-major convention; if you’re using column vectors you’d be using a column-major convention.

Annoyingly it’s often harder to figure out which convention code is actually using, and as we’ve seen above, it’s important to get it right so as not to accidentally transpose matrices in the code. When in doubt, look at the functions in the code that perform transformations, if there are such things.

The third instance where things get confusing is matrix multiplication. This isn’t specifically a naming and layout convention issue, but the convention you pick does matter because matrix multiplication is not commutativeImage may be NSFW.
Clik here to view.
Matrix Multiplication
To understand why this matters, go back to the geometry examples above and consider the cases of a rotation and a translation. If we first apply a rotation transformation to a co-ordinate, and then a translation, the translation applies to the rotated co-ordinate. But if we first apply the translation, and then the rotation, the rotation applies to the translated co-ordinate.

This matters when we come to putting together our model, view, projection matrix.

Fortunately there’s an easy rule of thumb to remember here. Transformations apply from the vector outwards:

Image may be NSFW.
Clik here to view.
ModelViewProjection

In other words, if we’re using row vectors we know they go on the left, so the left-most transformation is applied first; if we’re using column vectors on the right, the right-most transformation is applied first.

Thus for each vertex of a model, we first apply the model matrix to relocate the co-ordinate of the vertex from model space to world space, then the view matrix to relocate the world space co-ordinates into camera space, and finally the projection matrix to bundle it all into homogenous clip space

So finally, does OpenGL really require we use column-major matrices and column vectors? The answer is not really. That may have been true in the older fixed-function pipeline OpenGL implementations, which provided their own matrix classes and functions, but in the modern shader pipeline it’s largely up to you. GLSL supports post-multiplication and pre-multiplication, in your shader code you just place the vector on the left or the right depending on which convention the matrices you passed in use.

However GLSL does use a column-major naming convention for its matrix types; performing an array subscript operation on a mat3 or mat4 type returns a vec3 or vec4 of the named column, not row.

For consistency between your shaders and your engine code, you may want to use column-major to stop yourself going mad. Since all the books and references assume OpenGL requires column-major, you may want to just accept that to stop yourself going mad. And since the vast majority of example shader code out there does assume you post-multiply column vectors, it’s probably a good idea to stick with that otherwise your lighting might look funny.


Viewing all articles
Browse latest Browse all 20

Trending Articles