kevinmoran/noacos_derivation.md

## noacos_derivation.md

      
    Raw
  

              noacos_derivation.md
            
          
    How to Calculate a Rotation Matrix to Align Vector A to Vector B in 3D

Based on 'Avoiding Trigonometry' by Íñigo Quílez

Expanded derivation by Kevin - 29/11/18

Introduction

Original article: https://iquilezles.org/www/articles/noacos/noacos.htm
'Avoiding Trigonometry' or 'noacos' is a wonderful article by Íñigo Quílez on avoiding redundant trigonometric function calls in your 3D maths code. It explores the example of calculating a rotation matrix to align two vectors, but the approach of simplifying a piece of code through an understanding of the dot and cross products can be applied in all your linear algebra programming to write simpler, faster and more robust code.
I had a bit of trouble following the original derivation in the article, especially the later parts. I needed to write it out myself in a more verbose way to make sense of it. I'm sharing my expanded derivation in case it helps anyone! I hope this article will be a bit more beginner-friendly.
Prerequisites

A basic understanding of linear algebra. You should know:

What vectors are (mostly 3D vectors for our purposes)
What matrices are (mostly 3-by-3 matrices) and how they are used for 3D rotations
You should know what the dot product is (sometimes called inner product)
You should have some idea what the cross product is.

Important Notes

I am copying the original code sample for the most part, which means I will calculate a 3-by-3 matrix in row-major form. If your code uses column-major matrices, simply transpose the row-major matrix. If you want to calculate a 4-by-4 matrix, the conversion is:
Example mat3   Corresponding mat4
[ a  b  c ]      [ a  b  c  0]
[ d  e  f ]  =>  [ d  e  f  0]
[ g  h  i ]      [ g  h  i  0]
                 [ 0  0  0  1]

The code snippets are written in simple C++. I will use mat3 as my matrix type and vec3 as my vector type. We'll assume we have functions for calculating the dot product, cross product and the length of a vector, along with system functions for calculating the triginometric functions sin(), cos() and acos().
The Naive Approach

The problem outlined by Íñigo is this: We want to calculate the matrix that will rotate a given vector v1 to be aligned with another vector v2. Let's call the function that will do this rotateAlign().
mat3 rotMat = rotateAlign(v1, v2);
assert(dot((rotMat * v1), v2) ~= 1);
This is an extremely useful operation to align an object in a given direction. Let's assume we already have the following function for rotating around a given (normalised) axis u by a given angle angleRadians (specified in radians):
mat3 rotateAxisAngle( vec3 u, float angleRadians )
{
    const float sinA = sinf( angleRadians );
    const float cosA = cosf( angleRadians );
    const float oneMinusCosA = 1.0f - cosA;

    return mat3( (u.x * u.x * oneMinusCosA) + cosA,
                 (u.y * u.x * oneMinusCosA) - (sinA * u.z), 
                 (u.z * u.x * oneMinusCosA) + (sinA * u.y),
                 (u.x * u.y * oneMinusCosA) + (sinA * u.z),  
                 (u.y * u.y * oneMinusCosA) + cosA,      
                 (u.z * u.y * oneMinusCosA) - (sinA * u.x),
                 (u.x * u.z * oneMinusCosA) - (sinA * u.y),  
                 (u.y * u.z * oneMinusCosA) + (sinA * u.x),  
                 (u.z * u.z * oneMinusCosA) + cosA 
                 );
}
The most obvious solution is to calculate the required angle we want to rotate by, find an axis of rotation and call the rotateAxisAngle() function. We know that the cross product gives us a vector orthogonal to the two input vectors, so we normalise that and use it as our axis of rotation. We also know the dot product of two unit vectors is equal to the cosine of the angle between them, so we calculate that and call acos() (the inverse cosine) to get the desired angle.
Note: We use a clamp function here to make sure the dot product is between -1 and 1. Even when both input vectors are of unit length, floating-point error means that we can sometimes get a result that is slightly outside that range. This is usually not a problem for most calculations, except the acosf function in the C standard library is specified to return NaN if given a value outside the range [-1.0f; 1.0f].
Combining this with our rotateAxisAngle() function gives us:
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 axis = normalize( cross( v1, v2 ) );
    float dotProduct = dot( v1, v2 );
    dotProduct = clamp( dotProduct, -1.0f, 1.0f );
    float angleRadians = acosf( dotProduct );
    mat3 result = rotateAxisAngle( axis, angleRadians );
    return result;
}
This function works! You would be forgiven for getting this far and leaving it at that, but there's some nastiness going on under the hood, it's just been hidden by a function call.
Digging deeper

Let's inline the rotateAxisAngle() call. By this I mean remove the function call, simply insert the code body of the function with corrected names:
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 axis = normalize( cross( v1, v2 ) );
    float dotProduct = dot( v1, v2 );
    dotProduct = clamp( dotProduct, -1.0f, 1.0f );
    float angleRadians = acosf( dotProduct );

    const float sinA = sinf( angleRadians );
    const float cosA = cosf( angleRadians );
    const float oneMinusCosA = 1.0f - cosA;

    mat3 result( (axis.x * axis.x * oneMinusCosA) + cosA,
                 (axis.y * axis.x * oneMinusCosA) - (sinA * axis.z), 
                 (axis.z * axis.x * oneMinusCosA) + (sinA * axis.y),
                 (axis.x * axis.y * oneMinusCosA) + (sinA * axis.z),  
                 (axis.y * axis.y * oneMinusCosA) + cosA,      
                 (axis.z * axis.y * oneMinusCosA) - (sinA * axis.x),
                 (axis.x * axis.z * oneMinusCosA) - (sinA * axis.y),  
                 (axis.y * axis.z * oneMinusCosA) + (sinA * axis.x),  
                 (axis.z * axis.z * oneMinusCosA) + cosA 
                 );

    return result;
}
Now we can see that something is not right here. We call acosf() on dotProduct to calculate angleRadians, but then immediately undo this by calling cosf() on angleRadians to calculate cosA. We can surely remove this waste of CPU cycles and write a more elegant function.
But there's an obvious problem you might say I'm overlooking. We have to calculate the sin() of the angle too, so there's no avoiding that call to acosf(). Therefore we surely need to know the angle between the two vectors!
As I'm sure you're suspecting by now, we can avoid all of this pointless trigonometry. It's time to go back to our basic linear algebra and get clever.
Understanding the Cross Product

The cross product is a tricky concept to understand when you start learning linear algebra. I remember I learned the formula to pass my exam and promptly forgot it. Worse, I never had an intuition for what the cross product was actually doing. Once I started writing 3D graphics code I had to properly learn what was going on.
The simple definition is this:

The cross product of two vectors is a third vector orthogonal to both, whose length is equal to the sine of the angle between them

This is the missing piece of the puzzle. We never need to know the angle between the two input vectors for our function. All we really need are the sine and cosine of that angle. We can calculate these with basic vector functions and avoid that call to acosf()!
If you learn one thing from this article I hope it's this: There is great power in really understanding the fundamentals of linear algebra when writing 3D graphics code. I strongly recommend digging into some good resources to get a strong intuition for concepts such as the dot product, cross product, or even what a matrix actually is. If you want a free resource I recommend the video series 'Essence of Linear Algebra' available on the 3Blue1Brown Youtube channel:
https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
Fixing our Function

Let's rewrite rotateAlign(), this time using the dot product to calculate cos(A) and the length of the cross product to calculate sin(A). Also, because we no longer need to call acosf() we'll remove the clamp() function.
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 axis = normalize( cross( v1, v2 ) );

    const float sinA = length( cross( v1, v2 ) );
    const float cosA = dot( v1, v2 );
    const float oneMinusCosA = 1.0f - cosA;

    mat3 result( (axis.x * axis.x * oneMinusCosA) + cosA,
                 (axis.y * axis.x * oneMinusCosA) - (sinA * axis.z), 
                 (axis.z * axis.x * oneMinusCosA) + (sinA * axis.y),
                 (axis.x * axis.y * oneMinusCosA) + (sinA * axis.z),  
                 (axis.y * axis.y * oneMinusCosA) + cosA,      
                 (axis.z * axis.y * oneMinusCosA) - (sinA * axis.x),
                 (axis.x * axis.z * oneMinusCosA) - (sinA * axis.y),  
                 (axis.y * axis.z * oneMinusCosA) + (sinA * axis.x),  
                 (axis.z * axis.z * oneMinusCosA) + cosA 
                 );

    return result;
}
We're calling the cross product function twice, let's fix that by storing the value the first time:
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 crossProduct = cross( v1, v2 );
    vec3 axis = normalize( crossProduct );

    const float sinA = length( crossProduct );
    const float cosA = dot( v1, v2 );
    const float oneMinusCosA = 1.0f - cosA;

    mat3 result( (axis.x * axis.x * oneMinusCosA) + cosA,
                 (axis.y * axis.x * oneMinusCosA) - (sinA * axis.z), 
                 (axis.z * axis.x * oneMinusCosA) + (sinA * axis.y),
                 (axis.x * axis.y * oneMinusCosA) + (sinA * axis.z),  
                 (axis.y * axis.y * oneMinusCosA) + cosA,      
                 (axis.z * axis.y * oneMinusCosA) - (sinA * axis.x),
                 (axis.x * axis.z * oneMinusCosA) - (sinA * axis.y),  
                 (axis.y * axis.z * oneMinusCosA) + (sinA * axis.x),  
                 (axis.z * axis.z * oneMinusCosA) + cosA 
                 );

    return result;
}
You should know that normalising a vector means dividing it by its length. That means we're calculating the length of the cross product twice; once in the normalize() function and once for sinA. Fixing that gives us:
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 crossProduct = cross( v1, v2 );
    const float sinA = length( crossProduct );

    vec3  axis = crossProduct / sinA;

    const float cosA = dot( v1, v2 );
    const float oneMinusCosA = 1.0f - cosA;

    mat3 result( (axis.x * axis.x * oneMinusCosA) + cosA,
                 (axis.y * axis.x * oneMinusCosA) - (sinA * axis.z), 
                 (axis.z * axis.x * oneMinusCosA) + (sinA * axis.y),
                 (axis.x * axis.y * oneMinusCosA) + (sinA * axis.z),  
                 (axis.y * axis.y * oneMinusCosA) + cosA,      
                 (axis.z * axis.y * oneMinusCosA) - (sinA * axis.x),
                 (axis.x * axis.z * oneMinusCosA) - (sinA * axis.y),  
                 (axis.y * axis.z * oneMinusCosA) + (sinA * axis.x),  
                 (axis.z * axis.z * oneMinusCosA) + cosA 
                 );

    return result;
}
This is much improved! Now comes that part where Íñigo is very clever. Remember that normalising the axis of rotation is simply dividing each component by its length, which is sinA. Instead of normalising the axis up front, let's divide each component of the axis by sinA in the final calculation. So we remove the division from line vec3  axis = crossProduct / sinA; and instead divide every instance of axis.x, axis.y and axis.z by sinA:
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 crossProduct = cross( v1, v2 );
    const float sinA = length( crossProduct );

    vec3 axis = crossProduct; // vec3 axis = crossProduct / sinA;

    const float cosA = dot( v1, v2 );
    const float oneMinusCosA = 1.0f - cosA;

    mat3 result( ((axis.x / sinA) * (axis.x / sinA) * oneMinusCosA) + cosA,
                 ((axis.y / sinA) * (axis.x / sinA) * oneMinusCosA) - (sinA * (axis.z / sinA)), 
                 ((axis.z / sinA) * (axis.x / sinA) * oneMinusCosA) + (sinA * (axis.y / sinA)),
                 ((axis.x / sinA) * (axis.y / sinA) * oneMinusCosA) + (sinA * (axis.z / sinA)),  
                 ((axis.y / sinA) * (axis.y / sinA) * oneMinusCosA) + cosA,      
                 ((axis.z / sinA) * (axis.y / sinA) * oneMinusCosA) - (sinA * (axis.x / sinA)),
                 ((axis.x / sinA) * (axis.z / sinA) * oneMinusCosA) - (sinA * (axis.y / sinA)),  
                 ((axis.y / sinA) * (axis.z / sinA) * oneMinusCosA) + (sinA * (axis.x / sinA)),  
                 ((axis.z / sinA) * (axis.z / sinA) * oneMinusCosA) + cosA 
                 );

    return result;
}
This looks like a mess but you'll soon see where we're going with this! Let's cancel out the six instances of sinA/sinA. Then we'll group all the remaining divisions of sinA on each line together giving us:
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 axis = cross( v1, v2 );
    const float sinA = length( axis );

    const float cosA = dot( v1, v2 );
    const float oneMinusCosA = 1.0f - cosA;

    mat3 result( (axis.x * axis.x * oneMinusCosA / (sinA * sinA)) + cosA,
                 (axis.y * axis.x * oneMinusCosA / (sinA * sinA)) - axis.z, 
                 (axis.z * axis.x * oneMinusCosA / (sinA * sinA)) + axis.y,
                 (axis.x * axis.y * oneMinusCosA / (sinA * sinA)) + axis.z,  
                 (axis.y * axis.y * oneMinusCosA / (sinA * sinA)) + cosA,      
                 (axis.z * axis.y * oneMinusCosA / (sinA * sinA)) - axis.x,
                 (axis.x * axis.z * oneMinusCosA / (sinA * sinA)) - axis.y,  
                 (axis.y * axis.z * oneMinusCosA / (sinA * sinA)) + axis.x,  
                 (axis.z * axis.z * oneMinusCosA / (sinA * sinA)) + cosA 
                 );

    return result;
}
Now we have the term oneMinusCosA / (sinA * sinA) on every line. Let's factor that out into a variable so we only calculate it once. We'll call this value k for brevity.
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 axis = cross( v1, v2 );
    const float sinA = length( axis );

    const float cosA = dot( v1, v2 );
    const float k = (1.0f - cosA) / (sinA * sinA);

    mat3 result( (axis.x * axis.x * k) + cosA,
                 (axis.y * axis.x * k) - axis.z, 
                 (axis.z * axis.x * k) + axis.y,
                 (axis.x * axis.y * k) + axis.z,  
                 (axis.y * axis.y * k) + cosA,      
                 (axis.z * axis.y * k) - axis.x,
                 (axis.x * axis.z * k) - axis.y,  
                 (axis.y * axis.z * k) + axis.x,  
                 (axis.z * axis.z * k) + cosA 
                 );

    return result;
}
Now, using the fact that sinA*sinA + cosA*cosA = 1, we can replace (sinA * sinA) with (1 - cosA*cosA). Now we no longer need to calculate sinA at all, so we remove the call to length().
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 axis = cross( v1, v2 );

    const float cosA = dot( v1, v2 );
    const float k = (1.0f - cosA) / (1.0f - cosA*cosA);

    mat3 result( (axis.x * axis.x * k) + cosA,
                 (axis.y * axis.x * k) - axis.z, 
                 (axis.z * axis.x * k) + axis.y,
                 (axis.x * axis.y * k) + axis.z,  
                 (axis.y * axis.y * k) + cosA,      
                 (axis.z * axis.y * k) - axis.x,
                 (axis.x * axis.z * k) - axis.y,  
                 (axis.y * axis.z * k) + axis.x,  
                 (axis.z * axis.z * k) + cosA 
                 );

    return result;
}
Finally, we know that a*a - b*b is (a+b)*(a-b). Since 1 is equivilent to 1*1, then we can rewrite the denominator of k:
  (1.0f - cosA*cosA)
= (1.0f*1.0f - cosA*cosA)
= ((1.0f + cosA) * (1.0f - cosA))
Giving us:
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 axis = cross( v1, v2 );

    const float cosA = dot( v1, v2 );
    const float k = (1.0f - cosA) / ((1.0f + cosA) * (1.0f - cosA));

    mat3 result( (axis.x * axis.x * k) + cosA,
                 (axis.y * axis.x * k) - axis.z, 
                 (axis.z * axis.x * k) + axis.y,
                 (axis.x * axis.y * k) + axis.z,  
                 (axis.y * axis.y * k) + cosA,      
                 (axis.z * axis.y * k) - axis.x,
                 (axis.x * axis.z * k) - axis.y,  
                 (axis.y * axis.z * k) + axis.x,  
                 (axis.z * axis.z * k) + cosA 
                 );

    return result;
}
Cancelling out the (1.0f - cosA) terms, we can simplify k and get our final rotateAlign() function:
mat3 rotateAlign( vec3 v1, vec3 v2)
{
    vec3 axis = cross( v1, v2 );

    const float cosA = dot( v1, v2 );
    const float k = 1.0f / (1.0f + cosA);

    mat3 result( (axis.x * axis.x * k) + cosA,
                 (axis.y * axis.x * k) - axis.z, 
                 (axis.z * axis.x * k) + axis.y,
                 (axis.x * axis.y * k) + axis.z,  
                 (axis.y * axis.y * k) + cosA,      
                 (axis.z * axis.y * k) - axis.x,
                 (axis.x * axis.z * k) - axis.y,  
                 (axis.y * axis.z * k) + axis.x,  
                 (axis.z * axis.z * k) + cosA 
                 );

    return result;
}
And there we have it! We have removed every trigonometric function call ( sinf(), cosf() and acosf() ) from rotateAlign() along with the square root required to normalise the axis of rotation!
Conclusion

Now we have a useful function that is shorter, more elegant and doesn't use more CPU cycles than necessary!
I hope I broke the derivation down in a way that was useful to you. Be sure to reread the original article to get Íñigo's thoughts.
If anything is unclear please let me know so I can improve this article if needed. If you notice any mistakes please let me know!