It is hard to represent rotation, either in formula of Euler Angles or Quaternions. There are only three DOF in a rotation, and we need to find a good representation such that we can get derivatives wrp. to 3 individual parameters.
In this article, axis-angle representation will be used. we first introduce 3 formulas, they are actucally identical. Then, we will choose one of it to get the derivatives.
Axis-Angle Representation
Axis-angle representation, or more concisely the rotation vector, is a vector $\mathbf{v}$ in $\mathbb{R}^3$ space. It represents a rotation of angle $\theta=|\mathbf{v}|$ along the axis $\mathbf{\hat{e}}=\mathbf{\hat{v}}$.
What we want is a rotation matrix $\mathbf{R}(\mathbf{v})$ from it. There are 3 formulas:
- Rodrigues’ rotation formula(Vector notation)
- Rodrigues’ rotation formula(Matrix notation)
- Exponential Map
Rodrigues’ rotation formula
Suppose we have a vector $\mathbf{u}$, and a rotation vector $\mathbf{v}$. We can seperate $\mathbf{u}$ into 2 components: $(\mathbf{u}\cdot\mathbf{\hat{v}})\mathbf{\hat{v}}$, $\mathbf{u}-(\mathbf{u}\cdot\mathbf{\hat{v}})\mathbf{\hat{v}}$. Perpendicular to these 2 components, there is the third direction $\mathbf{\hat{v}}\times\mathbf{\hat{u}}$.
It is easy to see that:
\[\begin{align*} \mathbf{R}\mathbf{u}&=(\mathbf{u}\cdot\mathbf{\hat{v}})\mathbf{\hat{v}} + \cos\theta (\mathbf{u}-(\mathbf{u}\cdot\mathbf{\hat{v}})\mathbf{\hat{v}})+\sin\theta \mathbf{v}\times\mathbf{u}\\ &=\mathbf{u}\cos\theta+(\mathbf{\hat{v}}\times\mathbf{u})\sin\theta+(1-\cos\theta)(\mathbf{u}\cdot\mathbf{\hat{v}})\mathbf{\hat{v}} \end{align*}\]However, cross product can always cause headache. With this formula, we still do not know what is the rotation matrix $\mathbf{R}$.
To get rid of the cross product, We can define a cross product matrix as follows:
\[\mathbf{V}=[\mathbf{v}]_{\times}=\begin{bmatrix} 0 & -v_z & v_y\\ v_z & 0 & -v_x\\ -v_y & v_x & 0 \end{bmatrix}\]Such that $\mathbf{v}\times\mathbf{u}=[\mathbf{\hat{v}} ]_\times\mathbf{u}$.
We can write Rodrigues’ rotation formula in a more elegant way:
\[\mathbf{R}\mathbf{u}=\mathbf{u}+\sin\theta[\mathbf{\hat{v}}]_\times\mathbf{u}+(1-\cos\theta)[\mathbf{\hat{v}}]_\times^2 \mathbf{u}\]So we finally know what is the rotation matrix here:
\[\mathbf{R}=\mathbf{I}+\sin\theta\mathbf{V}+(1-\cos\theta)\mathbf{V}^2\]Exponantial Map
Consider that we already have a rotation matrix $\mathbf{R}(\theta\mathbf{\hat{v}})$, and we do not know the actual formu of it. Now we rotate along the same axis $\mathbf{\hat{v}}$ with an infiniestimal angle $\varepsilon$. It should be:
\[\mathbf{R}((\theta+\varepsilon)\mathbf{\hat{v}})=\mathbf{R}(\varepsilon\mathbf{\hat{v}})\mathbf{R}(\theta\mathbf{\hat{v}})\]According to Rodrigues’ rotation formula, when the rotation is small, we have:
\[\mathbf{R(\varepsilon\mathbf{\hat{v}})}=\mathbf{I}+\varepsilon[\mathbf{\hat{v}}]_\times\]We can substitute it into the previous formula:
\[\mathbf{R}((\theta+\varepsilon)\mathbf{\hat{v}})=(\mathbf{I}+\varepsilon[\mathbf{\hat{v}}]_\times)\mathbf{R}(\theta\mathbf{\hat{v}})\]So we have the derivative of rotation matrix to the rotation angle:
\[\begin{align*} \frac{\partial \mathbf{R}(\theta\mathbf{\hat{v}})}{\partial \theta} &=\lim_{\varepsilon\to0}\frac{\mathbf{R}((\theta+\varepsilon)\mathbf{\hat{v}})-\mathbf{R}(\theta\mathbf{\hat{v}})}{\varepsilon} \\ &=[\mathbf{\hat{v}}]_\times \mathbf{R}(\theta\mathbf{\hat{v}}) \end{align*}\]This is a differentiable equation, with boundary $\mathbf{R}(0)=\mathbf{I}$. And we already know how to solve a differentiable equation like this. The answer is:
\[\mathbf{R}(\theta\mathbf{\hat{v}})=e^{[\mathbf{v}]_\times}=e^{\mathbf{V}}\]With the knowledge that $\mathbf{V}^3=-\mathbf{V}$, we can prove that the exponential map is equivalent to the matrix form of Rodrigues’ rotation formula by taylor expansion.
For further reading, chech this page.
Derivative of Rotation Matrix
We have already know the rotation matrix from the rotation vector. The next step is, what is the derivative from the rotated vector to the rotation vector?
More specifically, we need:
\[\lim_{|d\mathbf{v}|\to0}\frac{\mathbf{R}(\mathbf{v}+d\mathbf{v})\mathbf{u}-\mathbf{R}(\mathbf{v})\mathbf{u}}{dv_i}\]We already know that:
\[\mathbf{R}(\mathbf{v}+d\mathbf{v})=\mathbf{R}(d\mathbf{v})\mathbf{R}(\mathbf{v})\]So if we are trying to solve such as some optimization problems on $f(\mathbf{R}(\mathbf{v}^{(m)}))$ that derivatives are needed in each iteration, we only need derivative with new infinitestimal rotation vector $\mathbf{p}$ being at the identity:
\[\mathbf{J}:=\frac{\partial f(e^{[\mathbf{p}]_\times}\mathbf{R}(\mathbf{v}^{(m)}))}{\partial \mathbf{p}}\bigg|_{\mathbf{p}=0}\]And the same for second order Hessian matrix $\mathbf{H}$. After having $\mathbf{p}$, we apply the update rule $\mathbf{R}(\mathbf{v}^{(m+1)})=e^{[\mathbf{p}]_\times}\mathbf{R}(\mathbf{v}^{(m)})$.
Jacobian of Rotation Matrix
We already know that the cross product matrix can be written as:
\[[\mathbf{p}]_\times=\mathbf{G}_1 p_1+\mathbf{G}_2 p_2 + \mathbf{G}_3 p_3\]In which $\mathbf{G}_i=[\mathbf{\hat{e}}_i]$. This is a linear combination, and the first order derivative is also obvious:
\[\frac{\partial [\mathbf{p}]_\times}{\partial \mathbf{p}}\bigg|_{\mathbf{p}=0}=\begin{bmatrix}\mathbf{G}_1 & \mathbf{G}_2 & \mathbf{G}_3\end{bmatrix}\]Or we can write it in derivatives of each component:
\[\frac{\partial [\mathbf{p}]_\times}{\partial p_i}=[\mathbf{\hat{e}}_i]_\times\]Since we know that $\mathbf{R}(\mathbf{p})=e^{[\mathbf{p}]_\times}$, then we have $\frac{\partial\mathbf{R}(\mathbf{p})}{\partial p_i}=[\mathbf{\hat{e}}_i] _\times$.
However, we can make our live more easily to consider the derivate of rotated vector to rotation vector. First order derivative only needs first order terms in taylor expansion, which is:
\[\begin{align*} \mathbf{R}\mathbf{u}&=\mathbf{u}+[\mathbf{p}]_\times \mathbf{u}\\ &=\mathbf{u}-[\mathbf{u}]_\times \mathbf{p} \end{align*}\]Such that we have:
\[\frac{\partial \mathbf{R}(\mathbf{p})\mathbf{u}}{\partial \mathbf{p}}\bigg|_{\mathbf{p}=0}=-[\mathbf{u}]_\times\]Hessian of Rotation Matrix
Still, we start with $\mathbf{R}(\mathbf{p})=e^{[\mathbf{p}]_\times}$. The second order Taylor expansion is:
\[\begin{align*} \mathbf{R}(\mathbf{p}) &=\mathbf{I}+[\mathbf{p}]_\times+\frac{1}{2}[\mathbf{p}]_\times[\mathbf{p}]_\times\\ &=\mathbf{I}+\sum_i \mathbf{G}_i p_i+\frac{1}{2}\sum_i\sum_j \mathbf{G}_i \mathbf{G}_j p_i p_j \end{align*}\]Then each component of second order derivative is:
\[\frac{\partial^2 \mathbf{R}(\mathbf{p})}{\partial p_i\partial p_j}=\frac{1}{2}([\mathbf{\hat{e}}]_i [\mathbf{\hat{e}}]_j + [\mathbf{\hat{e}}]_j[\mathbf{\hat{e}}]_i)\]Since we know the second order derivative is only related to second order term, we have:
\[\begin{align*} \frac{\partial^2 \mathbf{R}(\mathbf{p})\mathbf{u}}{\partial \mathbf{p}^2} &=\frac{\partial^2 }{\partial \mathbf{p}^2}(\frac{1}{2}[\mathbf{p}]_\times[\mathbf{p}]_\times\mathbf{u})\\ &=\frac{\partial^2 }{\partial \mathbf{p}^2}(\frac{1}{2}\mathbf{p}\times\mathbf{p}\times\mathbf{u})\\ &=\frac{1}{2}\frac{\partial^2 }{\partial \mathbf{p}^2}((\mathbf{p}\cdot\mathbf{u})\mathbf{p}-(\mathbf{p}\cdot\mathbf{p})\mathbf{u}) \end{align*}\]The result is a third order tensor:
\[(\frac{\partial^2 \mathbf{R}(\mathbf{p})\mathbf{u}}{\partial \mathbf{p}^2})_{i,jk}=\frac{\partial^2 (\mathbf{R}(\mathbf{p})\mathbf{u})_i}{\partial p_j \partial p_k}=\frac{1}{2}(\delta_{ij}u_k+\delta_{ik}u_j-2\delta_{jk}u_i)\]Written in three Matrix:
\[\begin{align*} (\frac{\partial^2 \mathbf{R}(\mathbf{p})\mathbf{u}}{\partial \mathbf{p}^2})_{1}&=\begin{pmatrix}0 & u_y & u_z\\ u_y & -2u_x & 0 \\ u_z & 0 & -2u_x \end{pmatrix}\\ (\frac{\partial^2 \mathbf{R}(\mathbf{p})\mathbf{u}}{\partial \mathbf{p}^2})_{2}&=\begin{pmatrix}-2u_y & u_x & 0\\ u_x & 0 & u_z \\ 0 & u_z & -2u_y \end{pmatrix}\\ (\frac{\partial^2 \mathbf{R}(\mathbf{p})\mathbf{u}}{\partial \mathbf{p}^2})_{3}&=\begin{pmatrix}-2u_z & 0 & u_x\\ 0 & -2u_z & u_y \\ u_x & u_y & 0 \end{pmatrix} \end{align*}\]We can check if these 2 formulas by applying the chain rule:
\[\frac{\partial^2 (\mathbf{R}(\mathbf{p})\mathbf{u})_i}{\partial p_j \partial p_k}=\sum_s (\frac{\partial^2 \mathbf{R}(\mathbf{p})}{\partial p_j\partial p_k})_{is}u_s\]