Solving matrix commutator equations?

Filip Larsen · Feb 2, 2024

I have the matrix relationship $$C = A^{-1} B^{-1} A B$$ I want to solve for ##A##, where ##A, B, C## are 4x4 homogeneous matrices, e.g. for ##A## the structure is $$A = \begin{pmatrix} R_A & \delta_A \\ 0 & 1 \end{pmatrix}, A^{-1} =\begin{pmatrix} R_A^\intercal & -R_A^\intercal\delta_A \\ 0 & 1 \end{pmatrix} $$ representing a 3x3 rotation matrix ##R_A## and 3-vector displacement ##\delta_A## (similar structure hold for ##B## and ##C##), and I know ##B, C\neq I##. Straight-forward permutations of the equation always leave an equation with ##A## being both pre- and post-multiplied so how would I go about solving this, preferably in a way that allows (numeric) calculations that retains the rotation/displacement structure?

Filip Larsen · Feb 2, 2024

What I mean by retaining the structure is that the 4x4 equation corresponds to
$$R_C = R_A^\intercal R_B^\intercal R_A R_B$$
and
$$\delta_C = R_A^\intercal ( R_B^\intercal ( R_A \delta_B + \delta_A - \delta_B) - \delta_A),$$
where both ##R_A## and ##\delta_A## are the unknowns. I assume solve for ##R_A## first would make sense, and then the second equation reduces to a vector equation with only ##\delta_A## unknown, so I guess my problem has been reduced solving for the 3x3 rotations.

The second equation can be written as $$(R_B^\intercal-I) \delta_A = R_A \delta_C - R_B^\intercal ( R_A - I) \delta_B$$ and should indeed have a unique solution for ##\delta_A## when all involved rotations are non-zero (i.e. not equal to the identity).

Filip Larsen · Feb 4, 2024

I am still stuck at trying to solve this equation so let me try simplify further in the hope someone here can point me in a more fruitful direction towards a solution.

Given that ##A, B, X \in \text{SO(3)}## and given that ##A, B \neq I## and ##A \neq B##, then find any ##X## such that
$$A X = X B .$$
Since this equation represent 9 equations in the 9 unknown elements of ##X## and there are 9 more equations from orthogonality of SO(3), i.e. ##X X^T = I## intuition tells me it should be solvable, but my intuition also tries to tell me that surely there must exist a methods for such equations that does not involve explicitely solving for 9 uknowns in 18 equations. However, so far none of the symbolic or numeric methods in my engineering toolbox seems to come close. So, what tool am I missing in my toolbox for this?

fresh_42 · Feb 4, 2024

Filip Larsen said:

I am still stuck at trying to solve this equation so let me try simplify further in the hope someone here can point me in a more fruitful direction towards a solution.

Given that ##A, B, X \in \text{SO(3)}## and given that ##A, B \neq I## and ##A \neq B##, then find any ##X## such that
$$A X = X B .$$
Since this equation represent 9 equations in the 9 unknown elements of ##X## and there are 9 more equations from orthogonality of SO(3), i.e. ##X X^T = I## intuition tells me it should be solvable, but my intuition also tries to tell me that surely there must exist a methods for such equations that does not involve explicitely solving for 9 uknowns in 18 equations. However, so far none of the symbolic or numeric methods in my engineering toolbox seems to come close. So, what tool am I missing in my toolbox for this?

This isn't as trivial as it might look. You have forgotten to mention that ##C=[A,B]## is quadratic in ##A## and so is ##AA^\dagger = I.## Hence, you have either the problem of finding a point on the intersection of quadratic surfaces or solving the word problem in ##SO(3).## The latter is in general unsolvable
https://en.wikipedia.org/wiki/Presentation_of_a_group#Novikov–Boone_theorem
https://de.wikipedia.org/wiki/Präsentation_einer_Gruppe#Das_Konjugationsproblem
These articles basically say: list the conjugates of ##B## and compare them with ##C.## This is impossible for infinite groups but we can pretend we were working over a finite field of sufficiently large characteristic. However, this lets the number of elements to be considered explode.

The question is whether a given matrix ##CB^{-1}## is in the image ##\iota_A(B^{-1})## of some inner automorphism ##\iota_A## and to determine this specific automorphism. This looks like trial and error, too. Maybe you could determine the conjugacy classes of the (few) generators of ##SO(3)## or first determine how each generator is written as a commutator.

Of course, the situation in ##SO(3)## is way more specific than the general situation in an arbitrary group, so I would look at the algorithms mentioned on those two Wikipedia pages, or give in and do it by hand.

Filip Larsen · Feb 4, 2024

Thanks for the input. If nothing else it helps me fell less frustrated about not being able to readily solve what appears like a trivial problem.

fresh_42 said:

you have either the problem of finding a point on the intersection of quadratic surfaces or solving the word problem

I have tried reduce the ##AX = BX## equation further using the cross product matrix for the rotation ##X## and that do indeed seem to head off towards having 9 quadratic equations in 5 unknowns (or 4, if using small angle approximation).

Since the ##M## in my original equation grow slowly from identity (for which ##A=I## is a solution) due to a manual measurement process it would also make sense that ##A## can be grown slowly too using numeric iterations, perhaps simply by brute-forcing guessing and combining a sequence of small-angle rotations (e.g. around the 3 primary axes) that in each step will make the residual between ##M## and ##[A,B]## smaller.

renormalize · Feb 24, 2024

Filip Larsen said:

Given that ##A, B, X \in \text{SO(3)}## and given that ##A, B \neq I## and ##A \neq B##, then find any ##X## such that
$$A X = X B .$$
Since this equation represent 9 equations in the 9 unknown elements of ##X## and there are 9 more equations from orthogonality of SO(3), i.e. ##X X^T = I## intuition tells me it should be solvable, but my intuition also tries to tell me that surely there must exist a methods for such equations that does not involve explicitely solving for 9 uknowns in 18 equations. However, so far none of the symbolic or numeric methods in my engineering toolbox seems to come close. So, what tool am I missing in my toolbox for this?

@Filip Larsen After all this time, are you still interested in solving your equation? Because by using 3D vector geometry, you can find all the solutions for your matrix ##X## simply by inspection!

While it's true that an arbitrary ##\text{SO(3)}## rotation matrix ##R##, where ##RR^{T}=I##, contains 9 entries, those entries all depend ultimately on only 3 parameters: a unit-vector ##\boldsymbol{\hat{n}}_{R}\equiv\cos\phi\sin\theta\boldsymbol{\hat{x}}+\sin\phi\sin\theta\boldsymbol{\hat{y}}+\cos\theta\boldsymbol{\hat{z}}## that defines the axis-of-rotation (2 parameters) and a rotation-angle ##\rho## about that axis (1 parameter). This "angle-axis" parameterization of ##R=R\left(\boldsymbol{\hat{n}}_{R};\rho\right)## is expressed in component-form as (https://scipp.ucsc.edu/~haber/ph216/rotation_12.pdf eq.(19)):$$R^{i}{}_{j}\equiv\cos\rho\,\delta^{i}{}_{j}+\left(1-\cos\rho\right)\hat{n}_{R}^{i}\hat{n}_{Rj}-\sin\rho\,\epsilon^{i}{}_{jk}\hat{n}_{R}^{k}\tag{1}$$and has trace:$$\text{tr}R\equiv\delta^{j}{}_{i}R^{i}{}_{j}=2\cos\rho+1\tag{2}$$That is, ##\text{tr}R## depends only on the rotation angle ##\rho##. This is important because your equation ##AX=XB## necessarily imposes one constraint between the otherwise arbitrary pair of matrices ##A,B##, as can be seen by rewriting it in the form ##A=XBX^{T}\Rightarrow\mathrm{tr}A=\mathrm{tr}\left(XBX^{T}\right)=\mathrm{tr}\left(X^{T}XB\right)=\mathrm{tr}B##. Per (2), this means that the rotation angles ##\alpha,\beta## characterizing ##A,B## must be the same (up to a sign, which can always be absorbed by flipping the direction of one of the rotation axes ##\boldsymbol{\hat{n}}_{A}## or ##\boldsymbol{\hat{n}}_{B}##). In other words, your equation mandates that the general forms of ##A,B## must be:$$A=R\left(\boldsymbol{\hat{n}}_{A};\alpha\right),\;B=R\left(\boldsymbol{\hat{n}}_{B};\alpha\right)\tag{3a,b}$$To now solve your equation for ##X##, note that ##XBX^{T}=A## can be interpreted as saying that the matrix transformation you seek rotates the rank-2 tensor ##B## into the tensor ##A##. In turn, by (3) this means that ##X## rotates the rank-1 vector ##\boldsymbol{\hat{n}}_{B}## into the vector ##\boldsymbol{\hat{n}}_{A}##; i.e., ##X\boldsymbol{\hat{n}}_{B}=\boldsymbol{\hat{n}}_{A}##. So you're looking for matrices ##X## that simply rotate one unit-vector into another. Denoting by ##\mathcal{P}## the plane spanned by the vectors ##\boldsymbol{\hat{n}}_{A}## and ##\boldsymbol{\hat{n}}_{B}##, the 2 possible solution-forms for ##X## can be "read-off" from the following diagram:

First, consider a rotation-axis ("##\parallel\mathrm{axis}##") in the plane ##\mathcal{P}##, that bisects the angle ##\cos^{-1}\left(\boldsymbol{\hat{n}}_{A}\cdotp\boldsymbol{\hat{n}}_{B}\right)## between ##\boldsymbol{\hat{n}}_{A}## and ##\boldsymbol{\hat{n}}_{B}## (i.e., the axis points in the direction of ##\boldsymbol{\hat{n}}_{A}+\boldsymbol{\hat{n}}_{B}##). Rotating ##\boldsymbol{\hat{n}}_{B}## about that axis through an angle of ##\pi## radians sweeps ##\boldsymbol{\hat{n}}_{B}## out of the plane through a half-cone into coincidence with ##\boldsymbol{\hat{n}}_{A}##. Call this solution ##X^{\parallel}##. Alternatively, choose a rotation-axis ("##\perp\mathrm{axis}##") normal to plane ##\mathcal{P}## (i.e., the axis points in the direction of ##\boldsymbol{\hat{n}}_{B}\times\boldsymbol{\hat{n}}_{A}##) and rotate ##\boldsymbol{\hat{n}}_{B}## in the plane by the angle ##\cos^{-1}\left(\boldsymbol{\hat{n}}_{A}\cdotp\boldsymbol{\hat{n}}_{B}\right)## to align ##\boldsymbol{\hat{n}}_{B}## with ##\boldsymbol{\hat{n}}_{A}##. This gives the second solution ##X^{\perp}##. Explicitly:$$X^{\parallel}=R\left(\frac{\boldsymbol{\hat{n}}_{A}+\boldsymbol{\hat{n}}_{B}}{\left\Vert \boldsymbol{\hat{n}}_{A}+\boldsymbol{\hat{n}}_{B}\right\Vert };\pi\right),\;X^{\perp}=R\left(\frac{\boldsymbol{\hat{n}}_{B}\times\boldsymbol{\hat{n}}_{A}}{\left\Vert \boldsymbol{\hat{n}}_{B}\times\boldsymbol{\hat{n}}_{A}\right\Vert };\cos^{-1}\left(\boldsymbol{\hat{n}}_{A}\cdotp\boldsymbol{\hat{n}}_{B}\right)\right)\tag{4a,b}$$ So these 2 rotation matrices (as well as their negatives) satisfy ##0=AX^{\parallel,\perp}-X^{\parallel,\perp}B## as desired.

But masochist that I am, I want to verify this using actual matrix algebra. I'll spare you the display here of the general form of ##R\left(\boldsymbol{\hat{n}}_{R};\rho\right)## with its 9 entries, but you can find it as eq.(20) in the reference cited above. Instead, without loss of generality, I make a judicious choice of cartesian coordinates to simplify the explicit forms of the matrices ##A,B,X^{\parallel},X^{\perp}##. First, I take the ##z##-axis to align with the rotation-axis ##\boldsymbol{\hat{n}}_{A}##:$$
A\equiv R\left(\boldsymbol{\hat{z}};\alpha\right)=\begin{pmatrix}\cos\alpha & -\sin\alpha & 0\\
\sin\alpha & \cos\alpha & 0\\
0 & 0 & 1
\end{pmatrix}
\tag{5}$$Then I choose the ##x,y## axes so that the rotation-axis ##\boldsymbol{\hat{n}}_{B}## lies entirely in the ##xz## plane; i.e., ##\boldsymbol{\hat{n}}_{B}=\pm\sin\theta\boldsymbol{\hat{x}}+\cos\theta\boldsymbol{\hat{z}}##:$$B_{\pm}\equiv R\left(\pm\sin\theta\boldsymbol{\hat{x}}+\cos\theta\boldsymbol{\hat{z}};\alpha\right)$$$$
=\begin{pmatrix}\cos\alpha\cos^{2}\theta+\sin^{2}\theta & -\sin\alpha\cos\theta & \pm\left(1-\cos\alpha\right)\cos\theta\sin\theta\\
\sin\alpha\cos\theta & \cos\alpha & \mp\sin\alpha\sin\theta\\
\pm\left(1-\cos\alpha\right)\cos\theta\sin\theta & \pm\sin\alpha\sin\theta & \cos\alpha\sin^{2}\theta+\cos^{2}\theta
\end{pmatrix}
\tag{6}$$With these choices, eq.(4) becomes:$$
X_{\pm}^{\parallel}\equiv R\left(\pm\sin\frac{\theta}{2}\boldsymbol{\hat{x}}+\cos\frac{\theta}{2}\boldsymbol{\hat{z}};\pi\right)=\begin{pmatrix}-\cos\theta & 0 & \pm\sin\theta\\
0 & -1 & 0\\
\pm\sin\theta & 0 & \cos\theta
\end{pmatrix}
\tag{7a}$$$$
X_{\pm}^{\perp}=R\left(\mp\boldsymbol{\hat{y}};\theta\right)=\begin{pmatrix}\cos\theta & 0 & \mp\sin\theta\\
0 & 1 & 0\\
\pm\sin\theta & 0 & \cos\theta
\end{pmatrix}
\tag{7b}$$Finally, throwing matrices (5), (6) and (7) into Mathematica confirms that:$$0=AX_{\pm}^{\parallel,\perp}-X_{\pm}^{\parallel,\perp}B_{\pm}\tag{8}$$is indeed true.

(Whew, I'm exhausted!)

Filip Larsen · Feb 24, 2024

renormalize said:

are you still interested in solving your equation?

Yes, it is still an ongoing effort for me to establish mathematical and numerical tools to analyze the relationship between the involved transformations, including specifically finding values of ##X## that satisfy ##AX = XB##.

Currently I am trying to use axis-angle quaternions to reduce the number of equations, but I am not there yet.

renormalize said:

note that can be interpreted as saying that the matrix transformation you seek rotates the rank-2 tensor into the tensor .

That is a very good observation giving a much clearer geometrical "concept" for possible solutions.

Your two solutions with ##X^\parallel## and ##X^\perp## are interesting. I will try find time to see if they can be useful to find solutions that satisfy my needs.

And to clarify my needs I perhaps better give a little context for the problem. The original relationship ##C^T D^T C D = E## arise in residual error analysis of a volume co-regitration process, i.e. a process that determines the relative transformation between the same physical object represented by 3D points in two different sensor coordinate systems. Here ##C## is an unknown small-angle/offset system calibration error, ##D## is a controlled large-angle/offset arbitrary transformation applied after co-registration, and ##E## is the subsequent small-angle/offset error observed at ##D##. When ##D## is identity (zero rotation and offset) it means the system is in the co-registration reference position where everything is assumed to match with zero error in both rotation and offset. Once ##D## is increased in rotation and/or offset the error ##E## also starts to increase, so a ##D## is typically increased within mechanical operating limits of the system to find as large ##E## as possible. Multiple ##D## with corresponding measured ##E## (for same unknown ##C##) are performed and after each I would like to calculate the (approximately) smallest angle ##C## for that particular value of ##D## and ##E##. Longer down the road it may be interesting to combine the set of ##D## and ##E## to form a statistical estimate for ##C## but currently I am just looking at what analysis I can throw at each measurement. As mention earlier, the relationship original involves 4x4 homogeneous transformation matrices (i.e. effectively a combined offset and rotation), but as also mentioned it makes sense to reduce to a rotation-only problem, hence same structural relationship but with 3x3 rotation matrices instead.

So for the problem you analysed, ##AX = XB##, we thus have ##X = C##, ##A = D^T## and ##B = C D^T## meaning ##A## and ##B## are large-angle rotations with nearly same axis and same angle. I guess with that setup it seems your ##X^\perp## solution possibly are quite close to the smallest-angle solution I want. When I have time to work on this problem again I may try simple calculate ##X^\perp##, or even just the angle, and see what it gives.

renormalize said:

Whew, I'm exhausted!

Thanks for the effort, it is really appreciated!

Solving matrix commutator equations?

Similar threads

Hot Threads

Recent Insights