Welcome to our community

Be a part of something great, join today!

[SOLVED] Orthogonal matrix properties

  • Thread starter
  • Admin
  • #1

Jameson

Administrator
Staff member
Jan 26, 2012
4,043
Problem: Let $O$ be an $n \times n$ orthogonal real matrix, i.e. $O^TO=I_n$. Prove that:

a) Any entry in $O$ is between -1 and 1.
b) If $\lambda$ is an eigenvalue of $O$ then $|\lambda|=1$
c) $\text{det O}=1 \text{ or }-1$

Solution: I want to preface this with that although this is a 3-part question and our rules state we should only ask one question at a time, I believe that all parts use the same concepts so it's more efficient to put them together in one thread. If that proves to be untrue, then I'll gladly split the thread.

Right now I see the solution for part (c): It uses the fact that $\text{det AB}=\text{det A }\text{det B}$.

$1= \text{det } I_n=\text{det }OO^T=\text{det }O\text{ det }O^T=\text{det }O\text{ det }O=(\text{det }O)^2$, thus $\text{det }O$ must be 1 or -1.

Any ideas on (a) and (b)?
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,780
Problem: Let $O$ be an $n \times n$ orthogonal real matrix, i.e. $O^TO=I_n$. Prove that:

a) Any entry in $O$ is between -1 and 1.
b) If $\lambda$ is an eigenvalue of $O$ then $|\lambda|=1$
c) $\text{det O}=1 \text{ or }-1$

Solution: I want to preface this with that although this is a 3-part question and our rules state we should only ask one question at a time, I believe that all parts use the same concepts so it's more efficient to put them together in one thread. If that proves to be untrue, then I'll gladly split the thread.

Right now I see the solution for part (c): It uses the fact that $\text{det AB}=\text{det A }\text{det B}$.

$1= \text{det } I_n=\text{det }OO^T=\text{det }O\text{ det }O^T=\text{det }O\text{ det }O=(\text{det }O)^2$, thus $\text{det }O$ must be 1 or -1.

Any ideas on (a) and (b)?
Hi Jameson!

What can you use?
Can you use that the length of each column vector in an orthogonal matrix is 1?
Can you use that an orthogonal matrix does not change the length of a vector?
These are basic properties of an orthogonal matrix.
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,780
Alternatively, suppose $v$ is the eigenvector corresponding to the eigenvalue $\lambda$.
What can you say about $Ov$, $(Ov)^T$, and their product?
 

Deveno

Well-known member
MHB Math Scholar
Feb 15, 2012
1,967
Let \(\displaystyle v\) be an eigenvector for \(\displaystyle O\) with eigenvalue \(\displaystyle \lambda\).

Then:

\(\displaystyle |v| = |Ov| = |\lambda v| = |\lambda||v|\).

This is the quick way to prove (b) (this uses a result proven in a different thread).

To use the result that if \(\displaystyle O_j\) is a column vector of \(\displaystyle O, |O_j| = 1\), we ought to prove this first.

But, since:

\(\displaystyle |O_j| = \sqrt{\langle O_j,O_j\rangle} = \sqrt{(O_j)^TO_j}\)

and \(\displaystyle (O_j)^TO_j = (I_{jj}) = 1\) (here I mean the j,j-th coordinate of the matrix \(\displaystyle O^TO = I\)), clearly \(\displaystyle |O_j| = \sqrt{1} = 1\).

If \(\displaystyle u_i\) is the i-th coordinate of \(\displaystyle O_j = (u_{1j},u_{2j}\dots,u_{nj})\), this means that:

\(\displaystyle (u_i)^2 = (u_{ij})^2 \leq (u_{1j})^2 + (u_{2j})^2 + \cdots + (u_{nj})^2 = 1\)

from which it follows that \(\displaystyle |u_i| = \sqrt{(u_i)^2} \leq 1\).
 
Last edited:
  • Thread starter
  • Admin
  • #5

Jameson

Administrator
Staff member
Jan 26, 2012
4,043
Hi Jameson!

What can you use?
Can you use that the length of each column vector in an orthogonal matrix is 1?
Can you use that an orthogonal matrix does not change the length of a vector?
These are basic properties of an orthogonal matrix.
Hi I like Serena! I'm not sure what properties I can use to be honest. This class is more applied and computationally based than theoretically based, so he hasn't taken much time to discuss this topic in detail. :(

I've been reading through the Wikipedia article on orthogonal matrices just as you said the rows and columns must be orthonormal. If it's part of the definition then surely I can use that property. I'm not sure how to generalize this properly for a proof but I'll try.

Let matrix $A$ be an $n \times n$ orthogonal real matrix. Let us also think of $A$ as comprising of column vectors $[a_1, a_2, ...a_n]$ where each $a_i$ is in $\mathbb{R}^n$. Assume that at least one entry in a row/column is greater than 1.

That should violate either $(a_i) \cdot (a_i)^T=0$ or $\sqrt{a_1 \cdot a_1}=1$ but I can't quite see how to get there.

How is this set up so far?
Alternatively, suppose $v$ is the eigenvector corresponding to the eigenvalue $\lambda$.
What can you say about $Ov$, $(Ov)^T$, and their product?
In this setup it follows by definition of an eigenvector that $Ov=\lambda v$, correct? I want to make sure I'm not mixing something up first.

Assuming that is true, then $(Ov)(Ov)^T=(Ov)(v^TO^T)=O(vv^T)O^T=$?
 
  • Thread starter
  • Admin
  • #6

Jameson

Administrator
Staff member
Jan 26, 2012
4,043
\(\displaystyle |O_j| = \sqrt{\langle O_j,O_j\rangle} = \sqrt{(O_j)^TO_j}\)

and \(\displaystyle (O_j)^TO_j = (I_{jj}) = 1\) (here I mean the j,j-th coordinate of the matrix \(\displaystyle O^TO = I\)), clearly \(\displaystyle |O_j| = \sqrt{1} = 1\).
This is all makes sense and follows from the definitions of the inner product space and an orthogonal matrix. (Yes)

If \(\displaystyle u_i\) is the i-th coordinate of \(\displaystyle O_j = (u_{1j},u_{2j}\dots,u_{nj})\), this means that:

\(\displaystyle (u_i)^2 = (u_{ij})^2 \leq (u_{1j})^2 + (u_{2j})^2 + \cdots + (u_{nj})^2 = 1\)

from which it follows that \(\displaystyle |u_i| = \sqrt{(u_i)^2} \leq 1\).
This part I don't get, especially the inequality. Could you explain more please or reference what I should read up on?
 

Deveno

Well-known member
MHB Math Scholar
Feb 15, 2012
1,967
You should have already proved if a square matrix has a left-inverse, it also has a right-inverse and the two are equal. In light of this, from:

\(\displaystyle OO^T = I\) we have \(\displaystyle O^TO = I\). The second equation is more useful.

Traditionally, vectors are written as COLUMNS, so one expresses an inner product as:

\(\displaystyle \langle u,v\rangle = v^Tu\) (row times column).

So \(\displaystyle \langle Ov,(Ov)^T\rangle = (Ov)^TOv = (v^TO^T)(Ov) = v^T(O^TO)v = v^TIv = v^Tv = \langle v,v\rangle\).

From this it follows that \(\displaystyle \|Ov\| = \|v\|\) (take the square root of both sides).

But what is each entry of the matrix product \(\displaystyle O^TO\)? isn't it the i-th ROW of \(\displaystyle O\) times (in the sense of an inner product) the j-th column of \(\displaystyle O\)?

Just consider the entries where the row and column have the same index (the diagonal ones). This is a sum of squares that sums up to 1. How can any such square be more than 1 (squares are all non-negative)?

(EDIT: the orthogonality of the columns of \(\displaystyle O\) follows from the definition, consider the inner product of any column with any other, if the two rows are not the same their inner product is the (matrix) product of the i-th row of \(\displaystyle O^T\) with the j-th column of \(\displaystyle O\), which is the i,j-th entry of the identity matrix, which is 0 if i ≠ j).
 
Last edited:
  • Thread starter
  • Admin
  • #8

Jameson

Administrator
Staff member
Jan 26, 2012
4,043
Bear with me. :confused: I am trying to make sense of this, I promise!

I follow everything on this line.

$\displaystyle \langle Ov,(Ov)^T\rangle = (Ov)^TOv = (v^TO^T)(Ov) = v^T(O^TO)v = v^TIv = v^Tv = \langle v,v\rangle$

Since we are dealing with the real space isn't it true that $\langle Ov,(Ov)^T\rangle =\langle (Ov)^T,Ov\rangle$? In $\mathbb{R^n}$ this is equivalent to the dot product, which is commutative.

I have two main questions:

1) I agree that $\langle Ov,(Ov)^T\rangle =\langle v,v\rangle$ but if we state that $\displaystyle \|Ov\| = \|v\|$, doesn't that imply that both values are positive since we'll be taking the square-root? How do we justify those values are positive?

2) I still don't see how we claim that the sum of squares must be 1. Once that is established then obviously it follows that any entry in the sum must be less than or equal to 1, otherwise the sum would exceed 1. I just don't see the first part.
 

Deveno

Well-known member
MHB Math Scholar
Feb 15, 2012
1,967
1) The inner product is positive-definite (if \(\displaystyle u \neq 0, \langle u,u\rangle > 0\)).

2) I think you are still not seeing the core idea:

the inner product of the i-th and j-th columns of \(\displaystyle O\) IS the matrix product of the i-th row of \(\displaystyle O^T\) and the j-th row of \(\displaystyle O\). We know ahead of time what all such products are, the ij possible such rows times columns together form the identity matrix.
 

Deveno

Well-known member
MHB Math Scholar
Feb 15, 2012
1,967
Maybe this will help:

Suppose we have a 3x3 matrix:

\(\displaystyle A = \begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33} \end{bmatrix}\)

If we want to take the inner product of two columns of \(\displaystyle A\), we can do this two ways, say we want the inner product of column 1: \(\displaystyle u = (a_{11},a_{21},a_{31})\) and column 3: \(\displaystyle v = (a_{13},a_{23},a_{33})\).

Using the standard Euclidean dot-product, we have:

\(\displaystyle \langle u,v\rangle = a_{11}a_{13} + a_{21}a_{23} + a_{31}a_{33}\).

Or, we can form the matrix:

\(\displaystyle A^TA = \begin{bmatrix}a_{11}&a_{21}&a_{31}\\a_{12}&a_{22}&a_{32}\\ a_{13}&a_{23}&a_{33} \end{bmatrix}\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33} \end{bmatrix}\)

in which case the first row of \(\displaystyle A^T\) times the third column of \(\displaystyle A\) (that is to say the 1,3-th entry of the matrix product) also gives the desired inner product: that is if:

\(\displaystyle A^TA = B = (b_{ij}) = \begin{bmatrix}b_{11}&b_{12}&b_{13}\\b_{21}&b_{22}&b_{23}\\b_{31}&b_{32}&b_{33} \end{bmatrix}\)

then:

\(\displaystyle b_{13} = a_{11}a_{13} + a_{21}a_{23} + a_{31}a_{33} = \langle u,v\rangle\).
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,780
Hi I like Serena! I'm not sure what properties I can use to be honest. This class is more applied and computationally based than theoretically based, so he hasn't taken much time to discuss this topic in detail. :(

I've been reading through the Wikipedia article on orthogonal matrices just as you said the rows and columns must be orthonormal. If it's part of the definition then surely I can use that property. I'm not sure how to generalize this properly for a proof but I'll try.
For an applied course I would expect you can use anything you can look up.
That's what an engineer does.
According to Chris L T521, an engineer has a red ball book in which he can look up the properties of a red ball, rather than trying to deduce its properties from its chemical composition.

Note that these properties are not part of the definition of an orthogonal matrix, but they are consequences of the definition, that each can be proven.



Let matrix $A$ be an $n \times n$ orthogonal real matrix. Let us also think of $A$ as comprising of column vectors $[a_1, a_2, ...a_n]$ where each $a_i$ is in $\mathbb{R}^n$. Assume that at least one entry in a row/column is greater than 1.

That should violate either $(a_i) \cdot (a_i)^T=0$ or $\sqrt{a_1 \cdot a_1}=1$ but I can't quite see how to get there.

How is this set up so far?
That is the right direction.

Suppose $b$ is one of those column vectors $a_i$.
Then $b \cdot b$ = 1.
Writing it out, this is:
$$b \cdot b = b_1^2 + b_2^2 + ... + b_n^2 = 1$$
Note that each of the terms must be at least 0, since they are squares.
So what happens if any of the $b_j$ is either greater than $1$ or less than $-1$?


In this setup it follows by definition of an eigenvector that $Ov=\lambda v$, correct? I want to make sure I'm not mixing something up first.

Assuming that is true, then $(Ov)(Ov)^T=(Ov)(v^TO^T)=O(vv^T)O^T=$?
Let's try it like this:
$$(Ov)^T(Ov)=(v^TO^T)(Ov)=v^T(O^TO)v$$
Now combine it with what is given: $O^TO=I$ and $Ov=\lambda v$.
 

Jameson

Administrator
Staff member
Jan 26, 2012
4,043
That is the right direction.

Suppose $b$ is one of those column vectors $a_i$.
Then $b \cdot b$ = 1.
Writing it out, this is:
$$b \cdot b = b_1^2 + b_2^2 + ... + b_n^2 = 1$$
Note that each of the terms must be at least 0, since they are squares.
So what happens if any of the $b_j$ is either greater than $1$ or less than $-1$?
If $|b_j|>1$ for any $j$ then the square would be larger than 1, which contradicts the $b \cdot b = 1$. That I definitely understand. If we use the fact that any column is orthonormal then $b \cdot b = 1$ is justified by the following: $1=\|b\|=\sqrt{b \cdot b} \implies b \cdot b=1$. I don't see how to prove that without using the definition though.

Let's try it like this:
$$(Ov)^T(Ov)=(v^TO^T)(Ov)=v^T(O^TO)v$$
Now combine it with what is given: $O^TO=I$ and $Ov=\lambda v$.
Using Deveno's helpful posts, I know that $(Ov)^T(Ov)=(v^TO^T)(Ov)=v^T(O^TO)v$ leads to $v^T I v=v^Tv$. Using the definition of the dot product that can be rewritten as $(Ov) \cdot (Ov)= v \cdot v$. Since $a \cdot a \ge 0$ it follows that $|(Ov) \cdot (Ov)|= |v \cdot v|$. So we have justified that multiplying by matrix $O$ doesn't change the length of $v$.

I don't know how to go from this to using $\lambda$ though.

Thank you both so much! Deveno is right that I am missing some core concepts and I've found that it takes time for these things to sink in. Hopefully I'll have an "Aha!" moment in the next day or two where it all comes together.
 

Klaas van Aarsen

MHB Seeker
Staff member
Mar 5, 2012
8,780
If we use the fact that any column is orthonormal then $b \cdot b = 1$ is justified by the following: $1=\|b\|=\sqrt{b \cdot b} \implies b \cdot b=1$. I don't see how to prove that without using the definition though.
Prove what?



Using Deveno's helpful posts, I know that $(Ov)^T(Ov)=(v^TO^T)(Ov)=v^T(O^TO)v$ leads to $v^T I v=v^Tv$. Using the definition of the dot product that can be rewritten as $(Ov) \cdot (Ov)= v \cdot v$. Since $a \cdot a \ge 0$ it follows that $|(Ov) \cdot (Ov)|= |v \cdot v|$. So we have justified that multiplying by matrix $O$ doesn't change the length of $v$.

I don't know how to go from this to using $\lambda$ though.
Let's try substituting $Ov=\lambda v$ and $O^TO=I$:
\begin{array}{lcl}
(Ov)^T(Ov) &=& v^T(O^TO)v \\
(\lambda v)^T(\lambda v) &=& v^T I v \\
\lambda^2 v^Tv &=& v^T v \\
\lambda^2 &=& 1 \\
\lambda &=& \pm 1
\end{array}
 

Deveno

Well-known member
MHB Math Scholar
Feb 15, 2012
1,967
Some additional facts about orthogonal linear transformations, with a geometric interpretation:

Every column vector of an orthogonal nxn matrix lies on the unit (n-1)-sphere.

Now ask yourself, how can a vector which lies on a unit (n-1)-sphere, possibly have any coordinate > 1?

If we choose a particular vector on the unit (n-1)-sphere, the possible orthogonal vectors remaining lie on an (n-2)-sphere perpendicular to our chosen vector. For example, if we use a 2-sphere, using the earth as a model, the perpendicular vectors to the vector represented by the north pole, lie on the equator (which is a 1-sphere, or circle).

Having chosen some vector on the equator, we now have to chose from one of two points on a 1-sphere, which lie on a perpendicular line to our equatorial vector co-planar with the equator.

It's sometimes easier to see what is going on in the special case n = 2:

Suppose we have an orthogonal matrix:

\(\displaystyle O = \begin{bmatrix}a&b\\c&d\end{bmatrix}\).

Working directly from the definition \(\displaystyle O^TO = I\), we get:

\(\displaystyle \begin{bmatrix}a&c\\b&d \end{bmatrix}\begin{bmatrix}a&b\\c&d \end{bmatrix} = \begin{bmatrix}1&0\\0&1 \end{bmatrix}\)

So that:

\(\displaystyle a^2 + c^2 = b^2 + d^2 = 1\)
\(\displaystyle ab + cd = 0\), that is:

\(\displaystyle \|(a,c)\|^2 = \|(b,d)\|^2 = 1\)
\(\displaystyle (a,c)\cdot (b,d) = 0\)
 

Jameson

Administrator
Staff member
Jan 26, 2012
4,043
I think I almost get it. :D That $2 \times 2$ example really helped something click.

In the identity matrix there are only 2 possible values by definition, 1 and 0. Whenever $i=j$ the row-column multiplication will sum to 1 and all for all other cases will result in 0 For all of the $i=j$ cases in the resultant matrix, that term is calculated by something in the form of $(a_r)^T(a_r)$ where $a_r$ is a column vector and $1 \le r \le n$. That product can be expressed as $(a_{1r})^2+(a_{2r})^2+...(a_{nr})^2$. If the magnitude of any of these terms is larger than 1 then the sum will be larger than 1 and we have a contradiction.

Ok, that part is good I think but I have one last question (for now). Using Deveno's $2 \times 2$ example, I agree that $ab+cd=0$ which is the same as $(a,c) \cdot (b,d)=0$. The only requirement for that to be true is that $ab=-cd$, which doesn't restrict the magnitudes of those for variables to be 1 or less.

EDIT: This can be justified by the fact that all columns are othoronormal. Thank you to I like Serena and Deveno for their patience and wonderful insight! :)
 

Jameson

Administrator
Staff member
Jan 26, 2012
4,043
I want to thank both I like Serena and Deveno for their help once more. Today was my first quiz in this course and there were questions related to this thread that I am sure I answered correctly only due to these two taking time to really help me understand the concepts. (Clapping)