Direct Proof that every zero of p(T) is an eigenvalue of T

zenterix · Feb 5, 2024

I was stuck on this problem so I looked for a solution online.

I was able to reproduce the following proof after looking at the proof on the internet. By this I mean, when I wrote it below I understood every step.

However, it is not a very insightful proof. At this point I did not really obtain any insight as to why this result is true.

I am looking for an alternative proof (possibly a direct proof) and also different ways of interpreting/analyzing this problem.

Here is a proof by contradiction.

Suppose ##p## has degree ##n##.

We know that

1) ##p(T)## is a linear map.

2) ##v\in\ \mathrm{null}(p(T))##

3) A root of ##p## is a number.

Suppose ##r## is a root of ##p##.

Then we can write ##p(x)=(x-r)q(x)## where ##q(x)## is a polynomial with degree ##n-1##.

Then ##p(T)=(T-rI)q(T)##.

Suppose ##r## is not an eigenvalue of ##T##.

Then, ##T-rI## is injective. It's nullspace is just the zero vector in ##V##.

##p(T)v=0=(T-rI)q(T)v##

Thus, ##q(T)v=0## because ##(T-rI)v\neq 0## since ##v\neq 0##.

By assumption, ##p(T)## with degree ##n## is the smallest degree polynomial operator that maps ##v## to ##0\in V##.

Thus, ##q(T)v=0## shows a contradiction with this assumption.

Therefore, by contradiction, ##r## must be an eigenvalue.

zenterix · Feb 5, 2024

Suppose ##\lambda## is an eigenvalue of ##T## and ##w## is an eigenvector for this eigenvalue.

$$p(T)=a_0I+a_1T+\ldots+a_nT^n\tag{1}$$

$$p(T)w=a_0w+a_1Tw+\ldots a_nT^nw\tag{2}$$

$$=a_0w+a_1\lambda w+\ldots a_n\lambda^nw\tag{3}$$

$$=(a_0+a_1\lambda+\ldots+a_n\lambda^n)w\tag{4}$$

$$=p(\lambda)w\tag{5}$$

The problem seems to show that every root is an eigenvalue, but it seems that it is not necessarily the case the every eigenvalue is a root.

The polynomial has degree ##n## and so ##n## complex roots, but ##V## can have dimension ##m>n## and thus there can be more eigenvalues of ##T## than roots of ##p##.

However, suppose that ##\lambda## is a root of ##p##. Then, ##\lambda## is also an eigenvalue. In this case, ##p(\lambda)=0## and from (5) we have

$$p(T)w=p(\lambda)w=0\tag{6}$$

which seems to indicate that ##0## is an eigenvalue of ##p(T)## and the associated eigenspace has dimension equal to the number of independent eigenvectors associated with eigenvalues that are roots.

pasmith · Feb 5, 2024

zenterix said:

The problem seems to show that every root is an eigenvalue, but it seems that it is not necessarily the case the every eigenvalue is a root.

Let [itex]\lambda[/itex] be an eigenvalue of [itex]T[/itex] with eigenvector [itex]v[/itex]. If [itex]\lambda[/itex] is not a root of [itex]p[/itex] then [itex]p(T)v = p(\lambda)v \neq 0[/itex], which is a contradiction, since we are given that [itex]p(T)v = 0[/itex] for every non-zero [itex]v \in V[/itex].

zenterix · Feb 5, 2024

pasmith said:

Let [itex]\lambda[/itex] be an eigenvalue of [itex]T[/itex] with eigenvector [itex]v[/itex]. If [itex]\lambda[/itex] is not a root of [itex]p[/itex] then [itex]p(T)v = p(\lambda)v \neq 0[/itex], which is a contradiction, since we are given that [itex]p(T)v = 0[/itex] for every non-zero [itex]v \in V[/itex].

I did not get the impression that the assumption was that ##p(T)v=0## for every non-zero ##v\in V##.

Do you agree that ##T## can have more eigenvalues that ##p## has roots?

For example, consider linear map

$$T=\begin{bmatrix} 1 &-1&0\\-1&2&-1\\0&-1&1\end{bmatrix}$$

It has eigenvalues ##0,1##, and ##3##.

Let ##p(x)=(x-1)^2## which has roots ##1,1##. These roots are eigenvalues of ##T##, but ##T## has two eigenvalues that are not roots of ##p##.

If we were to use ##p(x)=x## then ##p(T)=T##.

##p## has a single root ##0## and ##T## has three eigenvalues.

zenterix · Feb 5, 2024

The direct proof is actually simple.

Suppose ##r## is a root of ##p## and ##\deg{p}=n##.

Then ##p(x)=(x-r)q(x)## and ##\deg{q}=n-1##.

##p(T)v=(T-rI)q(T)v##

and ##q(T)v\neq 0## since ##p## is the smallest degree polynomial such that ##p(T)v=0## and ##q## has degree smaller than ##p##.

Thus, it must be that ##(T-rI)v=0## so ##r## is an eigenvalue of ##T##.

pasmith · Feb 5, 2024

zenterix said:

I did not get the impression that the assumption was that ##p(T)v=0## for every non-zero ##v\in V##.

Yes; I see that [itex]v[/itex] is specified.

Do you agree that [itex]T[/itex] can have more eigenvalues than [itex]p[/itex] has roots?

Yes. If [itex]v[/itex] has no component in the eigenspace of [itex]\lambda[/itex] then [itex]p(z)[/itex] does not need to contain a factor [itex](z - \lambda)^k[/itex] for any [itex]k \geq 1[/itex]. The roots of [itex]p[/itex] are exactly those eigenvalues of [itex]T[/itex] in whose eigenspaces [itex]v[/itex] has a component.

zenterix · Feb 5, 2024

But now suppose
$$T=\begin{bmatrix}5&1&0\\0&2&-1\\0&-1&1\end{bmatrix}$$
which has eigenvalues ##5,\frac{3\pm\sqrt{5}}{2}##.
Consider the polynomial ##p(x)=1## which doesn't have roots.
Then ##p(T)=I## and there are no non-zero vectors such that ##p(T)v=0##.
Thus, the theorem does not apply in this case.
But now let ##p(x)=x## which has a single root ##0##.
Then, ##p(T)=T##. But ##T## is non-singular so the null space is only the zero vector.
There are once again no non-zero vectors such that ##p(T)v=0##.
So the theorem doesn't apply. After all, ##0## is a root of ##p## but is not an eigenvalue of ##T##.
What about ##p(x)=x-c##, ##p(T)=T-cI##?
If ##c## is not an eigenvalue, then once more there are no non-zero solutions to ##p(T)v=0##. The theorem doesn't apply.
If ##c## is an eigenvalue, then there are non-zero solutions, namely the vectors in the eigenspace for this eigenvalue.
##c## is also a root of ##p##.
Now consider a general polynomial ##p##.
Suppose there is a non-zero ##v_1## such that ##p(T)v_1=0##.
Suppose ##r_1## is a root of ##p##.
##(T-r_1I)q(T)v_1=0##
So ##r_1## is an eigenvalue and ##v_1## is an eigenvector for this eigenvalue.
Suppose we have a second root ##r_2##.
Then by the same logic ##r_2## is an eigenvalue and ##v_1## is an eigenvector for this eigenvalue.
Something is weird or wrong here.

mathwonk · Feb 6, 2024

To say that p(T)v = 0, where deg(p) = n, implies that the vectors v,Tv,(T^2)v,...,(T^n)v are dependent. Hence the lowest degree of such a p, is the dimension of the span of the vectors v,Tv,...,T^rv,......

E.g if v≠0 is an eigenvector of T, with eigenvalue c, then this dimension is one, and the monic p of lowest degree is p(X) = (X-c). Obviously there is no reason, and no possibility, for any other eigenvalues of T to be roots of this polynomial.

I.e. every T has a minimal polynomial, and every root of the minimal polynomial is an eigenvalue (e.g by Cayley-Hamilton). As defined however, the polynomial p in this problem equals the minimal polynomial of the restriction of T to the subspace S spanned by vectors of form (T^r)v, r≥0, and is unrelated to the behavior of T outside this subspace. Every root of this "restricted" minimal polynomial is however also a root of the full minimal polynomial and is thus an eigenvalue of T. In the opposite direction, an eigenvalue c of T occurs as a root of this restricted minimal polynomial if and only c is associated to an eigenvector belonging to the subspace S.

In general, i.e. even if v≠0 is not an eigenvector of T, but p is the minimal degree polynomial such that p(T)v = 0, and c is a root of p, the polynomial p factors as (X-c)^s.g(X) where c is not a root of g. Then the subspace S spanned by the vectors T^r(v), r≥0, splits into a direct sum of subspaces U and W, on which T has minimal polynomials (X-c)^s, and g(X) respectively. Then T has at least one eigenvector in the subspace U, hence also in S, with eigenvalue c.

mathwonk · Feb 6, 2024

Here are some class notes of mine discussing the difference between a polynomial p, of minimal degree, such that p(T)v = 0 for all v, and such that p(T)v = 0 for one specific v. (My apologies for over - answering.). These are an excerpt from
https://www.math.uga.edu/sites/default/files/laprimexp.pdf

Terminology: A polynomial is called “monic” if the lead coefficient equals one.

Lemma: Every linear operator T:V-->V on a finite dimensional k - vector space V,
satisfies some monic (hence non zero) polynomial over k.
Proof: If dim(V) = n, then dim(Hom(V,V)) = n^2, but k[X] is infinite dimensional,
with basis all monomials {1, X, X^2, X^3,....}. Thus the map k[X]-->Hom(V,V) has a non zero kernel, i.e. for some non zero f, we have f(T) = 0. Dividing through by the leading non-zero coefficient makes the polynomial monic and T still satisfies it. QED.

Lemma: There is a unique monic polynomial of least degree satisfied by T. Indeed this minimal polynomial divides all other polynomials satisfied by T.
Proof: If f,g are two (non zero) polynomials of least degree satisfied by f, and we divide g by f, we get an equation if form g = qf + r, where deg(r) < deg(f). Since r = g - qf, and T satisfies both f and g, it also satisfies r. Since r has degree less than f, but f has least degree among non zero polynomials satisfied by T, so r = 0, i.e. f divides g. Since similarly g divides f, they must be scalar multiples of one another. In particular, if both are monic they are equal. QED.

We give a name to the unique monic polynomial of minimal degree satisfied by T.
Defn: If T:V-->V is a linear map, and dim(V) is finite, the monic polynomial f of least degree with f(T) = 0, i.e. such that (f(T))(v) = 0 for all v in V, is called the minimal polynomial of T.

Note it follows from the proof of existence of the minimal polynomial that it always has degree ≤ n^2 where n = dim(V). In fact the minimal polynomial has degree ≤ n = dim(V), a fact whose proof will be crucial in studying similarity.

Definition: If T:V-->V is a linear map, dim(V) is finite, and v is a vector in V, the unique monic polynomial f of least degree with f(T)(v) = 0, is called the minimal polynomial of T at v.

Remark: There is some such polynomial since the minimal polynomial for T on all of V works. The uniqueness proof for the one of least degree is also the same.

The next result is key to the ideas of this chapter.

Lemma: If T:V-->V is a linear map, dim(V) is finite, and v,w are vectors whose minimal T-polynomials are f,g, then there is a vector u in V whose minimal T- polynomial is the least common multiple of f,g.
Proof: First we prove it in case f,g are relatively prime, in which case their lcm is the product f.g. Then we claim that u = v+w works. Since (f.g)(T) = f(T)og(T) = g (T)of(T) does annihilate both v and w it also annihilates their sum. Now let h be any polynomial such that h(T) annihilates v+w. We claim f.g divides h, for which it suffices to show that each of f and g do so. Since f annihilates v and h annihilates v+w, thus f.h annihilates both v and v+w, hence also w. Hence the minimal T-polynomial for w divides f.h, so g divides f.h. Since f and g are relatively prime, then g divides h. A similar argument shows f also divides h. So indeed f.g is the minimal T-polynomial at u = v+w.

Now let the T-minimal polynomials of v,w, namely f,g, be arbitrary. Consider all irreducible factors of f and of g, and let A be the product of those irreducible factors that occur more often in f than in g, and let B be the product of those irreducible factors that occur at least as often in g as in f. Then A and are relatively prime, and their product A.B = the lcm of f and g. Moreover f = A.p, and g = B.q, for some polynomials p,q. Then since f, g are the T-minimal polynomials of v,w, it follows that A,B are the T-minimal polynomials of p(v) and q(w). Hence A.B = lcm(f,g) is the T-minimal polynomial of u = p(v)+q(w). QED.

Corollary: If m is the minimal polynomial of T on the space V, then there is a vector w in V such that the minimal polynomial of T at w is also m.
Proof: Since m(T) annihilates every vector in V, it follows that for each vector w, the minimal polynomial of T at w has degree at most that of m. Choose w to be a vector whose minimal polynomial has maximal degree among all vectors in V. Then for any other vector v, we claim the minimal polynomial of T at v divides that at w. It will follow that the minimal polynomial of T at w also annihilates every other v, and hence is the minimal polynomial of T for the whole space.

If the divisibility does not hold, some irreducible factor of the T-minimal polynomial at v occurs to a higher power than in the T-minimal polynomial of w. Then the lcm of the two minimal T-polynomials at v and w has degree greater than either of them. Thus by the lemma there is a vector whose T-minimal polynomial has that greater degree, contradicting the choice of w. QED.

Cor: The minimal polynomial of an operator T on V has degree ≤ dimV.
Proof: Let v be a vector whose minimal T-polynomial equals that for T on all of V, and consider the evaluation map at v, namely the map k[X]-->V taking f(X) to (f(T))(v). If n = dim(V), then the n+1 monomials {1,X,X^2,...,X^n}, must have dependent images in V, i.e. the vectors {v,T(v),T^2(v),...,T^n(v)} are linearly dependent in V. Thus some polynomial of degree ≤ n in T vanishes at v, but by choice of v, this polynomial vanishes also on all of V. QED

mathwonk · Feb 6, 2024

@zenterix: your exposition in post #5 of the direct argument is quite nice and clear (except in the last line I think you meant q(T)v instead of v, as the eigenvector).

[ah yes, this explains the "something weird" in your post #7. I.e. it was not v1 but q(T)v1 that was an eigenvector. also you need a hypothesis on the degree of p.]

anyway, your argument in #5 proves that every root of the minimal polynomial of T is an eigenvalue, hence also a root of the characteristic polynomial, a weaker form of the Cayley Hamilton theorem. It makes me want to try to strengthen it to perhaps prove the full result, that the minimal polynomial itself divides the characteristic polynomial.
....hmmm, I don't see how to do more than get the same results also for roots in an extension field of the scalars. e.g. if we are working over the reals, it seems we can also get that all complex roots of the minimal poly are roots of the char. poly.

to get that (X-c)^r divides the characteristic polynomial, I seem to need to look at a cyclic basis for a subspace, and at the actual matrix on that subspace.

but that seems clear and implies the result. i.e. it gives a block matrix with a nilpotent block with determinant (X-c)^r.

then I suspect one can pass to a vector space over field extension where the minimal polynomial splits, and use the fact that the characteristic polynomial is the same, even if we change to a basis over the new field.

i.e perhaps it suffices to prove Cayley Hamilton over an algebraically closed field.
...........
yes, inspired by this question, I have worked out how to prove Cayley Hamilton over a splitting field, via a Jordan matrix, and use that to deduce it over any field.

Bosko · Feb 7, 2024

Is the vector space ##V## over either real or complex numbers ( the scalar field ) ?
The fundamental theorem of algebra can be used > Fundamental theorem of algebra

I assume that the real numbers are the coefficients of the polynomial ##p(x)##.
Based on above mentioned theorem any ##p(x)=a_nx^n+...+a_0## can be written as
$$p(x)=a_n(x-\lambda_1)(x-\lambda_2)...(x-\lambda_n)$$
where ##\lambda##s are zeros of ##p(x)##.
Some pairs of them can be pairs of the complex numbers in the form ##(a+ib)## and ##(a-ib)##
Those "complex" pairs ##(x-\lambda_i)...(x-\lambda_j)## can be removed because ##p(x)## is polynomial of smallest degree => with real zeros.

Regarding ##p(\lambda_i)=0##, ##a_n## can be removed and we get
$$p(T)=(T-\lambda_1I)(T-\lambda_2I)...(T-\lambda_nI)$$

Edit: This is just an idea what can be used. There are more things to do.

nuuskur · Feb 9, 2024

However, it is not a very insightful proof.

Indeed, this is a problem with nonconstructive proofs. Assuming proof is correct, we don't learn why statement is true, only what can go wrong if it was false.

The problem seems to show that every root is an eigenvalue

Yes, if ##p## is of minimal degree. In general, no. ##\lambda## is a root of such ##p## if and only if ##\lambda## is a root of characteristic polynomial of ##T##. I.e the roots of minimal polynomial are precisely the eigenvalues of ##T##.

zenterix · Feb 10, 2024

mathwonk said:

To say that p(T)v = 0, where deg(p) = n, implies that the vectors v,Tv,(T^2)v,...,(T^n)v are dependent.

Very interesting and useful new perspective for me.

In what follows I will try to write what you have said in more steps.

##p(T)v## is actually a linear combination of vectors in ##V## (##v,Tv,T^2v,\ldots,T^nv##) which must be linearly dependent since the linear combination is zero and ##p## is a non-zero polynomial.

mathwonk said:

Hence the lowest degree of such a p, is the dimension of the span of the vectors v,Tv,...,T^rv,......

Let me try to justify this statement.

We have l.d. ##v,Tv,\ldots, T^nv##.

For some ##k\in\{0,1,\ldots,n-1\}##, if ##v,Tv,\ldots,T^kv## is l.i. and ##v,Tv,\ldots,T^{k+1}v## is l.d. then

$$T^{k+2}v=T(T^{k+1}v)=T\left (\sum\limits_{i=0}^k \alpha_i T^iv \right )$$

$$=\sum\limits_{i=0}^{k+1}\alpha_i T^iv\in\text{span}(v,Tv,\ldots,T^{k+1}v)$$

and thus ##v,Tv,\ldots,T^{k+2}v## are l.d.

Therefore, for any ##m\geq k+1## the vectors ##v,Tv,\ldots,T^mv## are l.d.

Thus, since ##k\in\{0,1,\ldots,n-1\}## there is a minimum ##k## such that ##v,Tv,\ldots, T^kv## are l.d.

This ##k## is the dimension of ##\text{span}(v,Tv,\ldots,T^nv)##.

Let ##A=\text{span}(v,Tv,\ldots T^nv)=\text{span}(v,Tv,\ldots,T^kv)##.

Then ##k=\dim{A}##.

Since there is a linear combination of ##v,Tv,\ldots,T^kv## that is zero with the coefficients not all zero and specifically with the coefficient on ##T^kv## nonzero, there is a polynomial ##q## of degree ##k## such that ##q(T)v=0##.

##q## is the polynomial of smallest degree such that ##q(T)v=0## by the reasoning above.

Here is a question that arises that maybe is interesting (or probably can be answered trivially but I am not seeing why yet):

Can we define ##q(T)## as follows

##v,Tv,\ldots,T^kv## is a basis for ##A##. Extend this basis to a basis of ##V##:

$$v,Tv,\ldots,T^kv,w_1,\ldots , w_{n-k}$$

Define a polynomial operator ##q(T)\in\mathcal{L}(V)## where ##q## has degree ##k## by

$$q(T)(T^iv)=p(T)(T^iv)\ \ \ i=1,2,\ldots, k$$

$$q(T)w_i=p(T)w_i\ \ \ i=1,2,\ldots,n-k$$

Then, for any ##y\in V## we have

$$q(T)v=q(T)\left ( \sum\limits_{i=0}^k \alpha_i T^iv + \sum\limits_{i=1}^{n-k}\beta_iw_i \right )$$

$$=\sum\limits_{i=0}^k\alpha_ip(T)(T^iv)+\sum\limits_{i=1}^{n-k}\beta_ip(T)w_i$$

$$=p(T)y$$

Does such a polynomial operator exist?

We defined ##q(T)## with ##n## equations. ##q## has only ##k+1## coefficients. Does this mean that there are multiple polynomials ##q## such that ##q(T)v=0##?

mathwonk said:

E.g if v≠0 is an eigenvector of T, with eigenvalue c, then this dimension is one, and the monic p of lowest degree is p(X) = (X-c). Obviously there is no reason, and no possibility, for any other eigenvalues of T to be roots of this polynomial.

If we pick a ##v## that is an eigenvector of ##T## of eigenvalue ##\lambda##, then all of ##v,Tv,T^2v,\ldots,T^nv## are in the same subspace of the eigenspace ##E(\lambda,T)##. The dimension of the subspace is ##1##.

##p(T)v## is also in this subspace since this is a linear combination of ##v,Tv,T^2v,\ldots,T^nv##.

Thus, there is a polynomial ##q## of degree ##1## such that ##q(T)v=0##.

Since it is true that ##(T-\lambda I)v=0## then the polynomial ##q(x)=x-\lambda## has degree ##1## and ##q(T)=T-\lambda I## so ##q(T)v=0##.

mathwonk said:

I.e. every T has a minimal polynomial, and every root of the minimal polynomial is an eigenvalue (e.g by Cayley-Hamilton).

We considered the two possible cases above: ##v## is or is not an eigenvector.

In both cases we reached the conclusion that there is a polynomial with minimal degree such that ##q(T)v=0##.

Thus, every ##T## has a minimal polynomial.

As for the second part, I am just about to study Cayley-Hamilton for the first time, so I will come back here when I do.

zenterix · Feb 10, 2024

mathwonk said:

As defined however, the polynomial p in this problem equals the minimal polynomial of the restriction of T to the subspace S spanned by vectors of form (T^r)v, r≥0, and is unrelated to the behavior of T outside this subspace.

I read this part only after going through the reasoning in my previous post.

While going through that reasoning this topic did come up.

The minimal polynomial ##q## has degree ##k\leq n## and ##p## has degree ##n##. However, the operators ##q(T)## and ##p(T)## both map all vectors of ##V## to the same subspace of ##V##.

So why do you say that ##p## (I think you mean ##p(T)## is restricted to a subspace of ##V##?

zenterix · Feb 10, 2024

mathwonk said:

@zenterix: your exposition in post #5 of the direct argument is quite nice and clear (except in the last line I think you meant q(T)v instead of v, as the eigenvector).

I did not mean ##q(T)v## but we could use this as well. Both ##q(T)v## and ##v## are eigenvectors. I wrote ##v## because ##(T-rI)q(T)v=q(T)(T-rI)v=0##, that is, polynomial operators have the commutativity property.

Thus, isn't the following incorrect

mathwonk said:

[ah yes, this explains the "something weird" in your post #7. I.e. it was not v1 but q(T)v1 that was an eigenvector. also you need a hypothesis on the degree of p.]

zenterix · Feb 10, 2024

Bosko said:

Is the vector space ##V## over either real or complex numbers ( the scalar field ) ?
The fundamental theorem of algebra can be used > Fundamental theorem of algebra

I assume that the real numbers are the coefficients of the polynomial ##p(x)##.
Based on above mentioned theorem any ##p(x)=a_nx^n+...+a_0## can be written as
$$p(x)=a_n(x-\lambda_1)(x-\lambda_2)...(x-\lambda_n)$$
where ##\lambda##s are zeros of ##p(x)##.
Some pairs of them can be pairs of the complex numbers in the form ##(a+ib)## and ##(a-ib)##
Those "complex" pairs ##(x-\lambda_i)...(x-\lambda_j)## can be removed because ##p(x)## is polynomial of smallest degree => with real zeros.

Regarding ##p(\lambda_i)=0##, ##a_n## can be removed and we get
$$p(T)=(T-\lambda_1I)(T-\lambda_2I)...(T-\lambda_nI)$$

Edit: This is just an idea what can be used. There are more things to do.

The factorization of a polynomial when it has a root is true for polynomials over any field.

WWGD · Feb 10, 2024

Zenterix, coffee now comes decaffeinated too ;).

Bosko · Feb 10, 2024

zenterix said:

The factorization of a polynomial when it has a root is true for polynomials over any field.

The polynomial over the field of real numbers
$$p(x)=(x-1)(x-2)(x^2+1)$$
have zeros ##x_1=1##, ##x_2=2##, ##x_3=i## and ##x_4=-i##.

WWGD · Feb 10, 2024

As Bosko shows, this is true only for some fields; specifi, algebraically-closed fields.

mathwonk · Feb 10, 2024

in post #15 you say that v is an eigenvector of T, with eigenvalue r, because q(T).(T-r.I)v = 0. This is wrong. You need to prove that (T-r.I)v = 0, and all you have shown is that applying q(T) to this vector is zero. This is the error I referred to above. All you have shown is that (T-rI)v is in the kernel of q(T), which need not be {0}.

e.g suppose T is a 90 degree rotation in the plane. Then (T^2+1)v=0 for any non zero v. Hence also (T-r)(T^2+1)v = 0 = (T^2+1)(T-r)v. From this you concluded that v is an eigenvector for T with eigenvalue r. Of course T has no eigenvectors, and you yourself pointed out this contradiction. I am just explaining where the contradiction came from.

ah yes, I should be more precise, in my comment you quoted above, I said that the equation (T-r)q(T)v1 = 0, implies that q(T)v1 is an eigenvector of T. Actually it only implies q(T)v1 is in the kernel of (T-r), hence either q(T)v1 is an eigenvector of T, with eigenvalue r, or q(T)v1 = 0.
But, either way, one cannot conclude that v1 is an eigenvector of T.

I think I understand what may be leading you astray: perhaps you misread the statement of the original problem. I.e. you seem to be assuming that if p is any polynomial such that p(T)v = 0, then every root of p is an eigenvalue. that is not true, and is not what was stated. It only holds if p is the minimal polynomial of T at v, i.e. if v≠0 and p(T)v=0, and q(T)v ≠ 0 for every polynomial q of lower degree than p, then every root of p is an eigenvalue of T.

zenterix · Feb 10, 2024

mathwonk said:

in post #15 you say that v is an eigenvector of T, with eigenvalue r, because q(T).(T-r.I)v = 0. This is wrong. You need to prove that (T-r.I)v = 0, and all you have shown is that applying q(T) to this vector is zero. This is the error I referred to above. All you have shown is that (T-rI)v is in the kernel of q(T), which need not be {0}.

e.g suppose T is a 90 degree rotation in the plane. Then (T^2+1)v=0 for any non zero v. Hence also (T-r)(T^2+1)v = 0 = (T^2+1)(T-r)v. From this you concluded that v is an eigenvector for T with eigenvalue r. Of course T has no eigenvectors, and you yourself pointed out this contradiction. I am just explaining where the contradiction came from.

ah yes, I should be more precise, in my comment you quoted above, I said that the equation (T-r)q(T)v1 = 0, implies that q(T)v1 is an eigenvector of T. Actually it only implies q(T)v1 is in the kernel of (T-r), hence either q(T)v1 is an eigenvector of T, with eigenvalue r, or q(T)v1 = 0.
But, either way, one cannot conclude that v1 is an eigenvector of T.

I think I understand what may be leading you astray: perhaps you misread the statement of the original problem. I.e. you seem to be assuming that if p is any polynomial such that p(T)v = 0, then every root of p is an eigenvalue. that is not true, and is not what was stated. It only holds if p is the minimal polynomial of T at v, i.e. if v≠0 and p(T)v=0, and q(T)v ≠ 0 for every polynomial q of lower degree than p, then every root of p is an eigenvalue of T.

I reached the equation ##(T-rI)q(T)v=0## and by assumption it cannot be that ##q(T)v=0## therefore it must be that ##(T-rI)v=0##.

mathwonk · Feb 10, 2024

absolutely not! many examples will show that to be wrong. Let e1 = (1,0), and e2 = (0,1), be the standard basis of k^2. define T on k^2 by Te1 = e2, and Te2 = 0. Then T(T(e1)) = T(e2) = 0, but by your argument, since T(e1)≠0, T.T(e1) = 0, would imply T(e1) = 0.

I.e. here T.Te1 = 0, but Te1≠0, implies Te1 is an eigenvector, not e1.

or, perhaps Te1 = 3e2, and Te2 = 5e2. Then (T-5)(T)e1 = (T-5)(3e2) = 3.(T-5)e2 = 0.
by your argument, since Te1 ≠ 0, we would have (T-5)e1 = 0, but (T-5)e1 = 3e2-5e1 ≠ 0. Again here we have (T-5)(T)e1 = 0 and Te1 ≠0, so Te1 is an eigenvector, but not e1 itself.

I.e. all it says when (T-r)q(T)v = 0, and q(T)v ≠ 0, is that q(T) maps v to an eigenvector, not that v is itself an eigenvector. the eigenvector is the non zero vector q(T)v.

WWGD · Feb 10, 2024

zenterix said:

I reached the equation ##(T-rI)q(T)v=0## and by assumption it cannot be that ##q(T)v=0## therefore it must be that ##(T-rI)v=0##.

You're not necessarily dealing with nonsingular transformations. So your map ##(T-rI)## may have a non-trivial (right-)kernel, as mathwonk pointed out.

mathwonk · Feb 11, 2024

@zenterix: I think I have an example that will illuminate the something weird that you questioned in post #15. It reveals that the something weird becomes something quite interesting, and makes the point about whether it is v or q(T)v that is the eigenvector.

Let T:k^2-->k^2 be defined by sending e1 = (1,0) to 3e1 = (3,0) = Te1, and e2 = (0,1) to 7e2 = (0,7) = Te2. Then both e1 and e2 are eigenvectors, and p(X) = (X-3)(X-7) is the minimal polynomial of T, since it annihilates both e1 and e2, and thus also annihilates any linear combination of them. Consider v= e1+e2 = (1,1). Then v is not an eigenvector, since Tv = Te1+Te2 = 3e1+7e2 = (3,7), is not a multiple of v = (1,1), so v cannot be annihilated by any degree one polynomial (in T), so p(X) is also the minimal polynomial at v. Hence p is the polynomial of least degree such that p(T)v = 0.

Now watch what happens when we apply the argument in the proof to see that both roots of p are indeed eigenvalues of T. I.e. since (T-3)(T-7)v = 0, it follows that (T-7)v is an eigenvector for the eigenvalue 3, since (T-7)v ≠ 0, but T-3 does annihilate this vector. The conclusion cannot be that v is an eigenvector for r=3, since v is not an eigenvector at all.

In fact if we compute we get (T-7)v = (T-7)(e1+e2) = Te1+Te2-7e1-7e2 = 3e1+7e2-7e1-7e2 = -4e1. Now e1 is an eigenvector for r=3, so also -4e1 is an eigenvector for r=3.

So q(T) = (T-7) does not annihilate v, but it does map it to an eigenvector for the eigenvalue 3. Similarly since, as you know, the factors commute, the equation
(T-7)(T-3)v = 0, implies that (T-3) must map v to an eigenvector for r=7. Indeed, (T-3)(e1+e2) = Te1+Te2 -3e1 -3e2 + 3e1+7e2-3e1-3e2 = 4e2, an eigenvector for r=7.

So you were entirely correct in observing that commuting the factors gave different conclusions, but the error you were questioning in post #15, actually explains this mystery. I.e. although concluding that v is an eigenvector both for r=3 and for r=7 would be a contradiction, concluding instead that (T-7)v is an eigenvector for r=3, and that (T-3)v is an eigenvector for r=7, is not.

Indeed thank you for this, as your healthy curiosity, and power of observation, has clarified something here that I did not understand at all clearly myself.

It seems to be the following phenomenon: Given T and v, let U be the subspace consisting of all vectors of form f(T)v, for all polynomials f, and consider the restricted map T:U-->U. Suppose m is the polynomial of least degree such that m(T)v = 0, and that m = p.q is a factorization of m. Then, on the subspace U, the kernel of p(T) equals the image of q(T). (The kernel contains the image even without the least degree hypothesis. In particular, when p is linear, then q(T)v is in the kernel of p(T), i.e. q(T)v is an eigenvector.) This is a key fact that can be used in proving a map has a Jordan normal form, in case the minimal polynomial has all roots present in the field of scalars.

In the example given here, where the factors p and q are relatively prime (and factor into distinct linear factors), each vector can be written as a linear combination of eigenvectors for the various eigenvalues, and applying q(T) to such a linear combination, just kills off those components that are eigenvectors for the roots of q, leaving only the components that are eigenvectors for a root of p. q(T) then maps these remaining component eigenvectors to eigenvectors with the same eigenvalues, since every polynomial in T maps eigenvectors to eigenvectors with the same eigenvalue, [i.e. if Tv = rv, then T(p(T)v) = p(T)(Tv) = p(T)(rv) = r.(p(T)v).]

mathwonk · Feb 12, 2024

@zenterix: In fact the phenomenon mentioned above, (in parenthetical expression beginning at line -12, post #24), that an equation of form ST = 0 where S and T are linear maps, is equivalent to saying the image of T is contained in the kernel of S,. also answers another one of your (unanswered) questions elsewhere:
https://www.physicsforums.com/threads/null-space-of-dual-map-of-t-annihilator-of-range-t.1056845/

Namely to say that, for a Linear map T:V-->W, the functional f in W*, is in the kernel of
T*:W*-->V*, says that the composition fT = 0. I.e. that the image of T is in the kernel of f, or that f is orthogonal to Im(T). I.e. the nullspace of T* = the annihilator of Im(T). This is a fundamentally useful property of linear maps.

Notice also that when there is a dot product present, so that linear functionals are obtained from vectors by dotting with them, the annihilator of a subspace consists of those vectors that are perpendicular to that subspace, so the annihilator is also called the orthogonal complement.

Direct Proof that every zero of p(T) is an eigenvalue of T

Similar threads

Hot Threads

Recent Insights