Chain rule for functions of operators?

pellman · Oct 18, 2010

This is strictly a math question but I figured that since it is something which would show up in QM, the quantum folks might be already familiar with it.

Suppose we have an operator valued function A(x) of a real parameter x and another function f, both of which have well defined derivatives.

consider [tex]\frac{d}{dx}f(A(x))[/tex]

Does this equal

[tex]\frac{df}{dA}\frac{dA}{dx}[/tex]

or

[tex]\frac{dA}{dx}\frac{df}{dA} [/tex]

or something else? Of course, if A and dA/dx commute, then either expression is good. But it is not clear to me that A and dA/dx would necessarily commute.

Fredrik · Oct 18, 2010

How do you define df/dA? (I don't think it's something we even want to define).

pellman · Oct 18, 2010

Fredrik said:

How do you define df/dA? (I don't think it's something we even want to define).

If f(u) is R --> R and has a Taylor series representation

[tex]f(u)=\Sigma \frac{1}{n!}f_n u^n[/tex]

where the f_n are just coefficients. Then

[tex]f'(u)=\Sigma \frac{1}{n!}f_{n+1} u^n[/tex]

We can similarly put

[tex]f(A)=\Sigma \frac{1}{n!}f_n A^n[/tex]

[tex]f'(A)=\Sigma \frac{1}{n!}f_{n+1} A^n[/tex]

For some f this may not work, it may not converge, blah, blah, blah. Let's just assume f is a function for which this works. The actual function I am interested in is [tex]f(A)=e^A[/tex], so f(A) = df/dA anyway.

pellman · Oct 18, 2010

Maybe I shouldn't be so general. My problem is this:

Suppose we have time-dependent Hamiltonian H(t). Then we can no longer write

[tex]|\Psi(t)\rangle = e^{-iHt}|\Psi(0)\rangle[/tex]

because dH/dt != 0 . What we need is an operator Q(t) such that dQ/dt=H .

Then we would have

[tex]i\frac{d}{dt}|\Psi(t)\rangle =i\frac{d}{dt}e^{-iQ(t)}|\Psi(0)\rangle[/tex]

[tex]=He^{-iQ(t)}|\Psi(0)\rangle[/tex]

or would it be

[tex]=e^{-iQ(t)}H|\Psi(0)\rangle[/tex]?

If H does not commute with Q, then the latter means we are not dealing with a solution to the Schrodinger equation. So what is

[tex]\frac{d}{dt}e^{-iQ(t)}=?[/tex]

Fredrik · Oct 18, 2010

(I wrote this before I saw your last post).

I think that's a directional derivative in the direction of A

[tex]\lim_{t\rightarrow 0}\frac{f(A+t\frac{A}{\|A\|})-f(A)}{t}[/tex]

Both the df/dA notation and the f'(A) notation seem very inadequate for directional derivatives. You could use something like [itex]D_X f(A)[/itex] for the directional derivative in direction X, at A. Your df/dA would then be [itex]D_A f(A)[/itex]. However, when we take the derivate of exponentials, don't we always do it with respect to a parameter? For example, when we prove that A is self-adjoint if U=exp(itA) is unitary:

[tex]U^\dagger U=I[/tex]

[tex]U^\dagger=U^{-1}[/tex]

[tex]e^{-itA^\dagger}=e^{-itA}[/tex]

Now apply [tex]\frac{d}{dt}\bigg|_0[/tex] to both sides, and we're done.

Added after I read your post #4: If Q(t) commutes with Q(s) for all t and s, then Q'(t) commutes with Q(t) and therefore with exp(iQ(t)), so the two options are equivalent. I need to think about the possibility that Q(t) doesn't commute with Q(s).

dextercioby · Oct 18, 2010

If your A is either self-adjoint or unitary in a (rigged) Hilbert space, then you can easily define a function f(A) by the means of the spectral decomposition of A. Then you can compute a derivative, but, of course, under tight conditions of convergence.

Fredrik · Oct 18, 2010

Fredrik said:

I need to think about the possibility that Q(t) doesn't commute with Q(s).

I don't think there are any simple formulas in this case. Note e.g. that d/dt Q(t)²=Q'(t)Q(t)+Q(t)Q'(t). So if we try to apply d/dt to each term of the exponential, things are already weird in the second order term.

pellman · Oct 18, 2010

I see what you mean. Darn. I was hoping this would have a simple answer.

strangerep · Oct 18, 2010

bigubau said:

If your A is either self-adjoint or unitary in a (rigged) Hilbert space, then you can easily define a function f(A) by the means of the spectral decomposition of A. Then you can compute a derivative, but, of course, under tight conditions of convergence.

Umm,... how does this work when one is dealing is a continuous family of operators
such as A(t) ?

E.g., for a given time, we have an operator [itex]A_0 = A(t=0)[/itex], (assumed to self-adjoint, say),
then we can spectral-decompose in terms of its eigenvalues and eigenstates:

[tex]
f(A_0) ~=~ \int da_0 f(a_0) |a_0\rangle \langle a_0| ~~.
[/tex]

But each A(t) will have a different set of eigenvalues and eigenstates in general,

[tex]
f(A_t) ~=~ \int da_t f(a_t) |a_t\rangle \langle a_t| ~~.
[/tex]

so how does one take the t derivative of the LHS without first computing
the time-dependent eigenvalues and eigenstates explicitly?

(Or did I misunderstand you?)

Chain rule for functions of operators?

Related to Chain rule for functions of operators?

1. What is the chain rule for functions of operators?

2. How is the chain rule applied in quantum mechanics?

3. What is the difference between the chain rule for functions of operators and the traditional chain rule?

4. Can the chain rule for functions of operators be extended to functions with more than two operators?

5. Are there any limitations to using the chain rule for functions of operators?

Similar threads

Hot Threads

Recent Insights