Substitution to turn a non-linear least squares problem into a linear one

In summary, the conversation discussed using the method of least squares to solve an overdetermined non-linear equation. The speaker proposed substituting a variable in order to turn the problem into a linear one. They then discussed the conditions under which this substitution is allowed and how to accurately solve for the unknown parameters. The conversation concluded that this approach can be used for a variety of non-linear problems, as long as a solveable algebraic expression can be obtained for the unknown parameters.
  • #1
bers
4
0
Hi,
I want to solve an overdetermined non-linear equation with the method of least squares. Assume it's f(x) = 1 + ax + a^2 + b, and I want to estimate a and b. This is non-linear, as I said, so the derivatives of the squared residuals involve a^3 terms and are difficult to solve.
Now I thought about substituting c = a^2 + b, which would turn this into a linear system that LOOKS pretty easy to solve. But when calculating the derivatives of the residuals, do I need to take into account that dc/da != 0 (which would make the terms quite complicated again), or can I simply proceed as if c had been the original variable?
If I am allowed to go this way (I guess I am), what are the conditions - when am I allowed to substitute to turn a non-linear least-squares system into a linear one? Can you point me at any fundamental rule for this? I searched Google quite a bit, but have not found any adequate.

Thanks
bers
 
Mathematics news on Phys.org
  • #2
You can approximate a non-linear problem, about a fixed set of values, by replacing any non-linear formula with its linear approximation. You are allowed to do this any time, at any point but how accurate the approximation is depends on how close to linear the function is at that point. If you are allowed to choose the point, then take the derivative and choose the point where the derivative (or its magnitude) is smallest. If, as often happens, you are given the point, check the derivative to see if "linearizing" will give reasonably accurate results.
 
  • #3
Least squares using non-linear functions can be rather straighforward. It's really just a series of linearized least squares problems (based on an initial guess of the solution) that (hopefully) iteratively converge to a solution. Difficulties can arise if the starting point is far from the solution and/or the matrices are ill-conditioned.
 
Last edited:
  • #4
Hi,
thanks for your answers. I do think, however, I did not point out two important things clearly enough:

1. I am aware of many iterative algorithms for this problem. However, I'm interested in a non-iterative solution. (Would you call this an ad-hoc solution?) Anyway, I want to have a fixed-run-time algorithm for the problem I mentioned, and we all know that linear problems can be solved this way.

2. I am also not interested in an approximation. I am interested whether the substitution will allow me to find the exact solution in a linear way.

So let me summarize what I got to so far. Let's talke about [itex]f(x) = 1 + ax + a^2 + b[/itex]. Using some data [itex]x_i[/itex] and [itex]y_i[/itex], I can define the residual

[itex]R = \sum_{i=1}^n ||f(x_i) - y_i||^2[/itex]

Inserting [itex]f[/itex] with [itex]c[/itex] substituted by [itex]a^2 + b[/itex], I find

[itex]R = \sum_{i=1}^n ||1 + ax_i + c - y_i||^2[/itex]

Now, I need to calculate the derivatives with respect to a and c. Not knowing whether total or partial derivatives are the way to go, let's try both.

[itex]\partial R / \partial a = \sum_{i=1}^n 2 (1 + ax_i + c - y_i) \cdot x_i = 0[/itex]
[itex]\partial R / \partial c = \sum_{i=1}^n 2 (1 + ax_i + c - y_i) \cdot 1 = 0[/itex]

These are pretty easy to solve non-iteratively, because they are linear in a and c, although my original problem is not. That is my main concern. Will the solution I achieve from this system of equations be exact? In the manual examples I calculated, they were, but I cannot be sure this is always the case.

If we go for total derivatives, we need to account for c depending on a, with [itex]\partial c / \partial a = 2a[/itex] and [itex]\partial a / \partial c[/itex] involving roots.

[itex]dR / da = \sum_{i=1}^n 2 (1 + ax_i + c - y_i) \cdot (x_i + \partial c / \partial a) = 0[/itex]
[itex]dR / dc = \sum_{i=1}^n 2 (1 + ax_i + c - y_i) \cdot (1 + \partial a / \partial c)= 0[/itex]

These are now not linear any more, and much more complex to solve. Testing some values fulfilling [itex]f(x_i) = y_i[/itex] and the correct values for a and b will of course make the first term [itex]1 + \dots - y_i[/itex] equal 0, so the derivatives are 0 in both cases.

So, still I am not sure which way is the right one...
 
  • #5
Seems to me your first approach is perfectly valid. But, the problem is even simpler if you let c = 1 + a2 + b. Your equation is then f(x) = ax + c which is very easy to solve for a, c from a least squares standpoint. Then, you can determine b = c - 1 - a2.
 
  • #6
Yes, I confirmed that this works using distorted data (y_i = f(x_i) + some error) with a non-zero residuum. I obtained the same estimates as with Matlab and a non-linear iterative procedure (and got wrong results with the total differential method, by the way).

So that would mean I can obtain a linear formulation for a lot of non-linear problems, such as f(x) = bx + ax + a^2. For example, I could do c = a^2 and then d = b + sqrt(c) to obtain f(x) = dx + c. What I am wondering is:
- Why can't I find any mentioning of these substitutions on Google or elsewhere?
- Under which conditions can I do this substitution? So, what "kinds of" non-linearities can be (perfectly) "linearized", and which ones can't?
 
  • #7
I believe the problem you stated is really a linear problem. Let's say the problem was proposed in reverse. I give you data and ask you to fit it to the equation f(x) = ax + c. After you give me the result I tell you I want to re-define c = 1 + a2 + b. The fact that a valid algebraic manipulation is performed on the parameters doesn't turn a linear problem into a non-linear one.

It seems to me (though I have no idea how to prove it mathematically) that if you can obtain a solveable algebraic expression for the unknown parameters you'll be ok.
 

Related to Substitution to turn a non-linear least squares problem into a linear one

What is substitution in the context of non-linear least squares problems?

Substitution is a mathematical technique used to transform a non-linear least squares problem into a linear one. This is done by replacing the non-linear variables in the problem with linear ones in order to simplify the problem and make it easier to solve.

Why do we need to turn a non-linear least squares problem into a linear one?

By turning a non-linear least squares problem into a linear one, we can use well-known and efficient techniques from linear algebra to solve the problem. This makes it easier to find the optimal solution and reduces the computational complexity of the problem.

How is substitution used to transform a non-linear least squares problem into a linear one?

To transform a non-linear least squares problem into a linear one, we substitute the non-linear variables with linear ones using a linearizing function. This function maps the non-linear variables to linear ones in such a way that the problem becomes linear.

What are the limitations of using substitution to turn a non-linear least squares problem into a linear one?

Substitution can only be used for certain types of non-linear least squares problems. It may not work for highly complex or non-linear problems, and the choice of linearizing function can greatly affect the accuracy of the solution.

Are there any alternatives to using substitution for non-linear least squares problems?

Yes, there are other methods such as Newton's method or gradient descent that can be used to solve non-linear least squares problems. These methods may be more suitable for certain types of problems and may yield more accurate results compared to using substitution.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
673
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
5
Views
1K
Replies
3
Views
2K
Replies
30
Views
2K
  • Differential Equations
Replies
3
Views
1K
Replies
4
Views
2K
  • General Math
Replies
1
Views
2K
Back
Top