Why Do Nonlinear Functions Often Lead to Non-Convex Cost Functions?

In summary, the conversation discusses the concept of convexity in cost functions, specifically in the context of linear and non-linear functions. It is mentioned that non-linear functions can give rise to non-convex cost functions, and an example is provided to illustrate this. The intuition behind this phenomenon is also explored, with the conclusion that the sign of certain expressions determines the convexity of the cost function.
  • #1
pamparana
128
0
I am taking a course on linear regression online and it talks about the sum of square difference cost function and one of the points it makes is that the cost function is always convex i.e. it has only one optima.

Now, reading a bit more it seems that non-linear functions tend to give rise to non-convex functions and I am trying to develop some intuition behind it. So, suppose I take a random model like:

$$
f(x) = w_0 x^2 + w_1 \exp(x) + \epsilon
$$

And the cost function I choose is the same i.e. I want to minimise:

$$
J(w_0, w_1) = (f(x) - w_0 x^2 + w_1 \exp(x))^2
$$

What is the intuition that the squared term and the exponential term would give rise to non-convexities?
 
Mathematics news on Phys.org
  • #2
For the example that you've chosen (fixing a missing minus sign),

$$ \frac{\partial^2 J}{\partial w_0 \partial w_1} = 2 x^2 e^x \geq 0,$$

is positive semidefinite.

In a more general case, we will have residuals

$$ r_i = y_i - \sum_\alpha w_\alpha f_\alpha(x_i)$$

and we are minimising

$$ S = \sum_i r_i^2.$$

The sign of the expressions

$$ m_{\alpha\beta}= \frac{\partial^2 S}{\partial w_\alpha \partial w_\beta} = 2 \sum_i f_\alpha(x_i) f_\beta(x_i),$$

determines the convexivity of ##S## as a function of the ##w_\alpha##. When ##\beta = \alpha##, ##m_{\alpha\alpha}## is a sum of squares, so is positive semidefinite. When ##\beta\neq \alpha##, it is possible to find ##m_{\alpha\beta}<0##, depending on the specific functions ##f_\alpha## and the data ##x_i##.
 

Related to Why Do Nonlinear Functions Often Lead to Non-Convex Cost Functions?

1. What is nonlinearity and nonconvexity?

Nonlinearity and nonconvexity refer to mathematical properties of a function or system. Nonlinearity means that the output of the function or system is not directly proportional to the input, and nonconvexity means that the function or system does not have a single global minimum or maximum point.

2. How do nonlinearity and nonconvexity affect scientific research?

Nonlinearity and nonconvexity can complicate scientific research because they make it difficult to find optimal solutions or predict the behavior of a system. They also require more complex mathematical techniques to analyze and model.

3. Can nonlinearity and nonconvexity be found in real-world systems?

Yes, nonlinearity and nonconvexity can be found in many real-world systems, including biological, economic, and physical systems. Examples include chaotic systems, competitive markets, and protein folding.

4. How can nonlinearity and nonconvexity be addressed in research?

One way to address nonlinearity and nonconvexity in research is to use advanced mathematical techniques, such as nonlinear optimization and machine learning algorithms. Another approach is to simplify the system or function being studied to make it more linear or convex.

5. What are some potential applications of studying nonlinearity and nonconvexity?

Studying nonlinearity and nonconvexity has many potential applications, including improving predictions and models in various fields, developing more efficient algorithms and optimization techniques, and gaining a better understanding of complex systems in the natural world.

Similar threads

  • General Math
Replies
2
Views
752
Replies
8
Views
2K
  • General Math
Replies
1
Views
2K
  • General Math
Replies
28
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
673
Replies
4
Views
577
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
1K
Replies
2
Views
282
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
906
Back
Top