Looking for empirical equation in experimental data

In summary, the conversation discusses the use of empirical equations in engineering research and the challenges in finding the most accurate model. Suggestions are made to include interaction terms and use linear algebra to find the closest approximation. The use of higher order terms is also suggested to improve the correlation between the estimated and measured data.
  • #1
Curran919
2
0
I am an engineer student working in nuclear research. I am performing some experiments looking for an empirical equation to apply to results in a test section, but am having trouble making a mental leap. Here is the core of the problem with all of the engineering 'fat' trimmed off:

I have a variable with three dependents:
G = f(A,B,C)

I have shown that G is more or less linear WRT each variable for multiple values of the other variables (sorry, I'm an undergrad engineer, mathematic notation is lacking):
G=f(A) of O(1) for every B,C
G=f(B) of O(1) for every A,C
G=f(C) of O(1) for every A,B


I would like to say that because of this,
G = f(A)+g(B)+h(C)
or even,
G = aA+bB+cC+d where a,b,c,d are constants

but this would only be true if the slope of f(A) where constant regardless of B,C (and the same for f(B)/f(C)). Of course, it isn't. Is what I've said correct, and if so, is there an alternative conclusion I can make?

G = (A-a)(B-b)(C-c)?
 
Mathematics news on Phys.org
  • #2
A common model for empirical work is the linear model. Where things don't fit so well, the experimenter can include interaction terms. Thus, the model might be (for two independent variables x and y)

z = a*x + b*y + c*x*y

The a, b, and c are constants are need to be fit to the data. You might check some books on the Design of Experiments, as such modeling is often done in that context.
 
  • #3
Consider the function f(A,B,C) = AB+AC+BC. This is linear in each variable, but not globally approximate to anything on the form Aa+Bb+Cc for constants a, b and c. If you are only interested in local behavior, you should add the constraints of a, b and c. Then maybe you can get an approximate linear form.

If you know some linear algebra, you could find the linear expression that is "closest" to your set of data-points in the manner that the sum of squares of the differences from the data points and the values of a linear expression is minimized. If you have values for f(x,y,z) at [tex](x_1,y_1,z_1), (x_2,y_2,z_2),...,(x_n,y_n,z_n)[/tex], solve for the least-squares solution to [tex]Mx=b[/tex], where
[tex]M = \begin{bmatrix} x_1 & y_1 & z_1 & 1 \\ x_2 & y_2 & z_2 & 1 \\ \vdots & \vdots & \vdots & \vdots \\ x_n & y_n & z_n & 1 \end{bmatrix}[/tex]

and

[tex]b = \begin{bmatrix} f(x_1,y_1,z_1) \\ f(x_2, y_2, z_2) \\ \vdots \\ f(x_n , y_n , z_n) \end{bmatrix}[/tex].

I.e. solve for [tex]M^TMx = M^Tb[/tex].



Then one of your [tex]x = \begin{bmatrix} a \\ b \\ c \\ d \end{bmatrix}[/tex] will give an approximation [tex]f(x,y,z) \approx ax+by+cz+d [/tex] on these data-points. The more "linear" your function behaves the better the approximation.

If you suspect it to be on other forms, such as higher degree polynomials or linear combinations of entirely different functions this can also be done similarly. To do this: If you think the function is approximately a linear combination of the functions [tex]g_1(x,y,z),...,g_k(x,y,x)[/tex], substitute [tex]\begin{bmatrix} g_1(x_i,y_i,z_i) & \ldots & g_k(x_i,y_i,z_i) \end{bmatrix}[/tex] for the i'th row of the matrix M, and solve for some k-vector x, which will be the coefficients of the functions. The linear form corresponds to the case where [tex]g_1(x,y,z) = x, g_2(x,y,z) = y, g_3(x,y,z) = z[/tex], and [tex]g_4(x,y,z) = 1[/tex].

Often [tex]M^TM[/tex] will be invertible giving a unique solution [tex](M^TM)^{-1}M^Tb[/tex], and inverting will not be very difficult as the matrix M^TM is a k x k matrix where k is the number of functions you are considering. You should probably constrain your data-set so you can multiply the matrices without difficulty. Hope this helps, good luck.
 
Last edited:
  • #4
Thank Jarle, very helpful.

Indeed using [tex]
f(x,y,z) \approx ax+by+cz+d
[/tex] gave a poor correlation between the estimated and the measured readings. I tried:

[tex]
f(x,y,z) \approx axyz+bxy+cxz+dyz+ex+fy+gz+h
[/tex]

and the correlation appears much better. I think I have some outliers in the measurement data, so I will remove a few instances and see what happens. Is there an underlying explanation to the terms that I used, or is it just a mathematical catch-all (or more terms [tex] \approx [/tex] less error)? I tried nixing the terms that seemed to have a low correlation, which was okay for [tex] axyz [/tex], but removing any of the second order terms introduced considerable error.
 
  • #5


I understand your frustration in trying to find an empirical equation to apply to your experimental data. It can be challenging to make the mental leap from data points to a mathematical equation that accurately represents your findings.

Based on the information provided, it seems that you have identified a linear relationship between G and each of the variables A, B, and C individually. However, you are struggling to find a single equation that encompasses all three variables simultaneously.

Your proposed equations, G = f(A)+g(B)+h(C) and G = aA+bB+cC+d, may not accurately represent the relationship between G and all three variables. This is because the slope of f(A) (or f(B) or f(C)) is not constant when the other variables are held constant. This implies that the relationship between G and each variable is influenced by the other variables.

One alternative conclusion you could make is that the relationship between G and the three variables is not a simple linear one. It may be a more complex relationship, such as a quadratic or exponential one. To determine this, you may need to perform further experiments with a wider range of values for each variable and analyze the data using regression techniques.

Another approach could be to use a statistical model, such as a multiple linear regression, to determine the relationship between G and the three variables. This would allow you to account for the influence of all three variables simultaneously and potentially identify any interactions between them.

In summary, it is important to carefully analyze your data and consider alternative conclusions before trying to fit a mathematical equation to your experimental results. It may also be helpful to consult with a statistician or other experts in your field for guidance on choosing the most appropriate model for your data.
 

Related to Looking for empirical equation in experimental data

1. How do you determine the best empirical equation for experimental data?

To determine the best empirical equation for experimental data, you need to first plot your data and visually analyze the relationship between the variables. You can also use statistical methods such as correlation analysis and curve fitting to determine the best fit equation. Additionally, you can consult with other experts in your field or conduct a literature review to see if similar equations have been used in previous studies.

2. What is the purpose of finding an empirical equation?

The purpose of finding an empirical equation is to describe the relationship between variables in a quantitative manner. This equation can then be used to make predictions and draw conclusions based on the experimental data. It can also help to identify any underlying patterns or trends in the data.

3. Can an empirical equation accurately represent all experimental data?

No, an empirical equation may not be able to accurately represent all experimental data. This is because experimental data can often be complex and may not follow a specific equation or pattern. However, an empirical equation can provide a simplified representation of the data and can be useful for making predictions and drawing conclusions.

4. How can you validate the accuracy of an empirical equation?

An empirical equation can be validated by comparing its predictions to new experimental data that was not used to create the equation. If the predictions are close to the actual values, then the equation can be considered accurate. Additionally, you can also compare the equation to other established equations or theoretical models to see if they are consistent.

5. Are there any limitations to using an empirical equation?

Yes, there are limitations to using an empirical equation. These equations are based on experimental data and may not take into account all possible factors that could affect the relationship between variables. They also may not be applicable to all situations or may not accurately represent all data points. It is important to carefully consider the assumptions and limitations of an empirical equation before using it for analysis or predictions.

Similar threads

  • General Math
Replies
3
Views
839
  • General Math
Replies
9
Views
1K
Replies
2
Views
2K
Replies
66
Views
4K
Replies
4
Views
788
  • General Math
Replies
12
Views
1K
  • General Math
Replies
6
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
384
  • Other Physics Topics
Replies
8
Views
2K
Back
Top