Analyzing a 7-variable data set where multiple variables changed at once

In summary, Matt is looking for a software that can help him predict which conditions produce good films from his data. He is also looking for help with understanding physical principles that may be affecting the quality of the films.
  • #1
swooshfactory
63
0
Hi,

I have some experimental data from a colleague which has been collected with different values for 7 parameters. It isn't nice, as in it's not as if 6 of the parameters were held fixed while the seventh was adjusted. Beyond that, I don't have much data. I'm not up on my statistical significance, but this data will probably prove to not be significant.

I'm wondering if there is some software that would take a shot and analyzing this data and maybe indicate some trends.

Thanks for reading!
 
Physics news on Phys.org
  • #2
You haven't explained enough about the data for anyone to know what hypothesis about it you wish to test.
 
  • #3
Ok, my apologies. The data is for growing thin films with different growth conditions. Some of the sets of variables produce "good" films, some "mediocre" and some "bad". I cannot formulate a hypothesis based on the data - the conditions seem so different from one data set to the next.

I would like to be able to predict which conditions will produce good films. If more information is needed, I will gladly supply it.

Matt
 
  • #4
swooshfactory said:
I cannot formulate a hypothesis based on the data - the conditions seem so different from one data set to the next.

I would like to be able to predict which conditions will produce good films.

There are two different goals in your previos posts. The question of determining "statistical signifcance" is much less ambitious that the goal of making good predictions. In the usual sort of statistical testing, one often assumes a "null hypothesis" that essentially says "The conditions don't affect the outcome". One analyzes the probability of the data based on that very unpredictive model. If the probability is low then one may subjectively "reject" the null hypothesis. However, this procedure merely decides that the conditions do affect the outcome without specifying exactly how you can use the conditions to make a prediction.

As to making predictions, I think there are forum members who could advise you if present the data. (I'm not sure that I'm one of them.) It would also be helpful to know any basic physics that underlies the process.
 
  • #5
swooshfactory said:
I would like to be able to predict which conditions will produce good films. If more information is needed, I will gladly supply it.

Matt

Hi Matt,

This seems an optimization problem but Mr. Tashi is right; since there are a myriad of optimization techniques more details about your project would be helpful to determine which one better suits you.

Is there an underlying physical model you can adjust your data to? Are there economical constraints upon the value of the seven parameters (e.g. the lower one parameter the cheaper the process)? Would you be able to gather more data if necessary and, if you could, there are limitations in time or money to do it?

But anyway, at first sight you may try to fit some linear model with quadratic higher order terms and then, if the model fits nicely, simply check for the optimal solution in that model.

And about software to do these analysis, I would recommend you R (http://cran.r-project.org/). R is a very powerful language for statistical analysis and you'll be able to analyze your data in many different ways.
 
Last edited:
  • #6
Thank you both for your replies. Overall, there must be some pattern (I would think), since other people in my group have said that they can reproduce good films using certain conditions. The data I have, though, is from someone exploring many different growth conditions.

Some of the physical principles would be something like high temperatures destroy order.

The price is assumed to be the same for each process, but I will consider this in my review.

More data could be collected, but I'm not trained on this machine, so it's possible it won't be for a while.

Thanks again...

Matt
 
  • #7
swooshfactory said:
Thank you both for your replies. Overall, there must be some pattern (I would think), since other people in my group have said that they can reproduce good films using certain conditions. The data I have, though, is from someone exploring many different growth conditions.

Some of the physical principles would be something like high temperatures destroy order.

The price is assumed to be the same for each process, but I will consider this in my review.

More data could be collected, but I'm not trained on this machine, so it's possible it won't be for a while.

Thanks again...

Matt

Well, then you could give a try to the lineal model and see how it goes. It will also give you hints on what parameters are more important for the quality which, by the way, you need a consistent way to measure it. So good luck!
 

Related to Analyzing a 7-variable data set where multiple variables changed at once

1. How do I determine which variables are the most influential in a 7-variable data set where multiple variables changed at once?

To determine which variables are the most influential, you can use techniques such as correlation analysis, regression analysis, or principal component analysis. These methods help identify the relationships and patterns between variables and can reveal which variables have the most impact on the data set.

2. How do I handle missing data in a 7-variable data set with multiple variables that changed?

There are several methods for handling missing data, including imputation, deletion, and multiple imputation. Imputation involves replacing missing values with estimated values based on the remaining data. Deletion involves removing cases with missing data, while multiple imputation creates multiple imputed data sets and combines them for analysis.

3. Is it possible to determine causality in a 7-variable data set where multiple variables changed at once?

While correlation can reveal relationships between variables, it does not necessarily indicate causation. To determine causality, you would need to conduct experiments or use techniques such as regression analysis to control for other variables and determine the influence of one variable on another.

4. What are some common mistakes to avoid when analyzing a 7-variable data set with multiple changing variables?

Some common mistakes to avoid include using incorrect statistical tests, not properly accounting for confounding variables, and not considering the assumptions and limitations of the analysis techniques used. It is important to carefully plan and design your analysis and to thoroughly check your results for accuracy and validity.

5. How can I effectively communicate my findings from a 7-variable data set with multiple variables that changed at once?

To effectively communicate your findings, it is important to clearly explain your methods, results, and conclusions in a way that is easy for others to understand. You can use visual aids such as graphs and charts to help convey your message and provide context for your data. It is also important to acknowledge any limitations or uncertainties in your findings and to provide suggestions for further research or analysis.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
Replies
4
Views
1K
Replies
27
Views
3K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • STEM Career Guidance
Replies
3
Views
2K
  • Beyond the Standard Models
Replies
19
Views
5K
  • Beyond the Standard Models
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
5K
Back
Top