Creating a confounding variable

  • A
  • Thread starter FallenApple
  • Start date
  • Tags
    Variable
In summary, the conversation discusses the use of linear regression to analyze the relationship between variables X1 and X2 with a response variable Y. It is noted that X1 is causally related to both X2 and Y, which can cause X2 to appear insignificant when included in the regression model with X1. This is due to the algorithm selecting coefficients that minimize the sum of squares, resulting in a higher coefficient for X1 and a lower one for X2. This can affect the statistical significance of X2 in the model.
  • #1
FallenApple
566
61
So I have Y, the response and X1 and X2. I generate Y and X1 from a multivariate normal distribution. Then I manually set X2 to be nearly same as X1( the same except for the fact that I change up a few entries to make X2 distinct from X1).

I ran three separate linear regressions.

lm(Y~X1) -> X1 statistically significant

lm(Y~X2)-> X2 statistically significant

lm(Y~X1+X2)-> X1 statistically significant and X2 not statistically significant.

I suppose this makes sense. X1 is clearly confounds the relation between X2 and Y since X1 is causally related to X2 and to Y. But I'm not so clear as to what is mathematically going on. How do the algorithms detect this? Does it have something to do with holding X1 constant while interpreting X2?
 
  • Like
Likes ZeGato
Physics news on Phys.org
  • #2
The algorithm selects coefficients c1 and c2 and intercept c0 so as to minimise the sum of squares of (Y - (c0 + c1 X1 + c2 X2)).
Because the fit between X1 and Y is better than between X2 and Y, it will choose a high absolute value coefficient for X1 and a low one for X2. So the confidence interval for the estimator of c2, given the null hypothesis that the true value of the coefficient is zero, will include the actual estimate, meaning that it is not statistically significant.
 
Last edited:
  • Like
Likes FallenApple and ZeGato

Related to Creating a confounding variable

What is a confounding variable?

A confounding variable is an outside factor that affects the relationship between the independent and dependent variables in a study, making it difficult to determine the true effect of the independent variable on the dependent variable.

How does a confounding variable impact research?

A confounding variable can lead to inaccurate or misleading results in research, as it can create a false relationship between the independent and dependent variables. This can make it difficult to draw valid conclusions and can undermine the validity of the study.

How can researchers control for confounding variables?

Researchers can control for confounding variables by using research designs such as randomized controlled trials, which assign participants to different groups at random. This helps to minimize the effects of confounding variables by ensuring that they are equally distributed among the different groups.

What are some examples of confounding variables?

Some examples of confounding variables include age, gender, socio-economic status, and education level. These variables can impact the relationship between the independent and dependent variables in a study and must be controlled for in order to draw accurate conclusions.

How can researchers identify and address confounding variables?

Researchers can identify potential confounding variables by conducting a thorough literature review, consulting with experts in the field, and using statistical techniques such as multiple regression analysis. Once identified, researchers can control for confounding variables through study design, statistical analysis, or by including them as independent variables in the study.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
881
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
943
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
5K
Back
Top