Examples of Multiple Linear Regression Models

In summary: Yes, the following are all considered as MULTIPLE linear regression models. Y = β0 + β1X1 + β2X2+ ... + βkXk.
  • #1
kingwinner
1,270
0
1) "Simple linear regression model:
Y = β0 + β1X + ε
E(Y) = β0 + β1X
A linear model means that it is linear in β's, and not necessarily a linear function of X.
The independent variable X could be W2 or ln(W), and so on, for some other independent variable W."


I have some trouble understanding the last line. I was told that a SIMPLE linear regression model is always a straight line model, it is a least-square LINE of best fit. But if X=W2, then we have E(Y) = β0 + β1W2 which is not a straight line...how come?? Is this allowed?


2) "A SIMPLE linear regression is a linear regression in which there is only ONE independent variable."

Now is the following a simple linear regression or a multiple linear regression?
Y = β0 + β1X + β2X2 + ε
It has only one independent variable X, so is it simple linear regression? But this just looks a bit funny to me...


3) "A linear regression model is of the form:
Y = β0 + β1X1 + β2X2 + ... + βkXk + ε
If there is more than one independent variable, then the model is called a MULTIPLE linear regression model."


This idea doesn't seem too clear to me. What can the Xi's be? What are some actual examples of mutliple linear model? Does a linear model always have to be a straight line or a plane?

Thanks for explaining!
 
Physics news on Phys.org
  • #2
kingwinner said:
1) have some trouble understanding the last line. I was told that a SIMPLE linear regression model is always a straight line model, it is a least-square LINE of best fit. But if X=W2, then we have E(Y) = β0 + β1W2 which is not a straight line...how come?? Is this allowed?
Think of it this way. You have a bunch of (x,y) pairs and are trying to find the coefficients a and b for y=ax2+b. Introduce a new variable u=x2. Now the equation you are trying to fit is y=au+b. A straight line fit. Now imagine you have a different set of (x,y) pairs and this time you are trying to find the coefficients a and b for y=bxa. Introduce two new variables, u=ln(x) and v=ln(y). Taking the log of both sides of y=axb and substituting yields v=au+b. A straight line fit.

A bit of caution with regard to the latter. The linear regression yields the best fit (least squares sense) to v=au+b. This is not necessarily the best fit (least squares sense) to y=bxa.
2) "A SIMPLE linear regression is a linear regression in which there is only ONE independent variable."

Now is the following a simple linear regression or a multiple linear regression?
Y = β0 + β1X + β2X2 + ε
It has only one independent variable X, so is it simple linear regression? But this just looks a bit funny to me...
No. The X and X2 are different independent variables as far as the regression goes.
3) "A linear regression model is of the form:
Y = β0 + β1X1 + β2X2 + ... + βkXk + ε
If there is more than one independent variable, then the model is called a MULTIPLE linear regression model."


This idea doesn't seem too clear to me. What can the Xi's be? What are some actual examples of mutliple linear model? Does a linear model always have to be a straight line or a plane?

Fitting salary y to years of schooling s and years of experience e via y=as+be+c is a multiple linear regression. Here, years of schooling and years of experience are independent variables for the regression. Fitting a parabola, y=ax2+bx+c, can also be done as a multiple linear regression. Think of x2 and x as being independent variables as far as the regression is concerned.
 
  • #3
2) What I think is that the definition of "simple linear model" is not very well-defined. It's ambiguous. I looked at the definitions in 3 different textbooks, but still can't really figure out whether e.g. Y = β0 + β1X + β2X2 + ε is a simple linear model or mutliple linear model. There seems to be only ONE independent variable X (X2 is also determined by X, it's not a DIFFERENT variable, once we've measured X, we can determine the values of both X and X2), but it has a β2 in there. X and X2 are related, I don't see how they can be two separate independent variables...
Is there a nicer definition of a "simple linear model"?


3) "A linear regression model is of the form:
Y = β0 + β1X1 + β2X2 + ... + βkXk + ε
If there is more than one independent variable, then the model is called a MULTIPLE linear regression model."


Now I have some confusion relating to the above paragraph.
e.g. Are the following also considered as MULTIPLE linear regression models? These are not quite in the exact same form as Y = β0 + β1X1 + β2X2 + ... + βkXk + ε which has "k" DIFFERENT independent variables X1,X2,...,Xk.
(i) Y = β0 + β1X + β2exp(X) + ε
(ii) Y = β0 + β1X1 + β2X2 + β3(X1X2) + β4(X12) + β5(X22) + ε

Are those allowed? Why or why not?


Thanks a lot!
 
Last edited:
  • #4
A simple linear regression model has two coefficients. Period.

Your problem is that you are looking at this the wrong way. Y = β0 + β1X + β2X2 + ε is not a simple model because you have three coefficients: β0, β1, and β2. In a sense, the independent variables for the regression are the βis. As far as the regression equations are concerned, those Xs and Ys are just a bunch of constant N-vectors. The best fit is found by taking the partial derivatives of the sum of the square error with respect to each βi: The βis are variables. The Xs and Ys are not variables as far as the regression equations are concerned. Stop thinking of them as variables and you will have fewer problems.
 
  • #5
Thanks for the helpful comments! So I think 2) is solved.

But I am still puzzled by 3) and I would really appreicate if anyone can explain that.

3) "A linear regression model is of the form:
Y = β0 + β1X1 + β2X2 + ... + βkXk + ε
If there is more than one independent variable, then the model is called a MULTIPLE linear regression model."


Now I have some confusion relating to the above paragraph.
e.g. Are the following also considered as MULTIPLE linear regression models? These are not quite in the exact same form as Y = β0 + β1X1 + β2X2 + ... + βkXk + ε which has "k" DIFFERENT independent variables X1,X2,...,Xk.
(i) Y = β0 + β1X + β2exp(X) + ε
(ii) Y = β0 + β1X1 + β2X2 + β3(X1X2) + β4(X12) + β5(X22) + ε

Are those allowed? Why or why not?
 
  • #6
kingwinner said:
2)
3) "A linear regression model is of the form:
Y = β0 + β1X1 + β2X2 + ... + βkXk + ε
If there is more than one independent variable, then the model is called a MULTIPLE linear regression model."


Now I have some confusion relating to the above paragraph.
e.g. Are the following also considered as MULTIPLE linear regression models? These are not quite in the exact same form as Y = β0 + β1X1 + β2X2 + ... + βkXk + ε which has "k" DIFFERENT independent variables X1,X2,...,Xk.
(i) Y = β0 + β1X + β2exp(X) + ε
(ii) Y = β0 + β1X1 + β2X2 + β3(X1X2) + β4(X12) + β5(X22) + ε

Are those allowed? Why or why not?


Thanks a lot!
They aren't linear! That's not to say that those might not be better models for the particular situation- not everything is linear- but anything can be approximated by a linear model and linear models are much, much, easier to work with!
 
  • #7
kingwinner said:
(i) Y = β0 + β1X + β2exp(X) + ε
(ii) Y = β0 + β1X1 + β2X2 + β3(X1X2) + β4(X12) + β5(X22) + ε

Are those allowed? Why or why not?
HallsofIvy said:
They aren't linear!

They are linear in the βis, and as far as linear regression is concerned, that is all that matters. These are linear regression models. Here are a couple that are not linear regressions:

[tex]Y = \beta_0*(1 + \beta_1 X_1)*(1 + \beta_2 X_2)+\varepsilon[/tex]
[tex]Y=\beta_0 + \beta_1X^{\beta_2} + \varepsilon[/tex]
 
Last edited:
  • #8
D H said:
They are linear in the βis, and as far as linear regression is concerned, that is all that matters. These are linear regression models. Here are a couple that are not linear regressions:

[tex]Y = \beta_0*(1 + \beta_1 X_1)*(1 + \beta_2 X_2)+\varepsilon[/tex]
[tex]Y=\beta_0 + \beta_1X^{\beta_2} + \varepsilon[/tex]

Yes, I think the trickiest point to notice when I first read through the definition of linear regression model is that it is linear in β's while in calculus, when we talk about linear, we are usually trying to say that the function is linear, i.e. straight line, plane.

"A linear regression model is of the form:
Y = β0 + β1X1 + β2X2 + ... + βkXk + ε "


(i) Y = β0 + β1X + β2exp(X) + ε
(ii) Y = β0 + β1X1 + β2X2 + β3(X1X2) + β4(X12) + β5(X22) + ε

For (i), X1=X, X2=exp(X)
For (ii), X3=X1*X2, X4=X1^2, X5=X2^2
The latter X's depends on the previous X's. In particular, X3 depends on TWO of the previous X's: X1 AND X2, which looks a bit funny to me? Are those allowed? Somehow I am having a lot of troubles understanding this...I understand the general form of a multiple linear regression model, but I don't seem to understand the specific examples of it like (i) and (ii).

Once again, your help is greatly appreciated!
 
Last edited:

Related to Examples of Multiple Linear Regression Models

1. What is the purpose of a linear regression model?

A linear regression model is used to analyze the relationship between a dependent variable and one or more independent variables. It helps to identify the strength and direction of the relationship and make predictions about the dependent variable based on the independent variables.

2. What is the difference between simple linear regression and multiple linear regression?

Simple linear regression involves only one independent variable while multiple linear regression involves more than one independent variable. In simple linear regression, the relationship between the independent and dependent variables is modeled using a straight line, while in multiple linear regression, the relationship is modeled using a linear equation with multiple variables.

3. What is the best way to assess the accuracy of a linear regression model?

The most common way to assess the accuracy of a linear regression model is by calculating the coefficient of determination (R-squared). This measure indicates the proportion of the variation in the dependent variable that can be explained by the independent variables. A higher R-squared value indicates a better fit for the model.

4. How do you handle outliers in a linear regression model?

Outliers, or data points that are significantly different from the rest of the data, can have a strong influence on the results of a linear regression model. One approach to handling outliers is to remove them from the dataset before fitting the model. Another approach is to use robust regression methods that are less affected by outliers.

5. Can a linear regression model handle categorical variables?

Yes, a linear regression model can handle categorical variables by using dummy coding. This involves creating dummy variables for each category and including them as independent variables in the model. These dummy variables will have a value of 0 or 1, indicating whether the observation falls into that category or not.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
964
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
Back
Top