Welcome to our community

Be a part of something great, join today!

Model Testing

davemk

New member
Mar 4, 2012
8
Hi folks.

Just looking for some input please.

I have a dataset containing interval data (one dependent and 6 independent variables) and taken a random 90% sample (approx 300 observations). I've performed a linear stepwise regression on the 90%, in order to obtain a model to predict the dependant using a number of input variables. I'm confident that I've done this ok.

The issue comes with testing the model. I'm sure that this is probably a simple step but, for some reason, I'm really struggling with it and would be grateful for some advice.

In order to test the model, I'm using the 10% of the dataset that were not used in the linear regression. I've input the predictor variables into the model, which has given me an expected value. I now want to compare this to the actual value. I was originally going to use Chi Square but that seems to be probability based and I'm not sure it's appropriate.

I've been told Spearman's rho would probably be most appropriate although I'm still not 100% sure that's right. Essentially, I would only be testing whether my predicted values = actual values.


All help appreciated. Thanks in advance.
 
Last edited:

CaptainBlack

Well-known member
Jan 26, 2012
890
Hi folks.

Just looking for some input please.

I have a dataset containing interval data (one dependent and 6 independent variables) and taken a random 90% sample (approx 300 observations). I've performed a linear stepwise regression on the 90%, in order to obtain a model to predict the dependant using a number of input variables. I'm confident that I've done this ok.

The issue comes with testing the model. I'm sure that this is probably a simple step but, for some reason, I'm really struggling with it and would be grateful for some advice.

In order to test the model, I'm using the 10% of the dataset that were not used in the linear regression. I've input the predictor variables into the model, which has given me an expected value. I now want to compare this to the actual value. I was originally going to use Chi Square but that seems to be probability based and I'm not sure it's appropriate.

I've been told Spearman's rho would probably be most appropriate although I'm still not 100% sure that's right. Essentially, I would only be testing whether my predicted values = actual values.


All help appreciated. Thanks in advance.
To some extent this depends on how clever you want to be. What you want to do is test that the residuals for the hold back sample have zero mean and that they are homoscedastic. With about 30 points you may have difficulty doing much more.

For the first of these I would just test for zero mean using the usual methods.

For the latter I would plot the residuals against the input variables and eyeball the data (at least to start with), but there are tests, see http://en.wikipedia.org/wiki/Homoscedasticity for a pointer.

You might also want to test the residuals for normality.

CB
 

davemk

New member
Mar 4, 2012
8
That's a great help, thank you very much.

I've already plotted the residuals for obs vs expected and histograms for normailty so I'll have a look into the tests within the link you posted (I must admit, I've never heard of those tests so I'll have a read up on those).

Thanks again. I'll update the thread with my progress asap.
 

davemk

New member
Mar 4, 2012
8
To some extent this depends on how clever you want to be.

With about 30 points you may have difficulty doing much more.
Hello again. If I was to get more data (say 70 observations) in order to test the model, is there a specific test that I could use? At the moment, I've performed a residual analysis and then I'm looking at performing a Wilcoxon's test or Spearman's test.

Any thoughts on this process, or alternatives? The procedures in the link above don't appear to be available in SPSS.