Weighting calculation to convert weather data from 6 stations into one

mdhastings · Jun 24, 2013

I currently have hard-coded in my forecasting model, 6 weightings (totaling 100%) for 6 weather stations and wish to determine a methodology to produce these weighting or conversion factors % to form an artificial single weather station. This is part of forecasting the electricity load in my city:- the data from 6 weather stations (observations of temperature (C), dew point (C) and rel. humidity (%)) is then weighted by the specific weightings and used further in the load's model equation.

The request is for a conversion factor methodology that must capture the relevance of any of the 6 weather stations to the overall load. It must somehow combine temperature and dew point (since rel. humidity is equivalent here) within the weighting. To this end I have applied various regressions of electricity load against these station datasets (say temperature) without success and believe I need to scale or otherwise change my thinking.

Sample data and current weightings in text file.

D H · Jun 24, 2013

Why are you using temperature, dew point, and relative humidity? You might well get better results if you use but two of them, e.g., temperature and humidity. Temperature, dew point, and relative humidity are related, and for relative humidity > 50% the relationship is close to linear. See http://journals.ametsoc.org/doi/pdf/10.1175/BAMS-86-2-225.

Throwing correlated independent variables at a regression model is not a good idea. Correlations between your independent variables mean those variables aren't independent. It is a downright bad idea if the independent variables are linearly related to one another.

D H · Jun 24, 2013

You might want to add percent cloud cover during daytime hours to your model, at least during summertime. My AC runs a good deal more on sunny days than on cloudy ones.

mdhastings · Jun 24, 2013

Thanks D.H.
Your points on dew point and rel.humidity are well made and in the second part of my question ("It must somehow combine temperature and dew point (since rel. humidity is equivalent here) within the weighting.")

I have not explained about the load modelling because quite frankly our market is not real time and this adds significant restriction. We actually incorporate the artificially weighted weather station with Fourier series in our load model. Anyway the specific need is to determine a methodology of weighting the 6 weather stations to provide impact with the observed electricity load.

Stephen Tashi · Jun 24, 2013

mdhastings said:

Anyway the specific need is to determine a methodology of weighting the 6 weather stations to provide impact with the observed electricity load.

Empircally, is the load an approximately linear function of the weather variables? It would be best if you can answer this "emprically" in the sense of examining actual data, but it would also be of interest to know if the model results are.

As I visualize the situation, you don't have the option to rewrite the current model and the model accepts only 1 set of weather variables. That doesn't imply a restriction that the each variable that you input must be a linear function of the measurements from the 6 weather stations, but you may want it to be a "weighted sum" for the sake of simplicity.

mdhastings · Jun 24, 2013

Stephen Tashi said:

Empircally, is the load an approximately linear function of the weather variables? It would be best if you can answer this "emprically" in the sense of examining actual data, but it would also be of interest to know if the model results are.

Yes. We model the load as a linear function of the weather variables (amongst others). Can you explain what you want with "It would be best if you can answer this "emprically" in the sense of examining actual data, but it would also be of interest to know if the model results are."

All else you say is good

Stephen Tashi · Jun 25, 2013

I think the problem amounts to fitting a linear regression model with some contraints on the model's coefficients.

To illustrate this, suppose there are only 2 weather stations A and B and they each measure 2 variables.

Station A measures data [itex] (X_A, Y_B] [/itex]
Station B measure data [itex] (X_B, Y_B [/itex]

The load data is [itex] L [/itex].
There is data [itex]Z [/itex] that doesn't come from the stations

The model has the form [itex] L = K_x X + K_y Y + K_z Z + K [/itex]
where the[itex]K[/itex]'s are constants.

The variable [itex] X = \lambda_A X_A + \lambda_B X_B [/itex]
The variable [itex] Y = \theta_A Y_A + \theta_B Y_B [/itex]

With the constraints on the constants:
[itex] \lambda_A + \lambda_B = 1 [/itex]
[itex] \theta_A + \theta_B = 1 [/itex]

and possibly the constraints
[itex] \lambda_A = \theta_A, \ \lambda_B = \theta_B[/itex]

If we substitute for [itex] X [/itex] and [itex] Y [/itex] in the model, it becomes

[tex] L = K_x \lambda_A X_A + K_x \lambda_B X_B + K_y \theta_A Y_A + K_y \theta_B Y_B + K_z Z + K [/tex]

This amounts to a linear regression model in 5 variables, [itex] X_A, X_B, Y_A,Y_B,Z [/itex]

[tex] L = C_a X_A + C_B X_B + D_a Y_A + D_b Y_B + K_z Z + K [/tex]

If we fit such a model to data by least squares, without any constraints on the coefficients, we can find the constants in that model. However, then we need to express them as other constants that satisfy the equations

[itex] C_A = K_x \lambda_A [/itex]
[itex] C_B = K_x \lambda_B [/itex]
[itex] D_A = K_y \theta_A [/itex]
[itex] D_B = K_y \theta_B [/itex]
[itex] \lambda_A + \lambda_B = 1 [/itex]
[itex] \theta_A + \theta_B = 1 [/itex]

So the question is whether the above equations have solutions for the several unknowns. If I have understood the problem correctly, we can think about that.

It would be easier to solve the equations if you drop on the last two constraints. The term "weighted average" has a comforting sound, but I see no reason why those constraints make the model more reliable.

mdhastings · Jun 27, 2013

Thanks Stephen,

You have provided what I wanted exactly. So how do we work out the lambda coefficients? I believe bot restrictions are necessary

Stephen Tashi · Jun 27, 2013

If you want constraints such as [itex] \lambda_A + \lambda_B = 1 [/itex] and you know the values of the [itex] K[/itex]'s, you need to solve the least squares problem involving the [itex]C[/itex]'s with constraints instead of solving it by simple least squares fitting.

For example
[itex] \lambda_A = C_A/K_x [/itex]
[itex] \lambda_B = C_B/K_x [/itex]
So [itex] \lambda_A + \lambda_B = 1 [/itex] is equivalent to [itex] C_A/K_x + C_B/K_x = 1 [/itex] which is a linear equality constraint of the coefficents [itex] C_A,C_B [/itex].

Not knowing the details of how to do regression with constraints , I searched the web using the phrase "linear regression constraints coefficients" and found this PDF of slides http://folk.uio.no/inf9540/CLS.pdf that (supposedly) explains how to solve such a problem (see page 19). It uses matrix notation, which I suppose we can interpret eventually.

Apparently there are also computer packages that solve problems of linear regression with constraint. How are you going to do the computer work on it?

mdhastings · Jun 28, 2013

Stephen, Again thanks for your advice. I think I have simplified the process. Since the coefficient is the same for both temp and Dew Point of each station, I can combine the data (scaleable??) and then run a regression as per your L=C[XA+XB]+D[YA+YB]+KzZ+K above. The C and D coefficients must add to 1 so if I divide both sides of the equation through by [C+D], I get the weighting out of 1. What do you think?

Stephen Tashi · Jun 28, 2013

mdhastings said:

Since the coefficient is the same for both temp and Dew Point of each station

Do you mean the coefficient [itex] K[/itex]'s in the original model?

run a regression as per your L=C[XA+XB]+D[YA+YB]+KzZ+K above.

My equation gives XA and XB possibly different coefficients.

The C and D coefficients must add to 1 so if I divide both sides of the equation through by [C+D], I get the weighting out of 1.

I thought the idea was to weight the measurements of the same quantity from different weather stations differently. So we want to weight XA and XB differently - if we are using "X" to represent the physical quantity and the "A" and "B" to denote the two different weather stations.

mdhastings · Jun 30, 2013

Sorry, Stephen, I remarked in my opening that "It must somehow combine temperature and dew point (since rel. humidity is equivalent here) within the weighting." I wasn't sure they could be combined within the methodology, so left the thought there. I am still unsure if you can simply add the two numbers (say 21.5C and 10.0C) and use regression as we are doing. I thought there might be some scaling to consider. But clearly that is what the current modelling uses. This, my first go, is really to get the methodology right.

My previous comment meant to combine data of the station (T(C) and DP(C)) as I have said above. I rushed your equation into my comment. My bad.

Hence now correct to L=C[XA+YA]+D[XB+YB]+KzZ+K. [For some reason I keep misinterpreting you notation]

Your first Q. This means it is not the K's. It is simpler. The above obviously refers to your constraints λA=θA, λB=θB.
Now the final constraint is that C and D (and other 4 stations) add to 1.

To find C and D (and other 4 stations) .
I have to use these 6 stations with the parent modelling (i.e. from where I get the Z) to obtain C and D (and other 4 stations) and this has some peculiarity. In my forecasting I need to reference the artificial station above which then produces other terms used in the parent of Z. Would you be able to assure me that if instead of parent of Z, I substitute out of it all these other terms and replace with the 6 weather stations this will give C and D etc.

For example the simplified model Z looks like (in R code)
temp_lm <- lm(loadMWH~trend +...+
(wt1+wt2+wd1+wd2)*(sd1+cd1+sd2+cd2+sd3+cd3+sd4+cd4)+
(wt1+wt2+wd1+wd2)*(sy1+cy1)+(wt1+wt2)*(ph1+ph2)+
etc...,data=dataframe, na.action = na.exclude)

where wt1, wt2, wd1 and wd2 is constructed from the artificial weather station
e.g wd1 <- pmax(wd1-17,0)

and replace with L=C[XA+YA]+D[XB+YB]...H[XF+YF]+KzZ+K to look like

temp_lm <- lm(loadMWH~trend +...+
([XA+YA]+[XB+YB]+...+[XF+YF])*(sd1+cd1+sd2+cd2+sd3+cd3+sd4+cd4)+
([XA+YA]+[XB+YB]+...+[XF+YF])*(sy1+cy1)+(wt1+wt2)*(ph1+ph2) +
etc...,data=dataframe, na.action = na.exclude)

where the sd1 etc and cd1 etc are Fourier terms creating interactions with the 6 weather stations

Or go one further and remove all interactions and just use the 6 weather stations.
Which then this matches the L=C[XA+YA]+D[XB+YB]...H[XF+YF]+KzZ+K
temp_lm <- lm(loadMWH~trend +...+
[XA+YA]+[XB+YB]+...+[XF+YF]+
etc...,data=dataframe, na.action = na.exclude)
For simplicity I like the last one - does it work?

My complete thanks for your support on this Stephen. Hope you can help further.

Stephen Tashi · Jul 1, 2013

mdhastings said:

Would you be able to assure me that if instead of parent of Z, I substitute out of it all these other terms and replace with the 6 weather stations this will give C and D etc.

I can't understand questions about 6 weather stations unles they are posed precisely. The simplest way to do that will be to use appropriate notation..

Designate the N weather stations whose measurements are to be somehow weighted, by indexes [itex] "1","2","3"..."N" [/itex] instead of [itex] "A","B","C",.. [/itex].

Use the notation [itex] X[j] [/itex] to be the [itex] j[/itex]-th type of measurement taken at the [itex] i[/itex]th weather station. The types of measurements are indexed by [itex]j = 1,2,..M[/itex].

I don't know about the wisdom of combining two types of measurement into a single number. I think that debate is a matter of physics, not pure math. I'm assuming that the measurements "types" are the final set of numbers, after you have done all the combining that's going to be done.

Let the variables representing [itex] S [/itex] other measurements, not in the above list be [itex] Z[1],Z[2]...Z [/itex].

Let the unconstrained regression model be

[itex] L_u =\sum_{j=1}^M \sum_{i=1}^N C[j] X[j] + \sum_{i=1}^S A Z_i + P [/itex]

where the [itex] C[j], A, P [/itex] are constants.

There are at least two interpetations of what it means to weight the data from weather stations.

On interpretation is that you must assign a set of non-negative weights [itex] w[1],w[2]...w[N] [/itex] with the constraint [itex] \sum_{i=1}^N w = 1 [/itex]. i.e. one weight value per weather station.

Another interpretation is that you may assign a set of non-negative weights [itex] w[j][j] , j = 1,2..M, i = 1,2,..N [/itex] , with the constrain that [itex] \sum_{i=1}^N w[j] = 1 [/itex] for each [itex] j = 1,2,...M[/itex] i.e. that you can have one weight per each type of measurement and each weather station.

It is unclear to me what the situation is with the company's current model. I'll guess it is of the form

[tex] L_c = \sum_{j=1}^M K[m] Y[m] + \sum_{i=1}^S B Z + Q [/tex]

where [itex] Y[j] [/itex] is a (single) value for a measurement of type [itex] j [/itex] and [itex] K[m],B,Q[/itex] are constants.

I don't know if you can look into the code and data for this model and read the specific values of the constants ( for example, determine that [itex]K[3] = 29.85677[/itex]) or whether you can't do things like that.

Can you clarify the above ambiguities and pose your questions in the framework of the notation or suggest a different notation? (I don't care if you use the forums LaTex. It's interesting to learn, but that can be a big distraction.)

I might be able to read R-code with documentation - a dictionary of the variables.

mdhastings · Jul 1, 2013

Stephen I did some Latex 21 years ago, so latex it is and please ignore the R code.

I don't know about the wisdom of combining two types of measurement into a single number. I think that debate is a matter of physics, not pure math. I'm assuming that the measurements "types" are the final set of numbers, after you have done all the combining that's going to be done.

Yes the idea was to work backwards to find a methodology to find a weighting for each station. It may be that we should not combine the two sets of numbers (Temp and Dew point). Maybe we run each alone and simply average the two sets of coefficients. So (cause I get notation confusion) let me ask explicitly about the Temp set of numbers: I want [itex] Y[m] [/itex] to represent the temp measurement of each station, thus [itex] j = 1 [/itex] then this:

[tex] L_c = \sum_{i=1}^M K[m] Y[m] + \sum_{i=1}^S B Z + Q [/tex]

where [itex] Y[j] [/itex] is a (single) value for a temp [itex] j [/itex] measurement of station [itex] i [/itex] and [itex] K[m],B,Q[/itex] are constants.

When I run this I get [itex] K[m] [/itex] coefficients of 10.20,-10.63, 5.75, 9.53,4.78,11.74 for station temperatures 1 to 6 but firstly these do not align to the current weightings and secondly no matter what I do I always get a negative value and do not know how I build the weighting so they total 1.

Please let me know how this aligns with your comments. These last two difficulties are one big headache and that is why I tried to combine temp with either dew point or rel. humid.

Now if I turn my attention to explaining the company's current load modelling (i.e. forecasting the load) :

The regression model would be

[itex] L_u =\sum_{i=1}^S A Z_i + P [/itex]

where the [itex] A\ and \ P [/itex] are constants.

In my modelling the only [itex] Z[1],Z[2]...Z [/itex] terms that are actual observable data is from the artificial weather station (which I am seeking to build from 6 stations). All up, I have 700+ interactive terms to help forecast a very true curve. Most if not all are interactions with Fourier terms (sd1,cd1 etc are sin(daily1), cos(daily1)..or yearly terms sy1, cy1 etc) and these construct the load curve. (The daily shape of the electricity load is like a sine wave). I did upload a sample of data but with no comments I think it did not do as I expected.

When in my previous answer I showed (wt1+wt2+wd1+wd2)*sd1+cd1+sd2+cd2+sd3+cd3+sd4+cd4) I implied that this artificial weather station is interacted with the Fourier terms such that
A[1](wt1*sd1) + A[2](wt1*cd1) + A[3](wt1*sd2) + ... + A(wd2*sd4) for 32 of 700+ terms.
Where the [itex] A\ [/itex] are constants and the wt1*sd1 is a new interactive term made by multiplication.

Hence unlike your company's model there isn't a weather term standing alone - thus no the specific values of the constants - they are all interactions.

I don't know if you can look into the code and data for this model and read the specific values of the constants ( for example, determine that [itex]K[3] = 29.85677[/itex]) or whether you can't do things like that.

Stephen Tashi · Jul 2, 2013

mdhastings said:

Stephen I did some Latex 21 years ago, so latex it is and please ignore the R code.

(I don't mind code if it is documented.)

Maybe we run each alone and simply average the two sets of coefficients.

Perhaps some forum member who is an expert on linear regressions knows how to do two regressions separately and combine the results, but I don't. As far as I know you can't average two least squares regressions that predict the same variable [itex] L [/itex] with different sets of variables and claim the average is a least squares regression.

So (cause I get notation confusion) let me ask explicitly about the Temp set of numbers: I want [itex] Y[m] [/itex] to represent the temp measurement of each station, thus [itex] j = 1 [/itex] then this:

[tex] L_c = \sum_{i=1}^M K[m] Y[m] + \sum_{i=1}^S B Z + Q [/tex]

where [itex] Y[j] [/itex] is a (single) value for a temp [itex] j [/itex] measurement of station [itex] i [/itex] and [itex] K[m],B,Q[/itex] are constants.

When I run this I get [itex] K[m] [/itex] coefficients of 10.20,-10.63, 5.75, 9.53,4.78,11.74 for station temperatures 1 to 6 but firstly these do not align to the current weightings and secondly no matter what I do I always get a negative value

Ok, I understand that that [itex] K[2] [/itex] is a negative value. When you say you "run this", I assume this means you use data you have to do a least squares fit to the measurements. Is that correct? I don't understand what "the current weightings" are.

and do not know how I build the weighting so they total 1.

My thought is that you would have to do a linear regression "with contraints on the coefficients". This is a known method (but not well known to me!). We would have to find software to do this or find a detailed explanation of the technique if we want to implement it ourselves. I think this is possible.

Now if I turn my attention to explaining the company's current load modelling (i.e. forecasting the load) :

The regression model would be

[itex] L_u =\sum_{i=1}^S A Z_i + P [/itex]

where the [itex] A\ and \ P [/itex] are constants.

In my modelling the only [itex] Z[1],Z[2]...Z [/itex] terms that are actual observable data is from the artificial weather station (which I am seeking to build from 6 stations). All up, I have 700+ interactive terms to help forecast a very true curve.

I don't understand what "interactive terms" means. Does it mean "non-linear"?

Most if not all are interactions with Fourier terms (sd1,cd1 etc are sin(daily1), cos(daily1)..or yearly terms sy1, cy1 etc) and these construct the load curve. (The daily shape of the electricity load is like a sine wave).

I'm confused by the mention of a model for a curve that uses a discrete Fourier series versus the earlier discussion of doing a linear regression.

I'll make a guess at what the model is.

It predicts a curve of electricity usage as:

[tex] L(t) = C[0] + \sum_{i=1}^{700+} C\ \cos(\omega t) [/tex]

where the [itex] \omega [/itex] are constants and the [itex] C= C(...) [/itex] are functions (possibly non-linear) of the observable data, including the weather data.

From the model for [itex] L(t) [/itex] you can compute the predicted mean daily load for each day [itex] L = \frac{1}{b-a}\int_{a}^{b} t\ L(t) dt [/itex] where the [itex]i[/itex]th day begins at time [itex] a [/itex] and ends at time [itex] b [/itex].

I don't know if you also have actual measured mean daily load data for the days.

The input data to this model does not have a variable for a given type of measurement (e.g. mean daily temperature) from N weather stations. It only has 1 variable representing mean daily temperature, say , [itex]Z[1] [/itex]. You wish to find non-negative weights [itex] w[1] [/itex] that sum to one and you wish to set [itex] Z[1] = \sum_{i=1}^N w[1] X[1]. [/itex]

The problem of finding the optimal set of weights [itex] w[1] [/itex] to fit the model's predicted mean load to observed daily mean load data is not a problem of linear regression. It is a problem of non-linear regression. Are we assuming the model is adequately approximated by a linear function?

mdhastings · Jul 2, 2013

Stephen,
Perhaps some background... you are been very patient with my explanations ..Thanks
The company's electricity load model is designed to provide a load forecast for just 1 day (but made up of 48 1/2 hour intervals that need to be forecast) and the shape of this day's load curve is like a sine wave. All data used in this modelling needs to be in the same 1/2 hour intervals We have years of these types of data. Apart from the load and weather data we make the other terms up. Thus we prepare sin and cos series that range evenly between -1 and 1 for daily and yearly terms. We have 10 sine daily terms and 10 cos daily terms ((labelled sd1, ...,sd10 and cd1, ..., cd10) capturing slightly offsetting day sized waves. The yearly terms (sy/cy) are set the same way but provide 8 sine and 8 cos offset curves over a year ranging between -1 and 1. We also have "day of the week" and public holiday dummy variables.

The six weather station's data is formed into the artificial station by hard-coded current weightings. These weights were provided 7 years ago and my only aim is to work out a methodology to up-date these. To be honest nothing else matters to me.

We refer to interactions in the regression model as combinations of these terms (E.g. a sd1*sy1 combines the two datum by multiplication and is now a new term in the model). In all cases the 700+ interactions take various combinations of the above. This complicates the finding of a methodology

The following you gave should include interations
[tex] L(t) = C[0] + \sum_{i=1}^{700+} C\ \cos(\omega t) [/tex]

like
[tex] \cos(\omega t). \sin(\omega t) \ or [/tex]
[itex] \cos(\omega t). dow1 \ or \ even [/itex]
[itex] wt1. \sin(\omega t). dow1. \sin(\lambda t) [/itex]

where wt1 is a temperature, omega is daily, dow1 represents Monday and lambda is year

But these are now very difficult to add in.

Please feel free to ask more .. thanks again.

Stephen Tashi · Jul 2, 2013

I have a better understanding of the complexity of the company's model now. I think you use the word "interactions" of variables to mean products (in the sense of multiplications) or, more generally "products of functions of the variables".

Since the company's model is not a linear regression, you can't expect to find the best weights to use in the company's model by finding the best weights to use in a linear regression model. I understand that finding the best weights to use in a linear regression may provide some hint about the best weights to use in the company's model. However, the most reliable solution would be to use the non-linear model itself. Another possibility is to approximate the company's model by a non-linear function that is simpler than the company's model.

If it takes a long time to run the company's model, it may not be practical to use the company's model to determine the best weights. If the model runs quickly, I think (in theory) you should approach the problem as the scenario of minimizing a non-linear function (= mean square error of forecast) with respect to a set of variables (the weights) subject to some given contraints on the variables ( - that the weights are non-negative and sum to 1). There are various numerical methods for doing this. They amount to systematic forms of triial-and-error but they produce practical results.

Have I understood the situation?

mdhastings · Jul 2, 2013

Stephen,

In my econometrics course we talk about linear in respect to the parameters (oefficients). Hence this is a linear model - we use the lm (linear model) function in R to solve and it takes about 4 minutes to take database input and produce a forecast in a csv output. One of the difficulties is understanding the meaning in using sin and cos terms - but clearly they just build the shape with interactions with day of the week (major component) and artificial weather station. The interaction term wt1. [itex]\sin(\omega t). dow1. \sin(\lambda t) [/itex] whilst complicated in meaning is still collecting a specific variation in the load.

The model has its problems but under our market rules we probably wouldn't be able to do better. We must forecast with a weather forecast that must be 24 hours old. That is we run the model daily and it generates the forecast for the same time tomorrow plus 48 intervals.

In a sense that is why the weights hard-coded into the program need to be changed - our city has grown.

jim mcnamara · Jul 2, 2013

I've been in utilities for a long time - too long probably - I understand our company models and how we forecast consumption.

We use wind, temperature, insolation, humidity and all kinds of consumption and transmission data/history to forecast requirements, which we integrate with nominations (gas) for our transportation customers. We have way more weather station reading sets than you appear to have. All this matters naught.

I have stayed away because your answers are not. They are sort of indirect descriptions of what you think Stephen needs. It would be fun to help if I had a prayer of understanding what you want.

Please:
Take one of Stephen's questions. Provide a direct answer. You appear to have done that above: One run generates 48 interval estimates. I'm staying away until I can understand.

Your model output cannot be solely based on weather, you have to have historical consumption data. Unless you are solely employing degree days and using some company factor. But that will not deal with load forecasting. That depends entirely on historical data vs current estimates.

US degree days == A degree day is computed as the integral of a function of time that generally varies from an arbitrary temperature base like 20 degrees C. ...Whatever that mensuration method is called in Great Britain, or wherever you are and using British English. Most of the EU has degree day maps and zones. That I have seen anyway.

Plus, working in this field I've never encountered the constraints you mention.

Pardon this comment if it is out bounds --- It sounds like your boss is pretending he needs to be sure you do not try to think. Are these completely regulatory constraints? If they are, then your regulators are worse than ours. And two of them went to jail in the past two years. (New Mexico, USA and not proud of our Public Regulatory Commission)

Are you private, IOU, Municipal (Gov't owned), or some kind of consumer owned cooperative?

I am giving you these questions to see what direct answers, if any, I get back.

mdhastings · Jul 2, 2013

Thanks Jim,

The way we forecast is very different to most since we are not real time - we have to forecast a day ahead. We are a Government retailer that needs to buy it's energy under unusual market regulations (delay (weeks) in receipt of load data).

This is an in-house program that was produced by a mathematician who is no longer part of our of us. We create an equation using R code which can be written like this

[tex] L(t) = C[0] + \sum_{i=1}^{700+} C\ Z[/tex]

where the C are coefficients/parameters and the Z[1],Z[2]...Z terms can be made up as interaction terms e.g. [itex] \cos(\omega t). \sin(\omega t) \ [/itex] .
The only data we have is load (dependent variable), and independent variables: weather (T(C), Dew Pt(C) and Rel. humid(%)), weekdays and public holidays. Once we have the load equation we use the weather forecast and run a predict function on the equation. This econometric modeling is different to most but the errors are reasonable for our purposes.

For the linear Q I have referenced Greene's "Econometric Analysis" 4th Ed. He states on p327 [referring to interaction terms (p326)] , "Despite their complex functional forms, these models are intrinsically linear... a distinguishing feature of the linear model is not the relationship among the variables as such but the way the parameters enter the equation". I have to admit I am yet to understand this.

Unfortunately the difficulties with the methodology I'm seeking are mine alone. This is in the sense that after 6 years running with the weights hard-coded into the program and with the changing demographics of our city I feel an update requires understanding of how they were derived. I thought I could work backwards since the weight for a given station applies to all measures (T(C), Dew Pt(C) and Rel. humid(%), so the constraints are based upon that.

Most of what Stephen and I have covered has focused on getting the weights through the modelling we have described below. The trouble is how to work a way through the sin and cos interaction terms in the Z's where for example I always get negative coefficients on some stations temp terms where the restriction [itex]\ \sum_{i=1}^{6} A\ = 1 [/itex] should apply.

[tex] L(t) = C[0]\ + \ \sum_{i=1}^{6} A\ X + C\ Z[/tex]
with Station A etc. measuring data [itex]\ (X_A, Y_B, H_C)\ [/itex] and X, Y and H are temp, Dew Pt and Rel. humid - though we only use 1 of the last 2.

or if we combine the Temp and Dew Pt (or Rel. H) as discussed previously with same restriction

[tex] L(t) = C[0]\ + \ \sum_{i=1}^{6} A\ (X+Y) + C\ Z[/tex]

Q is are we on the right tract - what other ways can we orchestrate the stations weights relative to load?

Thanks for been involved. Again hope all this helps

Stephen Tashi · Jul 2, 2013

mdhastings said:

Hence this is a linear model - we use the lm (linear model) function in R to solve and it takes about 4 minutes to take database input and produce a forecast in a csv output.

Let me see if I understand what you are saying. Your company's forecast method has two models. You run a linear model and it outputs a set of numbers. That output is used as input to the other model (the one that uses sine and cosine terms and is non-linear)?

jim mcnamara · Jul 2, 2013

We forecast a day ahead in order to buy bulk power or nominate extra gas from a field, it is not unusual.

mdhastings · Jul 2, 2013

Stephen Tashi said:

Let me see if I understand what you are saying. Your company's forecast method has two models. You run a linear model and it outputs a set of numbers. That output is used as input to the other model (the one that uses sine and cosine terms and is non-linear)?

One model that produces a single equation of the form
[tex] L(t) = C[0] + \sum_{i=1}^{700+} C\ Z [/tex]

where the C are coefficients/parameters and the Z[1],Z[2]...Z terms can be made up as interaction terms e.g. cos(ωt).sin(ωt).

If we then create a data set with exactly the same terms (just a repeat of the above's design matrix so all the sin/cos terms are the same) but use the forecast weather instead of 'observational weather from our artificial station' then using the predict function in R we get our load forecast. Quite simple but effective.

But I'm trying to focus on the inputs to the initial equation. Here we weight the observations of 6 weather stations into the artificial station using hard-coded weights. These weights, I believe, need to be changed and I need the methodology to create the weights. They must be matched against the load with the restrictions of how they are applied for the weather inputs into the initial equation (such as [itex]\ \sum_{i=1}^{6} weights\ = 1[/itex])

Very appreciated of both your efforts - I'm sorry I cannot explain this well.

pbuk · Jul 3, 2013

Are you saying that the whole point of averaging current weather data from six local stations is to generate a forecast 24 hours or more ahead of local weather conditions?

That's not how weather works - the weather now is in general a very poor indicator of the weather tomorrow. Have you tried an historical analysis of whatever forecast data are available instead? I'll bet there is a much better fit there, assuming of course that power consumption is in fact strongly correlated to one or more aspect of weather (have you determined this)?

For a short-term (0-6 hours) forecast, current data are more relevant, but the relationship is likely to be highly non-linear, probably chaotic. The only way to determine the "best" parameters for such a model is trial and error: from an initial guess, alter each parameter in turn and see if a better fit is obtained. Also probably an idea to scan a larger part of the solution space with a grid or monte-carlo method. You should first remove collinearity from the weather station parameters by transforming the model to use an average (median or interquartile mean may work best as outliers are likely to be poor predictors), and replacing the individual measurements with the difference between the measurement and the average. You will probably find that these differences have no consistent predictive value, meaning that a search for the "best" weightings of individual measurements is futile.

I would suggest that the best model is likely to be obtained by determining the optimum forecast parameters at each forecast interval (or perhaps at super-intervals of 4 hours or whatever if the computation is expensive) rather than attempting to "model the model" by fitting sunusoidal or other curves.

Stephen Tashi · Jul 3, 2013

mdhastings said:

One model that produces a single equation of the form
[tex] L(t) = C[0] + \sum_{i=1}^{700+} C\ Z [/tex]

where the C are coefficients/parameters and the Z[1],Z[2]...Z terms can be made up as interaction terms e.g. cos(ωt).sin(ωt).

If we then create a data set with exactly the same terms (just a repeat of the above's design matrix so all the sin/cos terms are the same) but use the forecast weather instead of 'observational weather from our artificial station' then using the predict function in R we get our load forecast. Quite simple but effective.

Ok, you can call that one model. But the process you are describing apparently uses two different algorithms based on that model. The first algorithm uses real weather data ( weighted averages of it) and other data to produce some output. Then you use that output plus the weather data from the weather forecast to predict the electrical load.

Even if the first stage uses a linear regression to generate its output, I don't see that the predicted load data of the two stage process is necessarily a linear function of the weather data.

You can test empirically to see if the result of the two stage process is linear is by making up various imaginary weather data and seeing the error of the load prediction varies linearly. I realize that the final output of the two-stage process is a curve, not a single number. So you need to define some simple measure of how well the curve predicted the actual load. For example, you could define the total error of the prediction to be the mean of the squares of the differences between the predicted loads and the actual load taken over each of the half-hour sections of the curve. See if that single number is apporximately a linear function of the weather variables input to the first stage of the process.

For example, if an input the first stage is "relative humidity", you can vary the "relative humidity" by pretending that all 6 of the weather stations measured the same relative humidity and varying the "relative humidity" over a set of linear increments.

If the error doesn't vary linearly with the weather input, it still may vary in some smooth manner - for example it might vary as a quadratic. It would be helpful to know this.

This is in the sense that after 6 years running with the weights hard-coded into the program and with the changing demographics of our city I feel an update requires understanding of how they were derived.

One simplistic thought is that each weather station can be considered to represent the best estimate of current weather for some fraction of the city's population (populaton of people or population of electrical connections). Are the hard coded weights consistent with that thought?

Are there non-weather inputs to the model that reflect the current total population of the city? - or the current total number of various capacity electrical connections? (I assume the electric company has "residential" vs "commercial" types of electric meters.)

mdhastings · Jul 3, 2013

Thanks MrAnchovy

MrAnchovy said:

Are you saying that the whole point of averaging current weather data from six local stations is to generate a forecast 24 hours or more ahead of local weather conditions?

No I'm saying the artificial weather station made from 6 stations weather goes into the initial equation - the Meteorology weather forecast replaces that data in the prediction function for the load forecast. What would your methodology be to find the weights?

mdhastings · Jul 3, 2013

Stephen Tashi said:

Ok, you can call that one model. But the process you are describing apparently uses two different algorithms based on that model.

Stephen only the data is changed - in this case the weather so it is the one algorithm - from my understanding this is the normal econometric method.

Stephen Tashi said:

One simplistic thought is that each weather station can be considered to represent the best estimate of current weather for some fraction of the city's population (populaton of people or population of electrical connections). Are the hard coded weights consistent with that thought?

No in the sense that the weather stations are not matched to areas of population and unfortunately this again is because of closed market and the necessary regulations.

Stephen Tashi said:

Are there non-weather inputs to the model that reflect the current total population of the city? - or the current total number of various capacity electrical connections? (I assume the electric company has "residential" vs "commercial" types of electric meters.)

Sorry, there is no data in the model reflecting the current total population of the city nor connections.

Thanks again Stephen. This has been done and staff who were involved with the code's creator say it was done as I have suggested in this forum. Just cannot work this out.

Stephen Tashi · Jul 3, 2013

mdhastings said:

Stephen only the data is changed - in this case the weather so it is the one algorithm - from my understanding this is the normal econometric method.

Threre must be some results from the first step that are input to the "predict" run. Otherwise there would be no point doing the first step. Perhaps the inputs are not evident to the user.

Can you give a link to some online source that describes this "econometric" method? - or some technical name for it? ( I've never seen the book by Greene.)

This has been done and staff who were involved with the code's creator say it was done as I have suggested in this forum.

I don't know what you mean by "this".

Just cannot work this out.

The general idea seems straightforward to me. You have to try various weights and see how well they predict the load by using historical data in the two step procedure you described ( not by doing a linear regression). Is that impractical?

mdhastings · Jul 3, 2013

Stephen Tashi said:

Threre must be some results from the first step that are input to the "predict" run. Otherwise there would be no point doing the first step. Perhaps the inputs are not evident to the user.
Can you give a link to some online source that describes this "econometric" method? - or some technical name for it? ( I've never seen the book by Greene.)

I could scan the relevant page in Greene's book - How do I load it into this forum for you to see?

Stephen Tashi said:

I don't know what you mean by "this".

The weighting procedure - 6 stations into 1 - but nobody remembers the details or don't understand it.

Stephen Tashi said:

The general idea seems straightforward to me. You have to try various weights and see how well they predict the load by using historical data in the two step procedure you described ( not by doing a linear regression). Is that impractical?

Sorry this just does not make sense to me. I am not giving up - it must be done, but thanks for for strong interest Stephen.

Stephen Tashi · Jul 3, 2013

mdhastings said:

I could scan the relevant page in Greene's book - How do I load it into this forum for you to see?

One way is to joint one of those photo sharing sites like photobucket. Post it there and just post a link to it in your post. If you do much on the web, it's handy to join one of those sites.

Another way is to look at the bottom of the message composition window at the "Additional Options" where it says "manage attachments".

mdhastings · Jul 3, 2013

Stephen Tashi said:

One way is to joint one of those photo sharing sites like photobucket. Post it there and just post a link to it in your post. If you do much on the web, it's handy to join one of those sites.

Another way is to look at the bottom of the message composition window at the "Additional Options" where it says "manage attachments".

See pdf Attachment

mdhastings · Jul 3, 2013

Stephen, this slightly more about the interactions

mdhastings · Jul 3, 2013

Stephen, this pdf from Greene's about the interactions

mdhastings · Jul 3, 2013

Stephen, third try
this pdf from Greene's about the interactions

mdhastings · Jul 3, 2013

Stephen, fourth try
this pdf from Greene's about the interactions

Weighting calculation to convert weather data from 6 stations into one

Attachments

Similar threads

Hot Threads

Recent Insights