# Best procedure to determine Lambda to calculate Poisson probability

#### Aetius

##### New member
What is the best procedure to determine Lambda to calculate the Poisson probability? Say I want to calculate P(X ≥1) of an accident occurring next day. For this I would calculate the average of daily accidents and divide it by 10. The question is, should I take the previous 10 days? Or calculate λ averaging i.e. 10 day periods for the last 200 days and divide them by 20? What would be the best?
Thank you much for any constructive comment.

#### Country Boy

##### Well-known member
MHB Math Helper
What is the best procedure to determine Lambda to calculate the Poisson probability? Say I want to calculate P(X ≥1) of an accident occurring next day. For this I would calculate the average of daily accidents and divide it by 10.
Why? Where did "10" come from?

The question is, should I take the previous 10 days? Or calculate λ averaging i.e. 10 day periods for the last 200 days and divide them by 20? What would be the best?
Thank you much for any constructive comment.
The Poisson distribution gives the probability of "n" occurances in a given time interval. Lambda is the average number of occurances in that time interval. What time interval are you given?

#### Klaas van Aarsen

##### MHB Seeker
Staff member
What is the best procedure to determine Lambda to calculate the Poisson probability? Say I want to calculate P(X ≥1) of an accident occurring next day. For this I would calculate the average of daily accidents and divide it by 10. The question is, should I take the previous 10 days? Or calculate λ averaging i.e. 10 day periods for the last 200 days and divide them by 20? What would be the best?
Thank you much for any constructive comment.
Hi Aetius, welcome to MHB!

Assuming a constant accident rate we will get a more accurate $\lambda$ if we take the average over a longer period of time.
However, if $\lambda$ is expected to change over time, such as in traffic accidents, then we should take a shorter period in which we assume that $\lambda$ is more or less constant.
With a shorter period comes a wider confidence interval for $\lambda$ though, so we can't take it too short.

Intuitively I'd expect that the rate of traffic accidents does not change significantly in 200 days.
That is, unless systematic changes were made in the area with regard to traffic safety and the like.

A first step in finding an acceptable interval in which $\lambda$ is sufficiently constant could be to average it over a shifting interval in time and see if we get a more or less level curve.
Suppose we have a total of $k$ traffic accidents in a specific period of $n$ days.
Then we can quantify the confidence interval for the estimate $\hat\lambda = \frac kn$ by:
$$\frac 12\chi^2(\alpha/2;\ 2k) \le n\lambda \le \frac 12\chi^2(1-\alpha/2;\ 2k+2)$$
where $\chi^2(p;\ df)$ is the inverse $\chi^2$-distribution for the cumulative probability $p$ with $df$ degrees of freedom.

#### Aetius

##### New member
Hi Aetius, welcome to MHB!

Assuming a constant accident rate we will get a more accurate $\lambda$ if we take the average over a longer period of time.
However, if $\lambda$ is expected to change over time, such as in traffic accidents, then we should take a shorter period in which we assume that $\lambda$ is more or less constant.
With a shorter period comes a wider confidence interval for $\lambda$ though, so we can't take it too short.

Intuitively I'd expect that the rate of traffic accidents does not change significantly in 200 days.
That is, unless systematic changes were made in the area with regard to traffic safety and the like.

A first step in finding an acceptable interval in which $\lambda$ is sufficiently constant could be to average it over a shifting interval in time and see if we get a more or less level curve.
Suppose we have a total of $k$ traffic accidents in a specific period of $n$ days.
Then we can quantify the confidence interval for the estimate $\hat\lambda = \frac kn$ by:
$$\frac 12\chi^2(\alpha/2;\ 2k) \le n\lambda \le \frac 12\chi^2(1-\alpha/2;\ 2k+2)$$
where $\chi^2(p;\ df)$ is the inverse $\chi^2$-distribution for the cumulative probability $p$ with $df$ degrees of freedom.
Actually, in general you confirmed what I thought. Something I left out is that, in order to improve the relevance of Lambda I would apply the exponential moving averages technique. I am thinking that the ideal procedure would be to take, say the previous 10 days and apply EMA to it. How does it grab you?
Thank you kindly!

#### Klaas van Aarsen

##### MHB Seeker
Staff member
Actually, in general you confirmed what I thought. Something I left out is that, in order to improve the relevance of Lambda I would apply the exponential moving averages technique. I am thinking that the ideal procedure would be to take, say the previous 10 days and apply EMA to it. How does it grab you?
Thank you kindly!
It sounds as if you're specifically interested in changes of $\lambda$.
That's what you get with a relatively short period that is exponentially weighted on the most recent days.
It also amplifies the noise in the last day.
But if $\lambda$ changes, then one of the assumptions of Poisson is violated. That is, it's not a Poisson distribution any more.

If you're interested in observing changes in lambda, perhaps a test statistic (e.g. t-test) is more appropriate to figure out if there is a significant change with respect to previous days.

#### Aetius

##### New member
It sounds as if you're specifically interested in changes of $\lambda$.
That's what you get with a relatively short period that is exponentially weighted on the most recent days.
It also amplifies the noise in the last day.
But if $\lambda$ changes, then one of the assumptions of Poisson is violated. That is, it's not a Poisson distribution any more.

If you're interested in observing changes in lambda, perhaps a test statistic (e.g. t-test) is more appropriate to figure out if there is a significant change with respect to previous days.
Are you sure, Klaas? i say this because the only thing that a moving average (exponential and simple) does is giving more weight to the last values, because Lambda does change -as does in the real world- because the number of accidents occur randomly. Think now that instead of accidents, we are researching highway robberies, and Lambda will change. Wouldn't it be almost as taking another Lambda reading for another 10 days which would likely generate another Lambda value??