Deviance of Binomial generalized linear model

logarithmic · Sep 18, 2010

The formula for the deviance of a binomial generalized linear model is:
[tex]D = 2\sum[y_i \log(\frac{y_i}{\hat{y}_i})+(n_i-y_i)\log(\frac{n_i-y_i}{n_i-\hat{y}_i})][/tex].

where the responses y are [tex]Binomial(n_i, p_i)[/tex], and [tex]\hat{y}_i = n_i\hat{p}_i[/tex].

The second log in that equation is undefined when [tex]n_i=y_i[/tex], which of course can happen with non-zero probability.

In R, that formula correctly gives the deviance that R gives. So what happens to the deviance when the binomial glm model has a data point where [tex]n_i=y_i[/tex]? Somehow R is still able to give a finite deviance, in this situation, even though the formula fails.

Also (this is a separate question), in order to the calculate the deviance, you need to calculate likelihood function of the saturated model. The likelihood function is [tex]L(\texbf{\mu},\texbf{y})[/tex] (mu is the vector of expected value of the y's), and the likelihood function for the saturated model is found by replacing [tex]\mu[/tex] with y (i.e all variation explained, perfect fit, but why?). Since the saturated model is defined as the model whose number of parameters equals the number of observations, how does the above fact follow from the definition? Also what does the definition even mean? Where in the model equation would you stick the extra parameters, and what are their associated covariates, so that the number of parameter equals the number of observations?

Valkarie · Sep 18, 2010

In answer to your first question, the deviance of a binomial generalized linear model is calculated by taking into account all of the data points, including those with n_i=y_i. In the case where n_i=y_i, the deviance for that particular data point is zero, since there is no difference between the observed and expected values. This is why R can still give a finite deviance, even though the formula fails. In answer to your second question, the likelihood function of the saturated model is found by replacing \mu with y because the saturated model is the model whose number of parameters equals the number of observations. This means that the predicted values for each observation are exactly equal to the observed values, so \mu is replaced with y. The extra parameters are not added to the model equation, because they are not necessary when the number of parameters equals the number of observations.

Deviance of Binomial generalized linear model

Related to Deviance of Binomial generalized linear model

1. What is a Deviance of Binomial generalized linear model?

2. How is the Deviance of Binomial generalized linear model different from other models?

3. What is the purpose of using a Deviance of Binomial generalized linear model?

4. How is the Deviance of Binomial generalized linear model calculated?

5. What are some limitations of the Deviance of Binomial generalized linear model?

Similar threads

Hot Threads

Recent Insights