Deviance of Binomial generalized linear model

In summary, the deviance of a binomial generalized linear model is calculated by taking into account all data points, including those with n_i=y_i, and in the saturated model, the likelihood function is found by replacing \mu with y because the number of parameters equals the number of observations.
  • #1
logarithmic
107
0
The formula for the deviance of a binomial generalized linear model is:
[tex]D = 2\sum[y_i \log(\frac{y_i}{\hat{y}_i})+(n_i-y_i)\log(\frac{n_i-y_i}{n_i-\hat{y}_i})][/tex].

where the responses y are [tex]Binomial(n_i, p_i)[/tex], and [tex]\hat{y}_i = n_i\hat{p}_i[/tex].

The second log in that equation is undefined when [tex]n_i=y_i[/tex], which of course can happen with non-zero probability.

In R, that formula correctly gives the deviance that R gives. So what happens to the deviance when the binomial glm model has a data point where [tex]n_i=y_i[/tex]? Somehow R is still able to give a finite deviance, in this situation, even though the formula fails.

Also (this is a separate question), in order to the calculate the deviance, you need to calculate likelihood function of the saturated model. The likelihood function is [tex]L(\texbf{\mu},\texbf{y})[/tex] (mu is the vector of expected value of the y's), and the likelihood function for the saturated model is found by replacing [tex]\mu[/tex] with y (i.e all variation explained, perfect fit, but why?). Since the saturated model is defined as the model whose number of parameters equals the number of observations, how does the above fact follow from the definition? Also what does the definition even mean? Where in the model equation would you stick the extra parameters, and what are their associated covariates, so that the number of parameter equals the number of observations?
 
Physics news on Phys.org
  • #2
In answer to your first question, the deviance of a binomial generalized linear model is calculated by taking into account all of the data points, including those with n_i=y_i. In the case where n_i=y_i, the deviance for that particular data point is zero, since there is no difference between the observed and expected values. This is why R can still give a finite deviance, even though the formula fails. In answer to your second question, the likelihood function of the saturated model is found by replacing \mu with y because the saturated model is the model whose number of parameters equals the number of observations. This means that the predicted values for each observation are exactly equal to the observed values, so \mu is replaced with y. The extra parameters are not added to the model equation, because they are not necessary when the number of parameters equals the number of observations.
 

Related to Deviance of Binomial generalized linear model

1. What is a Deviance of Binomial generalized linear model?

A Deviance of Binomial generalized linear model is a statistical model used to analyze binary or count data. It is a type of generalized linear model that takes into account the binomial distribution of the data.

2. How is the Deviance of Binomial generalized linear model different from other models?

The Deviance of Binomial generalized linear model differs from other models in that it takes into account the binomial distribution of the data, which is often more appropriate for binary or count data. It also allows for the inclusion of multiple explanatory variables and can handle non-normal or heteroscedastic data.

3. What is the purpose of using a Deviance of Binomial generalized linear model?

The purpose of using a Deviance of Binomial generalized linear model is to understand the relationship between a binary or count response variable and one or more explanatory variables. It can also be used for prediction and hypothesis testing.

4. How is the Deviance of Binomial generalized linear model calculated?

The Deviance of Binomial generalized linear model is calculated using the deviance statistic, which is a measure of the difference between the observed values and the expected values from the model. It is based on the maximum likelihood estimation method and is used to assess the goodness of fit of the model.

5. What are some limitations of the Deviance of Binomial generalized linear model?

Some limitations of the Deviance of Binomial generalized linear model include the assumption of a linear relationship between the response and explanatory variables, the lack of robustness to outliers, and the need for a sufficiently large sample size. It may also be challenging to interpret the coefficients in the model, especially for interactions between variables.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
984
  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
958
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
579
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
Back
Top