There are** several ways of calculating (pseudo) R-squared values for logistic regression models**, with no consensus about which is best. The *RsqGLM* function, now included in the *modEvA* package, calculates those of McFadden (1974), Cox & Snell (1989), Nagelkerke (1991), Tjur (2009), and the squared Pearson correlation between observed and predicted values:

RsqGLM <- function(obs = NULL, pred = NULL, model = NULL) { # version 1.2 (3 Jan 2015) model.provided <- ifelse(is.null(model), FALSE, TRUE) if (model.provided) { if (!("glm" %in% class(model))) stop ("'model' must be of class 'glm'.") if (!is.null(pred)) message("Argument 'pred' ignored in favour of 'model'.") if (!is.null(obs)) message("Argument 'obs' ignored in favour of 'model'.") obs <- model$y pred <- model$fitted.values } else { # if model not provided if (is.null(obs) | is.null(pred)) stop ("You must provide either 'obs' and 'pred', or a 'model' object of class 'glm'") if (length(obs) != length(pred)) stop ("'obs' and 'pred' must be of the same length (and in the same order).") if (!(obs %in% c(0, 1)) | pred < 0 | pred > 1) stop ("Sorry, 'obs' and 'pred' options currently only implemented for binomial GLMs (binary response variable with values 0 or 1) with logit link.") logit <- log(pred / (1 - pred)) model <- glm(obs ~ logit, family = "binomial") } null.mod <- glm(obs ~ 1, family = family(model)) loglike.M <- as.numeric(logLik(model)) loglike.0 <- as.numeric(logLik(null.mod)) N <- length(obs) # based on Nagelkerke 1991: CoxSnell <- 1 - exp(-(2 / N) * (loglike.M - loglike.0)) Nagelkerke <- CoxSnell / (1 - exp((2 * N ^ (-1)) * loglike.0)) # based on Allison 2014: McFadden <- 1 - (loglike.M / loglike.0) Tjur <- mean(pred[obs == 1]) - mean(pred[obs == 0]) sqPearson <- cor(obs, pred) ^ 2 return(list(CoxSnell = CoxSnell, Nagelkerke = Nagelkerke, McFadden = McFadden, Tjur = Tjur, sqPearson = sqPearson)) }

**Input data** can be either a glm ** model **object

**or**two vectors of

**erved binary (0 or 1) values and the corresponding**

*obs***icted probabilities (only binomial-logit GLMs admitted in this case). The output is a named list of the calculated R-squared values. See also the**

*pred**Dsquared*function.

REFERENCES

Cox, D.R. & Snell E.J. (1989) The Analysis of Binary Data, 2nd ed. Chapman and Hall, London

McFadden, D. (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P. (ed.) Frontiers in Economics. Academic Press, New York

Nagelkerke, N.J.D. (1991) A note on a general definition of the coefficient of determination. Biometrika, 78: 691-692

Tjur T. (2009) Coefficients of determination in logistic regression models – a new proposal: the coefficient of discrimination. The American Statistician, 63: 366-372.