A number of measures may be used to evaluate the predictions of species distribution (or ecological niche, or bioclimatic envelope…) models against observed presence-absence data (Fielding & Bell 1997; Liu et al. 2011; Barbosa et al. 2013). The ** threshMeasures **function, now included in the modEvA package (Barbosa et al. 2014), calculates a few of these measures. Input data are either a

**model**object of class “glm”, or a vector of the

**obs**erved the presences-absences (1-0) of your species plus another with the corresponding model

**pred**ictions, and the threshold value to separate predicted presences from predicted absences in the model predictions. By default this threshold is 0.5, but you may want to change it depending on a number of criteria (see e.g. Jiménez-Valverde & Lobo 2007, Nenzén & Araújo 2011). You can set

*thresh*to “preval” (species’ prevalence or proportion of presences in the input “obs” data) or to any numerical value between 0 and 1. You can calculate optimal threshold values according to several criteria with the

*optiThresh*function provided in the next post. The results of

*threshMeasures*are given numerically and in a bar plot:

# version 2.5 (20 Jan 2013) threshMeasures.methods = c("AUC", "CCR", "Misclass", "Sensitivity", "Specificity", "Omission", "Commission", "PPP", "NPP", "UPR", "OPR", "PPI", "PAI", "kappa", "TSS", "NMI", "OddsRatio") evaluate <- function(a, b, c, d, N = NULL, measure = "CCR"){ # version 1.1 (24 Jul 2013) # internal function to be used by others in the package # a, b, c, d: elements of the confusion matrix (TP, FP, FN, TN) # N: sample size (total number of observations); calculated automatically if NULL # measure: evaluation measure to use # note: TSS and NMI are not symmetrical ("obs" vs "pred" != "pred" vs "obs") # note that some measures (e.g. NMI, odds ratio) don't work with zeros in (some parts of) the confusion matrix if(is.null(N)) N <- a + b + c + d stopifnot(N == a + b + c + d) if(measure == "CCR") { value <- (a+d)/N } else if(measure == "Sensitivity") { value <- a/(a+c) } else if(measure == "Specificity") { value <- d/(b+d) } else if(measure == "Omission") { value <- c/(a+c) } else if(measure == "Commission") { value <- b/(b+d) } else if(measure == "PPP") { value <- a/(a+b) # also called "Precision" } else if(measure == "NPP") { value <- d/(c+d) } else if(measure == "Misclass") { value <- (b+c)/N } else if(measure == "UPR") { value <- c/(c+d) # this and next 3: Barbosa et al. 2013 Diversity and Distributions } else if(measure == "OPR") { value <- b/(a+b) } else if(measure == "PPI") { value <- ((a+b)/(a+c))-1 } else if(measure == "PAI") { value <- ((c+d)/(b+d))-1 } else if(measure == "kappa") { value <- ((a+d)-(((a+c)*(a+b)+(b+d)*(c+d))/N))/(N-(((a+c)*(a+b)+(b+d)*(c+d))/N)) } else if(measure == "TSS") { value <- (a*d - b*c) / ((a+c) * (b+d)) } else if(measure == "NMI") { value <- 1-((-a*log(a)-b*log(b)-c*log(c)-d*log(d)+(a+b)*log(a+b)+(c+d)*log(c+d))/(N*log(N)-((a+c)*log(a+c)+(b+d)*log(b+d)))) # NMI by Forbes (1995); "1-" missing in Fielding & Bell 1997 (and Manel et al 2001) but Fielding confirmed it was a typo } else if(measure == "OddsRatio") { value <- (a*d)/(c*b) # (c*b)/(a*d) # inverse, would give a (more expectable) unimodal plot of odds against thresholds } else stop("Invalid measure; type 'threshMeasures.methods' for available options.") return(value) } # end evaluate function threshMeasures <- function(obs = NULL, pred = NULL, model = NULL, thresh = 0.5, measures = threshMeasures.methods, simplif = FALSE, plot = TRUE, plot.ordered = FALSE, standardize = TRUE, messages = TRUE, ...) { # version 2.5 (20 Jan 2013) # obs: a vector of observed presences (1) and absences (0) or another binary response variable # pred: a vector with the corresponding predicted values of presence probability, habitat suitability, environmental favourability or alike # model: instead of (and overriding) obs and pred, you can provide a model object of class "glm" # thresh: threshold value to separate predicted presences from predicted absences in 'pred'; may be "preval" or any real number between 0 and 1 # measures: character vector of the evaluation measures to use # plot: logical, whether or not to produce a barplot of the calculated measures # plot.ordered: logical, whether to plot the measures in decreasing order rather than in input order # standardize: logical, whether to change measures that may range between -1 and +1 (namely kappa and TSS) to their corresponding value in the 0-to-1 scale, so thet they can compare directly to other measures # ...: arguments to be passed to the barplot function (if plot = TRUE) if(is.null(model)) { if (is.null(obs) | is.null(pred)) stop("You must provide either the 'obs' and 'pred' vectors, or a 'model' object of class 'glm'") } # end if null model else { if (!all(class(model) %in% c("glm", "lm"))) stop("'model' must be a model object of class 'glm'") if (messages) { if (!is.null(obs)) message("Argument 'obs' ignored in favour of 'model'.") if (!is.null(pred)) message("Argument 'pred' ignored in favour of 'model'.") } obs <- model$y pred <- model$fitted.values } # end if !null model if (NA %in% obs | NA %in% pred) stop("Please remove (rows with) NA from your data") if (length(obs) != length(pred)) stop("'obs' and 'pred' must have the same number of values (and in the same order)") if (!all(obs %in% c(0, 1))) stop ("'obs' must consist of binary observations of 0 or 1") if (standardize & !exists("standard01")) stop ("'standardize' requires the standard01 function; please load that function or use 'standardize = FALSE'") if (!(thresh == "preval" | is.numeric(thresh))) stop("'thresh' must be either 'preval' or a numeric value between 0 and 1") if (thresh == "preval") thresh <- prevalence(obs) obs0 <- obs == 0 obs1 <- obs == 1 pred0 <- pred < thresh pred1 <- pred >= thresh a <- sum(obs1 & pred1) b <- sum(obs0 & pred1) c <- sum(obs1 & pred0) d <- sum(obs0 & pred0) N <- a + b + c + d Nmeasures <- length(measures) measureValues <- as.vector(rep(NA, Nmeasures), mode = "numeric") for (i in 1:Nmeasures) { if (measures[i] == "AUC") measureValues[i] <- AUC(obs = obs, pred = pred, simplif = TRUE) else if (measures[i] %in% threshMeasures.methods) { measureValues[i] <- evaluate(a, b, c, d, N, measure = measures[i]) if (standardize == TRUE & measures[i] %in% c("TSS", "kappa")) { measureValues[i] <- standard01(measureValues[i]) measures[i] <- paste("s", measures[i], sep = "") message(measures[i], " standardized to the 0-1 scale for direct comparability with other measures; use 'standardize = FALSE' if this is not what you wish") } # end if standardize } # end if measures[i] in threshMeasures.methods else { warning("'",measures[i],"' is not a valid measure; type 'threshMeasures.methods' for available options\n\n") next } # end else } # end for i Measures <- matrix(data = measureValues, nrow = Nmeasures, ncol = 1, dimnames = list(measures, "Value")) if (simplif) { # shorter version for use with e.g. optiThresh function return(Measures) } else { prev <- prevalence(obs) conf.matrix <- matrix(c(a, b, c, d), nrow = 2, ncol = 2, byrow = TRUE, dimnames = list(c("pred1", "pred0"), c("obs1", "obs0"))) if (plot) { names(measureValues) <- measures measures.plot <- measureValues if (plot.ordered) { measures.plot <- sort(measures.plot, decreasing = TRUE) } barplot(measures.plot, las = 3, ...) } # end if plot return(list(N = N, Prevalence = prev, Threshold = thresh, ConfusionMatrix = conf.matrix, ThreshMeasures = Measures)) } # end else } # end threshMeasures function

[presented with Pretty R]

Note that some of these measures (like *NMI*,* UPR*,* OPR*,* PPP*,* NPP*) cannot be calculated for thresholds at which there are zeros in the confusion matrix. Note also that, while most of these measures range from 0 to 1, some of them — such as *kappa* and *TSS* — may range from -1 to 1 (Allouche et al. 2006), so their raw scores are not exactly comparable. The *threshMeasures* function includes an option to standardize these measures to 0-1, for which you need the *standard01* function, so that you obtain the standardized versions *skappa* and *sTSS*.

By default, the *threshMeasures* function calculates all the measures listed in *threshMeasures.methods*, but you may specify only a few measures if you wish, as in these examples:

# create some random test data: myobs <- sample(c(0, 1), 150, replace = TRUE) mypred <- runif(150, min = 0, max = 1) # note: you can pratice with real sample data too # calculate some threshold-based measures: threshMeasures(obs = myobs, pred = mypred, thresh = 0.75, measures = c("Sensitivity", "Specificity", "kappa", "TSS", "UPR", "OPR"), standardize = TRUE, ylim = c(0, 1)) # instead of obs and pred, you can provide a model object of class "glm": # mymodel = glm(myspecies ~ var1 + var2 ... ...) threshMeasures(model = mymodel, thresh = "preval", measures = c("Sensitivity", "Specificity", "UPR", "OPR"), ylim = c(0, 1))

The *threshMeasures* function can also be used to calculate the agreement between different presence-absence (or other types of binary) data, as Barbosa et al. (2012) did for comparing mammal distribution data from atlas and range maps. Notice, however, that some of these measures, such as TSS or NMI, are not symmetrical (obs *vs* pred is different from pred *vs* obs).

**References:**

Allouche, O., Tsoar, A. & Kadmon, R. (2006) Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). *Journal of Applied Ecology*, **43**, 1223–1232.

Barbosa A.M., Brown J.A. & Real R. (2014) modEvA – an R package for model evaluation and analysis. R package, version 0.1.

Barbosa A.M., Estrada A., Márquez A.L., Purvis A. & Orme C.D.L. (2012) Atlas versus range maps: robustness of chorological relationships to distribution data types in European mammals. *Journal of Biogeography* 39: 1391-1400

Barbosa A.M., Real R., Muñoz A.R. & Brown J.A. (2013) New measures for assessing model equilibrium and prediction mismatch in species distribution models. *Diversity and Distributions* 19: 1333-1338 (DOI: 10.1111/ddi.12100)

Fielding A.H. & Bell J.F. (1997). A review of methods for the assessment of prediction errors in conservation presence/absence models. *Environmental Conservation* 24: 38-49

Jiménez-Valverde A. & Lobo J.M. (2007) Threshold criteria for conversion of probability of species presence to either–or presence–absence. *Acta Oecologica* 31: 361–369

Liu C., White M., & Newell G. (2011) Measuring and comparing the accuracy of species distribution models with presence-absence data. *Ecography*, **34**, 232–243.

Nenzén H.K. & Araújo M.B. (2011) Choice of threshold alters projections of species range shifts under climate change. *Ecological Modelling* 222: 3346-3354