Plot predicted values for presences vs. absences

Diagnostic plots are always a nice, visually explicit way to assess our data and predictions, and I’ve just added a new one to the modEvA R package. ThepredPlotfunction plots predicted values separated into observed presences and absences, and coloured according to whether they’re above or below a given prediction threshold (the default threshold is species prevalence, i.e. proportion of presences, in the observed data). It shows whether there’s a visible split in the predicted values for presences and absences (see also functions predDensity and plotGLM in the same package, for alternative/complementary ways of visualizing this). The plot imitates (with permission from the author) one of the graphical outputs of the ‘summary‘ of models built with the ‘embarcadero‘ package (Carlson, 2020), but it can be applied to any ‘glm‘ object or any set of observed and predicted values, and it allows specifying a user-defined prediction threshold. The ‘predPlot‘ function is now included in package modEvA (version >= 2.1, currently available on R-Forge).

predPlot <- function(model = NULL, obs = NULL, pred = NULL, thresh = "preval", main = "Classified predicted values", legend.pos = "bottomright") {
  # version 1.0 (20 Jan 2021)
  if (!is.null(model)) {
    if (!("glm" %in% class(model)) || family(model)$family != "binomial") stop("'model' must be of class 'glm' and family 'binomial'.")
    if (!is.null(obs)) message("Argument 'obs' ignored in favour of 'model'.")
    if (!is.null(pred)) message("Argument 'pred' ignored in favour of 'model'.")
    obs <- model$y
    pred <- model$fitted.values
  } else {
    if (is.null(obs) || is.null(pred)) stop ("You must provide either 'model' or a combination of 'obs' and 'pred'.")
    if (!is.numeric(obs) || !is.numeric(pred)) stop ("'obs' and 'pred' must be numeric.")
    if (length(obs) != length(pred)) stop("'obs' and 'pred' must have the same length.")
  if (!(thresh == "preval" || (is.numeric(thresh) && thresh >= 0 && thresh <= 1))) stop ("'thresh' must be either 'preval' or a numeric value between 0 and 1.")
  if (thresh == "preval")  thresh <- prevalence(obs)

  pred0 <- pred[obs == 0]
  pred1 <- pred[obs == 1]
  opar <- par(no.readonly = TRUE)
  par(mar = c(5, 5.2, 3, 1))

  plot(x = c(0, 1), y = c(-0.5, 1.5), xlab = "Predicted value", type = "n", ylab = "", yaxt = "n", main = main)
  axis(side = 2, at = c(0, 1), tick = FALSE, labels = c("Observed\nabsences", "Observed\npresences"), las = 1)
  abline(v = thresh, lty = 2)
  points(x = pred0, y = sapply(rep(0, length(pred0)), jitter, 10), col = ifelse(pred0 < thresh, "grey", "black"))
  points(x = pred1, y = sapply(rep(1, length(pred1)), jitter, 10), col = ifelse(pred1 < thresh, "grey", "black"))
  if (! && legend.pos != "n")  legend(legend.pos, legend = c("Predicted presence", "Predicted absence"), pch = 1, col = c("black", "grey"))

Usage examples:


# load sample models:

# choose a particular model to play with:
mod <- rotif.mods$models[[1]]

# make some predPlots:
predPlot(model = mod)
predPlot(model = mod, thresh = 0.5)

# you can also use 'predPlot' with vectors of observed and predicted values instead of a model object:
myobs <- mod$y
mypred <- mod$fitted.values
predPlot(obs = myobs, pred = mypred)


Carlson C.J. (2020) embarcadero: Species distribution modelling with Bayesian additive regression trees in R. Methods in Ecology and Evolution, 11: 850-858.

4 thoughts on “Plot predicted values for presences vs. absences

  1. Pingback: Package modEvA 3.0 is now on CRAN! | modTools

  2. Pingback: Plot predicted values for presences vs. absences | R-bloggers

  3. Pingback: Plot predicted values for presences vs. absences | Best New Authors Magazine

  4. Pingback: Plot predicted values for presences vs. absences – Data Science Austria


Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s