Optimization of model thresholds based on a pair of measures

The optiPair function, now included in the modEvA package (Barbosa et al. 2014), follows the optiThresh, threshMeasures and evaluate functions, which should all be loaded along; and extends (thus replacing) the SensSpecCurve function (now deprecated), in the sense that optiPair can be used with any pair of model evaluation measures that balance each other, such as sensitivity-specificity, omission-commission, or underprediction-overprediction (featured in Barbosa et al. 2013). The function plots both measures in the given pair against all thresholds, and calculates the optimal sum, difference and mean of the two measures.

SensSpec

optiPair <-
function (obs = NULL, pred = NULL, model = NULL, 
          measures = c("Sensitivity", "Specificity"), 
          interval = 0.01, plot = TRUE, plot.sum = FALSE, 
          plot.diff = FALSE, ylim = NULL, ...) {
  # version 1.5 (3 Apr 2014)
  # obs: a vector of observed presences (1) and absences (0) or another binary response variable
  # pred: a vector with the corresponding predicted values of presence probability, habitat suitability, environmental favourability or alike
  # model: instead of (and overriding) obs and pred, a model object of class "glm"
  # measures: the pair of measures whose curves to plot, e.g. c("Sensitivity", "Specificity")
  # interval: the interval of thresholds at which to calculate sensitivity and specificity
  # plot: logical indicating whether or not to plot the measures
  # plot.sum and plot.diff will optionally plot the sum (+) and/or the difference (-) between both measures in the pair
  # ... : additional arguments to be passed to the "plot" function
 
  if(length(measures) != 2) stop ("'measures' must contain two elements.")
 
  if(is.null(model)) {
    if (is.null(obs) | is.null(pred)) stop("You must provide either the 'obs' 
and 'pred' vectors, or a 'model' object of class 'glm'")
  }  # end if null model
  else {
    if (!is.null(obs)) message("Argument 'obs' ignored in favour of 'model'.")
    if (!is.null(pred)) message("Argument 'pred' ignored in favour of 'model'.")
    obs <- model$y
    pred <- model$fitted.values
    model <- NULL  # so the message is not repeated for each threshold
  }  # end if model
 
  if (NA %in% obs | NA %in% pred) stop("Please remove (rows with) NA 
                                       from your data")
  if (length(obs) != length(pred)) stop("'obs' and 'pred' must have the same 
                                        number of values (and in the same order)")
  if (!all(obs %in% c(0, 1))) stop ("'obs' must consist of binary observations 
                                    of 0 or 1")
  if (any(pred < 0 | pred > 1)) stop ("'pred' must range between 0 and 1")
 
  measures.values <- optiThresh(obs = obs, pred = pred, interval = interval, measures = measures, optimize = "each", simplif = TRUE)
  measures.values$Difference <- abs(measures.values[ , 1] - measures.values[ , 2])
  measures.values$Sum <- rowSums(measures.values[ , 1:2])
  measures.values$Mean <- rowMeans(measures.values[ , 1:2])
  measures.values$Threshold <- as.numeric(rownames(measures.values))
 
  MinDiff <- min(measures.values$Difference)
  MaxSum <- max(measures.values$Sum)
  MaxMean <- max(measures.values$Mean)
 
  ThreshDiff <- with(measures.values, Threshold[which.min(Difference)])
  ThreshSum <- with(measures.values, Threshold[which.max(Sum)])
  ThreshMean <- with(measures.values, Threshold[which.max(Mean)])
 
  if (plot) {
 
    finite <- as.matrix(measures.values[ , 1:2])
    finite <- finite[is.finite(finite)]
    if(is.null(ylim)) {
      if(plot.sum) ylim <- c(min(finite), MaxSum)
      else ylim <- c(min(finite), max(finite))
    }  # end if null ylim
 
    plot(measures.values[ , 1] ~ measures.values$Threshold, pch = 20, xlab = "Threshold", ylab = measures, ylim = ylim, ...)
    # title(ylab = measures, col.lab = c("black", "grey"))  # error: "col.lab" has the wrong length
    mtext(measures[1], side = 2, line = 3)
    points(measures.values[ , 2] ~ measures.values$Threshold, pch = 20, col = "darkgrey")
    mtext(measures[2], side = 2, line = 2, col = "darkgrey")
 
    abline(h = measures.values[which.min(measures.values$Difference), 1], col = "lightgrey", lty = 2)
    abline(v = ThreshDiff, col = "lightgrey", lty = 2)
 
    if (plot.sum) {
      with(measures.values, points(Sum ~ Threshold, pch = '+'))
      abline(v = ThreshSum, col = "lightgrey", lty = 2)
      abline(h = MaxSum, col = "lightgrey", lty = 2)
    }    # end if plot sum
 
    if (plot.diff) {
      with(measures.values, points(Difference ~ Threshold, pch = '-', col = "grey"))
      abline(v = ThreshDiff, col = "grey", lty = 2)
      abline(h = MinDiff, col = "grey", lty = 2)
    }  # end if plot diff
  }  # end if plot
 
  return(list(measures.values = measures.values, MinDiff = MinDiff, ThreshDiff = ThreshDiff, MaxSum = MaxSum, ThreshSum = ThreshSum, MaxMean = MaxMean, ThreshMean = ThreshMean))
 
}

[Created by Pretty R at inside-R.org]

To use it, just load the required functions and replace the names of your data frame, presence-absence data and predicted values (or else use a “glm” model object directly) in either of the folowing commands:

optiPair(obs = mydata$myspecies, pred = mydata$myspecies_P, measures = c(“Sensitivity”, “Specificity”))

optiPair(model = mymodel, measures = c(“UPR”, “OPR”))

References

Barbosa A.M., Brown J.A. & Real R. (2014) modEvA – an R package for model evaluation and analysis. R package, version 0.1.

Barbosa, A.M., Real, R., Muñoz, A.-R. & Brown, J.A. (2013) New measures for assessing model equilibrium and prediction mismatch in species distribution models. Diversity and Distributions, online early (DOI: 10.1111/ddi.12100).

Advertisements

Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s