# Prevalence and evenness in binary data

For building and evaluating species distribution models, the porportion of presences of the species and the balance between the number of presences and absences may be issues to take into account (e.g. Jiménez-Valverde & Lobo 2006, Barbosa et al. 2013). The prevalence and the evenness functions, included in the modEvA package (Barbosa et al. 2014), can calculate these measures:

```prevalence <- function (obs, event = 1) {
# version 1.0
# calculates the prevalence (proportion of occurrences) of a value (event) in a vector
# obs: a vector of binary observations (e.g. 1 vs. 0, male vs. female, disease vs. no disease, etc.)
# event: the value whose prevalence we want to calculate (e.g. 1, "present", etc.)
sum(obs == event) / length(obs)
}  # end prevalence function```
```evenness <- function (obs) {
# version 1.3 (18 June 2013)
# calculates the evenness (equilibrium) of cases in a binary vector; result ranges from 0 when all values are the same, to 1 when there are the same number of cases with each value
# obs: a vector of binary observations (e.g. 1 vs. 0, male vs. female, disease vs. no disease, etc.)
values <- unique(obs)
nvalues <- length(values)
if (!(nvalues %in% c(1, 2))) stop("Input vector includes ", nvalues, " different values; 'evenness' is only implemented for binary observations (with 1 or 2 different values).")
proportion <- (sum(obs == values[1])) / length(obs)
if (proportion > 0.5)  balance <- 1 - proportion  else  balance <- proportion
return(2 * balance)  # so evenness ranges between 0 and 1
}  # end evenness function```

Let’s exemplify with 3 sample binary vectors x, y and z:

```(x <- rep(c(0, 1), each = 5))
(y <- c(rep(0, 3), rep(1, 7)))
(z <- c(rep(0, 7), rep(1, 3)))

prevalence(x)
evenness(x)

prevalence(y)
evenness(y)

prevalence(z)
evenness(z)```

[presented with Pretty R]

References

Barbosa A.M., Brown J.A. & Real R. (2014) modEvA – an R package for model evaluation and analysis. R package, version 0.1.

Barbosa A.M., Real R., Muñoz A.R. & Brown J.A. (2013) New measures for assessing model equilibrium and prediction mismatch in species distribution models. Diversity and Distributions, 19: 1333-1338

Jiménez-Valverde A. & Lobo J.M. (2006) The ghost of unbalanced species distribution data in geographical model predictions. Diversity and Distributions, 12: 521–524.