Turn a list of species by site into a species presence-absence table

Imagine you have a data frame with the species that are present in each locality, such as this:

locality      species
Cuba          Pancake bush
Sumatra       Pancake bush
Greenland     Red dwarf
South America Pancake bush
South America Tree whale
Antarctica    Red dwarf

…but what you need is a data frame with one species per column and their presence (1) or absence (0) in each of the sites (rows), such as this:

    locality Pancake.bush Red.dwarf Tree.whale
     Africa            1         0          0
 Antarctica            1         1          0
       Asia            1         0          0
  Australia            1         1          0
     Baffin            1         1          1
      Banks            1         1          0

You can use R’s table function for this, but it can take a bit of fiddling to get the result in an easily manageable format. The splist2presabs function, now included in the fuzzySim package (Barbosa 2014), does this in one step:

splist2presabs <- function(data, sites.col, sp.col, keep.n = FALSE) {
  # version 1.1 (7 May 2013)
  # data: a matrix or data frame with your localities and species (each in a different column)
  # sites.col: the name or index number of the column containing the localities
  # sp.col: the name or index number of the column containing the species names or codes
  # keep.n: logical, whether to get in the resulting table the number of times each species appears in each locality; if false (the default), only the presence (1) or absence (0) are recorded

    length(sites.col) == 1,
    length(sp.col) == 1,
    sites.col != sp.col,
    sites.col %in% 1 : ncol(data) | sites.col %in% names(data),
    sp.col %in% 1 : ncol(data) | sp.col %in% names(data),

  presabs <- table(data[ , c(sites.col, sp.col)])
  presabs <- as.data.frame(unclass(presabs))
  if (!keep.n)  presabs[presabs > 1] <- 1
  presabs <- data.frame(row.names(presabs), presabs)
  names(presabs)[1] <- names(subset(data, select = sites.col))
  rownames(presabs) <- NULL
}  # end splist2presabs function

To try an example, load the splist2presabs function (above) and then do the following:

# get a set of localities and some fake species:
loc <- names(islands)
spp <- c("Pancake bush", "Tree whale", "Red dwarf")
# combine them in a data frame:
data <- data.frame(locality = sample(loc, 100, replace = TRUE), species = sample(spp, 100, replace = TRUE))
# take a look at the data:
# turn these into a presence-absence data frame and check it out:
atlas <- splist2presabs(data, sites.col = 1, sp.col = 2)
# if you'd rather have columns with shorter names, load the spCodes function and do:
data$spcode <- spCodes(data$species)
atlas <- splist2presabs(data, sites.col = 1, sp.col = 3)

[presented with Pretty R]


Barbosa A.M. (2014) fuzzySim: Fuzzy similarity in species’ distributions. R package, version 0.1.



Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s