Imagine you have a data frame with the species that are present in each locality, such as this:
locality species Cuba Pancake bush Sumatra Pancake bush Greenland Red dwarf South America Pancake bush South America Tree whale Antarctica Red dwarf
…but what you need is a data frame with one species per column and their presence (1) or absence (0) in each of the sites (rows), such as this:
locality Pancake.bush Red.dwarf Tree.whale Africa 1 0 0 Antarctica 1 1 0 Asia 1 0 0 Australia 1 1 0 Baffin 1 1 1 Banks 1 1 0
You can use R’s table function for this, but it can take a bit of fiddling to get the result in an easily manageable format. The splist2presabs function, now included in the fuzzySim package (Barbosa 2014), does this in one step:
splist2presabs <- function(data, sites.col, sp.col, keep.n = FALSE) { # version 1.1 (7 May 2013) # data: a matrix or data frame with your localities and species (each in a different column) # sites.col: the name or index number of the column containing the localities # sp.col: the name or index number of the column containing the species names or codes # keep.n: logical, whether to get in the resulting table the number of times each species appears in each locality; if false (the default), only the presence (1) or absence (0) are recorded stopifnot( length(sites.col) == 1, length(sp.col) == 1, sites.col != sp.col, sites.col %in% 1 : ncol(data) | sites.col %in% names(data), sp.col %in% 1 : ncol(data) | sp.col %in% names(data), is.logical(keep.n) ) presabs <- table(data[ , c(sites.col, sp.col)]) presabs <- as.data.frame(unclass(presabs)) if (!keep.n) presabs[presabs > 1] <- 1 presabs <- data.frame(row.names(presabs), presabs) names(presabs)[1] <- names(subset(data, select = sites.col)) rownames(presabs) <- NULL return(presabs) } # end splist2presabs function
To try an example, load the splist2presabs function (above) and then do the following:
# get a set of localities and some fake species: loc <- names(islands) spp <- c("Pancake bush", "Tree whale", "Red dwarf") # combine them in a data frame: data <- data.frame(locality = sample(loc, 100, replace = TRUE), species = sample(spp, 100, replace = TRUE)) # take a look at the data: head(data) # turn these into a presence-absence data frame and check it out: atlas <- splist2presabs(data, sites.col = 1, sp.col = 2) head(atlas) # if you'd rather have columns with shorter names, load the spCodes function and do: data$spcode <- spCodes(data$species) head(data) atlas <- splist2presabs(data, sites.col = 1, sp.col = 3) head(atlas)
REFERENCES:
Barbosa A.M. (2014) fuzzySim: Fuzzy similarity in species’ distributions. R package, version 0.1.
I need to do the opposite! Any ideas how can I do that?
Try the ‘reshape’ function in base R, or functions in other packages like http://www.cookbook-r.com/Manipulating_data/Converting_data_between_wide_and_long_format/