The newest version of R package fuzzySim (3.0) is now on CRAN! It includes new functions such as ‘sharedFav‘, ‘favClass‘, ‘bioThreat‘ and ‘gridRecords‘; improvements to some functions, help files and examples; updated e-mail and citation information [ see citation(“fuzzySim”) ]; clarifications and typo corrections along the reference manual; and some bug fixes (after changes to base R and/or to function dependencies), e.g. to ‘getPreds‘ when applied to raster objects. You should now uninstall the old version of the package and install the new one:
Among other new functionalities, fuzzySim now makes it easier to use variable selection and presence-absence modelling techniques on occurrence points + raster variables, as these are becoming the more common data formats in species distribution modelling (SDM) and ecological niche modelling (ENM). Here’s a worked example:
# download and plot predictor variables: # (make sure their spatial resolution is not finer than the occurrence data) library(raster) worldclim <- getData("worldclim", var = "bio", res = 10) plot(worldclim) plot(worldclim[]) # download and plot species occurrence data: library(rgbif) gbif <- occ_data(scientificName = "Galemys pyrenaicus", hasCoordinate = TRUE, limit = 5000) # GBIF data often have errors, data cleaning (e.g. with package 'scrubr') should be performed! # here we'll just remove records of absence or zero-abundance: absence_rows <- which(gbif$data$occurrenceStatus == "absent" | gbif$data$organismQuantity == 0) if (length(absence_rows) > 0) gbif$data <- gbif$data[-absence_rows, ] presences <- gbif$data[ , c("decimalLongitude", "decimalLatitude")] # plot the occurrence records on the map: points(presences) # crop variables to the extent of the occurrence data (mind that this is just one possible way of cropping/masking to the adequate region of analysis): worldclim_crop <- crop(worldclim, extent(range(presences$decimalLongitude), range(presences$decimalLatitude))) plot(worldclim_crop[]) points(presences)
# model occurrence data as presence-absence in the cropped grid of pixels: library(fuzzySim) # first, get the centroid coordinates and variable values at pixels with and without presence records # (remember, raster resolution should not be finer than occurrence data!) gridded_presences <- gridRecords(rst = worldclim_crop, pres.coords = presences) head(gridded_presences) plot(worldclim_crop[]) points(gridded_presences[gridded_presences$presence == 0, c("x", "y")], col = "red") points(gridded_presences[gridded_presences$presence == 1, c("x", "y")], col = "blue", pch = 20)
# then, build a GLM with variable selection on these presence-absence data: names(gridded_presences) model_GLM <- multGLM(data = gridded_presences, sp.cols = "presence", var.cols = 5:23, id.col = "cells", FDR = TRUE, corSelect = TRUE, step = TRUE, trim = TRUE) summary(model_GLM$models$presence) head(model_GLM$predictions) # finally, get and plot the model predictions (probability and favourability): pred_GLM_raster <- getPreds(data = stack(worldclim_crop), models = model_GLM$models) plot(pred_GLM_raster)
If you’re worried about using a presence-absence modelling algorithm on ‘presence-only’ records, you can compare these predictions with those of a widely used presence-background algorithm (Maxent) on the same data, to check that they are not far off:
library(maxnet) # build a maxent model with all variables: model_maxent_allvars <- maxnet(p = gridded_presences[ , "presence"], data = gridded_presences[ , 5:23], f = maxnet.formula(p = gridded_presences[ , "presence"], data = gridded_presences[ , 5:23], classes = "lq")) # linear + quadratic features # build a maxent model with only the variables selected by multGLM (folowing FDR, correlation, AIC, significance): selected_vars <- model_GLM$variables$presence model_maxent_selvars <- maxnet(p = gridded_presences[ , "presence"], data = gridded_presences[ , selected_vars], f = maxnet.formula(p = gridded_presences[ , "presence"], data = gridded_presences[ , selected_vars], classes = "lq")) # compute and map maxent predictions: pred_maxent_allvars_raster <- raster::predict(worldclim_crop, model_maxent_allvars, type = "cloglog", clamp = TRUE) pred_maxent_selvars_raster <- raster::predict(worldclim_crop, model_maxent_selvars, type = "cloglog", clamp = TRUE) par(mfrow = c(1, 2)) plot(pred_maxent_allvars_raster, main = "Maxent all vars") plot(pred_maxent_selvars_raster, main = "Maxent selected vars")
This and other modelling methods will be taught next week at the CIBIO advanced course on ecological niche modelling. Check out the Training page for other upcoming courses!