Trim off non-significant variables from a model

Stepwise variable selection is a common procedure for simplifying models. It maximizes predictive efficiency in an objective and reproducible way, and is useful when the individual importance of the predictors is not known a priori (Hosmer & Lemeshow, 2000). The step function in R performs such procedure using an information criterion (AIC) to select the variables, but it often leaves variables that are not significant in the model. Such variables can be subsequently removed with a manual stepwise procedure (e.g. Crawley 2007, p. 442; Barbosa & Real 2010, 2012; Estrada & Arroyo 2012). The modelTrim function, now included in the fuzzySim package (Barbosa, 2015), performs such removal automatically until all remaining variables are significant. It can also be applied to a full model (i.e., without previous use of the step function), as it serves as a backward stepwise selection procedure based on the significance of the coefficients (if method = ‘summary’, the default) or on the significance of the variables themselves (if method = ‘anova’, better when there are categorical variables in the model).

modelTrim <- function(model, method = "summary", alpha = 0.05) {
  # version 1.7 (16 Apr 2013)

  if (method == "summary") {
    p.values <- expression(summary(model)$coefficients[ , 4])
  } else if (method == "anova") {
    p.values <- expression(as.matrix(anova(model, test = "Chi"))[ , 5])
  } else stop ("'method' must be either 'summary' or 'anova'")

  while (max(eval(p.values)[-1]) > alpha) {  # excludes p-value of intercept
    model <- update(model, as.formula(paste("~.-", names(which.max(eval(p.values)[-1])))))
    if (length(eval(p.values)) == 1) {  # only intercept remains
      message("No significant variables left in the model.")
      break
   }  # end if length
  } # end while

  return(model)
}  # end modelTrim function

[Created by Pretty R at inside-R.org]

If you have a model object (resulting e.g. from the glm function) called, for example, mod.sp1, load the modelTrim function and then just type

mod.sp1 <- modelTrim(mod.sp1, method = “summary”)

summary(mod.sp1)

An option to trim the models using this function is now also included as an option in the multGLM function.

References

Barbosa A.M. (2015) fuzzySim: applying fuzzy logic to binary similarity indices in ecology. Methods in Ecology and Evolution, 6: 853-858

Barbosa A.M. & Real R. (2010) Favourable areas for expansion and reintroduction of Iberian lynx accounting for distribution trends and genetic diversity of the European rabbit. Wildlife Biology in Practice 6: 34-47

Barbosa A.M. & Real R. (2012) Applying fuzzy logic to comparative distribution modelling: a case study with two sympatric amphibians. The Scientific World Journal, Article ID 428206

Crawley M.J. (2007) The R Book. John Wiley & Sons, Chichester (UK)

Estrada A. & Arroyo B. (2012) Occurrence vs abundance models: Differences between species with varying aggregation patternsBiological Conservation, 152: 37-45

Hosmer D. W. & Lemeshow S. (2000) Applied Logistic Regression (2nd ed). John Wiley and Sons, New York

Advertisements

Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s