fuzzySim is an R package that builds on some functions preliminarily published in the modTools blog (such as spCodes, splist2presabs, binarySimilarity and simMatrix). It implements fuzzy versions of the latter two functions to account for fuzziness  /uncertainty / vagueness when comparing species presence-absence patterns or regional species compositions and when calculating beta diversity. It also includes functions to generate fuzzy occurrence data, namely through trend surface analysis, inverse distance interpolation, or generalized linear modelling of binary occurrence (including the favourability function); and to make fuzzy comparison of model predictions, including niche model overlap, fuzzy intersection / union, fuzzy range change, etc.

fuzzySim is publicly available on R-Forge. A simple step-by-step tutorial on installation and usage is available as well. If you use/cite fuzzySim functions, or find articles that do, please let me know so I can keep track of them and help justify the work dedicated to developing the package!

A more graphical and directly mappable version of fuzzySim for QGIS is also in preparation – you can download the current (experimental!) functions from here, place them in your “.qgis2/processing/rscripts” folder (search for it in your computer; you may need to toggle “show hidden files” to see it) and give them a try. You need to have installed QGIS > 2.0, R with the fuzzySim package, and tell QGIS (under Processing – Options and configuration – Providers) where your R instalation is. Feedback welcome!


If you use a fuzzySim function, please remember to cite:

Barbosa A.M. (2015) fuzzySim: applying fuzzy logic to binary similarity indices in ecology. Methods in Ecology and Evolution6: 853-858 (DOI: 10.1111/2041-210X.12372)

2 thoughts on “fuzzySim

  1. The Baroni’s similarities it’s ((sqrt(C * D)) + C) / ((sqrt(C * D)) + A + B + C)), in your function be ((sqrt(C * D) + C) / ((sqrt(C * D)) + A + B – C))… Can you correct this ??

    • Thanks for your input. The formula in the function is right, but looks different from the original one because the A, B, C and D terms don’t mean the same in both formulas – not only because the letters are switched, but because in Baroni’s formula the B and C are “the number of attributes present in the first but not in the second”, while in our formula A and B are the numbers of attributes present in each element, regardless of whether they are also present in the other element or not. This slight difference makes both formulas equivalent, but the second faster to calculate (noticeably in large datasets) and more coherent with Jaccard’s formula. I see that this may confuse people, though, so I’m adding a note about it to the blog post on the binarySimilarity function and to the package help files. I’ve also added to the function in the blog (https://modtools.wordpress.com/2012/05/23/binarysimilarity/) the possibility of using Baroni’s original formula, so you can check for yourself that the results are the same. Cheers!


Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s