Simpler Species Distribution Models Yield Better Predictions

Tuesday, September 10, 2013: 8:00 AM
Harris Brake (The Marriott Little Rock)
Seth J. Wenger , Trout Unlimited, Boise, ID
Julian D. Olden , School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA
Species distribution modeling (SDM) or niche modeling aims to identify correlates of species occurrence, often with the ultimate goal of predicting suitable habitat in other locations (for example, to predict potential species invasions) or under future climates. It is common to use machine learning methods such as boosted regression trees, random forests, or neural networks for such predictions, as these techniques have the flexibility to describe nonlinear relationships and interactions without prior specification. However, the resulting models may be quite complex, and the degree of complexity is not easily described or constrained. We show that models of higher complexity may have inferior transferability to new locations and, potentially, to future climates. In one example with brook trout (Salvelinus fontinalis) in the Western United States, we demonstrate that the simplest of five models has the greatest spatial transferability, even though this model has the worst performance when evaluated with the dataset used for model fitting. We attribute this behavior to the fact that species-environment relationships are often highly variable, and overly complex models may be idiosyncratic to a given location; they are, in effect, overfitted. Simpler models that capture only the most important and generalizable species-environment relationships are of more practical use and higher reliability when projecting into new locations and future climates.