By Auriel M.V. Fournier and Andrew MacDonald
Math and Statistics Editors
A common misconception is that statistics can make precise predictions about future events. But most of the time there is error associated with a prediction; that is, the prediction falls within a range of values.
For instance a meteorologist might predict that Quebec City will receive 300 cm of snow this year plus or minus 20 cm. In other words, the city is predicted to receive between 280 and 320 cm of snow. When talking about something variable like the weather uncertainty makes a lot of sense. I (Auriel) grew up around the Great Lakes and the weather can change at the drop of a hat. But when statistics are used to predict other things the error can cause unease.
Error can be very frustrating both for those doing the predictions and for those trying to use them. If the error is too large it can make the prediction less informative – even useless. Because of this statisticians often try to reduce the error as much as possible by trying to understand what is driving the process they are modeling.
Take climate change. There are many models being used to predict climate change. Results of these predictions vary depending on which greenhouse gasses are being modeled, what time-frame or region is being observed and whether the models assume a linear, exponential, or sudden tipping point increase in temperature.
The many predictions about the pace and impact of climate change can cause people to question the validity of climate change as a phenomenon. But whether climate change is occurring is not in question. It is the models that predict future events that vary.
The error in these models, like many, is because there are always things in the system that cannot be accounted for. Most systems are incredibly complex, especially natural systems like climate. Current computing and mathematical models still cannot account for every variable.
“Essentially, all models are wrong, but some are useful” – George E.P. Box
There will always be errors in the models that predict events in our lives — the daily weather, election results, stock markets, advertising, bird migration, traffic flow, and sports outcomes. But statistics can still be useful to make informed guesses and decisions.
Statistical models driven by data can serve to remove human bias from decision-making. Expert opinion is a valuable tool, but expert opinion paired with data driven predictions makes decision-making more objective.
Error is an inevitable part of the process. However, analyzing the source of the error can serve to clarify the phenomenon being studied.
For instance, my own work (Auriel’s) involves trying to understand what habitats rails (Rallidae) use during the fall. Our initial analysis had huge error. This led us to explore other ways of collecting habitat data and variables, such as percentage of the wetland covered by a plant species, which turned out to explain our data better.
Error can be frustrating, but accepting and understanding it is key to the correct interpretation of statistics and understanding the world around us.
Quote Source: George E.P. Box, Empirical Model-Building and Response Surfaces (1987) (wikiquotes)
Photo Credits: Auriel Fournier