By Harry Brook
Why should you care about statistics? Statistics surround us and they are a tool to determine how likely an outcome will occur. Algorithms are based on statistics. Sounds simple, right? Like any tool, it can be abused. Attributed to Mark Twain is the quote: “There are three kinds of falsehoods, lies, damn lies and statistics.” In fact, statistics are frequently used to sell all sorts of things. It is used for selling medicine, health products, crop inputs and many other products claiming various improvements to your life and the bottom line.
When marketing, there are a number of ways statistics are used to sell. Using bar graphs and research results give the impression of rigorous scientific research. One method to misrepresent the statistics is to cut off a bar graph and only show the top part. This exaggerates the differences between one treatment and another. This doesn’t necessarily mean it is even valid or true, just that it might have occurred at one site.
Statistics are great for cherry picking. If you have multiple site/years of data, you can usually find one site where there is a response to the treatment. If you just show the one site with a profitable response, it gives the impression that this is true for every location. This is where the confidence interval comes in. The confidence interval is usually expressed as a percentage. It is the range where a certain percentage of the results fall into. You need a high confidence interval to prove the results are repeatable. Usually, they like to have it at 95% or 99%. It is often given as 19 times out of 20 (95%). The p value is closely related. It is the probability that the result occurred by chance. A p value of .05 means there is a 5% chance the results were random. The lower the number, the better.
There is another way to twist the numbers as well. It is that correlation does not necessarily mean causation. Put simply, just because they are related doesn’t mean one thing causes another. For example, a producer tries a new micronutrient in his crop. At the same time he makes other changes using a new seed drill, less or more of other nutrients and then attributes any gain in yield to the micronutrient product. At the same time, there is different weather conditions and other changes to his operation he ignores. That is due to our innate desire to believe too-good-to-be-true claims.
A very popular method of sales is the use of testimonials. It is simply a statement by someone seen as reliable, or relatable, stating it worked for them. This is always a red flag as there is never any background or numerical information to back up their claims. The trick is to make the persons who make those claims look just like you and me. If they are wearing a suit, you wouldn’t believe them.
Other techniques to twist the facts include using skewed or bad visuals. Taking a two-dimensional result and displaying it in three dimensions. Using misleading percentages is a good one too: “20% higher returns!” Compared to what? Total yield? Profit? Another error is using too small a sample size. In variety trials you often see new varieties with significantly higher yields with few station years of testing. As time goes on, those numbers always decline as the variety is tested under more and varied conditions.
Statistics and “big” data are used to sell us on all sorts of products and ideas. It is imperative that we use a little critical thinking to question some results. You don’t have to be a statistician but engage your mind and ask a few questions when you see claims on “new and improved” products.