According to a recent article published in Scientific American, the authors of study assert that:
"Farmers who have named their cows … probably have a better relationship with them. They’re less fearful, more relaxed and less stressed, so that could have an effect on milk yield."
And in a separate article:
"Placing more importance on knowing the individual animals and calling them by name can — at no extra cost to the farmer –— also significantly increase milk production."
Polite farmers and happy cows — who could argue with that?
Correlation is not Causation
The article is a cute example of a common problem in BI and decision making: correlation is not the same thing as causation, but people often don’t do enough analysis to know which is which.
I haven’t read the original research on the cows, so I can’t be sure, but there doesn’t seem to be any real evidence for saying there are "no extra costs". The "probably" in the first quote appears to indicate that the stress theory is just a guess, so did the researchers really find a causal effect?, or do both higher milk yields and a propensity for naming cows result from something else? Do farmers who name their cows look after them better? Do they spend more money on care?
Without answers to these questions, the information in the study is interesting, but not useful (i.e. would farmers who named their cows — but made no other changes — really get an increase in milk production?)
Quick! — Should We Discontinue Bread and Milk?
It’s very easy to make incorrect decisions with limited analysis. And the problem is sometimes made worse by an emphasis on fast, simplified, "actionable" information for executives.
If you see a chart like this one, showing a supermarket’s profitability by product, for example, you might be tempted to stop selling loss-making bread and milk:
But what if it’s a deliberate choice? The analysis misses a crucial point: people don’t buy each product in a supermarket independently of the others. Supermarkets (if they are legally allowed to: not in France) routinely sell some "loss-leader" products such as break, milk, and sugar in order to entice people into their store.
One real-life example of this phenomena, according to IBM:
A large UK supermarket chain sold a low-volume, gourmet cheese that was such a slow mover that the merchandising department considered discontinuing it. But a market-basket analysis of a high-value customer group revealed that the cheese was in many of the largest baskets in this customer group, and dropping the product could have risked disappointing or losing some of its most valuable customers.
(I prefer this apparently real example to the famous "beer and diapers" story, which is more legend than fact)
Needed: More Analysis — and Smart People
In order to make the right decisions using business intelligence, you have to rule out alternate causation effects. But understanding when further analysis might be needed, and what to look at, requires a deep knowledge of the business context — which in turn requires smart people who know what they’re doing.
Any successful business intelligence strategy has to take this into account, and include industry/business/data experts in a business intelligence competency center (BICC).
The Future of Decisions
Still, technology can — and should — help. The decision-making process is under-supported by current business intelligence and performance management tools. One key aspect of "decision intelligence" in the future should be the ability to help people identify when they might be making an over-hasty decision on too-limited data…
________________________________________________
Original cow Image by publicenergy
Comments
3 responses to “More Milk Please, Ermintrude! A Classic Decision Trap?”
[…] any relationship you find is a working hypothesis, to be validated through further analysis (e.g. correlation is not causation), and expert discussion (as with peer-reviewed science papers, the best way to deal with potential […]
Timo,
I agree with you that there is a lot of frivolous drawing of conclusions from shaky data analysis. Thanks for raising this issue.
I don’t believe that any analysis of data can get past correlation to causation because you never know if there is one more variable out there you overlooked. In addition, you can only estimate bias, not eliminate it. When statistical analysis shows an overwhelming correlation between drunk driving and accidents, you can only assume causation, but if you’re an ambulance driver, you can see it. Air traffic controllers won’t be replaced by decision engines based on statistical models that have revealed the causation of accidents. The sloppy statistics have been raised to an art form in the medical industry border on scandalous. Numbers will get you only so far, but unless you can see it in vivo, it’s still just an hypothesis.
And about the competency centers – in my experience, they are not executed very well, the staff is rarely picked from the pool of A-list practitioners. It creates another cost center and bureaucracy. I’m not opposed to the idea, only the execution. I remember IBM pushing this idea 25 years ago as the “Information Center” using their proto-BI, -ETL and -Datamart tools APL, APL-DI and ADRS. Perhaps my opinion is colored by that experience.
Neil Raden
Great post. It drives me batty how often idiotic causal chains are “proven” by studies.