Skip to main content

Lies, Damn Lies and Statistics

  • Gas is cheap.
  • The job market is strong.
  • Consumer debt is down.
  • The nation’s weather is cold.
  •  “Retail sales are expected to top $22 trillion this year, including $1.3 trillion in e-commerce led by China and the U.S.”  - eMarketer
  •  “U.S. Retailers likely to just meet holiday forecasts.” - Reuters
  •  “Total retail sales tumbled 11 percent to $50.9 billion during the extended Black Friday weekend. “ - National Retail Federation (NRF).
  •  “Online sales on Thanksgiving were up 14.3 percent, while Black Friday online sales were up 9.5 percent.” - IBM
  •  “Despite growth, Black Friday has lost its punch.” - Retention Science

  • The NRF data above is based upon a consumer survey, not purchasing data. 
  • The IBM data is from eCommerce “shopping cart” transactions. 
  • Cold Weather is good for in-store sales - It brings consumers into stores to buy outerwear and accessories,
  • Cold weather is bad for in-store retail sales - It keeps consumers at home, where they shop online and have fewer “impulse” purchases.
To paraphrase Andrew Lang;
Retail analysts use statistics in the same way that a drunk uses lamp-posts—for support rather than illumination.” 

Statistical Disconnects: Super Cats & Deadly Ice Cream 

While media, retailers, wholesalers, politicians, economists and analysts may misinterpret statistics, there is a more significant problem when the data used is itself inaccurate.
Many statistical studies are poorly designed.  
The collection, or analysis of data may not match the purpose of the study.   
 IQ testing is one classic example of mismatched analysis.  The test does not measure intelligence; its stated purpose. IQ tests do not measure problem solving, creativity or emotional intelligence. And, IQ tests are culturally and racially biased; they use “context” related questions such as, “Where does milk come from?”  Is the correct answer “a cow,” “a supermarket,” or ""? IQ tests may be effective at intelligence bench-marking and predictive academic achievement. 
 “The Pepsi Challenge” data illustrated that customers preferred the taste of Pepsi over that of coke. There were two “fallacies” in the study. First, the study used “sips” from small paper cups; sweeter taste is preferred in small sips, but not in larger consumption.  Second, a possibly more “valuable” data point was not shared; consumers who “discovered” through the survey that they preferred the taste of Pepsi, still preferred to buy Coke: “We’re a Coke household.”
“Cherry-picking” of data.  
Statisticians are skilled at choosing data to include in their study, and that which they disregard.  
Poor data sampling.
1n 2008, 132 cats were brought to Manhattan Animal Hospital, they had fallen out of open windows. Of the 132 cats, 128 survived, some from 30 story falls.  The New York Times wrote of the miraculous survival instincts of cats; the ability to “adjust rotation orientation” during a fall, “flying-squirrel” aerodynamics, joints and muscles acting as “shock absorbers.”  The media discussion lasted for months.  
The data point that ended the super-cat conversation regarded data sampling.  One woman interviewed said, “my poor cat must have been the exception. She fell out our 9th story window and died. Of course, I did not bring her body to the hospital, nor did I report it…”  The survey only included cats that had survived their falls.  When cats that died on impact were included the study it was no longer newsworthy. 
Presentation of data may bias interpretation.  
 “Milk with 3.5% Fat” - wow, that sounds like a lot vs. “96.5% Fat Free Milk” - better? This is the fat percentage for whole milk. Low fat milk is usually presented as “Reduced Fat, 2% Milk.”
Poor understanding of “probability.” 
There is no better presentation of the "probability" discussion than this piece, by Stephen J. Gould, "The Median Isn't the Message." Gould had been diagnosed with cancer (abdominal mesothelioma), he was informed that he had "a median mortality of eight months." (Note: Gould lived another, prolific, 20 years.)
Misunderstanding “correlation vs. causation.”  
Fact:  Children with larger feet score better on spelling tests. Interesting.  
Many observers have attempt to find “causality” in this data.  Why do children with larger feet spell better? 
Did A cause B? Larger feet cause spelling skills to improve. Is there a nutrition cause?  Something about shoes?  Foot size is likely genetic, is there a genetic component? Nature over nurture?  
Did B cause A?  Perhaps skill at spelling causes hormonal changes? Could time spent on academic achievement increase hunger, food consumption and therefor foot size?
Another view at the data illustrates that children’s feet grow larger as they grow older.  Overall, 8-year-olds have larger feet than 5-year-olds, and 15-year-olds have larger feet than 8-year olds.  Older children tend to spell better than than younger counterparts.  Foot size and spelling are correlated - but one does not cause the other.

Fact: As ice-cream consumption increases, so does the local incidence of drowning.
Ice-cream does not cause drowning. Nor do drowning incidents cause local residents to eat more ice-cream.  As summer temperatures rise people buy more ice-cream; they also spend more time in the pools and the ocean.  

Is More Better?
We have more data and statistics available than ever before, and it’s moving beyond our comprehension. Big data, by definition, is so complex that traditional data processing can not capture, filter, or process it. 
The real question:  What to do with all this data?
We can, effectively argue that warm jackets and mittens sell better in cold weather.  These statistics are not only correlated, we can likely show causality.  This season’s cold weather yielded high sales levels for warm apparel and accessories.  The weather, possibly, brought more customers into stores to purchase these items (weather that is too cold may keep customers home, or induce them to shop online.) While in stores, these customers may have purchased items they would not have purchased otherwise.
As we cannot, yet, control the weather and the manufacturing and purchasing of cold weather apparel is made 9 months ahead of the winter season… how are we to use this year’s weather data to make better purchasing decisions next year?  
Regardless of the weather next year, some cold weather apparel will be sold. Study the choices consumers made this year, the competitive environment, visual and technical trends, and related statistics.  Build great products and present them in a compelling fashion.  Sometimes, you just have to outperform the other guy.

Statistics: to use them is perilous - to not use them is ignorant.  
For 2015, let's resolve to... ask smarter questions, design better studies, challenge results, study probability, understand correlation & causation, and weigh the risks and rewards of acting on statistical data and analysis.

Lies, damn lies, and statistics. In 2015: It's our job to tame, understand, and take advantage of this Mark Twain attribution.

Happy New Year.

(c) David J. Katz - Detroit, Michigan - December 28, 2014

Popular posts from this blog

Annotated Guide To Men's Belts

The Complete Guide To Men’s BeltsArticle By  on 11th March 2014 | @gabrielweil


Warning, Car Porn

The signature feature is the Rolls Royce Wraith’s Starlight Headliner, consisting of 1,340 LEDs hand-sewn to create an effect of owning one’s personal night sky filled with stars...

Warning, content below represents a man's libidinous fascination with an automobile. It is not Lolita; after all Bradley Berman, the author, is not Nabokov and the Wraith is not underaged. Nonetheless, I find myself simultaneously repulsed... and seduced. David J. Katz

The End of Mass Marketing: Go Small, or Go Home

Once upon a time… business success was based on providing a narrow segment of consumers with a narrow segment of products, uniquely suited to their needs, sourced and advertised locally, and sold at a local store. Over time, the spread of mass media - TV, national newspapers and magazines - along with the expansion of national retail stores, and the growth of a global and highly efficient supply chain, led to a world of mass marketing, mass production, and massive retailers. The retail world moved from personalized products for localized, niche markets to mass-produced products for mass markets. Mass marketers thrive on "must-have" items - huge volumes of single styles, sold across many market segments to an audience of consumers eager to have the item they saw advertised in mass media, and which, in turn are produced in great scale and efficiency. This strategy worked. Until it didn’t.