Let me tell you a story about William Sealy Gosset. William was a Chemistry and Math grad from Oxford University in the class of 1899 (they were partying like it was 1899 back then). After graduating, he took a job with the brewery of Arthur Guinness and Son, where he worked as a mathematician, trying to find the best yields of barley.
But this is where he ran into problems.
One of the most important assumptions in (most) statistical tests is that you have a large enough sample size to create inferences about your data. You can’t make many comments if you only have 1 data point. 3? Maybe. 5? Possibly. Ideally, we want at least 20-30 observations, if not more. It’s why when a goalie in hockey, or a batter in baseball, has a great game, you chalk it up to being a fluke, rather than indicative of their skill. Small sample sizes are much more likely to be affected by chance and thus may not be accurate of the underlying phenomena you’re trying to measure. Gosset, on the other hand, couldn’t create 30+ batches of Guinness in order to do the statistics on them. He had a much smaller sample size, and thus “normal” statistical methods wouldn’t work.
Gosset wouldn’t take this for an answer. He started writing up his thoughts, and examining the error associated with his estimates. However, he ran into problems. His mentor, Karl Pearson, of Pearson Product Moment Correlation Coefficient fame, while supportive, didn’t really appreciate how important the findings were. In addition, Guiness had very strict policies on what their employees could publish, as they were worried about their competitors discovering their trade secrets. So Gosset did what any normal mathematician would.
He published under a pseudonym. In a startlingly rebellious gesture, Gosset published his work in Biometrika titled “The Probable Error of a Mean.” (See, statisticians can be badasses too). The name he used? Student. His paper for the Guinness company became one of the most important statistical discoveries of the day, and the Student’s T-distribution is now an essential part of any introductory statistics course.
In Canada, the top three causes of death for men are cancer (31.1%), heart disease (21.6%) and unintentional injuries (5.0%). The top two are the same for women, although with slightly different percentages: cancer and heart disease account for 28.5% and 19.7% of all deaths among women, with stroke (7.0%) coming in third. In the US, men die at an overall rate 1.4-times higher than women, of heart disease 1.6-times more, and are twice as likely to die from an unintentional injury.
In fact, women outlive men by 4.5 years on average worldwide – 66.5 years vs 71.0 years. This difference increase to 7 years in the developed world. Not only are men more likely to die from the causes above, men are also more likely to commit suicide than women. This gender difference increased following the recession. A time trend analysis from the UK found that approximately 850 more men, and 155 more women committed suicide than would have been expected based on historical trends following the 2008 economic downturn, with the highest increases in those regions that were most affected by rising unemployment.
But what leads to these outcomes? Given we live in a world where people can get help when they need it, why should men be dying at a rate that is that much higher than women for (almost) the same diseases? And why are they dying younger than women?
Anyone who follows my writing knows that I’m a big proponent of using stories to talk about science. We’ve discussed how you can use science fiction teach science, zombies to talk about disease outbreaks, and my TEDx talk discussed using principles of storytelling in how we discuss science. So when I was asked to review (see disclaimer below) Dr Alexandra Levitt’s new book “Deadly Outbreaks: How Medical Detectives Save Lives Threatened by Killer Pandemics, Exotic Viruses and Drug-Resistant Parasites,” I jumped on the opportunity.
The CDC has a program known as the Epidemiologic Intelligence Services, where individuals trained in fields such as epidemiology, medicine, statistics and veterinary sciences come together to identify causes of diseases. For an overview of the EIS, check out this review of “Inside the Outbreaks” by Travis Saunders over at Obesity Panacea. The EIS was set up Alexander Langmuir, who has been profiled on the blog, and their work has been instrumental in learning about, and thus containing, disease outbreaks all over the world. Dr Levitt is well positioned to speak on these issues, having worked at the CDC since 1995, although it should be noted that this was written in her free time, not as part of her position at the CDC.
Do those words scare you? If they do, you’re in good company. Mathematical anxiety is a well studied phenomenon that manifests for a number of different reasons. It’s an issue I’ve talked about before at length, and something that frustrates me no end. In my opinion though, one of the biggest culprits behind this is how math alienates people. Lets try an example:
If the average of three distinct positive integers is 22, what is the largest possible value of these three integers?
Too easy? How about this one:
The average of the integers 24, 6, 12, x and y is 11. What is the value of the sum x + y?
I do statistics regularly, and I find these tricky. Not because the underlying math is hard, or that they’re fundamentally “difficult,” but because you have to read the question 3 or 4 times just to figure out what they’re asking. This is exacerbated at higher levels, where you need to first understand the problem, and then understand the math.*
Last week, my colleague Cristina Russo discussed how sports can be used to teach biology. Today I’m going to discuss a personal example, and how I use sports to explain statistics.
One of my main objectives as a statistics instructor is to take “fear” out of the equation (math joke!), and make my students comfortable with the underlying mathematical concepts. I’m not looking for everyone to become a statistician, but I do want them to be able to understand statistics in everyday life. Once they have mastered the underlying concepts, we can then apply them to new and novel situations. Given most of my students are athletically minded or have a basic understanding of sports, this is a logical and reasonable place to start.