Cross Validated is a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I'm seeing this image passed around a lot.

I have a gut-feeling that the information provided this way is somehow incomplete or even erroneous, but I'm not well versed enough in statistics to respond. It makes me think of this xkcd comic, that even with solid historical data, certain situations can change how things can be predicted.

What it do baby

Is this chart as presented useful for accurately showing what the threat level from refugees is? Is there necessary statistical context that makes this chart more or less useful?


Note: Try to keep it in layman's terms :)

share
12  
This is a good question, but note that such questions cannot logically be answered without specifying useful for what? – gung 19 hours ago
10  
This chart is propaganda. To see why, try to answer questions such as "over what period of time?" and "chance to whom?" and "to which population do these statistics apply?" Then consider just how specific some of the events are (which is one way to make the chances seem extraordinarily small). Why not "chance of being killed by a white teenager from the American south who lives with a single mother?" – whuber 17 hours ago
3  
@LCIII threat level to whom? – AdamO 16 hours ago
6  
That plot seems to show unconditional probabilities. Presumably, the probability of being killed by a refugee in a terrorist attack would be even lower for an astronaut while in space. By the same token however, an astronaut in space is also at lower risk of lightning strikes, though probably at higher risk of being killed by vending machines, if the definition of vending machine is expanded to automatic food delivery systems generally. – generic_user 16 hours ago
11  
Though it makes a compelling case for legislation banning lightning. Particularly lightning from countries with whom our president doesn't have business dealings. – generic_user 16 hours ago

Imagine your job is to forecast the number of Americans that will die from various causes next year.

A reasonable place to start your analysis might be the National Vital Statistics Data final death data for 2014. The assumption is that 2017 might look roughly like 2014. You'll find that approximately 2,626,000 Americans died in 2014:

  • 614,000 died of heart disease.
  • 592,000 died of cancer.
  • 147,000 from respiratory disease.
  • 136,000 from accidents.
  • ...
  • 42,773 from suicide.
  • 42,032 from accidental poisoning (subset of accidents category).
  • 15,809 from homicide.
  • 18 from terrorists (Based on data in the linked report. See link for definitions.)

Terrorist incidents in the U.S. are quite rare, so estimating off a single year is going to be problematic. Looking at the time-series, what you see is that the vast majority of U.S. terrorism fatalities came during the 9/11 attacks (See this report from the National Consortium for the Study of Terrorism and Responses to Terrorism.) I've copied their Figure 1 below:

 National Consortium for the Study of Terrorism and Responses to Terrorism, "American Deaths in Terrorist Attacks Fact Sheet"

Immediately you see that you have an outlier, rare events problem. A single outlier is driving the overall number. If you're trying to forecast deaths from terrorism, there are endless issues:

  • What counts as terrorism? (Terrorism can be defined broadly or narrowly.)
  • Is the process stationary? If we take a time-series average, what are we estimating?
  • Changing conditions? What does a forecast conditional on current conditions look like?
  • If the vast majority of deaths come from a single outlier, how do you reasonably model that?

IMHO, the FT graphic picked an overly narrow definition (the 9/11 attacks don't show up in the graphic because the attackers weren't refugees). There are legitimate issues with the chart, but the FT's broader point is correct that terrorism in the U.S. is quite rare.

Life expectancy in the U.S. is about 78.7 years. What has moved life expectancy numbers down in the past has been events like the 1918 Spanish flu pandemic or WWII. Additional risks to life expectancy now might include obesity and opioid abuse.

If you're trying to create a detailed estimate of terrorism risk, there are huge statistical issues, but to understand the big picture requires not so much statistics as understanding orders of magnitude and basic quantitative literacy.

What I would instead be concerned about... (perhaps veering off topic)

Looking back at history, the way huge numbers of people get killed is through disease, genocide, and war. I'd be concerned about some rare, terrorist event which triggers something catastrophic (eg. how the assassination or Archduke Ferdinand help set off WWI.) Or I'd worry about nuclear weapons in the hands of someone crazy.

Thinking about extremely rare but catastrophic events is incredibly difficult. It's a multidisciplinary pursuit and goes far outside of statistics.

Perhaps the only statistical point here is that it's hard to estimate the probability and effects of some event which hasn't happened? (Except to say that it can't be that common or it would have happened already.)

share|improve this answer
1  
Very informative, as always (+1). In the same vein of background information I like to keep in mind that the site of the very first confirmed outbreak of the so-called Spanish flu was at Camp Funston, within Fort Riley in Kansas, USA. :-) – Antoni Parellada 16 hours ago
1  
@AntoniParellada I remember the very first time I saw a long time series of period life expectancy (eg. something like this) and how I was shocked by the 1918 flu pandemic. – Matthew Gunn 16 hours ago
6  
Bruce Schneier was interviewed after 9/11, and asked what he recommended New Yorkers do to keep safe in light of the attack. His answer? "Seatbelts." – Cort Ammon 8 hours ago

Agree with gung. It's very very generalised especially expressed as a percent per billion people. That said it does still provide some useful information that historically, world wide, refugees don't kill many people in terror attacks. One way in which the information would be more useful is if we were to make use of other prior knowledge:

I imagine the probabilities vary greatly by geographic location. There may be more terrorist attacks by refugees in the Middle East than the US given the number of refugees currently in this area and we know that there are more lightening strikes in Venezuela, where the Catatumbo River meets Lake Maracaibo (the most lightening struck place on earth apparently) than in the south of the United Kingdom. It's often better to express the probability of something happening given some other piece of information, what is known as conditional probability (the probability of X occurring given we know Y): otherwise the wrong conclusions can be drawn.

enter image description here

share|improve this answer
1  
"There are probably more terrorist attacks by refugees in the Middle East than the US" - Sorry, but I don't even remotely see how you could argue that without evidences. – Firebug 16 hours ago
    
Sorry yes just trying to make a point about prior knowledge. Have amended my answer. – MorganBall 16 hours ago
1  
Who uses percent per million? – Carsten S 15 hours ago

Your intuition is correct that the statistic above doesn't tell the whole story. Yes, past refugee terrorist behaviour isn't necessarily a good indicator of future refugee terrorist behaviour, but that isn't the problem. The problem is that even one or two large-scale terrorist attacks would be awful, and statistics isn't appropriate for dealing with such small numbers of things. If we only consider mass murder refugee terror attacks, there have never been any, at least not in the past fifteen years. The figure in the graph comes from the fact there were 3 murders by refugees in terror attacks since 1975[1], which is essentially zero compared to the terror attacks everyone is scared of. But on the basis of statistics alone, that data isn't enough to rule out the chance that a huge terror attack is coming. We can't say "the threat is low", because statistics can't tell us if the threat is low enough.

First off, imagine how upset you would be if a refugee committed a horrible terrorist attack that killed 100 innocent people. Now imagine how much you would hate the idea of a 20% chance of that happening. We would want to put safegaurds into the refugee program. Now, let's look at the statistics. How can you prove the probability of a refugee terrorist attack next year is less than 20%? Well, the probability of a terrorist attack went up after the awful September 11 attacks, so you could say there haven't been any attacks in 15 years. If the probability of an attack was 20%, then no attacks in 15 years would be pretty unlikely (p=3.5%). So we can be pretty certain the chance of a major terrorist attack next year by a refugee is less than 20%. But now suppose we're interested in the chance of a terrorist attack within five years. Then we only have three independent samples since 2001 (2001-2006, 2006-2011, and 2011-2016). We can't say with any confidence that the probability of a terrorist attack within five years is less than 20%. And a 20% risk within 5 years of a tragic terrorist attack by a refugee where 100 people die is awful, and would certainly justify changes to the refugee program if it were real. And even a 1% chance of an attack in ten years would be afwul.

But notice that the logic I used there could be applied to anything where the odds of it changed 15 years ago. We failed to disprove a threat of something happening, based only off the fact that it happened zero times. It would be equally hard to say the threat of a terrorist attack by a refugee named Tim was less than 20% in five years, but it wouldn't make sense to stop all refugees named Tim. It would be equally hard to prove the threat of a terrorist attack by an astronaut was less than 20% in five years, but you don't see anyone saying we should stop letting astronauts in. Statistics is the wrong approach to dealing with extreme events. If you want to be certain that something won't happen once in ten years, you can't use past experience to convince yourself of the fact. That's why it's wrong to say the graph shows the terrorism threat from refugees is very low. If we're only using history and a single major incident is unacceptable, the no threat is very low.

share|improve this answer
    
even a 1% chance of an attack in ten years would be awful, that's an expectation value of one casualty per ten years, in a country with 300 million+ inhabitants, which is very close to the statistic described by the Financial Times. And what do you mean by your final sentence the no threat is very low? – gerrit 17 mins ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.