Terrorism Datamining

I’m going to try and explain my position on data mining and data collection and profiling. The government wants to collect lots of data on all of us to help fight terrorism. This won’t work. The amount of data they want to collect is immense. They want to find terrorists using sophisticated computer programs and mining this data for connections.

Problem 1
If I was a terrorist I wouldn’t use social media for communication. I would use face to face meetings, dead letter drops and PAYG mobiles using them for a week and then dumping them. I would never directly contact anyone in charge and all organisation would be off the internet. It’s no use searching these data for patterns. Terrorists don’t use communication methods we can tap into.

Problem 2
The laws of probability. Let us suppose that the software can recognise a terrorist, by using their behaviour traits, 99% of the time and it incorrectly states an innocent person is a terrorist 1% of the time. Let’s suppose we are monitoring a million people and that there are 100 terrorists in that population. This would extrapolate to around 7000 terrorists in the UK. It would seem likely/reasonable that there are that many people within our borders willing to cause damage to the state.

The Sums

Of the 100 terrorists the software finds 99% of them so 99 terrorists are taken into custody. 1 terrorist remains at large.

The software also falsely accuses 1% of the population as it is only correct 99% of the time. This means that 9999 honest members of the public are wrongly accused of being terrorists.

So we have taken 10098 people into custody. The chances of you being accused but innocent is 9999/10098 which is about 99%.

The chance of you being guilty if taken into custody is 99/10098 or about 1%

This is a bad situation.

Data mining isn’t going to work. Neither is profiling or algorithms about people’s behaviour. The numbers don’t make sense. This measure won’t work.

To give you an idea of the scope of this problem I will now expand this to those people who enter this country each year. Let’s assume that the domestic population of the UK is perfectly happy with the State and not partaking in naughty behaviour (a false assumption). According to this BBC report there are 200 million people who enter and leave the country each year. So that means 100 million coming into the country each year. This profiling method assumes we have access to all of the data about every person who enters the country. Let’s look at the numbers and just for niceness let’s increase our software detection accuracy to 99.9%.

99.9% at selecting a terrorist correctly.
0.1% at selecting an innocent person.

Let’s also increase the number of terrorists to 10,000 (although this seems remarkably high to me).

9990 terrorists will be correctly detained. This means that 10 will go free.

99990 innocent people will be detained.

The probability of a detained person being guilty is: 9%. This is slightly better but not a system I want to be a part of.

 

Enjoy your winter break.

 

Normalizing

I wish to register a complaint.

Firstly the spelling of the title of this communication is deliberately wrong, I’ve used the US American English version. Secondly, where is this going?

I like watching Hawaii Five-O. I like seeing the wonderful Hawaiian scenery. I also quite like the characters. I’ve been watching for five seasons now and I still enjoy it. OK, the technology is purely comical and the team can hack into any CCTV system and use facial recognition etc. The plots are far fetched and they have jumped the shark many, many times. In fact I think I have tweeted about that before.

Now I have a problem with this and many other films and TV series which include dubious science. Having this level of nonsense in the public domain creates an impression where some things are seen as essential or that they actually work. I do understand that 5-O is a TV show and that they must get their man. I honestly do get that. I also understand that a major plot device is that they aren’t the police but a “special taskforce” and so they can blow shit up without too much hassle. Much like Inspector Morse gave the impression that Oxford was full of murderers so 5-O gives the impression that violent crime is rampant in the USA’s 4th smallest state [by land area].

Another thing that this TV show pushes into the public perception of acceptability is torture. Suspects are often beaten and are left chained to a chair in a small room to consider their options. I am pretty sure that this helps permeate public perception that torture works. There are many films where the bad people are tortured and then the world is saved. This is all good for governments who indulge in poor unscientific behaviour. I would suggest you follow the links in these articles to see what I mean.

Ultimately, torture doesn’t work. It’s illegal. It won’t help you get the information you want. It makes you barbaric. For a good discussion about stopping terrorism then I suggest you listen to this episode of Freaknomics. I still watch 5-O. But at the same time I like to think that I can ignore the bullshit aspects of it and use the TV show as 45 minutes of escape and relaxation [apart from the bloody Halloween specials that they do I hate those shows].

While I’m on terrorism here’s why data-mining doesn’t work.

The Extreme Rule – A Gaussian Explanation

Why do nutters make the headlines? Why is it reportable if a 13 year old kid decided to go and fight for Islamic State? There are plenty of people who have decided to go and fight, it shouldn’t surprise us that one of them turns out to be a youngster. It WASN’T news. People lied about their ages and signed up for WW1, that was seen as patriotic [different issue though].

The crazy 5% of our population seems to inhabit 95% of our news time and concern. This is pathetic. The news reporting for the Islamic killer kid should have been more realistic, along the lines of “This kid has decided to do go to Syria and fight for a cause he believes in but there are approximately twenty million children who haven’t”.

Crazy person drives down the wrong carriageway on the motorway, it happens and makes the news. The reporting is never “out of 200 million journeys this has happened only once”.

Here’s a diagram to show what I mean [with apologies to Karl Gauss].

Gauss Again
Gauss Again

The continuum in the top graph is meant to describe how there are stupid people who do stupid stuff and there are nutters who know what they are doing but do it anyway. 95% of the population are somewhere in the middle.

The media seems obsessed with crazy or stupid people who, by definition, do crazy or stupid stuff. This is made even more evident by the amount of this shit on the internet. There is never a realistic version explaining that actually this is a rare event and we shouldn’t be bothered by it.

Humans have a very poor “risk calculating” ability. We are unable to understand probability correctly. Partly because it actually gets quite hard and complicated and partly because we are shielded from correct interpretation of risk by a culpable media.

Some things we don’t seem to understand:

  • Flying is safer than crossing the road
  • MMR is safe
  • The difference between relative risk and absolute risk
  • Crime is falling
  • Education is a good thing [the stupid are celebrated]
  • Weather is not climate

A final hurrah before I go and do some ironing:

Q: If you have twenty people, how many different soccer teams could you create from those people (if you specified which position they played in)?

A: 6,704,425,728,000

Wow, that’s a lot you should say.

Q: If you have twenty people, how many different soccer teams could you create from those people?

A: 167,960

Q: Suppose you were the captain. How many different teams could you create from the remaining nineteen people?

A: 92,378

See, you are useless at numbers and probability.