I’m going to try and explain my position on data mining and data collection and profiling. The government wants to collect lots of data on all of us to help fight terrorism. This won’t work. The amount of data they want to collect is immense. They want to find terrorists using sophisticated computer programs and mining this data for connections.
Problem 1
If I was a terrorist I wouldn’t use social media for communication. I would use face to face meetings, dead letter drops and PAYG mobiles using them for a week and then dumping them. I would never directly contact anyone in charge and all organisation would be off the internet. It’s no use searching these data for patterns. Terrorists don’t use communication methods we can tap into.
Problem 2
The laws of probability. Let us suppose that the software can recognise a terrorist, by using their behaviour traits, 99% of the time and it incorrectly states an innocent person is a terrorist 1% of the time. Let’s suppose we are monitoring a million people and that there are 100 terrorists in that population. This would extrapolate to around 7000 terrorists in the UK. It would seem likely/reasonable that there are that many people within our borders willing to cause damage to the state.
The Sums
Of the 100 terrorists the software finds 99% of them so 99 terrorists are taken into custody. 1 terrorist remains at large.
The software also falsely accuses 1% of the population as it is only correct 99% of the time. This means that 9999 honest members of the public are wrongly accused of being terrorists.
So we have taken 10098 people into custody. The chances of you being accused but innocent is 9999/10098 which is about 99%.
The chance of you being guilty if taken into custody is 99/10098 or about 1%
This is a bad situation.
Data mining isn’t going to work. Neither is profiling or algorithms about people’s behaviour. The numbers don’t make sense. This measure won’t work.
To give you an idea of the scope of this problem I will now expand this to those people who enter this country each year. Let’s assume that the domestic population of the UK is perfectly happy with the State and not partaking in naughty behaviour (a false assumption). According to this BBC report there are 200 million people who enter and leave the country each year. So that means 100 million coming into the country each year. This profiling method assumes we have access to all of the data about every person who enters the country. Let’s look at the numbers and just for niceness let’s increase our software detection accuracy to 99.9%.
99.9% at selecting a terrorist correctly.
0.1% at selecting an innocent person.
Let’s also increase the number of terrorists to 10,000 (although this seems remarkably high to me).
9990 terrorists will be correctly detained. This means that 10 will go free.
99990 innocent people will be detained.
The probability of a detained person being guilty is: 9%. This is slightly better but not a system I want to be a part of.
Enjoy your winter break.