Friday, September 28, 2007

When you sign up for a credit card there is an awful lot of small print on the back. No one reads it all (alright I do, but I am an exception). Basically you trust the bank and the local legal system that there is not going to be anything horrendously unreasonable there. There are a couple of bits people do read though. Those are the bits with tick boxes next to them. You have to read these because they are asking for a decision. You either tick or you don’t; it is a call to action. One of the things they pretty much have to ask you is whether they and their partners can send you stuff through the post.

I generally tick the “don’t send” box. This is because what they send me is generally irrelevant junk and a real waste of trees. In a way this is surprising because my bank has a huge amount of information about me. I mentioned in a previous post that I felt there were limits around what could and should be done with this. I also said I would come back to the topic. The limits are really on a couple of things:

1) Who should use it.
2) What it should be used for.

My thoughts on this are as follows...

Who Should Use It?

Firstly only people I would feel comfortable with should use this data. Second only people who I as a person who uses the card have allowed to use it should have access to the data. Third only people who I can understand might have access to the data should have access to it.

Basically I do not want any unpleasant surprises. I do not want to find that the shopkeepers in the local stores have been getting together and gossiping about me having used my credit card to ensure they mean the same person.

Clearly my credit card provider can use it to provide the credit card. Moreover, if I have said they can send me special offers from partner companies then they can use it to select the offers. I am not sure that I am comfortable with them giving data to those specific companies in order to do this, so if they want to do that then it is best if they contact me to ask me first. I do not mind if they employ people to do the work on it, providing those people are bound to confidentiality. By extension I do not mind if they use companies to do the work providing the companies are similarly bound.

What should it be used for?

Firstly it should never be used for anything that will in any way harm me. Secondly it should only be used for things that will in some way benefit me. Thirdly it should probably not be used for anything that jars with my expectations. If it is something that makes me think "They are doing what???" when I hear it then someone is a bit close to the edge and it might be better to set my expectations, gauge my reactions and maybe even ask me first.

Clearly the data can be used for providing the credit card service. If I have signed up to receive offers, then I would feel it can be used to target those offers. I also feel that it can be used to improve the offerings of the credit card company. I feel that insights from the data could be used to improve offerings from partners of the credit card company. As I do not feel that the partners should have the actual data then I guess it follows that the actual data cannot be used (except as described below) to improve the offerings of the partner companies.

The exception to that is the credit card company (or their agents) can use the data to produce the insights. So the partner company is not using the data to improve it's offering, but in some sense the data is being used, indirectly, to do just that.

Where does all this lead?

I guess the next step is to ask what happens if a single company acts as an agent for multiple parties, taking data from each of them and then providing insight to each but not passing the data back to any of them. At first glance this seems reasonable providing the information providers/insight users are partners. It is probably best if they put the fact of the partnership somewhere where I can find out if I want to look it up, but I have not thought about this enough to be sure whether that is necessary. This is therefore, probably, a good point to finish for now. More on this topic later...

Rufus Evison

Labels: , , , ,

Tuesday, September 11, 2007

DNA databases: sending innocent people to prison

Mainstream Privacy and Freedom

Have you ever come across the birthday party problem? The question is how many people do you need to bring together to have a better than 50% chance of two people sharing the same birthday? Clearly if two people match then their birthday parties are likely to conflict and problems may ensue. The probability of a given birthday is assumed to be even throughout the (non-leap) year (1:365). So how many people? The answer is that you only need 23 people (probability 0.5073).

Now imagine that you have a much longer year, say a billion days. What is the number required for a match? To be honest I do not know. I know the way to calculate it, but the numbers get very big very fast. Yes they then divide back down to small numbers, but calculating them is difficult.

What I do have though is a history of testing something similar. In my role as CTO of Clickstream we were dealing with unique IDs. These were created using a (pseudo) random number between +2bn and -2bn. This gives us a 'year' of 4 billion days. Not only that, but we gave each person another user ID calculated in quite a different manner as an independent check. We found that with a population of 100,000 we were getting on the order of 20-40 matches. In dna database matching a match is the equivalent of prosecuting an innocent person. Now this could have been because the numbers were not truly random, or it could have been the combinatorics as in the birthday party problem.

So why am I writing this blog and ranting? Because DNA testing is a bit like IDing.

The population is not truly random.
The probability of an individual match is (we are told) about 1:1bn
We will get false positives (though with DNA it could lead to innocents in prison)

It is important to bear in mind that the real reason that this matters is that a false positive may equal a wrongful conviction. With a much smaller sample we were seeing about 20-40 ‘wrongful convictions’. The government is now talking about creating a database that is big enough to put thousands of people in prison for no reason.

All of this is without taking into account the flaws in the way the system is run. We are assuming people do things right every time. The only time this has been audited (that I have found) the chances of a false match turned out to be 1:100 rather than 1:1bn. This was due to experimental error that we are assured cannot take place in real life, but I am not sure I trust these assurances.

A probability of 1:100 would mean that the bulk of prosecutions were actually of innocent people. If this doesn’t worry you it should, as there is no reason that you should not be one of the innocent once the drive to gather DNA every time we fly comes in. Forget about whether this is privacy intrusive, forget about whether it is moral, I wish someone would address the question of whether it works and makes sense.

Wednesday, September 05, 2007

It is starting to work

SideLine Article: Web Optimisation and Promotion

I said I would start to see if I could move my blog from an audience of one. I started, as one must, but tracking site usage. I had, as expected myself as the only audience. I have since had a very little play and the audience has grown ten fold. Not a bog growth and unless they happen to be relevant people it will not be a sustained growth, but it is heartening that a tiny change with no time spent can move something from invisible to available for people to find. I am kind of busy right now, but will see if these minimal efforts can over time provide some real and sustained growth.

Rufus Evison

Monday, September 03, 2007

Data: It’s all about responsible use

Mainstream Article, Privacy and Data Use

I have been involved in web monitoring practically since there has been a web to monitor. Over the years I have often heard accusations about invasion of privacy and have done my best to make sure that they are not true. While I was at Clickstream this was fairly easy as we were the leaders in privacy friendly monitoring. Really we had to be; we could gather so much data so accurately and completely that, had we not been campaigning for more stringent privacy regulations, we would have been big brother. The online world promised so much in the way of “perfect information” and despite some disappointments it was actually able to deliver if you did things right.

What is now becoming ‘the next big scare’ is that it is not just online data that can be invasive. Articles are appearing in the press about abuses of data from all directions. I hear about the big brother aspects of the supermarket giants, and I cannot help but smile. It is all so familiar. I smile for two reasons:

Data is not inherently evil. Doing bad things with data is the potential problem, and at the moment I am not seeing signs that it would be profitable for the supermarkets to turn bad.

The supermarkets are just the tip of the iceberg and the bits that are underwater are much larger.

As number two points out, it is not just the supermarkets that have masses of data, they just happen to be the most currently visible aspect. Other areas that potentially frightening amounts of data are accumulating include, but are not limited to, Banking, Telecoms, Government, Media (Broadband providers, IPTV, etc), even manufacturing gets a look-in. At this point I should emphasise that they are only *potentially* frightening. If there is sufficient interest in this topic I shall write about what the different areas are gathering and some of the implications of this in another article. For the moment it is just worth bearing in mind that all of these sectors are gathering huge amounts of data; more than that, they are turning that data into huge amounts of information.

Information is power and with great power comes great responsibility. This sounds trite and clichéd because it is, but it is also true. The power of data has been loosed upon the world and we are not going to change that without driving ourselves back into the dark ages. Think of it as a bright light shining down upon us, and work out how much shade we need and where we should be shining our torches.

One way of looking at this, which seems to make sense, is by examining explicit consent. If you are burgled you feel violated and invaded because someone has been rummaging through your private things and walking in your home without your permission. If you invite someone in to do a job for you then you feel grateful to them for taking the time and trouble. You may even pay them for the services provided. The loyalty card is the difference between an invasion and an invitation. A loyalty card is saying “here is what I want, watch what I am doing and work out how you can help me”. The same data can be gained from credit card transactions but the user does not use their card specifically so that upermarket can make them offers and improve their shopping experience. One is invitation the other is nosiness at best and at worst full blown invasion (*) The law acknowledges this and protects the consumer from this kind of direct snooping. It does allow aggregate data gathering, saying this many people do this kind of thing, but not specific snooping (on this date you were in this store doing this).

The worst argument against the way that supermarkets use the data is that they use it to sell you more things that you do not need. The best argument in favour of them is that they change their offerings, ranges and way of working to fit the customer’s needs. If a loyalty card is inviting someone in, then the store deciding to change to match it’s customers is the workman doing his job. As long as the vendors are incentivised to fulfil our best expectations then the problem of big brother is not a real and present danger.

So the next question is whether the incentives are sufficient to make it in the data holder’s interest to play nicely. I do not have a definitive answer to that. I do have some evidence, but it is not sufficient to be certain, so I can only offer advice and suggestions, not answers. It is my guesstimate that the lifetime value of a happy customer is a greater incentive than the quick profit of an over-sold customer, but I do noy yet have the figures to back this up. I am however certain that the company I am working for believes this lifetime value argument and that this is what they are providing to their customers. Looking at the retail market it is also what a lot of the experts are saying. Whether they are doing it or not retailers are at least paying lip service to the idea that customers have the power to walk away and so you need to treat them as well as you can,

An encouraging comment I came across in the retail bulletin seems to support this responsible view:

“Effective retail media solutions in the modern world are all about adding value for consumers and ensuring they receive something personally worthwhile as a trade-off for their time and attention”

The quotation is from Martin Hayward Director of Consumer Strategy and Futures at Dunnhumby where I am now working; it is this attitude that lets me feel comfortable working here. As an added benefit, working here will supply me, over time, with the hard evidence to back up this view. Certainly Dunnhumby make their money out of learning how to do the right thing for consumers, so they (we) have a real incentive to show that this is worthwhile.

Rufus Evison,

(*) this is misleading because it is only true in context, so I will delve deeper into what can/should be done with credit card data in a later entry.

Labels: , , , , , , , ,