The anonymisation challenge
30 November 2012
This article was first published in Data Protection
Law & Policy in November 2012
For a while now, it has been suggested that
one of the ways of tackling the risks to personal information,
beyond protecting it, is to anonymise it. That means to stop
such information being personal data altogether. The effect
of anonymisation of personal data is quite radical – take personal
data, perform some magic to it and that information is no longer
personal data. As a result, it becomes free from any
protective constraints. Simple. People's privacy is no
longer threatened and users of that data can run wild with
it. Everybody wins. However, as we happen to be living
in the 'big data society', the problem is that with the amount of
information we generate as individuals, what used to be pure
statistical data is becoming so granular that the real value of
that information is typically linked to each of the individuals
from whom the information originates. Is true anonymisation
actually possible then?
The UK Information Commissioner believes
that given the potential benefits of anonymisation, it is at least
worthwhile having a go at it. With that in mind, the ICO has
produced a chunky code of practice aimed at showing how to manage
privacy risks through anonymisation. According to the code
itself, this is the first attempt ever made by a data protection
regulator to explain how to rely on anonymisation techniques to
protect people's privacy, which is quite telling about the
regulators' faith in anonymisation given that the concept is
already mentioned in the 1995 European data protection
directive. Nevertheless, the ICO is relentless in its defence
of anonymisation as a tool that can help society meet its
information needs in a privacy-friendly way.
The ICO believes that the legal test of
whether information qualifies as personal data or not allows
anonymisation to be a realistic proposition. The reason for
that is that EU data protection law only kicks in when someone is
identifiable taking into account all the means 'likely reasonably'
to be used to identify the individual. In other words and as
the code puts it, the law is not framed in terms of the mere
possibility of an individual being identified. The definition
of personal data is based on the likely identification of an
individual. Therefore, the ICO argues that although it may
not be possible to determine with absolute certainty that no
individual will ever be identified as a result of the disclosure of
anonymous data, that does not mean that personal data has been
disclosed.
One of the advantages of anonymisation is
that technology itself can help make it even more effective.
As with other privacy-friendly manifestations of technology – such
as encryption and anti-malware software – the practice of
anonymising data is likely to evolve at the same speed as the
chances of identification. This is so because technological
evolution is in itself neutral and anonymisation techniques can and
should evolve as the uses of data become more sophisticated.
What is clear is that whilst some anonymisation techniques are weak
because reintroducing personal identifiers is as easy as stripping
them out, technology can also help bulletproof anonymised data.
What makes anonymisation less viable though
is the fact that in reality there will always be a risk of
identification of the individuals to whom the data relates.
So the question is how remote that risk must be for anonymisation
to work. The answer is that it depends on the level of
identification that turns non-personal data into personal
data. If personal data and personally identifiable
information were the same thing, it would be much easier to
establish whether a given anonymisation process has been
effective. But they are not because personal data goes beyond
being able to 'name' an individual. Personal data is about
being able to single out an individual so the concept of
identification can cover many situations which make anonymisation
genuinely challenging.
The ICO is optimistic about the benefits
and the prospect of anonymisation. In certain cases – mostly
in the context of public sector data uses – it will clearly be
possible to derive value from truly anonymised data. In many
other cases however, it is difficult to see how anonymisation in
isolation will achieve its end, as data granularity will prevail in
order to maximise the value of the information. In those
situations, the gap left by imperfect anonymisation will need to be
filled in by a good and fair level of data protection and, in some
other cases, by the principle of 'privacy by default'. But
that's a different kind of challenge.
Eduardo
Ustaran, Partner in the Privacy and Information Group
at Field Fisher Waterhouse LLP