The Department for Education recently carried out a consultation to find out views in relation to sharing data stored on the National Pupil Database. Here is a copy of the submission made by me, Andy Phippen and Terri Dowty. It includes some examples that people might find useful in bringing this problem to life.
Do you agree with the proposal to widen the purposes for which data from the National Pupil Database can be shared? Please explain the reasons for your answer.
We do not agree with this proposal. We understand that one of the motives behind this question is that there is a proposal to sell or give away children’s data, acquired in the process of their compulsory attendance at school, to the private sector. There are a number of intrinsic dangers in requiring citizens to give personal data to the state, and then allowing it to be used commercially. We list them below.
1. Selling data acquired by compulsion encourages a national position, encouraged by Government, that children are little more than a potential resource to be used by others, rather than citizens in their own right, with rights to privacy.
Example: A drug company pays to access the School Census data, and finds that are greater number of children than average are diagnosed with ADHD-type conditions in one particular region. The drug company then contacts local schools and clinics to invite pupils to become involved with drug testing initiatives, targets local GPs to encourage drug take up, and carries out a PR initiative citing successful case studies in this region of drug based interventions using their products
2. The security of this data cannot be assured once it is out of the UK public sector, and there is likely to be little recourse for children if their data is used inappropriately, or stored inaccurately overseas.
Example: A US educational outsourcing company holds some the data it has purchased about UK school pupils on a US computer system. It is legally required to inform the authorities about potential visitors to the US who might pose immigration problems. Consequently it is forced to hand over some of the data, but the UK families who originally gave their data to their children’s schools are completely unaware of this, and unable to correct any errors occurring during the data exchange. This is not spotted until a family is stopped at a US airport, as a result of an error in the authorities confusing two identities. The family has no recourse.
3. In terms of using the data, it is unlikely that the same ethical controls will exist for commercial companies as for public sector researchers, which represents a further risk to the personal data of children.
Example: A data processing company decide to buy some of the data with the aim of creating a visually attractive alternative database for parents, to allow them to choose schools for their children. It interprets the data poorly, failing to take into account the school’s local conditions, which results in some schools and groups of pupils being unfairly classified as failing by this database. The resultant fall in admissions affects funding in some schools working with vulnerable children, which in turn affects children’s access to some aspects of education.
4. Mosaic identification (identification of individuals by piecing together information from different databases or other courses) is entirely possible using this sort of information, given uncommon cases. This presents ethical issues for the distribution of such data.
As the Office of National Statistics makes clear:
“Generally, rare combinations of attributes lead to the identification of individuals, for example, a sixteen-year-old widow, a female miner or a single manufacturer in an area. Disclosure control methods are usually applied if ethical, practical or legal considerations require the data to be protected, and the possibility of identification exists.
Statistical disclosure control techniques are currently being used in a wide number of areas of National Statistics, for example the Census, the Neighbourhood Statistics Service and for several social surveys. Different types of data pose different types of problems and inevitably require different solutions”. 
In 2010 the Information Commissioner held that the Youth Justice Board was in breach of the Data Protection Act in collecting purportedly anonymised data that included sector postcode, ethnicity, date of birth and gender (similar to some of the data held in the National Pupil Database). He concluded that this data was sufficient to identify individuals in areas where there were few residents from minority ethnic groups. As a consequence, the Youth Justice board had to remove this data from their Management Information System.
Therefore while data of this type might, of itself, not contain directly identifiable data (for example, names), this does not, in any way, guarantee anonymity for the individuals within the dataset.
Example 1: A high achieving pupil achieving level 5s for Mathematics and English in year 4 is identifiable within a small rural sample in a comparatively low achieving area. This leads to targeted marketing from commercial companies for paid-for enrichment activities, putting pressure on the parents to provide additional resources.
Example 2: A job applicant confirms to an employer that he has 4 GCSEs and the grades awarded. One of the GCSEs is in an unusual modern foreign language. The certificate date identifies the year of exam, the subjects, and the school. By using data derived from the National Pupil Database and School Census, it is discovered that a pupil from the same school with a GCSE qualification in this language had a certificate of Special Educational Needs for Oppositional Conduct Disorder (OCD). Fearing health and safety issues, the company decides not to employ the applicant on this basis.
National Pupil Database information is taken without the consent or knowledge of parents and children. It is derived straight from the school’s Management Information System. It is questionable that parents and children have no control over its supply to the Department for Education in the first place. We consider that this situation would be compounded, were the proposal to share the data implemented. This would represent a significant breach of trust, if the data were subsequently handed out to other organisations.
This is also part of a much wider debate. Such moves in data bank research are likely further to erode privacy rights, and records about children may be seen as an easier way in to this general undermining, bypassing discussion and consent that might be required for adults’ data. Fortunately in this instance the Department for Education is consulting widely, and we are anxious that this continues to be the case.
We are also concerned that commercial pressures on all kinds of researchers and practitioners are eroding privacy with almost no public debate. Policies seem to be driven by technology, in the sense that if it is technologically possible, then we must do it. Given the above largely negative implications of sharing data in the manner proposed, we the undersigned wish to register an objection to any changes.
Dr Sandra Leaton Gray, Institute of Education, University of London
Terri Dowty, Truth2Power
Professor Andy Phippen, University of Plymouth
[Accessed 14th December 2012]
[Accessed 17th December 2012]
Image courtesy of Stuart Miles at FreeDigitalPhotos.net