The medical records of NHS England’s 61 million users are set to be gathered in a new centralised database as part of a new scheme called the General Practice Data for Planning and Research (GPDPR). According to NHS Digital, the data will be used to: inform and develop health and social care policy, plan and commission health and care services, take steps to protect public health such as managing the Covid-19 pandemic, enable research, and provide individual care in exceptional cases.
The database will not include names or addresses, or any other data that could directly identify a patient like their NHS number, date of birth, or postcode. NHS Digital claims this will allow the information to remain confidential when it’s accessed by third parties in the healthcare industry. It also says that the data will only be accessible to organisations with a legitimate need for it who match up to stringent criteria, and that the database will never be used for insurance or marketing purposes, promoting or selling products or services, market research or advertising.
But while the scheme was in development for three years, patients were given just over a month to be made aware of the project and opt out if they wished to do so. NHS Digital released the plans on 12 May this year and gave a deadline of 23 June for people to omit data from the GPDPR, which has since been pushed back to 25 August following pressure from the Doctors’ Association UK (DAUK). If patients do not opt out by this time, they will not be able to do so in future.
The information set to be included in the database includes data about: sex, ethnicity, sexual orientation, diagnoses, symptoms, observations, test results, medications, allergies, immunisations, referrals, recalls and appointments, including information about physical, mental and sexual health. Notably, it includes details about which staff have treated patients.
Those in favour of the initiative believe the database could be a big help in advancing understanding of medical issues, but critics have described the move as an “NHS data grab”. Writing into the Guardian, University of Manchester emeritus professor of medical informatics Alan Rector described the assurances of anonymity as “worthless” and that “[f]ew people realise how easy it is to identify individuals from medical records, even if obvious personal details are removed.”
All 36 doctors’ surgeries in Tower Hamlets, east London, have agreed to withhold patient data when the collection begins.
Patient data confidentiality
While it is worth acknowledging that “most people would be happy for the NHS to have their health data”, it doesn’t change the fact that the NHS has been involved in some pretty dodgy data dealings in recent years which have damaged public trust. In 2014, the Care.data initiative proved so unpopular public outcry led to the scheme being scrapped in 2016.
In 2015, the health records of NHS patients at the Royal Free London Trust were transferred, without explicit consent from patients and in a way that did not comply with the UK’s Data Protection Act, to Google DeepMind. In 2019 it was revealed that international pharmaceutical companies had obtained access to NHS patient data, while the recent involvement of big data company Palantir with the NHS Covid-19 datastore has ruffled more than a few feathers.
Since it was announced in early April, GPDPR hasn’t exactly been highly publicised, with information about the initiative primarily shared on the NHS Digital website and via leaflets at GP surgeries, meaning only a small proportion of the public will be aware of how their data is about to be used. This lack of publicity, particularly in light of the historic handling of NHS patient data, has prompted commentators to view the scheme as a “data grab”, with the government capitalising on the pandemic health panic to push GPDPR trough under the radar.
Labour’s Shadow Health Minister Alex Norris has said that the “current plans to take data from GPs, assemble it in one place and sell it to unknown commercial interests for purposes unknown has no legitimacy” and that the plans had been “snuck out under the cover of darkness.”
The legality of the move has also been questioned. Campaigners remain unconvinced about the lawfulness of collecting NHS data on this scale without properly consulting patients.
English citizens can opt out of the system by contacting their GP in one of two ways, but how they will be implemented and the exact differences between them aren’t that easy to understand at first glance.
The Type 1 Opt-out allows an individual to prevent their identifiable patient data from being shared outside their GP practice for any purposes, other than their own care. This option was introduced in 2013 for data sharing from GP practices but may be discontinued in the future as a new opt-out option has since being introduced to cover the broader health and care system, known as the National Data Opt-out.
NHS Digital will not collect any data about patients who have already registered a Type 1 Opt-out. If a person registers for a Type 1 Opt-out after their data has already been shared with NHS Digital, no more of it will be shared in future but NHS Digital will still hold all patient data shared before the Opt-out was registered.
Meanwhile, National Data Opt-out includes information like hospital data as well as GP data. If someone registers for a National Data Opt-out, NHS Digital won’t share any confidential information about them with other organisations apart from when there is a legal obligation to do so, such as information about Covid-19 infection or their personal care.
However, the National Data Opt-out won’t prevent data from being shared by GP practices with NHS Digital, as it is a legal requirement for GP practices to share this data with NHS Digital and the National Data Opt-out does not apply where there is a legal requirement to share data. The information won’t be used to inform research and planning but will still be passed to NHS Digital.
No true anonymity
As well as concerns about NHS competency when it comes to handling sensitive information and the lack of transparency around the initiative and how to opt out, significant privacy concerns remain about the very nature of this supposedly de-identified data.
In 2019, researchers from Belgium’s Université Catholique de Louvain (UCLouvain) and Imperial College London built a model to estimate how easy it would be to deanonymize an arbitrary dataset. They found that a dataset with 15 demographic attributes would render 99.8% of people in Massachusetts as unique. For smaller populations, it becomes even easier to identify a person, with the inclusion of town-level data meaning that “it would not take much to reidentify people living in Harwich Port, Massachusetts, a city of fewer than 2,000 inhabitants.”
The proposed database will contain the details of the medical professional who treat each patient, information which could make it easy to narrow down who a person is. Simply look up the clinician and where they’ve worked and follow the data from there.
NHS Digital says GPDPR has a secure data environment that researchers will be able to access, from which the data will not need to travel. However, the system still allows copies of the data to be lifted from NHS Digital and placed onto an external site. This could allow users to abuse patient data without NHS Digital’s knowledge or control.
Speaking to Each Other, University of East Anglia Law School associate professor in information technology Dr Paul Bernal said: “There are people within the field of public health who would like to have more data to serve the public good. But there are also people who know the potential of data for exploiting for financial gain or for reasons of control, and those people have been licking their lips at the prospect of getting health data for a very long time.”