In March 2017, the ICO issued an update to its 2014 Report on Big Data in light of the imminent implementation of the GDPR. The updated ICO report has added a focus on artificial intelligence and machine learning to its discussion of big data. The ICO argues it is the combination of the three that makes up ‘big data analytics’. The ICO looks at big data analytics from the GDPR perspective and provides practical guidance for compliance in its new report.
The ICO report considers the types of personal data used for big data analytics. It may involve the use of ‘new types of data’ for the analysis, such as ‘observed data’, ‘derived data’ and ‘inferred data’. These data are additional to personal data consciously provided by the individual. The new types of data are collected through various sensors, cookies, or produced by using machine learning algorithms and analytics methods.
The ICO report suggests that the complexity of big data analytics should not preclude businesses from complying with the data protection rules. Thus, if consent or legitimate interests are relied upon as the legal basis for processing personal data for big data analytics, then all the relevant conditions for such processing set out under the GDPR should be met. Incidentally, the ICO concludes that contract is unlikely to be a valid basis for processing for big data analytics purposes, as it will be difficult to show that the level of processing for big data analytics ‘by its nature’ is ‘necessary’ for the performance of a contract.
Since big data analytics usually repurposes personal data, the ICO suggests that businesses will need to obtain informed consent for any secondary use of personal data. Also, this covers scenarios where personal data is obtained from other organisations and not directly from individuals.
The ICO accepts that data minimisation and data retention principles could be difficult to comply with. Despite this, the ICO insists that businesses that carry out big data analytics must define the aim of such analysis at the outset and ensure that the personal data they use is not excessive and is relevant to the aim. Businesses should ensure that the data retention requirements are enforced, as set out under the GDPR.
Data accuracy and data quality are key issues raised in the updated Big Data report. If big data analytics is based on inaccurate data, machine learning algorithms may make decisions that are erroneous or unjustified. Businesses relying on big data analytics will need to ensure that they build discrimination detection into their machine learning systems to prevent discriminatory outcomes.
The ICO provides six key recommendations for compliance with the GDPR:
1) anonymise personal data, where personal data is not necessary for the analysis;
2) be transparent about the use of personal data for big data analytics and provide privacy notices at appropriate stages throughout a big data project;
3) embed a privacy impact assessment process into big data projects to help identify privacy risks and address them;
4) adopt a privacy by design approach in the development and application of big data analytics;
5) develop ethical principles to help reinforce key data protection principles; and
6) implement internal and external audits of machine learning algorithms to check for bias, discrimination and errors.
The newly released Big Data report by the ICO is a discussion paper, and the ICO will continue its work helping organisations to comply by issuing further guidance on ‘profiling’ and ‘risk’.
Big data is also a focus of attention at EU level. The EU Parliament adopted a resolution on 14 March 2017 on Big Data that urges the private and public sectors to bring their big data practices in line with EU data protection legal standards and safeguards, including the GDPR (“Resolution on Fundamental Rights Implications of Big Data: Privacy, Data Protection, Non-discrimination, Security and Law-Enforcement”). It calls for the EU Commission, the European Data Protection Board, and national data protection authorities to develop “concrete standards that protect fundamental rights and guarantees associated with the use of data processing and analytics by the private and public sector”.