AI Revolution in GDPR Compliance

In the modern era, ensuring compliance has become an arduous task with the increasing regulations worldwide. The ability of artificial intelligence (AI) to learn from vast datasets and predict potential breaches in regulations effectively streamlines compliance. This article deliberates on the role and application of AI for General Data Protection Regulation (GDPR) compliance, and recent developments.

Image credit: New Africa/Shutterstock
Image credit: New Africa/Shutterstock

Role of AI in GDPR Compliance

The GDPR is a European Union (EU) regulation introduced to protect the personal data and privacy of EU citizens. This regulation is directly applicable to all member states of the EU. GDPR also applies to non-EU organizations that process EU citizens’ data.

A massive increase in the maximum fine for regulation breaches is one of the key aspects of the GDPR. Fines amounting to 2% of an organization’s turnover or €10 million can be levied for smaller/lesser breaches, while serious violations will attract higher fines.

In such a scenario, AI emerged as a critical technology effectively assisting with compliance with GDPR by offering the best advice, conducting assessments, and asking only the relevant questions. GDPR compliance can be simplified by adopting AI for checklist interpretation and risk analysis, providing adequate replies to requests for automatic profiling information, and identifying and performing risk analysis of actual/potential data breaches.

In checklist interpretation, AI facilitates the automation of a compliance checklist and offers guidance on key sections in the regulation. Risk analysis using AI assists in identifying low-risk, medium-risk, and high-risk activities based on the GDPR’s criteria and implementing compliance measures accordingly.

AI techniques were adopted in several studies to automate the legal evaluation of privacy policies. Privacy policies are the primary channel for organizations to inform about their data sharing and collection practices to users. However, most studies depended on manually annotated datasets for classification. In a study, a deep learning approach was employed to annotate previously unseen privacy policies automatically using fine-grained and high-level labels from a pre-specified taxonomy.

Natural language processing has received significant attention for text processing in different data protection problems, such as text anonymization, where text data, such as healthcare dossiers and police reports, are either protected by privacy laws/GDPR or sensitive in nature. For instance, natural language processing was leveraged in text anonymization to determine the true ability to protect individuals from being re-identified using AI-based methods.

In process mining, AI methods offer insights into business processes using the process execution data. After the guidelines were formulated to enable the GDPR compliance of business processes, process mining techniques have been adapted to determine the business process execution compliance to data subject rights and enable GDPR-compliant business process discovery.

Recently, an explicit, reusable, and understandable semantic model was developed to represent GDPR consent. A blockchain-based model can be combined with this model to ensure that organizations comply with GDPR regarding user consent.

Named entity recognition (NER) assists in detecting personally identifiable information in textual documents. NER excels in identifying and classifying named entities existing in a text into pre-defined and specific categories. Studies have shown that NER is effective for automated identification, validation, and monitoring of personally identifiable information in contracts in various scenarios. Convolutional neural networks and naïve Bayes-based NER techniques excel in identifying the nature and presence of personal information in data from social media.

Recent Developments

The Italian Public Administration depends on expensive manual analyses to secure personal data and ensure GDPR compliance with public documents. In a study published in the Journal of Intelligent Information Systems, researchers designed an AI-based framework that effectively determines whether the public administration documents written in Italian meet the relevant GDPR requirements.

This framework, designated as artIficial iNTelligence for gdpR compliancE of Public admInistration Documents (INTREPID), can assist the Italian Public Administration in automating the data protection workflows. INTREPID was realized by tuning a number of linguistic resources, such as SpaCy and Tint, for the processing of the Italian language to the GDPR intelligence. A text corpus of public documents manually labeled as GDPR compliant or GDPR non-compliant was used to train a text classification model to enable it to predict the compliance status of new documents.

In this study, researchers aimed to model the public document data protection problem as a binary text classification task. In the INTREPID architecture, the first step involved extracting textual content from the labeled PDF documents in the text corpus. This step was followed by text cleaning, text processing, features extraction, classification, and finally, prediction of GDPR compliance/non-compliance. Two kinds of features, including the classical bag-of-word and features based on the NER output, were exploited in the feature extraction stage. The extracted features were used to train a classifier to predict the class of new documents.

Additionally, three base classifiers/classification algorithms, including xgboost, random forest, and linear support vector machines, were utilized in this framework. Researchers performed an inter-annotator study to assess the agreement of annotation predictions of the framework with annotations by domain experts.

They also evaluated the accuracy of the text classification model in detecting documents with data breaches. Results showed that Tint outperforms SpaCy based on the agreement of the annotation predictions with annotations by domain experts. Additionally, the highest accuracy was achieved by training a classifier with XGBoost and leveraging both bag-of-word and NER-based features.

Although many companies have updated their privacy policies after the enforcement of GDPR, most of the policies are loquacious and do not clearly describe users’ rights and companies’ data practices. Many end users face difficulties understanding companies' privacy policies explaining GDPR requirements. This also increases the challenges for law enforcement authorities to check the compliance of companies’ privacy policies with GDPR requirements manually.

In a paper recently published in the Proceedings of the 21st Workshop on Privacy in the Electronic Society, researchers created a new dataset containing 1,080 privacy policies manually annotated by experts using labels representing 18 privacy disclosure requirements of GDPR and developed a convolutional neural network-based model that classifies the privacy policies into GDPR requirements with 89.2% accuracy.

Researchers applied the model to measure GDPR compliance automatically in the privacy policies of 9,761 most visited websites. As per results, 68% of websites failed to comply with at least one GDPR requirement even after four years of GDPR implementation.

In another paper published in the 2020 IEEE 28th International Requirements Engineering Conference (RE), researchers proposed an AI-assisted automated approach to check the privacy policy completeness against GDPR to overcome the limitations of manual checking, which is both error-prone and time-consuming.

Specifically, they presented a conceptual model to characterize the GDPR-envisaged information content for privacy policies, an AI-assisted approach to classify the information content in GDPR privacy policies and measure the completeness of the classified content against GDPR, and evaluated the approach using a case study more than 24 unseen privacy policies.

A natural language processing and supervised machine learning combination was leveraged for classification. A total of 234 real privacy policies obtained from the fund industry were used in the experiment. Results displayed that the proposed approach can successfully detect 45 of the total 47 incompleteness issues in the 24 unseen privacy policies. Moreover, the approach demonstrated 85% precision and 96% recall and had eight false positives.

In a paper published in the Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, an integrated rule-based and machine learning approach was formulated for automated checking of GDPR compliance. A framework was conceptualized for implementing a document-centric approach to compliance checking in the data supply chain, and specific methods were developed to automate privacy policy compliance checking.

In conclusion, AI simplifies GDPR compliance and prevents potential breaches in GDPR that can lead to substantial financial losses to organizations. However, ensuring data privacy and security, collecting good quality data for AI model training, and addressing complex AI models' interpretability issues must be prioritized for large-scale adoption of AI for GDPR compliance.

References and Further Reading

Hamdani, R. E., Mustapha, M., Amariles, D. R., Troussel, A., Meeùs, S., Krasnashchok, K. (2021). A combined rule-based and machine learning approach for automated GDPR compliance checking. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, 40-49. https://doi.org/10.1145/3462757.3466081

Kingston, J. (2017). Using artificial intelligence to support compliance with the general data protection regulation. Artificial Intelligence and Law, 25(4), 429-443. https://doi.org/10.1007/s10506-017-9206-9

Lorè, F., Basile, P., Appice, A., de Gemmis, M., Malerba, D., Semeraro, G. (2023). An AI framework to support decisions on GDPR compliance. Journal of Intelligent Information Systems, 61, 541–568. https://doi.org/10.1007/s10844-023-00782-4

Torre, D., Abualhaija, S., Sabetzadeh, M., Briand, L., Baetens, K., Goes, P., Forastier, S. (2020). An ai-assisted approach for checking the completeness of privacy policies against GDPR. 2020 IEEE 28th International Requirements Engineering Conference (RE),136-146. https://doi.org/10.1109/RE48521.2020.00025

Rahat, T. A., Long, M., Tian, Y. (2022). Is Your Policy Compliant? A Deep Learning-based Empirical Study of Privacy Policies' Compliance with GDPR. Proceedings of the 21st Workshop on Privacy in the Electronic Society, 89-102. https://doi.org/10.1145/3559613.3563195

Last Updated: Dec 30, 2023

Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2023, December 30). AI Revolution in GDPR Compliance. AZoAi. Retrieved on September 19, 2024 from https://www.azoai.com/article/AI-Revolution-in-GDPR-Compliance.aspx.

  • MLA

    Dam, Samudrapom. "AI Revolution in GDPR Compliance". AZoAi. 19 September 2024. <https://www.azoai.com/article/AI-Revolution-in-GDPR-Compliance.aspx>.

  • Chicago

    Dam, Samudrapom. "AI Revolution in GDPR Compliance". AZoAi. https://www.azoai.com/article/AI-Revolution-in-GDPR-Compliance.aspx. (accessed September 19, 2024).

  • Harvard

    Dam, Samudrapom. 2023. AI Revolution in GDPR Compliance. AZoAi, viewed 19 September 2024, https://www.azoai.com/article/AI-Revolution-in-GDPR-Compliance.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.