Balancing security, innovation, and regulation in cyber threat (artificial) intelligence

In the last decade, cyber threat intelligence (CTI) has become an essential component of modern, proactive cybersecurity approaches. It allows organisations to anticipate and identify threats and mitigate their potential impacts in advance. A proactive approach to cyber threats from the EU should prevent them from materialising into incidents.
As in other industries, generative artificial intelligence (GenAI) is being increasingly incorporated into CTI, shifting how cybersecurity professionals investigate threat-related data to mitigate risks.
CTI revolves around collecting, analysing, and parsing large troves of data to uncover and predict threats. GenAI is excellent at rapidly sorting and contextualising data, saving human teams time and providing near real-time responses to threats. In this regard, GenAI could be used to sift through massive data leaks to identify risks, such as individuals offering to sell access to organisational networks or exploits to vulnerabilities.
On the attackers’ side, state-sponsored hacking groups and cybercriminals are already employing GenAI, and specifically large language models (LLMs) such as ChatGPT, for social engineering and disinformation. Where mainstream AI tools fall short, ‘dark LLMs,’ malicious adaptations of ChatGPT like FraudGPT and WormGPT, fill in the gaps.
Legal challenges for cyber threat (artificial) intelligence
While both defenders and attackers utilise AI, legal restrictions on data collection and sharing, such as the EU’s General Data Protection Regulation (GDPR), may inhibit organisations seeking to adopt AI-driven CTI solutions. This will be an advantage for actors unhindered by legal concerns.
Over the years, experts have repeatedly argued that CTI collection, analysis, and distribution are lawful under GDPR even if it includes personal and identifying information, such as names, email addresses, and IP addresses. Article 6 of the GDPR requires ‘legitimate interests’ for processing personal data, and CTI is a legitimate reason. GDPR recitals 47, 49 and 50 support processing personal data for fraud prevention, ensuring information and network security, and indicating possible acts or threats to public security.
However, to legally collect and share CTI data, organisations must follow the principles stated in Article 5 as well. This would mean ensuring that the data collected is relevant, limited to what is necessary, and not used for other purposes.
Using GenAI for CTI could complicate the manner and extent to which the principles in Article 5 are addressed. For example, GenAI models, such as ChatGPT and Gemini, can create misleading, inaccurate, or imagined outputs, a phenomenon known as ‘hallucinations’. These hallucinations could produce incorrect or harmful intelligence, such as false positives, resulting in false alerts, or false negatives, missing actual threats.
Secondly, from a privacy and GDPR point of view, hallucinations could cause a GenAI-driven CTI solution to further expose personal information that was not supposed to be revealed. It could also collect more data than necessary, or present inaccurate or irrelevant information. These risks are compounded by AI’s lack of transparency and explainability, which makes it difficult for users to trust and validate information produced by the system.
The lack of transparency or traceability could also lead to legal challenges if an AI-powered CTI solution accurately identifies a finding that could be misinterpreted by human analysts.
For example, an AI model could flag customer data that was exposed on the Dark Web without explaining how it reached the conclusion that the data is relevant and accurate. A human analyst could then wrongly assume the alert was a false positive and fail to report the leak and notify the authorities, resulting in penalties and legal actions.
Lastly, the GDPR has a decentralised enforcement model, which allows national data protection authorities to interpret the GDPR requirements and prosecute violations independently. This has led to inconsistencies regarding enforcement and compliance. Uneven enforcement and the above-mentioned risks may create legal uncertainties, eroding trust and hindering the adoption of AI-powered CTI solutions in Europe.
The AI Act: new regulation, new challenges
The GDPR is not the only EU regulation that would govern AI-powered CTI solutions. The EU AI Act divides AI products into different risk categories – unacceptable, high, limited and minimal – based on safety, ethics, and trustworthiness.
AI-driven CTI solutions would likely fall under the high-risk category, which has strict obligations, such as maintaining high-quality training datasets, ensuring results are accurate, and providing transparency, model robustness, and security.
Maintaining both accuracy and high transparency could be challenging due to the trade-off that exists between complex models, which offer high accuracy but are difficult to interpret, and simpler models, which are easier to understand but less accurate.
Although research is underway on developing more transparent and explainable AI models, some, like neural networks and LLMs, are so complex, with billions of parameters, that they are very difficult to interpret.
Clarification and balance
Managing these risks requires the EU, and institutions like the European Data Protection Board, to establish clear safety and privacy guidelines for the use of GenAI in the cybersecurity sector. Such guidelines should clarify how organisations and vendors use and develop GenAI-driven cybersecurity products to meet regulatory requirements and how to balance between potentially conflicting features.
These guidelines could, for example, instruct organisations on how to recognise and treat GenAI hallucinations, what legal protections they may have in cases of false negatives, or what to do if an AI-powered solution has accidentally revealed personal information.
Sharing case studies, risk scenarios, and mitigations with the public and the industry would allow European users and vendors to operate in a more stable and predictable environment. This would also encourage both innovation and adoption of next-generation security technologies.
A more predictable regulatory environment could also focus on creating a shared responsibility model, akin to the one used in cloud services, to divide legal responsibility between the end user, the CTI solution vendor, and the AI model developer.
Emphasising clarity would ensure more consistent enforcement and help vendors tailor their products to European end users. Establishing common baseline interpretations of these guidelines for national regulators would go some way to establishing common ground. This would encourage European organisations to adopt next-generation cybersecurity solutions and give them an edge over their opponents in the cyber-AI race.