Submit your essay to the AI-Cybersecurity Essay Prize Competition by January 2, 2025.
The AI-Cybersecurity Essay Prize Competition

Reading between the lies: using leak sites to analyse ransomware trends

Leak site data is a tempting source for researchers, but it must be used cautiously. Selection biases, inaccuracy, and manipulation by ransomware groups all pose risks
Main Top Image
Image created using Chat GPT-4o

In September 2024, a group called Valencia Ransomware announced their presence to the world by posting data from five alleged victims, including an Indian paper producer, a Malaysian pharmaceutical company, and a Californian municipality, on their leak site’s ‘wall of shame.’ 

Since 2020, operating a leak or shame site has practically become a matter of course for the ransomware business model. These sites play an essential role in ransomware groups’ double-extortion schemes. The groups not only encrypt victims’ systems – often rendering them inoperable – but also siphon off some of their data, which they can later threaten to release on leak sites. The goal is simple: the threat of releasing data often significantly increases the cost of non-compliance with ransom demands and might pressure the companies to pay even if they are able to restore access to the encrypted data independently. Such threats can be especially effective if the stolen data includes sensitive information, such as customer purchase information, employee addresses and payroll details, or patient health records

Data from leak sites are frequently used by journalists, practitioners, scholars, and think tank experts to shed light on various aspects of ransomware incidents, such as current ransomware activities, timing with geopolitical events, geographic spread, or type of companies targeted.

In Ransom War: How Cyber Crime Became a Threat to National Security, one of the authors of this post, Max Smeets, used data from the leak site Conti News to assess trends in targeting. Data from Conti News suggested that the group, Conti, had begun to explore targets in markets beyond the Western world. In 2020, Conti only very sporadically released leaked data from victims in non-Western regions. For example, in early December 2020, they put up for sale data from a relatively small Indian company called Ixsight Technologies. They also offered data from a smaller information technology firm in the United Arab Emirates, CORE Information Technology Consultants. But these were exceptions to the rule. In 2021, a shift seems to have occurred. Conti began expanding its operations into other markets, with an emphasis on actively targeting organisations in Latin America. Data from various Latin American companies began to appear on their leak site.

This type of data can be tempting to use due to its tantalising accessibility and broadness – it can even be obtained in bulk from scraping websites like ecrime.ch. However, manipulation by ransomware groups, selection biases, and inaccuracy necessitate a cautious approach – one that too often is not taken. 

Selection bias

A key limitation in using leak site data is its inherent selection bias. These sites only showcase victims who do not meet ransom demands, meaning we see a skewed picture of ransomware activity. A highly effective ransomware group that secures a high percentage of ransom payments may appear less active than a less successful group that posts more victims online. This hinders cross-group comparisons, complicating the analysis of groups like BlackCat, the culprit behind the UnitedHealth incident; Akira, which has received at least $42 million in ransom payments from hundreds of victims; and their competitors.

This bias also impedes within-group comparisons, making it difficult to discern activity trends of ransomware groups over time. A decline in the volume of leaked data does not necessarily indicate a reduction in ransomware operations; it could also imply that more victims are yielding to ransom demands as the ransomware group becomes better at stealing critical data.

Similarly, certain sectors, such as government institutions, may be overrepresented in the data simply because they are less likely to pay due to regulatory restrictions, further distorting trend analyses.

Overestimating and underestimating

In a 2022 article, cyber threat analyst Will Thomas illustrated how relying only on leak site data can lead to a stark underestimation of actual ransomware cases. The example of REvil, the ransomware gang behind the now infamous Kaseya supply-chain attack, is indicative. As Thomas points out, the group’s leak site showed only 288 victims, yet, when seven of its members were arrested by Europol in 2021, they alone were accused of around 7,000 cases of ransomware use.

Overestimation is also a risk. Ransomware groups have incentives to inflate their presence on leak sites. They may fake listings, post old data, or post other groups’ victims. These tactics can create a false sense of scale, making it look like they have targeted more victims than they have. These inflated numbers serve several purposes. A larger list of victims helps build a stronger reputation within the criminal ecosystem, making the group appear more successful and attracting more affiliates who want to profit from their perceived dominance. An inflated reputation also gives ransomware groups leverage in negotiations. Victims are more likely to pay if they believe the group is powerful and capable of causing widespread damage.

Imagine you are a cybercrime officer relying on leak site data to assess whether you are effectively combating the ransomware threat. Are you looking for an increase in the number of victims listed, which could suggest that fewer victims are paying ransoms, perhaps due to better defences? Or would you rather see fewer entries, possibly indicating a reduction in overall ransomware activity? It is clear that, unless perhaps you are strictly focused on minimising data leaks, the number of leak-site posts should not be trusted as an indicator for success in combatting ransomware.    

The more we use it, the less useful leak data becomes

The incentives for these groups to manipulate data increase the more the research community, journalists, and industry practitioners rely on this data to analyse their behaviour and describe trends.

Cybersecurity journalist Valéry Riess-Marchive has pointed out the uncritical use of this data not only skews research results but may even serve ransomware groups’ interests by amplifying their perceived power.

Ransomware groups are observant. They do not operate in a vacuum but constantly engage with the public. They are aware of the influence their leak sites can exude.

Need for a careful approach

Despite these issues, leak site data can still offer useful insights, for example as early indicators of new or resurgent ransomware groups. However, we must proceed with caution. We should not give in to the temptation of using this data for our analyses simply because it is easily available. It is important to dissect the data critically, recognising both its value and its limitations. If we choose to include leak site data in our analyses, we must clearly acknowledge its potential shortcomings and ensure these are discussed transparently. Ultimately, we are relying on a source maintained by criminal actors who make their living through deception.

Terms and Conditions for the AI-Cybersecurity Essay Prize Competition

Introduction

The AI-Cybersecurity Essay Prize Competition (the “Competition”) is organized by the European Cyber Conflict Research Incubator (“ECCRI CIC”) in partnership with the Munich Security Conference (“MSC”). It is sponsored by Google (the “Sponsor”). By entering the Competition, participants agree to these Terms and Conditions (T&Cs).

Eligibility

The Competition is open to individuals worldwide who are experts in the fields of cybersecurity and artificial intelligence (“AI”). Participants must ensure that their participation complies with local laws and regulations.

Submission Guidelines

Essays must address the question: “How will Artificial Intelligence change cybersecurity, and what are the implications for Europe? Discuss potential strategies that policymakers can adopt to navigate these changes.”

Submissions must be original, unpublished works between 800-1200 words, excluding footnotes but including hyperlinks for references.

Essays must be submitted by 2 January 2025, 00:00 am CET., through the official submission portal provided by ECCRI CIC.

Only single-authored essays are accepted. Co-authored submissions will not be considered.

Participants are responsible for ensuring their submissions do not infringe upon the intellectual property rights of third parties.

Judging and Awards

Essays will be judged based on insightfulness, relevance, originality, clarity, and evidence by a review board comprising distinguished figures from academia, industry, and government.

The decision of the review board is final and binding in all matters related to the Competition.

Prizes are as follows: 1st Place: €10,000; Runner-Up: €5,000; 3rd Place: €2,500; 4th-5th Places: €1,000 each. The winner will also be invited to attend The Munich Security Conference

Intellectual Property Rights

The author retains ownership of the submitted essay.

By submitting the essay, the author grants ECCRI CIC exclusive, royalty-free rights to use, reproduce, publish, distribute, and display the essay for purposes related to the Competition, including but not limited to educational, promotional, and research-related activities.

The author represents, warrants, and agrees that no essay submitted as part of the essay prize competition violates or infringes upon the rights of any third party, including copyright, trademark, privacy, publicity, or other personal or proprietary rights, breaches, or conflicts with any obligation, such as a confidentiality obligation, or contains libellous, defamatory, or otherwise unlawful material.

The author agrees that the organizers can use your name (or your pseudonym) and an image of you in association with your essay for purposes of publicity, promotion and any other activity related to the exercise of its rights under these Terms.

The organizers may remove any essay-related content from its platforms at any time and without explanation.

The organizers may block contributions from particular email or IP addresses without notice or explanation.

The organizers may enable advertising on its platforms and associated social media accounts, including in connection with the display of your essay. The organizers may also use your Material to promote its products and services.

The organizers may, at its sole discretion, categorise Material, whether by means of ranking according to popularity or by any other criteria.

Data Protection

Personal information collected in connection with the Competition will be processed in accordance with Virtual Routes’ Privacy Policy. Participants agree to the collection, processing, and storage of their personal data for the purposes of the Competition.

Liability and Indemnity

ECCRI CIC, MSC, and the Sponsor will not be liable for any damages arising from participation in the Competition, except where prohibited by law.

Participants agree to indemnify ECCRI CIC, MSC, and the Sponsor against any claims, damages, or losses resulting from a breach of these T&Cs.

General Conditions

ECCRI CIC reserves the right to cancel, suspend, or modify the Competition or these T&Cs if fraud, technical failures, or any other factor beyond ECCRI CIC’s reasonable control impairs the integrity or proper functioning of the Competition, as determined by ECCRI CIC in its sole discretion.

Any attempt by any person to deliberately undermine the legitimate operation of the Competition may be a violation of criminal and civil law, and, should such an attempt be made, ECCRI CIC reserves the right to seek damages from any such person to the fullest extent permitted by law.

Governing Law

These Terms and Conditions are governed by the laws of the United Kingdom, without regard to its conflict of law principles. Any dispute arising out of or in connection with these Terms and Conditions, including any question regarding its existence, validity, or termination, shall be referred to and finally resolved by the courts of the United Kingdom. The participants agree to submit to the exclusive jurisdiction of the courts located in the United Kingdom for the resolution of all disputes arising from or related to these Terms and Conditions or the Competition.