Bad Data, Better Defenses: How Russian Narratives Infiltrate AI

Ongoing Research Project

Disinformation in LLM Training Data

Our First 60 Days

Data poisoning has increasingly become a valuable Trojan horse in hybrid and cognitive warfare strategies. The practice of filling commonly used digital platforms with disinformation has proven itself to be one of the most efficient methods for destabilising democracies. Over the years, Russian disinformation operations have evolved into a critical component of hybrid warfare, systematically targeting the stability of Western democracies by eroding institutional trust and distorting political environments. The emergence of Large Language Models (LLMs) marks a new frontier in this conflict, as these systems risk absorbing and amplifying disinformation narratives present in their training data. Our project focuses on mapping the origins and impact of harmful narratives and identifying how disinformation is propagated from the training data to the LLM responses' pipeline.

Our research will examine LLMs through stress tests consisting of culturally specific disinformation profiles and thematic queries. We aim to investigate the link between AI-generated narratives and their capacity to influence institutional trust and user decision-making and behaviour.

The project status, two months in, is firmly on track following a launch phase that has established a strong theoretical and methodological foundation. The first phase consisted of collecting Kremlin-aligned sources and narratives, which we have gathered from expert consultations, academic sources, previous research projects, open analytical sources and publicly available case databases. The AttackIndex tool, an automated analytical service for processing large-scale information data sets, helped consolidate this dataset. The information we have gathered in this stage will be used to further contribute to the LLM stress-testing stage of the project's methodology.

These datasets are available upon request by contacting researcher Louise Dufrasne Keegan at Louise.dufrasne@ingram.tech.

On February 2nd, 2026, we were invited to attend the AI in Defence Summit to showcase our research project to some of the EU's most influential political leaders and industry experts. Through these discussions with professionals, we gained a better understanding of this project's positioning in the field.

The next phase of this project will focus on data analysis and clustering through topic modelling in order to uncover narratives filtered by relevance to set up our LLM-prompting methodology. Profile cases will be designed and then deployed in order to be able to do so.

We are pleased with the progress made over these first two months and are now focused on the practical work of analysis. Our priority is to move from data collection into the testing phase to better understand how these narratives persist within AI systems. We look forward to sharing our findings as the research continues to develop.