Research Project | January 2026 - June 2026

Disinformation in LLM
Training Data

Investigating the relational impact between disinformation and the datasets used when training Large Language Models

About the Project

Russian disinformation has evolved into a critical component of hybrid warfare, systematically targeting the stability of Western democracies by eroding institutional trust and distorting political environments. The emergence of Large Language Models (LLMs) marks a new frontier in this conflict, as these systems risk absorbing and amplifying disinformation narratives present in their training data. The intersection of AI advancement and information manipulation necessitates research to address the rapid, persuasive, and coordinated spread of disinformation that threatens democratic resilience.

How does the presence of online disinformation within training datasets influence the reliability and outputs of major Large Language Models (LLMs), and to what extent does this shape user decision-making across different query themes?

We have launched a 6-month research project (starting January 2026) to investigate the relational impact between disinformation and the datasets used when training LLMs. By testing LLM models against localised European disinformation, we aim to map the correlation between training data and disinformation-biased narratives.

To achieve this, the project focuses on mapping narratives and identifying how disinformation is propagated from the training data to the LLM responses' pipeline. The research will also examine LLMs through stress tests consisting of culturally specific disinformation profiles and thematic queries. Additionally, we aim to investigate the link between AI-generated narratives and their capacity to influence institutional trust and user decision-making and behaviour.

This research helps ensure that generative AI develops in a direction that supports democratic agency.

Project Deliverables

Research Manuscript

A comprehensive publication including a comparative analysis of model outputs

Interactive Dashboard

A data dashboard visualising the research results and findings

Policy Brief

A two-page policy recommendation brief addressed to the European Commission

How to Get Involved?

We welcome collaboration with researchers, policymakers, and organisations working at the intersection of AI, disinformation, and democratic resilience. The project is designed to be interdisciplinary and comparative, and we are particularly interested in expertise related to European information environments, computational social science, AI governance, and policy evaluation.

How can you contribute?

  • Sharing expertise on localised disinformation ecosystems, particularly within European contexts
  • Contributing methodological input on LLM evaluation, bias detection, or stress-testing frameworks
  • Providing access to relevant datasets, case studies, or policy perspectives
  • Offering feedback on findings

We are especially keen to engage with contributors who can help bridge technical analysis with societal and policy implications.

How can you participate?

Participants may be involved and contribute by sharing insights, co-creating knowledge, or feedback rounds on the policy brief and dashboard outputs.

Get in Touch

Louise Dufrasne Keegan

Louise Dufrasne Keegan

Researcher | Ingram Technologies

Contact Louise to get involved in this research project or to learn more about our work on AI, disinformation, and democratic resilience.

Contact Me