The impact of adversarial collaboration with a large language model on belief updating

This project investigates how collaboration with a disagreeing Large Language Model (LLM) affects the change of individual beliefs. Thereby, we assess how digital spaces—contrary to their tendency toward polarization and echo chambers—can become places of constructive discourse. Building on research in intergroup processes, the project examines whether so-called adversarial collaboration, the joint creation of a text with a disagreeing LLM, leads to belief updating. This research question is addressed in two experimental studies. In the first experiment, an LLM is trained either to adopt an opposing position or to co-author a compromise text with participants. The second experiment investigates how congenial collaboration, i.e., affirming the participant's beliefs, in comparison to adversarial collaboration affects their attitudes. Across all experiments, belief updating, digital ownership, and acceptance of opposing arguments are measured. Additionally, the intensity of voluntary interaction with the LLM and the number of arguments participants contribute are explored. If adversarial collaboration with LLMs proves successful in belief updating, this method offers several advantages. Unlike discussion groups or citizen assemblies, it is individualized, lower in conflict, and scalable. It thus opens new perspectives for shaping constructive discourse in digital spaces; particularly in the field of internet use, but also in schools, higher education, and other educational contexts. The project contributes to the question of how AI technologies can be used to promote democratic discourse.

Part of the lab

Knowledge Construction

Duration

06/2025 - 05/2026

The IWM

Research

Experience

Career