Skip to main content
Skip to main content

Ben Levinstein / Belief and trust in AI

A man in a suit, standing outside, apparently thinking carefully about something.

Ben Levinstein / Belief and trust in AI

Philosophy Friday, February 6, 2026 3:00 pm - 5:00 pm Skinner Building, 1115

Friday February 6 we are joined by Ben Levinstein, Associate Professor of Philosophy at the University of Illinois, Urbana-Champaign, who will present his work on "Belief and trust in AI." The talk is abstracted below.


Contemporary AI systems demonstrate remarkable capabilities, from passing PhD-level exams to sophisticated role-play and common-sense reasoning. As we increasingly delegate consequential decisions to these systems—in hiring, medical diagnosis, and beyond—two fundamental questions arise: Under what conditions can we responsibly trust AI agents? And to what extent can we meaningfully attribute action-guiding beliefs and desires to systems like Large Language Models? This talk develops a two-pronged approach to these challenges. First, I outline general conditions under which delegation to AI systems is rational and ethical, even under uncertainty about their internal states. Second, drawing on insights from radical interpretation (Davidson, Lewis) and contemporary machine learning, I propose four adequacy criteria for identifying belief-like representations in LLMs. I then argue current interpretability methods fail to satisfy these criteria for both conceptual and empirical reasons.  I conclude by exploring alternative approaches to responsible AI deployment when mental state attribution remains uncertain.

Add to Calendar 02/06/26 15:00:00 02/06/26 17:00:00 America/New_York Ben Levinstein / Belief and trust in AI

Friday February 6 we are joined by Ben Levinstein, Associate Professor of Philosophy at the University of Illinois, Urbana-Champaign, who will present his work on "Belief and trust in AI." The talk is abstracted below.


Contemporary AI systems demonstrate remarkable capabilities, from passing PhD-level exams to sophisticated role-play and common-sense reasoning. As we increasingly delegate consequential decisions to these systems—in hiring, medical diagnosis, and beyond—two fundamental questions arise: Under what conditions can we responsibly trust AI agents? And to what extent can we meaningfully attribute action-guiding beliefs and desires to systems like Large Language Models? This talk develops a two-pronged approach to these challenges. First, I outline general conditions under which delegation to AI systems is rational and ethical, even under uncertainty about their internal states. Second, drawing on insights from radical interpretation (Davidson, Lewis) and contemporary machine learning, I propose four adequacy criteria for identifying belief-like representations in LLMs. I then argue current interpretability methods fail to satisfy these criteria for both conceptual and empirical reasons.  I conclude by exploring alternative approaches to responsible AI deployment when mental state attribution remains uncertain.

Skinner Building false