Stepping into the Black Box of AI

Outputs from this project

Forthcoming!

Modern AI systems are comprised of so many individual parts – in the billions – that even experts often struggle to meaningfully understand how they arrive at the results they produce: they’re a “black box”. In addition, the human decision-making processes that accompany the implementation of AI systems in the real world – issues of accountability and human oversight – are unfortunately often just as opaque as the algorithms themselves.

We set out to create an experience that could allow anyone, regardless of their knowledge about AI, to be a meaningful participant in conversations around deployment of AI systems. We turned a physical black box into an interactive installation where two people can interact with “EdinBot”, a chatbot that presented itself as an expert on the city of Edinburgh. One person uses a tablet mounted on the outside of the box to ask it any question, just like one would with a chatbot. The other person, meanwhile, is invited to step inside the box, where they can intercept some of the “decisions” the algorithm is making and influence its response.

Large language models construct answers by predicting the next token (usually a word), one at a time. For each token, they typically determine several likely candidates before selecting one, often at random. In our installation, however, the person inside the box gets to make this choice themself at key moments. This potentially lets them steer the model’s response in a direction of their choosing, while also seeing first-hand the range of possible responses and the methods the model uses to “choose” between them.

We set up the installation at a few public-facing events around the University, invited participants to try out both roles, and then asked them to reflect on their experience. What would they (not) trust EdinBot with? What did Edinburgh look like from inside the box? While some said they could trust EdinBot to give high-level tourist information, many commented on its generic and unreliable answers. Encouragingly, many participants found the experience useful for allowing its players to learn more about the workings of LLMs and construct their own opinions about the system’s trustworthiness.

Collaborators: Jingjie Li (University of Edinburgh)

Funder: UKRI AI Centre for Doctoral Training in Responsible and Trustworthy in-the-world Natural Language Processing

Project dates: 2024