Archive of Tomorrow – Connecting web archives with users through the health information in the Archive of Tomorrow collection
Outputs from this project
Data:
https://github.com/aurigandrea/NLS-Fellowship-2024
Papers:
Talboom, L., & Kocsis, A. (2025). Digital healing: Metadata and documentation for health web archives. Journal of Open Humanities Data, 11, 1-7. https://doi.org/10.5334/johd.272
Kocsis, A., & Talboom, L. (2025). Engaging audiences with the UK Web Archive: Strategies for general readers, data users, and the digitally curious. Paper presented at Research Infrastructure for the Study of Archived Web Materials Conference, Siegen, Germany. https://www.research.ed.ac.uk/en/publications/engaging-audiences-with-the-uk-web-archive-strategies-for-general
Kocsis, A., & Talboom, L. (2025). From pages to people: Tailoring web archives for different use cases. Paper presented at IIPC General Assembly and Web Archiving Conference , Oslo, Norway. From pages to people: Tailoring web archives for different use cases – University of Edinburgh Research Explorer
Panel discussions:
Beyond Preservation: Engaging Audiences and Researchers with Web Archives, IIPC General Assembly and Web Archiving Conference, National Library of Norway, Oslo, Norway, 9 Apr 2025.
Striking the Balance: Empowering Web Archivists And Researchers. IPC General Assembly and Web Archiving Conference, National Library of France, Paris, April 26, 2024.
The project is committed to using research conducted on the UK Web Archive’s Archive of Tomorrow (AoT) dataset as a gateway into interacting with web archives. While web archives might seem intimidating at first glance, they are a wealth of knowledge for a variety of users, from professional researchers to everyday library visitors who wish to better understand our recent past. By designing new interfaces and resources, the project aims to make them more accessible both for researchers and broader audiences.
The Archive of Tomorrow – Talking about Health project ran from 2022–2023, collecting health information online. During this time, Andrea Kocsis collaborated with the web archivists at the University of Cambridge to conduct a Machine Learning pilot research in order to understand the collection’s true potential. Building on these preliminary results, Andrea used AoT data to facilitate the understanding of web archives, explore their useability, and help users connect with the Library and each other by discovering our shared discussion on health.
The project had three objectives:
- Develop an interactive web app and display screen to explore and play with the dataset
Aiming at a broad audience, the interactive web platform, where users can explore and experiment with the data and the results, will serve as a gamified interface to the collection.
- Jupyter notebooks to add to the Data Foundry’s notebook collection
For those who would like to engage more profoundly with the dataset through distant reading, the project offers Jupyter notebooks on how to rehydrate the articles from metadata and how to do some basic Natural Language Processing on them.
- Entry-level technical workshops
To bridge the potential digital literacy gap between users and dataset creators, the project offers beginner, non-coding workshops on distant reading using the collection as an example.
Collaborators: National Library of Scotland
Funder: National Library of Scotland, The National Librarian’s Research Fellowship in Digital Scholarship 2024-25
Project dates: 2024-2025






