Alumni

Federica Collina

Dissertation Title Information and Communication Technologies (ICT) per il patrimonio museale “dimenticato”: strategie di valorizzazione e comunicazione attraverso tecnologie digitali

Abstract The vast and complex landscape of Italian museums is characterized by a widespread presence of small institutions that, despite safeguarding heritage of immense historical, artistic, and cultural value, often remain marginal, affected by limited resources, low visibility, and reduced visitor numbers. This research work is situated within this context, addressing the challenge of developing effective communication and enhancement strategies for this “forgotten” cultural heritage through the targeted use of Information and Communication Technologies (ICT). The overall objective has been to develop a replicable operational model grounded in strong synergy between museums and research institutions, capable of overcoming the economic and structural constraints typical of peripheral contexts. The research focused on four specific objectives: identifying effective communication strategies through ICT, implementing and analyzing collaboration between museums and research institutions, evaluating the effectiveness of the adopted digital solutions, and enhancing the communicative potential of the collections under study. The methodology involved the analysis and development of digital strategies for four small- to medium-sized institutions within the Italian museum landscape—each with fewer than 40,000 visitors annually—and one larger international case. The Italian case studies include the Anthropology Collection (SMA), the National Museum of the Neoclassical Age in Romagna in Faenza – Palazzo Milzetti, the Museum of the Landscape of the Faenza Apennines – Rocca di Riolo, and the National Archaeological Museum of Ravenna. These are complemented by the Butrint National Park, a UNESCO site with over 200,000 annual visitors, where the specific focus of the research (the Temple of Asclepius) is nevertheless affected by a marginal location that limits its accessibility and use. For each case, tailored digital tools for enhancement and communication of cultural heritage were developed, employing a range of technologies including Virtual Museums, narrative audioguides, 3D-printed tactile reproductions, Virtual Reality, 360° Virtual Tours, and 3D reconstructions. The effectiveness of these tools was assessed through User Experience (UX) testing conducted on samples divided by generation. The results show that the adopted digital solutions proved effective in increasing the museum's attractiveness. In conclusion, the research not only outlines the critical issues affecting small museums but also proposes an operational model that, through structured collaboration between research and institutions and the targeted adoption of ICT, can offer new and concrete perspectives for the communication and enhancement of “forgotten” heritage, fostering the inclusion of these institutions within broader cultural networks.

Supervisor Alessandro Iannucci

Co-supervisors Chiara Panciroli and Gustavo Marfia

Keywords Information and Communication Technologies; Forgotten Cultural Heritage; Enhancement; Communication; Cultural Sites

Lucia Giagnolini

Dissertation Title Representing Born-Digital Literary Archives: from the Filesystem to the Knowledge Graph

Abstract The growing production of born-digital archives by contemporary authors generates complex documentary ecosystems that challenge traditional archival descriptive tools. Although the international archival community is increasingly oriented towards Linked Open Data (LOD), semantic models capable of addressing the specificities of literary born-digital materials remain lacking. This research proposes an approach to the representation of born-digital literary archives articulated across four levels: the analysis of authorial practices; the design of a semantic model; the implementation of automated description workflows; and the development of tools for analysis and access.

The phenomenological investigation involved fifty finalists of the Premio Strega and Premio Campiello literary awards, as well as an in-depth examination of the Valerio Evangelisti Archive – the most extensive Italian case study documented to date. The inquiry revealed recurring management patterns, strategies of digital self-representation, and forms of "archival will," alongside a generally limited awareness of documentary value among the authors surveyed. From this empirical analysis, five modelling requirements were identified: the representation of the physical, logical, and conceptual layering of digital materials; the integration of cryptographic integrity measures; the representation of native metadata; and the documentation of provenance and contextual relationships.

On these bases, the Born-Digital Ontology (BoDi) was developed as an extension of the Records in Contexts Ontology (RiC-O). An automated five-phase workflow was subsequently implemented to convert born-digital archives into RDF graphs compliant with BoDi. When tested on the Evangelisti Archive, the process generated over 60 million triples, making explicit the structures, metadata, and relationships needed to transform an extensive and opaque documentary corpus into a formalised knowledge base. Advanced querying through SPARQL and visualisation via a dedicated application enabled both sophisticated interrogation and accessible exploration at multiple levels of engagement.

The convergence of phenomenological inquiry, ontological modelling, automation, and visualisation delineates an integrated and replicable methodological framework for the representation and enhancement of born-digital literary archives as part of contemporary cultural heritage.

Supervisor Francesca Tomasi

Co-supervisors Paolo Bonora and Paola Italia

Keywords Born-Digital Archives; Literary Archives; BoDi; Born-Digital Ontology; Archival Representation; Archival Description; Semantic Web; Linked Open Data; Records in Contexts; Born-Digital Heritage; Valerio Evangelisti

Arcangelo Massari

Dissertation Title HERITRACE: enabling domain expert participation in semantic data curation with integrated provenance and change tracking

Abstract Cultural heritage institutions increasingly adopt Semantic Web technologies for FAIR compliance. In cultural heritage, semantic data is inherently interpretative and requires human curation, yet technical complexity prevents domain experts from contributing. This thesis addresses the usability gap through two research questions: RQ1 asks how to design interfaces enabling domain experts to curate RDF data without requiring technical expertise while maintaining provenance and change tracking; RQ2 asks how technical staff can perform one-time configuration of curation environments for specialized domains.

Two case studies serve distinct roles: OpenCitations Meta validates that barriers exist at scale in systems processing 124 million entities; ParaText serves as testbed for guerrilla testing. These reveal five convergent requirements: provenance management, change tracking, usability for both domain experts and configurators, flexible customization, and integration with existing RDF collections.

HERITRACE addresses these requirements through a framework built on the OpenCitations Data Model for provenance and change tracking, with the Time Agnostic Library enabling reconstruction of past entity states. Technicians use familiar SHACL shapes and YAML rules to configure entity types, validation constraints, and display settings; the framework then generates interfaces from these configurations. End users create, modify, delete, and merge entities and restore previous states through version history, without awareness of the underlying RDF infrastructure. Evaluation with 9 end users and 10 technicians combines quantitative measures and grounded theory analysis. For RQ1, end users completed curation tasks with 67% to 100% success and above-average usability (SUS 78.9). For RQ2, technicians achieved 90% success with excellent usability (SUS 83.8).

Since most institutions maintain collections in relational databases, preliminary research explores extending RML to enable inverse transformations: RDF serves as a lingua franca, with HERITRACE curating semantic data while inverse mappings transfer modifications back to original databases, allowing institutions to adopt FAIR principles without changing their existing infrastructure.

Supervisor Silvio Peroni

Co-supervisor Anastasia Dimou

Keywords FAIR; Usability; Provenance; Change-Tracking; Cultural Heritage

Margherita Mattioni

Dissertation Title La biblioteca d’autore come modello epistemologico e laboratorio narrativo. Un’indagine filosofica sulla biblioteca di lavoro e sui libri postillati di Umberto Eco

Abstract This doctoral dissertation aims to examine, through a philosophical inquiry, Umberto Eco’s authorial library, exploring in particular the ways and practices through which the renowned intellectual questioned and appropriated his “vegetal memory.” The first part of the thesis introduces the hypothesis of Eco’s “theoretical library,” namely the existence of a system of regulatory principles and procedural norms that may have guided his reading and writing practices. In light of this assumption, the main features of the model reader, empirical author, and model author are analysed, showing how the techniques and strategies employed by both the empirical and the model Eco conform to the principles developed within his literary semiotics, narratology, and, more broadly, his theory of the text. Particular attention is devoted to the intertextual and metanarrative habits that characterize the author’s textual voice across his seven novels. The first section concludes with an analysis of the associative criteria — topological and contextual — that define the physical organization and conceptual structure of his modern library.

The second part focuses on the methodology and guiding principles underlying the analysis of a sample of annotated books from Eco’s library, as well as on the presentation and critical discussion of the interpretive map through which the author’s marginalia and textual interventions have been classified according to their spatial, functional, expressive, linguistic, thematic, and referential properties.

The concluding section explores the hermeneutic and applicative potential of the meta-model, showing how it proves useful not only for deepening our understanding of Eco’s strategies of appropriation and reading–writing practices, but also as a tool for analysing the annotative interventions of other authors.

Supervisor Costantino Marmo

Co-supervisor Riccardo Fedriga

Keywords Umberto Eco; Authors' Libraries; Marginalia; Reading Practices; Theory of Interpretation and Classification

Andrea Schimmenti

Dissertation Title Structuring cultural heritage content and context: integrating llms in ontology-driven knowledge graph extraction

Abstract Cultural Heritage institutions have digitized extensive collections and published their metadata through Semantic Web technologies, yet the content and the scholarly contextualization of documents—entities, relationships, events, and interpretations—remains largely inaccessible through semantic querying. Manual Knowledge Graph creation proves prohibitively expensive at scale, while automatic Knowledge Extraction faces critical barriers in CH contexts: limited annotated training data and domain-specific linguistic complexity. This dissertation investigates automatic Knowledge Graph extraction from Cultural Heritage texts in data-scarce scenarios, addressing three research questions: (1) What methodologies and challenges characterize existing CH text-to-KG projects? (2) How can Large Language Models be integrated into ontology-driven Knowledge Extraction pipelines, and what are the limitations and trade-offs? (3) Can LLM-based systems produce sufficiently accurate Knowledge Graphs of scholarly interpretations while preserving provenance and epistemic uncertainty? We conduct a systematic survey of eleven CH projects (2015-2025) and analyze 227 papers, identifying persistent bottlenecks in Named Entity Recognition, Relationship Extraction, and Entity Linking. We introduce \textit{Adaptive Text-to-KG for Cultural Heritage} (ATR4CH), a five-step methodology coordinating ontology analysis, Competency Question formulation, ground-truth annotation, LLM-based extraction, and multi-layered evaluation. We validate ATR4CH through case studies including authenticity debates, archival finding aids, RAG-based argument extraction, and synthetic training data generation for Aspect-Based Sentiment Analysis. Results establish that LLMs enable ontology-aligned extraction under data scarcity, achieving accuracy sufficient for scholarly workflows. LLMs augment rather than replace traditional pipelines, providing capabilities for bootstrapping development and serving domains where annotation costs cannot be justified. However, human oversight remains necessary: errors may propagate through pipelines, data alignment represents a persistent bottleneck, and epistemic uncertainty requires continued development. This dissertation advances the state of the art by providing a replicable methodological framework and empirical evidence that LLM-based extraction can bridge the gap between digitization and semantic accessibility of Cultural Heritage repositories.

Supervisor Silvio Peroni

Co-supervisor Francesca Tomasi

Keywords Knowledge Graph; Knowledge Extraction; Cultural Heritage; Large Language Models; Natural Language Processing