Tariq Yousef (Hamburg University) and Daniel Kinitz give a talk at the DH 2023 in Graz on "Similarity-Based Clustering of Premodern Arabic Names".
1 Introduction
Data repositories must manage the identity of their entities. In the case of intellectual history, the challenge lies in premodern, and therefore non-standardised entity names. Our use case deals with Arabic persons related manuscripts (scholars, scribes, etc.). Thus, multiple occurrences of the same person with different spellings and name compositions must be identified and disambiguated. This paper presents a graph clustering approach that combines literal and numerical properties (name and year of event) with promising results. The particular challenge lies in the vast variability of name variants and sometimes unspecific dates.
See the full programme: https://www.conftool.pro/dh2023/sessions.php
The paper will be published in the conference proceedings.