40 Years of Computing at Newcastle

Department Technical Report Series No. 550

An Assessment of Name Matching Algorithms.

A.J. Lait and B. Randell

University of Newcastle upon Tyne. 1996

Abstract

In many computer applications involving the recording and processing of personal data there is a need to allow for variations in surname spelling, caused for example by transcription errors. A number of algorithms have been developed for name matching, i.e. which attempt to identify name spelling variations, one of the best known of which is the Soundex algorithm. This paper describes a comparative analysis of a number of these algorithms and, based on an analysis of their comparative strengths and weaknesses, proposes a new and improved name matching algorithm, which we call the Phonex algorithm. The analysis takes advantage of the recent creation of a large list of "equivalent surnames", published in the book Family History Knowledge UK [Park]. This list is based on data supplied by some thousands of individual genealogists, and can be presumed to be representative of British surnames and their variations over the last two or three centuries. It thus made it possible to perform what we would argue were objective tests of name matching, the results of which provide a solid basis for the analysis that we have performed, and for our claims for the merits of the new algorithm, though these are unlikely to hold fully for surnames emanating largely from other countries.
Department Technical Report Series - 1996
Department Technical Report Series Index
Contents Page - 40 Years of Computing at Newcastle
Technical Report Abstract No. 550, 30 June 1997