Identifying Ethnicity: comparison of two computer programmes
|
Since April 1996, the NHS has expected that all hospital trusts will record, and provide as part of the 'contract minimum data set', data relating to the ethnic origin of all 'admitted patients'. Ethnic monitoring requires the self-identification of individuals as belonging to one or more groups, defined in terms of their culture and origin. The NHS has also supported the development of ethnic monitoring procedures in primary care with several pilot sites participating in this process.
However, although ethnicity should now be recorded routinely by all hospital trusts, the data is often incomplete. Therefore, in the absence of such self-assigned ethnicity data, name analysis can offer a useful alternative for the identification of South Asian and other populations with distinct names. Since name analysis by visual inspection can be time consuming and inaccurate, attempts have been made to develop name analysis software to perform this function. There are two computer programmes currently available in the UK, which are designed to assist in such name analysis. Nam Pehchan This computer programme was developed by Bradford City Council and Health Authority in the 1980's. The software contains a dictionary of South Asian names which it attempts to match against the complete name or the name stem (usually the first five characters of an individual's name) in order to provide a list of South Asians together with a language and religion marker for each person. In a study by Birmingham University (3) Nam Pehchan had a sensitivity of 96% and PPV of 67.4% against names from Yorkshire, but scored only 88.2% and 58.7% respectively in populations from Thames. SANGRA SANGRA (South Asian Names and Group Recognition Algorithm) programme, developed by the London School of Hygiene and Tropical Medicine, also contains a dictionary of South Asian names. However, the programme has been developed from a number of different sources in order to make it more accurate for populations across the UK. Validation has shown a sensitivity of 90.7% and PPV of 80.1% for London inpatients with results (according to the designers) much the same across all areas of the UK (4).
Further work Self-assigned ethnicity data collected with the use of OPCS coding is considered the benchmark or gold standard for collecting ethnicity data. The two software programmes above have now been compared against such a gold standard. It was difficult to find appropriate databases containing a list of names plus a self-assigned ethnic tag (to act as gold standard). Also, both computer software programmes are run against a list of names, which because of confidentiality can be difficult to acquire. A study to compare Nam Pehchan and SANGRA against the gold standard of self-assigned ethnicity data is now complete. A database collected by Coventry Primary Care Trust is being used for this purpose, but other data sources are also being sought. You can download an abstract of the paper in Word by clicking below. Abstract - Identifying Ethnicity If you need any further information please contact:
or
References White A. Social Focus in Brief: Ethnicity 2002. Office for National Statistics. London. [Can be downloaded from an ONS page]. Chandola, T. Ethnic and class differences in health in relation to British South Asians: using the new National Statistics Socio-Economic Classification. Social Science and Medicine 2001; 52: 1285-1296. Cummins C, Winter H, Cheng KK, Maric R, Silcocks P, Varghese C. An assessment of the Nam Pehchan computer programme for the identification of names of south Asian ethnic origin. Journal of Public Health Medicine 1999; 21(4): 401-406. Nanchahal K, Mangtani P, Alston M, Silva IDS. Development and validation of a computerized South Asian Names and Group recognition Algorithm (SANGRA) for use in British health-related Studies. Journal of Public Health Medicine 2001; 23(4): 278-285.
|
|


