Midterm update
Jul 20th, 2007 by sarp
As many of you know, we are working on a patient matching module for OpenMRS that will allow users to identify records that belong to the same patient among different data sources.In the first part of SoC, I’ve completed adding weight scaling functionality to the existing record linkage framework.Matching records are determined by assigning a score to each possible record pair. Weight scaling improves the accuracy of patient matching because fields that match on a common value, for instance James for first name, will be scaled down, and they will contribute less to the overall score for the given pair.In order to introduce weight scaling, we first analyze data sources (could be database or character delimited file) that will be used in linkage for token frequencies. We store this data in a relational database and use it later during calculating scores for possible pairs. We have the ability to use different lookup tables for token frequencies (top N most/least frequent tokens, top N% most/least frequent tokens and frequencies above/below N).There are other possible improvements for scoring, therefore we’re currently working on refactoring the framework to make it easier to adjust matching scores.