De-Identify Sensitive Information

  • Thread starter Thread starter Twee
  • Start date Start date
T

Twee

Hi,

I have a table with a field of medical record numbers (MRN) that I need to
de-identify. There over 10,000 records and I'm so basic that I've brought
them into EXCEL and fill down with 1001, 1002, 1003, 1004..... This method,
however, does not account for different records with the same ID's. So
perhaps there is a query that I can write that can generate a random
surrogate that only I can trace back t othe orignal ID.

Any Ideas? I hope this question is not too vague. I could post an
'detailed' example, but I am only talking about an ordinary list of numbers
here with ocassional repeats, e.g. V459934, 323236, 345103, 088031,
V459934, etc....Please help!!!
 
Perhaps build a mapping table along the lines of:

1. Existing table, query on MRN, use query total (group by).

This produces a list of unique MRN's

2. Use query to make table, for example tblMRNMap

3. In tblMRNMap, add field: KeyID as autonumber, primary key, and save.

Put an index, unique, on MRN in tblMRNMap.

You now have a table where each MRN has been assinged a unique number (KeyID)

4. In original table, add a field, KeyID, long int, indexed non-unique.

5. Starting from tblMRNMap, build query. Add original table to query and
join on MRN from tblMRNMap to original table. Original table should have
non-unique index on MRN.

6. Update KeyID in original table with [tblMRNMap].[KeyID]

<>

Test your results with new query where tblMRNMap is joined to original table
on KeyID. Show MRN from both tables in query. You should get the same MRN
from each table, on the same row, for the same KeyID.
 
Back
Top