Confidentiality issues for medical data miners

作者:

Highlights:

摘要

The first task in any medical data mining effort is ensuring patient confidentiality. In the past, most data mining efforts ensured confidentiality by the dubious policy of witholding their raw data from colleagues and the public. A cursory review of medical informatics literature in the past decade reveals that much of what we have “learned” consists of assertions derived from confidential datasets unavailable for anyone’s review. Without access to the original data, it is impossible to validate or improve upon a researcher’s conclusions. Without access to research data, we are asked to accept findings as an act of faith, rather than as a scientific conclusion.This special issue of Artificial Intelligence in Medicine is devoted to medical data mining. The medical data miner has an obligation to conduct valid research in a way that protects human subjects. Today, data miners have the technical tools to merge large data collections and to distribute queries over disparate databases. In order to include patient-related data in shared databases, data miners will need methods to anonymize and deidentify data. This article reviews the human subject risks associated with medical data mining. This article also describes some of the innovative computational remedies that will permit researchers to conduct research AND share their data without risk to patient or institution.

论文关键词:Data mining,Confidentiality,Security,Encryption,HIPAA,IRB

论文评审过程:Received 5 March 2002, Accepted 11 March 2002, Available online 8 August 2002.

论文官网地址:https://doi.org/10.1016/S0933-3657(02)00050-7