Ho Bae1, Dahuin Jung2, Hyun-Soo Choi2, Sungroh Yoon1,2,3,4,5,*
1Interdisciplinary Program in Bioinformatics, Seoul National University
2Electrical and Computer Engineering, Seoul National University
3Biological Sciences, Seoul National University
4ASRI and INMC, Seoul National University
5Institute of Engineering Research, Seoul National University
*Corresponding author
Email: sryoon@snu.ac.kr
Pacific Symposium on Biocomputing 25:563-574(2020)
© 2020 World Scientific
Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution (CC BY) 4.0 License.
Typical personal medical data contains sensitive information about individuals. Storing or sharing the personal medical data is thus often risky. For example, a short DNA sequence can provide information that can identify not only an individual, but also his or her relatives. Nonetheless, most countries and researchers agree on the necessity of collecting personal medical data. This stems from the fact that medical data, including genomic data, are an indispensable resource for further research and development regarding disease prevention and treatment. To prevent personal medical data from being misused, techniques to reliably preserve sensitive information should be developed for real world applications. In this paper, we propose a framework called anonymized generative adversarial networks (AnomiGAN), to preserve the privacy of personal medical data, while also maintaining high prediction performance. We compared our method to state-of-the-art techniques and observed that our method preserves the same level of privacy as differential privacy (DP) and provides better prediction results. We also observed that there is a trade-off between privacy and prediction results that depends on the degree of preservation of the original data. Here, we provide a mathematical overview of our proposed model and demonstrate its validation using UCI machine learning repository datasets in order to highlight its utility in practice. The code is available at https://github.com/hobae/AnomiGAN/