A GAN-based data anonymization system that preserves data utility while removing sensitive information. Won 2nd place in AI domain at hackathon.
Developed an innovative solution using Generative Adversarial Networks (GANs) to anonymize sensitive data while maintaining its statistical properties and utility. The system uses a custom generator model to create synthetic data that preserves the performance characteristics of the original dataset while removing all personally identifiable information.
Key Features
GAN-based data generation for privacy preservation
Custom loss functions for utility preservation
Real-time data anonymization pipeline
Evaluation metrics for data utility assessment
Technology Stack
AI/ML
PyTorchGANCustom Neural Networks
Tools
DockerGitJupyter Notebooks
Challenges
Balancing privacy and data utility
Optimizing GAN training for stable results
Handling different types of sensitive data
Key Learnings
Advanced GAN architectures and training techniques