These gains can be achieved at no cost in terms of classification accuracy. This assumes that requests are made uniformly across the dataset. Provides a speed-up of 3.13 × and 1.658 × for the Purchase and SVHN dataset respectively, if the number of unlearning requests is 0.003% of the total dataset sizes (respectively). Experimentally, we find that sharding the data into 20 shards,
Or in batches ( i.e., the service provider buffers a few unlearning ( i.e., immediately upon a user revoking access to their data) Provider chooses to process unlearning requests sequentially The naive approach of retraining from scratch whether the service To demonstrate that SISA training handles streams of unlearning requests effectively, we analytically show that it outperforms Slicingįurther contributes to decreasing the time to unlearn, atĪt inference, we simply aggregate the predictions of models We save the state of model parametersīefore introducing each new slice, allowing us to start retraining the modelįrom the last known parameter state that does not include the point to be unlearned-rather than a random initialization. We can divide each shard into slices and present slices In addition, rather than training each model on the entire shard directly, Training set, this decreases the retraining time to achieve unlearning. That were trained on shard(s) containing the point.įinally, when a request to unlearn a training point arrives, we need to retrain only the affected models.īecause the shards are smaller than the entire Limits the influence of a point to the models Then, we train models in isolation on each of these shards, which That a training point is included in a small number of shards only, First, we divide the training data in multiple shards such Our SISA training approach, short for Sharded, Isolated, Sliced, and Aggregated training, can be implemented with minimal modification to existing pipelines. Our work contributes to practical data governance in Users of a commercial product that is available in countries with varying Of the unlearning distribution provides further improvements in retraining timeīy simulating a scenario where we model unlearning requests that come from The SVHN dataset, over retraining from scratch. Training improves unlearning for the Purchase dataset by 3.13x, and 1.658x for
Under no distributional assumptions, we observe that SISA Two datasets from different application domains, with corresponding motivationsįor unlearning. We may take this prior into account to partition and order dataĪccordingly and further decrease overhead from unlearning. In some cases, we may haveĪ prior on the distribution of unlearning requests that will be issued by Requests are made uniformly across the training set. This framework reduces the computational overheadĪssociated with unlearning, even in the worst-case setting where unlearning Limit the number of model updates that need to be computed to have these Unlearning request and caches intermediate outputs of the training algorithm to We introduce SISA training, aįramework that decreases the number of model parameters affected by an Retraining downstream models from scratch. After aĭata point is removed from a training set, one often resorts to entirely Yet, having models unlearn is notoriously difficult. Memorized it, putting users at risk of a successful privacy attack exposing Machine learning (ML)Įxacerbates this problem because any model trained with said data may have To revoke access and ask for the data to be deleted. Once users have shared their data online, it is generally difficult for them