Happy to announce that our paper, “CryptoNAS: Private Inference on a ReLU budget“, has been accepted to NeurIPS 2020! This is collaborative work with Prof. Brandon Reagen.
In this paper we look at private inference for deep neural networks, and argue that existing models are not well suited for this task. In private inference, non-linear operations dominate latency, while linear layers become effectively free. Based on this insight, we introduce the idea of a ReLU budget as a proxy for inference latency, and develop CryptoNAS to build models that maximize accuracy within a given budget.