Deep Exponential-Family Auto-Encoders


Bahareh Tolooshams, Andrew H. Song, Simona Temereanca, and Demba Ba. Submitted. “Deep Exponential-Family Auto-Encoders.” In Advances in Neural Information Processing Systems 32. arXiv Version


We consider the problem of learning recurring convolutional patterns from data that are not necessarily real valued, such as binary or count-valued data. We cast the problem as one of learning a convolutional dictionary, subject to sparsity constraints, given observations drawn from a distribution that belongs to the canonical exponential family. We propose two general approaches towards its solution. The first approach uses the ℓ0 pseudo-norm to enforce sparsity and is reminiscent of the alternating-minimization algorithm for classical convolutional dictionary learning (CDL). The second approach, which uses the ℓ1 norm to enforce sparsity, generalizes to the exponential family the recently-shown connection between CDL and a class of ReLU auto-encoders for Gaussian observations. The two approaches can each be interpreted as an auto-encoder, the weights of which are in one-to-one correspondence with the parameters of the convolutional dictionary. Our key insight is that, unless the observations are Gaussian valued, the input fed into the encoder ought to be modified iteratively, and in a specific manner, using the parameters of the dictionary. Compared to the ℓ0 approach, once trained, the forward pass through the ℓ1 encoder computes sparse codes orders of magnitude more efficiently. We apply the two approaches to the unsupervised learning of the stimulus effect from neural spiking data acquired in the barrel cortex of mice in response to periodic whisker deflections. We demonstrate that they are both superior to generalized linear models, which rely on hand-crafted features.