Contents Online
Annals of Mathematical Sciences and Applications
Volume 3 (2018)
Number 1
Special issue in honor of Professor David Mumford, dedicated to the memory of Jennifer Mumford
Guest Editors: Stuart Geman, David Gu, Stanley Osher, Chi-Wang Shu, Yang Wang, and Shing-Tung Yau
Sparse and deep generalizations of the FRAME model
Pages: 211 – 254
DOI: https://dx.doi.org/10.4310/AMSA.2018.v3.n1.a7
Authors
Abstract
In the pattern theoretical framework developed by Grenander and advocated by Mumford for computer vision and pattern recognition, different patterns are represented by statistical generative models. The FRAME (Filters, Random fields, And Maximum Entropy) model is such a generative model for texture patterns. It is a Markov random field model (or a Gibbs distribution, or an energy-based model) of stationary spatial processes. The log probability density function of the model (or the energy function of the Gibbs distribution) is the sum of translation-invariant potential functions that are one-dimensional non-linear transformations of linear filter responses. In this paper, we review two generalizations of this model. One is a sparse FRAME model for non-stationary patterns such as objects, where the potential functions are location specific, and they are non-zero only at a selected collection of locations. The other generalization is a deep FRAME model where the filters are defined by a convolutional neural network (CNN or ConvNet). This leads to a deep convolutional energy-based model. The local modes of the energy function satisfies an auto-encoder which we call the Hopfield auto-encoder. The model can be learned by an “analysis by synthesis” algorithm that iterates a sampling step for synthesis and a learning step for analysis. The algorithm admits an adversarial interpretation where the learning step and sampling step play a minimax game based on a value function. We can recruit a generator model as a direct and approximate sampler of the deep energy-based model to speed up the sampling step, and the two models can be learned simultaneously by a cooperative learning algorithm.
Keywords
adversarial interpretation, convolutional neural network, cooperative learning, energy-based model, generator model, Hopefield auto-encoder, sparse coding
The work is supported by NSF DMS 1310391, DARPA SIMPLEX N66001-15-C-4035, ONR MURI N00014-16-1-2007, and DARPA ARO W911NF-16-1-0579.
Received 26 June 2017
Published 27 March 2018