Our learning algorithm performs stochastic gradient ascent on a variational lower bound of the likelihood. Instead of introducing variational parameters for each data point, we compile the inference procedure at the same time as learning the generative model. This idea was originally used in the wake-sleep algorithm for unsupervised learning (Hinton et al. 1995), and has since led to state-of-the-art results for unsupervised learning of deep generative models (Kingma and Welling 2014; Mnih and Gregor 2014; Rezende, Mohamed, and Wierstra 2014). Specifically, we introduce a new family of structured inference networks, parameterized by recurrent neural networks