Paper Review: Vector-Quantized Variational Autoencoder (VQ-VAE)

3 min readApr 9, 2022

In this quick review, we’ll be talking about VQ-VAE, a novel autoencoder model that improves the reconstruction and compression quality of samples through the employment of a discrete (vector-quantized) latent space and a learned prior.

Autoencoders work by finding an information-maximizing compressed representation in a lower-dimensional space than their original input. Photo by JJ Ying on Unsplash.

Overview

Tags: Compression, Deep Learning, Computer Vision, Data Augmentation

Year Published: 2017

Research Gap(s) Filled: Improved the generative and compressive capabilities of VAE models through: (i) discrete, vector-quantized latent spaces, and (ii) learned priors for the generative model.

Links:

Abridged Summary

The Vector-Quantized Variational Autoencoder (VQ-VAE) model is a novel unsupervised machine learning algorithm that builds upon Variational Autonencoders (VAEs) through the use of a vector-quantized, discrete latent space [1, 2]. This latent representation lends itself well to producing higher-quality input reconstructions relative to standard VAE models, particularly for the computer vision domain.

Paper Review: Vector-Quantized Variational Autoencoder (VQ-VAE)

Overview

Abridged Summary

Written by Ryan S