factorised-image-generation

A collection of GANs demonstrating the efficiency of factorised transposed convolution (deconvolution) layers - performance is comparable to conventional model architectures, but with ~15-40% less parameters.

Context

The Inception architecture [1] and variants such as Xception [2] have been used in image classification tasks with great success - successive versions Inception v2 and v3 have improved on the naive implementation by factorising larger convolutions such as 5x5 to two 3x3 convolution operations as well as by factorising operations of the form NxN to two operations 1xN and Nx1, reducing the amount of needed parameters whilst maintaining a similarly high ability to recognise patterns in input images.

In this repository, this concept of factorising convolutional layers is applied in reverse to transposed convolution layers; a copy of the model without factorised layers is also present within each folder for easy comparison of the two. The factorised models contain around 15-40% less parameters, but perform approximately as well (subjectively) as their naive counterparts.

Examples

CIFAR-10 (Frog)

Factorised model	Naive model

MNIST

Factorised model	Naive model

EMNIST Letters

Factorised model	Naive model

Improved Parameter Efficiency

The table below lists the reduction in trainable parameters gained by factorising transposed convolution layers in the generator component of the GAN for each dataset. Please note that the parameter numbers shown here refer to all parameters, trainable and non-trainable, and that they only include the generator of each GAN and not the discriminator - I have not factored the convolutional discriminator layers, hence the parameters in both the naive and factorised implementations would be the same.

Dataset	Naive generator model parameters	Factorised generator model parameters	Parameter reduction
MNIST	2,330,944	1,558,293	772,651 (31.15%)
CIFAR-10	6,264,579	3,971,587	2,292,992 (36.60%)
EMNIST Letters	4,287,808	3,570,245	717,563 (16.73%)

Code

The MNIST model code is from the TensorFlow Deep Convolutional Generative Adversarial Network guide.

The CIFAR-10 model code is from Deep Learning with Python, by Francois Chollet.

Other model code has also generally been based upon the above resources with some modifications for data loading and pre-processing.

Visible flickering was initially present in the GIFs for the MNIST and EMNIST Letters datasets, indicating a certain amount of model instability - this was rectified by reducing the momentum term β1 from 0.9 to 0.5, as per the work of Radford et al. [3], and increasing the learning rate to 0.0002 to compensate.

Datasets

The datasets used in this repository are:

MNIST database of handwritten digits
CIFAR-10
EMNIST Letters

Further Steps / Contributing

The bilinear additive upsampling method described by Wojna et al. [4] and the idea of layer branching from the Inception module architecture could be used to produce a novel GAN architecture, tentatively named InceptionGAN.

Contributing to this repository would preferably consist of testing 'factorised' models and their naive equivalents on other datasets, preferably using more objective metrics such as the Inception Score (IS) in order to make a more accurate comparison.

Citations

[1] Szegedy, C. et al., 2014. Going deeper with convolutions. arXiv.

[2] Chollet, F., 2017. Xception: Deep Learning with Depthwise Separable Convolutions. arXiv.

[3] Radford, A., Metz, L. & Chintala, S., 2016. Unsupervised Representation Learning 
with Deep Convolutional Generative Adversarial Networks. arXiv.

[4] Wojna, Z. et al., 2019. The Devil is in the Decoder: Classification, Regression and GANs. arXiv.

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
examples		examples
models		models
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

models

models

LICENSE

LICENSE

README.md

README.md

Repository files navigation

factorised-image-generation

Context

Examples

CIFAR-10 (Frog)

MNIST

EMNIST Letters

Improved Parameter Efficiency

Code

Datasets

Further Steps / Contributing

Citations

About

Releases

Packages

Languages

License

josephjojoe/factorised-image-generation

Folders and files

Latest commit

History

Repository files navigation

factorised-image-generation

Context

Examples

CIFAR-10 (Frog)

MNIST

EMNIST Letters

Improved Parameter Efficiency

Code

Datasets

Further Steps / Contributing

Citations

About

Topics

Resources

License

Stars

Watchers

Forks

Languages