Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chapter4 DCGAN used tf.keras but could not produce same results #20

Open
Nevermetyou65 opened this issue Oct 22, 2021 · 4 comments
Open

Comments

@Nevermetyou65
Copy link

Hi, I am reading the chapter 4 of this book and there seem to be some problem.
The code written in this book is from Keras but when I do the code I prefer to use tf.keras which should not be different.
I implemented the code written in chapter 4 using tf.keras and I got strange result like the loss of discriminator and generator approached 0 and the acc. to 1. Also the result in image-grid is just noise. But when I removed the BathcNormalization layer out of both generator and discriminator, I got fine fake digit images. Any Idea why????

this is the colab containing the code
https://colab.research.google.com/drive/1TF-nkPPkj0HAzKceb3UL_AzSdb-0DjKD?usp=sharing

@mjzalewski
Copy link

I also had the same problem. When I removed all the BatchNormalization layers from both the discriminator and generator, the problem was even worse (The loss for the generator was high, and the images were just blobs).

I had success when I removed the BatchNormalization from the discriminator only.

I suspect that the Discriminator is training too quickly relative to the generator. When you remove BatchNormalization from the discriminator, it trains more slowly, closer to the training rate of the generator.

@bladebump
Copy link

I also had the same problem. I change this model to use maxpool ansd remove batchnormalization.it's work.but not good.

@marckolak
Copy link

I had a same problem.
I removed BatchNormalization and tahn activation layer. I also added a Dropout layer in the discriminator to avoid overfitting.
Here you have modified models for reference

def build_generator(img_shape, z_dim):
    model = Sequential()
    model.add(Input(shape=z_dim))
    
    model.add(Dense(256*7*7))
    model.add(Reshape((7,7,256)))
    
    model.add(Conv2DTranspose(128, kernel_size=3, strides=2, padding='same'))
    model.add(LeakyReLU(alpha=0.01))
    
    model.add(Conv2DTranspose(64, kernel_size=3, strides=1, padding='same'))
    model.add(LeakyReLU(alpha=0.01))
    
    model.add(Conv2DTranspose(1, kernel_size=3, strides=2, padding='same'))
    
    return model

def build_discriminator(img_shape):
    model = Sequential()
    model.add(Input(shape = img_shape))

    model.add(Conv2D(32, kernel_size=3, strides=2, padding='same'))
    model.add(LeakyReLU(alpha=0.01))
    
    model.add(Conv2D(64, kernel_size=3, strides=2, padding='same'))
    model.add(LeakyReLU(alpha=0.01))
    
    model.add(Conv2D(128, kernel_size=3, strides=2, padding='same'))
    model.add(LeakyReLU(alpha=0.01))
    model.add(Dropout(0.4))

    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))
    
    return model

The results are much better.

@Chiuchiyin
Copy link

It kind of work after the changes marckolak suggested. I probably trained the model too long and ended with mode collapse.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants