Interpreting weights of MNIST Denoising Autoencoder

I have implemented denoising autoencoder and have plotted the learnt weights which looks like this:


In many of these images I notice that it is capturing the curves at different positions. But there are some neurons which are looking noisy like for eg: 2nd row , 3rd col. But I can see a 3 in black color in it. But my images are numbers from mnist dataset where I have black bg and digit in white color. So does this particular neuron actually recognizes 3? coz it is in black color where as my input images have digit in white color. Or these type of neurons have not learnt yet?

We cannot always exactly reason out all the features of the learnt convolutional filters.
All we can do is reason-out somewhat based on experimenting.

What you could is, try visualizing the output of individual layer’s channels after passing a specific input, say an image having number 3. And you could compare with the corresponding filters to get some intuition on what it’s doing.

It also depends on the convolutional layer number from which you’re visualizing.

I am not using any cnn I am using a simple FNN with input layer(784) a single hidden layer and the output layer(784)