Why it is important
- RF is the corresponding field in the original input data of a pixel in the feature map of a certain layer.
- A larger RF means more global features while a smaller one indicates detailed and local features. It can be considered as an indicater of the extent of abstractness.
- It affects where and how big are the convs within the network are 'seeing' in the original input.
Calculation
-
RF calculation for CNN, dependent on previous layers
RF for dilated conv
The above equation holds, and the only difference is that the kernal size now needs to be translated into the size for a non-dilated conv. For instance, the kernal size of 3 with a rate of 1 is tranlated into 3; the kernal size of 3 with a rate of 2 is tranlated into 5; the kernal size of 3 with a rate of 4 is tranlated into 9.
How to translate?
(rate-1) zeros are inserted between each two pixels of the kernal.
Questions
- RF is only a small part of the black box, it is more about the size and not about the location: how to better control it to learn better features?
- Isn't it wierd to have a RF even larger than the input data size in the last layer? Or the point of DL model is redundancy?