About Align Corners
Contents
About Align Corners#
This serves to be a brief, but complete writeup on explaining what
align_corners
actually is, how it impacts interpolation, and
when we should set it or not. Unless otherwise specified,
align_corners
refers to PyTorch code.
TL;DR: align_corners=False
in pytorch is proper and generally better.
History#
There’s an excellent writeup by @hollance that explains the history
of align_corners
in various frameworks. In summary:
BAD: TensorFlowV1 implemented bilinear resizing incorrect. When doubling resolution, this results in an image that shifts up one pixel and left one pixel. This also repeats the final row and column.
So-so: TensorFlow attempted to fix this in a backwards compatible way if the caller sets
align_corners=True
. This implementation, unfortunately, is also incorrect. The distance between successive pixels depends on the size of the image. Intuitively, the distance between pixels should be fixed. PyTorch uses this semi-bugged implementation whenalign_corners=True
.GOOD: PyTorch’s
align_corners=False
is actually different from TensorFlow’s. This is the “proper” way to bilinearly resize an image.
Explanation#
The linked articles are in the context of resizing images, but I’m going to present a pytorch example in the context of normalized coordinates to bring it all together.
Consider a 2 pixel wide, 1 pixel tall image.
The left pixel has intensity value 100, and the right pixel has intensity value 200.
The value of a pixel is considered as a point at it’s center, X
.
align_corners=False
+-------------------+-------------------+
| | |
| | |
| | |
| | |
| X | X |
| Value: 100 | Value: 200 |
| | |
| | |
| | |
+-------------------+-------------------+
-1 -0.5 0 0.5 1 Norm. Coords.
align_corners=True
+-------------------+-------------------+
| | |
| | |
| | |
| | |
| X | X |
| Value: 100 | Value: 200 |
| | |
| | |
| | |
+-------------------+-------------------+
-1 0 1 Norm. Coords.
align_corners=False
coordinate system starts at the left side of the left-most pixel, and has it end at the right side of the right-most pixel.align_corners=True
coordinate system starts at the center of the left-most pixel, and has it end at the center of the right-most pixel.
For align_corners=False
, what border values is it interpolating between?#
PyTorch has a padding_mode
argument for querying border pixels. By
default, padding_mode="zeros"
, meaning that there’s a border of
pixels with value 0
around the image. For example, sampling this
image at normalized coordinates (-1, -1)
will sample the top
left corner of the left pixel. Due to bilinear interpolation with the
surrounding zero-padding, the resulting value would be 1/4
the
pixel’s value; i.e. 25
.
That sounds bad.#
Probably, it is an actively researched topic.
In torchsample
we change the default to padding_mode="border"
,
meaning that the padding has the same value as it’s neighboring valid
pixel value.
With this padding mode, sampling at (-1, -1)
would result in the full value 100
.
The resulting sampled image/featuremap looks how one would expect it to look.