About Align Corners#

This serves to be a brief, but complete writeup on explaining what align_corners actually is, how it impacts interpolation, and when we should set it or not. Unless otherwise specified, align_corners refers to PyTorch code.

TL;DR: align_corners=False in pytorch is proper and generally better.

History#

There’s an excellent writeup by @hollance that explains the history of align_corners in various frameworks. In summary:

  1. BAD: TensorFlowV1 implemented bilinear resizing incorrect. When doubling resolution, this results in an image that shifts up one pixel and left one pixel. This also repeats the final row and column.

  2. So-so: TensorFlow attempted to fix this in a backwards compatible way if the caller sets align_corners=True. This implementation, unfortunately, is also incorrect. The distance between successive pixels depends on the size of the image. Intuitively, the distance between pixels should be fixed. PyTorch uses this semi-bugged implementation when align_corners=True.

  3. GOOD: PyTorch’s align_corners=False is actually different from TensorFlow’s. This is the “proper” way to bilinearly resize an image.

Explanation#

The linked articles are in the context of resizing images, but I’m going to present a pytorch example in the context of normalized coordinates to bring it all together.

Consider a 2 pixel wide, 1 pixel tall image. The left pixel has intensity value 100, and the right pixel has intensity value 200. The value of a pixel is considered as a point at it’s center, X.

            align_corners=False
 +-------------------+-------------------+
 |                   |                   |
 |                   |                   |
 |                   |                   |
 |                   |                   |
 |         X         |         X         |
 |    Value: 100     |    Value: 200     |
 |                   |                   |
 |                   |                   |
 |                   |                   |
 +-------------------+-------------------+
-1       -0.5        0        0.5        1   Norm. Coords.


            align_corners=True
 +-------------------+-------------------+
 |                   |                   |
 |                   |                   |
 |                   |                   |
 |                   |                   |
 |         X         |         X         |
 |    Value: 100     |    Value: 200     |
 |                   |                   |
 |                   |                   |
 |                   |                   |
 +-------------------+-------------------+
         -1          0         1             Norm. Coords.
  • align_corners=False coordinate system starts at the left side of the left-most pixel, and has it end at the right side of the right-most pixel.

  • align_corners=True coordinate system starts at the center of the left-most pixel, and has it end at the center of the right-most pixel.

For align_corners=False, what border values is it interpolating between?#

PyTorch has a padding_mode argument for querying border pixels. By default, padding_mode="zeros", meaning that there’s a border of pixels with value 0 around the image. For example, sampling this image at normalized coordinates (-1, -1) will sample the top left corner of the left pixel. Due to bilinear interpolation with the surrounding zero-padding, the resulting value would be 1/4 the pixel’s value; i.e. 25.

That sounds bad.#

Probably, it is an actively researched topic. In torchsample we change the default to padding_mode="border", meaning that the padding has the same value as it’s neighboring valid pixel value. With this padding mode, sampling at (-1, -1) would result in the full value 100. The resulting sampled image/featuremap looks how one would expect it to look.