Skip to main navigation menu Skip to main content Skip to site footer

Effects of Spatial Transformer Location on Segmentation Performance of a Dense Transformer Network


Semantic segmentation solves the task of labelling every pixel in
an image with its class label, and remains an important unsolved
problem. While significant work has gone into using deep learning
to solve this problem, almost all the existing research uses methods
that do not make modifications on spatial context considered for the
pixel being labelled. Spatial information is an important cue in tasks
such as segmentation, reusing the same spatial span for every pixel
and every label may not be the best approach. Spatial Transformer
Networks have shown promising results in improving classification
performance of existing networks by allowing networks to actively
manipulate their input data to achieve better performance. Our work
shows the benefit of incorporating Spatial Transformer Networks
and their corresponding decoders into networks tailored to semantic
segmentation. Our experiments show an improvement in performance
over baseline networks when using networks augmented
with Spatial Transformers.