Tracking the hockey puck in broadcast video remains a difficult problem in sports analytics due to the puck’s small size, high velocity, frequent occlusions, and heavy compression artifacts. In this paper, we benchmark seven state-of-the-art trackers—Siamese-FC, ARTrack, DiMP, SuperDiMP, PrDiMP, ToMP, and TaMOs—on a large broadcast hockey dataset. We analyze how existing tracking paradigms break down when applied to fast, small objects in cluttered environments. Our experiments reveal that approaches that rely on localized search regions are highly susceptible to irreversible drift, stick-locking, and blur-induced feature loss. In contrast, global frame-level reasoning (TaMOs) enables superior re-detection and stability across occlusions. However, even the best-performing trackers struggle with severe motion blur and extreme spatial aliasing. These findings expose fundamental limitations in current tracking assumptions and highlight the need for architectures that preserve fine-grained detail, are domain-specific, and reason globally.
PDF