FlowIt: Global Matching via Hierarchical Transformers and Optimal Transport for Optical Flow


FlowIt Teaser

We present FlowIt, a novel architecture for optical flow estimation that combines global matching with confidence and occlusion-guided refinement. At its core, FlowIt leverages a hierarchical transformer architecture that captures extensive global context, enabling the model to effectively model long-range correspondences. To overcome the limitations of localized matching, we formulate the flow initialization as an optimal transport problem. This formulation yields a highly robust initial flow field, alongside explicitly derived occlusion and confidence maps. These cues are then seamlessly integrated into a guided refinement stage, where the network actively propagates reliable motion estimates from high-confidence regions into ambiguous, low-confidence areas.


Method

FlowIt extracts multi-scale features from images using a CNN encoder followed by a Feature Pyramid Network (FPN). These features are processed with one or more Multi-Resolution Transformer (MRT) blocks. A 4D correlation volume is constructed using the $\frac{1}{4}$ resolution features, and optimal transport is applied to produce a 4D probability map. Initial flow, occlusion, and confidence maps are derived using the probability map. These predictions are refined through three refinement iterations to obtain the final outputs.


Fine-Tuned Results


Zero-Shot Results

We compare FlowSeek, SEA-RAFT, WAFT, and our method on several scenes in a zero-shot setting.

WAFT output for Sintel frame360
Ours output for Sintel frame360
Frame 1
First RGB frame for Sintel frame360
Frame 1 → Frame 2
Animated first RGB frame for Sintel frame360 Animated second RGB frame for Sintel frame360

BibTeX

@article{safadoust2026flowit,
  title={FlowIt: Global Matching via Hierarchical Transformers and Optimal Transport for Optical Flow},
  author={Safadoust, Sadra and Tosi, Fabio and Poggi, Matteo and G{\"u}ney, Fatma},
  journal={arXiv preprint arXiv:2603.28759},
  year={2026}
}