Abstract: Referring video object segmentation aims to segment objects within a video corresponding to a given text description. Existing transformer-based temporal modeling approaches face challenges ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results