In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval
Work
Year: 2023
Type: preprint
Abstract: Large-scale noisy web image-text datasets have been proven to be efficient for learning robust vision-language models. However, when transferring them to the task of video retrieval, models still need... more
Source: arXiv (Cornell University)
Cites:
Cited by:
Related to: 10
Citation percentile (by year/subfield):
Subfield: Computer Vision and Pattern Recognition
Field: Computer Science
Domain: Physical Sciences
Sustainable Development Goal Quality education
Open Access status: green