X2-VLM: All-in-One Pre-Trained Model for Vision-Language Tasks
Work
Year: 2023
Type: article
Abstract: Vision language pre-training aims to learn alignments between vision and language from a large amount of data. Most existing methods only learn image-text alignments. Some others utilize pre-trained o... more
Cites: 70
Cited by: 22
Related to: 10
FWCI: 5.122
Citation percentile (by year/subfield): 100
Subfield: Computer Vision and Pattern Recognition
Field: Computer Science
Domain: Physical Sciences
Open Access status: green