论文

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning

Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, Cordelia Schmid

CVPR 2023

@inproceedings{ventura2023covr,
                      title={CoVR: Learning Composed Video Retrieval from Web Video Captions},
                      author={Lucas Ventura and Antoine Yang and Cordelia Schmid and Gul Varol},
                      booktitle={arXiv},
                      year={2023}}