TV_VTT (TrecVid Video-To-Text) Dataset

Published by National Institute of Standards and Technology | National Institute of Standards and Technology | Metadata Last Checked: June 27, 2025 | Last Modified: 2025-01-06 00:00:00

This dataset contains short videos (ranging from 3 seconds to 10 seconds) from TRECVID VTT task from 2016 to 2024. There are 73,893 videos with captions. Each video has between 2 and 5 captions, which have been written by dedicated annotators hired by NIST.

Find Related Datasets

Click any tag below to search for similar datasets

Complete Metadata

bureauCode	[ "006:55" ]
identifier	ark:/88434/mds2-2545
issued	2022-02-14
landingPage	https://data.nist.gov/od/id/mds2-2545
language	[ "en" ]
programCode	[ "006:045" ]
theme	[ "Information Technology:Data and informatics" ]