LabROSA : Projects :

[consumer video image] Classification of Consumer Video by Soundtrack

We have been investigating the automatic classification of consumer videos i.e. the kinds of raw, unedited video shot with small cameras. We downloaded a large number of videos from YouTube, filtered them to retain only true consumer-style videos, and manually labeled them according to 25 concepts chosen based on a user study as the kinds of terms users would like to use to retrieve videos.

As an aid to others wishing to work on this problem, we are making available our labeled data here. We provide references to the YouTube videos, along with our manual labels. There are 1,873 videos in this collection.


The file youtube-labels.txt contains 1873 lines, each describing one video. The first column is the row number, the second gives our internal name for the video, and the remaining 25 columns give the binary labels indicating whether the video is relevant to the 25 concepts we used, as defined in the list of concepts. The actual YouTube URLs corresponding to each internal name are given in youtube-refs.txt.

Since these clips are no longer all available, we also distribute an archive of the soundtracks in MP3 format. videoSndtrkClassData.tar.gz (1.4 GB) is an archive of the 1873 sountracks. (Note that there are a small number of repeats in this data, as noted in this errata list.)


We have made an example implementation of our baseline system. videoSndtrkClassCode.tar.gz contains the Matlab code which can be run with the data above; here is its README.


If you make use of this data, please reference the following:

K. Lee and D. Ellis (2010)
Audio-Based Semantic Concept Classification for Consumer Video
IEEE Tr. Audio, Speech and Lang. Proc. vol. 18 no. 6 pp. 1406-1416, Aug. 2010.
DOI: 10.1109/TASL.2009.2034776


This material is based in part upon work supported by the National Science Foundation under grant no. IIS-0238301, by the National Geospatial Intelligence Agency under FY09 NURI HM1582-09-1-0036, and by Eastman Kodak Corporation. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the sponsors.

Valid HTML 4.0! Last updated: $Date: 2011/06/29 19:33:01 $
Graham Poliner <>