Abstract - referenceIn this work, we address the problem of cross-dataset learning, where classification models are learned on well known challenging image datasets (Caltech-101, Scene-15, etc.), and applied on videos from other challenges (PETS, ETISEO, etc.). We propose to study the robustness and effectiveness of several coding schemes, to build Bag-of-words features, toward concept detection in video-surveillance typical scenes. Analysis lead to practical results and tools for both robust classification of these scenes and efficient multi-concept localization, which are relevant in improving the accuracy of the video-surveillance high-level algorithms, such as tracking and foreground object detection.
DatasetThis benchmark is made available for reproducible research purpose only.
The purpose of this benchmark is to detect four concepts into videosurveillance data without providing any learning database on this type of image. The approach we proposed is to learn these concepts on available database in the multimedia domain, althought they are different in nature. One interest of our work is to study the robustness of different coding methods (see the paper for details). The concepts we are interested in are the following:
LearningSince this benchmark deals with cross-domain learning, you can learn the four concepts on any database except those used to make the test database (see below). in practice we used the following:
One can consider to use other databases:
If you participated to these evaluation campaign, you can consider using:
TestingAnnotation and data will be made available after acceptance of the paper at ICIP (see references).
Test data was extracted from publicly available dataset of videosurveillance, namely: