Visible to the public Generic TV Advertisement Detection Using Progressively Balanced Perceptron Trees

TitleGeneric TV Advertisement Detection Using Progressively Balanced Perceptron Trees
Publication TypeConference Paper
Year of Publication2016
AuthorsKannao, Raghvendra, Guha, Prithwijit
Conference NameProceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing
Date PublishedDecember 2016
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4753-2
Keywordsartificial intelligence, Computer vision, Computer vision problems, computing methodologies, machine learning, machine learning approaches, Machine learning theory, Models of learning, Neural networks, pubcrawl, pubcrawl170201, science of security, Theory and algorithms for application domains, Theory of computation, Video segmentation

Automatic detection of TV advertisements is of paramount importance for various media monitoring agencies. Existing works in this domain have mostly focused on news channels using news specific features. Most commercial products use near copy detection algorithms instead of generic advertisement classification. A generic detector needs to handle inter-class and intra-class imbalances present in data due to variability in content aired across channels and frequent repetition of advertisements. Imbalances present in data make classifiers biased towards one of the classes and thus require special treatment. We propose to use tree of perceptrons to solve this problem. The training data available for each perceptron node is balanced using cluster based over-sampling and TOMEK link cleaning as we traverse the tree downwards. The trained perceptron node then passes the original unbalanced data to its children. This process is repeated recursively till we reach the leaf nodes. We call this new algorithm as "Progressively Balanced Perceptron Tree". We have also contributed a TV advertisements dataset consisting of 250 hours of videos recorded from five non-news TV channels of different genres. Experimentations on this dataset have shown that the proposed approach has comparatively superior and balanced performance with respect to six baseline methods. Our proposal generalizes well across channels, with varying training data sizes and achieved a top F1-score of 97% in detecting advertisements.

Citation Keykannao_generic_2016