

The person‐held object was then tracked and the relation model between the joints and objects was built. To find the person‐held object of indefinite form, we used a background subtraction algorithm and human joint estimation. We detected the dumping action by the change in relation between a person and the object being held by them. Because the dumping actions in the real‐world take a variety of forms, building a new method to disclose the actions instead of exploiting previous approaches is a better strategy. Although several action/behavior recognition methods have been investigated, these studies are hardly applicable to real‐world scenarios because they are mainly focused on well‐refined datasets. In this paper, we propose a new framework for detecting the unauthorized dumping of garbage in real‐world surveillance camera. Moreover, this paper describes the extensibility of the method for a new action contained in a video from a video domain that is different from the dataset used. The method using object information achieves an F-measure of 90.27%.

The hierarchical design of the method enables it to detect any interactive actions based on the spatial relations between two objects. Therefore, proposed in this paper is an extensible hierarchical method for detecting generic actions, which combine object movements and spatial relations between two objects, and inherited actions, which are determined by the related objects through an ontology and rule based methodology. In addition, most studies have not considered extensibility for a newly added action that has been previously trained. However, action detection, including localization as well as recognition, is required because, in general, actions intersect in time and space. Most studies on actions have handled recognition problems for a well-trimmed video and focused on enhancing their classification performance. For video understanding, namely analyzing who did what in a video, actions along with objects are primary elements.
