ABSTRACT
In earlier study, the result of background reconstruction is strongly dependent on the order of data. Based on the assumption that background appears with large appearance frequency, a new background reconstruction algorithm based on modified basic sequential clustering is proposed in this research. First, pixel intensity in period of time are classified based on modified basic sequential clustering. Second, merging procedure is run to classified classes. Finally, pixel intensity classes, whose appearance frequencies are higher than a threshold, are selected as the background pixel intensity value, so the background model can represent the scene well. Compared with the background reconstruction method based on basic sequential clustering, the simulation results show that an assignment for the data is reached after the final cluster formation, at the same time those near classes are avoided at all and the effect of input order of data has been reduced greatly. And the background model can represent the scene well.
PDF Abstract XML References Citation
How to cite this article
DOI: 10.3923/itj.2008.1037.1042
URL: https://scialert.net/abstract/?doi=itj.2008.1037.1042
INTRODUCTION
Identifying moving objects from a video sequence is a fundamental and critical task in video surveillance, traffic monitoring and analysis. Several approaches are known to separate foreground from background. Background subtraction is a common approach to identifying the moving objects, especially, for a video sequence from a stationary camera. Pixels in the current frame that deviate significantly from the background are considered to be moving objects. Background reconstruction and foreground detection are two major steps in a background subtraction algorithm. Background reconstruction uses the new video frame to calculate and update a background model. Foreground detection then identifies pixels in the video frame that cannot be adequately explained by the background model and outputs them as a foreground mask. This research focuses on background reconstruction. A large number of background reconstruction methods for fixed cameras have been proposed in recent years. We classify background modeling techniques into approach based on background assumption (Long and Yang, 1990; Gloyer et al., 1995; Cucchiara et al., 2003; Kornprobst et al., 1999; Hou and Han, 2004; Xiao and Han, 2007), statistical model (Wren et al., 1997; Stauffer and Grimson, 1999; Javed et al., 2002; Lee et al., 2003; Zivkovic and Van, 2004; Elgammal et al., 2000) and prediction techniques (Toyama et al., 1995). They are described here:
An adaptive smoothness method (Long and Yang, 1990) based on the assumption that intensity of background is stable for a long period. Their method finds intervals of stable intensity and uses a heuristic which choose the longest, most stable interval as the one most likely to represent the background. Median filter is one of the most commonly-used background modeling techniques (Gloyer et al., 1995). The background estimate is defined to be the median at each pixel location of all the frames in the buffer. The assumption is that the pixel stays in the background for more than half of the frames in the buffer. An improved algorithm about Median filtering is presented in (Cucchiara et al., 2003) that has been extended to color by replacing the median with the medoid. Assumed that background would be defined as the most often observed part over the sequence, they (Kornprobst et al., 1999) presented a approach to deal with the background reconstruction and motion segmentation based on Partial Differential Equations (PDE), the result is good, but their method is complex and the parameters are difficult to choose. These methods can reconstruct background even though foreground objects are present in the sequence and can avoid blending. Pixel Intensity Classification (PIC) is presented in (Hou and Han, 2004). The method assume that background pixel intensity appears in image sequence with the maximum probability, then classified pixels intensity based on the calculated pixel intensity difference between inter-frame, finally, the intensity value with the maximum frequency is selected as the background pixel intensity value. In this study a multi background reconstruction based on online cluster has been proposed. Study algorithm adopted the assumption that background wouldn`t be the part appear a short time of the sequence. In this approach, first, pixel intensity is classified based on online cluster, then cluster center and appear probability of each class is calculated, finally, a single or multi pixel intensity, with the appear probability is higher than a threshold, are selected as the background pixel intensity value. Based on the assumption that background appears with large appearance frequency, a background subtraction algorithm based on Basis Sequential Clustering (BSC) is proposed in literature (Xiao and Han, 2007). First, pixel intensity in period of time are classified based on basic sequential clustering and pixel intensity classes, whose appearance frequency is higher than a threshold, are selected as the background pixel intensity value. Simulation results show that the algorithm can hand complex situations contains small motions and motion detection can be performed correctly. It is obvious that the results of BSC are strongly dependent on the order in which the data are presented to the algorithm. And it may happen that two of the formed clusters are very closely located.
Some of the commonly-used statistic model are Time Averaged Background Image (TABI), Gaussian model (Wren et al., 1997; Stauffer and Grimson, 1999) and Non-parametric Model (Elgammal et al., 2000). A simple method of background reconstruction is TABI, a background approximation is obtained by averaging a long time image sequences, the method is effective in empty scene, once situations with many moving objects, especially if they move slowly and the foreground objects always can be blending into the background image. A single Gaussian (Wren et al., 1997) is considered to model the statistical distribution of a background pixel, it does not cope with multimodal backgrounds. Extended background subtraction approach (Stauffer and Grimson, 1999) by using an adaptive multi-modal subtraction method that modeled the pixel color as a Mixture of Gaussians (MOG). This method could deal with slow changes in illumination, repeated motion from background clutter and long term scene changes. But The MOG method is computationally intensive and its parameters require careful tuning. Literature (Javed et al., 2002; Lee et al., 2003; Zivkovic and Van, 2004) put forward kinds of modify, but computationally intensive has not solved yet. Adapted a multi-model background estimate (Elgammal et al., 2000) at each pixel location, Gaussian was chosen to be the kernel estimator and the method uses the median of the absolute differences between successive frames as the width of the kernel. The model can handle situations where the dynamic background but contains small motion. However, the method is computationally intensive and it is not easy to choose its parameters.
Wiener filter (Toyama et al., 1999) for background reconstruction are based on prediction techniques. It is a linear predictor based on a recent history of values. Any pixel that different significantly from its predicted value are declared foreground.
THE STEPS OF THE BACKGROUND RECONSTRUCTION ALGORITHM
Literature (Hou and Han, 2004; Xiao and Han, 2007) reports the character of the pixel intensity in detail. We adopted the same assumption that was mentioned in literature (Xiao and Han, 2007), that background would not be the parts which appear in the sequence for a short time, a background reconstruction algorithm based on Modified Basic Sequential Clustering (MBSC) was proposed in this research. The steps of algorithm are described as follows:
Step 1 | : | Classify the pixel intensity based on MBSC (Sergios and Konstantinos, 2003). |
The basic idea of BSC is the following: Starts with the guess that the first input value is the initial class and the number of the initial class set to 1. As each new data is considered, it is either assigned to an existing cluster or assigned to a newly created cluster, depending on its distance from the already formed ones. The basic idea behind BSC is that each input data is assigned to an already created cluster or a new one is formed. Therefore, a decision for the data is reached prior to the final cluster formation, which is determined after all data have been presented. The following refinement of BSC, which will be called Modified Basic Sequential Clustering (MBSC), overcomes this drawback. The first phase involves the determination of the clusters, via the assignment of some of the data to them. During the second phase, the unassigned data are presented for a second time to the algorithm and are assigned to the appropriate cluster.
Let be N be the frames that selected from the sequences and marked as F = {f1, f2,..., fn}. Pixel (x, y) is used to illustrate the MBSC. The MBSC Procedure is stated as Table 1.
Where, ft (x, y) represents the intensity value of the pixel (x, y) of the t(t = 1, 2,..., N) frame. and represent the center and the number of the ith cluster separately. denote the distance (or dissimilarity) between a input date ft (x, y) and a cluster , we adopt the same distance in follow step. The following process is run to same pixel position (x, y). The omission of pixel position is necessary for concise writing.
Table 1: | The MBSC procedure |
Table 2: | The merging procedure |
Step 2 | : | Run merging procedure. |
In basic sequential clustering algorithm, it may happen that two of the formed clusters are very closely located and it may be desirable to merge them into a single one. Consider the data F = {1, 11, 5, 10, 6, 9, 8, 4, 70, 100}. If we present the F in the above order to the BSC and we set δ1 = 10, we obtain C1 = {4, 9.5, 70, 100} and m1 = {4, 4, 1, 1}. We can see and are very closely located. Such cases cannot be handled by the BSC. One way out of this problem is to run the simple merging procedure, after the termination of the MBSC.
If the number of clusters is n after the termination of the MBSC, is the center of ith cluster. Then the merging procedure is stated as Table 2.
Where, Θ2 is a user-defined parameter that quantifies the closeness of two clusters, and .
Step 3 | : | Calculate the appearance frequency of all clusters. |
Assume that p clusters are formed now. The appearance frequency of ith cluster:
(1) |
where, is the number of ith cluster that the reassignment procedure has created.
Step 4 | : | Select background of pixel. |
In real scene, multi-surfaces often appear in the view frustum of a particular pixel and the lighting conditions are often changed, so choosing a single image as background always results in large errors in moving object detection. Motivated by the work of (Stauffer and Grimson, 1999), we choose a single or multi-images as the background images. Suppose there are R(R≤q) clusters with a larger appearance frequency in the video sequences. After removal those intensity clusters with a small appearance probability, then the R(R≤q) clusters are chosen as the multi-background images. is stated as Table 3.
Table 3: | The background selection procedure |
Where, Θ3 is a user-defined parameter, the number of data of background, Bj (j = 1, 2,..., R) is the jth background that our algorithm has constructed.
Must readjust appearance frequency of backgrounds:
(2) |
SIMULATION RESULTS AND COMPARISONS
Two examples show the results of background reconstruction by using our algorithm. The results calculated by TABI, PIC (Hou and Han, 2004), MOG (Stauffer and Grimson, 1999) and BSC are given in detail in literature (Xiao and Han, 2007).
Fig. 1: | Result of background reconstruction and motion detection of highway sequence |
Fig. 2: | Result of background reconstruction and motion detection of hall sequence |
In the study, we only give the simulation results compare the algorithm of BSC and our method. In the simulation the parameters in BSC are chosen as: N = 100, δ = 13 and δc = 0.2. The parameters that are referred in our algorithm are chosen as: N = 100, Θ1 = 10, Θ2 = 10 and Θ3 = 0.2. The video shows the pure detection results without any morphological operations, noise filtering and tracking information of targets.
Figure 1 is Highway sequence, 100 frames are used to reconstruct the background, the vehicles moving quickly and the traffic flow is dense. The number of background is different in vary pixel, so we use black denote empty background. For example, suppose that pixel (1, 1) has three backgrounds B1(1, 1), B2(1, 1) and B3(1, 1). Pixel (2, 2) has one background B1(2, 2), which results in two empty backgrounds B2(2, 2) and B3(2, 2), so we can see black in B2(2, 2) and B3(2, 2). The number of backgrounds of MBSC is less than BSC. Motion detections (see the area that is masked with ellipse in Fig. 1(d1) and Fig. 1(d2), Fig. 1(e1) and Fig. 1(e2)) also show that false detection of our method is less greatly than BSC.
Figure 2 is an indoor sequence. Hundred frames are used to construct the background. From the background image of BSC, we can see part of foreground (see the area that is masked with ellipse in Fig. 2(a1)) is considered as background. It is clear that our method leads to more reasonable results than BSC. The motion detections (Fig. 1(d1) and Fig. 1(d2), Fig. 1(e1) and Fig. 1(e2)) show that our method can eliminate the false motion detection.
CONCLUSIONS
Background reconstruction is core of background subtraction. An improved background reconstruction based modified basic sequential clustering is proposed in this study. The merging procedure is used after the termination of the MBSC algorithm. Our method algorithm has there traits: 1) background reconstruction does not need priori knowledge of scene. 2) an assignment for the data is reached after the final cluster formation, which is determined after all data have been presented. 3) those near classes are avoided at all and the effect of input order of data has been reduced greatly.. However, we must state that the cost we pay for it is that the data of F have to be presented twice to the algorithm. So how to consume time of computation and save memory requirements is my future work.
REFERENCES
- Cucchiara, R., C. Grana, M. Piccardi and A. Prati, 2003. Detecting moving objects, ghosts and shadows in video streams. IEEE Trans. Pattern Anal. Mach. Intell., 25: 1337-1342.
CrossRefDirect Link - Elgammal, A., D. Harwood and L. Davis, 2000. Non-parametric model for background subtraction. Proceeding of the 6th European Conference on Computer Vision, June 26-July 1, 2000, Dublin, Ireland, pp: 246-252.
Direct Link - Gloyer, B., H.K. Aghajan, K.Y. Siu and T. Kailath, 1995. Video-based freeway monitoring system using recursive vehicle tracking. Proceedings of the SPIE, Image and Video Processing III, Volume 2421, March 23, 1995, San Jose, CA, USA., pp: 173-180.
CrossRefDirect Link - Hou, Z.Q. and C.Z. Han, 2004. A background reconstruction algorithm based on pixel intensity classification in remote video surveillance system. Proceedings of the 7th International Conference on Information Fusion, June 28-July 1, 2004, Fairborn, United States, pp: 754-759.
Direct Link - Javed, O., K. Shafique and M. Shah, 2002. A hierarchical approach to robust background subtraction using color and gradient information. Proceeding of the Workshop on Motion and Video Computing, December 5-6, 2002, Orlando, FL., USA., pp: 22-27.
CrossRefDirect Link - Kornprobst, P., R. Deriche and G. Aubert, 1999. Image sequence analysis via partial difference equations. J. Math. Imaging Vision, 11: 5-26.
CrossRefDirect Link - Lee, D.S., J.J. Hull and B. Erol, 2003. A bayesian framework for gaussian mixture background modeling. Proceedings of the International Conference on Image Processing, September 14-17, 2003, Barcelona, Spain, pp: 973-979.
CrossRefDirect Link - Long, W. and Y. Yang, 1990. Stationary background generation: An alternative to the difference of two images. Pattern Recog., 23: 1351-1359.
CrossRefDirect Link - Stauffer, C. and W.E.L. Grimson, 1999. Adaptive background mixture models for real-time tracking. Proceedings of the Computer Society Conference on Computer Vision and Pattern Recognition, June 23-25, 1999, Los Alamitos, CA., USA., pp: 246-252.
CrossRefDirect Link - Sergios, T. and K. Konstantinos, 2003. Pattern Recognition, 2nd Edn. Machine Press, China, pp: 278-289.
Direct Link - Toyama, K., J. Krumm, B. Brumitt and B. Meyers, 1999. Wallflower: Principles and practice of background maintenance. Proceedings of the 7th IEEE International Conference on Computer Vision. September 20-27, 1999, Piscataway, NJ., USA., pp: 255-261.
CrossRefDirect Link - Wren, C.R., A. Azarbayejani, T. Darrell and A.P. Pentland, 1997. Pfinder: Real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intel., 19: 780-785.
CrossRefDirect Link - Xiao, M. and C.Z. Han, 2007. Background subtraction algorithm based on online clustering. Pattern Recognition Artificial Intel., 20: 35-41.
Direct Link - Zivkovic, Z. and D.H.F. Van, 2004. Recursive unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell., 26: 651-656.
Direct Link