CN101610412B - A visual tracking method based on multi-cue fusion - Google Patents
A visual tracking method based on multi-cue fusion Download PDFInfo
- Publication number
- CN101610412B CN101610412B CN2009100888784A CN200910088878A CN101610412B CN 101610412 B CN101610412 B CN 101610412B CN 2009100888784 A CN2009100888784 A CN 2009100888784A CN 200910088878 A CN200910088878 A CN 200910088878A CN 101610412 B CN101610412 B CN 101610412B
- Authority
- CN
- China
- Prior art keywords
- msubsup
- mrow
- probability distribution
- msub
- distribution map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及视觉跟踪,尤其涉及一种融合多种线索的视觉跟踪方法,属于信息技术领域。The invention relates to visual tracking, in particular to a visual tracking method for fusing multiple clues, and belongs to the field of information technology.
背景技术Background technique
随着信息技术与智能科学的迅速发展,利用计算机实现人类视觉功能的计算机视觉成为目前计算机领域中最热门的研究方向之一。其中视觉跟踪是计算机视觉的核心问题之一,它是在图像序列的每一帧图像中找到感兴趣的运动目标所处的位置。对其进行研究是非常必要的,也是非常紧迫的。With the rapid development of information technology and intelligence science, computer vision, which uses computers to realize human visual functions, has become one of the most popular research directions in the computer field. Among them, visual tracking is one of the core problems of computer vision, which is to find the position of the moving target of interest in each frame of the image sequence. It is very necessary and urgent to study it.
Hong Liu等人于2007年在《Proceedings of the 14th IEEE International Conference onImage Processing(ICIP 2007)》(IEEE第14届图像处理国际会议)上发表论文“Collaborativemean shift tracking based on multi-cue integration and auxiliary objects”(基于多线索融合和辅助物体的协作均值漂移跟踪),该论文结合了颜色,位置和预测特征线索,根据背景情况动态地更新每个线索的权值,使用Mean Shift技术,利用辅助物实现了视觉跟踪方法。但是,它假设背景模型服从单高斯模型,事先需要对无运动物体的视频序列进行训练,得到背景初始模型,这样限制了它的应用,在线索评价函数中用一个比目标稍大的矩形表示感兴趣区域,在该矩形和跟踪窗口之间的区域定义为背景区域,对于某个线索的可靠性评价函数,背景区域的大小直接影响它的值,即跟踪窗口越大,它的可靠性评价函数值越小,缺乏一般性。Hong Liu et al published the paper "Collaborative mean shift tracking based on multi-cue integration and auxiliary objects" in "Proceedings of the 14th IEEE International Conference on Image Processing (ICIP 2007)" (IEEE 14th International Conference on Image Processing) in 2007 (Cooperative mean shift tracking based on multi-cue fusion and auxiliary objects), this paper combines color, position and prediction feature clues, dynamically updates the weight of each clue according to the background situation, uses Mean Shift technology, and uses auxiliary objects to achieve Visual tracking method. However, it assumes that the background model obeys a single Gaussian model, and it needs to train the video sequence without moving objects in advance to obtain the background initial model, which limits its application. In the cue evaluation function, a rectangle slightly larger than the target is used to represent the sense The area of interest, the area between the rectangle and the tracking window is defined as the background area. For the reliability evaluation function of a clue, the size of the background area directly affects its value, that is, the larger the tracking window, the greater its reliability evaluation function The smaller the value, the lack of generality.
发明内容Contents of the invention
本发明的目的在于克服现有技术中的不足,提供一种融合多种线索的视觉跟踪方法,尤其可用于面向人体运动的视觉跟踪,使得让计算机对目标(比如人体)进行视觉自动跟踪时,满足准确性和实时性的要求。The purpose of the present invention is to overcome the deficiencies in the prior art, provide a kind of visual tracking method of fusion multiple cues, especially can be used for the visual tracking of human body movement, make when allowing computer to carry out visual automatic tracking to target (such as human body), Meet the requirements of accuracy and real-time.
本发明结合视频图像的多个线索(颜色特征、位置特征和运动连续性特征)借助于CAMSHIFT(连续自适应均值漂移,Continuously Adaptive Mean Shift)方法实现视觉跟踪,如图1所示。其中,颜色特征优选采用色调和饱和度特征、红色信道特征、绿色信道特征和蓝色信道特征,对于遮挡和位姿的改变实现了较好的鲁棒性;位置特征利用帧差技术实现;运动连续性特征根据帧间连续性完成。The present invention combines multiple clues (color features, position features and motion continuity features) of video images to realize visual tracking by means of CAMSHIFT (Continuously Adaptive Mean Shift) method, as shown in Figure 1. Among them, the color features are preferably hue and saturation features, red channel features, green channel features, and blue channel features, which achieve better robustness to occlusion and pose changes; position features are implemented using frame difference technology; motion Continuity features are done in terms of inter-frame continuity.
本发明采用固定不变的跟踪窗口,这样虽然限制了外表改变和遮挡的管理,但它不用考虑会把一些背景相似的区域看作目标的一部分,而且同样可以实现跟踪效果。The present invention adopts a fixed tracking window, which limits the management of appearance changes and occlusions, but it does not need to consider some areas with similar backgrounds as part of the target, and can also achieve tracking effects.
本发明通过以下技术方案实现,包括以下步骤:The present invention is realized through the following technical solutions, comprising the following steps:
a)在一段视频序列的第一帧中确定一跟踪窗口,所述跟踪窗口包括目标区域和背景区域,所述目标区域包含被跟踪的对象;优选地,所述跟踪窗口是一矩形,所述矩形等分成三份,中间一份是所述目标区域,两边各一份是所述背景区域,如图2所示。a) Determine a tracking window in the first frame of a section of video sequence, the tracking window includes a target area and a background area, and the target area includes a tracked object; preferably, the tracking window is a rectangle, the The rectangle is equally divided into three parts, the middle part is the target area, and each side part is the background area, as shown in FIG. 2 .
b)对于自第二帧起的每一帧,获得前一帧的颜色特征概率分布图,位置特征概率分布图和运动连续性特征概率分布图;b) For each frame from the second frame, obtain the color feature probability distribution map, position feature probability distribution map and motion continuity feature probability distribution map of the previous frame;
c)将所述三种概率分布图加权相加得到总概率分布图;c) weighted addition of the three probability distribution diagrams to obtain a total probability distribution diagram;
d)在所述总概率分布图中,通过CAMSHIFT算法得到当前帧的跟踪窗口的中心点坐标。d) In the total probability distribution diagram, the coordinates of the center point of the tracking window of the current frame are obtained through the CAMSHIFT algorithm.
下面具体描述本发明所涉及的多种线索及线索融合。Various clues and clue fusion involved in the present invention are described in detail below.
颜色特征color characteristics
所述颜色特征优选包括图像的色调(Hue)和饱和度(Saturation)特征、R(Red)信道特征、G(Green)信道特征和B(Blue)信道特征,对于遮挡和位姿的改变实现了较好的鲁棒性。The color features preferably include the hue (Hue) and saturation (Saturation) features of the image, the R (Red) channel features, the G (Green) channel features and the B (Blue) channel features, which realize the change of occlusion and pose Better robustness.
设本发明使用m个柄(bin)的直方图,图像有n个像素点,它们的位置和在直方图中相应取值分别是{xi}i=1...n,{qu}u=1,...,m.(R信道特征、G信道特征和B信道特征)或{qu(v)}u=1,...,m;v=1,...,m.(色调和饱和度特征)。定义函数b:R2→{1,…,m},此函数表征每个像素颜色信息对应的离散区间值。直方图中,第c个颜色信息区间对应的值可以表示为式(1)和式(2)或式(1′)和式(2′):Assuming that the present invention uses a histogram of m handles (bins), the image has n pixels, and their positions and corresponding values in the histogram are respectively {x i } i=1...n , {q u } u=1,...,m. (R channel characteristics, G channel characteristics and B channel characteristics) or {q u(v) } u=1,...,m; v=1,...,m . (hue and saturation characteristics). Define the function b: R 2 →{1,...,m}, this function represents the discrete interval value corresponding to the color information of each pixel. In the histogram, the value corresponding to the cth color information interval can be expressed as formula (1) and formula (2) or formula (1') and formula (2'):
或
颜色特征概率分布图可通过下列方法建立:A color feature probability distribution map can be created by the following methods:
第一,从RGB(Red,Green,Blue)图像中提取出R(Red)、G(Green)和B(Blue)通道,然后将RGB图像转变成HSV(Hue,Saturation,Value)图像,提取出色调(Hue)通道和饱和度(Saturation)通道,利用直方图逆投影(Histogram Back-Projection)计算出跟踪窗口中像素的色调和饱和度概率分布,红色概率分布,绿色概率分布,蓝色概率分布,如式(1)或式(1′)。First, extract the R (Red), G (Green) and B (Blue) channels from the RGB (Red, Green, Blue) image, then convert the RGB image into an HSV (Hue, Saturation, Value) image, and extract Hue (Hue) channel and Saturation (Saturation) channel, use the histogram back-projection (Histogram Back-Projection) to calculate the probability distribution of hue and saturation of pixels in the tracking window, red probability distribution, green probability distribution, blue probability distribution , such as formula (1) or formula (1′).
第二,将色调和饱和度概率分布,红色概率分布,绿色概率分布,蓝色概率分布中的值域利用式(2)或式(2′)进行重新取值,使得值域由[0,max(qu(v))]或[0,max(qu)]投影到[0,255]。Second, use formula (2) or formula (2') to re-value the value ranges in the probability distribution of hue and saturation, red probability distribution, green probability distribution, and blue probability distribution, so that the value range is from [0, max(q u(v) )] or [0, max(q u )] projected to [0, 255].
第三,按照一定规则在色调和饱和度特征,红色特征,绿色特征,蓝色特征中选择适合的特征作为视觉跟踪算法的颜色特征,形成最终的颜色概率分布图p(x,y)。Third, according to certain rules, select suitable features among the hue and saturation features, red features, green features, and blue features as the color features of the visual tracking algorithm to form the final color probability distribution map p(x, y).
在上述四个特征中,本发明方法优选动态选择其中最能体现目标区域和背景区域差异性的一个或多个特征,方法如下:Among the above four features, the method of the present invention preferably dynamically selects one or more features that can best reflect the difference between the target area and the background area, the method is as follows:
对于特征k,设i是特征k的数值,H1 k(i)表示目标区域A中特征值的直方图,H2 k(i)表示背景区域B和C中特征值的直方图,pk(i)是目标区域A的离散概率分布,qk(i)是背景区域B和C的离散概率分布,Li k是特征k的对数似然率,如式(10),取δ>0的很小的数,这主要是防止式(10)出现分母是0或者log0的情况。var(Li k;pk(i)是相对目标类分布pk(i)的Li k的方差,如式(11),var(Li k;qk(i))是相对背景类分布qk(i)的Li k的方差,如式(12),var(Li k;Rk(i))是相对目标和背景类分布的Li k的方差,如式(13),V(L;pk(i),qk(i))是Li k的方差,如式(14),V(L;pk(i),qk(i))表示了特征k能够把目标和背景相分离的能力,V(L;pk(i),qk(i))越大说明特征k越容易从背景中分离出目标,这个特征越可靠,越适合作为跟踪目标的特征。For feature k, let i be the value of feature k, H 1 k (i) represents the histogram of feature values in target region A, H 2 k (i) represents the histogram of feature values in background regions B and C, p k (i) is the discrete probability distribution of the target area A, q k (i) is the discrete probability distribution of the background areas B and C, L i k is the logarithmic likelihood of feature k, as in formula (10), take δ> 0, which is mainly to prevent the denominator of formula (10) from being 0 or log0. var(L i k ; p k (i) is the variance of L i k relative to the target class distribution p k (i), such as formula (11), var(L i k ; q k (i)) is the relative background class The variance of L i k of the distribution q k (i), such as formula (12), var(L i k ; R k (i)) is the variance of L i k relative to the target and background class distribution, such as formula (13) , V(L; p k (i), q k (i)) is the variance of L i k , such as formula (14), V(L; p k (i), q k (i)) represents the feature k The ability to separate the target from the background, V(L; p k (i), q k (i)), the larger the feature k, the easier it is to separate the target from the background, the more reliable this feature is, the more suitable it is to track the target Characteristics.
其中Rk(i)=[pk(i)+qk(i)]/2;where R k (i)=[p k (i)+q k (i)]/2;
在视频跟踪过程中,不断检测评判色调和饱和度特征、R(Red)信道特征、G(Green)信道特征和B(Blue)信道特征的可靠性,当它们的可靠性发生变化时,则根据式(14)计算它们的可靠性,并按照可靠性的大小重新排列,取V(Li k;pk(i),qk(i))最大的W个特征作为跟踪目标的颜色特征。所述W值优选是1,2或3。During the video tracking process, the reliability of the hue and saturation characteristics, R (Red) channel characteristics, G (Green) channel characteristics and B (Blue) channel characteristics are constantly detected and judged. Equation (14) calculates their reliability, and rearranges them according to the size of the reliability, and takes the W features with the largest V(L i k ; p k (i), q k (i)) as the color features of the tracking target. The W value is preferably 1, 2 or 3.
位置特征location feature
关于位置特征,本发明利用帧差计算出前后两帧图像中每一点的灰度差值,然后通过设定一个阈值来判定哪些像素点是运动点,超过该阈值即为运动点。如果只是依赖于经验来设定帧差阈值,则带有一定的盲目,只适用于某些特定的场合。为此本发明优选使用Otsu方法来动态确定这个帧差阈值F。Otsu算法的基本思想是寻找合适的阈值F使类内散布矩最小,这等价于寻找合适的阈值F使类间散布矩最大,即通过帧差阈值F将帧差图像分割成两类,使得被分成的两类的方差为最大。类内散布矩表示各样本点围绕它们均值周围的散布情况,类间散布矩表示各类之间的散布情况。类间散布矩越小意味着各类样本内部越紧密,类间散布矩越大意味着各类样本之间可分性越好。Regarding the position feature, the present invention calculates the gray level difference of each point in the two frames of images by using the frame difference, and then determines which pixel points are moving points by setting a threshold, and the moving points are those exceeding the threshold. If you only rely on experience to set the frame difference threshold, it is blind and only applicable to certain occasions. For this reason, the present invention preferably uses the Otsu method to dynamically determine the frame difference threshold F. The basic idea of the Otsu algorithm is to find the appropriate threshold F to minimize the intra-class scatter moment, which is equivalent to finding an appropriate threshold F to maximize the inter-class scatter moment, that is, to divide the frame difference image into two categories by the frame difference threshold F, so that The variance of the two classes that are divided into is the largest. The intra-class scatter moments represent the scatter of each sample point around their mean, and the inter-class scatter moments represent the scatter among different classes. The smaller the inter-class scatter moment means that the interior of each type of sample is tighter, and the larger the inter-class scatter moment means the better the separability between various types of samples.
运动连续性特征Motion Continuity Features
关于运动连续性特征,本发明通过前几帧的图像估测跟踪目标的速度,进而由前几帧图像跟踪得到的目标位置估测出当前时刻的目标中心位置。在很短时间内(视频帧之间),目标运动具有很强的连续性,目标的速度可视为是恒定不变的,因此,可以通过前几帧的图像估测跟踪目标的速度,进而由前几帧图像跟踪得到的目标位置估测出当前时刻的目标中心位置。Regarding the feature of motion continuity, the present invention estimates the speed of the tracking target through the images of the previous frames, and then estimates the center position of the target at the current moment from the target position obtained by tracking the images of the previous frames. In a short period of time (between video frames), the movement of the target has a strong continuity, and the speed of the target can be regarded as constant. Therefore, the speed of the tracking target can be estimated through the images of the previous frames, and then The target center position at the current moment is estimated from the target position tracked by the previous frames of images.
X(t,row)=X(t-1,row)±(X(t-1,row)-X(t-2,row)) (3)X(t, row)=X(t-1, row)±(X(t-1, row)-X(t-2, row)) (3)
X(t,col)=X(t-1,col)±(X(t-1,col)-X(t-2,col)) (4)X(t,col)=X(t-1,col)±(X(t-1,col)-X(t-2,col)) (4)
设X(t,row)表示t时刻当前目标中心位置的行坐标,如式(3),X(t,col)表示t时刻当前目标中心位置的纵坐标,如式(4),row是图像的最大行数,col是图像的最大纵数,考虑了目标运动的连续性,通过使用线性预测器来预测当前位置。所以X(t,row)、X(t-1,row)和X(t-2,row)的关系如式(5),X(t,col)、X(t-1,col)和X(t-2,col)的关系如式(6)。Let X(t, row) represent the row coordinates of the current target center position at time t, such as formula (3), X(t, col) represents the ordinate of the current target center position at time t, such as formula (4), row is the image The maximum number of lines, col is the maximum vertical number of the image, taking into account the continuity of the target movement, by using a linear predictor to predict the current position. So the relationship between X(t, row), X(t-1, row) and X(t-2, row) is as in formula (5), X(t, col), X(t-1, col) and X The relationship between (t-2, col) is as formula (6).
X(t,row)∈[max(X(t-1,row)-(X(t-1,row)-X(t-2,row)),1),min(X(t-1,row)+(X(t-1,row)-X(t-2,row)),rows)] (5)X(t, row) ∈ [max(X(t-1, row)-(X(t-1, row)-X(t-2, row)), 1), min(X(t-1, row)+(X(t-1, row)-X(t-2, row)), rows)] (5)
X(t,col)∈[max(X(t-1,col)-(X(t-1,col)-X(t-2,col)),1),min(X(t-1,col)+(X(t-1,col)-X(t-2,col)),cols)] (6)X(t, col) ∈ [max(X(t-1, col)-(X(t-1, col)-X(t-2, col)), 1), min(X(t-1, col)+(X(t-1, col)-X(t-2, col)), cols)] (6)
设所述跟踪窗口行宽是width,纵长是length,则当前时刻目标的行坐标,如式(7),纵坐标,如式(8),即目标在这个矩形范围内。Assuming that the line width of the tracking window is width, and the vertical length is length, then the line coordinate of the target at the current moment is as in formula (7), and the ordinate is as in formula (8), that is, the target is within the range of this rectangle.
Y(t,row)∈[max(X(t,row)-width,1),min(X(t,row)+width,rows)] (7)Y(t, row) ∈ [max(X(t, row)-width, 1), min(X(t, row)+width, rows)] (7)
Y(t,col)∈[max(X(t,col)-length,1),min(X(t,col)+length,cols)] (8)Y(t, col) ∈ [max(X(t, col)-length, 1), min(X(t, col)+length, cols)] (8)
设B′(x,y,t)表示运动连续性特征的概率分布图,如式(9),其中,(x,y,t)表示t时刻坐标(x,y)的像素,1表示被跟踪的目标,0表示背景像素。Let B'(x, y, t) represent the probability distribution map of motion continuity features, such as formula (9), where (x, y, t) represents the pixel at the coordinate (x, y) at time t, and 1 represents the pixel being Tracking target, 0 means background pixel.
线索融合cue fusion
假设Pk(row,colu,t)是像素点(row,colu)在时刻t,通过特征k得到的概率分布,它表征每个像素(row,colu)在特征k下属于目标区域的概率。P(row,colu,t)代表在时刻t,W+2个特征(W个颜色特征,一个预测目标位置特征和一个运动连续性特征)经过融合后的最终概率分布,它表征每个像素(row,colu)属于目标区域的概率,如式(15)。W个颜色特征通过上一帧的可靠性作为竞争评判的依据,某个特征可信性高,则在视觉跟踪系统中占主导地位,提供更多的信息给跟踪系统;可信度低时,信息会被降低利用率或者忽略不计。Suppose P k (row, colu, t) is the probability distribution of pixel (row, colu) obtained by feature k at time t, which represents the probability that each pixel (row, colu) belongs to the target area under feature k. P(row, colu, t) represents the final probability distribution after fusion of W+2 features (W color features, a predicted target position feature and a motion continuity feature) at time t, which characterizes each pixel ( row, colu) belongs to the probability of the target area, such as formula (15). W color features use the reliability of the previous frame as the basis for competition evaluation. If a feature is highly reliable, it will dominate the visual tracking system and provide more information to the tracking system; when the reliability is low, Information will be underutilized or ignored.
其中rk是特征k的权值,r1,r2,...,rk是被选择的颜色特征,rw+1是预测目标位置特征的权值,rw+2是运动连续性特征的权值,
与现有技术相比,本发明简单有效,不需要假设背景模型,不需要事先对无运动目标的视频序列进行训练,它的关键在于实现了多个线索的融合,适用不同场景,取得了较好的跟踪效果,尤其适合视频序列中目标环境的颜色饱和度较低和目标被部分遮挡的情况。Compared with the prior art, the present invention is simple and effective, does not need to assume a background model, and does not need to train video sequences without moving objects in advance. Good tracking performance, especially for video sequences where the color saturation of the target environment is low and the target is partially occluded.
利用本发明进行视觉跟踪,既可以作为跟踪结果来利用,也可以作为下一步的视觉理解的中间结果。本发明在信息领域有广泛的应用前景,可以在人与机器人的交互技术(Human Robot Interaction,简称HRI)、视觉智能监控、智能机器人、虚拟现实技术、基于模型的图像编码,流媒体的内容检索等领域得到应用。在此仅以视觉智能监控系统为例,它涵盖了安防、交通、消防、军工、通信等领域,视频监控系统已经应用于小区安全监控、火情监控、交通违章、流量控制、军事和银行、商场、机场、地铁等公共场所安全防范等。现有的视频监控系统通常只是录制视频图像,用来当作事后证据,没有充分发挥实时主动的监控作用。如果将现有的视频监控系统改进成为智能视频监控系统,就能够大大地增强监控能力、降低不安全隐患,同时节省人力物力资源,节约投资。视频智能系统可以解决的问题有两个:一个是将安防操作人员从繁杂而枯燥的“盯屏幕”任务解脱出来,由机器来完成这部分工作;另外一个是在海量的视频数据中快速搜索到想要找的图像,即对目标进行跟踪,如北京地铁13号线,利用视频分析抓住偷窃贼;浦东机场、首都机场及已经多条在建铁路项目,都预计使用视频分析技术,而本发明的视觉跟踪方法就是这些视频分析技术的核心和关键技术之一。The visual tracking by using the present invention can be used not only as a tracking result, but also as an intermediate result of the next step of visual understanding. The present invention has wide application prospects in the field of information, and can be used in human-robot interaction technology (Human Robot Interaction, HRI for short), visual intelligent monitoring, intelligent robots, virtual reality technology, image coding based on models, and content retrieval of streaming media. and other fields have been applied. Here we only take the visual intelligent monitoring system as an example. It covers security, transportation, fire protection, military industry, communication and other fields. The video monitoring system has been used in community safety monitoring, fire monitoring, traffic violations, flow control, military and banking Safety precautions in shopping malls, airports, subways and other public places. Existing video surveillance systems usually only record video images, which are used as evidence after the event, and do not give full play to the role of real-time and active monitoring. If the existing video surveillance system is improved into an intelligent video surveillance system, the surveillance capability can be greatly enhanced, the hidden danger of insecurity can be reduced, and at the same time, manpower, material resources and investment can be saved. There are two problems that the video intelligence system can solve: one is to free the security operators from the complicated and boring task of "staring at the screen", and let the machine complete this part of the work; The image you want to find is to track the target. For example, Beijing Metro Line 13 uses video analysis to catch thieves; Pudong Airport, Capital Airport and many railway projects that are already under construction are expected to use video analysis technology. The invented visual tracking method is one of the core and key technologies of these video analysis technologies.
附图说明Description of drawings
图1为本发明方法融合线索的示意图;Fig. 1 is a schematic diagram of fusion clues of the method of the present invention;
图2为本发明跟踪窗口示意图,A是目标区域,B和C是背景区域;Fig. 2 is a schematic diagram of the tracking window of the present invention, A is the target area, B and C are the background areas;
图3-5分别是针对分辨率为640*512的视频序列的50帧,100帧和120帧的视觉跟踪示意图,其中图a表示运动连续性特征概率分布图,b表示位置特征概率分布图,c表示总概率分布图,d表示当前帧跟踪结果图。Figures 3-5 are schematic diagrams of visual tracking for 50 frames, 100 frames and 120 frames of a video sequence with a resolution of 640*512, in which figure a represents the probability distribution map of motion continuity features, and b represents the probability distribution map of position features. c represents the total probability distribution map, and d represents the current frame tracking result map.
图6a-d分别是针对分辨率为640*480的视频序列的50帧,90帧,120帧和164帧的视觉跟踪结果图。Figures 6a-d are visual tracking results for 50 frames, 90 frames, 120 frames and 164 frames of a video sequence with a resolution of 640*480, respectively.
具体实施方式Detailed ways
下面结合附图对本发明的实施例作详细说明。本发明的保护范围不限于下述的实施例。Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. The protection scope of the present invention is not limited to the following examples.
本实施例的视觉跟踪根据下列步骤进行:The visual tracking of the present embodiment is carried out according to the following steps:
第一,在视频序列的第1帧设置跟踪窗口,跟踪窗口的长宽根据被跟踪目标的大小由操作者确定,且在跟踪过程中不变。将跟踪窗口分成三份,中间部分(A)为目标区域,左右(B和C)为背景区域,如图2所示。First, the tracking window is set in the first frame of the video sequence. The length and width of the tracking window are determined by the operator according to the size of the tracked target and remain unchanged during the tracking process. Divide the tracking window into three parts, the middle part (A) is the target area, and the left and right (B and C) are the background areas, as shown in Figure 2.
第二,从第2帧起,根据前一帧选择最可靠的2(W=2)个颜色特征(比如,R信道和B信道),计算颜色特征概率分布图M1。Second, from the second frame, select the most reliable 2 (W=2) color features (for example, R channel and B channel) according to the previous frame, and calculate the color feature probability distribution map M 1 .
第三,计算位置特征概率分布图M2。Thirdly, the location feature probability distribution map M 2 is calculated.
第四,计算运动连续性特征概率分布图M3。Fourth, the motion continuity feature probability distribution map M 3 is calculated.
第五,将如上所得的三种概率分布图(M1、M2、M3)分别加权相应的rk,得到最终的概率分布图M,在本实施例中M1-M3的权重依次为3/7(其中R信道和B信道的权重分别为2/7和1/7),2/7和2/7。Fifth, the three probability distribution diagrams (M 1 , M 2 , M 3 ) obtained above are weighted by the corresponding r k to obtain the final probability distribution diagram M. In this embodiment, the weights of M 1 -M 3 are sequentially is 3/7 (the weights of R channel and B channel are 2/7 and 1/7 respectively), 2/7 and 2/7.
第六,在概率分布图M中,通过CAMSHIFT算法得到当前帧的跟踪窗口的中心点坐标,CAMSHIFT算法的核心过程包括:计算跟踪窗口的零阶矩(下式16)和一阶矩(下式17和18),通过式(19),式(20)迭代计算(x,y)坐标,直到该坐标没有明显位移(x和y的值的变化小于2)或迭代到最大次数15次时的坐标就是当前帧的跟踪窗口中心。Sixth, in the probability distribution map M, the center point coordinates of the tracking window of the current frame are obtained by the CAMSHIFT algorithm. The core process of the CAMSHIFT algorithm includes: calculating the zero-order moment (the following formula 16) and the first-order moment (the following formula 17 and 18), iteratively calculate (x, y) coordinates through formula (19), formula (20) until the coordinates have no obvious displacement (the change of the value of x and y is less than 2) or when the maximum number of iterations is 15 times The coordinates are the center of the tracking window of the current frame.
图3-图5分别是针对分辨率为640*512的视频序列的50帧,100帧和120帧的视觉跟踪示意图。Figures 3-5 are schematic diagrams of visual tracking for 50 frames, 100 frames and 120 frames of a video sequence with a resolution of 640*512, respectively.
图6是针对分辨率为640*480的视频序列的50帧,90帧,120帧和164帧的视觉跟踪结果图。尽管该视频序列2的饱和度较低,但综合考虑了颜色特征的可靠性和多线索融合,仍然实现了跟踪目标。Fig. 6 is a visual tracking result diagram of 50 frames, 90 frames, 120 frames and 164 frames of a video sequence with a resolution of 640*480. Although the saturation of this video sequence 2 is low, the tracking goal is still achieved considering the reliability of color features and multi-cue fusion.
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2009100888784A CN101610412B (en) | 2009-07-21 | 2009-07-21 | A visual tracking method based on multi-cue fusion |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2009100888784A CN101610412B (en) | 2009-07-21 | 2009-07-21 | A visual tracking method based on multi-cue fusion |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN101610412A CN101610412A (en) | 2009-12-23 |
| CN101610412B true CN101610412B (en) | 2011-01-19 |
Family
ID=41483954
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2009100888784A Expired - Fee Related CN101610412B (en) | 2009-07-21 | 2009-07-21 | A visual tracking method based on multi-cue fusion |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN101610412B (en) |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI497450B (en) * | 2013-10-28 | 2015-08-21 | Univ Ming Chuan | Visual object tracking method |
| JP2016033759A (en) * | 2014-07-31 | 2016-03-10 | セイコーエプソン株式会社 | Display device, display device control method, and program |
| CN105547635B (en) * | 2015-12-11 | 2018-08-24 | 浙江大学 | A kind of contactless structural dynamic response measurement method for wind tunnel test |
| CN107403439B (en) * | 2017-06-06 | 2020-07-24 | 沈阳工业大学 | Cam-shift-based prediction tracking method |
| CN107833240B (en) * | 2017-11-09 | 2020-04-17 | 华南农业大学 | Target motion trajectory extraction and analysis method guided by multiple tracking clues |
| CN113378616A (en) * | 2020-03-09 | 2021-09-10 | 华为技术有限公司 | Video analysis method, video analysis management method and related equipment |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1619593A (en) * | 2004-12-09 | 2005-05-25 | 上海交通大学 | Adaptive tracking method of video moving target based on multi-feature information fusion |
| CN1932846A (en) * | 2006-10-12 | 2007-03-21 | 上海交通大学 | Visual frequency humary face tracking identification method based on appearance model |
| CN1992911A (en) * | 2005-12-31 | 2007-07-04 | 中国科学院计算技术研究所 | Target tracking method of sports video |
-
2009
- 2009-07-21 CN CN2009100888784A patent/CN101610412B/en not_active Expired - Fee Related
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1619593A (en) * | 2004-12-09 | 2005-05-25 | 上海交通大学 | Adaptive tracking method of video moving target based on multi-feature information fusion |
| CN1992911A (en) * | 2005-12-31 | 2007-07-04 | 中国科学院计算技术研究所 | Target tracking method of sports video |
| CN1932846A (en) * | 2006-10-12 | 2007-03-21 | 上海交通大学 | Visual frequency humary face tracking identification method based on appearance model |
Also Published As
| Publication number | Publication date |
|---|---|
| CN101610412A (en) | 2009-12-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110688987B (en) | Pedestrian position detection and tracking method and system | |
| US9652863B2 (en) | Multi-mode video event indexing | |
| CN103077539B (en) | Motion target tracking method under a kind of complex background and obstruction conditions | |
| CN109636795B (en) | Real-time non-tracking monitoring video remnant detection method | |
| CN106408594B (en) | Video multi-target tracking based on more Bernoulli Jacob's Eigen Covariances | |
| CN101610412B (en) | A visual tracking method based on multi-cue fusion | |
| CN102289948B (en) | Multi-characteristic fusion multi-vehicle video tracking method under highway scene | |
| CN101470809B (en) | Moving object detection method based on expansion mixed gauss model | |
| CN102881022A (en) | Concealed-target tracking method based on on-line learning | |
| CN103617414B (en) | The fire disaster flame of a kind of fire color model based on maximum margin criterion and smog recognition methods | |
| CN103268470A (en) | Real-time statistics method of video objects based on arbitrary scenes | |
| CN109902592A (en) | A Blind Assisted Walking Method Based on Deep Learning | |
| Tang et al. | Multiple-kernel adaptive segmentation and tracking (MAST) for robust object tracking | |
| Xia et al. | Automatic multi-vehicle tracking using video cameras: An improved CAMShift approach | |
| Li et al. | RETRACTED: Intelligent transportation video tracking technology based on computer and image processing technology | |
| CN103077533A (en) | Method for positioning moving target based on frogeye visual characteristics | |
| Muniruzzaman et al. | Deterministic algorithm for traffic detection in free-flow and congestion using video sensor | |
| CN104156939B (en) | A kind of remnant object detection method based on SOBS and GMM | |
| Denman et al. | Multi-view intelligent vehicle surveillance system | |
| Du | CAMShift-based moving object tracking system | |
| Guo et al. | Research on water hazards detection method based on A-MSRCR and improved YOLO | |
| CN102663368B (en) | Moving target regional integration and optimization method on basis of Gestalt visual principle | |
| Davies et al. | Using CART to segment road images | |
| Ma et al. | Efficient visual tracking using particle filter | |
| Zou et al. | Active Object Detection for UAV Remote Sensing via Behavior Cloning and Enhanced Q-Network with Shallow Features |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20110119 Termination date: 20140721 |
|
| EXPY | Termination of patent right or utility model |