[go: up one dir, main page]

CN103218815B - Utilize the method for natural scene statistical computation image saliency map - Google Patents

Utilize the method for natural scene statistical computation image saliency map Download PDF

Info

Publication number
CN103218815B
CN103218815B CN201310135762.8A CN201310135762A CN103218815B CN 103218815 B CN103218815 B CN 103218815B CN 201310135762 A CN201310135762 A CN 201310135762A CN 103218815 B CN103218815 B CN 103218815B
Authority
CN
China
Prior art keywords
image
saliency
saliency map
wavelet coefficient
nsssal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310135762.8A
Other languages
Chinese (zh)
Other versions
CN103218815A (en
Inventor
黄虹
张建秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jilian Network Technology Co ltd
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201310135762.8A priority Critical patent/CN103218815B/en
Publication of CN103218815A publication Critical patent/CN103218815A/en
Application granted granted Critical
Publication of CN103218815B publication Critical patent/CN103218815B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

本发明属于图像显著图模型技术领域,具体为一种利用自然场景统计计算图像显著图的方法。本发明利用自然场景高斯尺度混合统计分布中的乘数随机变量来计算图像显著图,从而建立图像显著图模型。分析表明:本发明提出的显著图模型与视觉注意力选择机制具有较高的一致性,即能够在抑制重复出现的刺激同时,突出显著性较高的视觉刺激,从而更好地描述了图像对人眼视觉刺激的显著性分布。

The invention belongs to the technical field of image saliency map models, in particular to a method for statistically calculating image saliency maps using natural scenes. The invention uses the multiplier random variable in the Gaussian scale mixed statistical distribution of the natural scene to calculate the image saliency map, thereby establishing the image saliency map model. The analysis shows that the saliency map model proposed by the present invention has a high consistency with the visual attention selection mechanism, that is, it can highlight the visual stimuli with high salience while suppressing the recurring stimuli, thereby better describing the image pair Saliency distribution of visual stimuli for human eyes.

Description

利用自然场景统计计算图像显著图的方法A Method for Computing Image Saliency Map Using Natural Scene Statistics

技术领域 technical field

本发明属于图像显著图模型技术领域,具体涉及利用自然场景高斯尺度混合统计分布中的乘数随机变量来计算图像显著图的方法。 The invention belongs to the technical field of image saliency map models, and in particular relates to a method for calculating image saliency maps by using multiplier random variables in the Gaussian-scale mixed statistical distribution of natural scenes.

背景技术 Background technique

视觉注意力(VisualAttention,VA)是人类视觉系统(HumanVisualSystem,HVS)的一个重要机制。一般而言,人眼会在一个瞬间受到大量来自外界的视觉刺激,但是由于HVS中的处理资源有限,不同的视觉刺激之间会形成一个对处理资源的竞争关系。最终所有视觉刺激中信息量最大的刺激会赢得竞争,而其他刺激将受到抑制[1-2]。HVS正是利用这样的VA选择机制,充分利用有效的资源来处理大量视觉刺激,从而降低了对场景进行分析的复杂度[3]Visual Attention (VA) is an important mechanism of the Human Visual System (HVS). Generally speaking, the human eye will receive a large number of visual stimuli from the outside world at one moment, but due to the limited processing resources in the HVS, there will be a competition for processing resources between different visual stimuli. In the end, the most informative stimulus among all visual stimuli will win the competition, while other stimuli will be suppressed [1-2] . HVS uses such a VA selection mechanism to make full use of effective resources to process a large number of visual stimuli, thereby reducing the complexity of scene analysis [3] .

在神经学与生物学领域,一些实验通过特定的设备记录观察者面对不同图像场景时的眼睛注视点情况,另一些实验则由观察者主动标示出不同图像场景中他们感兴趣的区域。这些实验的目的都是希望根据实验结果,研究HVS的VA机制。研究结果表明:一方面,图像中灰度变化大的地方会引起HVS更大的注意,包括纹理信息、边缘信息等等。另一方面,HVS对重复出现的冗余信息有一定的抑制作用,新颖的、不与周围重复的信息会引起HVS更多的关注。因此,我们称视觉刺激结构模板的出现次数越少的地方,显著性越大[3]In the fields of neurology and biology, some experiments use specific equipment to record the gaze points of the observers when they face different image scenes, and other experiments allow the observers to actively mark the areas of interest in different image scenes. The purpose of these experiments is to study the VA mechanism of HVS based on the experimental results. The research results show that: on the one hand, places with large grayscale changes in the image will attract more attention from HVS, including texture information, edge information, and so on. On the other hand, HVS has a certain inhibitory effect on repeated redundant information, and novel information that is not repeated with the surroundings will attract more attention from HVS. Therefore, we say that the less the number of occurrences of visual stimulus structure templates, the greater the significance [3] .

随着对VA机制研究的不断深入,一些研究结果已经应用到图像处理的问题之中,并带来了一些启发性的结果。但是在图像的实时处理中,通过实验获得人眼对不同图像的注视点分布是不现实的,需要建立可以模拟HVS性质的可计算VA模型,从而实现对视觉刺激显著性的预测。在这些模型中,最基本的为显著图模型,它用显著图(Saliencymap)来描述HVS对场景中不同位置的注意程度[1]With the continuous deepening of the research on VA mechanism, some research results have been applied to the problem of image processing, and brought some enlightening results. However, in the real-time processing of images, it is unrealistic to obtain the distribution of fixation points of human eyes on different images through experiments. It is necessary to establish a computable VA model that can simulate the properties of HVS, so as to realize the prediction of the salience of visual stimuli. Among these models, the most basic is the saliency map model, which uses a saliency map to describe the degree of attention of HVS to different positions in the scene [1] .

视觉注意力最基本的结论是1980年Treisman&Gelade在实验基础上提出的注意力特征整合理论[5]。1989年,Wolfe等提出了GuidedSearch模型[6],它用显著度映射图实现场景中目标的搜索。1985年,Koch&Ullman根据神经学理论建立了VA模型框架[7]。经典的Itti&Koch显著图方法[8]在Koch&Ullman框架中,采用多尺度多通道来模型VA机制,提取相关特征,并通过融合来得到与HVS一致性较高的显著图。STB(SaliencyToolBox)[9]方法结合视觉认知一致的理论,对Itti&Koch方法进行了改进。另外,文献[16]提出了一种基于图像傅里叶变换相位谱的显著图模型(PFT-basedSaliencymap,PFTSal),并且得到了广泛的认可[1][4][17]。近来,文献[14]提出了一种基于余弦变换的PCT(PulseDiscretecosineTranform)模型,该模型后又被文献[15]称为signature的图像描述子,他们的实验结果证实在这种描述子上建立的显著图模型(signature-basedSaliencymap,signatureSal)与HVS的一致性较其它现存的显著图模型更高。其它一些工作包括基于信息理论的Bruce方法[10],以及基于贝叶斯方法的SUN模型[12]与Surprise模型[13]等。 The most basic conclusion of visual attention is the attention feature integration theory proposed by Treisman & Gelade in 1980 on the basis of experiments [5] . In 1989, Wolfe et al. proposed the GuidedSearch model [6] , which uses saliency maps to search for objects in the scene. In 1985, Koch & Ullman established the VA model framework based on neurological theory [7] . The classic Itti&Koch saliency map method [8] uses multi-scale and multi-channel to model the VA mechanism in the Koch&Ullman framework, extracts relevant features, and obtains a saliency map with high consistency with HVS through fusion. The STB (SaliencyToolBox) [9] method combines the theory of consistent visual cognition to improve the Itti&Koch method. In addition, literature [16] proposed a saliency map model (PFT-based Saliencymap, PFTSal) based on the Fourier transform phase spectrum of the image, and it has been widely recognized [1][4][17] . Recently, literature [14] proposed a cosine transform-based PCT (Pulse Discretecosine Transform) model, which was later called the signature image descriptor by literature [15]. Their experimental results confirmed that the image descriptor established on this descriptor The consistency between the signature-based Saliencymap (signatureSal) and HVS is higher than other existing saliency map models. Other works include the Bruce method [10] based on information theory, and the SUN model [12] and Surprise model [13] based on Bayesian methods.

本发明提出了一种自然场景统计显著图模型。它利用自然场景高斯尺度混合(GSM,Gaussianscalemixture)统计分布中的乘数随机变量来计算图像显著图。本发明显著图模型与VA机制具有较高的一致性。 The present invention proposes a statistical saliency graph model for natural scenes. It utilizes multiplier random variables in the Gaussian scale mixture (GSM, Gaussian scale mixture) statistical distribution of natural scenes to compute image saliency maps. The saliency graph model of the present invention has high consistency with the VA mechanism.

发明内容 Contents of the invention

本发明的目的在于提供一种与人类视觉系统注意力视觉选择机制相一致的计算图像显著图的方法,从而建立图像显著图模型。 The object of the present invention is to provide a method for calculating an image saliency map consistent with the attention visual selection mechanism of the human visual system, so as to establish an image saliency map model.

本发明利用自然场景高斯尺度混合(Gaussianscalemixture,GSM)统计分布中的乘数随机变量来计算图像显著图,具体步骤如下: The present invention uses the multiplier random variable in the Gaussian scale mixture (GSM) statistical distribution of natural scenes to calculate the image saliency map, and the specific steps are as follows:

(1)设图像为灰度图像分别为图像的行、列数,对其进行小波系数变换,得到多个小波系数子带。 (1) Set the image as a grayscale image , , are the number of rows and columns of the image respectively, and perform wavelet coefficient transformation on it to obtain multiple wavelet coefficient subbands.

(2)在每个小波系数子带内,对每个系数选取合适的小波系数邻域,将小波系数邻域拉成小波系数邻域矢量,其中为邻域大小; (2) In each wavelet coefficient subband, for each coefficient Select a suitable wavelet coefficient neighborhood, and pull the wavelet coefficient neighborhood into a wavelet coefficient neighborhood vector ,in is the neighborhood size;

根据自然场景统计(NaturalSceneStatistics,NSS)模型的统计性质,自然图像的小波系数邻域矢量能够用高斯尺度混合(Gaussianscalemixture,GSM)分布描述,即,其中为表征邻域矢量协方差变化的随机乘数,为零均值、协方差矩阵为的高斯随机变量,因此,邻域矢量的概率密度函数表示为: According to the statistical properties of the Natural Scene Statistics (NSS) model, the wavelet coefficient neighborhood vector of the natural image It can be described by Gaussian scale mixture (GSM) distribution, namely ,in is a random multiplier that characterizes the variation of neighborhood vector covariance, is zero mean, and the covariance matrix is Gaussian random variables of , therefore, the neighborhood vector The probability density function of Expressed as:

(1) (1)

其中,为随机乘数的概率密度函数分布; in, is a random multiplier The probability density function distribution of ;

因此,系数邻域矢量关于随机乘数服从零均值、协方差矩阵为的高斯分布,条件概率分布函数表示为: Therefore, the coefficient neighborhood vector About Random Multiplier Obey the zero mean, the covariance matrix is The Gaussian distribution of , the conditional probability distribution function is expressed as:

(6)。 (6).

(3)计算高斯尺度混合分布乘数变量的估计值 (3) Calculate the estimated value of the Gaussian scale mixture distribution multiplier variable

当选取的小波系数邻域足够小,即足够小时,假设乘数在这个邻域内保持不变,因此可以暂时将看作是一个确定量或者常量,此时,邻域对应的乘数可以通过对条件概率分布函数的最大似然估计得到,即[18]When the selected neighborhood of wavelet coefficients is small enough, that is enough hours, assuming the multiplier remains constant within this neighborhood, so it is possible to temporarily As a definite quantity or constant, at this time, the neighborhood corresponding multiplier The conditional probability distribution function can be The maximum likelihood estimation of is obtained, namely [18] :

(7) (7)

的特征值分解为,其中的特征矢量构成的矩阵,的特征值构成的矩阵。 remember The eigenvalues of are decomposed into ,in for The eigenvector of The matrix formed, for The eigenvalues of constituted matrix.

因此,的最大似然估计结果为: therefore, The maximum likelihood estimation result of is:

(8) (8)

不失普适性地假设,有的协方差矩阵。因此,可以用的协方差矩阵代替,得到 without loss of generality ,Have The covariance matrix of . Therefore, you can use The covariance matrix of replace ,get

(9) (9)

带上标表示的转置,假设小波矢量为一个特征样本,那么的集合构成了图像的特征空间。由于小波系数均值为零,因此,到这个特征空间中心的Mahalanobis距离[22] 。根据Mahalanobis距离的性质:Mahalanobis距离综合考虑了特征矢量各个维度之间的关系,样本到特征空间中心的Mahalanobis距离越大,属于该特征空间的概率越低,也就是说样本的“显著性”越高,反之亦然。 superscript of express The transpose of , assuming the wavelet vector is a feature sample, then The set of σ constitutes the feature space of the image. Since the mean value of wavelet coefficients is zero, therefore, for to the center of this feature space Mahalanobis distance [22] . According to the nature of the Mahalanobis distance: the Mahalanobis distance comprehensively considers the relationship between the various dimensions of the feature vector, the sample Mahalanobis distance to center of feature space bigger, The lower the probability of belonging to the feature space, that is to say, the sample The higher the "significance" of , and vice versa.

根据式(9)的描述,可以知道存在正比关系,即 According to the description of formula (9), we can know and There is a proportional relationship, that is

(10) (10 )

因此,是样本在特征空间显著性的有效描述。特征样本显著性越高,对应的值越大;相反,特征样本显著性越低,对应的值越小。 therefore, It is an effective description of the significance of the sample in the feature space. The higher the significance of feature samples, the corresponding The larger the value; on the contrary, the lower the significance of the feature sample, the corresponding The smaller the value.

根据图像小波分解的性质,小波系数能够提取图像的不连续信息,同时与频率域相比,它能描述HVS对场景的注意力强度分布。如果我们选取每一个小波系数在其空间、尺度和方向上相邻的全部系数构成一个邻域,将邻域内的全部系数表示成一个邻域矢量。那么就描述了该邻域内视觉刺激的特征矢量。因此,由的全部集合构成的特征空间,不仅描述了视觉刺激的空间分布情况,同时描述了在空间、尺度和方向上相邻视觉刺激之间的相关信息,如同一子带内相邻系数描述了空间相邻的视觉特征相关性、同尺度不同方向相邻系数描述了方向相邻的视觉特征相关性、而不同尺度同方向相邻系数则描述了尺度相邻的视觉特征相关性等。那么由式(6)及上面的分析就可以表明:由构成的矩阵能够在综合考虑视觉刺激空间分布情况与相邻视觉刺激之间相关性的前提下,综合描述视觉场景显著性分布,这样的综合描述与神经学和生物学领域中所描述的VA机制是相当一致的。 According to the nature of image wavelet decomposition, the wavelet coefficient can extract the discontinuous information of the image, and at the same time, compared with the frequency domain, it can describe the distribution of HVS attention intensity to the scene. If we select all the adjacent coefficients of each wavelet coefficient in its space, scale and direction to form a neighborhood, and express all the coefficients in the neighborhood as a neighborhood vector . So It describes the feature vector of visual stimuli in this neighborhood. Therefore, by The feature space composed of all sets of the visual stimuli not only describes the spatial distribution of visual stimuli, but also describes the correlation information between adjacent visual stimuli in space, scale and direction, such as the adjacent coefficient in the same subband describes the spatial correlation. The correlation of adjacent visual features, the adjacent coefficients of the same scale and different directions describe the correlation of visual features of adjacent directions, and the adjacent coefficients of different scales and the same direction describe the correlation of visual features of adjacent scales. Then from formula (6) and the above analysis, it can be shown that: constituted The matrix can comprehensively describe the salience distribution of visual scenes under the premise of comprehensively considering the spatial distribution of visual stimuli and the correlation between adjacent visual stimuli. Such a comprehensive description is comparable to the VA mechanism described in the fields of neurology and biology. consistent.

(4)将所有小波系数子带所对应的显著性进行融合,能够得到完备的自然场景统计显著图NSSSal(NaturalSceneStatisticalSaliencymap)模型: (4) By fusing the saliency corresponding to all wavelet coefficient subbands, a complete natural scene statistical saliency map NSSSal (Natural Scene Statistical Saliencymap) model can be obtained:

(2) ( 2)

式中,表示尺度上不同方向的显著性描述的迭加;表示不同尺度的加(across-scaleaddition)[9],将所有尺度上的显著性描述插值到后相加,其中为高斯模糊核函数,它用于对显著图进行一定的平滑作用[15]In the formula, Indicates the scale Superposition of saliency descriptions in different directions on ; Represents the addition of different scales (across-scaleaddition) [9] , interpolating the saliency descriptions on all scales to post-addition, where It is a Gaussian blur kernel function, which is used to smooth the saliency map [15] .

(5)将式(11)对灰度动态范围调整到,值越接近1的位置,就表示图像中对应区域显著性越大,比其他值小的地方更加容易吸引到人眼的注意。 (5) Adjust the dynamic range of the grayscale in formula (11) to , the closer the value is to 1, the more significant the corresponding area in the image is, and it is easier to attract the attention of the human eye than other places with smaller values.

(6)如果图像为RGB调制的彩色图像,有红色通道、绿色通道与蓝色通道。计算相应的灰度通道、红绿拮抗对与黄蓝拮抗对(6) If the image is a RGB modulated color image , with a red channel , green channel with blue channel . Calculate the corresponding grayscale channel , red-green antagonistic pair Antagonized with yellow and blue .

灰度通道为: grayscale channel for:

(12) (12)

根据人眼对彩色信息的处理机制,四个广义调制的红色(R)、绿色(G)、蓝色(B)与黄色(Y)通道分别为: According to the human eye's processing mechanism for color information, the four generalized modulation red (R), green (G), blue (B) and yellow (Y) channels are:

(13) (13)

(14) (14)

(15) (15)

(16) (16)

可以得到红绿拮抗对和黄蓝拮抗对为: red-green antagonistic pair Antagonized with yellow and blue for:

(17) (17)

(18) (18)

(7)对灰度通道、红绿拮抗对通道与黄蓝拮抗对按照步骤(1)-(5)计算,得到分别记为的显著图,则将这三者的加权平均作为该彩色图像的显著图,即: (7) For the grayscale channel , red-green antagonistic channel Antagonized with yellow and blue According to the calculations in steps (1)-(5), the obtained values are denoted as , and the saliency map of the color image, the weighted average of these three is taken as the saliency map of the color image ,which is:

(19) (19)

其中,分别为3个通道的权重,且有in, , and are the weights of the three channels respectively, and have .

根据本发明,对图像计算得到的显著图中,像素值越高的位置对应图像显著性越高;像素值越低的位置对应图像显著性越低。 According to the present invention, in the saliency map calculated for the image, positions with higher pixel values correspond to higher saliency of the image; positions with lower pixel values correspond to lower saliency of the image.

本发明提出的显著图模型与视觉注意力选择机制具有较高的一致性,即能够在抑制重复出现的刺激同时,突出显著性较高的视觉刺激,从而更好地描述了图像对人眼视觉刺激的显著性分布。 The saliency map model proposed by the present invention has a high consistency with the visual attention selection mechanism, that is, it can highlight the visual stimuli with high salience while suppressing the recurring stimuli, thereby better describing the impact of images on human vision. The distribution of salience for stimuli.

附图说明 Description of drawings

图1:可操纵金字塔分解与小波系数邻域选取。 Figure 1: Steerable pyramid decomposition and neighborhood selection of wavelet coefficients.

具体实施方式 detailed description

下面本发明通过实施例,对比本发明显著图模型NSSSal/cNSSSal与其它显著图模型对自然图像提取显著图的性能。同时,也将通过公开的Bruce数据库[10]与ImgSal数据库[11],对它们AUC(AreaUndertheCurve,ROC曲线下的面积)进行定量评估。 In the following, the present invention compares the performance of the saliency map model NSSSal/cNSSSal of the present invention and other saliency map models for extracting saliency maps from natural images through examples. At the same time, the public Bruce database [10] and ImgSal database [11] will also be used to quantitatively evaluate their AUC (Area Under the Curve, the area under the ROC curve).

实验中用过完备的可操纵金字塔对图像进行小波分解,以保持图像的方向性。采用分解尺度数,方向数,分别为,如图1所示。同时,选取每个小波系数(图中以黑色小方块标示)与其相同子带内相邻系数、同尺度不同方向子带内同样位置上的系数以及相同方向母尺度上同样位置的系数(图中都以灰色小方块标示)共同构成它的邻域矢量,邻域大小。选取高斯核方差为显著图宽度的0.045倍。 In the experiment, a complete steerable pyramid is used to decompose the image by wavelet to preserve the directionality of the image. Decomposition scale number , the number of directions , respectively , , and ,As shown in Figure 1. At the same time, each wavelet coefficient is selected (marked by small black squares in the figure) and its adjacent coefficients in the same subband, coefficients at the same position in subbands of the same scale and different directions, and coefficients at the same position on the parent scale of the same direction (marked by small gray squares in the figure) together form its neighborhood vector , the neighborhood size . The Gaussian kernel variance is selected to be 0.045 times the width of the saliency map.

对任意一幅自然图像而言,记由某种显著图模型得到的显著图为。选取阈值,根据阈值进行二值化,记二值化后的。根据数据库中提供的主观注意力分布图(FixationDensityMap),其正类率(TPR,TruePositiveRate)为: For any natural image, the saliency map obtained by a certain saliency map model is . Select threshold , according to the threshold right Perform binarization, and record the binarized for . According to the subjective attention distribution map provided in the database (FixationDensityMap), its positive class rate (TPR, TruePositiveRate) is:

(20) (20)

其中,符号表示像素点间乘法,表示1范数值,即矩阵中为1元素的个数。 in, The symbol represents multiplication between pixels, Represents the 1 norm value, that is, the matrix In is the number of 1 elements.

类似的,其虚警率(FPR,FalsePositiveRate)为: Similarly, its false alarm rate (FPR, FalsePositiveRate) is:

(21) (twenty one)

对于给定的阈值,将对数据库中所有图像得到的平均值作为该显著图模型在阈值为时的正类率,同样的,将的平均值作为在阈值为时的虚警率。通过选取不同的阈值,以为纵轴,以为横轴,绘制ROC曲线。ROC曲线下的面积提供了显著图模型与人眼主观注意力分布的一致性测量,值越接近于1,说明显著图模型与VA机制的一致性越高。 For a given threshold , will be obtained for all images in the database The mean value of the saliency map model as the threshold is Positive class rate when , similarly, will The average value of as the threshold is false alarm rate . By choosing different thresholds ,by as the vertical axis, with On the horizontal axis, draw the ROC curve. Area under the ROC curve Provides a measure of agreement between saliency map models and the subjective attention distribution of the human eye, The closer the value is to 1, the higher the consistency between the saliency graph model and the VA mechanism.

(1)灰度图像显著图模型性能评估(1) Performance evaluation of grayscale image saliency map model

表1为Bruce、ImgSal数据库中的图像灰度化后,由NSSSal、Itti&Koch、PFTSal、signatureSal模型计算得到的值,值越接近于1,表示显著图模型与VA机制的一致性越高。可以看出,在这两个数据库上,Itti&Koch的结果最低,PFTSal与signatureSal模型在统计意义上结果相近,而NSSSal模型相较于二者,具有更接近于1的值,与VA机制一致性最高。 Table 1 is calculated by NSSSal, Itti&Koch, PFTSal, and signatureSal models after grayscale images in the Bruce and ImgSal databases value, The closer the value is to 1, the higher the consistency between the saliency map model and the VA mechanism. It can be seen that on these two databases, Itti&Koch's The result is the lowest, and the PFTSal and signatureSal models are statistically similar, while the NSSSal model has a value closer to 1 than the two. value, the highest consistency with the VA mechanism.

表1:灰度图像显著图AUC值对比Table 1: AUC value comparison of grayscale image saliency map

.

(2)彩色图像显著图模型性能评估(2) Performance evaluation of color image saliency map model

进一步地,对彩色自然图像,我们对比了cNSSSal与Itti&Koch[8]、PQFTSal[16]、signatureSal[15]的ROC(ReceiverOperatingCharacteristive)曲线,以及ROC曲线下的面积AUC(AreaUndertheCurve)。实验中取,即对3个通道取相同权重。 Further, for color natural images, we compared the ROC (Receiver Operating Characteristic) curves of cNSSSal and Itti&Koch [8] , PQFTSal [16] , signatureSal [15] , and the area under the ROC curve AUC (AreaUndertheCurve). Taken in the experiment , that is, take the same weight for the three channels.

表2为Bruce、ImgSal数据库上由cNSSSal、Itti&Koch、PQFTSal、signatureSal模型计算得到的值。不同显著图模型在考虑了彩色信息之后,值都会相应增加,而cNSSSal的值较其它模型仍然最高。 Table 2 is calculated by the cNSSSal, Itti&Koch, PQFTSal, signatureSal models on the Bruce and ImgSal databases value. After considering the color information of different saliency map models, The values will increase accordingly, while the value of cNSSSal is still the highest compared with other models.

表2:灰度图像显著图AUC值对比Table 2: AUC value comparison of grayscale image saliency map

.

本发明实际应用效果Practical application effect of the present invention

显著图模型在自适应图像压缩、视频摘要、编码和图像渐进传输、图像分割、图像与视频质量评估、目标识别、内容感知的图像缩放等图像处理问题中都有广泛的应用。下面以本发明显著图模型在图像质量评估问题中的应用为例说明本发明显著图模型的有效性与优越性。 Saliency graph models are widely used in image processing problems such as adaptive image compression, video summarization, coding and image progressive transmission, image segmentation, image and video quality assessment, object recognition, and content-aware image scaling. The following takes the application of the saliency graph model of the present invention in image quality assessment as an example to illustrate the effectiveness and superiority of the saliency graph model of the present invention.

图像质量评估测度模拟人类视觉系统的整体机制,希望获取与人眼对图像质量的评估一致的结果。然而在很多现存的图像质量评估方法中,HVS中重要的注意力视觉选择机制却常常被忽略。当忽略注意力视觉选择机制时,图像质量评估测度假设人眼会对包括自然场景与图像失真等所有的目标赋予同样的关注度。然而,根据注意力视觉选择机制,人类视觉模型不以简单的高维空间信号的方式看待图像,而是对图像的不同属性具有不同的敏感度,如亮度、对比度、目标的形状与纹理、方向、平滑度等。由于人类视觉系统对图像的不同成分具有不同的敏感度,因此对失真图像进行质量评估时需要考虑这样的不同的敏感度。因此,结合注意力视觉选择机制能够有效地提高图像质量评估算法的性能。 Image quality assessment measures simulate the overall mechanism of the human visual system, hoping to obtain results that are consistent with the human eye's assessment of image quality. However, the important attentional visual selection mechanism in HVS is often neglected in many existing image quality assessment methods. When ignoring the attentional visual selection mechanism, image quality assessment measures assume that the human eye attaches equal attention to all objects including natural scenes and image distortions. However, according to the attentional visual selection mechanism, the human visual model does not view images in the form of simple high-dimensional spatial signals, but has different sensitivities to different attributes of images, such as brightness, contrast, shape and texture of objects, and orientation , smoothness, etc. Since the human visual system has different sensitivities to different components of an image, such different sensitivities need to be taken into account when evaluating the quality of distorted images. Therefore, combining the attention visual selection mechanism can effectively improve the performance of image quality assessment algorithms.

实验中将得到的图像显著图当作权值,来重新调整原有图像质量评估方法中各部分质量评估结果在最终结果中所占的权重比例,其目的是期待使评估结果与人眼主观评估结果更加一致。实验对三种客观图像质量评估测度(PSNR、MSSIM[25]、VIF[26])用显著图进行加权。 The image saliency map that will be obtained in the experiment As a weight, to readjust the weight ratio of each part of the quality assessment results in the final result in the original image quality assessment method, the purpose is to make the assessment results more consistent with the subjective assessment results of the human eye. The experiments weight three objective image quality assessment measures (PSNR, MSSIM [25] , VIF [26] ) with saliency maps.

(1)基于显著图的PSNR(1) PSNR based on saliency map

由于PSNR是基于像素点的,所以显著图可以直接作为加权矩阵使用。假设是显著图上第个像素上的灰度值。基于显著图的PSNR(Saliency-basedPSNR,SPSNR)定义为: Since PSNR is based on pixels, the saliency map can be directly used as a weighting matrix. suppose is a saliency map on the first gray value of a pixel. The PSNR (Saliency-basedPSNR, SPSNR) based on the saliency map is defined as:

(22) (twenty two)

其中,为图像总的像素数目,分别为图像上第个个像素的灰度值,是图像灰度动态范围。由于不同自然图像得到的显著图的灰度级别是不同的,因此用对权值进行归一化处理。 in, is the total number of pixels in the image, , image respectively , on the first The gray value of each pixel, is the grayscale dynamic range of the image. Since the gray levels of the saliency maps obtained from different natural images are different, the Normalize the weights.

(2)基于显著图的MSSIM(2) MSSIM based on saliency map

由于MSSIM是基于图像子块的,因此首先把得到的显著图划分为与参考图像同样数目、同样大小的子块。假设被划分成个子块,且与子块对应的子块为,将子块的归一化灰度均值作为对应子块的权重,那么就有 Since MSSIM is based on image sub-blocks, firstly the obtained saliency map It is divided into sub-blocks with the same number and size as the reference image. suppose is divided into sub-blocks, and with sub-blocks , The corresponding sub-block is , the subblock The normalized gray value of As the weight of the corresponding sub-block, then there is

(23) (twenty three)

其中,表示中的像素数,表示中的第个像素点的灰度值。因此基于显著图的MSSIM(Saliency-basedMSSIM,SMSSIM)就可定义为 in, express the number of pixels in the express in the first gray value of a pixel. Therefore, the saliency-based MSSIM (Saliency-basedMSSIM, SMSSIM) can be defined as

(23) (twenty three)

其中,分别为参考图像与失真图像中的第个子块,为图像子块的总数。 in, , are the reference image and the distorted image respectively block, is the total number of image subblocks.

(3)基于显著图的VIF(3) VIF based on saliency map

由于空间域VIF是多尺度的,因此将参考图像调整到不同尺度下,计算不同尺度下的显著图,其中为尺度索引。再根据VIF基于子块的性质,在尺度下,把划分成个子块,记第个子块为。将子块的归一化灰度均值作为对应子块的权重,那么就有 Since the spatial domain VIF is multi-scale, the reference image is adjusted to different scales, and the saliency map at different scales is calculated ,in is the scale index. Then according to the nature of the VIF based on the sub-block, in the scale down, put divided into block sub-block is . Subblock The normalized gray value of As the weight of the corresponding sub-block, then there is

(25) (25)

其中,表示中的像素数,表示中的第个像素点的灰度值。因此,基于显著图的VIF(Saliency-basedVIF,SVIF)就可定义为 in, express the number of pixels in the express in the first gray value of a pixel. Therefore, the VIF (Saliency-basedVIF, SVIF) based on the saliency map can be defined as

(26) (26)

其中,表示尺度数,为尺度索引,表示第个尺度下的图像子块数,为子块索引。为参考图像第个尺度的第个子块,分别为人眼接收到的参考图像与失真图像子块,为对应的模型参数。关于的互信息。 in, represents the number of scales, is the scale index, Indicates the first The number of image sub-blocks under the scale, index for subblocks. for the reference image the first scale block, , are the sub-blocks of the reference image and the distorted image received by the human eye, are the corresponding model parameters. for and about mutual information.

实验中对PSNR、MSSIM、与VIF采用四种不同的显著图模型(Itti&Koch[8]、PQFTSal[16]、signatureSal[15]、cNSSSal)进行加权,每种测度因此衍生出五种不同的实现方法(IQA、Itti_IQA、PQFT_IQA、signature_IQA及cNSS_IQA)。 In the experiment, four different saliency graph models (Itti&Koch [8] , PQFTSal [16] , signatureSal [15] , cNSSSal) were used to weight PSNR, MSSIM, and VIF, and each measure derived five different implementation methods (IQA, Itti_IQA, PQFT_IQA, signature_IQA, and cNSS_IQA).

MSSIM和空间域VIF的matlab实现程序均来自于网上原作者的实现版本。实验中,计算MSSIM和SMSSIM时子块大小取8×8,相邻子块间隔1个像素距离;而计算空间域VIF时取4个尺度,子块大小为3×3,子块间互不重叠。同时,为了保证对比的公平性,用三种模型提取显著图并将其与图像质量评估结合时的参数设置是一致的。采用signatureSal的作者所实现版本中的默认模糊参数(详见文献[15])对三种显著图进行模糊。 The matlab implementation programs of MSSIM and spatial domain VIF are all from the implementation version of the original author on the Internet. In the experiment, when calculating MSSIM and SMSSIM, the sub-block size is 8×8, and the distance between adjacent sub-blocks is 1 pixel; while calculating the spatial domain VIF, 4 scales are used, the sub-block size is 3×3, and the sub-blocks are different from each other. overlapping. At the same time, in order to ensure the fairness of the comparison, the parameter settings when using the three models to extract saliency maps and combine them with image quality assessment are consistent. The default fuzzy parameters in the version implemented by the author of signatureSal (see [15] for details) are used to blur the three saliency maps.

在LIVE图像质量评估数据库[28]上验证本发明的性能。LIVE数据库包含982幅图像,其中779幅为失真图像。这些图像由29幅参考图像通过JPEG、JPEG2000、白噪声、高斯模糊和信道快速衰落这五种失真方式在不同失真级别下得到。数据库中还给出了每幅图像对应的主观评估分数(DMOS),DMOS的范围是[0,100],DMOS=0代表图像无失真,随着图像的失真程度增加,DMOS的值也会相应增加。通过将DMOS与图像质量评估算法得到的评估结果进行对比,那么就可评价图像质量评估算法的性能。 The performance of the present invention is verified on the LIVE image quality assessment database [28] . The LIVE database contains 982 images, 779 of which are distorted images. These images are obtained from 29 reference images at different distortion levels by JPEG, JPEG2000, white noise, Gaussian blur and channel fast fading. The database also gives the subjective evaluation score (DMOS) corresponding to each image. The range of DMOS is [0,100]. DMOS=0 means that the image is undistorted. As the degree of image distortion increases, the value of DMOS will increase accordingly. By comparing the evaluation results obtained by the DMOS with the image quality evaluation algorithm, the performance of the image quality evaluation algorithm can be evaluated.

在对图像质量估计结果与DMOS进行非线性回归拟合之后,可以通过五个客观评价指标来定量评价客观图像质量评估测度与主观质量评估结果的一致性:1)线性相关系数(LinearCorrelationCoefficient,LCC),它描述了预测的准确性,其值越接近于1表示预测准确性越高;2)平均绝对误差(MeanAbsoluteError,MAE),其值越小表示预测绝对误差越小;3)均方根误差(RootMeanSquaredError,RMSE),其值越小表示预测均方根误差越小;4)离群比例(OutlierRatio,OR),它描述了预测一致性,其值越小表示预测一致性越高;5)Spearman秩相关系数(Spearman’sRankOrderedCorrelationCoefficient,SROCC),它描述预测的单调性,其值越接近于1表示预测单调性越优。定量评价结果如下: After nonlinear regression fitting of image quality estimation results and DMOS, five objective evaluation indicators can be used to quantitatively evaluate the consistency of objective image quality evaluation measures and subjective quality evaluation results: 1) Linear Correlation Coefficient (LCC) , which describes the prediction accuracy, the closer its value is to 1, the higher the prediction accuracy; 2) Mean Absolute Error (MeanAbsoluteError, MAE), the smaller the value, the smaller the prediction absolute error; 3) Root mean square error (RootMeanSquaredError, RMSE), the smaller the value, the smaller the prediction root mean square error; 4) Outlier ratio (OutlierRatio, OR), which describes the prediction consistency, the smaller the value, the higher the prediction consistency; 5) Spearman's Rank Ordered Correlation Coefficient (SROCC), which describes the monotonicity of prediction, the closer its value is to 1, the better the monotonicity of prediction. The quantitative evaluation results are as follows:

表3:彩色图像显著图模型对图像质量评估测度性能提升比较Table 3: Comparison of performance improvement of color image saliency map model on image quality assessment measurement

.

表3中加粗的值为每个评估指标中,几种不同的实现方法得到的最优结果。可以看出,三种不同的显著图模型(Itti&Koch、PQFTSal、signatureSal及cNSSSal)都对图像质量评估测度性能带来了提升。同时,cNSSSal显著图对图像质量评估测度性能的提升远远优于其它的显著图模型,证明了本发明显著图模型在图像质量评估问题中的优越性。 The values in bold in Table 3 are the optimal results obtained by several different implementation methods in each evaluation index. It can be seen that three different saliency graph models (Itti&Koch, PQFTSal, signatureSal and cNSSSal) have all improved the performance of image quality assessment measures. At the same time, the cNSSSal saliency map improves the measurement performance of image quality assessment far better than other saliency map models, which proves the superiority of the saliency map model of the present invention in image quality assessment problems.

参考文献 references

[1]UEngelke,HKaprykowsk,H-JZepernick,andPNdjiki-Nya.Visualattentioninqualityassessment[J].IEEESignalProcessingMagazine,2011,28(6):50-59. [1] UEngelke, HKaprykowsk, H-JZepernick, and PNdjiki-Nya.Visual attentioninqualityassessment[J].IEEESignalProcessingMagazine,2011,28(6):50-59.

[2]EKowler.Eyemovements:Thepast25years[J].VisionRes.,2011,51(13):1457-1583. [2]EKowler.Eyemovements:Thepast25years[J].VisionRes.,2011,51(13):1457-1583.

[3]MCarrasco.VisualAttention:Thepast25years[J].VisionRes.,2011,51(13):1484-1525. [3]MCarrasco.Visual Attention:Thepast25years[J].VisionRes.,2011,51(13):1484-1525.

[4]AToet.Computationalversuspsychophysicalbottom-upimagesaliency:acomparativeevaluationstudy[J].IEEETransactions,2011,PAMI-33(11):2131-2146. [4] AToet.Computationalversuspsychophysicalbottom-upimagesaliency:acomparativeevaluationsstudy[J].IEEETransactions,2011,PAMI-33(11):2131-2146.

[5]AMTreismanandGGelade.Afeature-integrationtheoryofattention[J].Cogn.Psychol.,1980,12(1):97–136. [5] AM Treisman and G Gelade. Afeature-integration theory of attention [J]. Cogn. Psychol., 1980, 12(1): 97-136.

[6]JMWolfe,KRCave,andSLFranzel.Guidedsearch:Analternativetothefeatureintegrationmodelforvisualsearch[J].J.Exp.Psychol.Hum.Percept.Perform.,1989,15(3):419-433. [6] JMWolfe, KRCave, and SLFranzel. Guided search: An Alternative to the feature integration model for visual search [J]. J. Exp. Psychol. Hum. Percept. Perform., 1989, 15(3): 419-433.

[7]CKochandSUllman.Shiftsinselectivevisualattention:towardstheunderlyingneuralcircuitry[J].HumanNeurobiol,1985,219-227. [7] CKoch and SUllman. Shifts in selective visual attention: towards the underlying neural circuit [J]. Human Neurobiol, 1985, 219-227.

[8]LItti,CKochandENiebur.Amodelofsaliency-basedvisualattentionforrapidsceneanalysis[J].IEEETransactions,1998,PAMI-20(11):1254-1259. [8] LItti, C Kochand ENiebur. A model of saliency-based visual attention for rapids scene analysis [J]. IEEE Transactions, 1998, PAMI-20 (11): 1254-1259.

[9]DWaltherandCKoch.Modelingattentiontosalientproto-objects[J].NeuralNetworks,2006,19(9):1395-1407. [9] DWaltherandCKoch.Modelingattentiontosalientproto-objects[J].NeuralNetworks,2006,19(9):1395-1407.

[10]NDBBruceandJKTsotsos.Saliencybasedoninformationmaximization[A].InProc.AdvancesinNeuralInformationProcessingSystem[C].2005.155-162. [10] NDBBruceandJKTsotsos.Saliencybasedoninformationmaximization[A].InProc.AdvancesinNeuralInformationProcessingSystem[C].2005.155-162.

[11]JLi,MDLevine,XAn,XXu,andHHe.Visualsaliencybasedonscale-spaceanalysisinthefrequencydomain[J].IEEETransactions,2007,PAMI-PP(99):1,0. [11] JLi, MD Levine, XAn, XXu, and HHe. Visual saliency based on scale-space analysis in the frequency domain [J]. IEEE Transactions, 2007, PAMI-PP(99): 1,0.

[12]CKanan,MHTong,LZhang,andGWCottrel.SUN:Top-downsaliencyusingnaturalstatistics[J].VisualCognition,2009,17(6/7):979-1003. [12] CKanan, MHTong, LZhang, and GW Cottrel. SUN: Top-downsaliency using natural statistics [J]. Visual Cognition, 2009, 17(6/7): 979-1003.

[13]LIttiandPBaldi.Bayesiansurpriseattractshumanattention[A].InProc.AdvancesinNeuralInformationProcessingSystem[C].2009.547-554. [13] LIttian and P Baldi. Bayesian surprise attract human attention [A]. InProc. Advances in Neural Information Processing System [C]. 2009.547-554.

[14]YYing,BWang,LMZhang.Pulsediscretecosinetransformforsaliency-basedattention[A].InProc.IEEEICDL’09[C].2009.1-6,5-7. [14] YYing, BWang, LM Zhang. Pulsed discretecosine transform for saliency-based attention [A]. InProc. IEEEICDL'09 [C]. 2009.1-6, 5-7.

[15]XHou,JHarel,andCKoch.Imagesignature:highlightingsparsesalientregions[J].IEEETransactions,2012,PAMI-34(1):194-201. [15] XHou, JHarel, and CKoch. Image signature: highlighting sparse alient regions [J]. IEEE Transactions, 2012, PAMI-34(1): 194-201.

[16]MQiandLMZhang.Saliency-basedimagequalityassessmentcriterion[A].InProc.AdavancedIntelligentComputingTheoriesandApplications[C].2008.1124-1133. [16]MQiandLMZhang.Saliency-basedimagequalityassessmentcriterion[A].InProc.AdavancedIntelligentComputingTheoriesandApplications[C].2008.1124-1133.

[17]CGuoandLMZhang.Spatio-temporalsaliencydetectionusingphasespectrumofquaternionFouriertransform[A].InProc.IEEECVPR’08[C].2008.1-8,23-28. [17] CGuoandLMZhang.Spatio-temporalsaliencydetectionusingphasespectrumofquaternionFouriertransform[A].InProc.IEEECVPR’08[C].2008.1-8,23-28.

[18]MJWainwrightandEPSimoncelli.ScalemixturesofGaussiansandthestatisticsofnaturalimages[J].Adv.NeuralInf.Process.Syst.,2000,12:855-861. [18] MJ Wainwright and EPSimoncelli. Scale mixtures of Gaussians and the statistics of natural images [J]. Adv. Neural Inf. Process. Syst., 2000, 12:855-861.

[19]JPortilla,VStrela,MJWainwright,andEPSimoncelli,ImagedenoisingusingscalemixtureofGaussiansinthewaveletdomain[J].IEEETransactions,2003,ImageProcessing-12(11):1338-1351. [19] J Portilla, V Strela, MJ Wainwright, and EPS Simoncelli, Image denoising using scale mixture of Gaussians in the wavelet domain [J]. IEEE Transactions, 2003, Image Processing-12(11): 1338-1351.

[20]ZWangandACBovik.Reduced-andno-referenceimagequalityassessment[J].IEEE,2011,SignalProcessingMagazine-28(6):29-40. [20] ZWang and AC Bovik. Reduced-andno-reference image quality assessment [J]. IEEE, 2011, Signal Processing Magazine-28(6): 29-40.

[21]AKMoorthyandACBovik.Statisticsofnaturalimagedistortions[A].InProc.IEEEICASSP’10[C].2010.962-965,14-19. [21] AK Moorthy and AC Bovik. Statistics of natural image distortions [A]. InProc. IEEE ICASSP'10 [C]. 2010.962-965, 14-19.

[22]PCMahalanobis.Onthegeneralizeddistanceinstatistics[J].ProceedingsoftheNationalInstituteofSciencesofIndia,1936,2(1):49-55. [22] PC Mahalanobis. On the generalized distance statistics [J]. Proceeding of the National Institute of Sciences of India, 1936, 2(1): 49-55.

[23]EPSimoncelli,WTFreeman,EHAdelson,andDJHeeger.Shiftablemultiscaletransforms[J].IEEETransactions,1992,InformaitonTheory-38(2):587–607. [23] EPSimoncelli, WT Freeman, EHAdelson, and DJ Heeger. Shiftable multiscale transforms [J]. IEEE Transactions, 1992, Informaiton Theory-38(2):587–607.

[24]ZWangandACBovik.Meansquarederror:loveitorleaveit?Anewlookatsignalfidelitymeasures[J].IEEESignalProcessingMagazine,2009,26(1):98-117. [24]ZWangandACBovik.Meansquarederror:loveitorleaveit?Anewlookatsignalfidelitymeasures[J].IEEESignalProcessingMagazine,2009,26(1):98-117.

[25]ZWang,ACBovik,HRSheikh,andEPSimoncelli.Imagequalityassessment:Fromerrormeasurementtostructuralsimilarity[J].IEEETransactions,2004,ImageProcessing-13(4):600-612. [25] ZWang, AC Bovik, HR Sheikh, and EPSimoncelli. Image quality assessment: From error measurement to structural similarity [J]. IEEE Transactions, 2004, Image Processing-13(4): 600-612.

[26]HRSheikhandACBovik.Imageinformationandvisualquality[J].IEEETransactions,2006,ImageProcessing-15(2):430-444. [26] HR Sheikhand AC Bovik. Image information and visual quality [J]. IEEE Transactions, 2006, Image Processing-15(2): 430-444.

[27]HR.Sheikh,MFSabir,andACBovik.Astatisticalevaluationofrecentfullreferenceimagequalityassessmentalgorithms[J].IEEETransactions,2006,ImageProcessing-15(11):3440-2451. [27] HR. Sheikh, MFSabir, and AC Bovik. Astatistical evaluation of recent full reference image quality assessment algorithms [J]. IEEE Transactions, 2006, Image Processing-15 (11): 3440-2451.

[28]HRSheikh,ZWang,ACBovik,andLKCormack.ImageandVideoQualityAssessmentResearchatLIVE[Online].Available:http://live.ece.utexas.edu/research/quality.。 [28] HR Sheikh, Z Wang, AC Bovik, and LK Cormack. Image and Video Quality Assessment Research at LIVE [Online]. Available: http://live.ece.utexas.edu/research/quality.

Claims (2)

1.一种计算图像显著图的方法,其特征在于利用自然场景高斯尺度混合统计分布中的乘数随机变量来计算图像显著图,具体步骤如下:1. A method for calculating an image saliency map, characterized in that the multiplier random variable in the natural scene Gaussian scale mixed statistical distribution is used to calculate the image saliency map, and the specific steps are as follows: (1)设图像为灰度图像R、C分别为图像的行、列数,对其进行小波系数变换,得到多个小波系数子带;(1) Set the image as a grayscale image R and C are the number of rows and columns of the image respectively, and the wavelet coefficient transformation is performed on it to obtain a plurality of wavelet coefficient subbands; (2)在每个小波系数子带内,对每个系数c选取合适的小波系数邻域,将小波系数邻域拉成小波系数邻域矢量其中M为邻域大小;根据自然场景统计模型的统计性质,用高斯尺度混合分布描述自然图像的小波系数邻域矢量即c′=su′,其中,s为表征邻域矢量协方差变化的随机乘数,u′为零均值、协方差矩阵为Cu的高斯随机变量;(2) In each wavelet coefficient subband, select a suitable wavelet coefficient neighborhood for each coefficient c, and pull the wavelet coefficient neighborhood into a wavelet coefficient neighborhood vector Among them, M is the size of the neighborhood; according to the statistical properties of the statistical model of natural scenes, the wavelet coefficient neighborhood vector of the natural image is described by Gaussian scale mixture distribution That is, c'=su', where s is a random multiplier representing the change of the neighborhood vector covariance, u ' is a Gaussian random variable with zero mean and covariance matrix Cu; (3)计算高斯尺度混合分布乘数变量的最大似然估计结果为:(3) Calculate the maximum likelihood estimation result of Gaussian scale mixed distribution multiplier variable as follows: sthe s ^^ == cc ′′ TT CC uu -- 11 cc ′′ // Mm -- -- -- (( 11 )) 假设E{s2}=1,用c′的协方差矩阵Cc代替Cu,得到:Assuming E{s 2 }=1, replacing C u with the covariance matrix C c of c′, we get: sthe s ^^ == cc ′′ TT CC cc -- 11 cc ′′ // Mm -- -- -- (( 22 )) 假设小波系数邻域矢量c′为一个特征样本,那么c′的集合构成了图像的特征空间;与c′的Mahalanobis距离存在正比关系,即:Assuming that the wavelet coefficient neighborhood vector c' is a feature sample, then the set of c' constitutes the feature space of the image; Mahalanobis distance from c′ There is a proportional relationship, namely: 是样本在特征空间显著性的有效描述,特征样本显著性越高,对应的值越大;相反,特征样本显著性越低,对应的值越小; is an effective description of the significance of the sample in the feature space, the higher the significance of the feature sample, the corresponding The larger the value; on the contrary, the lower the significance of the feature sample, the corresponding The smaller the value; (4)将所有小波系数子带所对应的显著性进行融合,能够得到完备的自然场景统计显著图NSSSal模型:(4) By fusing the saliency corresponding to all wavelet coefficient subbands, a complete natural scene statistical saliency map NSSSal model can be obtained: 式中,表示尺度上不同方向的显著性描述的迭加;表示不同尺度的加,将所有尺度上的显著性描述插值到R×C后相加,其中为高斯模糊核函数,它用于对显著图进行一定的平滑作用;In the formula, Indicates the scale Superposition of saliency descriptions in different directions on ; Indicates the addition of different scales, interpolating the saliency descriptions on all scales to R×C and adding them, where It is a Gaussian blur kernel function, which is used to smooth the saliency map; (5)将式(4)对灰度动态范围调整到[0,1],值越接近1的位置,就表示图像中对应区域显著性越大,比其他值小的地方更加容易吸引到人眼的注意;(5) Adjust the grayscale dynamic range of formula (4) to [0, 1]. The closer the value is to 1, the more significant the corresponding area in the image is, and it is easier to attract people than places with smaller values. eye attention (6)如果图像为RGB调制的彩色图像对图像的灰度通道I、红绿拮抗对通道RG与黄蓝拮抗对BY按照步骤(1)-(5)计算得到分别记为NSSSal_I、NSSSal_RG与NSSSal_BY的显著图,则将这三者的加权平均作为该彩色图像的显著图,即:(6) If the image is a RGB modulated color image The gray channel I of the image, the red-green antagonistic pair channel RG and the yellow-blue antagonistic pair BY are calculated according to steps (1)-(5) to obtain the saliency maps respectively recorded as NSSSal_I, NSSSal_RG and NSSSal_BY, and the weighting of these three averaged as a saliency map for this color image, namely: cNSSSal=ωINSSSal_I+ωRGNSSSal_RG+ωBYNSSSal_BY(5)cNSSSal = ω I NSSSal_I + ω RG NSSSal_RG + ω BY NSSSal_BY (5) 其中,ωI、ωRG与ωBY分别为3个通道的权重,且有ωIRGBY=1。Wherein, ω I , ω RG and ω BY are the weights of the three channels respectively, and ω IRGBY =1. 2.根据权利要求1所述的方法,对每幅图像计算得到的显著图中,像素值越高的位置对应图像显著性越高;像素值越低的位置对应图像显著性越低。2. The method according to claim 1, in the saliency map calculated for each image, a position with a higher pixel value corresponds to a higher saliency of the image; a position with a lower pixel value corresponds to a lower saliency of the image.
CN201310135762.8A 2013-04-19 2013-04-19 Utilize the method for natural scene statistical computation image saliency map Expired - Fee Related CN103218815B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310135762.8A CN103218815B (en) 2013-04-19 2013-04-19 Utilize the method for natural scene statistical computation image saliency map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310135762.8A CN103218815B (en) 2013-04-19 2013-04-19 Utilize the method for natural scene statistical computation image saliency map

Publications (2)

Publication Number Publication Date
CN103218815A CN103218815A (en) 2013-07-24
CN103218815B true CN103218815B (en) 2016-03-30

Family

ID=48816558

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310135762.8A Expired - Fee Related CN103218815B (en) 2013-04-19 2013-04-19 Utilize the method for natural scene statistical computation image saliency map

Country Status (1)

Country Link
CN (1) CN103218815B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298341B (en) * 2019-06-12 2023-09-19 上海大学 Enhanced image significance prediction method based on direction selectivity
CN110503162A (en) * 2019-08-29 2019-11-26 广东工业大学 A kind of media information popularity prediction method, device and equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101103378A (en) * 2005-01-10 2008-01-09 汤姆森许可贸易公司 Apparatus and method for creating saliency map of image
CN102184557A (en) * 2011-06-17 2011-09-14 电子科技大学 Salient region detection method for complex scene
EP2461274A1 (en) * 2010-09-16 2012-06-06 Thomson Licensing Method and device of determining a saliency map for an image
CN102754126A (en) * 2010-02-12 2012-10-24 高等技术学校 Method and system for determining a quality measure for an image using multi-level decomposition of images

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101103378A (en) * 2005-01-10 2008-01-09 汤姆森许可贸易公司 Apparatus and method for creating saliency map of image
CN102754126A (en) * 2010-02-12 2012-10-24 高等技术学校 Method and system for determining a quality measure for an image using multi-level decomposition of images
EP2461274A1 (en) * 2010-09-16 2012-06-06 Thomson Licensing Method and device of determining a saliency map for an image
CN102184557A (en) * 2011-06-17 2011-09-14 电子科技大学 Salient region detection method for complex scene

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Gaussian Mixture Modeling by Exploiting the Mahalanobis Distance;Dimitrios Ververidis 等;《IEEE TRANSACTIONS ON SIGNAL PROCESSING》;20080731;第56卷(第7期);2797-2811页 *
Information Content Weighting for Perceptual Image Quality Assessment;Zhou Wang 等;《IEEE TRANSACTIONS ON IMAGE PROCESSING》;20100930;1185-1198页 *
On Saliency, Affect and Focused Attention;Lori McCay-Peet 等;《Proceeding of the SIGCHI Conference on Human Factors in Computing Systems》;20120510;541-550页 *
图像感兴趣区域检测技术研究;吕宝成 等;《中国科技信息》;20080731(第13期);44-45页 *

Also Published As

Publication number Publication date
CN103218815A (en) 2013-07-24

Similar Documents

Publication Publication Date Title
Gu et al. No-reference quality assessment of screen content pictures
Li et al. No-reference video quality assessment with 3D shearlet transform and convolutional neural networks
Ye et al. Unsupervised feature learning framework for no-reference image quality assessment
CN103200421B (en) No-reference image quality evaluation method based on Curvelet transformation and phase coincidence
CN104318545B (en) A kind of quality evaluating method for greasy weather polarization image
Liu et al. No-reference image quality assessment method based on visual parameters
CN103325113B (en) Partial reference type image quality evaluating method and device
Ahmed et al. PIQI: perceptual image quality index based on ensemble of Gaussian process regression
CN103295241A (en) Frequency domain significance target detection method based on Gabor wavelet
Murray et al. Low-level spatiochromatic grouping for saliency estimation
CN105828064A (en) No-reference video quality evaluation method integrating local and global temporal and spatial characteristics
Cheng et al. Image quality assessment using natural image statistics in gradient domain
CN103258326B (en) A kind of information fidelity method of image quality blind evaluation
CN104021536A (en) Self-adaptation SAR image and multispectral image fusion method
CN103095996A (en) Multi-sensor video fusion method based on space-time conspicuousness detection
Ma et al. Efficient saliency analysis based on wavelet transform and entropy theory
CN106056523A (en) Digital image stitching tampering blind detection method
Ma et al. Blind image quality assessment in multiple bandpass and redundancy domains
CN103218815B (en) Utilize the method for natural scene statistical computation image saliency map
CN102930545A (en) Statistical measure method for image quality blind estimation
CN107194926A (en) The blind evaluation method of complementary colours wavelet field picture quality
Wu et al. Attended visual content degradation based reduced reference image quality assessment
CN104144339A (en) An objective evaluation method of degraded reference stereo image quality based on human eye perception
CN103996188A (en) Full-reference-type image quality evaluation method based on Gabor weighted characteristics
Joshi et al. Retina inspired no-reference image quality assessment for blur and noise

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20190704

Address after: Room 1103, Building 21, 39 Jibang Road, Zhongming Town, Shanghai 202163

Patentee after: SHANGHAI JILIAN NETWORK TECHNOLOGY Co.,Ltd.

Address before: 200433 No. 220, Handan Road, Shanghai, Yangpu District

Patentee before: Fudan University

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160330