[go: up one dir, main page]

CN104751459B - Multi-dimensional feature similarity measuring optimizing method and image matching method - Google Patents

Multi-dimensional feature similarity measuring optimizing method and image matching method Download PDF

Info

Publication number
CN104751459B
CN104751459B CN201510140372.9A CN201510140372A CN104751459B CN 104751459 B CN104751459 B CN 104751459B CN 201510140372 A CN201510140372 A CN 201510140372A CN 104751459 B CN104751459 B CN 104751459B
Authority
CN
China
Prior art keywords
feature
value
binary
original
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510140372.9A
Other languages
Chinese (zh)
Other versions
CN104751459A (en
Inventor
祁友杰
朱恩
王建新
彭金龙
胥陈彧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201510140372.9A priority Critical patent/CN104751459B/en
Publication of CN104751459A publication Critical patent/CN104751459A/en
Application granted granted Critical
Publication of CN104751459B publication Critical patent/CN104751459B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

本发明公开了一种多维特征的相似性度量优化方法,属于模式识别技术与计算机技术相交叉的技术领域。该方法首先基于原始的多维特征描述子的自身数据变化特征对其进行二值编码,得到二值编码特征;然后利用二值编码特征间的汉明距离来度量原始的多维特征描述子间的相似性。本发明还公开了一种图像匹配方法。相比现有技术,本发明可加快特征相似性度量的速度,降低特征匹配算法硬件移植的难度和单点处理模块的功耗,减少算法的硬件资源占用率,有效地优化整个特征匹配模块的性能。

The invention discloses a method for optimizing similarity measurement of multi-dimensional features, which belongs to the technical field intersecting pattern recognition technology and computer technology. This method first performs binary coding on the original multidimensional feature descriptor based on its own data change characteristics to obtain binary coded features; then uses the Hamming distance between the binary coded features to measure the similarity between the original multidimensional feature descriptors sex. The invention also discloses an image matching method. Compared with the prior art, the present invention can accelerate the speed of feature similarity measurement, reduce the difficulty of feature matching algorithm hardware transplantation and the power consumption of single-point processing modules, reduce the hardware resource occupancy rate of the algorithm, and effectively optimize the performance of the entire feature matching module. performance.

Description

多维特征的相似性度量优化方法及图像匹配方法Multi-dimensional feature similarity measurement optimization method and image matching method

技术领域technical field

本发明涉及一种多维特征的相似性度量优化方法,属于模式识别技术与计算机技术相交叉的技术领域。The invention relates to a method for optimizing similarity measurement of multi-dimensional features, which belongs to the technical field intersecting pattern recognition technology and computer technology.

背景技术Background technique

随着计算机技术的发展,特征匹配已在信息检索、模式识别、图像处理等诸多领域得到广泛应用。特征匹配通常通过度量特征描述子之间的相似性来实现,以图像处理技术中的图像特征点匹配为例,其主要分为两类:一类是基于单点比较的线性扫描法;另一类是基于划分比较和建立数据索引进行快速匹配的方法。现有图像特征匹配均是先计算基准图和实时图的特征点及其特征描述符,然后计算基准图与实时图的关键点的最近邻匹配[David MARSHALL.Nearest Neighbor Searching In High Dimensional Metric Space[EB/OL].http://escience.anu.edu.Au/pro-ject/06S1/DavidMarshall.2006-06-21/2009-12],以此判断是否为匹配点对。传统技术方案中,对最近邻匹配的相似性度量均采用欧式距离,即先提取基准图像和实时图像的SIFT特征点,在分别建立特征描述子后,根据最近邻和次近邻比值的大小,判断基准图像中的特征点是否与实时图像中的特征点匹配。With the development of computer technology, feature matching has been widely used in information retrieval, pattern recognition, image processing and many other fields. Feature matching is usually achieved by measuring the similarity between feature descriptors. Taking image feature point matching in image processing technology as an example, it is mainly divided into two categories: one is the linear scanning method based on single-point comparison; the other is The class is a method for fast matching based on partition comparison and data indexing. The existing image feature matching is to first calculate the feature points and their feature descriptors of the reference image and the real-time image, and then calculate the nearest neighbor matching of the key points of the reference image and the real-time image [David MARSHALL.Nearest Neighbor Searching In High Dimensional Metric Space[ EB/OL].http://escience.anu.edu.Au/pro-ject/06S1/DavidMarshall.2006-06-21/2009-12] to judge whether it is a matching point pair. In the traditional technical solution, the Euclidean distance is used for the similarity measurement of the nearest neighbor matching, that is, the SIFT feature points of the reference image and the real-time image are extracted first, and after the feature descriptors are respectively established, according to the size of the nearest neighbor and the next nearest neighbor ratio, judge Whether the feature points in the reference image match those in the real-time image.

随着科技进步,各种特征描述方法被提出,这些特征往往具有极高的维数(例如,SIFT特征的维数为128),一方面,这些特征可以更好地对样本的某些属性进行更准确地描述;另一方面,其也为特征相似性度量的具体实现带来了一系列问题。具体而言,传统的基于欧式距离的匹配方案有如下的缺点:With the advancement of science and technology, various feature description methods have been proposed. These features often have extremely high dimensions (for example, the dimension of SIFT features is 128). On the one hand, these features can better perform certain attributes of samples. A more accurate description; on the other hand, it also brings a series of problems to the specific realization of feature similarity measurement. Specifically, the traditional matching scheme based on Euclidean distance has the following disadvantages:

1)资源占用率高。对于单点匹配而言,一般情况下,每个SIFT特征点有128个维度,每个维度用一个字节(8位)进行浮点数表示,则一个特征点就需要128个字节的存储空间;1) The resource usage rate is high. For single-point matching, in general, each SIFT feature point has 128 dimensions, and each dimension is represented by a byte (8 bits) as a floating point number, so a feature point requires 128 bytes of storage space ;

2)算法的硬件移植难道大。将基于欧式距离的匹配算法进行硬件移植过程中,主要涉及求和运算和乘积运算(除法也可以看成一种乘积运算),而这两种运算的复杂度(时间复杂度和空间复杂度)与输入数据的空间复杂度以及输入数据的多少成正比关系;2) Is the hardware transplantation of the algorithm too big? In the process of transplanting the matching algorithm based on Euclidean distance to hardware, it mainly involves summation and product operations (division can also be regarded as a product operation), and the complexity of these two operations (time complexity and space complexity) and The space complexity of the input data is directly proportional to the amount of input data;

3)运算速度低。对于单点匹配,运算的时钟频率受输入数据的维度大小影响较大。假设每个特征点有128个维度,每个维度用一个字节(8位)进行浮点数表示,在FPGA(型号为Zynq-7000 XC7Z020-1CLG484C)上进行算法移植后,其最高时钟频率仅为78MHz。3) The operation speed is low. For single-point matching, the clock frequency of the operation is greatly affected by the dimensionality of the input data. Assume that each feature point has 128 dimensions, and each dimension is represented by a byte (8 bits) as a floating point number. After the algorithm is transplanted on the FPGA (model Zynq-7000 XC7Z020-1CLG484C), the highest clock frequency is only 78MHz.

发明内容Contents of the invention

本发明所要解决的技术问题在于克服现有技术的不足,提供一种多维特征的相似性度量优化方法,可加快特征相似性度量的速度,降低特征匹配算法硬件移植的难度和单点处理模块的功耗,减少算法的硬件资源占用率,有效地优化整个特征匹配模块的性能。The technical problem to be solved by the present invention is to overcome the deficiencies in the prior art and provide a similarity measurement optimization method for multi-dimensional features, which can speed up the measurement of feature similarity and reduce the difficulty of feature matching algorithm hardware transplantation and single-point processing module. Reduce power consumption, reduce the hardware resource occupancy rate of the algorithm, and effectively optimize the performance of the entire feature matching module.

本发明具体采用以下技术方案解决上述技术问题:The present invention specifically adopts the following technical solutions to solve the above technical problems:

一种多维特征的相似性度量优化方法,首先基于原始的多维特征描述子的自身数据变化特征对其进行二值编码,得到二值编码特征;然后利用二值编码特征间的汉明距离来度量原始的多维特征描述子间的相似性。An optimization method for similarity measurement of multi-dimensional features. Firstly, the original multi-dimensional feature descriptors are binary coded based on their own data change characteristics to obtain binary coded features; then the Hamming distance between binary coded features is used to measure Similarity between original multidimensional feature descriptors.

优选方案之一、具体按照以下方法对原始的多维特征描述子进行二值编码:In one of the preferred schemes, the original multi-dimensional feature descriptor is binary coded according to the following method:

设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1/0,否则,的值为0/1,m∈{m|m=1,2,…,N-1};根据的差值与一预设阈值之间的大小关系确定,的差值大于所述阈值,则的值为1/0,否则,的值为0/1,m∈{m|m=1,2,…,N-1}。Let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of 1/0, otherwise, The value of is 0/1, m∈{m|m=1,2,...,N-1}; according to with The magnitude relationship between the difference and a preset threshold value is determined, with The difference is greater than the threshold, then The value of 1/0, otherwise, The value of is 0/1, m∈{m|m=1,2,...,N-1}.

优选方案之二、具体按照以下方法对原始的多维特征描述子进行二值编码:The second preferred option is to perform binary encoding on the original multi-dimensional feature descriptor according to the following method:

设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1/0,否则,的值为0/1,m∈{m|m=1,2,…,N}。Let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of 1/0, otherwise, The value of is 0/1, m∈{m|m=1,2,…,N}.

优选方案之三、具体按照以下方法对原始的多维特征描述子进行二值编码:The third preferred option is to perform binary encoding on the original multi-dimensional feature descriptor according to the following method:

设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1/0,否则,的值为0/1;表示原始的多维特征描述子中各维特征分量的均值。Let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of 1/0, otherwise, The value of is 0/1; Indicates the mean value of each dimension feature component in the original multidimensional feature descriptor.

根据相同的发明思路还可以得到以下技术方案:According to the same inventive idea, the following technical solutions can also be obtained:

一种图像匹配方法,利用图像特征间的相似性进行图像匹配,其特征在于,所述图像特征间的相似性使用上任一技术方案所述相似性度量优化方法进行度量。An image matching method, which uses the similarity between image features to perform image matching, wherein the similarity between the image features is measured using the similarity measurement optimization method described in any one of the above technical solutions.

相比现有技术,本发明可大幅降低特征匹配算法的硬件移植难度,提高特征匹配的运算速度,并大大减少硬件资源占用率,具有广泛的应用前景。Compared with the prior art, the invention can greatly reduce the difficulty of hardware transplantation of the feature matching algorithm, increase the operation speed of the feature matching, greatly reduce the occupation rate of hardware resources, and has wide application prospects.

附图说明Description of drawings

图1为现有基于欧式距离的SIFT特征相似性度量方法的流程图;Fig. 1 is the flow chart of existing SIFT feature similarity measurement method based on Euclidean distance;

图2为本发明相似性度量优化方法的流程图;Fig. 2 is the flowchart of similarity measurement optimization method of the present invention;

图3为特征描述子进行二值化编码的原理示意图。FIG. 3 is a schematic diagram of the principle of binary encoding of feature descriptors.

图4为本发明相似性度量优化方法所使用的一种二值编码算法流程图;Fig. 4 is a kind of binary coding algorithm flow chart that similarity measurement optimization method of the present invention uses;

图5为本发明相似性度量优化方法所使用的一种计算汉明距离的算法流程图。FIG. 5 is a flowchart of an algorithm for calculating Hamming distance used in the similarity measure optimization method of the present invention.

具体实施方式detailed description

下面结合附图对本发明的技术方案进行详细说明:The technical scheme of the present invention is described in detail below in conjunction with accompanying drawing:

在图像处理、模式识别、信息检索等领域的特征匹配过程中,由于特征维数较高以及欧式距离的计算较复杂,导致特征相似性度量需要消耗大量的硬件资源(包括存储、计算资源),对硬件要求高,硬件移植的难度大。以基于SIFT特征的图像匹配算法为例,用于进行匹配的特征点一般使用所在尺度空间的128维向量信息进行表征。而SIFT算法的一个突出优点就是多量性,即:即使目标物体很少,也能生成较多的特征点,这就使得具有较大尺寸的基准图像往往具有多达数千的特征点,尺寸较小的实时图像拥有的特征点也多达数百。图像特征点所带信息过多,成为阻碍提高图像匹配速度、提升业务吞吐量的主要因素之一,其所需的庞大硬件资源也为算法的硬件移植带来了极大的困难。In the process of feature matching in the fields of image processing, pattern recognition, information retrieval, etc., due to the high feature dimension and the complicated calculation of Euclidean distance, the feature similarity measurement needs to consume a lot of hardware resources (including storage and computing resources). The hardware requirements are high, and the hardware transplantation is difficult. Taking the image matching algorithm based on SIFT features as an example, the feature points used for matching are generally represented by the 128-dimensional vector information in the scale space. One of the outstanding advantages of the SIFT algorithm is its multiplicity, that is, even if there are few target objects, more feature points can be generated, which makes the reference image with a larger size often have as many as thousands of feature points, and the size is smaller. Small real-time images also have hundreds of feature points. Too much information carried by image feature points has become one of the main factors hindering the improvement of image matching speed and business throughput. The huge hardware resources it requires also bring great difficulties to the hardware transplantation of algorithms.

为解决该问题,本发明提出了一种新的多维特征的相似性度量优化方法,其基本思想是首先基于原始的多维特征描述子的自身数据变化特征对其进行二值编码,得到二值编码特征;然后利用二值编码特征间的汉明距离来度量原始的多维特征描述子间的相似性。为了便于公众理解,下面仍以基于SIFT特征的图像匹配为例来对本发明技术方案进行详细说明。In order to solve this problem, the present invention proposes a new multi-dimensional feature similarity measurement optimization method, the basic idea of which is to first perform binary encoding on the original multi-dimensional feature descriptor based on its own data change characteristics, and obtain the binary encoding features; and then use the Hamming distance between binary coded features to measure the similarity between the original multidimensional feature descriptors. In order to facilitate the public's understanding, the technical solution of the present invention will be described in detail below still taking image matching based on SIFT features as an example.

图1显示了现有基于欧式距离的SIFT特征相似性度量方法的基本流程。对于一个128维的SIFT特征描述子(上标f表示当前的描述子的数据类型是浮点数,以区别下面二值表达的描述子),一般情况下,描述子的每一维均用浮点数进行表示。在算法进行硬件移植中,至少需要用8个比特位进行浮点数的表示,这样,一个特征点描述子需要128×8=1024个比特。Figure 1 shows the basic flow of the existing Euclidean distance-based SIFT feature similarity measurement method. For a 128-dimensional SIFT feature descriptor (The superscript f indicates that the data type of the current descriptor is a floating-point number, so as to distinguish it from the following binary-valued descriptors). In general, each dimension of the descriptor is represented by a floating-point number. When the algorithm is transplanted to hardware, at least 8 bits are required to represent the floating point number, so a feature point descriptor needs 128×8=1024 bits.

图2显示了本发明相似性度量优化方法的基本流程。如图2所示,首先读入待编码数据A序列和B序列,其中A序列为基准图像的单点数据,B序列为实时图像的单点数据。A、B序列分别为128个维度,每个维度用一个字节(8个比特位)表示;然后分别对A序列和B序列进行二值编码;最后对二值编码的A序列、B序列进行异或运算,计算出其汉明距离。其中的二值编码方法为本发明的核心部分,其是基于原始特征描述子(A序列、B序列)自身的数据变化特征进行编码,相当于以原始特征描述子为对象,再提取其自身数据变化特征。本发明优选根据当前维度特征分量和与其相隔一定距离的维度特征分量之间的变化关系进行二值编码,该方案的具体描述如下:Fig. 2 shows the basic flow of the similarity measure optimization method of the present invention. As shown in Figure 2, first read in the data to be encoded A sequence and B sequence, where A sequence is the single-point data of the reference image, and B sequence is the single-point data of the real-time image. The A and B sequences are respectively 128 dimensions, and each dimension is represented by a byte (8 bits); then the A sequence and the B sequence are respectively binary-coded; finally, the binary-coded A sequence and the B sequence are encoded XOR operation to calculate its Hamming distance. The binary encoding method is the core part of the present invention, which encodes based on the data change characteristics of the original feature descriptor (A sequence, B sequence) itself, which is equivalent to taking the original feature descriptor as the object and then extracting its own data change characteristics. The present invention preferably performs binary encoding according to the change relationship between the current dimension feature component and the dimension feature component separated from it by a certain distance. The specific description of the scheme is as follows:

设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1/0,否则,的值为0/1,m∈{m|m=1,2,…,N-1};根据的差值与一预设阈值之间的大小关系确定,的差值大于所述阈值,则的值为1/0,否则,的值为0/1,m∈{m|m=1,2,…,N-1}。Let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of 1/0, otherwise, The value of is 0/1, m∈{m|m=1,2,...,N-1}; according to with The magnitude relationship between the difference and a preset threshold value is determined, with The difference is greater than the threshold, then The value of 1/0, otherwise, The value of is 0/1, m∈{m|m=1,2,...,N-1}.

例如,当m的取值为1的时候可采用以下的二值编码方法:For example, when the value of m is 1, the following binary encoding method can be used:

设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1(或为0),否则,的值为0(为1);根据的差值与一预设阈值Threshold之间的大小关系确定,的差值大于所述阈值,则的值为1(或为0),否则,的值为0(为1)。Let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of is 1 (or 0), otherwise, The value of is 0 (is 1); according to with The size relationship between the difference and a preset threshold Threshold is determined, with The difference is greater than the threshold, then The value of is 1 (or 0), otherwise, has a value of 0 (was 1).

或者,设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1(或为0),否则,的值为0(为1);根据的差值与一预设阈值Threshold之间的大小关系确定,的差值大于所述阈值,则的值为1(或为0),否则,的值为0(为1)。Alternatively, let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of is 1 (or 0), otherwise, The value of is 0 (is 1); according to with The size relationship between the difference and a preset threshold Threshold is determined, with The difference is greater than the threshold, then The value of is 1 (or 0), otherwise, has a value of 0 (was 1).

上述二值编码方法的原理如图3所示。采用该编码方案后,则原需要1024比特存储空间的SIFT特征描述子仅需要128×2=256个比特,大幅降低了所需存储空间。此外,上述二值编码方案中,di2 2的引入是为了提高数据的独特性,实际应用中也可以没有,即原始特征的每一维分量仅使用一个二值数进行编码,从而进一步降低存储空间。Threshold的取值越低,则得到的二值描述子的独特性越好,但同时会降低算法的查全率,Threshold的具体取值可根据所针对的原始特征通过试验确定。经试验验证,对于SIFT特征,其取值为0.05较佳。The principle of the above-mentioned binary encoding method is shown in FIG. 3 . After adopting this encoding scheme, the SIFT feature descriptor that originally required 1024 bits of storage space only needs 128×2=256 bits, which greatly reduces the required storage space. In addition, in the above binary coding scheme, the introduction of d i2 2 is to improve the uniqueness of the data, and it may not be used in practical applications, that is, each dimension component of the original feature is coded with only one binary number, thereby further reducing storage space. The lower the value of Threshold, the better the uniqueness of the obtained binary descriptor, but at the same time it will reduce the recall rate of the algorithm. The specific value of Threshold can be determined through experiments according to the original features targeted. It has been verified by experiments that for the SIFT feature, the value of 0.05 is better.

当然,具体的二值编码也可采用其它方式,例如,以原始特征某一特定维度的分量(例如第一维分量)作为基准,以各维特征分量与基准之间的大小关系确定相应的二值数,具体如下:Of course, specific binary encoding can also be done in other ways, for example, using a component of a specific dimension of the original feature (such as the first dimension component) as a reference, and using the size relationship between the feature components of each dimension and the reference to determine the corresponding binary code. value, as follows:

设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1(或为0),否则,的值为0(为1),m∈{m|m=1,2,…,N}。Let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of is 1 (or 0), otherwise, The value of is 0 (is 1), m∈{m|m=1,2,...,N}.

又或者,以原始特征各维分量的均值作为基准,以各维特征分量与基准之间的大小关系确定相应的二值数,具体如下:Or, take the mean value of each dimensional component of the original feature as the benchmark, and determine the corresponding binary number based on the size relationship between the feature component of each dimension and the benchmark, as follows:

设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1/0,否则,的值为0/1;表示原始的多维特征描述子中各维特征分量的均值。Let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of 1/0, otherwise, The value of is 0/1; Indicates the mean value of each dimension feature component in the original multidimensional feature descriptor.

上述两方案也可进一步如第一种优选方案一样引入di2 2,从而提高数据的独特性。The above two schemes can further introduce d i2 2 like the first preferred scheme, so as to improve the uniqueness of the data.

图4、图5显示了采用本发明方法进行SIFT特征相似性度量的一个算法实例,其中图4为二值编码的算法实现流程,图5为计算汉明距离的算法流程。Fig. 4 and Fig. 5 have shown an algorithm example that adopts the method of the present invention to carry out SIFT feature similarity measurement, wherein Fig. 4 is the algorithm implementation flow of binary encoding, Fig. 5 is the algorithm flow of calculating Hamming distance.

二值编码的具体步骤如图4所示,包括:The specific steps of binary encoding are shown in Figure 4, including:

步骤一:首先,将A序列的128个浮点数据依次读入至FIFO(先入先出)中,将此次FIFO命名为F_A,F_A由128个8位寄存器组成;然后,将B序列的128个浮点数据依次读入至FIFO(先入先出)中,将此次FIFO命名为F_B,F_B由128个8位寄存器组成。Step 1: First, read the 128 floating-point data of sequence A into FIFO (first-in, first-out) sequentially, and name this FIFO F_A, F_A is composed of 128 8-bit registers; then, read the 128 floating-point data of sequence B Floating-point data are read into FIFO (first-in-first-out) sequentially, and this FIFO is named as F_B, and F_B is composed of 128 8-bit registers.

步骤二:对A序列进行二值编码,编码后的数据存放于寄存器C中,C是256位寄存器。其中,C[0]与C[1]的编码由A[0]与A[127]决定:Step 2: Perform binary encoding on sequence A, and store the encoded data in register C, which is a 256-bit register. Among them, the encoding of C[0] and C[1] is determined by A[0] and A[127]:

若0<A[0]-A[127]<threshold,则C[0]=1,C[1]=0;If 0<A[0]-A[127]<threshold, then C[0]=1, C[1]=0;

若A[0]-A[127]>threshold,则C[0]=1,C[1]=1;If A[0]-A[127]>threshold, then C[0]=1, C[1]=1;

若0<A[127]-A[0]<threshold,则C[0]=0,C[1]=0;If 0<A[127]-A[0]<threshold, then C[0]=0, C[1]=0;

若A[0]-A[127]>threshold,则C[0]=0,C[1]=1。If A[0]-A[127]>threshold, then C[0]=0, C[1]=1.

而C[2×k]和C[2×k+1]的编码由A[k]与A[k-1]决定,其中1<k<127:The encoding of C[2×k] and C[2×k+1] is determined by A[k] and A[k-1], where 1<k<127:

若0<A[k]-A[k-1]<threshold,则C[2×k]=1,C[2×k+1]=0;If 0<A[k]-A[k-1]<threshold, then C[2×k]=1, C[2×k+1]=0;

若A[k]-A[k-1]>threshold,则C[2×k]=1,C[2×k+1]=1;If A[k]-A[k-1]>threshold, then C[2×k]=1, C[2×k+1]=1;

若0<A[k-1]-A[k]<threshold,则C[2×k]=0,C[2×k+1]=0;If 0<A[k-1]-A[k]<threshold, then C[2×k]=0, C[2×k+1]=0;

若A[k-1]-A[k]>threshold,则C[2×k]=0,C[2×k+1]=1。If A[k-1]-A[k]>threshold, then C[2×k]=0, C[2×k+1]=1.

步骤三:对B序列进行二值编码,编码后的数据存放于寄存器D中,D是256位寄存器。其中,D[0]与D[1]的编码由B[0]与B[127]决定:Step 3: Perform binary encoding on sequence B, and store the encoded data in register D, which is a 256-bit register. Among them, the encoding of D[0] and D[1] is determined by B[0] and B[127]:

若0<B[0]-B[127]<threshold,则D[0]=1,D[1]=0;If 0<B[0]-B[127]<threshold, then D[0]=1, D[1]=0;

若B[0]-B[127]>threshold,则D[0]=1,D[1]=1;If B[0]-B[127]>threshold, then D[0]=1, D[1]=1;

若0<B[127]-B[0]<threshold,则D[0]=0,D[1]=0;If 0<B[127]-B[0]<threshold, then D[0]=0, D[1]=0;

若B[0]-B[127]>threshold,则D[0]=0,D[1]=1。If B[0]-B[127]>threshold, then D[0]=0, D[1]=1.

而D[2×k]和D[2×k+1]的编码由B[k]与B[k-1]决定,其中1<k<127:The encoding of D[2×k] and D[2×k+1] is determined by B[k] and B[k-1], where 1<k<127:

若0<B[k]-B[k-1]<threshold,则D[2×k]=1,D[2×k+1]=0;If 0<B[k]-B[k-1]<threshold, then D[2×k]=1, D[2×k+1]=0;

若B[k]-B[k-1]>threshold,则D[2×k]=1,D[2×k+1]=1;If B[k]-B[k-1]>threshold, then D[2×k]=1, D[2×k+1]=1;

若0<B[k-1]-B[k]<threshold,则D[2×k]=0,D[2×k+1]=0;If 0<B[k-1]-B[k]<threshold, then D[2×k]=0, D[2×k+1]=0;

若B[k-1]-B[k]>threshold,则D[2×k]=0,D[2×k+1]=1。If B[k-1]-B[k]>threshold, then D[2×k]=0, D[2×k+1]=1.

汉明距离的计算步骤如图5所示,具体如下:The calculation steps of the Hamming distance are shown in Figure 5, and the details are as follows:

步骤一:将编码步骤二和步骤三得到的,分别存放于C寄存器和D寄存器的二值数据按位异或,并将结果保存在E寄存器中,其中E为256位寄存器。Step 1: Bitwise XOR the binary data obtained in the encoding steps 2 and 3 and stored in the C register and the D register respectively, and store the result in the E register, where E is a 256-bit register.

步骤二:求出寄存器E中二进制序列的汉明距离:首先,判断E[0]是否等于1,等于1,则计数器sum增1;否则不变。下一个时钟,E序列向右移1位,同时重复判断E[0]以决定sum是否增1,直至E序列256比特位全部判断完为止。则计数器sum的最终结果,即为序列A、B转换成二值序列后的汉明距离。Step 2: Find the Hamming distance of the binary sequence in the register E: First, judge whether E[0] is equal to 1, and if it is equal to 1, the counter sum will be incremented by 1; otherwise, it will remain unchanged. At the next clock, the E sequence is shifted to the right by 1 bit, and at the same time, the E[0] is repeatedly judged to determine whether the sum is increased by 1 until all 256 bits of the E sequence are judged. Then the final result of the counter sum is the Hamming distance after the sequences A and B are converted into binary sequences.

采用本发明特征相似性度量优化方法的SIFT特征点匹配方法,因为进行了二值编码,并采用汉明距离替代欧式距离做为特征匹配的度量方式,所以只需一步异或运算便可实现相似性的度量,极大降低了匹配算法的硬件移植难度,提高了图像匹配的运算速度,并大大减少了硬件资源占用率,具有广泛的应用前景。Using the SIFT feature point matching method of the feature similarity measurement optimization method of the present invention, because binary encoding is performed, and the Hamming distance is used instead of the Euclidean distance as the feature matching measurement method, it only needs one-step XOR operation to achieve similarity. The performance measurement greatly reduces the difficulty of hardware transplantation of the matching algorithm, improves the operation speed of image matching, and greatly reduces the occupation rate of hardware resources, which has a wide application prospect.

为了验证本发明的效果,在相同硬件平台(FPGA)上比较了采用传统基于欧式距离的相似性度量和采用本发明相似性度量优化方法情况下,SIFT特征匹配算法所占用的硬件资源以及所能达到最高速率。表1显示了比较结果,其中欧式距离表示采用传统基于欧式距离的相似性度量,汉明距离表示采用本发明相似性度量方法。In order to verify the effect of the present invention, compared on the same hardware platform (FPGA) using the traditional similarity measure based on Euclidean distance and adopting the similarity measure optimization method of the present invention, the hardware resources taken by the SIFT feature matching algorithm and the performance reach the highest rate. Table 1 shows the comparison results, wherein the Euclidean distance means that the traditional similarity measure based on Euclidean distance is used, and the Hamming distance means that the similarity measure method of the present invention is used.

表1 不同相似性度量方式下的硬件资源和匹配速度对比Table 1 Comparison of hardware resources and matching speed under different similarity measurement methods

从表1中可以看出,在相同平台下(FPGA),采用本发明相似性度量方法的匹配方式要比传统方法占用更少的资源(前者约为后者的27.4%),却拥有更快的处理速度(速度提升约304%)。As can be seen from Table 1, under the same platform (FPGA), the matching method using the similarity measurement method of the present invention takes up less resources than the traditional method (the former is about 27.4% of the latter), but has a faster processing speed (about 304% faster).

Claims (5)

1.一种多维特征的相似性度量优化方法,其特征在于,首先基于原始的多维特征描述子的自身数据变化特征对其进行二值编码,得到二值编码特征;然后利用二值编码特征间的汉明距离来度量原始的多维特征描述子间的相似性;具体按照以下方法对原始的多维特征描述子进行二值编码:1. A similarity measurement optimization method of multi-dimensional feature, it is characterized in that, at first based on the self data change characteristic of original multi-dimensional feature descriptor it carries out binary encoding, obtains binary encoding feature; Then utilizes binary encoding feature The Hamming distance is used to measure the similarity between the original multidimensional feature descriptors; specifically, the original multidimensional feature descriptors are binary coded according to the following method: 设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1/0,否则,的值为0/1,m∈{m|m=1,2,…,N-1};根据的差值与一预设阈值之间的大小关系确定,的差值大于所述阈值,则的值为1/0,否则,的值为0/1,m∈{m|m=1,2,…,N-1};Let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of 1/0, otherwise, The value of is 0/1, m∈{m|m=1,2,...,N-1}; according to with The magnitude relationship between the difference and a preset threshold value is determined, with The difference is greater than the threshold, then The value of 1/0, otherwise, The value of is 0/1, m∈{m|m=1,2,...,N-1}; 或者,设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1/0,否则,的值为0/1,m∈{m|m=1,2,…,N};Alternatively, let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of 1/0, otherwise, The value of is 0/1, m∈{m|m=1,2,...,N}; 或者,设原始的多维特征描述子N为特征维数,上标f表示其数据类型为浮点数;令上标2表示其数据类型为二值数,i=1,2,…,N;即得到原始的多维特征描述子的二值编码特征;其中,根据的大小关系确定,大于的值为1/0,否则,的值为0/1;表示原始的多维特征描述子中各维特征分量的均值。Alternatively, let the original multidimensional feature descriptor N is the feature dimension, and the superscript f indicates that its data type is a floating point number; let The superscript 2 indicates that its data type is a binary number, i=1, 2,..., N; that is, the binary encoding feature of the original multidimensional feature descriptor is obtained; among them, according to with The size relationship is determined, more than the but The value of 1/0, otherwise, The value of is 0/1; Indicates the mean value of each dimension feature component in the original multidimensional feature descriptor. 2.如权利要求1所述多维特征的相似性度量优化方法,其特征在于,m的值为1。2. the similarity measurement optimization method of multi-dimensional feature as claimed in claim 1, is characterized in that, the value of m is 1. 3.如权利要求1或2所述多维特征的相似性度量优化方法,其特征在于,所述原始的多维特征描述子为图像SIFT特征描述子。3. The similarity measure optimization method of multi-dimensional feature as claimed in claim 1 or 2, is characterized in that, described original multi-dimensional feature descriptor is image SIFT feature descriptor. 4.如权利要求3所述多维特征的相似性度量优化方法,其特征在于,所述阈值为0.05。4. The similarity measure optimization method of multi-dimensional feature as claimed in claim 3, is characterized in that, described threshold is 0.05. 5.一种图像匹配方法,利用图像特征间的相似性进行图像匹配,其特征在于,所述图像特征间的相似性使用权利要求1~4任一项所述相似性度量优化方法进行度量。5. An image matching method, which uses the similarity between image features to perform image matching, characterized in that the similarity between the image features is measured using the similarity measurement optimization method described in any one of claims 1-4.
CN201510140372.9A 2015-03-27 2015-03-27 Multi-dimensional feature similarity measuring optimizing method and image matching method Expired - Fee Related CN104751459B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510140372.9A CN104751459B (en) 2015-03-27 2015-03-27 Multi-dimensional feature similarity measuring optimizing method and image matching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510140372.9A CN104751459B (en) 2015-03-27 2015-03-27 Multi-dimensional feature similarity measuring optimizing method and image matching method

Publications (2)

Publication Number Publication Date
CN104751459A CN104751459A (en) 2015-07-01
CN104751459B true CN104751459B (en) 2017-05-17

Family

ID=53591078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510140372.9A Expired - Fee Related CN104751459B (en) 2015-03-27 2015-03-27 Multi-dimensional feature similarity measuring optimizing method and image matching method

Country Status (1)

Country Link
CN (1) CN104751459B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127243A (en) * 2016-06-22 2016-11-16 上海师范大学 A kind of image matching method describing son based on binaryzation SIFT
CN106778771A (en) * 2016-11-22 2017-05-31 上海师范大学 A kind of new two-value SIFT descriptions and its image matching method
CN108615006B (en) * 2018-04-23 2020-04-17 百度在线网络技术(北京)有限公司 Method and apparatus for outputting information

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778619A (en) * 2012-10-17 2014-05-07 华中科技大学 Image matching method based on Zernike matrix
CN104081435A (en) * 2014-04-29 2014-10-01 中国科学院自动化研究所 Image matching method based on cascading binary encoding
CN104166977A (en) * 2013-05-17 2014-11-26 中国航空工业集团公司洛阳电光设备研究所 Image matching similarity measuring method and image matching method thereof
CN104392431A (en) * 2014-10-27 2015-03-04 华东师范大学 Image matching method based on image variable length coding

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102375987B (en) * 2010-08-17 2014-04-02 国基电子(上海)有限公司 Image processing device and image feature vector extracting and image matching method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778619A (en) * 2012-10-17 2014-05-07 华中科技大学 Image matching method based on Zernike matrix
CN104166977A (en) * 2013-05-17 2014-11-26 中国航空工业集团公司洛阳电光设备研究所 Image matching similarity measuring method and image matching method thereof
CN104081435A (en) * 2014-04-29 2014-10-01 中国科学院自动化研究所 Image matching method based on cascading binary encoding
CN104392431A (en) * 2014-10-27 2015-03-04 华东师范大学 Image matching method based on image variable length coding

Also Published As

Publication number Publication date
CN104751459A (en) 2015-07-01

Similar Documents

Publication Publication Date Title
WO2017092183A1 (en) Image retrieval method based on variable-length deep hash learning
US11551785B2 (en) Gene sequencing data compression preprocessing, compression and decompression method, system, and computer-readable medium
CN104142984A (en) A Video Fingerprint Retrieval Method Based on Coarse and Fine Granularity
CN105095435A (en) Similarity comparison method and device for high-dimensional image features
CN102750339B (en) Positioning method of repeated fragments based on video reconstruction
CN101710334A (en) Large-scale image library retrieving method based on image Hash
CN103425639A (en) Similar information identifying method based on information fingerprints
CN102508910A (en) Image retrieval method based on minimum projection errors of multiple hash tables
CN104484673A (en) Data complementation method for pattern recognition application of real-time data flow
CN104217222B (en) A kind of image matching method represented based on stochastical sampling Hash
CN106980656A (en) A kind of searching method based on two-value code dictionary tree
CN104123375A (en) Data search method and system
CN106778079A (en) A kind of DNA sequence dna k mer frequency statistics methods based on MapReduce
US20200294629A1 (en) Gene sequencing data compression method and decompression method, system and computer-readable medium
CN103226585A (en) Self-adaptation Hash rearrangement method for image retrieval
CN112069875A (en) Face image classification method, device, electronic device and storage medium
CN104751459B (en) Multi-dimensional feature similarity measuring optimizing method and image matching method
CN105183792B (en) Distributed fast text classification method based on locality sensitive hashing
CN111370064A (en) Rapid gene sequence classification method and system based on SIMD hash function
Li et al. Coverless Video Steganography Based on Frame Sequence Perceptual Distance Mapping.
CN106549675A (en) A kind of average dependent quadrature matching pursuit algorithm based on compressed sensing
CN103064887B (en) A kind of method and apparatus of recommendation information
CN105005464B (en) A kind of Burrows Wheeler mapping hardware processing units
CN110083743A (en) A kind of quick set of metadata of similar data detection method based on uniform sampling
CN114817651A (en) Data storage method, data query method, apparatus and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170517

Termination date: 20210327