[go: up one dir, main page]

CN115189675B - A high-order FIR digital filter based on reduced calculation - Google Patents

A high-order FIR digital filter based on reduced calculation

Info

Publication number
CN115189675B
CN115189675B CN202210873320.2A CN202210873320A CN115189675B CN 115189675 B CN115189675 B CN 115189675B CN 202210873320 A CN202210873320 A CN 202210873320A CN 115189675 B CN115189675 B CN 115189675B
Authority
CN
China
Prior art keywords
bit
node
module
lookup table
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210873320.2A
Other languages
Chinese (zh)
Other versions
CN115189675A (en
Inventor
宋宇鲲
刘文娟
蔡阳
曾树铭
张多利
倪伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202210873320.2A priority Critical patent/CN115189675B/en
Publication of CN115189675A publication Critical patent/CN115189675A/en
Application granted granted Critical
Publication of CN115189675B publication Critical patent/CN115189675B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03HIMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
    • H03H17/00Networks using digital techniques
    • H03H17/02Frequency selective networks
    • H03H17/06Non-recursive filters
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03HIMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
    • H03H17/00Networks using digital techniques
    • H03H2017/0072Theoretical filter design
    • H03H2017/0081Theoretical filter design of FIR filters

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Mathematical Physics (AREA)
  • Complex Calculations (AREA)

Abstract

本发明提出一种基于规约计算的高阶FIR数字滤波器,滤波器的抽头为n个一位的二值随机系数,每个采样周期下滤波器输入一采样点并计算此采样点与前n‑1个采样点组成的n个采样点和n个二值随机系数的乘累加结果。规定采样点与二值随机系数的相乘结果:系数为1则相乘结果为原采样点的值,否则相乘结果为采样点的相反数。本发明基于这种规约计算设计了一种基于数据广播结构的高阶FIR数字滤波器,将采样点与二值随机系数进行n项组合且利用组合后的部分和为有限项的规律,用一深度较小的查找表压缩了各加法节点处计算部分和所需要的大量重复加法,使滤波器各加法节点通过与查找表各表项的地址匹配获得相应的部分和,大幅压缩加法器的使用。

The present invention proposes a high-order FIR digital filter based on reduced calculation. The filter taps are n single-bit binary random coefficients. During each sampling period, the filter inputs a sampling point and calculates the product of the n sampling points and the previous n-1 sampling points, which are composed of n sampling points and n binary random coefficients. The multiplication result of the sampling point and the binary random coefficient is specified: if the coefficient is 1, the multiplication result is the value of the original sampling point; otherwise, the multiplication result is the inverse of the sampling point. Based on this reduced calculation, the present invention designs a high-order FIR digital filter based on a data broadcast structure. The sampling point and the binary random coefficient are combined into n items. The partial sum of the combined items is a finite number of items. A lookup table with a small depth is used to compress the large number of repeated additions required to calculate the partial sum at each addition node. Each addition node of the filter obtains the corresponding partial sum by matching the address of each table entry in the lookup table, which greatly reduces the use of adders.

Description

High-order FIR digital filter based on protocol calculation
Technical Field
The invention relates to the technical field of time domain capturing, in particular to a high-order FIR digital filter based on protocol calculation.
Background
At present, the FIR digital filter is widely applied in the fields of signal processing and the like, and can ensure strict phase characteristics when the amplitude characteristics are met in the design. Along with the increasing requirements of the system on speed and real-time performance, the research of the high-performance filter has important theoretical significance and engineering value.
The time domain expression of the FIR digital filter is:
h (N) is a coefficient of the filter, x (N-i) is an input sampling point, y (N) is a multiplication and accumulation result of the input sampling point and the coefficient of the filter, and N is an order of the filter. The FIR digital filter consists of a limited number of sampling points, and convolution sum operation in the left formula is simplified into multiplication and accumulation results of N sampling points and N coefficients in the right formula under each sampling period.
In practice, the larger the order of the FIR, the larger the coefficient length corresponding to h (n), and the higher the hardware complexity of the final implementation. The FPGA implementation of the prior high-order FIR digital filter mainly comprises an FIR digital filter based on a MAC structure, an FIR digital filter based on a distributed algorithm and an FIR digital filter based on FFT. The limitations of FPGA implementation of these three FIR digital filters were analyzed separately as follows:
First, for a FIR digital filter based on a distributed algorithm, the input N-bit data is disassembled into N-bit binary numbers, and a lookup table with a depth of 2 N is built based on all multiplication results of the N filter coefficients and the N-bit binary numbers. While this approach saves multiplier resource consumption and can be optimized by partitioning the look-up table to reduce the look-up table resource consumption and increasing the number of look-up tables to increase the computation speed, the look-up table resource consumption is still significant for higher order FIR digital filters.
Second, for the FIR digital filter based on FFT, the convolution operation of the sampling point and coefficient input by the filter is transformed into multiplication operation of frequency domain by FFT, and finally the multiplication result is restored into time domain sequence by IFFT. While the resources of the FPGA occupied are somewhat reduced, its system performance is very limited by its data operation delay.
Thirdly, for the FIR digital filter based on the multiplier structure, particularly for the high-order FIR digital filter based on the data broadcasting structure, firstly, the input data is broadcasted to each multiplier and then sent to each adding node for delay accumulation, so that the real-time performance and the high efficiency of data processing are well ensured. FPGAs present significant advantages in terms of processing speed. However, for the high-order FIR digital filter, the number of adders in the hardware implementation is huge in order to ensure the real-time performance and high efficiency of the processing speed. Therefore, it is necessary to save hardware resource consumption for implementing the higher order FIR digital filter by compressing the adder.
Disclosure of Invention
The invention provides a high-order FIR digital filter based on protocol calculation, which is characterized by comprising a combination module, a lookup table, a broadcasting module, a Gate node, a Pipeline module, a hard-wired module, an addition node and a buffer module, wherein the lookup table is used for storing the combination of the high-order FIR digital filter and the hard-wired module;
the combination module carries out multiple combinations on the sampling points;
the lookup table stores partial sums formed by the multiple combined sampling points and corresponding addresses through protocol calculation according to the order of small addresses and large addresses, and the addresses are binary numbers after multiple combination.
The broadcasting module is used for transmitting the part in the lookup table and the address to the corresponding Gate node;
The Gate node stores the partial sum and adds a flag bit to the partial sum;
The Pipeline module is connected with the Gate node, and outputs the data output by the Gate node in the period after synchronizing;
the hard-wired module is connected with the Pipeline module, and simultaneously outputs the part of the sum or the opposite number of the part of the sum to the addition node according to the zone bit for the value of each addition node which is input in a time-sharing way;
The addition node adopts a time-sharing multiplexing strategy and is used for accumulating the input data from the hard-wired module and the input data of the cache module and registering the accumulated result in the cache module;
The buffer module is arranged between the addition points, and the buffer module buffers the data with preset period and inputs the data into the adjacent addition nodes.
Furthermore, the combination module firstly combines n items of new sampling points in the current sampling period with n sampling points in the previous n-1 sampling periods, n items of combination are also needed for the binary random coefficients of the filter with the original length of m, and the partial sum formed by the sampling points and the binary random coefficients after the n items of combination through protocol calculation is calculated.
Further, the address of the lookup table is fully listed by 2 n-1 possible sequences of n-bit binary numbers with the highest address of 0, and each address corresponds to a partial sum formed by the n-bit binary numbers and the current n sampling points through reduction calculation.
Furthermore, the number of the Gate nodes is m/n, the Gate nodes are provided with node addresses, and the node addresses are determined by binary random coefficients corresponding to the positions of the Gate nodes after n combinations;
If the main working frequency of the system is k times of the sampling frequency, each Gate node of the broadcasting module compares the n-bit node address of the Gate node with the n-bit lookup table address of the current broadcasting in every 1/2 n-1 cycles if k is more than 2 n-1, otherwise, each Gate node of the broadcasting module compares the n-bit node address of the Gate node with the 2 n-1/k n-bit lookup table addresses of the current broadcasting in every 1/k cycles if k is less than 2 n-1;
If the n-bit Gate node address is the opposite number of the n-bit lookup table address currently broadcast, the Gate node saves the partial sum corresponding to the lookup table address and adds a bit flag bit 0 at the highest bit during output, and if the n-bit Gate node address is the opposite number of the n-bit lookup table address currently broadcast, the Gate node saves the partial sum corresponding to the lookup table address and adds a bit flag bit 1 at the highest bit during output.
Further, the hard-wired module comprises a shift register set, an inverting unit and a one-out-of-two multiplexer, wherein the shift register set is used for inputting values from each node of the Pipeline module; the inverting unit is used for generating the opposite numbers of the input parts and the data output by the shift register set are respectively and directly input into the one-out-of-two multiplexer and the one-out-of-two multiplexer after passing through the inverting unit;
The flag bit output by the shift register set is used as a flag signal of the alternative multiplexer, if the flag signal is 0, the alternative multiplexer output part and the opposite number of the alternative multiplexer output part and the entering addition node enter the addition node, and if the flag signal is 1.
Furthermore, the buffer module is arranged between two addition nodes and is composed of a C1-Cn+1 common n+1 level shift register group, and 1 sampling period is divided into k system working periods;
In the 1 st to the k-1 st system working periods, each Cn+1 system working period receives the result of the previous adder through the last register in the stage, and shifts upwards in the stage, the result is wholly shifted into Cn until the kth system working period, and the data input in the kth system working period directly enters the last register of Cn;
After 1 sampling period, C2-Cn moves data to the next stage register, and each system working period of C1 outputs the result to the adder through the first register of the stage and shifts upwards in the stage.
The beneficial effects of the invention are as follows:
The invention realizes a high-order FIR digital filter based on protocol calculation, n-term combination is carried out on sampling points and binary random coefficients, the rule that the combined partial sum is a limited term is utilized, a lookup table with smaller depth is used for compressing the calculated part at each addition node and a large number of repeated addition needed, so that each addition node of the filter is matched with the address of each table item of the lookup table to obtain the corresponding partial sum, and the adder is greatly compressed for use. For higher order FIR digital filters with larger orders, the present invention saves more significant advantages of the adder.
Drawings
FIG. 1 is a block diagram of a high-order FIR digital filter based on protocol computation according to an embodiment of the present invention;
Fig. 2 is a block diagram of a hard-wired module in a high-order FIR digital filter based on protocol computation according to an embodiment of the present invention.
Fig. 3 is a block diagram of a single summing node input/output within a higher order FIR digital filter based on protocol computation according to an embodiment of the present invention.
Detailed Description
The technical scheme of the present invention will be described in more detail with reference to the accompanying drawings, and the present invention includes, but is not limited to, the following examples.
The invention aims to provide a high-order FIR digital filter based on protocol calculation, which comprises a combination module, a lookup table, a broadcasting module, a Gate node, a pipeline module, a hard-wired module, an addition node and a buffer module.
The combination module combines the sampling point X i+n under the new sampling period with the sampling point X i+n-1,...,Xi+2,Xi+1 of the previous n-1 sampling periods by n items, the operation is equivalent to a shift register group with the length of n, the sampling point X i+n under the new sampling period enters from the lower side of the shift register group, the sampling point of the corresponding previous n-1 sampling periods slides upwards by 1 length, X 0 slides upwards out of the shift register group, and for the binary random coefficient of the filter, the binary coefficient of the filter with the original length of m is grouped into n items according to the sequence from left to right.
The lookup table combines n terms of the sampling point X i+n in the new sampling period with the sampling point X i+n-1,...,Xi+2,Xi+1 in the first n-1 sampling periods, and then calculates 2 n possible partial sums formed by n-bit binary numbers through reduction.
To further compress the resource, the sum of the parts corresponding to the address 0 and the address 2 n-1 -1 is the opposite number, and the sum of the parts corresponding to the address 1 and the address 2 n-1 -2 is also the opposite number, so that the lookup table entry to be established is only the sum of the parts with the highest address of 0, and the number of the corresponding lookup table entry can be reduced from 2 n to 2 n-1.
A lookup table with depth of 2 n-1 is built, wherein, the lookup table is fully enumerated by 2 n-1 possible sequences with n-bit binary numbers and the address most significant bit of 0, and each table entry address corresponds to the partial sum formed by the n-bit binary numbers and the current n sampling points through reduction calculation.
Specifically, according to the protocol calculation feature, if the binary coefficient is 0000, the sum of the binary coefficient and the sampling point ABCD is accumulated into a partial sum y1= - (ABCD), if the binary coefficient is 1111, the sum of the binary coefficient and the sampling point ABCD is accumulated into a partial sum y2=abcd, it can be seen that Y1 and Y2 are in an opposite number relationship, so if the broadcast address of the current period is 0000, the address of the gate node is 1111, the gate node stores the partial sum corresponding to the address 0000, and a flag bit is output at the highest bit, which indicates that the partial sum actually needs to be transferred by the gate node is the opposite number of the currently stored partial sum. In summary, only the partial sums corresponding to the 0xxx addresses need to be established.
When broadcasting, the broadcasting module matches the address corresponding to the table item with the address of each adding node of the Gate node, and if the matching is successful, the corresponding Gate node adding node obtains the corresponding partial sum.
The number of Gate nodes is m/n, and the node addresses of the Gate nodes are respectively determined by binary random coefficients corresponding to the positions of the Gate nodes after n combinations.
Specifically, assuming that the system operating main frequency is k times of the sampling frequency, if k is greater than 2 n-1, each Gate node of the broadcasting module compares the n-bit node address with the n-bit lookup table address currently broadcast in every 1/2 n-1 cycles, otherwise if k is less than 2 n-1, each Gate node of the broadcasting module compares the n-bit node address with the 2 n-1/k n-bit lookup table addresses currently broadcast in every 1/k cycles;
If the n-bit Gate node address is the opposite number of the currently broadcast n-bit lookup table address, the Gate node saves the corresponding partial sum of the lookup table address and adds a bit flag bit 0 to the highest bit during output, and if the n-bit Gate node address is the opposite number of the currently broadcast n-bit lookup table address, the Gate node saves the corresponding partial sum of the lookup table address and adds a bit flag bit 1 to the highest bit during output.
Up to this point, 2 n-1 parts and the parts received between the addition nodes are fully enumerated by a lookup table and broadcast to the addition nodes, a large number of repeated items are needed in the part and the parts received between the addition nodes, and the lookup table with the depth of 2 n-1 is used for compressing a large number of repeated addition after n items of combination of each addition node and a binary random coefficient are respectively carried out.
The number of nodes in the Pipeline module is the same as the number of Gate nodes, and the nodes are m/n and are in one-to-one correspondence. The Pipeline module is used for synchronizing the output value of each Gate node to the hard-wired module due to the time difference of the output part of each Gate node.
The hard-wired module comprises a shift register group, an inverting unit and an alternative multiplexer. If the system working main frequency is k times of the sampling frequency, a time-sharing multiplexing strategy is adopted for the addition nodes, wherein m/n nodes from the Pipeline module are divided into k groups, and the output of (m/n)/k nodes from the Pipeline module is calculated under 1/k system working periods, so that the total consumption of the number of the addition nodes is only (m/n)/k.
The shift register group is a first-in first-out memory used for acquiring the value output by the Pipeline module from the input end according to the time-sharing calculation sequence of the addition node, and the marker bit data of the output end is directly connected with the one-out-of-two multiplexer and is partially and respectively input into the one-out-of-two multiplexer and the inverting unit.
The inverting unit is used for generating the opposite number of the input part sum, and the opposite number is input into the alternative multiplexer.
The flag bit output by the shift register group is used as a flag signal of the alternative multiplexer, if the flag signal is 0, the alternative multiplexer output part and the opposite number of the alternative multiplexer output part and the entering addition node enter the addition node, and if the flag signal is 1.
Thus, by the device, the length of the original filter is changed from m to (m/n)/k, and the number of addition nodes of the corresponding filter is (m/n)/k.
The buffer module is arranged between the addition nodes, realizes the concatenation of the addition nodes, buffers the data of n sampling periods and inputs the data into the next addition node.
Specifically, n-term combination is performed on the sampled data and the binary random coefficient of the filter, and according to the register retiming principle, 1 sampling period delay between addition nodes in the original filter needs to be changed into n sampling period delay, so that C1-Cn total n-level register is needed, and because k time division multiplexing processing of system working periods is adopted on the addition nodes, cn+1-level register is needed to be additionally added. Thus, C1-Cn-1 respectively form the delay of one sampling period, and Cn and Cn+1 form the delay of one sampling period.
The 1 sampling period is now divided into k system operating periods, and the number of shift registers in each stage of registers corresponds to a single system operating period. Then in the 1 st to k-1 st system working period, C2-Cn is kept, and the whole shift is carried out to the next stage register in the k system working period, and meanwhile the data of the previous stage whole is received.
For Cn+1, in the 1 st to k-1 st system working periods, each Cn+1 system working period receives the result of the previous adder through the last register in the stage, and shifts upwards in the stage, the whole is moved into Cn until the k system working period, and the data input in the k system working period directly enter the last register of Cn.
After 1 sampling period, C2-Cn moves data to the next stage register, and each system working period of C1 outputs the result to the adder through the first register of the stage and shifts upwards in the stage.
It is noted that since Cn+1 requires one system duty cycle to write the result of the previous stage to the last register in the present stage, cn+1 only writes k-1 number during the kth cycle, and then Cn+1 is shifted to Cn as a whole during the kth cycle.
In one embodiment, the length of the original FIR digital filter is 4096, the system operating frequency is 96MHz, and each 12MHz sampling frequency inputs a sample data with a bit width of 16 bits, which requires a "multiplication" accumulation result of 4096 16 bit sampling points and 4096 1 bit binary random coefficients to be calculated at each 12MHz sampling frequency. The binary random coefficient corresponds to the coefficient of the FIR filter, the value is 0 or 1, and the input sampling point corresponds to the data input of the FIR filter. The result of the 'multiplication' of the 1 sampling point and the 1-bit binary coefficient is that the multiplication result is the value of the sampling point when the coefficient is 1, otherwise, the multiplication result is that the value of the sampling point takes the opposite number.
After new sampling data is input in each sampling period by the FIR filter based on the data broadcasting structure before combination, the new sampling data is broadcast to 4096 addition nodes, and the value received by each addition node is determined to be the value of the current sampling point or the value obtained by taking the opposite number from the current sampling point according to the single-bit binary random coefficient 0 or 1 at the corresponding position. At this time, the number of adders to be used in each sampling period is 4096, so that the consumption of adder resources is excessive in this case.
Thus, in the present embodiment, the operation of combining the sampling point and the binary random coefficient of the filter by 4 items respectively is selected, namely, the sampling point X i+4 under the new sampling period and the sampling point X i+3,Xi+2,Xi+1 of the first 3 sampling periods form 4 sampling point combinations, which is equivalent to a shift register group with the length of 4, the sampling point X i+4 under the new sampling period enters from the lower side of the shift register group, the sampling point of the corresponding first 3 periods slides upwards by 1 length, and X 0 slides upwards out of the shift register group, and meanwhile, the binary coefficient of the filter with the length of 4096 is grouped by 4 items in sequence from left to right. After the 4-item combination operation, the length of the filter is reduced to 1024 from 4096.
Second, all possible results of the reduction calculation using a combination of 4 sampling points per sampling period and a 4-bit binary random number have a rule of at most 16 possible results, so that the sum of the calculated parts between the addition nodes must have a large number of repeated terms. In order to further save adder resources, only the table entry of the corresponding parts of the addresses 0000 to 0111 is required to be established, and the corresponding parts of the addresses 1000 to 1111 can be directly obtained by directly reversing the corresponding address parts. In sum, 24 adders consumed by each table item in the lookup table with depth of 8 are used for further compressing the repeated addition after 4 items of combination of each addition node and the binary random coefficient respectively, so that each addition node of the filter obtains corresponding partial sum through matching with the address of each table item of the lookup table.
As shown in figure 1, in view of the 8-frequency multiplication relation between the sampling frequency 12MHz and the system operating frequency 96MHz, partial sums in the compressed lookup table entries can be generated in 8 working periods and are broadcast to 1024 Gate nodes in a time-sharing manner according to the sequence of addresses 0000 to 0111, and the broadcast content is divided into two parts, wherein one part is a 4-bit broadcast coefficient and corresponds to the address corresponding to each table entry of the lookup table, and the other part is an 18-bit partial sum corresponding to each table entry address of the lookup table. Specifically, when each table entry of the lookup table is divided into periods to broadcast the address and the corresponding partial sums, each Gate node compares the 4-bit address with the 4-bit broadcast coefficient in each period, if the two are equal, the Gate node outputs the corresponding partial sums, and adds a bit of flag bit 0 at the highest bit during output for marking the received weighted sums, and then the weighted sums can be sent to an addition node for direct accumulation, otherwise, if the 4-bit local coefficient is equal to the 4-bit broadcast coefficient and inverted according to the bit, the Gate node outputs the corresponding partial sums, and adds a bit of flag bit 1 at the highest bit during output for marking the received weighted sums to be sent to the addition node for accumulation after the opposite number is needed to be obtained. The data format output by each Gate node is {1 bit flag bit, 18 bit part and }. In view of the fact that the partial sums output by all Gate nodes have time differences, in order to ensure that all addition nodes can receive corresponding partial sums and sign signals at the same time, when all Gate nodes receive the sign signals and the corresponding partial sums and output, a level-one Pipeline module is added between the Gate nodes and the addition node level to perform level-one buffering, all Gate node output signals are synchronized, and the number of nodes in the Pipeline module is 1024 identical to that of Gate nodes and corresponds to that of Gate nodes one by one.
As shown in FIG. 2, the hardwired module comprises a shift register set, an inverting unit and an alternative multiplexer. Because the system working main frequency 96MHz is 8 times of the sampling frequency 12MHz, 1 sampling period is divided into 8 system working periods, a time-sharing multiplexing strategy is adopted for the adding nodes, 128 adding nodes calculate the output of 1024 nodes from the Pipeline module in 8 periods, the number of registers in the shift register set is 8, the values from all nodes of the Pipeline module enter the shift register set from the right in a time-sharing manner according to the sequence of the calculating periods, and the values entering from adjacent periods move leftwards. The inverting unit is used for generating the opposite number of the input part sum, wherein the flag bit output by the shift register set is used as a flag signal of the alternative multiplexer, if the flag signal is 0, the alternative multiplexer output part sum enters the adding node, and if the flag signal is 1, the opposite number of the alternative multiplexer output part sum enters the adding node.
As shown in fig. 3, the buffer module is disposed between the addition nodes, and realizes the concatenation of the addition nodes, and the buffer module buffers the data of 4 periods and inputs the data into the next addition node.
Specifically, because 4 combinations are respectively performed on the sampled data and the binary random coefficients of the filter, according to the register retiming principle, the delay of 1 sampling period between the addition nodes in the original filter needs to be changed into the delay of 4 sampling periods, so that four-stage registers of C1, C2, C3 and C4 are needed, and because 8-period time division multiplexing processing is adopted for the addition nodes, a C5-stage register is needed to be additionally added. Thus, C1, C2, C3 each constitute a delay of one sample period, and C4 and C5 constitute a delay of one sample period.
The 1 sampling period is now divided into 8 system operating periods, and the number of shift registers in each stage of registers corresponds to a single system operating period. Then C2, C3, C4 remain in the 1 st through 7 th system cycles, shifting to the next stage register global by the 8 th system cycle while receiving the data of the previous stage global.
For C5, during the 1 st to 7 th system cycles, each C5 system cycle receives the result of the previous adder through the last register in the stage and shifts up in the stage, shifting in its entirety into C4 by the 8 th system cycle.
For C1, each system duty cycle outputs the result to the adder through the first register of the stage and shifts up in the stage.
It is noted that since C5 requires a system duty cycle to write the result of the previous stage to the last register in this stage, C5 only writes 7 numbers during the 8 th cycle, and when the 8 th cycle shifts C5 overall to C4,
The present invention is not limited to the above embodiments, and those skilled in the art can implement the present invention in various other embodiments according to the examples and the disclosure of the drawings, so that the design of the present invention is simply changed or modified while adopting the design structure and concept of the present invention, and the present invention falls within the scope of protection.

Claims (6)

1.一种基于规约计算的高阶FIR数字滤波器,其特征在于,所述高阶FIR数字滤波器包括组合模块、查找表、广播模块、Gate节点、Pipeline模块、硬连线模块、加法节点和缓存模块;1. A high-order FIR digital filter based on reduced calculation, characterized in that the high-order FIR digital filter includes a combination module, a lookup table, a broadcast module, a gate node, a pipeline module, a hard-wired module, an addition node and a cache module; 所述组合模块将采样点进行多项组合;The combination module performs multiple combinations of sampling points; 所述查找表按地址由小及大的顺序存储所述多项组合采样点与对应地址经规约计算形成的部分和,所述地址为多项组合后的二进制数;The lookup table stores the partial sums formed by reducing the multiple combination sampling points and the corresponding addresses in order of addresses from small to large, wherein the addresses are binary numbers after the multiple combinations; 所述广播模块用于将所述查找表中所述部分和根据地址输送至对应的所述Gate节点;The broadcast module is used to transmit the portion in the lookup table and the address to the corresponding Gate node; 所述Gate节点保存所述部分和,并对所述部分和增加标志位;The Gate node saves the partial sum and adds a flag bit to the partial sum; 所述Pipeline模块与所述Gate节点连接,对周期内所述Gate节点输出的数据进行同步后输出;The Pipeline module is connected to the Gate node, and synchronizes and outputs the data output by the Gate node within a cycle; 所述硬连线模块与所述Pipeline模块连接,同时对分时输入各加法节点的值依据所述标志位输出所述部分和或所述部分和的相反数至所述加法节点;The hard-wired module is connected to the Pipeline module, and simultaneously outputs the partial sum or the inverse of the partial sum to the addition node according to the flag bit for the time-sharing input value of each addition node; 所述加法节点采取分时复用策略,用于对来自所述硬连线模块的输入数据以及所述缓存模块的输入数据进行累加,并将累加结果寄存在所述缓存模块;The adding node adopts a time-division multiplexing strategy to accumulate the input data from the hard-wired module and the input data from the cache module, and stores the accumulated result in the cache module; 所述缓存模块由设置在各加法加点之间,所述缓存模块缓存预设周期的数据后将其输入相邻的所述加法节点。The cache module is arranged between each addition node, and the cache module caches data of a preset period and then inputs the data into the adjacent addition node. 2.根据权利要求1所述高阶FIR数字滤波器,其特征在于,所述组合模块先将当前采样周期下新的采样点与其前n-1个采样周期共n个采样点进行n项组合;原长度为m的滤波器的二值随机系数也需进行n项组合,并计算n项组合后采样点与二值随机系数经规约计算形成的部分和。2. The high-order FIR digital filter according to claim 1 is characterized in that the combination module first performs n-item combinations on the new sampling points in the current sampling period and the n sampling points in the previous n-1 sampling periods, a total of n items; the binary random coefficients of the filter with an original length of m also need to be combined in n items, and the partial sum formed by the reduced calculation of the sampling points and the binary random coefficients after the n-item combination is calculated. 3.根据权利要求2所述高阶FIR数字滤波器,其特征在于,所述查找表的地址由n位二进制数且地址最高位为0的2n-1种可能排序全列举;每个所述地址对应该n位二进制数与当前n个采样点经规约计算形成的部分和。3. The high-order FIR digital filter according to claim 2, wherein the addresses of the lookup table are a complete list of 2n -1 possible orderings of n-bit binary numbers with the most significant bit of the address being 0; each of the addresses corresponds to a partial sum formed by the reduced calculation of the n-bit binary number and the current n sampling points. 4.根据权利要求3所述高阶FIR数字滤波器,其特征在于,所述Gate节点数量为m/n,所述Gate节点设有节点地址,所述节点地址由其所在位置对应的经n项组合后的二值随机系数确定;4. The high-order FIR digital filter according to claim 3, wherein the number of the gate nodes is m/n, the gate nodes are provided with node addresses, and the node addresses are determined by the binary random coefficients corresponding to the locations thereof after n combinations; 假设系统工作主频是采样频率的k倍,若k>2n-1,所述广播模块每1/2n-1个周期下各所述Gate节点将其n位节点地址与当前广播的n位查找表地址进行对比;否则若k<2n-1,所述广播模块每1/k个周期下各所述Gate节点将其n位节点地址与当前广播的2n-1/k个n位查找表地址进行对比;Assuming that the system operating main frequency is k times the sampling frequency, if k> 2n-1 , the broadcast module compares each Gate node's n-bit node address with the currently broadcast n-bit lookup table address every 1/ 2n-1 cycle; otherwise, if k<2n -1 , the broadcast module compares each Gate node's n-bit node address with the currently broadcast 2n -1 /k n-bit lookup table addresses every 1/k cycle; 若n位Gate节点地址与当前广播的n位查找表地址相等,则此Gate节点保存该查找表地址对应的部分和,并在输出时在最高位增加一位标志位0;若此n位Gate节点地址是当前广播的n位查找表地址的相反数,则此Gate节点保存该查找表地址对应的部分和,并在输出时在最高位增加一位标志位1。If the n-bit Gate node address is equal to the currently broadcast n-bit lookup table address, this Gate node saves the partial sum corresponding to the lookup table address and adds a flag bit 0 to the highest bit when outputting; if the n-bit Gate node address is the opposite of the currently broadcast n-bit lookup table address, this Gate node saves the partial sum corresponding to the lookup table address and adds a flag bit 1 to the highest bit when outputting. 5.根据权利要求4所述高阶FIR数字滤波器,其特征在于,所述硬连线模块包括移位寄存器组、取反单元和二选一多路选择器:移位寄存器组用于输入来自所述Pipeline模块各节点的值;取反单元用于生成输入部分和的相反数;所述移位寄存器组输出的部分和数据分别直接输入二选一多路选择器和通过取反单元后输入二选一多路选择器;5. The high-order FIR digital filter according to claim 4, wherein the hard-wired module comprises a shift register group, an inverting unit, and a two-to-one multiplexer: the shift register group is used to input values from each node of the pipeline module; the inverting unit is used to generate the inverse of the input partial sum; the partial sum data output by the shift register group are respectively input directly into the two-to-one multiplexer and then input into the two-to-one multiplexer after passing through the inverting unit; 所述移位寄存器组输出的标志位作为二选一多路选择器的标志信号,若标志信号为0,二选一多路选择器输出部分和进入加法节点;若标志信号为1,二选一多路选择器输出部分和的相反数进入加法节点。The flag bit output by the shift register group serves as a flag signal of a two-to-one multiplexer. If the flag signal is 0, the two-to-one multiplexer outputs a partial sum that enters the addition node; if the flag signal is 1, the opposite of the two-to-one multiplexer outputs a partial sum that enters the addition node. 6.根据权利要求4所述高阶FIR数字滤波器,其特征在于,所述缓存模块设置两个加法节点之间,由C1~Cn+1共n+1级移位寄存器组构成,1个采样周期分为k个系统工作周期;6. The high-order FIR digital filter according to claim 4, wherein the buffer module is arranged between two addition nodes and is composed of a shift register group of n+1 stages from C1 to Cn+1, and one sampling period is divided into k system working periods; 在第1至第k-1个系统工作周期内,Cn+1每个系统工作周期通过本级内最后一个寄存器接收前一个加法器的结果,并在级内向上移位,至第k个系统工作周期时整体移入Cn,且第k个系统工作周期时输入的数据直接进入Cn的最后一个寄存器;During the 1st to k-1th system working cycles, Cn+1 receives the result of the previous adder through the last register in this stage in each system working cycle, and shifts it upward within the stage. When it reaches the kth system working cycle, it is shifted into Cn as a whole, and the data input during the kth system working cycle directly enters the last register of Cn; 在经历1个采样周期后,C2~Cn向下一级寄存器中移动数据,C1每个系统工作周期通过本级第一个寄存器输出结果至加法器,并在级内向上移位。After one sampling cycle, C2~Cn move data to the next level register. C1 outputs the result to the adder through the first register of this level in each system working cycle and shifts it upward within the level.
CN202210873320.2A 2022-07-22 2022-07-22 A high-order FIR digital filter based on reduced calculation Active CN115189675B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210873320.2A CN115189675B (en) 2022-07-22 2022-07-22 A high-order FIR digital filter based on reduced calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210873320.2A CN115189675B (en) 2022-07-22 2022-07-22 A high-order FIR digital filter based on reduced calculation

Publications (2)

Publication Number Publication Date
CN115189675A CN115189675A (en) 2022-10-14
CN115189675B true CN115189675B (en) 2025-09-02

Family

ID=83520547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210873320.2A Active CN115189675B (en) 2022-07-22 2022-07-22 A high-order FIR digital filter based on reduced calculation

Country Status (1)

Country Link
CN (1) CN115189675B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009072197A1 (en) * 2007-12-05 2009-06-11 Mitsubishi Electric Corporation Digital filter, precoder, and transmission system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6724329B2 (en) * 2002-04-24 2004-04-20 Intel Corporation Decision feedback equalization employing a lookup table
US7386581B2 (en) * 2003-12-31 2008-06-10 Stmicroelectronics, Inc. Performance FIR filter
US20060036664A1 (en) * 2004-08-13 2006-02-16 Nokia Corporation Efficient FIR filter suitable for use with high order modulation radio frequency transmitters
CN114614795A (en) * 2022-03-07 2022-06-10 中科南京移动通信与计算创新研究院 A high-order polyphase digital filtering sampling method and system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009072197A1 (en) * 2007-12-05 2009-06-11 Mitsubishi Electric Corporation Digital filter, precoder, and transmission system

Also Published As

Publication number Publication date
CN115189675A (en) 2022-10-14

Similar Documents

Publication Publication Date Title
KR20010034300A (en) Pipelined fast fourier transform processor
CN106484658A (en) The device and method of 65536 points of pulse compressions is realized based on FPGA
CN101937423A (en) A Pipeline FFT/IFFT Processing System
CN101697486A (en) Two-dimensional wavelet transform integrated circuit structure
CN115189675B (en) A high-order FIR digital filter based on reduced calculation
WO2007068438A2 (en) Circular fast fourier transform
CN103117730A (en) Multichannel comb filter and implement method thereof
CN110096672A (en) Inexpensive pipeline-type fft processor implementation method based on FPGA
CN110620566A (en) FIR filtering system based on combination of random calculation and remainder system
Kala et al. High throughput, low latency, memory optimized 64K point FFT architecture using novel radix-4 butterfly unit
CN103237219A (en) Two-dimensional discrete cosine transformation (DCT)/inverse DCT circuit and method
CN102751963B (en) Based on configurable wavelet transform circuit and its implementation of multiply-accumulator ring
CN109951173B (en) FIR filtering method and filter for multi-channel parallel input and parallel processing
Laguri et al. VLSI implementation of efficient split radix FFT based on distributed arithmetic
CN115242220A (en) Order Dynamically Reconfigurable Folding Circuit Structure Digital Shaping Filter and Design Method
Wu et al. Programmable wavelet packet transform processor
CN1323105A (en) Correlator
CN108616265A (en) A kind of circuit structure of the RNS DWT filter groups based on five mould remainder bases
KR100576520B1 (en) Variable Fast Fourier Transform Processor Using Iterative Arithmetic
Lofandri et al. Design and Optimization of a High-Speed VLSI Architecture for Integrated FIR Filters in Advanced Digital Signal Processing Applications
Ye et al. An efficient RNS scaler for moduli set
CN107193784A (en) Method and system for implementing sinc interpolation with high precision and low hardware complexity
CN114020240B (en) Device and method for implementing time-domain convolution calculation across clock domains based on FPGA
Suk et al. A 8192 complex point FFT/IFFT for COFDM modulation scheme in DVB-T system
Syed et al. A scalable architecture for discrete wavelet transform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant