[go: up one dir, main page]

CN115933457A - A Kind of FPGA Facilitating Timing Closure - Google Patents

A Kind of FPGA Facilitating Timing Closure Download PDF

Info

Publication number
CN115933457A
CN115933457A CN202211303085.1A CN202211303085A CN115933457A CN 115933457 A CN115933457 A CN 115933457A CN 202211303085 A CN202211303085 A CN 202211303085A CN 115933457 A CN115933457 A CN 115933457A
Authority
CN
China
Prior art keywords
clock
clock signal
phase
target resource
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211303085.1A
Other languages
Chinese (zh)
Other versions
CN115933457B (en
Inventor
单悦尔
徐彦峰
陈波寅
匡晨光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Zhongwei Yixin Co Ltd
Original Assignee
Wuxi Zhongwei Yixin Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Zhongwei Yixin Co Ltd filed Critical Wuxi Zhongwei Yixin Co Ltd
Priority to CN202211303085.1A priority Critical patent/CN115933457B/en
Publication of CN115933457A publication Critical patent/CN115933457A/en
Application granted granted Critical
Publication of CN115933457B publication Critical patent/CN115933457B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

本申请公开了一种便于实现时序收敛的FPGA,涉及FPGA领域。该FPGA中全局时钟信号除了经由第一全局时钟树连接到多个目标资源模块的时钟输入口之外,全局时钟信号的移相时钟信号还经由第二时钟树连接各个目标资源模块的时钟输入口,第二时钟树的路径走向与第一全局时钟树相同使得第二时钟树与第一全局时钟树对应位置处的时延差一致且该时延差可以调控。每个目标资源模块可以有多个时钟信号进行选择作为模块时钟信号,因此通过调节局部目标资源模块的模块时钟信号来进行局部调试,从而可以较为方便的达到时序收敛,从而有利于加快设计流程。

Figure 202211303085

The application discloses an FPGA which is convenient for realizing timing convergence, and relates to the FPGA field. In addition to the global clock signal in the FPGA being connected to the clock input ports of multiple target resource modules via the first global clock tree, the phase-shifted clock signal of the global clock signal is also connected to the clock input ports of each target resource module via the second clock tree The path direction of the second clock tree is the same as that of the first global clock tree so that the delay difference between the second clock tree and the corresponding positions of the first global clock tree is consistent and the delay difference can be adjusted. Each target resource module can have multiple clock signals to choose as the module clock signal. Therefore, local debugging can be performed by adjusting the module clock signal of the local target resource module, so that timing convergence can be achieved more conveniently, which is conducive to speeding up the design process.

Figure 202211303085

Description

一种便于实现时序收敛的FPGAAn FPGA that facilitates Timing Closure

技术领域technical field

本发明涉及FPGA领域,尤其是一种便于实现时序收敛的FPGA。The invention relates to the FPGA field, in particular to an FPGA which is convenient for realizing timing convergence.

背景技术Background technique

FPGA(Field Programmable Gate Array,现场可编程逻辑门阵列)内部包含大量的诸如CLB、BRAM、DSP、IOB之类的资源模块,这些资源模块通常按照行列方式排布形成二维阵列。这些资源模块工作时所需的时钟信号来自于多个预置的全局时钟树,全局时钟树是一种时钟网络结构,其通过预设的总线结构连接至不同资源模块的时钟输入口,使得全局时钟树可以为FPGA内部的多个资源模块提供时钟信号。FPGA (Field Programmable Gate Array, Field Programmable Logic Gate Array) contains a large number of resource modules such as CLB, BRAM, DSP, IOB, etc. These resource modules are usually arranged in rows and columns to form a two-dimensional array. The clock signals required for the operation of these resource modules come from multiple preset global clock trees. The global clock tree is a clock network structure, which is connected to the clock input ports of different resource modules through a preset bus structure, so that the global The clock tree can provide clock signals for multiple resource modules inside the FPGA.

在实现用户设计时,一般使用全局时钟信号为同步信号,也即要求一个全局时钟信号经由全局时钟树连接到各个资源模块,且全局时钟信号的上升沿或下降沿要能够同时到达各个资源模块的时钟输入口。但是全局时钟树的覆盖面积大、走线长,全局时钟信号在全局时钟树中传输时会产生传输延迟,导致全局时钟信号的上升沿或下降沿难以完全同时到达各个资源模块的时钟输入口,尤其是在面积较大的FPGA中,全局时钟树的传输延迟导致的全局时钟信号的不同步更是明显,也即产生时钟偏移(clock skew)。When implementing a user design, the global clock signal is generally used as a synchronization signal, that is, a global clock signal is required to be connected to each resource module through the global clock tree, and the rising or falling edge of the global clock signal must be able to reach each resource module at the same time. Clock input port. However, the global clock tree has a large coverage area and long traces. When the global clock signal is transmitted in the global clock tree, there will be a transmission delay, which makes it difficult for the rising or falling edges of the global clock signal to reach the clock input ports of each resource module at the same time. Especially in larger FPGAs, the out-of-synchronization of the global clock signal caused by the transmission delay of the global clock tree is more obvious, that is, clock skew occurs.

在全片中可容许的最大的时钟偏移(clock skew)是FPGA的重要设计参数,在设计FPGA的电路时,为了更好的满足时钟偏移的要求,理论上可以通过诸如采用差分信号传输、增大驱动能力等电路技巧来设计全局时钟树,但是这种做法会导致电路面积增加或者功耗增加的问题,因此为了平衡各方面性能,全片中可容许的最大的时钟偏移一般不能随意减小。这就导致时钟偏移对时钟周期的占比较高带来的影响变得较为明显,尤其是在先进的FPGA器件中,用户设计复杂,时钟频路较高,再加上复杂的大FPGA用户设计,完成布局布线就不易一次就完全满足全部时序,一般都会有少数路径不满足时序要求,为此往往需要针对这些局部问题进行反复调试,导致设计周期较长。The maximum permissible clock skew in the whole film is an important design parameter of FPGA. In order to better meet the requirements of clock skew when designing FPGA circuits, theoretically, it can be transmitted through differential signal transmission, such as , Increase the driving capability and other circuit techniques to design the global clock tree, but this approach will lead to increased circuit area or increased power consumption. Therefore, in order to balance the performance of all aspects, the maximum allowable clock skew in the entire chip generally cannot Feel free to decrease. This leads to the impact of the high proportion of clock skew on the clock period becoming more obvious, especially in advanced FPGA devices, the user design is complex, the clock frequency path is high, and the complex large FPGA user design , it is not easy to fully satisfy all the timing requirements once the layout and routing are completed. Generally, there will be a small number of paths that do not meet the timing requirements. Therefore, it is often necessary to repeatedly debug these local problems, resulting in a longer design cycle.

发明内容Contents of the invention

本发明人针对上述问题及技术需求,提出了一种便于实现时序收敛的FPGA,本发明的技术方案如下:The present inventor proposes a kind of FPGA that is convenient to realize timing convergence for above-mentioned problem and technical demand, and the technical scheme of the present invention is as follows:

本申请公开了一种便于实现时序收敛的FPGA,其特征在于,在FPGA中,全局时钟信号经由第一全局时钟树连接到多个目标资源模块的时钟输入口;全局时钟信号的移相时钟信号还经由第二时钟树连接各个目标资源模块的时钟输入口,第二时钟树的路径走向与第一全局时钟树相同,每个目标资源模块获取全局时钟信号或移相时钟信号作为模块时钟信号;The present application discloses a kind of FPGA which is convenient to realize timing convergence, and it is characterized in that, in FPGA, the global clock signal is connected to the clock input ports of multiple target resource modules via the first global clock tree; the phase-shifted clock signal of the global clock signal The clock input port of each target resource module is also connected via the second clock tree, the path of the second clock tree is the same as that of the first global clock tree, and each target resource module obtains the global clock signal or the phase-shifted clock signal as the module clock signal;

在完成对FPGA的初始布局布线、所有目标资源模块都获取对应的全局时钟信号作为模块时钟信号而未达到时序收敛时,调整至少一个目标资源模块改为获取移相时钟信号作为模块时钟信号并调整移相时钟信号相对于全局时钟信号的相位差,直至达到时序收敛。When the initial layout and routing of the FPGA is completed and all target resource modules obtain the corresponding global clock signal as the module clock signal but timing convergence is not reached, adjust at least one target resource module to obtain the phase-shifted clock signal as the module clock signal and adjust Shifts the phase difference of the clock signal relative to the global clock signal until timing closure is reached.

其进一步的技术方案为,调整至少一个目标资源模块改为获取移相时钟信号作为模块时钟信号以及调整移相时钟信号相对于全局时钟信号的相位差,直至任意两个目标资源模块的模块时钟信号之间的时钟偏移不超过预定偏移阈值,且任意两个目标资源模块之间形成的传输路径的建立时间和保持时间均满足对应的时间约束。Its further technical solution is to adjust at least one target resource module to obtain the phase-shifted clock signal as the module clock signal and adjust the phase difference between the phase-shifted clock signal and the global clock signal until the module clock signals of any two target resource modules The clock offset between them does not exceed a predetermined offset threshold, and the setup time and hold time of the transmission path formed between any two target resource modules satisfy corresponding time constraints.

其进一步的技术方案为,调整至少一个目标资源模块的模块时钟信号的方法包括:Its further technical solution is that the method for adjusting the module clock signal of at least one target resource module includes:

在完成对FPGA的初始布局布线后进行时序分析并确定任意两个目标资源模块之间的不满足建立时间约束或不满足保持时间约束的待优化路径;After completing the initial layout and routing of the FPGA, perform timing analysis and determine the path to be optimized between any two target resource modules that do not meet the setup time constraints or do not meet the hold time constraints;

依次遍历处理各条待优化路径,且对于遍历处理到的每条待优化路径,调整待优化路径的输入端的目标资源模块改为获取移相时钟信号作为模块时钟信号,或者,调整待优化路径的输出端的目标资源模块改为获取移相时钟信号作为模块时钟信号,并调节移相时钟信号相对于全局时钟信号的相位差,使得已经遍历处理过的所有待优化路径均满足建立时间约束以及满足保持时间约束。Traverse and process each path to be optimized in turn, and for each path to be optimized through traversal processing, adjust the target resource module at the input end of the path to be optimized to obtain a phase-shifted clock signal as the module clock signal, or adjust the path to be optimized The target resource module at the output end obtains the phase-shifted clock signal as the module clock signal, and adjusts the phase difference of the phase-shifted clock signal relative to the global clock signal, so that all paths to be optimized that have been traversed and processed meet the setup time constraints and hold time constraints.

其进一步的技术方案为,在调整一个目标资源模块的模块时钟信号时,保证目标资源模块的输入端的传输路径满足建立时间和保持时间对应的时间约束,且保证目标资源模块的输出端的传输路径满足建立时间和保持时间对应的时间约束。Its further technical solution is, when adjusting the module clock signal of a target resource module, ensure that the transmission path of the input end of the target resource module meets the time constraints corresponding to the setup time and hold time, and ensure that the transmission path of the output end of the target resource module satisfies Time constraints corresponding to setup time and hold time.

其进一步的技术方案为,在依次遍历处理各条待优化路径时,按照先处理不满足保持时间约束的待优化路径,再处理不满足建立时间约束的待优化路径的顺序依次遍历处理各条待优化路径。Its further technical solution is that when traversing and processing each path to be optimized in sequence, the paths to be optimized that do not satisfy the hold time constraint are processed first, and then the paths to be optimized that do not satisfy the setup time constraint are processed in order to traverse and process each path to be optimized sequentially. Optimize the path.

其进一步的技术方案为,当存在多条不满足保持时间约束的待优化路径时,按照保持时间的时序余量从小到大的顺序依次遍历处理各条待优化路径;当存在多条不满足建立时间约束的待优化路径时,按照建立时间的时序余量从小到大的顺序依次遍历处理各条待优化路径。Its further technical solution is, when there are multiple paths to be optimized that do not satisfy the hold time constraint, traverse and process each path to be optimized in order of the timing margin of the hold time from small to large; when there are multiple paths to be optimized that do not satisfy the establishment When the paths to be optimized are time-constrained, each path to be optimized is traversed and processed in sequence according to the sequence margin of the establishment time from small to large.

其进一步的技术方案为,第一全局时钟树的预定位置处引出并通过移相器产生移相时钟信号连接第二时钟树的输入端,通过移相器调节移相时钟信号与全局时钟信号之间的相位差。Its further technical scheme is that the phase-shifted clock signal is drawn from the predetermined position of the first global clock tree and connected to the input end of the second clock tree through the phase shifter, and the phase-shifted clock signal and the global clock signal are adjusted by the phase shifter. phase difference between them.

其进一步的技术方案为,第一全局时钟树包括时钟主干线,从时钟主干线开始依次相连形成若干层级的多条时钟分支线,以及连接在最后一个层级的时钟分支线和对应的目标资源模块之间的时钟末端线;Its further technical solution is that the first global clock tree includes a clock trunk line, a plurality of clock branch lines connected in sequence from the clock trunk line to form several levels, and a clock branch line connected to the last level and the corresponding target resource module between the clock end lines;

第一全局时钟树的预定位置位于时钟主干线上、或者位于时钟分支线上、或者位于时钟末端线上。The predetermined position of the first global clock tree is located on the clock trunk line, or on the clock branch line, or on the clock stub line.

其进一步的技术方案为,用户信号通过锁相环产生全局时钟信号连接第一全局时钟树的输入端,以及产生移相时钟信号连接第二时钟树的输入端,通过锁相环的移相调节功能调节移相时钟信号与全局时钟信号之间的相位差。Its further technical solution is that the user signal generates a global clock signal through the phase-locked loop to connect to the input end of the first global clock tree, and generates a phase-shifted clock signal to connect to the input end of the second clock tree, and through the phase-shift adjustment of the phase-locked loop The function adjusts the phase difference between the phase-shifted clock signal and the global clock signal.

其进一步的技术方案为,全局时钟信号的多个不同的移相时钟信号分别经由多个不同的第二时钟树连接到各个目标资源模块的时钟输入口,存在至少一个目标资源模块同时连接第一全局时钟树和多个第二时钟树,目标资源模块获取全局时钟信号或其中一个移相时钟信号作为模块时钟信号;多个移相时钟信号相对于全局时钟信号的相位差不同。Its further technical solution is that a plurality of different phase-shifted clock signals of the global clock signal are respectively connected to the clock input ports of each target resource module through a plurality of different second clock trees, and there is at least one target resource module connected to the first A global clock tree and multiple second clock trees, the target resource module obtains the global clock signal or one of the phase-shifted clock signals as the module clock signal; the phase differences of the multiple phase-shifted clock signals relative to the global clock signal are different.

本发明的有益技术效果是:The beneficial technical effect of the present invention is:

本申请公开了一种便于实现时序收敛的FPGA,该FPGA中增加了与第一全局时钟树的路径走向相同的第二时钟树作为硬件支持,使得每个目标资源模块可以有多个时钟信号进行选择作为模块时钟信号,而第二时钟树与第一全局时钟树对应位置处的时延差一致且便于调节,因此通过调节局部目标资源模块的模块时钟信号来进行局部调试,从而可以较为方便的达到时序收敛,从而有利于加快设计流程。This application discloses an FPGA that facilitates timing convergence. In the FPGA, a second clock tree that has the same path direction as the first global clock tree is added as hardware support, so that each target resource module can have multiple clock signals. It is selected as the module clock signal, and the delay difference between the second clock tree and the corresponding position of the first global clock tree is consistent and easy to adjust. Therefore, local debugging can be performed by adjusting the module clock signal of the local target resource module, which can be more convenient Achieve timing closure, which helps speed up the design process.

附图说明Description of drawings

图1是本申请的FPGA中包含的多个第一全局时钟树的架构示意图。FIG. 1 is a schematic structural diagram of multiple first global clock trees included in the FPGA of the present application.

图2是本申请一个实施例中,针对一个第一全局时钟树增设的第二时钟树的一种时钟树结构示意图。Fig. 2 is a schematic diagram of a clock tree structure of a second clock tree added to a first global clock tree in an embodiment of the present application.

图3是本申请另一个实施例中,针对一个第一全局时钟树增设的第二时钟树的另一种时钟树结构示意图。FIG. 3 is a schematic diagram of another clock tree structure of a second clock tree added to a first global clock tree in another embodiment of the present application.

图4是本申请另一个实施例中,针对一个第一全局时钟树增设的第二时钟树的另一种时钟树结构示意图。FIG. 4 is a schematic diagram of another clock tree structure of a second clock tree added to a first global clock tree in another embodiment of the present application.

图5是本申请另一个实施例中,针对一个第一全局时钟树增设的第二时钟树的另一种时钟树结构示意图。FIG. 5 is a schematic diagram of another clock tree structure of a second clock tree added to a first global clock tree in another embodiment of the present application.

图6是两个目标资源模块M1和M2均选用全局时钟信号CLK1时的传输时延示意图,以及M1改为选用移相时钟信号CLK2时的传输时延示意图。6 is a schematic diagram of the transmission delay when the two target resource modules M1 and M2 both use the global clock signal CLK1, and a schematic diagram of the transmission delay when M1 is changed to use the phase-shifted clock signal CLK2.

具体实施方式Detailed ways

下面结合附图对本发明的具体实施方式做进一步说明。The specific embodiments of the present invention will be further described below in conjunction with the accompanying drawings.

本申请公开了一种便于实现时序收敛的FPGA,该FPGA基于现有的FPGA架构优化得到,因此该FPGA内部同样包括若干个资源模块以及全局时钟树,如图1所示,诸如CLB、BRAM、DSP、IOB之类的资源模块在FPGA内部按照行列方式排布形成二维阵列。全局时钟信号CLK1经由第一全局时钟树连接到多个目标资源模块的时钟输入口,本申请中与第一全局时钟树对应的目标资源模块是指与第一全局时钟树相连的资源模块。The present application discloses an FPGA which is convenient for realizing timing convergence. The FPGA is optimized based on the existing FPGA architecture. Therefore, the FPGA also includes several resource modules and a global clock tree, as shown in FIG. 1 , such as CLB, BRAM, Resource modules such as DSP and IOB are arranged in rows and columns to form a two-dimensional array inside the FPGA. The global clock signal CLK1 is connected to clock input ports of multiple target resource modules via the first global clock tree. In this application, the target resource module corresponding to the first global clock tree refers to the resource module connected to the first global clock tree.

其中,第一全局时钟树包括时钟主干线GC,从时钟主干线GC开始依次相连形成若干层级的多条时钟分支线HC,以及连接在最后一个层级的时钟分支线HC和对应的目标资源模块之间的时钟末端线LC,每一个层级的多条时钟分支线HC一般包括多条以覆盖二维阵列的不同区域,时钟末端线LC也一般包括多条以覆盖二维阵列的不同区域。第一个层级的时钟分支线HC连接时钟主干线GC,第二个层级的时钟分支线HC连接第一个层级的时钟分支线HC,依次类推,直至最后一个层级的时钟分支线HC后,通过时钟末端线LC连接到目标资源模块。第一全局时钟树可以采用现有的各种时钟树结构,图1示出了一种示例性的第一全局时钟树的结构,时钟主干线GC设置在二维阵列的中央且沿着垂直方向设置、从而在垂直方向上覆盖整个二维阵列。图1所示实例中仅包含一个层级的时钟分支线HC,每条时钟分支线HC设置在二维阵列的一行处且沿着水平方向设置连接时钟主干线GC、从而在水平方向上覆盖整一行中的多列资源模块。每条时钟分支线HC覆盖的一列资源模块处设置一个时钟末端线LC连接时钟分支线HC以及相应列的资源模块(即为目标资源模块)的时钟输入口。同一条时钟末端线LC可以向上和/或向下分叉覆盖连接同一列上的多个目标资源模块,比如向上分叉覆盖20个目标资源模块。全局时钟信号CLK1经由第一全局时钟树中的时钟主干线GC、时钟分支线HC和时钟末端线LC最终输出到目标资源模块的时钟输入口,也即本申请默认全局时钟信号CLK1位于第一全局时钟树的信号最上游的输入端。Wherein, the first global clock tree includes a main clock line GC, a plurality of branch clock lines HC connected successively from the main clock line GC to form several levels, and a clock branch line HC connected at the last level and the corresponding target resource module Between the clock end lines LC, the multiple clock branch lines HC of each level generally include multiple to cover different areas of the two-dimensional array, and the clock end lines LC also generally include multiple to cover different areas of the two-dimensional array. The clock branch line HC of the first level is connected to the main clock line GC, the clock branch line HC of the second level is connected to the clock branch line HC of the first level, and so on, until the clock branch line HC of the last level, through The clock terminal line LC is connected to the target resource module. The first global clock tree can adopt various existing clock tree structures. Figure 1 shows an exemplary structure of the first global clock tree. The main clock line GC is set in the center of the two-dimensional array and along the vertical direction Set such that it covers the entire 2D array in the vertical direction. The example shown in Figure 1 only includes one level of branch clock lines HC, and each branch clock line HC is arranged at a row of the two-dimensional array and connected to the main clock line GC along the horizontal direction, thereby covering the entire row in the horizontal direction A multi-column resource module in . A clock terminal line LC is provided at a column of resource modules covered by each clock branch line HC to connect the clock branch line HC and the clock input port of the resource module of the corresponding column (ie, the target resource module). The same clock terminal line LC can branch upward and/or downward to cover and connect multiple target resource modules on the same column, for example, branch upward to cover 20 target resource modules. The global clock signal CLK1 is finally output to the clock input port of the target resource module through the clock main line GC, clock branch line HC and clock end line LC in the first global clock tree, that is, the default global clock signal CLK1 of this application is located in the first global clock tree. The most upstream input of the clock tree signal.

在实际应用时,FPGA内部设置有多条时钟主干线,比如常见的设置有32条不同路径走向的时钟主干线分别写为GC1、GC2……GC32,这多条时钟主干线平行设置在二维阵列的中央,如图1所示。二维阵列的每一行资源模块处平行设置多条不同路径走向的时钟分支线,比如常见的设置10条时钟分支线分别为HC1、HC2……HC10。每条路径走向的时钟分支线均与各条路径走向的时钟主干线相连。每一个目标资源模块通过时钟末端线LC1连接所在行处的各条路径走向的时钟分支线。因此一个目标资源模块M1所连接的时钟末端线LC1可以从各条路径走向的时钟分支线中选择连通一条,每条路径走向的时钟分支线又可以从各条路径走向的时钟主干线中选择连通一条。In practical applications, there are multiple clock trunk lines inside the FPGA. For example, the common setting has 32 clock trunk lines with different paths, which are respectively written as GC1, GC2...GC32. These multiple clock trunk lines are arranged in parallel in two dimensions. The center of the array, as shown in Figure 1. Multiple clock branch lines with different paths are arranged in parallel at each row of resource modules of the two-dimensional array. For example, 10 clock branch lines are commonly set as HC1, HC2...HC10. The branch clock line of each path is connected to the main clock line of each path. Each target resource module is connected to the clock branch lines of the respective paths in the row through the clock terminal line LC1 . Therefore, the end clock line LC1 connected to a target resource module M1 can be connected to one of the clock branch lines of each path, and the clock branch line of each path can be connected to one of the main clock lines of each path. one.

由此可以认为FPGA内部形成了多个第一全局时钟树,每个全局时钟树均包含时钟主干线、若干层级的多条时钟分支线以及时钟末端线、以形成一种路径走向。两个不同的第一全局时钟树中的时钟分支线和/或时钟主干线的路径走向不同,从而使得两个不同的第一全局时钟树具有不同的路径走向。比如基于图1的举例,一个第一全局时钟树包括GC1、HC1和LC1,另一个第一全局时钟树包括GC1、HC3和LC1,另一个第一全局时钟树包括GC5、HC1和LC1。Therefore, it can be considered that multiple first global clock trees are formed inside the FPGA, and each global clock tree includes a main clock line, multiple clock branch lines at several levels, and clock end lines to form a path direction. The paths of the clock branches and/or clock trunk lines in the two different first global clock trees are different, so that the two different first global clock trees have different path directions. For example, based on the example in FIG. 1 , one first global clock tree includes GC1 , HC1 and LC1 , another first global clock tree includes GC1 , HC3 and LC1 , and another first global clock tree includes GC5 , HC1 and LC1 .

不同的第一全局时钟树的路径走向不同,不同的第一全局时钟树连接的资源模块也可以不同,同一个资源模块可以接入多个不同的第一全局时钟树以获取不同的全局时钟信号CLK1。本申请的目标在于快速实现时序收敛,最重要的是要实现同一个时钟域内的资源模块之间的同步设计以达到时序收敛,因此本申请针对一个第一全局时钟树及其连接的若干个目标资源模块进行介绍,则本申请所指的目标资源模块是与同一种路径走向的第一全局时钟树相连而获取同一个全局时钟信号CLK1的多个资源模块,这多个目标资源模块可以连接同一个时钟末端线和时钟分支线,或者也可以连接不同区域处的路径走向相同的时钟末端线和时钟分支线。比如对照图1以仅包含一个层级的时钟分支线为例,每一行资源模块处均覆盖一行时钟分支线HC1,第一行的时钟分支线HC1通过LC1连接到第一行处的各个资源模块,第二行的时钟分支线HC1通过LC1连接到第二行处的各个资源模块,第一行中的资源模块M1和第二行中的资源模块M2虽然实际连接不同行的LC1和HC1,但连接的HC1和LC1的路径走向都是相同的,所以第一行中的资源模块M1和第二行中的资源模块M2还是接入同一个第一全局时钟树。本领域技术人员可以理解的是,对于FPGA内部的多个第一全局时钟树,可以针对每一个第一全局时钟树分别实现本申请提供的设计。The paths of different first global clock trees are different, and the resource modules connected to different first global clock trees can also be different. The same resource module can be connected to multiple different first global clock trees to obtain different global clock signals CLK1. The goal of this application is to quickly achieve timing convergence. The most important thing is to realize the synchronization design between resource modules in the same clock domain to achieve timing convergence. Therefore, this application is aimed at several goals of a first global clock tree and its connections The resource module is introduced, and the target resource module referred to in this application is a plurality of resource modules connected to the first global clock tree of the same path direction to obtain the same global clock signal CLK1, and these multiple target resource modules can be connected to the same A clock end line and a clock branch line, or paths in different regions may be connected to the same clock end line and clock branch line. For example, in comparison with Figure 1, take a clock branch line containing only one level as an example. Each row of resource modules covers a row of clock branch lines HC1. The clock branch line HC1 in the first row is connected to each resource module in the first row through LC1. The clock branch line HC1 in the second row is connected to each resource module in the second row through LC1. Although the resource module M1 in the first row and the resource module M2 in the second row are actually connected to LC1 and HC1 in different rows, they are connected The paths of HC1 and LC1 are the same, so the resource module M1 in the first row and the resource module M2 in the second row are still connected to the same first global clock tree. Those skilled in the art can understand that, for multiple first global clock trees inside the FPGA, the design provided in this application can be implemented for each first global clock tree respectively.

请参考图2,对于一个第一全局时钟树及其相连的若干个目标资源模块,该第一全局时钟树获取到的全局时钟信号CLK1的移相时钟信号CLK2还经由第二时钟树连接各个目标资源模块的时钟输入口,由此使得每个可以目标资源模块获取全局时钟信号CLK1或移相时钟信号CLK2作为自身的模块时钟信号。移相时钟信号CLK2相对于全局时钟信号CLK1的时序超前或时序滞后。第二时钟树的路径走向与第一全局时钟树相同,因此第二时钟树与第一全局时钟树对应点处的时延差一致,且该时延差由全局时钟信号CLK1和移相时钟信号CLK2之间的时延差决定,从而形成可调控的时延差。Please refer to Figure 2, for a first global clock tree and several target resource modules connected to it, the phase-shifted clock signal CLK2 of the global clock signal CLK1 obtained by the first global clock tree is also connected to each target resource module through the second clock tree The clock input port of the resource module, so that each target resource module can obtain the global clock signal CLK1 or the phase-shifted clock signal CLK2 as its own module clock signal. The timing of the phase-shifted clock signal CLK2 is advanced or delayed relative to the global clock signal CLK1. The path direction of the second clock tree is the same as that of the first global clock tree, so the delay difference at the corresponding point of the second clock tree and the first global clock tree is consistent, and the delay difference is determined by the global clock signal CLK1 and the phase-shifted clock signal The time delay difference between CLK2 is determined, thereby forming an adjustable time delay difference.

(1)在一个实施例中,在第一全局时钟树的预定位置处引出并通过移相器产生移相时钟信号CLK2连接第二时钟树的输入端,通过移相器调节移相时钟信号与全局时钟信号CLK1之间的相位差。一种实现方式是,移相器可以利用锁相环的移相调节功能实现。(1) In one embodiment, the phase-shifted clock signal CLK2 is generated at a predetermined position of the first global clock tree and connected to the input end of the second clock tree through a phase shifter, and the phase-shifted clock signal and the phase-shifted clock signal are adjusted by the phase shifter The phase difference between the global clock signals CLK1. One way of implementation is that the phase shifter can be implemented by using the phase shift adjustment function of the phase locked loop.

在该实施例中,在从第一全局时钟树的预定位置处引出信号连接到移相器时,第一全局时钟树上引出信号的预定位置位于时钟主干线上、或者位于时钟分支线上、或者位于时钟末端线上:In this embodiment, when the signals drawn from the predetermined position of the first global clock tree are connected to the phase shifter, the predetermined position of the drawn signal on the first global clock tree is located on the main clock line or on the branch line of the clock, Or on the clock end line:

当第一全局时钟树上引出信号的预定位置位于时钟主干线上时,第二时钟树的架构与第一全局时钟树类似,也即第二时钟树同样包含时钟主干线GCE2,与时钟主干线GCE2依次相连的若干层级的时钟分支线HCE2,每个层级的时钟分支线HCE2包含覆盖不同区域的多条时钟分支线HCE2,每条时钟分支线HCE2通过多条时钟末端线LCE2连接到多个不同的目标资源模块的时钟输入口。则这种情况下移相时钟信号CLK2连接到时钟分支线HCE2上与第一全局时钟树上的预定位置对应位置处,第二时钟树可以覆盖连接多行、多列的目标资源模块,如图2所示。When the predetermined position of the derived signal on the first global clock tree is on the clock trunk line, the structure of the second clock tree is similar to that of the first global clock tree, that is, the second clock tree also includes the clock trunk line GCE2, and the clock trunk line Several levels of clock branch lines HCE2 connected to GCE2 in turn, each level of clock branch lines HCE2 include multiple clock branch lines HCE2 covering different areas, and each clock branch line HCE2 is connected to multiple different clock lines through multiple clock end lines LCE2 The clock input port of the target resource module. In this case, the phase-shifted clock signal CLK2 is connected to the position corresponding to the predetermined position on the first global clock tree on the clock branch line HCE2, and the second clock tree can cover the target resource modules connected to multiple rows and columns, as shown in the figure 2 shown.

当第一全局时钟树上引出信号的预定位置位于其中一条时钟分支线上时,第二时钟树的架构为局部的时钟树架构,也即第二时钟树包含二维阵列的相应行处的时钟分支线HCE2,该时钟分支线HCE2通过多条时钟末端线LCE2连接到多个不同的目标资源模块的时钟输入口。则这种情况下移相时钟信号CLK2连接到时钟分支线HCE2上与第一全局时钟树上的预定位置对应位置处,第二时钟树可以连接其覆盖的若干行中位于不同列的目标资源模块,如图3。When the predetermined position of the derived signal on the first global clock tree is located on one of the clock branch lines, the architecture of the second clock tree is a local clock tree architecture, that is, the second clock tree contains the clocks at the corresponding rows of the two-dimensional array A branch line HCE2, the clock branch line HCE2 is connected to clock input ports of multiple different target resource modules through multiple clock end lines LCE2. In this case, the phase-shifted clock signal CLK2 is connected to the clock branch line HCE2 corresponding to the predetermined position on the first global clock tree, and the second clock tree can be connected to target resource modules located in different columns in several rows covered by it. , as shown in Figure 3.

当第一全局时钟树上引出信号的预定位置位于其中一条时钟末端线LC1上时,第二时钟树的架构为局部的时钟树架构,也即第二时钟树包含时钟末端线LCE2连接到多个不同的目标资源模块的时钟输入口。则这种情况下移相时钟信号CLK2连接到时钟末端线LCE2上与第一全局时钟树上的预定位置对应位置处,第二时钟树可以覆盖连接其覆盖的若干行中位于同一列的目标资源模块,如图4以其中的一条时钟末端线LC1处引出的信号为例,未全部示出。When the predetermined position of the derived signal on the first global clock tree is located on one of the clock end lines LC1, the architecture of the second clock tree is a local clock tree architecture, that is, the second clock tree includes a clock end line LCE2 connected to multiple Clock input ports of different target resource modules. In this case, the phase-shifted clock signal CLK2 is connected to the position corresponding to the predetermined position on the first global clock tree on the clock end line LCE2, and the second clock tree can cover and connect the target resources located in the same column in several rows covered by it. As for the module, as shown in FIG. 4 , the signal drawn from one of the clock terminal lines LC1 is taken as an example, and not all of them are shown.

由此可见,当第一全局时钟树上引出信号的预定位置不同时,第二时钟树的架构也有所区别,且可以覆盖连接的目标资源模块的数量和分布范围也不同。第一全局时钟树上引出信号的预定位置处于越靠近信号上游,第二时钟树的规模越大、可以覆盖连接的目标资源模块的分布范围越广且数量越多。It can be seen that when the predetermined positions of the derived signals on the first global clock tree are different, the architecture of the second clock tree is also different, and the number and distribution range of target resource modules that can cover connections are also different. The closer the predetermined position of the derived signal on the first global clock tree is to the upstream of the signal, the larger the scale of the second clock tree is, and the wider the distribution range and the greater the number of target resource modules that can cover connections are.

(2)在另一个实施例中,无需额外增加移相器,如图2所示,在常规FPGA中,用户信号CLK0通过锁相环PLL产生全局时钟信号CLK1连接第一全局时钟树的输入端,如图5所示,也即接入时钟主干线GC1。该实施例中,直接复用用于产生全局时钟信号CLK1的锁相环,利用锁相环的移相调节功能不仅产生全局时钟信号CLK1连接第一全局时钟树的输入端,还产生移相时钟信号CLK2连接第二时钟树的输入端,通过该锁相环的移相调节功能即能调节控制移相时钟信号CLK2与全局时钟信号CLK1之间的相位差。(2) In another embodiment, there is no need to additionally add a phase shifter, as shown in Figure 2, in a conventional FPGA, the user signal CLK0 generates the global clock signal CLK1 through the phase-locked loop PLL to connect the input end of the first global clock tree , as shown in FIG. 5 , that is, access to the main clock line GC1. In this embodiment, the phase-locked loop used to generate the global clock signal CLK1 is directly multiplexed, and the phase-shift adjustment function of the phase-locked loop is used to not only generate the global clock signal CLK1 connected to the input end of the first global clock tree, but also generate a phase-shifted clock The signal CLK2 is connected to the input end of the second clock tree, and the phase difference between the phase-shifted clock signal CLK2 and the global clock signal CLK1 can be adjusted and controlled through the phase-shift adjustment function of the phase-locked loop.

在该实施例中,由于移相时钟信号CLK2由用户信号CLK0通过锁相环产生,相当于位于第一全局时钟树的信号最上游处,因此该实施例中第二时钟树的架构以及可以连接的目标资源模块的情况与上述实施例中、第一全局时钟树上引出信号的预定位置位于时钟主干线上的情况类似,该实施例不再赘述。In this embodiment, since the phase-shifted clock signal CLK2 is generated by the user signal CLK0 through the phase-locked loop, it is equivalent to being located at the most upstream signal of the first global clock tree, so the structure of the second clock tree in this embodiment and can be connected The situation of the target resource module is similar to the situation in the above embodiment that the predetermined position of the derived signal on the first global clock tree is located on the clock trunk line, so this embodiment will not be described again.

不管采用哪种方式产生移相时钟信号CLK2,也不管第二时钟树采用何种时钟树架构,在本申请的FPGA的设计过程中,首先对FPGA进行初始布局布线,该步骤与现有FPGA的初始布局布线相同,由此使得所有目标资源模块都获取对应的全局时钟信号CLK1作为模块时钟信号。若此时已经达到时序收敛的状态,则完成设计,但是由于FPGA内部结构复杂,在初始布局布线后往往是未能达到时序收敛的。按照常规FPGA设计流程,这时就需要进行局部的反复调试,就会出现背景部分记载的调试过程耗时过长的问题。但是本申请的FPGA中加入了第二时钟树提供硬件支持,一个目标资源模块不是固定只能选择全局时钟信号CLK1,还可以选择移相时钟信号CLK2,当出现时序不收敛的情况后,调整至少一个目标资源模块改为获取移相时钟信号CLK2作为模块时钟信号、并调整移相时钟信号CLK2相对于全局时钟信号CLK1的相位差,可以较为快速的达到时序收敛。No matter which method is used to generate the phase-shifted clock signal CLK2, and no matter what kind of clock tree structure the second clock tree adopts, in the design process of the FPGA of the present application, the initial layout and wiring of the FPGA is first carried out. This step is different from that of the existing FPGA. The initial placement and routing are the same, so that all target resource modules obtain the corresponding global clock signal CLK1 as the module clock signal. If the state of timing convergence has been reached at this time, the design is completed. However, due to the complex internal structure of the FPGA, timing convergence is often not achieved after the initial layout and routing. According to the conventional FPGA design process, local repeated debugging is required at this time, and the problem that the debugging process described in the background part takes too long will appear. However, the FPGA of this application adds a second clock tree to provide hardware support. A target resource module is not fixed and can only choose the global clock signal CLK1, but also can choose the phase-shifted clock signal CLK2. When the timing does not converge, adjust at least A target resource module instead acquires the phase-shifted clock signal CLK2 as the module clock signal, and adjusts the phase difference between the phase-shifted clock signal CLK2 and the global clock signal CLK1, so that timing convergence can be achieved relatively quickly.

与同一个第一全局时钟树相连的多个目标资源模块之间的时序不收敛主要包括两种情况,对于任意两个目标资源模块M1和M2在获取CLK1时:Timing non-convergence between multiple target resource modules connected to the same first global clock tree mainly includes two situations. For any two target resource modules M1 and M2 when acquiring CLK1:

(1)全局时钟信号CLK1在第一全局时钟树上传输时会有传输时延,导致各个目标资源模块在获取全局时钟信号CLK1作为模块时钟信号时,实际获取到的的CLK1经过传输时延后的时钟信号,假设目标资源模块M1获取到的模块时钟信号为clk1,假设目标资源模块M2获取到的模块时钟信号为clk2。由于目标资源模块M1和M2在二维阵列中的位置不同,全局时钟信号CLK1传输到M1和M2时的传输时延不同,导致目标资源模块M1获取到的clk1的上升沿与目标资源模块M2获取到的clk2的上升沿之间的时钟偏移超过预定偏移阈值,而不满足时序收敛的要求。(1) There will be a transmission delay when the global clock signal CLK1 is transmitted on the first global clock tree, so that when each target resource module obtains the global clock signal CLK1 as the module clock signal, the actually acquired CLK1 is delayed by the transmission time Assume that the module clock signal obtained by the target resource module M1 is clk1, and assume that the module clock signal obtained by the target resource module M2 is clk2. Since the positions of the target resource modules M1 and M2 in the two-dimensional array are different, the transmission delays when the global clock signal CLK1 is transmitted to M1 and M2 are different. The clock skew between the rising edges of clk2 exceeds the predetermined skew threshold, and the timing closure requirement is not met.

(2)当目标资源模块M1和M2之前有信号传输时,目标资源模块M1和M2需要满足建立时间setup time和保持时间hold time对应的时间约束。请参考图6,信号S1由目标资源模块M1的输出连接到目标资源模块M2的输入,同上,假设M1获取到的模块时钟信号为clk1,M2获取到的模块时钟信号为clk2,则需要满足如下的建立时间约束:(2) When there is signal transmission before the target resource modules M1 and M2, the target resource modules M1 and M2 need to meet the time constraints corresponding to the setup time and hold time. Please refer to Figure 6, the signal S1 is connected from the output of the target resource module M1 to the input of the target resource module M2, as above, assuming that the module clock signal obtained by M1 is clk1, and the module clock signal obtained by M2 is clk2, the following needs to be met The build time constraints of:

d1+T(CK1→Q1)+T(p1)+Tsetup(M2)<d2+Period(CLK1);d1+T(CK1→Q1)+T(p1)+Tsetup(M2)<d2+Period(CLK1);

且需要满足如下的保持时间约束:And need to meet the following hold time constraints:

d1+T(CK1→Q1)+T(p1)>d2+Thold(M2);d1+T(CK1→Q1)+T(p1)>d2+Thold(M2);

其中,d1表示全局时钟信号CLK1经由第一全局时钟树传输到目标资源模块M1的传输延时,T(CK1→Q1)表示从目标资源模块M1的时钟端CK1到输出Q端的传输延时,T(p1)表示目标资源模块M1的输出到目标资源模块M2的输入之间的传输延时,Tsetup(M2)表示目标资源模块M2的建立时间。d2表示全局时钟信号CLK1经由第一全局时钟树传输到目标资源模块M2的传输延时,Period(CLK1)表示全局时钟信号CLK1的时钟周期,Thold(M2)表示目标资源模块M2的保持时间。Among them, d1 represents the transmission delay of the global clock signal CLK1 to the target resource module M1 via the first global clock tree, T(CK1→Q1) represents the transmission delay from the clock terminal CK1 of the target resource module M1 to the output Q terminal, T (p1) represents the transmission delay between the output of the target resource module M1 and the input of the target resource module M2, and Tsetup(M2) represents the setup time of the target resource module M2. d2 represents the transmission delay of the global clock signal CLK1 to the target resource module M2 via the first global clock tree, Period(CLK1) represents the clock period of the global clock signal CLK1, and Thold(M2) represents the holding time of the target resource module M2.

在完成初始布局布线后,目标资源模块M1和M2的布局位置固定,相当于d1和d2都固定了,因此由于部分目标资源模块的布局位置不合适,可能会导致局部难以满足上述建立时间约束和保持时间约束,常规做法需要反复调试较为耗时且可能需要重新局部。After completing the initial placement and routing, the layout positions of the target resource modules M1 and M2 are fixed, which means that both d1 and d2 are fixed. Therefore, due to the inappropriate layout positions of some target resource modules, it may be difficult to locally meet the above setup time constraints and To maintain time constraints, conventional practice requires repeated debugging, which is time-consuming and may require re-localization.

针对上述两种时序不收敛的情况,本申请在调整至少一个目标资源模块改为获取移相时钟信号CLK2作为模块时钟信号,以及调整移相时钟信号CLK2相对于全局时钟信号CLK1的相位差时,目标是调整直至任意两个目标资源模块的模块时钟信号之间的时钟偏移不超过预定偏移阈值,且任意两个目标资源模块之间形成的传输路径的建立时间和保持时间均满足对应的时间约束。通过加入第二时钟树,当将目标资源模块M2改为获取移相时钟信号CLK2作为模块时钟信号时,图6中需要满足如下的建立时间约束:In view of the above two situations where timing does not converge, the present application adjusts at least one target resource module to obtain the phase-shifted clock signal CLK2 as the module clock signal, and when adjusting the phase difference of the phase-shifted clock signal CLK2 relative to the global clock signal CLK1, The goal is to adjust until the clock offset between the module clock signals of any two target resource modules does not exceed the predetermined offset threshold, and the setup time and hold time of the transmission path formed between any two target resource modules meet the corresponding time constraints. By adding the second clock tree, when the target resource module M2 is changed to obtain the phase-shifted clock signal CLK2 as the module clock signal, the following setup time constraints need to be met in Figure 6:

d1+T(CK1→Q1)+T(p1)+Tsetup(M2)<Tps+d3+Period(CLK1);d1+T(CK1→Q1)+T(p1)+Tsetup(M2)<Tps+d3+Period(CLK1);

且需要满足如下的保持时间约束:And need to meet the following hold time constraints:

d1+T(CK1→Q1)+T(p1)>Tps+d3+Thold(M2);d1+T(CK1→Q1)+T(p1)>Tps+d3+Thold(M2);

其中,Tps表示全局时钟信号CLK1和移相时钟信号CLK2之间的相位差,可以为正也可以为负。d3表示移相时钟信号CLK2经由第二时钟树传输到目标资源模块M2的传输延时。在完成初始布局布线确定目标资源模块M1和M2的布局位置后,相当于d1和d3都固定了,但是Tps是便于调节的,因此更易于调节达成上述约束。Wherein, Tps represents the phase difference between the global clock signal CLK1 and the phase-shifted clock signal CLK2, which can be positive or negative. d3 represents the transmission delay of the phase-shifted clock signal CLK2 to the target resource module M2 via the second clock tree. After completing the initial placement and routing to determine the layout positions of the target resource modules M1 and M2, it means that both d1 and d3 are fixed, but Tps is easy to adjust, so it is easier to adjust to achieve the above constraints.

为了达成上述建立时间和保持时间的时序收敛的要求,调整至少一个目标资源模块的模块时钟信号的方法包括:在完成对FPGA的初始布局布线后进行时序分析并确定任意两个目标资源模块之间的不满足建立时间约束或不满足保持时间约束的待优化路径,依次遍历处理各条待优化路径。In order to achieve the timing convergence requirements of the above setup time and hold time, the method for adjusting the module clock signal of at least one target resource module includes: performing timing analysis after completing the initial placement and routing of the FPGA and determining the time between any two target resource modules. The path to be optimized that does not satisfy the setup time constraint or the hold time constraint is traversed in turn to process each path to be optimized.

在一个实施例中,在依次遍历处理各条待优化路径时,按照先处理不满足保持时间约束的待优化路径,再处理不满足建立时间约束的待优化路径的顺序依次遍历处理各条待优化路径。因为相对于建立时间约束来说,保持时间约束相对更难以达成,因此优先解决保持时间约束问题。In one embodiment, when traversing and processing each path to be optimized in sequence, the paths to be optimized that do not satisfy the hold time constraint are processed first, and then the paths to be optimized that do not satisfy the setup time constraint are processed in order to traverse and process each path to be optimized in turn. path. Because the hold time constraint is relatively more difficult to achieve than the setup time constraint, the hold time constraint problem is prioritized.

在另一个实施例中,当存在多条不满足保持时间约束的待优化路径时,按照保持时间的时序余量从小到大的顺序依次遍历处理各条待优化路径,也即保持时间的时序余量越小,表示路径越不满足保持时间的约束,则越优先处理。当存在多条不满足建立时间约束的待优化路径时,按照建立时间的时序余量从小到大的顺序依次遍历处理各条待优化路径,也即建立时间的时序余量越小,表示路径越不满足建立时间的约束,则越优先处理。In another embodiment, when there are multiple paths to be optimized that do not satisfy the hold time constraint, each path to be optimized is sequentially traversed and processed according to the sequence margin of the hold time from small to large, that is, the timing margin of the hold time is The smaller the amount, the more the path does not meet the hold time constraint, and the more priority it will be processed. When there are multiple paths to be optimized that do not meet the setup time constraints, the paths to be optimized are traversed in order of the timing margin of the setup time from small to large, that is, the smaller the timing margin of the setup time, the more If the constraint of the establishment time is not met, the priority is given to processing.

不管按照何种顺序依次遍历处理各条待优化路径,对于遍历处理到的每条待优化路径:调整该待优化路径的输入端的目标资源模块改为获取移相时钟信号CLK2作为模块时钟信号,或者,调整该待优化路径的输出端的目标资源模块改为获取移相时钟信号CLK2作为模块时钟信号。也即比如对于图6,假设信号S1的路径是待优化路径,可以将目标资源模块M1修改为获取移相时钟信号CLK2作为模块时钟信号,也可以将目标资源模块M2修改为获取移相时钟信号CLK2作为模块时钟信号。并调节移相时钟信号CLK2相对于全局时钟信号CLK1的相位差Tps,使得在先已经遍历处理过的所有待优化路径均满足建立时间约束以及满足保持时间约束。依次遍历,直至所有待优化路径都遍历处理完成,由此可以使得所有待优化路径达成建立时间和保持时间的约束,且可以确定最终CLK2与CLK1之间的相位差Tps。Regardless of the order in which each path to be optimized is sequentially traversed and processed, for each path to be optimized through traversal processing: adjust the target resource module at the input end of the path to be optimized to obtain the phase-shifted clock signal CLK2 as the module clock signal, or , adjust the target resource module at the output end of the path to be optimized to obtain the phase-shifted clock signal CLK2 as the module clock signal instead. That is, for example, as shown in Figure 6, assuming that the path of the signal S1 is the path to be optimized, the target resource module M1 can be modified to obtain the phase-shifted clock signal CLK2 as the module clock signal, and the target resource module M2 can also be modified to obtain the phase-shifted clock signal CLK2 is used as the module clock signal. And adjust the phase difference Tps of the phase-shifted clock signal CLK2 relative to the global clock signal CLK1, so that all paths to be optimized that have been previously traversed and processed satisfy the setup time constraint and the hold time constraint. Traverse sequentially until all paths to be optimized are traversed, so that all paths to be optimized can meet the constraints of setup time and hold time, and the final phase difference Tps between CLK2 and CLK1 can be determined.

另外,每一个目标资源模块的输入和输出都连接有信号,对一个由M1传输到M2的待优化路径,在总要调节待优化路径满足建立时间和保持时间的约束的情况下,若M1的模块时钟信号clk1相对于M2的模块时钟信号clk2提前到达,则有利于达成该待优化路径的保持时间的约束、而不利于达成该待优化路径的建立时间的约束。若M1的模块时钟信号clk1相对于M2的模块时钟信号clk2延后到达,则有利于达成该待优化路径的建立时间的约束、而不利于达成该待优化路径的保持时间的约束。而通过切换M1改为获取CLK2并调节Tps可以令M1的模块时钟信号clk1提前或延后以满足M1传输到M2的待优化路径的约束,但是提前或延后M1的模块时钟信号clk1后,又会影响其他目标资源模块传输给M1的路径。若切换M2改为获取CLK2并调节Tps也是类似的。In addition, the input and output of each target resource module are connected with signals. For a path to be optimized transmitted from M1 to M2, if the path to be optimized must always be adjusted to meet the constraints of setup time and hold time, if M1's The arrival of the module clock signal clk1 earlier than the module clock signal clk2 of M2 is beneficial to achieving the hold time constraint of the path to be optimized, but not conducive to achieving the setup time constraint of the path to be optimized. If the module clock signal clk1 of M1 arrives later than the module clock signal clk2 of M2, it is beneficial to achieve the constraint of the setup time of the path to be optimized, but not conducive to the constraint of the hold time of the path to be optimized. By switching M1 to obtain CLK2 and adjusting Tps, the module clock signal clk1 of M1 can be advanced or delayed to meet the constraints of the path to be optimized from M1 to M2, but after the module clock signal clk1 of M1 is advanced or delayed, It will affect the path transmitted by other target resource modules to M1. It is also similar to switch M2 to obtain CLK2 and adjust Tps.

因此在调整一个目标资源模块的模块时钟信号时,不仅需要考虑该目标资源模块的输入端或输出端连接的待优化路径,还需要考虑其相对于另一端连接的待优化路径,也即要不仅要保证目标资源模块的输入端的传输路径满足建立时间和保持时间对应的时间约束,还要保证该目标资源模块的输出端的传输路径满足建立时间和保持时间对应的时间约束。Therefore, when adjusting the module clock signal of a target resource module, it is necessary to consider not only the path to be optimized connected to the input or output of the target resource module, but also the path to be optimized relative to the connection to the other end, that is, not only It is necessary to ensure that the transmission path of the input terminal of the target resource module satisfies the time constraints corresponding to the setup time and the hold time, and also ensure that the transmission path of the output terminal of the target resource module satisfies the time constraints corresponding to the setup time and the hold time.

在上述遍历处理待优化路径的过程中,部分目标资源模块之间的时钟偏移超过预定偏移阈值的问题有较大的概率可以被一并解决。如果仍有部分目标资源模块的时钟偏移超过预定偏移阈值,则选择其中一个目标资源模块改为获取移相时钟信号CLK2作为模块时钟信号,并调整CLK2与CLK1之间的相位差Tps一般就可以顺利满足时钟偏移的要求,相对建立时间和保持时间的约束会更易于达成。In the process of traversing and processing the path to be optimized, the problem that the clock offset between some target resource modules exceeds a predetermined offset threshold can be solved together with a high probability. If the clock offset of some target resource modules still exceeds the predetermined offset threshold, select one of the target resource modules to obtain the phase-shifted clock signal CLK2 as the module clock signal, and adjust the phase difference Tps between CLK2 and CLK1. The requirements of clock skew can be met smoothly, and the constraints of relative setup time and hold time will be easier to achieve.

在另一个实施例中,全局时钟信号CLK1的多个不同的移相时钟信号CLK2分别经由多个不同的第二时钟树连接到各个目标资源模块的时钟输入口,存在至少一个目标资源模块同时连接第一全局时钟树和多个第二时钟树,每个第二时钟树分别从第一全局时钟树的预定位置处引出并通过移相器产生,或者由用户信号CLK0通过锁相环产生,相关技术特征如上介绍,该实施例不再赘述。各个第二时钟树所连接的移相时钟信号CLK2可以从第一全局时钟树的不同预定位置处引出,且各个移相时钟信号相对于全局时钟信号的相位差不同。则目标资源模块获取全局时钟信号CLK1或其中一个移相时钟信号CLK2作为模块时钟信号,由此通过多个第二时钟树可以更易于达成时序收敛。In another embodiment, a plurality of different phase-shifted clock signals CLK2 of the global clock signal CLK1 are respectively connected to the clock input port of each target resource module through a plurality of different second clock trees, and there is at least one target resource module connected simultaneously A first global clock tree and a plurality of second clock trees, each of which is derived from a predetermined position of the first global clock tree and generated by a phase shifter, or generated by a user signal CLK0 through a phase-locked loop, related The technical features are as described above, and this embodiment will not be described in detail. The phase-shifted clock signal CLK2 connected to each second clock tree may be drawn from different predetermined positions of the first global clock tree, and the phase difference of each phase-shifted clock signal relative to the global clock signal is different. Then the target resource module obtains the global clock signal CLK1 or one of the phase-shifted clock signals CLK2 as the module clock signal, so that it is easier to achieve timing convergence through multiple second clock trees.

Claims (10)

1.一种便于实现时序收敛的FPGA,其特征在于,在所述FPGA中,全局时钟信号经由第一全局时钟树连接到多个目标资源模块的时钟输入口;所述全局时钟信号的移相时钟信号还经由第二时钟树连接各个目标资源模块的时钟输入口,所述第二时钟树的路径走向与所述第一全局时钟树相同,每个所述目标资源模块获取全局时钟信号或移相时钟信号作为模块时钟信号;1. a kind of FPGA that is convenient to realize timing convergence is characterized in that, in described FPGA, global clock signal is connected to the clock input port of a plurality of target resource modules via the first global clock tree; The phase shifting of described global clock signal The clock signal is also connected to the clock input port of each target resource module through the second clock tree. The path of the second clock tree is the same as that of the first global clock tree. Each of the target resource modules obtains the global clock signal or moves The phase clock signal is used as the module clock signal; 在完成对所述FPGA的初始布局布线、所有目标资源模块都获取对应的全局时钟信号作为模块时钟信号而未达到时序收敛时,调整至少一个目标资源模块改为获取移相时钟信号作为模块时钟信号并调整所述移相时钟信号相对于所述全局时钟信号的相位差,直至达到时序收敛。When the initial layout and routing of the FPGA is completed, and all target resource modules obtain the corresponding global clock signal as the module clock signal but timing convergence is not reached, at least one target resource module is adjusted to obtain a phase-shifted clock signal as the module clock signal And adjust the phase difference of the phase-shifted clock signal relative to the global clock signal until timing convergence is achieved. 2.根据权利要求1所述的FPGA,其特征在于,调整至少一个目标资源模块改为获取移相时钟信号作为模块时钟信号以及调整所述移相时钟信号相对于所述全局时钟信号的相位差,直至任意两个目标资源模块的模块时钟信号之间的时钟偏移不超过预定偏移阈值,且任意两个目标资源模块之间形成的传输路径的建立时间和保持时间均满足对应的时间约束。2. FPGA according to claim 1, it is characterized in that, adjust at least one target resource module to obtain phase-shifted clock signal as module clock signal and adjust described phase-shifted clock signal relative to the phase difference of described global clock signal instead , until the clock offset between the module clock signals of any two target resource modules does not exceed the predetermined offset threshold, and the setup time and hold time of the transmission path formed between any two target resource modules meet the corresponding time constraints . 3.根据权利要求2所述的FPGA,其特征在于,调整至少一个目标资源模块的模块时钟信号的方法包括:3. FPGA according to claim 2, is characterized in that, the method for adjusting the module clock signal of at least one target resource module comprises: 在完成对所述FPGA的初始布局布线后进行时序分析并确定任意两个目标资源模块之间的不满足建立时间约束或不满足保持时间约束的待优化路径;After completing the initial layout and routing of the FPGA, perform a timing analysis and determine a path to be optimized between any two target resource modules that does not satisfy the setup time constraint or does not satisfy the hold time constraint; 依次遍历处理各条待优化路径,且对于遍历处理到的每条待优化路径,调整所述待优化路径的输入端的目标资源模块改为获取移相时钟信号作为模块时钟信号,或者,调整所述待优化路径的输出端的目标资源模块改为获取移相时钟信号作为模块时钟信号,并调节所述移相时钟信号相对于所述全局时钟信号的相位差,使得已经遍历处理过的所有待优化路径均满足建立时间约束以及满足保持时间约束。Traverse and process each path to be optimized in turn, and for each path to be optimized through traversal processing, adjust the target resource module at the input end of the path to be optimized to obtain a phase-shifted clock signal as a module clock signal, or adjust the The target resource module at the output end of the path to be optimized instead obtains the phase-shifted clock signal as the module clock signal, and adjusts the phase difference of the phase-shifted clock signal relative to the global clock signal, so that all the processed paths to be optimized have been traversed Both meet the setup time constraints and meet the hold time constraints. 4.根据权利要求3所述的FPGA,其特征在于,在调整一个目标资源模块的模块时钟信号时,保证所述目标资源模块的输入端的传输路径满足建立时间和保持时间对应的时间约束,且保证所述目标资源模块的输出端的传输路径满足建立时间和保持时间对应的时间约束。4. FPGA according to claim 3, is characterized in that, when adjusting the module clock signal of a target resource module, guarantees that the transmission path of the input end of described target resource module satisfies the corresponding time constraints of setup time and hold time, and Ensure that the transmission path of the output terminal of the target resource module satisfies the time constraint corresponding to the setup time and the hold time. 5.根据权利要求3所述的FPGA,其特征在于,在依次遍历处理各条待优化路径时,按照先处理不满足保持时间约束的待优化路径,再处理不满足建立时间约束的待优化路径的顺序依次遍历处理各条待优化路径。5. FPGA according to claim 3, it is characterized in that, when traversing and processing each path to be optimized in turn, process the path to be optimized that does not satisfy the setup time constraint according to first processing the path to be optimized that does not satisfy the hold time constraint The order of traversing and processing each path to be optimized in turn. 6.根据权利要求5所述的FPGA,其特征在于,当存在多条不满足保持时间约束的待优化路径时,按照保持时间的时序余量从小到大的顺序依次遍历处理各条待优化路径;当存在多条不满足建立时间约束的待优化路径时,按照建立时间的时序余量从小到大的顺序依次遍历处理各条待优化路径。6. FPGA according to claim 5, it is characterized in that, when there are multiple paths to be optimized that do not satisfy the hold time constraint, each path to be optimized is traversed in order according to the sequence margin of hold time from small to large ; When there are multiple paths to be optimized that do not satisfy the setup time constraints, each path to be optimized is traversed and processed in sequence according to the sequence margin of the setup time from small to large. 7.根据权利要求1所述的FPGA,其特征在于,所述第一全局时钟树的预定位置处引出并通过移相器产生所述移相时钟信号连接所述第二时钟树的输入端,通过所述移相器调节所述移相时钟信号与所述全局时钟信号之间的相位差。7. FPGA according to claim 1, is characterized in that, draws at the predetermined position of described first global clock tree and produces described phase-shifted clock signal by phase shifter and connects the input end of described second clock tree, The phase difference between the phase-shifted clock signal and the global clock signal is adjusted by the phase shifter. 8.根据权利要求2所述的FPGA,其特征在于,所述第一全局时钟树包括时钟主干线,从所述时钟主干线开始依次相连形成若干层级的多条时钟分支线,以及连接在最后一个层级的时钟分支线和对应的目标资源模块之间的时钟末端线;8. FPGA according to claim 2, it is characterized in that, described first global clock tree comprises clock main line, starts to connect successively from described clock main line to form a plurality of clock branch lines of several levels, and connects at the end The clock terminal line between the clock branch line of a level and the corresponding target resource module; 所述第一全局时钟树的预定位置位于时钟主干线上、或者位于时钟分支线上、或者位于时钟末端线上。The predetermined position of the first global clock tree is located on the clock trunk line, or on the clock branch line, or on the clock terminal line. 9.根据权利要求1所述的FPGA,其特征在于,用户信号通过锁相环产生所述全局时钟信号连接所述第一全局时钟树的输入端,以及产生所述移相时钟信号连接所述第二时钟树的输入端,通过所述锁相环的移相调节功能调节所述移相时钟信号与所述全局时钟信号之间的相位差。9. FPGA according to claim 1, is characterized in that, user signal produces described global clock signal to connect the input end of described first global clock tree through phase-locked loop, and produces described phase-shifted clock signal to connect described The input terminal of the second clock tree adjusts the phase difference between the phase-shifted clock signal and the global clock signal through the phase-shift adjustment function of the phase-locked loop. 10.根据权利要求1所述的FPGA,其特征在于,所述全局时钟信号的多个不同的移相时钟信号分别经由多个不同的第二时钟树连接到各个目标资源模块的时钟输入口,存在至少一个目标资源模块同时连接所述第一全局时钟树和多个第二时钟树,所述目标资源模块获取全局时钟信号或其中一个移相时钟信号作为模块时钟信号;多个移相时钟信号相对于所述全局时钟信号的相位差不同。10. The FPGA according to claim 1, wherein a plurality of different phase-shifted clock signals of the global clock signal are respectively connected to the clock input ports of each target resource module via a plurality of different second clock trees, There is at least one target resource module connected to the first global clock tree and multiple second clock trees at the same time, and the target resource module obtains the global clock signal or one of the phase-shifted clock signals as the module clock signal; multiple phase-shifted clock signals The phase difference with respect to the global clock signal is different.
CN202211303085.1A 2022-10-24 2022-10-24 FPGA convenient to realize timing sequence convergence Active CN115933457B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211303085.1A CN115933457B (en) 2022-10-24 2022-10-24 FPGA convenient to realize timing sequence convergence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211303085.1A CN115933457B (en) 2022-10-24 2022-10-24 FPGA convenient to realize timing sequence convergence

Publications (2)

Publication Number Publication Date
CN115933457A true CN115933457A (en) 2023-04-07
CN115933457B CN115933457B (en) 2024-10-29

Family

ID=86551418

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211303085.1A Active CN115933457B (en) 2022-10-24 2022-10-24 FPGA convenient to realize timing sequence convergence

Country Status (1)

Country Link
CN (1) CN115933457B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117313602A (en) * 2023-10-17 2023-12-29 北京市合芯数字科技有限公司 Module boundary time sequence constraint method and related equipment

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH113945A (en) * 1997-06-12 1999-01-06 Nec Corp Clock tree design method of semiconductor integrated circuit and semiconductor integrated circuit by the same
US20070277071A1 (en) * 2006-04-21 2007-11-29 Altera Corporation Write-Side Calibration for Data Interface
CN103389952A (en) * 2012-05-08 2013-11-13 三星电子株式会社 System on chip (soc), method of operating the soc, and system having the soc
US8629548B1 (en) * 2012-10-11 2014-01-14 Easic Corporation Clock network fishbone architecture for a structured ASIC manufactured on a 28 NM CMOS process lithographic node
CN105653748A (en) * 2014-11-14 2016-06-08 京微雅格(北京)科技有限公司 Clock tree resource allocation method and clock tree configuration
CN108319762A (en) * 2018-01-08 2018-07-24 无锡中微亿芯有限公司 One kind supporting segmented programmable clock network structure based on clock area
CN110784276A (en) * 2018-07-26 2020-02-11 集成装置技术公司 Zero offset clock distribution
US10860765B1 (en) * 2019-02-22 2020-12-08 Xilinx, Inc. Clock tree routing in programmable logic device
CN113723046A (en) * 2021-08-10 2021-11-30 广芯微电子(广州)股份有限公司 A fishbone clock tree and its realization method
CN113962190A (en) * 2021-09-30 2022-01-21 京微齐力(北京)科技有限公司 A step-by-step synthesis method and device for a clock tree
CN216527175U (en) * 2021-12-22 2022-05-13 英属维京群岛商烁星有限公司 Semiconductor structure

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH113945A (en) * 1997-06-12 1999-01-06 Nec Corp Clock tree design method of semiconductor integrated circuit and semiconductor integrated circuit by the same
US20070277071A1 (en) * 2006-04-21 2007-11-29 Altera Corporation Write-Side Calibration for Data Interface
CN103389952A (en) * 2012-05-08 2013-11-13 三星电子株式会社 System on chip (soc), method of operating the soc, and system having the soc
US8629548B1 (en) * 2012-10-11 2014-01-14 Easic Corporation Clock network fishbone architecture for a structured ASIC manufactured on a 28 NM CMOS process lithographic node
CN105653748A (en) * 2014-11-14 2016-06-08 京微雅格(北京)科技有限公司 Clock tree resource allocation method and clock tree configuration
CN108319762A (en) * 2018-01-08 2018-07-24 无锡中微亿芯有限公司 One kind supporting segmented programmable clock network structure based on clock area
CN110784276A (en) * 2018-07-26 2020-02-11 集成装置技术公司 Zero offset clock distribution
US10860765B1 (en) * 2019-02-22 2020-12-08 Xilinx, Inc. Clock tree routing in programmable logic device
CN113723046A (en) * 2021-08-10 2021-11-30 广芯微电子(广州)股份有限公司 A fishbone clock tree and its realization method
CN113962190A (en) * 2021-09-30 2022-01-21 京微齐力(北京)科技有限公司 A step-by-step synthesis method and device for a clock tree
CN216527175U (en) * 2021-12-22 2022-05-13 英属维京群岛商烁星有限公司 Semiconductor structure

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘健: ""基于28 nm工艺的芯片时钟树研究"", 《电子与封装》, 31 July 2020 (2020-07-31), pages 46 - 49 *
杜文志: ""星载FPGA内时序电路设计与时钟控制技术分析"", 《航天器工程》, no. 05, 15 September 2008 (2008-09-15), pages 62 - 67 *
钱鹏: ""FPGA时序收敛分析及仿真"", 《中国优秀硕士学位论文全文数据库信息科技》, 15 September 2015 (2015-09-15) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117313602A (en) * 2023-10-17 2023-12-29 北京市合芯数字科技有限公司 Module boundary time sequence constraint method and related equipment
CN117313602B (en) * 2023-10-17 2024-05-07 北京市合芯数字科技有限公司 Module boundary timing constraint method and related equipment

Also Published As

Publication number Publication date
CN115933457B (en) 2024-10-29

Similar Documents

Publication Publication Date Title
EP2871550B1 (en) Clocking for pipelined routing
US7668022B2 (en) Integrated circuit for clock generation for memory devices
US8922264B1 (en) Methods and apparatus for clock tree phase alignment
US7245240B1 (en) Integrated circuit serializers with two-phase global master clocks
US6856171B1 (en) Synchronization of programmable multiplexers and demultiplexers
CN105897261B (en) Clock synchronization method
US8248110B1 (en) Clock switch-over circuits and methods
US10924096B1 (en) Circuit and method for dynamic clock skew compensation
CN106170741A (en) The clock distribution framework of logic module and operational approach thereof for integrated circuit
Royal et al. Globally asynchronous locally synchronous FPGA architectures
US9537491B1 (en) Leaf-level generation of phase-shifted clocks using programmable clock delays
US9047934B1 (en) Timing signal adjustment for data storage
CN115933457A (en) A Kind of FPGA Facilitating Timing Closure
CN106446366B (en) A kind of large-scale digital ic Clock grid location mode
US8816743B1 (en) Clock structure with calibration circuitry
US6377077B1 (en) Clock supply circuit and data transfer circuit
US7346794B1 (en) Method and apparatus for providing clocking phase alignment in a transceiver system
US20160098059A1 (en) Circuits for and methods of processing data in an integrated circuit device
CN105653748A (en) Clock tree resource allocation method and clock tree configuration
US12061853B2 (en) Multi-dimensional network interface
JP3869406B2 (en) Clock phase difference detection circuit, clock distribution circuit, and large-scale integrated circuit
CN103677077A (en) Complex programmable logic device (CPLD) for strengthening clock management
US10763862B1 (en) Boundary logic interface
JP2013102417A5 (en)
JP2025528226A (en) Clock distribution with clock offset

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant