[go: up one dir, main page]

CN103383641A - Synchronous device for multi-assembly lines - Google Patents

Synchronous device for multi-assembly lines Download PDF

Info

Publication number
CN103383641A
CN103383641A CN2013101392797A CN201310139279A CN103383641A CN 103383641 A CN103383641 A CN 103383641A CN 2013101392797 A CN2013101392797 A CN 2013101392797A CN 201310139279 A CN201310139279 A CN 201310139279A CN 103383641 A CN103383641 A CN 103383641A
Authority
CN
China
Prior art keywords
pipeline
register
clustering
streamline
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013101392797A
Other languages
Chinese (zh)
Inventor
尹磊祖
张星
刘子君
谢少林
王磊
杨勇勇
汪涛
王东琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN2013101392797A priority Critical patent/CN103383641A/en
Publication of CN103383641A publication Critical patent/CN103383641A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Advance Control (AREA)

Abstract

本发明公开了一种多流水线同步装置,在该装置中,控制状态寄存器传送单元完成流水线之间的控制状态寄存器传送,包括流水线A对流水线B的控制寄存器配置操作,以及流水线A对流水线B的状态寄存器读操作。流水线寄存器传送单元完成流水线A中普通寄存器与流水线B中流水线配置寄存器的寄存器传送,以此流水线B得到其运转所需的寄存器信息。同步逻辑单元负责接收调度指令以及阻塞信息,并根据控制状态寄存器,为流水线A产生停顿信号,为流水线B产生调度使能及传递调度信息。

Figure 201310139279

The invention discloses a multi-pipeline synchronization device. In the device, the control state register transmission unit completes the control state register transmission between pipelines, including the control register configuration operation of pipeline A to pipeline B, and the control register configuration operation of pipeline A to pipeline B. Status register read operation. The pipeline register transfer unit completes the register transfer between the general registers in pipeline A and the pipeline configuration registers in pipeline B, so that pipeline B can obtain the register information required for its operation. The synchronization logic unit is responsible for receiving scheduling instructions and blocking information, and according to the control status register, generates a pause signal for pipeline A, generates scheduling enable and transmits scheduling information for pipeline B.

Figure 201310139279

Description

一种多流水线同步装置A Multi-Pipeline Synchronization Device

技术领域 technical field

本发明涉及流水线技术领域,特别涉及一种多流水线同步装置。  The invention relates to the technical field of pipelines, in particular to a multi-pipeline synchronization device. the

背景技术 Background technique

流水线技术是指在程序执行时多条指令重叠进行操作的一种准并行处理实现技术。随着处理器体系结构的发展,涌现出许多新的流水线技术,如动态流水技术、超流水线技术等,极大地提升了处理器的性能。  Pipeline technology refers to a quasi-parallel processing implementation technology in which multiple instructions overlap and operate during program execution. With the development of processor architecture, many new pipeline technologies have emerged, such as dynamic pipeline technology and super pipeline technology, which have greatly improved the performance of processors. the

而随着时代的进步和社会的发展,在信息化时代,人们对信息处理的需求越来越大,要求信息处理系统的能力越来越高。特别地,随着互联网、云计算和物联网的迅速发展,大量的移动设备、无线传感器时刻都在产生信息,数以亿计的互联网服务产生了巨量信息交互。大规模运算对处理器的发展提出了要求。  With the progress of the times and the development of society, in the information age, people's demand for information processing is increasing, and the ability of the information processing system is required to be higher and higher. In particular, with the rapid development of the Internet, cloud computing, and the Internet of Things, a large number of mobile devices and wireless sensors are generating information all the time, and hundreds of millions of Internet services have generated a huge amount of information interaction. Large-scale computing puts forward requirements for the development of processors. the

为解决上述问题,处理器设计者不断创新以提升处理器性能,多处理器核技术的提出尤为重要。如Nvdia的GPU架构GT480处理器,有15个流处理器(Streaming Multiprocessors),每个流处理器有32个cuda处理器,cuda核有独立的寄存器和程序计数器,但是没有取指和调度单元来构成完整的前端,而是由流处理器提供;再如通用多核CPU处理器,只完成了核间同步通信,而没有涉及到核内的多流水线同步。  In order to solve the above problems, processor designers continue to innovate to improve processor performance, and the proposal of multi-processor core technology is particularly important. For example, Nvdia's GPU architecture GT480 processor has 15 stream processors (Streaming Multiprocessors), and each stream processor has 32 cuda processors. The cuda core has independent registers and program counters, but there is no fetching and scheduling unit to It constitutes a complete front end, but it is provided by a stream processor; another example is a general-purpose multi-core CPU processor, which only completes synchronous communication between cores, but does not involve multi-pipeline synchronization within the core. the

发明内容 Contents of the invention

有鉴于此,本发明提供一种多流水线同步装置,可以高效地实现多流水线的配置同步,以满足适应大规模运算处理器的需求。本发明提出了一种多流水线同步装置,包括:指令存储单元、配置流水线A、运算流水线B、流水线同步装置,其中流水线同步装置包括控制状态寄存器传送单元、流水线寄存器传送单元以及同步逻辑单元,流水线A进一步包括普通寄存 器,流水线B进一步包括控制状态寄存器、流水线配置寄存器,其中控制状态寄存器传送单元实现流水线之间的控制状态寄存器传送,包括流水线A对流水线B的控制寄存器配置操作,以及流水线A对流水线B的状态寄存器读操作,流水线寄存器传送单元实现普通寄存器与流水线配置寄存器的传送,同步逻辑单元用于实现流水线之间的同步功能,控制寄存器用来控制流水线B处于何种运算状态,状态寄存器反映流水线B的忙闲状态,指令存储单元为流水线寄存器传送单元提供指令。  In view of this, the present invention provides a multi-pipeline synchronization device, which can efficiently implement multi-pipeline configuration synchronization, so as to meet the requirements of adapting to large-scale computing processors. The present invention proposes a multi-pipeline synchronization device, including: an instruction storage unit, a configuration pipeline A, an operation pipeline B, and a pipeline synchronization device, wherein the pipeline synchronization device includes a control status register transmission unit, a pipeline register transmission unit, and a synchronization logic unit. A further includes ordinary registers, and pipeline B further includes control status registers and pipeline configuration registers, wherein the control status register transfer unit realizes the transfer of control status registers between pipelines, including the control register configuration operation of pipeline A to pipeline B, and pipeline A For the read operation of the status register of the pipeline B, the pipeline register transmission unit realizes the transmission of the general register and the pipeline configuration register, the synchronization logic unit is used to realize the synchronization function between the pipelines, and the control register is used to control which operation state the pipeline B is in, the state The register reflects the busy state of pipeline B, and the instruction storage unit provides instructions for the pipeline register transfer unit. the

其中,流水线A为配置流水线,用于对流水线B进行调度;流水线B为运算流水线,用于满足大规模运算需求。  Among them, pipeline A is a configuration pipeline, which is used to schedule pipeline B; pipeline B is a computing pipeline, which is used to meet large-scale computing requirements. the

其中,流水线B分为X、Y两簇,有三种工作方式,分别是不分簇工作、分簇情况下仅X工作、分簇情况下仅Y工作。  Among them, the pipeline B is divided into two clusters, X and Y, and has three working modes, namely, no clustering work, only X work in the case of clustering, and only Y work in the case of clustering. the

其中,控制寄存器用CCtrl表示,状态寄存器用CStat表示,都是3bit寄存器,控制寄存器CCtrl为xx0,表示不分簇工作,011表示分簇情况下仅X工作,101表示分簇情况下仅Y工作,状态寄存器CStat[0]指征不分簇情况下流水线B的忙闲状态,1为忙,CStat[1]指征分簇情况下流水线B中X簇的忙闲状态,1为忙,CStat[2]指征分簇情况下流水线B中Y簇的忙闲状态。  Among them, the control register is represented by CCtrl, and the status register is represented by CStat, both of which are 3-bit registers. The control register CCtrl is xx0, which means that no clustering works, 011 means that only X works in the case of clustering, and 101 means that only Y works in the case of clustering , the status register CStat[0] indicates the busy/idle state of pipeline B without clustering, 1 means busy, CStat[1] indicates the busy/idle state of cluster X in pipeline B under clustering, 1 means busy, CStat [2] Indicates the busy/idle state of cluster Y in pipeline B in the case of clustering. the

本发明的多流水线同步装置可以配置控制寄存器,读取状态寄存器;根据调度指令,通过查询状态寄存器以及控制寄存器,灵活地实现多流水线的同步;同时根据控制寄存器状态,完成对任意状态流水线的寄存器配置。  The multi-pipeline synchronization device of the present invention can configure the control register and read the status register; according to the scheduling instruction, by querying the status register and the control register, the synchronization of multiple pipelines can be flexibly realized; at the same time, according to the state of the control register, the register of any state pipeline can be completed configuration. the

附图说明 Description of drawings

图1为本发明多流水线同步装置的模块结构图;  Fig. 1 is the module structural diagram of multi-pipeline synchronization device of the present invention;

图2为本发明控制状态寄存器说明图;  Fig. 2 is an explanatory diagram of the control state register of the present invention;

图3配置寄存器编码示例;  Figure 3 configuration register coding example;

图4为本发明流水线寄存器传送单元的结构图;  Fig. 4 is the structural diagram of pipeline register transmission unit of the present invention;

图5为本发明同步逻辑单元的模块图;  Fig. 5 is the block diagram of synchronous logic unit of the present invention;

图6为本发明同步逻辑单元Called信号产生代码;  Fig. 6 is the synchronous logic unit Called signal generation code of the present invention;

图7为本发明同步逻辑单元调度使能产生电路图;  Fig. 7 is the synchronous logical unit dispatching enabling generation circuit diagram of the present invention;

图8为本发明同步逻辑单元流水线停顿产生电路图。  FIG. 8 is a circuit diagram for generating a pipeline stall of a synchronous logic unit according to the present invention. the

具体实施方式 Detailed ways

为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明进一步详细说明。  In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings. the

本发明相关的多核处理器的每一个核内有两条流水线,一条流水线用于大规模运算,另一条用于小规模的运算以及对大规模运算流水线的调度,这样可以有效率地针对应用来实施合理的运算规划。为实现流水线之间的同步,需要设计高性能的流水线同步装置,本发明正是基于以上考虑实施。  There are two pipelines in each core of the multi-core processor related to the present invention, one pipeline is used for large-scale computing, and the other is used for small-scale computing and scheduling of large-scale computing pipelines, which can efficiently target applications Implement reasonable operational planning. In order to realize the synchronization between the pipelines, it is necessary to design a high-performance pipeline synchronization device, and the present invention is implemented based on the above considerations. the

图1是本发明的多流水线同步装置的结构图。参照图1,以双流水线同步为例,该双流水线同步装置主要包括以下几个组成部分:指令存储单元106、配置流水线A104、运算流水线B105、流水线同步装置100。其中流水线同步装置包括控制状态寄存器传送单元101、流水线寄存器传送单元102以及同步逻辑单元103,流水线A104进一步包括普通寄存器107,流水线B105进一步包括控制状态寄存器108(其包括有控制寄存器和状态寄存器)、流水线配置寄存器109。其中控制状态寄存器传送单元101实现流水线之间的控制状态寄存器传送,包括流水线A对流水线B的控制寄存器配置操作,以及流水线A对流水线B的状态寄存器读操作。流水线寄存器传送单元102实现普通寄存器107与流水线配置寄存器109的传送,而同步逻辑单元103是整个装置的核心,实现流水线之间的同步功能。控制寄存器用来控制流水线B处于何种运算状态。状态寄存器则反映流水线B的忙闲状态,普通寄存器用来给控制状态寄存器和流水线配置寄存器提供输入,流水线配置寄存器提供流水线B运行所需的配置信息。  FIG. 1 is a structural diagram of a multi-pipeline synchronization device of the present invention. Referring to FIG. 1 , taking dual-pipeline synchronization as an example, the dual-pipeline synchronization device mainly includes the following components: instruction storage unit 106 , configuration pipeline A104 , operation pipeline B105 , and pipeline synchronization device 100 . The pipeline synchronization device includes a control status register transfer unit 101, a pipeline register transfer unit 102 and a synchronization logic unit 103, the pipeline A104 further includes a general register 107, and the pipeline B105 further includes a control status register 108 (which includes a control register and a status register), Pipeline Configuration Register 109. The control status register transmission unit 101 implements the control status register transmission between pipelines, including the control register configuration operation from pipeline A to pipeline B, and the status register read operation from pipeline A to pipeline B. The pipeline register transmission unit 102 realizes the transmission of the general register 107 and the pipeline configuration register 109, and the synchronization logic unit 103 is the core of the whole device, and realizes the synchronization function between the pipelines. The control register is used to control which operation state the pipeline B is in. The status register reflects the busy/idle state of the pipeline B. The general register is used to provide input to the control status register and the pipeline configuration register. The pipeline configuration register provides the configuration information required for the operation of the pipeline B. the

控制状态寄存器传送单元101完成流水线之间的控制状态寄存器传送,包括流水线A对流水线B的控制寄存器配置操作,以及流水线A对流水线B的状态寄存器读操作。流水线A为配置流水线,用于对流水线B进行调度;流水线B为运算流水线,用于满足大规模运算需求,流水线B采用分簇技术,可分为X、Y两簇,有三种工作方式,不分簇情况下仅X工作、分簇情况下仅Y工作。  The control status register transmission unit 101 completes the control status register transmission between pipelines, including the control register configuration operation from pipeline A to pipeline B, and the status register read operation from pipeline A to pipeline B. Pipeline A is a configuration pipeline used to schedule pipeline B; pipeline B is a computing pipeline used to meet large-scale computing needs. Pipeline B adopts clustering technology and can be divided into X and Y clusters. In the case of clustering, only X works, and in the case of clustering, only Y works. the

控制寄存器由流水线A配置,用来控制流水线B处于何种运算状态。状态寄存器则反映流水线B的忙闲状态,由流水线B反馈。控制寄存器用CCtrl表示,状态寄存器用CStat表示,都是3bit寄存器,表示形式如图2。控制寄存器CCtrl为xx0,表示不分簇工作,011表示分簇情况下仅X工作,101表示分簇情况下仅Y工作。状态寄存器CStat[0]指征不分簇情况下流水线B的忙闲状态,1为忙;CStat[1]指征分簇情况下流水线B中X簇的忙闲状态,1为忙;CStat[2]指征分簇情况下流水线B中Y簇的忙闲状态。这样,可以指征流水线配置和状态的任意特征。  The control register is configured by pipeline A to control the operation status of pipeline B. The status register reflects the busy and idle status of pipeline B, which is fed back by pipeline B. The control register is represented by CCtrl, and the status register is represented by CStat, both of which are 3-bit registers, as shown in Figure 2. The control register CCtrl is xx0, which means no clustering, 011 means only X works under clustering, and 101 means only Y works under clustering. The status register CStat[0] indicates the busy/idle state of pipeline B without clustering, 1 means busy; CStat[1] indicates the busy/idle state of cluster X in pipeline B under clustering, 1 means busy; CStat[ 2] Indicates the busy/idle status of cluster Y in pipeline B in the case of clustering. In this way, arbitrary characteristics of the pipeline configuration and state can be indicated. the

运算流水线B运行,需要大量配置信息,这需要流水线A通过流水线同步装置100进行配置,流水线寄存器传送单元102即负责流水线A中普通寄存器107与流水线B中流水线配置寄存器109的寄存器传送。  The operation of the operation pipeline B requires a large amount of configuration information, which requires the pipeline A to be configured through the pipeline synchronization device 100, and the pipeline register transmission unit 102 is responsible for the register transmission of the general register 107 in the pipeline A and the pipeline configuration register 109 in the pipeline B. the

运行大规模计算的流水线必然需要很多配置寄存器信息。本文提供一种指令配置形式,KAs=Rm,KBs=Rm等,其中s是下标号,指示是哪一个KA(KA有多个)。Rm标示普通寄存器的值。此处KA、KB为流水线配置寄存器,R为普通寄存器。由于流水线B包含许多流水线配置寄存器109,而且它们有其共同特征,可将其分为多组,每组又包含许多个寄存器。据此本发明提供一种灵活的编码方式,mi-1mi-2mi-3...m2m1m0nj-1nj-2nj-3...n2n1n0,其中mi-1mi-2mi-3...m2m1m0提供i个流水线配置寄存器组,nj-1nj-2nj-3...n2n1n0提供每个寄存器组中j个寄存器。这样依据寄存器组数和每个寄存器组的寄存器数,选择合适的i和j即可灵活地完成配置寄存器的编码。图3描述了一个编码示例。KA至KD可表示至多16个对应的寄存器,KE、KF、KG则可根据寄存器数调整可用编码,分别只用3、2、1位编码表示至多8、4、2个对应寄存器。  Pipelines that run large-scale computations necessarily require a lot of configuration register information. This article provides an instruction configuration form, KAs=Rm, KBs=Rm, etc., where s is a subscript indicating which KA it is (there are multiple KAs). Rm indicates the value of an ordinary register. Here KA and KB are pipeline configuration registers, and R is an ordinary register. Since pipeline B contains many pipeline configuration registers 109, and they have common characteristics, they can be divided into multiple groups, and each group contains many registers. Accordingly, the present invention provides a flexible encoding method, mi-1mi-2mi-3...m2m1m0nj-1nj-2nj-3...n2n1n0, wherein mi-1mi-2mi-3...m2m1m0 provides i pipelines Configure register groups, nj-1nj-2nj-3...n2n1n0 provide j registers in each register group. In this way, according to the number of register groups and the number of registers of each register group, the coding of configuration registers can be completed flexibly by selecting appropriate i and j. Figure 3 depicts an encoding example. KA to KD can represent at most 16 corresponding registers, and KE, KF, and KG can adjust the available codes according to the number of registers, and only use 3, 2, and 1-bit codes to represent at most 8, 4, and 2 corresponding registers. the

图4是流水线寄存器传送单元102的实现结构图。流水线寄存器传送单元完成流水线A中普通寄存器与流水线B中流水线配置寄存器的寄存器传送,以此流水线B得到其运转所需的寄存器信息。指令存储单元106为流水线寄存器传送单元提供指令,指令经过译码逻辑a(404)产生使能信号En、ID信号GrpID/RegID以及数据信号Data,GrpID即上文中编码mi-1mi-2mi-3...m2m1m0给出的信息,它即可识别某此流水线寄存器传送单元应该传递给哪一组寄存器,接着便把En、RegID以及Data传递给选 定的寄存器组例如寄存器组406,经过译码逻辑单元b405分析出RegID,便可以选择指定的寄存器例如KA1进行读写。例如,假设有7个寄存器组,每个寄存器组最多16个寄存器。如果要写编码为0000001的寄存器,首先从指令存储单元取出指令,经过译码逻辑a产生En、ID,从普通寄存器107取得Data,识别GrpID为000,即要写寄存器组KA,这样将En、RegID及Data传递给译码逻辑B,识别RegID为0001,即写寄存器KA0。如此,便完成一次写配置寄存器操作。  FIG. 4 is an implementation structural diagram of the pipeline register transfer unit 102 . The pipeline register transfer unit completes the register transfer between the general registers in pipeline A and the pipeline configuration registers in pipeline B, so that pipeline B can obtain the register information required for its operation. The instruction storage unit 106 provides instructions for the pipeline register transmission unit, and the instruction generates the enable signal En, the ID signal GrpID/RegID and the data signal Data through the decoding logic a (404), GrpID is the code mi-1mi-2mi-3 above. The information given by ..m2m1m0 can identify which group of registers a certain pipeline register transfer unit should transfer to, and then transfer En, RegID and Data to the selected register group, such as register group 406, after decoding logic The unit b405 analyzes the RegID, and then can select a designated register such as KA1 for reading and writing. For example, suppose there are 7 register banks, each with a maximum of 16 registers. If you want to write a register coded as 0000001, first take out the instruction from the instruction storage unit, generate En and ID through the decoding logic a, get Data from the general register 107, and identify the GrpID as 000, that is, write the register group KA, so that En, The RegID and Data are passed to the decoding logic B, and the RegID is identified as 0001, that is, the register KA0 is written. In this way, a write configuration register operation is completed. the

同步逻辑单元103是流水线同步装置的核心部分。图5是同步逻辑单元103的模块图。它负责从指令存储单元106接收调度指令以及阻塞信息,并根据控制状态寄存器108,为配置流水线A104产生停顿信号,为运算流水线B105产生调度使能及传递调度信息。  The synchronization logic unit 103 is the core part of the pipeline synchronization device. FIG. 5 is a block diagram of the synchronization logic unit 103 . It is responsible for receiving scheduling instructions and blocking information from the instruction storage unit 106, and according to the control status register 108, generates a pause signal for the configuration pipeline A104, generates a scheduling enable and transmits scheduling information for the operation pipeline B105. the

同步逻辑单元103主要功能是产生调度使能和停顿信号,以保证流水线之间的正确运行。根据上文描述的控制寄存器CCtrl可能有三种配置方式,不分簇、分簇情况下仅X工作、分簇情况下仅Y工作。调度使能和停顿信号需要根据流水线的配置情况和运行情况来协调产生。同步逻辑单元103的信号列表如下。  The main function of the synchronization logic unit 103 is to generate scheduling enable and pause signals to ensure correct operation between pipelines. According to the control register CCtrl described above, there may be three configuration modes, no clustering, only X works in the case of clustering, and only Y works in the case of clustering. Scheduling enable and pause signals need to be coordinated and generated according to the configuration and operation of the pipeline. The signal list of the synchronization logic unit 103 is as follows. the

信号 Signal in/out in/out 描述 describe Instr Instr in in 指令存储单元取出的调度指令 dispatch instruction fetched from instruction store iCCtrl0 iCCtrl0 in in 配置流水线B不分簇工作 Configure pipeline B to work without clustering iCCtrl1 iCCtrl1 in in 配置流水线B分簇情况仅X簇工作 Configure pipeline B to cluster only cluster X to work iCCtrl2 iCCtrl2 in in 配置流水线B分簇情况仅Y簇工作 Configure pipeline B to cluster only cluster Y to work iCStat0 iCStat0 in in 流水线B不分簇情况下忙闲状态 Busy/idle state of pipeline B without clustering iCStat1 iCStat1 in in 流水线B仅X簇工作忙闲状态 Pipeline B only works in the busy/idle state of cluster X iCStat2 iCStat2 in in 流水线B仅Y簇工作忙闲状态 Pipeline B only works in the busy/idle state of cluster Y CallEn CallEn out out 对流水线B调度使能 Enable the scheduling of pipeline B Stall Stall out out 让流水线A停顿 Stop pipeline A

同时,同步逻辑单元103会产生一系列中间信号,CallEn0、CallEn1、CallEn2分别表示不分簇、分簇下X工作、分簇下Y工作下调度使能。 Called0、Called1、Called2表示三种工作情况下,CallEn发出以后,流水线B未处理完的状态,如果流水线B处理完毕,Called清零。Called的产生如图6。Stall0、Stall1、Stall2表示三种工作情况下的流水线停顿。CallBlkEn表示Call使能情况下阻塞使能,SYNCodeEn为表示Call指令有效,CallBlkEn和SYNCodeEn由指令可得。  At the same time, the synchronization logic unit 103 will generate a series of intermediate signals, CallEn0, CallEn1, and CallEn2 respectively indicate no clustering, X working under clustering, and scheduling enablement under clustering Y working. Called0, Called1, and Called2 indicate the status of pipeline B not finishing processing after CallEn is issued under the three working conditions. If pipeline B is finished processing, Called is cleared. The generation of Called is shown in Figure 6. Stall0, Stall1, and Stall2 represent pipeline stalls under three working conditions. CallBlkEn indicates that blocking is enabled when Call is enabled, and SYNCodeEn indicates that the Call command is valid, and CallBlkEn and SYNCodeEn can be obtained from the command. the

接下来便可以产生调度以及流水线停顿信号。以CallEn0及Stall0为例,CallEn0产生的情况为(iCCtrl0&&!iCStat0)&&!Called0,即配置流水线B不分簇下工作并且流水线B空闲并且流水线B之前的请求已经处理完毕。Stall0产生的情况为iCCtrl0&&iCStat0||(!iCState0&&CallBlkEn&&!Called0),一种是配置流水线B不分簇下工作并且流水线B忙碌,另一种是配置流水线B不分簇下工作,流水线B空闲但阻塞使能。CallEn1、CallEn2及Stall1、Stall2的产生类似。最后将三种情况结合,便可以产生正确的调度使能和流水线停顿。CallEn的产生电路如图7,StallEn的产生电路如图8,可以看到其产生逻辑比较简单,此装置高效地实现了流水线的同步。  Scheduling and pipeline stall signals can then be generated. Taking CallEn0 and Stall0 as examples, the situation generated by CallEn0 is (iCCtrl0&&!iCStat0)&&!Called0, that is, pipeline B is configured to work without clustering and pipeline B is idle and the requests before pipeline B have been processed. The situation generated by Stall0 is iCCtrl0&&iCStat0||(!iCState0&&CallBlkEn&&!Called0), one is to configure pipeline B to work without clustering and pipeline B is busy, and the other is to configure pipeline B to work without clustering, pipeline B is idle but blocked able. The generation of CallEn1, CallEn2 and Stall1, Stall2 is similar. Finally, the combination of the three situations can produce the correct scheduling enable and pipeline stall. The generation circuit of CallEn is shown in Figure 7, and the generation circuit of StallEn is shown in Figure 8. It can be seen that the generation logic is relatively simple, and this device efficiently realizes the synchronization of the pipeline. the

这样便实现了双流水线的同步,配置流水线可以安全的调度运算流水线,以满足大规模运算的需求。此同步思想同样适用于多流水线的同步,只要实现类似的同步电路即可。  In this way, the synchronization of the dual pipelines is realized, and the configuration pipelines can safely schedule the computing pipelines to meet the needs of large-scale computing. This synchronization idea is also applicable to the synchronization of multiple pipelines, as long as a similar synchronization circuit is realized. the

以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步详细说明,应理解的是,以上所述仅为本发明的具体实施例而已,并不用于限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。  The specific embodiments described above have further described the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above descriptions are only specific embodiments of the present invention, and are not intended to limit the present invention. Within the spirit and principles of the present invention, any modifications, equivalent replacements, improvements, etc., shall be included in the protection scope of the present invention. the

Claims (4)

1. multiple pipeline synchronous device, comprise: the location of instruction (106), configuration assembly line A (104), arithmetic pipelining B (105), pipeline synchronization device (100), wherein pipeline synchronization device (100) comprises state of a control register transfer unit (101), pipeline register delivery unit (102) and synchronous logic unit (103), assembly line A (104) further comprises general register (107), streamline B (105) further comprises state of a control register (108), pipeline configuration register (109), wherein state of a control register transfer unit (101) realize the state of a control register transfer between streamline, comprise that assembly line A (104) is to the control register configuration operation of streamline B (105), and assembly line A (104) is to the status register read operation of streamline B (105), pipeline register delivery unit (102) is realized the transmission of general register (107) and pipeline configuration register (109), synchronous logic unit (103) is for the synchronizing function that realizes between streamline, control register is used for controlling streamline B (105) and is in which kind of compute mode, the busy-idle condition of status register reflection streamline B (105), the location of instruction (106) provides instruction for pipeline register delivery unit (102), general register (107) is used for providing input to state of a control register and pipeline configuration register, pipeline configuration register (109) provides streamline B operation required configuration information.
2. device according to claim 1, is characterized in that, assembly line A is the configuration flow waterline, is used for streamline B is dispatched; Streamline B is arithmetic pipelining, is used for satisfying extensive computing demand.
3. system according to claim 2, is characterized in that, streamline B is divided into two bunches of X, Y, and three kinds of working methods are arranged, and is respectively only Y work in only X work, sub-clustering situation in not sub-clustering work, sub-clustering situation.
4. system according to claim 3, it is characterized in that, control register represents with CCtrl, status register represents with CStat, it is all the 3bit register, control register CCtrl is xx0, represent not sub-clustering work, only X work in 011 expression sub-clustering situation, only Y work in 101 expression sub-clustering situations, status register CStat[0] busy-idle condition of the dirty waterline B of not sub-clustering of indication situation, 1 for busy, CStat[1] busy-idle condition of X bunch in the dirty waterline B of indication sub-clustering situation, 1 for busy, CStat[2] busy-idle condition of Y bunch in the dirty waterline B of indication sub-clustering situation.
CN2013101392797A 2013-04-19 2013-04-19 Synchronous device for multi-assembly lines Pending CN103383641A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013101392797A CN103383641A (en) 2013-04-19 2013-04-19 Synchronous device for multi-assembly lines

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013101392797A CN103383641A (en) 2013-04-19 2013-04-19 Synchronous device for multi-assembly lines

Publications (1)

Publication Number Publication Date
CN103383641A true CN103383641A (en) 2013-11-06

Family

ID=49491443

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013101392797A Pending CN103383641A (en) 2013-04-19 2013-04-19 Synchronous device for multi-assembly lines

Country Status (1)

Country Link
CN (1) CN103383641A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017088456A1 (en) * 2015-11-24 2017-06-01 中国科学院计算技术研究所 Pipeline data synchronization apparatus and method for multi-input multi-output processor

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5787489A (en) * 1995-02-21 1998-07-28 Micron Technology, Inc. Synchronous SRAM having pipelined enable
US20020126705A1 (en) * 2000-12-08 2002-09-12 Gentieu Paul R. Synchronous network traffic processor
CN1678988A (en) * 2002-09-04 2005-10-05 Arm有限公司 Synchronization between pipelines in data processing equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5787489A (en) * 1995-02-21 1998-07-28 Micron Technology, Inc. Synchronous SRAM having pipelined enable
US20020126705A1 (en) * 2000-12-08 2002-09-12 Gentieu Paul R. Synchronous network traffic processor
CN1678988A (en) * 2002-09-04 2005-10-05 Arm有限公司 Synchronization between pipelines in data processing equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何虎等: "面向寄存器的流水线处理器建模及验证方法", 《半导体学报》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017088456A1 (en) * 2015-11-24 2017-06-01 中国科学院计算技术研究所 Pipeline data synchronization apparatus and method for multi-input multi-output processor

Similar Documents

Publication Publication Date Title
TWI628594B (en) User level forks and rendezvous processors, methods, systems, and instructions
US20190303159A1 (en) Instruction set architecture to facilitate energy-efficient computing for exascale architectures
US9606797B2 (en) Compressing execution cycles for divergent execution in a single instruction multiple data (SIMD) processor
CN101833441B (en) Parallel vector processing engine structure
JP2019050033A (en) Processor and system
US11403104B2 (en) Neural network processor, chip and electronic device
CN113766802A (en) Smart liquid-cooled computing pods for mobile data centers
CN106358003A (en) Video analysis and accelerating method based on thread level flow line
CN103761215B (en) Matrix transpose optimization method based on graphic process unit
CN103150146A (en) ASIP (application-specific instruction-set processor) based on extensible processor architecture and realizing method thereof
CN105373367B (en) The vectorial SIMD operating structures for supporting mark vector to cooperate
CN102508643A (en) Multicore-parallel digital signal processor and method for operating parallel instruction sets
CN104679691B (en) A kind of multinuclear DMA segment data transmission methods using host count for GPDSP
CN106502782A (en) Heterogeneous computing system and method thereof
CN104679689B (en) A kind of multinuclear DMA segment data transmission methods counted using slave for GPDSP
CN106575220B (en) Multiple clustered VLIW processing cores
CN100489830C (en) 64 bit stream processor chip system structure oriented to scientific computing
US10659396B2 (en) Joining data within a reconfigurable fabric
CN104049937A (en) Chaining between exposed vector pipelines
CN116158201B (en) Smart, adaptable heat sinks for cooling data center equipment
CN103793208A (en) Data processing system for collaborative operation of vector DSP and coprocessors
WO2023123453A1 (en) Operation acceleration processing method, operation accelerator use method, and operation accelerator
CN113934455A (en) Instruction conversion method and device
CN103019655A (en) Internal memory copying accelerating method and device facing multi-core microprocessor
CN105446733B (en) Data processing system, method for data processing system, and readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20131106