[go: up one dir, main page]

CN102841828B - Fault detect in logical circuit and alleviating - Google Patents

Fault detect in logical circuit and alleviating Download PDF

Info

Publication number
CN102841828B
CN102841828B CN201110166907.1A CN201110166907A CN102841828B CN 102841828 B CN102841828 B CN 102841828B CN 201110166907 A CN201110166907 A CN 201110166907A CN 102841828 B CN102841828 B CN 102841828B
Authority
CN
China
Prior art keywords
parallel
logical circuit
core
redundancy
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201110166907.1A
Other languages
Chinese (zh)
Other versions
CN102841828A (en
Inventor
S·D·索伦森
S·索加尔德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Westinghouse Electric Corp
Original Assignee
Westinghouse Electric Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Westinghouse Electric Corp filed Critical Westinghouse Electric Corp
Priority to CN201110166907.1A priority Critical patent/CN102841828B/en
Publication of CN102841828A publication Critical patent/CN102841828A/en
Application granted granted Critical
Publication of CN102841828B publication Critical patent/CN102841828B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Hardware Redundancy (AREA)

Abstract

本发明涉及逻辑电路中的故障检测和减轻。本发明针对监视逻辑电路的故障的方法。具体地说,该方法针对建立并行逻辑电路核心,其中,通过由冗余校验器比较在关键位置处等效的并行路径来检测故障。任何误匹配将导致预定故障自动防护操作模式。另外,应用重要的技术,以定期运用各个并行路径来保证,从而按不干扰被监视或控制的任何过程的方式检验并行核心。该特征在某些工业如核电工业中是重要的,在这里,安全关键操作对于可能不经常被利用的逻辑电路块要求很高的可靠性状态。

The present invention relates to fault detection and mitigation in logic circuits. The present invention is directed to a method of monitoring logic circuits for failure. Specifically, the method is directed to building parallel logic circuit cores in which faults are detected by comparing equivalent parallel paths at critical locations by a redundancy checker. Any mismatch will result in a predetermined fail-safe mode of operation. In addition, significant techniques are applied to ensure that each parallel path is exercised periodically, thereby verifying the parallel cores in a manner that does not interfere with any process being monitored or controlled. This feature is important in certain industries, such as the nuclear power industry, where safety-critical operations require a high reliability state for logic circuit blocks that may not be utilized very often.

Description

逻辑电路中的故障检测和减轻Fault Detection and Mitigation in Logic Circuits

相关申请的交叉参考Cross References to Related Applications

本申请是在2008年2月6日提交的美国专利申请No.12/026,703的部分继续。整个现有的申请由此通过参考包括在这里。This application is a continuation-in-part of US Patent Application No. 12/026,703, filed February 6, 2008. The entire prior application is hereby incorporated by reference.

技术领域 technical field

本发明一般涉及一种用于设计高完整性逻辑电路的方法。本发明具体地针对安全相关控制系统,这些安全相关控制系统包括核电站反应堆保护系统,在这里完整性和可靠性是最重要的。本发明特别针对在诸如PAL、CPLD、FPGA、ASIC、或门阵列(GateArray)之类的逻辑装置中、或在多个逻辑装置的组合中实施这些方法。该逻辑装置通常被安装在印刷电路板上。The present invention generally relates to a method for designing high integrity logic circuits. The present invention is particularly directed to safety-related control systems, including nuclear power plant reactor protection systems, where integrity and reliability are paramount. The present invention is particularly directed to implementing these methods in a logic device such as a PAL, CPLD, FPGA, ASIC, or GateArray, or in a combination of multiple logic devices. The logic device is usually mounted on a printed circuit board.

背景技术 Background technique

其它人已经试图改进计算机化系统中的任务关键逻辑部件的可靠性。例如,美国专利7,290,169描述了一种核心级处理器锁步系统,其中,两个微处理器并行地操作,并且它们各自提供被比较的外部输出信号。微处理器意味着按锁步(lockstep)操作,也就是说,按紧密协调方式操作,使得它们的输出将按可靠的方式匹配。在实际应用中,该方法对于安全关键系统具有许多问题。难以将微处理器完全保持在锁步中。在系统中可能有未发现的故障,直到系统被实际使用。Others have attempted to improve the reliability of mission-critical logic components in computerized systems. For example, US Patent 7,290,169 describes a core-level processor lockstep system in which two microprocessors operate in parallel and each provide an external output signal that is compared. Microprocessors are meant to operate in lockstep, that is, in a tightly coordinated manner so that their outputs will match in a reliable manner. In practical application, this method has many problems for safety-critical systems. It is difficult to keep the microprocessor completely in lockstep. There may be undiscovered faults in the system until the system is actually used.

美国专利7,237,144提供类似的操作想法和困难,但提供离芯片锁步校验以抗击“软差错”。它具有刚才描述的相同困难。US Patent 7,237,144 offers similar operational ideas and difficulties, but provides off-chip lockstep verification to combat "soft errors". It has the same difficulties just described.

美国专利6,233,702描述了一种复杂的多处理器系统,该多处理器系统通过采用硬件(例如,故障功能、采用冗余性)和使用软件技术(速错,例如采用借助于高数据完整性硬件的软件恢复)来提供容错数据处理。差错校验明确地避免了利用在并行处理器之间比较关键数据点的冗余度,并且代之以仅比较按较慢速率如在I/O点处或在主存储器中操作的点。该设计过度复杂,并且具有与将简短讨论的未告知的差错有关的问题。它是基于软件的系统,具有也将简短讨论的问题。US Patent 6,233,702 describes a complex multiprocessor system that uses hardware (e.g., fail-safe, employs redundancy) and software techniques (e.g., employs high data integrity hardware software recovery) to provide fault-tolerant data processing. Error checking explicitly avoids exploiting the redundancy of comparing critical data points between parallel processors, and instead only compares points operating at slower rates, such as at I/O points or in main memory. This design is overly complex and has problems related to unannounced errors that will be discussed shortly. It is a software-based system with issues that will also be briefly discussed.

美国专利7,134,104描述了一种通过创建逻辑功能的至少三个并行拷贝、并然后使用投票方案以确定任一具体拷贝是否有错来改进FPGA中的容错的方法。尽管该方法大体上改进了容错,但它对于安全关键环境不是一种满意的方案,在这里不能确定多数票始终是无故障结果。US Patent 7,134,104 describes a method of improving fault tolerance in an FPGA by creating at least three parallel copies of a logic function, and then using a voting scheme to determine whether any particular copy is faulty. Although this approach generally improves fault tolerance, it is not a satisfactory solution for safety-critical environments, where it cannot be guaranteed that a majority vote will always be a failure-free outcome.

美国专利5,144,230描述了一种通过叫做循环挪用(cyclestealing)的方法的自测试电路。当不要求输出信号执行其正常功能时,通过选择性地施加测试输入信号来测试来自‘测试下的电路’的输出信号。尽管这是一种校验处理器的可能方法,但测试没有提供防止影响相关系统的故障的任何保护。当使用并行冗余性时,表决器方案用于确定无故障结果。这些方法对于其中期望高度可靠的系统的安全关键环境是不可接受的。US Patent 5,144,230 describes a self-testing circuit by a method called cycle stealing. The output signal from the 'circuit under test' is tested by selectively applying a test input signal when the output signal is not required to perform its normal function. While this is a possible way to verify a processor, the test does not provide any protection against failures affecting the system in question. When parallel redundancy is used, a voter scheme is used to determine a failure-free outcome. These approaches are unacceptable for safety-critical environments where highly reliable systems are expected.

美国申请2007/0022348描述了并行锁步核心,这些并行锁步核心与US7,290,169已经描述的相似,不同之处在于,来自核心的中间值也与输出一道进行比较。然而,该系统具有将两个核心保持在锁步中的所有问题。例如,当有差错时,高速缓冲存储器必须被加载到系统存储器中,以保证保持发生锁步。当有系统或编程变化时,高速缓冲存储器必须被保持并且在正在进行的基础上被检验。系统也是基于软件的。US application 2007/0022348 describes parallel lockstep cores which are similar to those already described in US7,290,169, except that intermediate values from the cores are also compared along with the outputs. However, this system has all the problems of keeping the two cores in lockstep. For example, when there is an error, the cache must be loaded into system memory to keep lockstep from occurring. The cache must be maintained and checked on an ongoing basis when there are system or programming changes. The system is also software-based.

现有技术中有提供一种高度可靠的系统的需要,该高度可靠的系统不是基于软件的系统。例如,在安全关键系统中,如在核电站保护系统中,由于潜在差错的性质,不期望依赖于可执行软件。软件具有难以解决的固有操作问题。即使相对简单的系统也要求显著量的程序代码。具体地说,在并行冗余系统由于故障条件可能同时失效的场合,软件微处理器系统经受共模故障。There is a need in the art to provide a highly reliable system that is not a software based system. For example, in safety-critical systems, such as in nuclear power plant protection systems, it is not desirable to rely on executable software due to the nature of potential errors. Software has inherent operational problems that are difficult to solve. Even relatively simple systems require a significant amount of program code. Specifically, software microprocessor systems are subject to common mode failures where parallel redundant systems may fail simultaneously due to failure conditions.

不管可能包括在软件微处理器系统内的冗余度,故障仍然可能偶然地影响不能正确地拾取无故障结果的足够冗余功能,并且系统将经历共模故障。共模故障可以由单个故障或几个故障导致。已知的是,在软件的冗余拷贝在同一故障下失效时,基于微处理器的系统易受共模故障。具体地说,共模故障使软件微处理器系统在电站保护系统中是不期望有的。Regardless of the redundancy that may be included within a software microprocessor system, failures may still occasionally affect sufficient redundant functions not to correctly pick up no-failure results, and the system will experience common mode failures. Common mode faults can be caused by a single fault or several faults. It is known that microprocessor based systems are susceptible to common mode failures when redundant copies of the software fail under the same failure. In particular, common mode faults make software microprocessor systems undesirable in power plant protection systems.

为了本发明的目的,如下定义适用。故障是执行要求的功能的能力的终止。也参见任务故障。故障可能直到下次测试未被告知和未被检测到,这叫做未告知的故障。它们可能在发生的瞬时由任何数量的方法告知和检测到,这叫做告知的故障。任务是项或系统的单一目标、作业、或目的。任务故障是在规定极限内无能力完成规定任务。关键功能是在逻辑电路中为了使它执行其任务需要的功能。For the purposes of the present invention, the following definitions apply. A failure is the cessation of the ability to perform a required function. See also task glitch. A failure may not be reported and not detected until the next test, which is called an unannounced failure. They may be signaled and detected by any number of methods at the instant of occurrence, which is called a signaled fault. A task is a single goal, job, or purpose of an item or system. Task failure is the inability to perform a specified task within specified limits. A critical function is a function in a logic circuit that is required in order for it to perform its task.

在安全相关控制系统中,高完整性系统将具有两个关键特征:In a safety-related control system, a high-integrity system will have two key characteristics:

1)它在被调用时将执行其任务。当出现预定组的输入条件时,任务将典型地是致动现场装置。为了具有在被调用时执行其任务的高度保证,未告知的故障在系统中一定不存在。未告知的故障可使系统在被调用其任务的时刻出故障。这意味着必须检测和告知所有故障。1) It will perform its task when called. The task will typically be to actuate the field device when a predetermined set of input conditions occurs. In order to have a high degree of assurance that it will perform its task when invoked, unannounced failures must not exist in the system. Unannounced failures can cause the system to fail at the moment its task is invoked. This means that all faults must be detected and communicated.

2)必须避免由于逻辑电路故障造成的控制系统的意外致动。这些致动使现场装置执行它们的安全功能,这些安全功能常常是高成本的。为了做到这点,所有故障在它们到达现场装置之前必须被隔离抑制。2) Accidental actuation of the control system due to logic circuit failure must be avoided. These actuations cause the field devices to perform their safety functions, which are often costly. In order to do this, all faults must be isolated and suppressed before they reach the field device.

用于提高在关键用途中使用的逻辑电路的可靠性和适用性的普通方法是使用三模或更多模冗余度(TMR)。这通常用在核、空间及军事用途中。具有TMR逻辑电路,借助于多数投票方案允许容错。如果多数冗余逻辑电路没有故障,则系统将执行其功能。不幸地是,如果与少数相比多数出错,则系统将把差错用在其功能中。A common method for increasing the reliability and serviceability of logic circuits used in critical applications is to use triple or more mode redundancy (TMR). This is commonly used in nuclear, space and military applications. With TMR logic circuitry, fault tolerance is allowed by means of a majority voting scheme. If the majority of redundant logic circuits are not faulty, the system will perform its function. Unfortunately, if the majority is wrong compared to the minority, the system will use the error in its function.

如果允许故障在TMR系统中累积,则它可能有灾难性结果。具体地说,如果它被应用于安全关键用途,则系统可能在其功能中失效以在问题成为关键之前使系统关机或采取适当校正动作以消除问题。If faults are allowed to accumulate in a TMR system, it can have catastrophic results. Specifically, if it is applied in a safety-critical application, the system may fail in its function to shut down the system or take appropriate corrective action to eliminate the problem before it becomes critical.

可通过比较冗余逻辑电路之间的输出检测TMR逻辑电路中的故障。然而它不能检测未告知的故障,即在逻辑电路中不导致输出变化的故障。系统中的未告知的故障直到运用具体逻辑功能才被发现。也就是说,直到具体逻辑路径被利用。Faults in the TMR logic circuits can be detected by comparing outputs between redundant logic circuits. However, it cannot detect unannounced faults, that is, faults in logic circuits that do not cause an output change. Unannounced faults in the system are not discovered until specific logic functions are applied. That is, until a specific logical path is exploited.

未告知的故障在核安全系统中特别成问题,这些核安全系统通常处于“等待”位置,其中没有输入或输出改变状态。安全系统可能保持在这种状态下一段时间,允许未告知的故障积累。未告知的故障可能搁置未检测达数周、数月、或甚至数年。Unannounced failures are particularly problematic in nuclear safety systems, which are often in a "wait" position, where no inputs or outputs change state. Safety systems may remain in this state for some time, allowing unannounced faults to accumulate. Unannounced failures may sit undetected for weeks, months, or even years.

将TMR添加到系统上固有地增加了复杂性,该复杂性降低了整体可靠性。通过添加的辅助逻辑和编程而增加维护。添加辅助冗余模块(4个或更多个)将通过降低它们积累的概率并影响投票逻辑来改进防止未告知的故障,但以成比例降低可靠性和增加复杂性为代价。Adding TMRs to the system inherently increases complexity which reduces overall reliability. Increased maintenance through added auxiliary logic and programming. Adding secondary redundant modules (4 or more) will improve protection against unannounced failures by reducing the probability of their accumulation and affecting the voting logic, but at the cost of proportionally lower reliability and increased complexity.

发明内容 Contents of the invention

本发明针对创建高完整性逻辑电路并监视它们以检验它们的正确操作的方法。具体地说,该方法针对建立并行逻辑电路核心,其中,通过由冗余校验器比较在关键位置处等效的并行路径来检测故障。任何误匹配将导致预定故障自动防护操作模式。另外,开发了定期运用各个并行路径以保证在不干扰被监视或控制的任何过程的同时逻辑电路路径按将暴露未告知的故障的方式被运用的方法。The present invention is directed to methods of creating high integrity logic circuits and monitoring them to verify their correct operation. Specifically, the method is directed to building parallel logic circuit cores in which faults are detected by comparing equivalent parallel paths at critical locations by a redundancy checker. Any mismatch will result in a predetermined fail-safe mode of operation. In addition, methods were developed to periodically exercise individual parallel paths to ensure that logic circuit paths are exercised in a manner that would expose unannounced faults while not interfering with any process being monitored or controlled.

附图说明 Description of drawings

图1表示利用冗余校验器的两个并行核心的实施的图示说明。Figure 1 shows a schematic illustration of an implementation of two parallel cores utilizing a redundancy checker.

图2和3表示利用冗余校验器的两个并行核心的实施的另一个图示说明。2 and 3 show another illustration of the implementation of two parallel cores utilizing a redundancy checker.

图4表示本发明的内置自测试的重要细节。Figure 4 shows important details of the built-in self-test of the present invention.

具体实施方式 detailed description

本发明的主要目的是提供一种高度可靠的逻辑电路,该高度可靠的逻辑电路保证在被调用时可执行预期的任务。The main object of the present invention is to provide a highly reliable logic circuit which is guaranteed to perform the intended task when invoked.

本发明的另一个目的是提供一种用于设计故障自动防护逻辑电路的方法,这些故障自动防护逻辑电路在诸如PAL、CPLD、ASIC、门阵列、或FPGA之类的单个逻辑装置内被实施。可选择地和同样地,逻辑电路在单个印刷电路板(PCB)上的多个逻辑装置的组合中被实施。可选择地和同样地,它们在多个具有一个或多个诸如PAL、CPLD、FPGA、ASIC、或门阵列之类的逻辑装置的印刷电路板的组合中被实施。Another object of the present invention is to provide a method for designing fail-safe logic circuits implemented within a single logic device such as a PAL, CPLD, ASIC, gate array, or FPGA. Alternatively and as such, the logic circuitry is implemented in a combination of multiple logic devices on a single printed circuit board (PCB). Alternatively and as such, they are implemented in combinations of multiple printed circuit boards with one or more logic devices such as PALs, CPLDs, FPGAs, ASICs, or gate arrays.

本发明可以通过具有能够执行任务的多个并行系统,在应用级下结合有冗余性和/或容错。一种方法是具有能够执行任务的两个或更多个并行系统。如果这些系统中的一个失效,并且进入故障自动防护状态,则其它系统保持能够执行任务。改进完整性的另一种方法是具有三个或更多个并行逻辑电路核心,其中,两个用于提供故障自动防护操作,并且第三逻辑核心在测试模式下离线。核心然后被定期地循环,使得至少两个核心始终在线,并且一个始终正在被测试。可选择地,建立测试计划,使得所有核心都正常在线,并且定期地,一个核心被迫离线用于进行测试。The present invention can incorporate redundancy and/or fault tolerance at the application level by having multiple parallel systems capable of performing tasks. One approach is to have two or more parallel systems capable of performing tasks. If one of these systems fails and goes into a fail-safe state, the other systems remain able to perform their tasks. Another way to improve integrity is to have three or more logic cores in parallel, where two are used to provide fail-safe operation and the third logic core is taken offline in test mode. The cores are then periodically cycled so that at least two cores are always online and one is always being tested. Optionally, a test plan is set up such that all cores are normally online, and periodically, one core is forced offline for testing.

并行逻辑核心被准确地复制,或者它们被类似地复制以执行相同任务。在后一种情况下,核心是互异地复制的核心或并行互异的核心。Parallel logic cores are duplicated exactly, or they are similarly duplicated to perform the same task. In the latter case, the cores are heterogeneously replicated cores or parallel distinct cores.

本发明适用于工业过程监视和控制。本发明特别针对安全关键控制系统,包括核电站反应堆保护系统,在这里可靠性和完整性是最重要的。The invention is suitable for industrial process monitoring and control. The present invention is particularly directed to safety-critical control systems, including nuclear power plant reactor protection systems, where reliability and integrity are paramount.

任一逻辑电路易受诸如下面之类的差错:Any logic circuit is susceptible to errors such as:

1.由宇宙射线或高能质子引起的单粒子效应(SEE)、引起逻辑中的瞬时脉冲的单粒子翻转(SEU)、存储器单元和寄存器中的位翻转、及单粒子锁定(SEL)。1. Single event effects (SEE) caused by cosmic rays or high energy protons, single event upsets (SEUs) causing transient pulses in logic, bit flips in memory cells and registers, and single event lockouts (SELs).

2.静电放电(ESD)和电过载(EOS)。2. Electrostatic discharge (ESD) and electrical overload (EOS).

3.由装置故障、装置设计故障、或过热引起的闪烁单元衰变/故障。3. Flicker cell decay/failure caused by device failure, device design failure, or overheating.

4.制造故障和/或老化相关故障,如氧化故障,金属层故障,电子迁移,焊线腐蚀,来自水分、或在过程中使用的化学物的污染影响等。4. Manufacturing failures and/or aging-related failures, such as oxidation failures, metal layer failures, electromigration, wire bonding corrosion, contamination effects from moisture, or chemicals used in the process, etc.

在安全关键系统中,如在核电站中,以上项越来越受关注和重视。In safety-critical systems, such as in nuclear power plants, the above items are gaining more and more attention and importance.

对于所有以上故障常见的是,它们通常按时间和位置随机地发生,并且典型地仅影响一个或少量晶体管。这些差错可能引起重大问题。Common to all of the above faults is that they usually occur randomly in time and location, and typically affect only one or a small number of transistors. These errors can cause major problems.

本发明描述了一种用于设计逻辑电路的方法,其中,按其中不会不利地影响其它相关系统的方式,自动地检测和减轻故障。The present invention describes a method for designing logic circuits in which failures are automatically detected and mitigated in such a way that other related systems are not adversely affected.

本发明保证最少增加复杂性,并且以最少的维护增加整体可靠性。The present invention ensures minimal added complexity and increases overall reliability with minimal maintenance.

本发明可以与容错方案相结合。The invention can be combined with fault tolerance schemes.

本发明的一个实施例是如下三种技术的结合:One embodiment of the present invention is the combination of the following three technologies:

1.使用并行冗余核心以保证所有故障立即由冗余校验器检测和隔离。1. Use parallel redundant cores to ensure that all failures are immediately detected and isolated by the redundancy checker.

2.使用内置自测试引擎以运用核心内的关键功能从而防止未告知的故障。如果故障在实际使用之前未被检测到,则故障是未告知的故障。2. Use the built-in self-test engine to exercise key functions in the core to prevent unannounced failures. If the fault is not detected before actual use, the fault is an uninformed fault.

3.与外部通信的并行冗余核心接口固有地由如下保护:3. The parallel redundant core interface with external communication is inherently protected by:

a)通过冗余性或循环冗余校验(CRC)保护串行或并行接口。a) Serial or parallel interfaces are protected by redundancy or cyclic redundancy check (CRC).

b)对于输入的‘翻转测试(toggletest)’。翻转测试是一种保证输入电路和它们的连接起作用的方法。该测试典型地包括将输入与源装置断开并且将测试输入信号施加到逻辑电路上。如果输入反映测试输入,则可确定输入电路是在起作用。b) 'toggletest' for the input. Toggle testing is a way to ensure that input circuits and their connections are functioning. This testing typically involves disconnecting the input from the source device and applying a test input signal to the logic circuit. If the input reflects the test input, it can be determined that the input circuit is functioning.

c)输出的独立读回。这是一种通过包括到输入的反馈检验输出的状态的独立方法。例子会是通过检验继电器在通过使用继电器上的备用触点来驱动输入而被请求时事实上被致动。为了按这种方式进行检验,各种其它模拟和数字输出可以串联或并联地连线到输入上。c) Independent readback of output. This is an independent method of checking the state of the output by including feedback to the input. An example would be by verifying that a relay is in fact actuated when requested by driving an input using a spare contact on the relay. To verify in this manner, various other analog and digital outputs can be wired in series or in parallel to the input.

在本发明的优选实施例中,内置自测试(BIST)结构被放置在可编程逻辑装置上,并且其功能按不影响逻辑电路输出的方式被执行。BIST的重要特征是暴露并行核心中的任何未告知的故障。BIST具有如下重要功能:In a preferred embodiment of the present invention, a built-in self-test (BIST) structure is placed on the programmable logic device and its functions are performed in a manner that does not affect the output of the logic circuit. An important feature of BIST is to expose any unannounced faults in parallel cores. BIST has the following important functions:

1.BIST引擎通过施加伪随机输入激励来测试并行核心。1. The BIST engine tests the parallel cores by applying pseudo-random input stimuli.

2.BIST引擎通过施加计划或编程的输入激励序列来测试并行核心。2. The BIST engine tests the parallel cores by imposing a planned or programmed sequence of input stimuli.

3.它测试所有状态转移和输出组合。3. It tests all state transitions and output combinations.

4.它检验并行核心执行其任务的能力。4. It tests the ability of the parallel cores to perform their tasks.

5.它完成以上的任何单一项或组合。5. It accomplishes any single or combination of the above.

另外,在一个实施例中,BIST通过如下测试并行核心:Additionally, in one embodiment, BIST tests the parallel cores by:

1.监视来自核心的关键内部状态。1. Monitor critical internal states from the core.

2.监视来自核心的关键输出。2. Monitor critical output from the core.

3.通过在选中的位置处比较,相对于彼此测试两个冗余核心。3. Test the two redundant cores against each other by comparing at selected locations.

4.将来自每个并行核心的内部状态‘累计’成校验和。4. 'Accumulate' the internal state from each parallel core into a checksum.

5.将来自每个并行核心的输出响应‘累计’成校验和。5. 'Accumulate' the output responses from each parallel core into a checksum.

6.它完成以上的任何单一项或组合。6. It does any single or combination of the above.

在重要的实施例中,BIST检验并行核心所借助的测试方法是:In the important embodiment, the test method by which BIST verifies the parallel core is:

1.将并行核心之一置于测试模式,使得它不影响任何输入或输出的状态,1. Put one of the parallel cores in test mode so that it does not affect the state of any input or output,

2.对于被测试的核心禁用冗余校验器,2. Disable the redundancy checker for the core being tested,

3.将一组预定输入施加到如以前描述的那样被测试的核心中的至少一个输入或内部状态上。3. Applying a predetermined set of inputs to at least one input or internal state in the core being tested as previously described.

4.通过相对于校验和、或相对于预定的图案监视内部状态变化和核心输出,检验核心对输入的响应。4. Verify core response to input by monitoring internal state changes and core output against a checksum, or against a predetermined pattern.

5.将核心和禁用的冗余校验器恢复到正常操作。5. Restore the core and disabled redundancy checkers to normal operation.

BIST检验并行核心所借助的另一个实施例测试方法是:Another example test method by which BIST verifies parallel cores is:

1.将逻辑电路置于测试模式,其中,任何输出的状态不受影响。1. Put the logic circuit into a test mode where the state of any output is not affected.

2.将一组相同的预定输入施加到所有并行核心上,如以前描述的那样,2. Apply the same set of predetermined inputs to all parallel cores, as previously described,

3.由冗余校验器检验所有并行核心的响应。3. The responses of all parallel cores are checked by the redundancy checker.

在优选的实施例中,存在多个屏障,以保证在冗余差错发生之后逻辑电路不能继续操作。在电站保护环境中,故障自动防护信号被发送到所有受影响的并行核心,以停止所有操作。所有适当起作用的核心将服从这个信号并且停止操作。引起该条件的误匹配的并行核心中的一个因为引起差错的相同原因可能不能够服从该信号。为了解决这点:In a preferred embodiment, multiple barriers are present to ensure that the logic cannot continue to operate after a redundant error occurs. In a power plant protection environment, a fail-safe signal is sent to all affected parallel cores to stop all operations. All properly functioning cores will obey this signal and stop operating. One of the parallel cores that caused the condition's mismatch may not be able to honor the signal for the same reason that caused the error. To fix this:

1.按其中并行核心必须匹配以便成功的方式,构造与其它系统的通信。这样,失效的逻辑电路不能将错误数据传送到未受影响的/相关的系统。这由如下进行:1. Structure communications with other systems in such a way that parallel cores must match in order to succeed. In this way, the failed logic cannot pass erroneous data to unaffected/related systems. This is done as follows:

a)通信数据的AND(与)或OR(或)门,以故意创建无效CRC校验和。a) AND or OR gates of communicated data to intentionally create invalid CRC checksums.

b)AND门ON(通)通信数据输出启用。这防止数据被传输。b) AND gate ON (pass) communication data output enabled. This prevents data from being transmitted.

本发明的优选实施例是利用FPGA实施基本控制功能。在其它的实施例中,使用FPGA的替代物,这些替代物包括ASIC(专用集成电路)、CPLD(复杂可编程逻辑器件)、门阵列、及PAL(可编程阵列逻辑)。这些装置一般被叫做可编程逻辑装置、复杂逻辑装置、或逻辑装置。所有这些装置都可以通过适当编程被利用,以在不使用可执行软件的情况下操作。由这些装置管理的系统可被描述成基于硬件的系统。The preferred embodiment of the present invention utilizes an FPGA to implement the basic control functions. In other embodiments, alternatives to FPGAs are used, including ASICs (Application Specific Integrated Circuits), CPLDs (Complex Programmable Logic Devices), Gate Arrays, and PALs (Programmable Array Logic). These devices are generally referred to as programmable logic devices, complex logic devices, or logic devices. All of these means can be utilized by appropriate programming to operate without the use of executable software. Systems managed by these devices can be described as hardware-based systems.

逻辑装置利用逻辑被编程,该逻辑是基于给定用途的要求可定制的,并且包含任何类型的数字构建块,该数字构建块典型地包括:AND门、OR门、XOR(异)门、触发器(D、JK、SR)、计数器、计时器、乘法器、及有限状态机(FSM)。当被适当地编程时,逻辑装置将按高度可预测的、大体确定性的方式表现。Logic devices are programmed with logic that is customizable based on the requirements of a given application and contains any type of digital building block typically including: AND gates, OR gates, XOR (exclusive) gates, trigger Devices (D, JK, SR), counters, timers, multipliers, and finite state machines (FSM). When properly programmed, logic devices will behave in a highly predictable, substantially deterministic manner.

在重要的实施例中,逻辑电路按寄存器传输级被描述,该寄存器传输级包括诸如Verilog或VHDL之类的硬件描述语言、和示意捕获。由冗余核心复制整个逻辑电路、或逻辑电路的关键功能。到核心的输入按保证输入无错地被传输到内部核心寄存器的方式被设计。In important embodiments, logic circuits are described at a register-transfer level comprising a hardware description language such as Verilog or VHDL, and schematic capture. The entire logic circuit, or critical functions of the logic circuit, is duplicated by redundant cores. The inputs to the core are designed in such a way that the inputs are guaranteed to be transferred to the internal core registers without errors.

逻辑电路将接收外部输入。到逻辑电路的输入可以包括如下任何一种:用冗余性保护的串行接口、离散输入、或数字化模拟值。关键输入由冗余测试、XOR翻转测试、CRC和/或外部环回测试保证。任何输入测试按不影响输入数据的方式被实施。典型的输入电路包括:总线通信电路(串行或并行)、数字信道(串行或并行)、通信电路(串行或并行)、数字电路(串行或并行)、及数字化模拟电路。Logic circuits will receive external inputs. Inputs to logic circuits may include any of the following: serial interfaces protected with redundancy, discrete inputs, or digitized analog values. Critical inputs are guaranteed by redundancy testing, XOR flip testing, CRC and/or external loopback testing. Any input testing is performed in a manner that does not affect the input data. Typical input circuits include: bus communication circuits (serial or parallel), digital channels (serial or parallel), communication circuits (serial or parallel), digital circuits (serial or parallel), and digitized analog circuits.

来自并行核心的输出按保证输出起作用的方式被设计。该保证来自冗余测试、XOR翻转测试、CRC和/或外部环回测试。外部环回测试是通过将输出信号路由回输入的输出信号的独立检验。输出信号然后与实际测得的值相比较。典型的输出电路包括:总线通信电路(串行或并行)、数字信道(串行或并行)、通信电路(串行或并行)、数字电路(串行或并行)、及数字化模拟电路。The output from the parallel cores is designed in such a way that the output is guaranteed to work. This guarantee comes from redundancy testing, XOR flip testing, CRC and/or external loopback testing. The external loopback test is an independent verification of the output signal by routing the output signal back to the input. The output signal is then compared to the actual measured value. Typical output circuits include: bus communication circuits (serial or parallel), digital channels (serial or parallel), communication circuits (serial or parallel), digital circuits (serial or parallel), and digitized analog circuits.

来自逻辑电路的I/O典型地包括如下重要特征:I/O from logic circuits typically includes the following important features:

1.用冗余性保护的串行或并行接口。1. Serial or parallel interface protected with redundancy.

2.来自冗余核心的串行或并行接口由CRC中的冗余校验器进行AND或OR运算,以保证当由于通信中的CRC故障而发生故障时,到其它系统的所有通信都将停止。2. The serial or parallel interface from the redundant core is ANDed or ORed by the redundancy checker in the CRC to ensure that when a failure occurs due to a CRC failure in the communication, all communication to other systems will stop .

3.来自离散输入的输入。3. Input from discrete input.

4.离散输出,它们可驱动继电器、固态继电器、现场部件、或其它系统输入。4. Discrete outputs, which can drive relays, solid state relays, field components, or other system inputs.

5.关键输出由这样的手段测试,如5. Key outputs are tested by means such as

a)由冗余性保证的手段,a) means ensured by redundancy,

b)XOR翻转测试,b) XOR flip test,

c)CRC,及c) CRC, and

d)外部环回测试。d) External loopback test.

输出测试按不引起不期望的现场致动的方式被实施。Output testing is performed in a manner that does not cause undesired field actuation.

在优选的实施例中,BIST由如下方式被实施:In a preferred embodiment, BIST is implemented as follows:

1.被设计成运用关键功能,如有限状态机中遍历所有状态,或者仅一组特定状态。1. It is designed to use key functions, such as traversing all states in a finite state machine, or only a specific set of states.

2.为了满意操作,确定和测试逻辑电路的关键功能。这可以包括电路的所有功能。2. Identify and test critical functions of logic circuits for satisfactory operation. This can include all functions of the circuit.

3.按在逻辑电路中可没有固定型故障的方式,注入测试输入信号。3. Inject the test input signal in such a way that there are no stuck-at faults in the logic circuit.

4.按不影响输出的这样一种方式被设计。这可以由如下方式进行:4. Designed in such a way as not to affect the output. This can be done by:

a)在测试期间冻结输出,或者a) freeze the output during the test, or

b)在其中输出未被更新的时段中执行测试。b) The test is performed in the period in which the output is not updated.

5.通过如下方式检验逻辑电路的操作:5. Verify the operation of the logic circuit by:

a)使BIST引擎通过监视内部状态和核心输出来检验功能性,该内部状态即为在核心中也叫做关键状态的关键值或寄存值。数据压缩形式可以用于基于BIST输入刺激来简化核心输出或内部状态条件。a) Causes the BIST engine to verify functionality by monitoring internal states, which are key or registered values also called key states in the core, and core outputs. A compressed form of data can be used to simplify core output or internal state conditioning based on BIST input stimuli.

b)使多个BIST引擎在冗余核心之间运行同步例行程序。在这种情况下,BIST引擎不需要检验输出。这将由冗余校验器进行,该冗余校验器可在关键点处比较两个核心,或者为了匹配比较两个核心的输出。b) Have multiple BIST engines run synchronization routines between redundant cores. In this case, the BIST engine does not need to verify the output. This will be done by a redundancy checker that can compare the two cores at critical points, or compare the outputs of the two cores for matching.

6.在完成BIST时,将逻辑电路恢复到其适当状态。也就是说,将被测试的任何并行核心恢复到正常操作。6. Upon completion of the BIST, restore the logic circuit to its proper state. That is, restore any parallel cores being tested to normal operation.

在优选的实施例中,冗余校验器逻辑电路用于确定逻辑电路是否有故障,并且将逻辑电路置于故障自动防护状态。冗余校验器监视逻辑电路结构中的关键冗余校验点,也就是说,将来自具体电路的信号-该具体电路来自冗余逻辑核心中的每一个,连线到冗余校验器逻辑电路上。冗余校验器然后通过将来自用于准确匹配的冗余核心的每一个的两个信号相比较,寻找两个核心之间的差异。如果值不匹配,则检测到冗余故障(即差错)。另外,通过比较关键信号(即,关键数据)来实施冗余校验器,这些关键信号优选地包括关键内部状态和输出两者。In a preferred embodiment, redundancy checker logic is used to determine if the logic is faulty and place the logic in a fail-safe state. The redundancy checker monitors critical redundancy checkpoints in the logic circuit structure, that is, signals from specific circuits from each of the redundant logic cores are wired to the redundancy checker on the logic circuit. The redundancy checker then looks for differences between the two cores by comparing the two signals from each of the redundant cores for an exact match. If the values do not match, a redundancy failure (ie error) has been detected. Additionally, the redundancy checker is implemented by comparing critical signals (ie, critical data), which preferably include both critical internal states and outputs.

在优选的实施例中,并且因为系统是基于硬件的,所以在并行冗余核心之间不应该有误匹配。它们在精确相同的时刻接收相同的输入,并且核心将按完全同步方式操作。In a preferred embodiment, and because the system is hardware based, there should be no mismatch between parallel redundant cores. They receive the same input at the exact same moment, and the cores will operate in full synchronization.

通过监视来自每个冗余核心的内部状态和输出,冗余校验器将立即检测关键功能的状态变化,如由核心因为故障而产生的意外致动信号。在冗余校验器不减轻该故障和强迫逻辑电路进入故障自动防护状态的情况下,故障会传播到相关系统,并且引起不期望的设备瞬变。By monitoring the internal state and outputs from each redundant core, the redundancy checker will immediately detect a state change of a critical function, such as an unexpected actuation signal due to a core failure. Without the redundancy checker mitigating the fault and forcing the logic into a fail-safe state, the fault can propagate to related systems and cause undesired equipment transients.

在优选的实施例中,由冗余校验器监视的逻辑电路的关键功能包括:逻辑判定、极限校验、状态机、检测逻辑、及控制逻辑。In a preferred embodiment, the key functions of the logic circuit monitored by the redundancy checker include: logic decision, limit check, state machine, detection logic, and control logic.

在本发明的另一个重要实施例中,并行核心未被准确地复制。也就是说,并行核心完成相同的任务或功能,但在设计上相异。核心被说成是并行相异核心。相异性可通过程序如何被物理地放置在FPGA内而建立,例如通过改变如何使用互连资源,或者因为在给予相同工作任务的程序设计员之间的微小编程差别。如果在实施中使用不同的逻辑装置,例如不同的FPGA零售商或使用执行逻辑部分的微处理器,则相异性可能会非常大。In another important embodiment of the invention, the parallel cores are not exactly replicated. That is, parallel cores perform the same task or function but are different in design. Cores are said to be parallel distinct cores. Dissimilarity can be created by how programs are physically placed within the FPGA, for example by changing how interconnect resources are used, or by minor programming differences between programmers given the same job task. The dissimilarity can be very large if different logic devices are used in the implementation, such as different FPGA vendors or use of microprocessors that execute portions of the logic.

相异性是保证编程差错将不影响操作的整体安全性的非常重要的操作安全特征。两个、三个、或更多个核心可以由两个或更多个程序设计员独立地编程。为了增强相异性,不同的程序设计员被指派给采取不同的方法,甚至关于相当直接的编程任务。保证相异性或不同实施的方法包括相异的状态编码、“独热码(onehot)”对“葛莱码”、利用或不利用分级优化、利用或不利用平面化、及在复杂逻辑装置上如何布置程序。Dissimilarity is a very important operational security feature that ensures that programming errors will not affect the overall security of the operation. Two, three, or more cores can be independently programmed by two or more programmers. To enhance the diversity, different programmers are assigned to take different approaches, even with respect to fairly straightforward programming tasks. Methods to ensure dissimilarity or different implementations include distinct state encodings, "onehot" vs. "Grey codes", with or without hierarchical optimization, with or without planarization, and on complex logic devices How to lay out the program.

在利用并行相异核心的情况下,冗余校验器比较来自核心内的选中点的值、来自输出点的值、或两者。Where parallel distinct cores are utilized, the redundancy checker compares values from selected points within the core, values from output points, or both.

在本发明的一个实施例中,相异性可扩展成包括使用具有可执行软件的微处理器,该微处理器与没有使用可执行软件的基于FPGA的系统并行。例如,可在逻辑装置中实施一个并行核心,并且在基于软件的处理器装置中实施另一个并行核心。然后使用冗余校验器来查看来自两个核心的输出,以监视误匹配。In one embodiment of the invention, dissimilarity can be extended to include the use of a microprocessor with executable software paralleled to an FPGA-based system without executable software. For example, one parallel core may be implemented in a logic device and another parallel core may be implemented in a software-based processor device. A redundancy checker is then used to look at the output from both cores to monitor for mismatches.

在基于软件的并行核心的情况下,内置自测试会包括如下特征:通过使用监控器、运行时间断定及自测试的组合来保证正确操作和未告知的故障的检测。在优选的实施例中,基于软件的BIST会被设计成,通过使用已经描述的技术,如运用关键功能、注入测试输入信号、在测试期间冻结输出、在其中不更新输出的时段中执行测试、检验处理器的操作、及通过监视关键码值或寄存器来检验功能性,来测试处理器。在完成BIST时,将处理器被恢复到其适当状态。In the case of software-based parallel cores, built-in self-tests would include features that ensure correct operation and detection of unannounced failures through the use of a combination of monitors, run-time assertions, and self-tests. In a preferred embodiment, a software-based BIST would be designed, by using techniques already described, such as exercising key functions, injecting test input signals, freezing outputs during testing, performing tests during periods in which outputs are not updated, Processors are tested to verify the operation of the processor, and to verify functionality by monitoring key values or registers. Upon completion of the BIST, the processor is restored to its proper state.

图1表示利用冗余校验器的两个并行核心的实施的图示说明。第一CORE(核心)A101和第二COREB102是逻辑电路的并行和冗余代表。已经描述的REDUNDANCYCHECKER(冗余校验器)电路103用于检验核心的完整性操作。BIST104、105表示成核心结构中的每一个的一部分,但可选择地和同样地,可分别表示。整个逻辑电路结构在单个FPGA106或其它逻辑装置内。可选择地,逻辑电路可被放置在多个逻辑装置上。由COREA和COREB接收相同的输入,并且由REDUNDANCYCHECKER监视它们的输出以便准确匹配。从FPGA输出来自两个核心的输出、以及来自REDUNDANCYCHECKER的故障自动防护信号。使用输出故障自动防护门,但在图1中未表示。在图2中描述了该特征。Figure 1 shows a schematic illustration of an implementation of two parallel cores utilizing a redundancy checker. A first CORE (core) A 101 and a second CORE B 102 are parallel and redundant representations of logic circuits. The already described REDUNDANCYCHECKER (redundancy checker) circuit 103 is used to check the integrity operation of the core. BIST 104, 105 are represented as part of each of the core structures, but may alternatively and likewise be represented separately. The entire logic circuit structure is within a single FPGA 106 or other logic device. Alternatively, logic circuits may be placed on multiple logic devices. The same input is received by COREA and COREB, and their output is monitored by REDUNDANCYCHECKER for an exact match. Outputs from the two cores, and fail-safe signals from the REDUNDANCYCHECKER are output from the FPGA. An output failsafe gate is used, but not shown in Figure 1. This feature is depicted in FIG. 2 .

图2表示利用冗余校验器的另一个实施例的两个并行核心的实施。两个并行冗余核心215、225用于实施逻辑电路。示出了冗余核心的另外细节,这些另外细节包括:输入寄存器210、220、输出寄存器211、221及内置自测试(BIST)特征214、224。冗余校验器205用于可靠性和差错校验并激活故障自动防护模式203。冗余核心的一部分是关键功能212、222,其中存在关键状态213、223变量或信息。该信息用于通过冗余校验器205进行差错校验,如表示的那样。Figure 2 shows the implementation of two parallel cores using another embodiment of a redundancy checker. Two parallel redundant cores 215, 225 are used to implement logic circuits. Additional details of the redundant core are shown, including: input registers 210, 220, output registers 211, 221, and built-in self-test (BIST) features 214, 224. The redundancy checker 205 is used for reliability and error checking and activates the fail-safe mode 203 . Part of the redundant core are key functions 212, 222 where there are key state 213, 223 variables or information. This information is used for error checking by redundancy checker 205, as indicated.

输入201流到并行输入寄存器210、220中。输入由根据系统设计的逻辑电路使用,并且更新输出寄存器211、221。核心输出然后从输出寄存器流过输出故障自动防护门204,在这里它然后被组合,并且成为用于系统的输出202。这是GateON通信数据输出启用。当存在检测到故障的冗余校验器时,这防止数据传输。当检测到差错时,由冗余校验器205激活输出故障自动防护203,以警告系统。故障自动防护可以是继电器触点闭合、报警、或某种通信。整个逻辑电路200被驻留在单个逻辑装置上,如在PAL、CPLD、FPGA、ASIC、或门阵列上。可选择地,逻辑电路可被放置在多个逻辑装置上。Input 201 flows into parallel input registers 210,220. The inputs are used by logic circuits designed according to the system and the output registers 211, 221 are updated. The core output then flows from the output register through output failsafe gate 204 where it is then combined and becomes output 202 for the system. This is GateON communication data output enable. This prevents data transmission when there is a redundancy checker that has detected a failure. When an error is detected, the output fail-safe 203 is activated by the redundancy checker 205 to alert the system. Failsafe could be a relay contact closure, an alarm, or some kind of communication. The entire logic circuit 200 resides on a single logic device, such as a PAL, CPLD, FPGA, ASIC, or gate array. Alternatively, logic circuits may be placed on multiple logic devices.

图2是与图1相似的冗余校验器的另一个实施例。在图2中,冗余校验器为了比较另外利用每个冗余核心内的关键状态(即,值)。该辅助信息用于快速揭露未告知的故障。FIG. 2 is another embodiment of a redundancy checker similar to FIG. 1 . In FIG. 2, the redundancy checker additionally utilizes key states (ie, values) within each redundant core for comparison. This auxiliary information is used to quickly uncover unannounced faults.

在这种情况下的BIST正在监视自测试中的冗余核心。BIST in this case is monitoring redundant cores in self-test.

类似地,图3示出冗余校验器的另一个实施例。两个并行冗余核心315、325用于实施逻辑电路,该逻辑电路利用:输入寄存器310、320、输出寄存器311、321及内置自测试(BIST)特征314、324。冗余校验器305用于可靠性和差错校验并激活故障自动防护模式303。冗余核心的一部分是关键功能312、322,其中存在关键状态313、323变量或信息。该信息用于通过冗余校验器305进行差错校验,如表示的那样。Similarly, Fig. 3 shows another embodiment of a redundancy checker. Two parallel redundant cores 315 , 325 are used to implement logic circuits utilizing: input registers 310 , 320 , output registers 311 , 321 and built-in self-test (BIST) features 314 , 324 . Redundancy checker 305 is used for reliability and error checking and activates fail-safe mode 303 . Part of the redundant core are key functions 312, 322 where there are key state 313, 323 variables or information. This information is used for error checking by redundancy checker 305, as indicated.

类似地,如以前那样,输入301流到并行输入寄存器310、320中。输入由根据系统设计的逻辑电路使用,并且更新输出寄存器311、321。核心输出然后从输出寄存器流过输出故障自动防护门304,在这里它然后被组合,并且成为用于系统的输出302。当检测到差错时,输出故障自动防护303由冗余校验器305激活,以警告系统。整个逻辑电路300驻留在单个逻辑装置上。可选择地,逻辑电路可被放置在多个逻辑装置上。Similarly, input 301 flows into parallel input registers 310, 320 as before. The inputs are used by logic circuits designed according to the system and the output registers 311, 321 are updated. The core output then flows from the output register through output failsafe gate 304 where it is then combined and becomes output 302 for the system. When an error is detected, output fail-safe 303 is activated by redundancy checker 305 to alert the system. The entire logic circuit 300 resides on a single logic device. Alternatively, logic circuits may be placed on multiple logic devices.

在这种情况下的BIST在自测试中另外使用关键状态和输出寄存器。The BIST in this case additionally uses key status and output registers in the self-test.

图4表示典型的内置自测试(BIST)314的重要细节。在这种情况下,图4是来自图3的另外细节。来自冗余核心315和关键状态313的输出寄存器值被输入到输出检验例行程序401,该输出检验例行程序401转到BIST有限状态机(FSM)402。BIST由FSM控制。当由操作器、计时器、或事件激活时,BIST将产生输入激励,或者作为随机序列或者作为编程序列403到输入寄存器310。BIST监视冗余核心、冗余核心输出、及关键状态,以检验正确操作。该检验包括:相对于存储基准进行比较、相对于另一个冗余核心进行比较、或产生监视输出的校验和及相对于基准校验和检验这个校验和。FIG. 4 shows important details of a typical built-in self-test (BIST) 314 . In this case, FIG. 4 is an additional detail from FIG. 3 . Output register values from redundant core 315 and critical state 313 are input to output check routine 401 , which goes to BIST finite state machine (FSM) 402 . BIST is controlled by FSM. When activated by an operator, timer, or event, the BIST will generate input stimuli, either as a random sequence or as a programmed sequence 403 to the input register 310 . BIST monitors redundant cores, redundant core outputs, and critical states to verify correct operation. The verification includes comparing against a stored baseline, comparing against another redundant core, or generating a checksum of the monitoring output and verifying the checksum against a baseline checksum.

本发明的优选实施例是,在逻辑电路的正常操作期间实施BIST。也就是说,在逻辑电路正在执行其任务的同时,激活BIST。这在不影响其它系统或输出的情况下由包括如下方面的方法进行:A preferred embodiment of the invention implements BIST during normal operation of the logic circuit. That is, the BIST is activated while the logic circuit is performing its task. This is done without affecting other systems or outputs by methods including:

1.在测试期间冻结输出。1. Freeze output during testing.

2.在其中未更新输出的时段期间执行测试。2. Execute the test during periods in which the output is not updated.

3.将并行核心之一置于专门测试模式;将它隔离,使得它不影响任何输入或输出的状态;及禁用与被测试的核心有关的冗余校验器。3. Put one of the parallel cores in dedicated test mode; isolate it so that it does not affect the state of any input or output; and disable the redundancy checker associated with the core being tested.

用于逻辑电路的典型任务是,根据设计在输入与输出之间提供处理功能。设计可以是预备状态、或诸如电站保护系统中的安全相关功能中的一种。如果设计是过程控制,则它可能涉及更多。A typical task for logic circuits is to provide processing functions between inputs and outputs according to the design. Design can be one of a readiness state, or a safety-related function such as in a plant protection system. If design is process control, it can be more involved.

逻辑电路任务也可以包括与控制电路接口。这些控制电路包括外部逻辑、判定、检测、及控制电路。这些电路在过程控制和安全相关设备判定中是常见的。它们可以是二进制(通/断)类型的电路,或者它们可以相关控制电路,这些相关控制电路包括传感器、开关、过程控制器、及执行器。它们可以是基于继电器的系统和对于其它计算机化系统的接口的一部分。Logic circuit tasks may also include interfacing with control circuits. These control circuits include external logic, decision, detection, and control circuits. These circuits are common in process control and safety-related equipment determination. They may be binary (on/off) type circuits, or they may be associated control circuits including sensors, switches, process controllers, and actuators. They can be part of relay based systems and interfaces to other computerized systems.

在本发明的另一个实施例中,冗余校验器没有被布置在其中布置并行核心的逻辑装置上。冗余校验器被独立地布置在另一个逻辑装置上。它然后由通信路径连接到核心的输出上,以便提供冗余校验。冗余校验器然后如图1-3中描述的那样,通过提供故障自动防护信号等进行操作。In another embodiment of the invention, the redundancy checker is not arranged on the logic device in which the parallel cores are arranged. The redundancy checker is independently arranged on another logical device. It is then connected by a communication path to the output of the core to provide redundancy checks. The redundancy checker then operates as described in Figures 1-3, by providing fail-safe signals, etc.

在优选的实施例中,本发明基于硬件平台而不是基于软件的微处理器系统。与基于软件的微处理器系统体系构造显著不同的是,通过在逻辑装置中实施逻辑电路,由此消除可执行软件、和与基于软件的微处理器系统相关的问题,如软件共模故障。它提供一种适用于安全关键控制系统的高度可靠的系统,这些安全关键控制系统包括在核电站中的反应堆保护系统。In a preferred embodiment, the present invention is based on a hardware platform rather than a software based microprocessor system. In a significant departure from software-based microprocessor system architectures, the executable software, and problems associated with software-based microprocessor systems, such as software common-mode faults, are eliminated by implementing logic circuits in logic devices. It provides a highly reliable system suitable for safety-critical control systems, including reactor protection systems in nuclear power plants.

尽管已经描述了本发明的各个实施例,但对于本领域的技术人员来说,对于各种操作方法可以修改和调整本发明。因此,本发明不限于这里表示的描述和附图,并且包括由权利要求书的范围包含的所有这样的实施例、变更、及修改。While various embodiments of the invention have been described, it will occur to those skilled in the art to modify and adapt the invention for various methods of operation. Therefore, the present invention is not limited to the description and drawings shown here, and includes all such embodiments, changes, and modifications encompassed by the scope of the claims.

Claims (3)

1. a high integrality logical circuit (200), comprising:
A. many parallel cores (215,225), wherein, described parallel core for implementing the key function of described logical circuit,
B. wherein, described parallel core (215,225) is redundancy or different,
C. redundancy checker (205), wherein, described redundancy checker is used for:
I. check the input (201) of the first parallel core (215) and the multiple values (213) in exporting between (202) the first parallel core (215) whether and the second parallel core (225) input (201) and export multiple values (223) that second between (202) walk abreast in core (225) and match, and
Ii. according to preassigned, described logical circuit (200) is activated to fail-safety state (203),
D. wherein, described logical circuit (200) is connected with multiple input (201) and multiple output (202),
E. wherein, described logical circuit performs the task relevant with described output (202) to described input (201), and wherein said task is safety-critical function,
F. wherein, protect by from comprising at least one of selecting in following group in described logical circuit (200) and the communication between described input (201) and described output (202):
I. redundancy,
Ii. cyclic redundancy check (CRC),
Iii. for the Turnover testing of described input, and
Reading back iv. for described output,
G. Built-in Self Test (BIST214, BIST224), wherein, described Built-in Self Test for exposing arbitrary described Parallel Kernel fault of not informing in the heart,
H., wherein, while described logical circuit (200) performs described task, described Built-in Self Test (BIST214, BIST224) regularly or is continuously performed,
I. wherein, the described key function (212,222) of described logical circuit (200) is implemented substantially at least one logical unit (215,225), and
J. wherein, at least one logical unit described (215,225) be implemented to avoid use can executive software.
2. high integrality logical circuit (200) according to claim 1, wherein, described redundancy checker (205) is disposed in from described parallel core (215,225) on the logical unit of separation, wherein said redundancy checker is connected to described parallel core by the communication path to the output of described parallel core, or described redundancy checker is disposed on the resident same logical unit of wherein said Parallel Kernel in the heart at least one.
3. high integrality logical circuit (200) according to claim 1, multiple values (213 wherein in the first parallel core (215) and in the second parallel core (225), 223) be described logical circuit key function (212,222) state change.
CN201110166907.1A 2011-06-21 2011-06-21 Fault detect in logical circuit and alleviating Expired - Fee Related CN102841828B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110166907.1A CN102841828B (en) 2011-06-21 2011-06-21 Fault detect in logical circuit and alleviating

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110166907.1A CN102841828B (en) 2011-06-21 2011-06-21 Fault detect in logical circuit and alleviating

Publications (2)

Publication Number Publication Date
CN102841828A CN102841828A (en) 2012-12-26
CN102841828B true CN102841828B (en) 2016-01-20

Family

ID=47369223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110166907.1A Expired - Fee Related CN102841828B (en) 2011-06-21 2011-06-21 Fault detect in logical circuit and alleviating

Country Status (1)

Country Link
CN (1) CN102841828B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9952922B2 (en) 2013-07-18 2018-04-24 Nxp Usa, Inc. Fault detection apparatus and method
DE102014116484B4 (en) * 2014-11-12 2019-02-14 Infineon Technologies Ag Signal processing system and sensor system for determining information about a movement of an object
CN106528312B (en) * 2016-09-29 2019-07-12 北京广利核系统工程有限公司 Fault repairing method and device based on FPGA
US10802932B2 (en) * 2017-12-04 2020-10-13 Nxp Usa, Inc. Data processing system having lockstep operation
EP3543870B1 (en) * 2018-03-22 2022-04-13 Tata Consultancy Services Limited Exactly-once transaction semantics for fault tolerant fpga based transaction systems
US10628277B2 (en) * 2018-03-29 2020-04-21 Arm Ltd. Device, system and process for redundant processor error detection
CN108710327B (en) * 2018-05-28 2020-09-11 湖北三江航天万峰科技发展有限公司 Small-size universalization multi-bus controlgear
CN109782124B (en) * 2018-12-24 2021-07-23 国网江苏省电力有限公司苏州供电分公司 An integrated fault location method and system for main and auxiliary equipment based on gradient descent algorithm
CN109839918B (en) * 2019-03-06 2020-10-27 中国核动力研究设计院 Self-diagnosis method based on FPGA
CN110134001A (en) * 2019-05-29 2019-08-16 山东省科学院激光研究所 A heterogeneous dual-core motor servo controller with redundant safety and its redundant safety control method
CN112925682B (en) * 2019-12-06 2024-02-02 澜起科技股份有限公司 Testing device and method with built-in self-test logic
CN112526979B (en) * 2020-12-16 2023-06-09 中国兵器装备集团自动化研究所 Serial communication interface diagnosis system and method with multiple redundancy architecture
US12164000B2 (en) * 2021-08-30 2024-12-10 Stmicroelectronics S.R.L. On-chip checker for on-chip safety area
CN114825293B (en) * 2022-06-02 2025-09-30 南京国电南自电网自动化有限公司 A relay protection device and method for preventing single-particle upset
CN115389911B (en) * 2022-08-25 2023-04-14 北京物芯科技有限责任公司 Chip scheduler fault judgment method, device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1134559A (en) * 1994-12-28 1996-10-30 株式会社日立制作所 Controller with fail-safe function and its automatic train controller and system
US7870299B1 (en) * 2008-02-06 2011-01-11 Westinghouse Electric Co Llc Advanced logic system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1134559A (en) * 1994-12-28 1996-10-30 株式会社日立制作所 Controller with fail-safe function and its automatic train controller and system
US7870299B1 (en) * 2008-02-06 2011-01-11 Westinghouse Electric Co Llc Advanced logic system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SELF-CHECKING AND FAIL-SAFE LSIS BY INTRA-CHIP REDUNDANCY;Nobuyasu Kanekawa等;《PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING》;19960625;第26卷;第426页,第428页第1栏第28-30行,第2栏第29-39行,第429页第2栏第22-27行,第430页第1栏、图1,3,5,7,9,10 *

Also Published As

Publication number Publication date
CN102841828A (en) 2012-12-26

Similar Documents

Publication Publication Date Title
CN102841828B (en) Fault detect in logical circuit and alleviating
US8117512B2 (en) Failure detection and mitigation in logic circuits
Nelson Fault-tolerant computing: Fundamental concepts
EP2533154B1 (en) Failure detection and mitigation in logic circuits
US7877627B1 (en) Multiple redundant computer system combining fault diagnostics and majority voting with dissimilar redundancy technology
US6550018B1 (en) Hybrid multiple redundant computer system
Dubrova Fault tolerant design: An introduction
US10078565B1 (en) Error recovery for redundant processing circuits
Johnson An introduction to the design and analysis of fault-tolerant systems
US7389460B1 (en) Runtime-competitive fault handling for reconfigurable logic devices
Johnson Fault tolerance
Györök et al. Duplicated control unit based embedded fault-masking systems
KR101825568B1 (en) Failure Detection and Mitigation in Logic Circuits
Jain et al. Redundancy issues in software and hardware systems: an overview
Kastil et al. Dependability analysis of fault tolerant systems based on partial dynamic reconfiguration implemented into FPGA
Farias et al. Active redundant hardware architecture for increased reliability in FPGA-based nuclear reactors critical systems
Voas et al. Reducing uncertainty about common-mode failures
Wang et al. The reliability and availability analysis of SEU mitigation techniques in SRAM-based FPGAs
Gericota et al. A self-healing real-time system based on run-time self-reconfiguration
Pan et al. A framework for system reliability analysis considering both system error tolerance and component test quality
Salewski et al. Fault handling in FPGAs and microcontrollers in safety-critical embedded applications: A comparative survey
Villalta et al. Dependability in FPGAs, a review
Tummeltshammer et al. On the role of the power supply as an entry for common cause faults—An experimental analysis
Agarwal et al. State model for scheduling Built-in Self-Test and scrubbing in FPGA to maximize the system availability in space applications
Venu et al. A fail-functional automotive CPU subsystem architecture for mitigating single point of failures

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160120

CF01 Termination of patent right due to non-payment of annual fee