CN109086179B - Processing method and device under program exception condition - Google Patents
Processing method and device under program exception condition Download PDFInfo
- Publication number
- CN109086179B CN109086179B CN201810949401.XA CN201810949401A CN109086179B CN 109086179 B CN109086179 B CN 109086179B CN 201810949401 A CN201810949401 A CN 201810949401A CN 109086179 B CN109086179 B CN 109086179B
- Authority
- CN
- China
- Prior art keywords
- pci
- program
- monitoring module
- module
- memory address
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
- G06F11/2205—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
- G06F11/221—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test buses, lines or interfaces, e.g. stuck-at or open line faults
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/42—Bus transfer protocol, e.g. handshake; Synchronisation
- G06F13/4282—Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
 
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
技术领域technical field
本申请涉及软件系统驱动程序设计技术领域,特别是涉及一种程序异常情况下的处理方法和装置。The present application relates to the technical field of software system driver program design, and in particular, to a method and device for processing abnormal programs.
背景技术Background technique
目前,计算机可以利用PCI-e(英文:peripheral component interconnectexpress)设备与其它设备进行数据交互。具体地,安装有Linux操作系统的计算机中的用户空间(英文:user space)安装的应用程序在运行时,可以与其它设备进行数据交互。At present, a computer can use a PCI-e (English: peripheral component interconnectexpress) device to perform data interaction with other devices. Specifically, an application program installed in a user space (English: user space) in a computer with a Linux operating system installed can perform data interaction with other devices when running.
随着科学技术的发展,新型的PCI-e设备可以支持直接访问用户空间的内存。具体地,用户空间安装的应用程序例如目标应用程序在运行时,会预先分配读写内存,以存储目标应用程序运行过程中产生的数据。然后将该预先分配的读写内存地址注册到PCI-e设备中,当目标应用程序与其它设备进行数据交互,例如,其它设备通过目标应用程序向用户空间写入数据时,PCI-e设备可以直接把其它设备发送的数据写入到预先注册的内存地址对应的内存中。With the development of science and technology, new PCI-e devices can support direct access to user space memory. Specifically, when an application installed in the user space, such as a target application, is pre-allocated read and write memory to store data generated during the running of the target application. Then register the pre-allocated read and write memory address to the PCI-e device. When the target application interacts with other devices, for example, when other devices write data to the user space through the target application, the PCI-e device can Directly write the data sent by other devices into the memory corresponding to the pre-registered memory address.
然而,当这种与PCI-e硬件设备具有强耦合性的目标应用程序出现异常退出的情况时,目标应用程序通常无法及时地注销在PCI-e设备上注册的内存地址,由此可能导致当其他设备再通过目标应用程序读写数据时,PCI-e设备仍会向原有内存地址读写数据,但是这部分内存可能已经不可用或被操作系统分配给其他程序使用,这时若再向这部分内存读写数据,将引发严重的内存读写错误,或者对其他程序造成严重的影响。However, when the target application with strong coupling to the PCI-e hardware device exits abnormally, the target application usually cannot deregister the memory address registered on the PCI-e device in a timely manner, which may lead to When other devices read and write data through the target application, the PCI-e device will still read and write data to the original memory address, but this part of the memory may be unavailable or allocated to other programs by the operating system. Partial memory read and write data will cause serious memory read and write errors, or have a serious impact on other programs.
因此,亟待一种能够降低因程序异常而带来的危害的方法。Therefore, there is an urgent need for a method that can reduce the harm caused by program abnormality.
发明内容SUMMARY OF THE INVENTION
为了解决上述技术问题,本申请提供了一种程序异常情况下的处理方法,能够在程序出现异常时,及时地处理好收尾工作,降低因程序异常而带来的危害。In order to solve the above technical problems, the present application provides a processing method in the case of program abnormality, which can handle the finishing work in time when the program is abnormal, and reduce the harm caused by the abnormal program.
本申请实施例公开了如下技术方案:The embodiments of the present application disclose the following technical solutions:
第一方面,本申请实施例提供了一种程序异常情况下的处理方法,应用于处理系统,所述处理系统包括父进程监控模块和PCI-e设备;所述方法包括:In a first aspect, an embodiment of the present application provides a method for processing abnormal programs, which is applied to a processing system, where the processing system includes a parent process monitoring module and a PCI-e device; the method includes:
所述父进程监控模块根据操作系统内核发送的退出信号,判断子进程是否异常退出,所述子进程中运行PCI-e设备传输数据程序,所述PCI-e设备传输数据程序与所述PCI-e设备关联;The parent process monitoring module judges whether the child process exits abnormally according to the exit signal sent by the operating system kernel. e device association;
在所述子进程异常退出的情况下,所述父进程监控模块根据所述PCI-e设备的设备号和所述PCI-e设备中注册的内存地址,注销所述PCI-e设备中注册的内存地址。In the case that the child process exits abnormally, the parent process monitoring module cancels the registered memory address in the PCI-e device according to the device number of the PCI-e device and the memory address registered in the PCI-e device. memory address.
可选的,所述处理系统还包括:PCI-e设备传输数据程序模块;Optionally, the processing system further includes: a PCI-e device data transmission program module;
则所述方法还包括:Then the method further includes:
所述PCI-e设备完成初始化,并且在用户空间成功申请内存、将所述内存的内存地址注册到所述PCI-e设备后,所述PCI-e设备传输数据程序模块告知所述父进程监控模块所述PCI-e设备的设备号和所述PCI-e设备中注册的内存地址。The PCI-e device completes initialization, and after the user space successfully applies for memory and registers the memory address of the memory to the PCI-e device, the PCI-e device transmission data program module informs the parent process to monitor The module includes the device number of the PCI-e device and the memory address registered in the PCI-e device.
可选的,所述方法还包括:Optionally, the method further includes:
所述父进程监控模块创建自身和所述子进程之间的通信管道,创建所述子进程;The parent process monitoring module creates a communication pipeline between itself and the child process, and creates the child process;
所述父进程监控模块在所述子进程中加载所述PCI-e设备传输数据程序。The parent process monitoring module loads the PCI-e device data transmission program in the child process.
可选的,所述方法还包括:Optionally, the method further includes:
在所述子进程异常退出的情况下,所述父进程监控模块记录日志。When the child process exits abnormally, the parent process monitoring module records a log.
可选的,所述父进程监控模块循环工作。Optionally, the parent process monitoring module works cyclically.
第二方面,本申请实施例提供了一种程序异常情况下的处理装置,包括:In a second aspect, an embodiment of the present application provides a processing device under abnormal program conditions, including:
判断模块,用于根据操作系统内核发送的退出信号,判断子进程是否异常退出,所述子进程中运行PCI-e设备传输数据程序,所述PCI-e设备传输数据程序与所述PCI-e设备关联;The judgment module is used to judge whether the child process exits abnormally according to the exit signal sent by the operating system kernel, and the PCI-e device data transmission program is run in the child process, and the PCI-e device data transmission program and the PCI-e device are used to transmit data. device association;
处理模块,用于在所述子进程异常退出的情况下,根据所述PCI-e设备的设备号和所述PCI-e设备中注册的内存地址,注销所述PCI-e设备中注册的内存地址。A processing module, configured to cancel the memory registered in the PCI-e device according to the device number of the PCI-e device and the memory address registered in the PCI-e device when the sub-process exits abnormally address.
可选的,所述装置还包括:Optionally, the device further includes:
告知模块,用于在所述PCI-e设备完成初始化,并且在用户空间成功申请内存、将所述内存的内存地址注册到所述PCI-e设备后,告知所述父进程监控模块所述PCI-e设备的设备号和所述PCI-e设备中注册的内存地址。a notification module, used to notify the parent process monitoring module of the PCI after the PCI-e device completes initialization, and after the user space successfully applies for memory and registers the memory address of the memory to the PCI-e device -e device number of the device and the memory address registered in said PCI-e device.
可选的,所述装置还包括:Optionally, the device further includes:
创建模块,用于创建父进程和所述子进程之间的通信管道,创建所述子进程;A creation module is used to create a communication pipeline between the parent process and the child process, and to create the child process;
加载模块,用于在所述子进程中加载所述PCI-e设备传输数据程序。A loading module, configured to load the PCI-e device data transmission program in the sub-process.
可选的,所述装置还包括:Optionally, the device further includes:
记录模块,用于在所述子进程异常退出的情况下,记录日志。The recording module is configured to record a log when the subprocess exits abnormally.
由上述技术方案可以看出,本申请实施例提供的程序异常情况下的处理方法,应用于处理系统,该处理系统中包括有父进程监控模块和PCI-e设备,该父进程监控模块能够根据操作系统内核发送的退出信号,判断子进程是否异常退出,该子进程中运行有PCI-e设备传输数据程序,该子进程异常退出,则表明其中运行的PCI-e设备传输数据程序异常;相应地,在确定子进程异常退出的情况下,父进程监控模块根据PCI-e设备传输数据程序关联的PCI-e设备的设备号和其中注册的内存地址,注销该PCI-e设备传输数据程序关联的PCI-e设备中注册的内存地址。由此,父进程监控模块在判断PCI-e设备传输数据程序异常的情况下,能够及时地注销该PCI-e设备传输数据程序关联的PCI-e设备中注册的内存地址,进而,避免因PCI-e设备传输数据程序利用无效内存地址进行读写操作,而引发内存读写错误,防止对操作系统以及其他程序造成影响,从而降低了程序异常情况下造成的危害。It can be seen from the above technical solutions that the processing method provided by the embodiment of the present application under abnormal program conditions is applied to a processing system, and the processing system includes a parent process monitoring module and a PCI-e device, and the parent process monitoring module can The exit signal sent by the operating system kernel determines whether the child process exits abnormally. There is a PCI-e device data transmission program running in the child process. If the child process exits abnormally, it indicates that the running PCI-e device data transmission program is abnormal; correspondingly In the case of determining that the child process exits abnormally, the parent process monitoring module cancels the association of the PCI-e device data transmission program according to the device number of the PCI-e device associated with the PCI-e device data transmission program and the registered memory address. The memory address registered in the PCI-e device. In this way, the parent process monitoring module can timely cancel the memory address registered in the PCI-e device associated with the data transmission program of the PCI-e device when judging that the data transmission program of the PCI-e device is abnormal. The -e device transmits data programs using invalid memory addresses for read and write operations, causing memory read and write errors to prevent the impact on the operating system and other programs, thereby reducing the harm caused by program exceptions.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings required for the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.
图1为本申请实施例提供的程序异常情况下的处理方法的流程示意图;1 is a schematic flowchart of a processing method under abnormal program conditions provided by an embodiment of the present application;
图2为本申请实施例提供的程序异常情况下的处理装置的结构示意图。FIG. 2 is a schematic structural diagram of a processing apparatus provided in an embodiment of the present application in the case of a program abnormality.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例例如能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the application described herein can, for example, be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.
本申请实施例提供了一种程序异常情况下的处理方法,能够在程序出现异常时,及时地处理好收尾工作,降低因程序异常而带来的危害。The embodiment of the present application provides a processing method in the event of a program abnormality, which can handle the finishing work in a timely manner when the program is abnormal, and reduce the harm caused by the abnormal program.
下面先对本申请实施例提供的程序异常情况下的处理方法的核心技术思路进行介绍:The core technical ideas of the processing method in the case of abnormal program conditions provided by the embodiments of the present application are first introduced below:
本申请实施例提供的程序异常情况下的处理方法,应用于处理系统,该处理系统中包括有父进程监控模块和PCI-e设备,该父进程监控模块能够根据操作系统内核发送的退出信号,判断子进程是否异常退出,该子进程中运行有PCI-e设备传输数据程序,该子进程异常退出,则表明其中运行的PCI-e设备传输数据程序异常;相应地,在确定子进程异常退出的情况下,父进程监控模块根据PCI-e设备传输数据程序关联的PCI-e设备的设备号和其中注册的内存地址,注销该PCI-e设备传输数据程序关联的PCI-e设备中注册的内存地址。The processing method in the case of program exception provided by the embodiment of the present application is applied to a processing system. The processing system includes a parent process monitoring module and a PCI-e device, and the parent process monitoring module can, according to an exit signal sent by the operating system kernel, It is judged whether the child process exits abnormally. There is a PCI-e device data transmission program running in the child process. If the child process exits abnormally, it indicates that the PCI-e device transmission data program running in it is abnormal; accordingly, it is determined that the child process exits abnormally. In this case, the parent process monitoring module deregisters the PCI-e device registered in the PCI-e device associated with the PCI-e device data transfer program according to the device number of the PCI-e device associated with the PCI-e device data transfer program and the registered memory address. memory address.
在上述程序异常情况下的处理方法,父进程监控模块在判断PCI-e设备传输数据程序异常的情况下,能够及时地注销该PCI-e设备传输数据程序关联的PCI-e设备中注册的内存地址,进而,避免因PCI-e设备传输数据程序利用无效内存地址进行读写操作,而引发内存读写错误,防止对操作系统以及其他程序造成影响,从而降低了程序异常情况下造成的危害。In the above processing method in case of abnormal program, the parent process monitoring module can timely cancel the memory registered in the PCI-e device associated with the data transmission program of the PCI-e device in the case of judging that the data transmission program of the PCI-e device is abnormal In addition, it can avoid memory read and write errors caused by the use of invalid memory addresses for read and write operations by the PCI-e device data transmission program, and prevent the impact on the operating system and other programs, thereby reducing the harm caused by program exceptions.
下面以实施例的方式对本申请提供的程序异常情况下的处理方法进行介绍:The processing method in the case of abnormality of the program provided by the present application will be introduced below by way of example:
参见图1,图1为本申请实施例提供的程序异常情况下的处理方法的流程示意图。该管理方法应用于处理系统,该处理系统包括父进程监控模块和PCI-e设备,如图1所示,该方法包括以下步骤:Referring to FIG. 1 , FIG. 1 is a schematic flowchart of a processing method in the case of a program abnormality provided by an embodiment of the present application. The management method is applied to a processing system, and the processing system includes a parent process monitoring module and a PCI-e device. As shown in FIG. 1 , the method includes the following steps:
步骤101:所述父进程监控模块根据操作系统内核发送的退出信号,判断子进程是否异常退出,所述子进程中运行PCI-e设备传输数据程序,所述PCI-e设备传输数据程序与所述PCI-e设备关联。Step 101: The parent process monitoring module determines whether the child process exits abnormally according to the exit signal sent by the operating system kernel. The PCI-e device data transmission program runs in the child process, and the PCI-e device data transmission program is the same as that of all the child processes. PCI-e device association described above.
父进程监控模块接收到操作系统内核发送的退出信号后,可以相应地根据操作系统内核发送的退出信号确定子进程退出的原因。具体的,由于操作系统内核在子进程退出时会相应地根据退出原因的不同生成携带有不同标识的退出信号,因此,操作系统内核将退出信号发送至父进程监控模块后,父进程监控模块可以根据退出信号中携带的标识,确定子进程退出的原因,进而可以根据子进程退出的原因,判断子进程是否异常退出。After the parent process monitoring module receives the exit signal sent by the operating system kernel, it can correspondingly determine the reason for the exit of the child process according to the exit signal sent by the operating system kernel. Specifically, since the operating system kernel will correspondingly generate exit signals with different identifiers according to different exit reasons when the child process exits, therefore, after the operating system kernel sends the exit signal to the parent process monitoring module, the parent process monitoring module can According to the identifier carried in the exit signal, the reason for the exit of the child process is determined, and then it can be judged whether the child process exits abnormally according to the reason for the exit of the child process.
需要说明的是,子进程中通常运行有PCI-e设备传输数据程序,通过该PCI-e设备传输数据程序,其他设备可以直接将传输的数据写入PCI-e设备中注册的内存地址对应的内存中,或者,其他设备可以直接读取PCI-e设备中注册的内存地址对应的内存中存储的数据,当子进程异常退出时,则相应地代表该PCI-e设备传输数据程序异常。It should be noted that the child process usually runs a PCI-e device transmission data program. Through the PCI-e device transmission data program, other devices can directly write the transmitted data into the memory address corresponding to the registered memory address in the PCI-e device. In the memory, or other devices can directly read the data stored in the memory corresponding to the memory address registered in the PCI-e device, and when the child process exits abnormally, it corresponds to the abnormality of the PCI-e device transmission data program.
需要说明的是,上述PCI-e设备传输数据程序与PCI-e设备相关联,即当其他设备通过该PCI-e设备传输数据程序写数据时,该PCI-e设备传输数据程序直接根据将数据写入至其关联PCI-e设备中注册的内存地址所对应的内存;当其他设备通过该PCI-e设备传输数据程序读数据时,该PCI-e设备传输数据程序直接从与自身关联的PCI-e设备中注册的内存地址所对应的内存中读取数据。It should be noted that the above-mentioned PCI-e device transmission data program is associated with the PCI-e device, that is, when other devices write data through the PCI-e device transmission data program, the PCI-e device transmission data program directly Write to the memory corresponding to the memory address registered in its associated PCI-e device; when other devices read data through the PCI-e device transfer data program, the PCI-e device transfer data program directly from the PCI associated with itself. -e Read data from the memory corresponding to the memory address registered in the device.
         在执行步骤101之前,父进程监控模块还需要先创建自身与子进程之间的通信管道,以及创建子进程;并在子进程中加载上述PCI-e设备传输数据程序。由此,父进程监控模块可以通过该通信管道与子进程通信,相应地,父进程监控模块可以通过该通信管道监控子进程中运行的PCI-e设备传输数据程序,当子进程退出时,即当PCI-e设备传输数据程序运行结束时,操作系统内核可以通过该通信管道向父进程监控模块发送退出信号。Before executing 
可选的,该处理系统中还可以包括PCI-e设备传输数据程序模块,该PCI-e设备传输数据程序模块可以在PCI-e设备完成初始化,并且在用户空间成功申请内存,将内存地址注册到PCI-e设备中后,告知上述父进程监控模块PCI-e设备的设备号以及PCI-e设备中注册的内存地址,即告知父进程监控模块,与该PCI-e设备传输数据程序关联的PCI-e设备的设备号以及PCI-e设备中注册的内存地址,以便父进程监控模块后续根据这些信息,在PCI-e设备传输数据程序出现异常时,做好收尾工作。Optionally, the processing system may further include a PCI-e device data transmission program module, and the PCI-e device data transmission program module can complete initialization in the PCI-e device, successfully apply for memory in the user space, and register the memory address. After entering the PCI-e device, inform the above-mentioned parent process monitoring module of the device number of the PCI-e device and the memory address registered in the PCI-e device, that is, inform the parent process monitoring module of the data transfer program associated with the PCI-e device. The device number of the PCI-e device and the memory address registered in the PCI-e device, so that the parent process monitoring module can do the finishing work when an abnormality occurs in the data transmission program of the PCI-e device according to the information.
步骤102:在所述子进程异常退出的情况下,所述父进程监控模块根据所述PCI-e设备的设备号和所述PCI-e设备中注册的内存地址,注销所述PCI-e设备中注册的内存地址。Step 102: In the case that the child process exits abnormally, the parent process monitoring module deregisters the PCI-e device according to the device number of the PCI-e device and the memory address registered in the PCI-e device The memory address registered in .
父进程监控模块在根据子进程发送的退出信号,确定出子进程异常退出的情况下,即确定子进程中运行的PCI-e设备传输数据程序异常结束的情况下,父进程监控模块根据PCI-e设备的设备号和PCI-e设备中注册的内存地址,注销PCI-e设备中注册的内存地址。When the parent process monitoring module determines that the child process exits abnormally according to the exit signal sent by the child process, that is, when it is determined that the PCI-e device running in the child process transmits data program abnormally, the parent process monitoring module determines that the PCI-e device running in the child process ends abnormally. The device number of the e device and the memory address registered in the PCI-e device, and deregister the memory address registered in the PCI-e device.
应理解,PCI-e设备传输数据程序模块在自身启动时,会将与自身关联的PCI-e设备的设备号和PCI-e设备中注册的内存地址告知父进程监控模块,因此,父进程监控模模块在确定子进程异常退出的情况下,可以先确定子进程中运行的PCI-e设备传输数据程序,进而,根据和该PCI-e设备传输数据程序相关联的PCI-e设备的设备号和PCI-e设备中注册的内存地址,查找对应的PCI-e设备,并将该PCI-e设备中注册的内存地址注销,做好收尾工作。It should be understood that when the PCI-e device transmission data program module starts itself, it will inform the parent process monitoring module of the device number of the PCI-e device associated with itself and the memory address registered in the PCI-e device. Therefore, the parent process monitors When the module module determines that the child process exits abnormally, it can first determine the PCI-e device transmission data program running in the child process, and then, according to the device number of the PCI-e device associated with the PCI-e device transmission data program. And the memory address registered in the PCI-e device, find the corresponding PCI-e device, and cancel the memory address registered in the PCI-e device, and do the finishing work.
父进程监控模块在子进程异常退出的情况下,还可以记录此次子进程异常退出的相关日志,该日志中可以包括异常程序、程序异常的原因、子进程异常退出的时间等相关信息,以便工作人员后续根据该日志的内容对PCI-e设备传输数据程序进行维护,分析其出现异常的原因。In the case of abnormal exit of the child process, the parent process monitoring module can also record the relevant log of the abnormal exit of the child process. The staff will then maintain the data transmission program of the PCI-e device based on the content of the log, and analyze the cause of the abnormality.
需要说明的是,上述父进程监控模块通常循环工作,即在确定子进程异常退出,注销PCI-e设备中注册的内存地址,父进程监控模块会重新启动,重新创建自身与子进程之间的通信管道以及子进程,然后在所创建的子进程中加载PCI-e设备传输数据程序,接收与该PCI-e设备传输数据程序模关联的PCI-e设备的设备号以及PCI-e设备中注册的内存地址,当再次监测到子进程异常退出时,注销PCI-e设备中注册的内存地址,关闭子进程,如此循环。It should be noted that the above-mentioned parent process monitoring module usually works cyclically, that is, when it is determined that the child process exits abnormally, and the memory address registered in the PCI-e device is cancelled, the parent process monitoring module will restart and recreate the connection between itself and the child process. Communication pipeline and sub-process, then load the PCI-e device transmission data program in the created sub-process, receive the device number of the PCI-e device associated with the PCI-e device transmission data program module and register in the PCI-e device When the abnormal exit of the child process is detected again, the memory address registered in the PCI-e device is cancelled, and the child process is closed, and so on.
由于触发PCI-e设备传输数据程序异常结束的条件存在随机性,即该PCI-e设备传输数据程序可能在某次运行过程中因为特定的原因而出现异常结束的情况,但该PCI-e设备传输数据程序并不会在每次运行过程中均出现这种异常结束的情况,因此,一旦PCI-e设备传输数据程序异常结束,父进程监控模块循环工作,还可以保证重新启动该PCI-e设备传输数据程序。Due to the randomness of the conditions that trigger the abnormal end of the data transmission program of the PCI-e device, that is, the data transmission program of the PCI-e device may end abnormally due to a specific reason during a certain running process, but the PCI-e device The data transmission program does not end abnormally in each running process. Therefore, once the PCI-e device transmission data program ends abnormally, the parent process monitoring module works cyclically, and the PCI-e device can also be restarted. The device transmits data program.
在上述程序异常情况下的处理方法,父进程监控模块在判断PCI-e设备传输数据程序异常的情况下,能够及时地注销该PCI-e设备传输数据程序关联的PCI-e设备中注册的内存地址,进而,避免因PCI-e设备传输数据程序利用无效内存地址进行读写操作,而引发内存读写错误,防止对操作系统以及其他程序造成影响,从而降低了程序异常情况下造成的危害。In the above processing method in case of abnormal program, the parent process monitoring module can timely cancel the memory registered in the PCI-e device associated with the data transmission program of the PCI-e device in the case of judging that the data transmission program of the PCI-e device is abnormal In addition, it can avoid memory read and write errors caused by the use of invalid memory addresses for read and write operations by the PCI-e device data transmission program, and prevent the impact on the operating system and other programs, thereby reducing the harm caused by program exceptions.
         此外,本申请还提供了一种程序异常情况下的处理装置。参见图2,图2为程序异常情况下的处理装置200的结构示意图,该程序异常情况下的处理装置200包括:In addition, the present application also provides a processing device under abnormal program conditions. Referring to FIG. 2, FIG. 2 is a schematic structural diagram of the 
         判断模块201,用于根据操作系统内核发送的退出信号,判断子进程是否异常退出,所述子进程中运行PCI-e设备传输数据程序,所述PCI-e设备传输数据程序与所述PCI-e设备关联;The 
         处理模块202,用于在所述子进程异常退出的情况下,根据所述PCI-e设备的设备号和所述PCI-e设备中注册的内存地址,注销所述PCI-e设备中注册的内存地址。The 
可选的,所述装置还包括:Optionally, the device further includes:
告知模块,用于在所述PCI-e设备完成初始化,并且在用户空间成功申请内存、将所述内存的内存地址注册到所述PCI-e设备后,告知所述父进程监控模块所述PCI-e设备的设备号和所述PCI-e设备中注册的内存地址。a notification module, used to notify the parent process monitoring module of the PCI after the PCI-e device completes initialization, and after the user space successfully applies for memory and registers the memory address of the memory to the PCI-e device -e device number of the device and the memory address registered in said PCI-e device.
可选的,所述装置还包括:Optionally, the device further includes:
创建模块,用于创建父进程和所述子进程之间的通信管道,创建所述子进程;A creation module is used to create a communication pipeline between the parent process and the child process, and to create the child process;
加载模块,用于在所述子进程中加载所述PCI-e设备传输数据程序。A loading module, configured to load the PCI-e device data transmission program in the sub-process.
可选的,所述装置还包括:Optionally, the device further includes:
记录模块,用于在所述子进程异常退出的情况下,记录日志。The recording module is configured to record a log when the subprocess exits abnormally.
在上述程序异常情况下的处理装置,在判断PCI-e设备传输数据程序异常的情况下,能够及时地注销该PCI-e设备传输数据程序关联的PCI-e设备中注册的内存地址,进而,避免因PCI-e设备传输数据程序利用无效内存地址进行读写操作,而引发内存读写错误,防止对操作系统以及其他程序造成影响,从而降低了程序异常情况下造成的危害。In the above-mentioned processing device in the case of program abnormality, in the case of judging that the PCI-e device data transmission program is abnormal, the memory address registered in the PCI-e device associated with the PCI-e device data transmission program can be cancelled in time, and then, Avoid memory read and write errors caused by PCI-e device data transmission programs using invalid memory addresses for read and write operations, prevent the impact on the operating system and other programs, and reduce the harm caused by program exceptions.
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于设备及系统实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的设备及系统实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。It should be noted that each embodiment in this specification is described in a progressive manner, and the same and similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. place. In particular, for the device and system embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for related parts. The device and system embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.
以上所述,仅为本申请的一种具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。The above is only a specific embodiment of the present application, but the protection scope of the present application is not limited to this. Substitutions should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.
Claims (7)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN201810949401.XA CN109086179B (en) | 2018-08-20 | 2018-08-20 | Processing method and device under program exception condition | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN201810949401.XA CN109086179B (en) | 2018-08-20 | 2018-08-20 | Processing method and device under program exception condition | 
Publications (2)
| Publication Number | Publication Date | 
|---|---|
| CN109086179A CN109086179A (en) | 2018-12-25 | 
| CN109086179B true CN109086179B (en) | 2022-04-22 | 
Family
ID=64793810
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN201810949401.XA Active CN109086179B (en) | 2018-08-20 | 2018-08-20 | Processing method and device under program exception condition | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN109086179B (en) | 
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN113672445B (en) * | 2020-05-13 | 2024-09-24 | 华为技术有限公司 | Method for recording running state information of target program and electronic equipment | 
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN1480878A (en) * | 2002-09-02 | 2004-03-10 | 联想(北京)有限公司 | Method for obtaining information of linux operation system | 
| CN100543683C (en) * | 2006-12-26 | 2009-09-23 | 华为技术有限公司 | Method and system for monitoring a process | 
| CN103530167A (en) * | 2013-09-30 | 2014-01-22 | 华为技术有限公司 | Virtual machine memory data migration method and relevant device and cluster system | 
| CN104050091A (en) * | 2012-12-28 | 2014-09-17 | 华耀(中国)科技有限公司 | Network device and its setting method based on non-uniform memory access system | 
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| US7752355B2 (en) * | 2004-04-27 | 2010-07-06 | International Business Machines Corporation | Asynchronous packet based dual port link list header and data credit management structure | 
- 
        2018
        - 2018-08-20 CN CN201810949401.XA patent/CN109086179B/en active Active
 
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN1480878A (en) * | 2002-09-02 | 2004-03-10 | 联想(北京)有限公司 | Method for obtaining information of linux operation system | 
| CN100543683C (en) * | 2006-12-26 | 2009-09-23 | 华为技术有限公司 | Method and system for monitoring a process | 
| CN104050091A (en) * | 2012-12-28 | 2014-09-17 | 华耀(中国)科技有限公司 | Network device and its setting method based on non-uniform memory access system | 
| CN103530167A (en) * | 2013-09-30 | 2014-01-22 | 华为技术有限公司 | Virtual machine memory data migration method and relevant device and cluster system | 
Also Published As
| Publication number | Publication date | 
|---|---|
| CN109086179A (en) | 2018-12-25 | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| US8046641B2 (en) | Managing paging I/O errors during hypervisor page fault processing | |
| TWI632462B (en) | Switching device and method for detecting i2c bus | |
| US9026865B2 (en) | Software handling of hardware error handling in hypervisor-based systems | |
| US20080115012A1 (en) | Method and infrastructure for detecting and/or servicing a failing/failed operating system instance | |
| US7613861B2 (en) | System and method of obtaining error data within an information handling system | |
| US10802847B1 (en) | System and method for reproducing and resolving application errors | |
| JP5495310B2 (en) | Information processing apparatus, failure analysis method, and failure analysis program | |
| US8683267B2 (en) | Virtual debugging sessions | |
| US20080140895A1 (en) | Systems and Arrangements for Interrupt Management in a Processing Environment | |
| US10275330B2 (en) | Computer readable non-transitory recording medium storing pseudo failure generation program, generation method, and generation apparatus | |
| JP2015529927A (en) | Notification of address range with uncorrectable errors | |
| WO2016127600A1 (en) | Exception handling method and apparatus | |
| US8122176B2 (en) | System and method for logging system management interrupts | |
| US10157005B2 (en) | Utilization of non-volatile random access memory for information storage in response to error conditions | |
| US20160196083A1 (en) | Method and device for monitoring data integrity in shared memory environment | |
| CN113553243A (en) | remote debug method | |
| CN107544879A (en) | Diagnostic method, device and the machinable medium of server | |
| US10635554B2 (en) | System and method for BIOS to ensure UCNA errors are available for correlation | |
| CN109086179B (en) | Processing method and device under program exception condition | |
| CN100375960C (en) | Method and apparatus for regulating input/output fault | |
| US8880957B2 (en) | Facilitating processing in a communications environment using stop signaling | |
| KR20220156355A (en) | Apparatus and Method for Detecting Non-volatile Memory Attack Vulnerability | |
| CN101311909A (en) | Method for diagnosing system abnormality | |
| CN119473801B (en) | Bus timeout monitoring method, device and system based on AXI protocol | |
| CN115686914A (en) | A fault recording method, computing device and storage medium | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |