CN112035837B

CN112035837B - Malicious PDF document detection system and method based on mimic defense

Info

Publication number: CN112035837B
Application number: CN202010755721.9A
Authority: CN
Inventors: 伊鹏; 胡涛; 陈祥; 韩伟涛; 张震; 王文博
Original assignee: PLA Information Engineering University
Current assignee: Information Engineering University Of Chinese People's Liberation Army Cyberspace Force
Priority date: 2020-07-31
Filing date: 2020-07-31
Publication date: 2023-06-20
Anticipated expiration: 2040-07-31
Also published as: CN112035837A

Abstract

The invention belongs to the technical field of information content security, and particularly relates to a malicious PDF document detection system and method based on mimicry defense, which are used for simultaneously processing an input PDF document based on a plurality of heterogeneous host systems with equivalent functions, respectively tracking the processing action of a PDF reader and the influence of the PDF document on the host systems, and outputting a document detection result according to majority judgment, wherein once internal behaviors or external behaviors are found to be inconsistent among the heterogeneous host systems, the PDF document is considered to be malicious. According to the invention, by introducing a mimicry defense technology in the detection of the malicious PDF document, the known and unknown risks faced by the PDF document can be effectively prevented, and the detection accuracy and detection efficiency are improved.

Description

Malicious PDF document detection system and method based on mimic defense

技术领域technical field

本发明属于信息内容安全技术领域，特别涉及一种基于拟态防御的恶意PDF文档检测系统及方法。The invention belongs to the technical field of information content security, and in particular relates to a malicious PDF document detection system and method based on mimicry defense.

背景技术Background technique

随着用户对恶意电子邮件附件和网络链接的了解，网络攻击者开始转向基于文档的恶意攻击。浏览器、电子邮件代理或杀毒产品通常会对用户发出更多关于可执行文件危险的警告。然而，像PDF这类文档很少受到大家的关注和怀疑，因为它们给人的印象是静态文件，几乎不会产生危害。As users learned about malicious email attachments and web links, cyber attackers turned to document-based malicious attacks. Browsers, email proxies, or antivirus products often warn users more about the dangers of executable files. However, documents such as PDFs receive little attention and suspicion because they give the impression of being static files that do little harm.

然而，近年来，PDF规范已经发生改变。新增脚本功能使文档能够以几乎与可执行文件相同的方式工作，包括连接到Internet、运行进程和与其他文件/程序交互的能力。内容复杂性增长为攻击者提供了更多漏洞来发动强大的攻击，并提供了更多的灵活性来隐藏恶意负载和逃避检测。一个恶意PDF文档通常利用PDF解释器中一个或者多个漏洞发动攻击。考虑到PDF文档阅读器日益增加的复杂性和广泛的库/系统组件依赖性，容易形成较大的暴露攻击面。In recent years, however, the PDF specification has changed. New scripting capabilities enable documents to work in almost the same way as executables, including the ability to connect to the Internet, run processes, and interact with other files/programs. Growth in content complexity provides attackers with more holes to launch powerful attacks and more flexibility to hide malicious payloads and evade detection. A malicious PDF document usually exploits one or more vulnerabilities in the PDF interpreter to launch an attack. Considering the increasing complexity and extensive library/system component dependencies of PDF document readers, it is easy to form a large exposed attack surface.

以PDF阅读器Adobe Acrobat Reader为例，2019年发现了274个CVE。种类繁多的PDF阅读器以及形成的庞大攻击面使其成为攻击者的首选目标之一。所收集的恶意软件案例显示许多Abode组件已经被攻击，包括元素解析器和解码器，字体管理器，和JavaScript引擎。系统范围的依赖，例如图形库，也是攻击者的目标。Taking the PDF reader Adobe Acrobat Reader as an example, 274 CVEs were discovered in 2019. The wide variety of PDF readers and the large attack surface they create make them one of the top targets for attackers. The collected malware cases show that many Abode components have been compromised, including element parsers and decoders, font managers, and JavaScript engines. System-wide dependencies, such as graphics libraries, are also targeted by attackers.

随着PDF阅读器不断发展和PDF格式普及，恶意PDF文档检测已经成为一个紧迫问题。然而，现有恶意PDF文档检测方案对于PDF规范过度简化，从而导致不完整的恶意有效负载提取和检测失败，并且缺乏文档运行过程实时跟踪。因此，需要设计一种新型恶意PDF文档检测系统以防范PDF文档的已知和未知风险。With the continuous development of PDF readers and the popularity of PDF format, malicious PDF document detection has become an urgent problem. However, existing malicious PDF document detection schemes oversimplify the PDF specification, resulting in incomplete malicious payload extraction and detection failure, and lack of real-time tracking of document running process. Therefore, it is necessary to design a novel malicious PDF document detection system to guard against the known and unknown risks of PDF documents.

近年来，为解决传统防御方法安全性不足而带来的严重安全问题，相关研究人员提出了拟态防御技术。拟态防御是一种主动防御方法，其核心思想是动态异构冗余，通过组织多个冗余的异构功能等价体来共同处理外部相同的请求，通过多模裁决发现并屏蔽恶意攻击，以此来弥补传统防御技术中静态、相似、单一的安全缺陷。In recent years, in order to solve the serious security problems caused by the insufficient security of traditional defense methods, relevant researchers have proposed mimic defense technology. Mimic defense is an active defense method. Its core idea is dynamic heterogeneous redundancy. It organizes multiple redundant heterogeneous functional equivalents to jointly process the same external request, and discovers and shields malicious attacks through multi-mode adjudication. In order to make up for the static, similar, and single security flaws in traditional defense technologies.

发明内容Contents of the invention

针对攻击者利用PDF文档中漏洞和后门发起的恶意攻击，基于拟态防御中动态异构冗余思想，本发明设计了一种基于拟态防御的恶意PDF文档检测系统及方法，使检测系统具备内生安全属性，能够快速而有效地检测PDF文档中所面临的已知和未知风险。Aiming at malicious attacks initiated by attackers using loopholes and backdoors in PDF documents, based on the idea of dynamic heterogeneous redundancy in mimic defense, the present invention designs a malicious PDF document detection system and method based on mimic defense, so that the detection system has endogenous Security attributes that quickly and efficiently detect known and unknown risks in PDF documents.

为解决上述技术问题，本发明采用以下的技术方案：In order to solve the problems of the technologies described above, the present invention adopts the following technical solutions:

本发明提供了一种基于拟态防御的恶意PDF文档检测系统，包括：The invention provides a malicious PDF document detection system based on mimic defense, comprising:

功能等价的多个异构主机，用于对同一个PDF文档进行处理，追踪多个异构主机在处理PDF文档时的系统行为；Multiple heterogeneous hosts with equivalent functions are used to process the same PDF document, and track the system behavior of multiple heterogeneous hosts when processing PDF documents;

裁决器，用于对追踪处理结果判决，确定该PDF文档是否恶意。The arbiter is configured to judge the tracking processing result and determine whether the PDF document is malicious.

进一步地，所述异构主机的异构主机系统中安装同种类型相同版本的PDF阅读器。Further, PDF readers of the same type and version are installed in the heterogeneous host system of the heterogeneous host.

进一步地，追踪多个异构主机在处理PDF文档时的系统行为包括内部行为和外部行为，内部行为是追踪异构主机系统中PDF阅读器的PDF文档处理流程，外部行为是追踪PDF文档处理对异构主机系统的影响。Further, tracking the system behavior of multiple heterogeneous hosts when processing PDF documents includes internal behavior and external behavior, the internal behavior is to track the PDF document processing flow of the PDF reader in the heterogeneous host system, and the external behavior is to track the PDF document processing Effects of heterogeneous host systems.

进一步地，所述裁决器包括内部行为裁决器和外部行为裁决器；所述内部行为裁决器用于对内部行为追踪结果执行多模裁决，所述外部行为裁决器用于对外部行为追踪结果执行多模裁决。Further, the arbiter includes an internal behavior arbiter and an external behavior arbiter; the internal behavior arbiter is used to perform multi-modal arbitration on the internal behavior tracking result, and the external behavior arbiter is used to perform multi-modal arbitration on the external behavior tracking result. ruling.

本发明还提供了一种基于拟态防御的恶意PDF文档检测方法，包含以下步骤：The present invention also provides a method for detecting malicious PDF documents based on mimicry defense, comprising the following steps:

对同一个PDF文档进行拟态化处理；Skeuomorphic treatment of the same PDF document;

追踪多个异构主机在处理PDF文档时的系统行为；Track the system behavior of multiple heterogeneous hosts when processing PDF documents;

对追踪处理结果判决，确定该PDF文档是否恶意。Determine whether the PDF document is malicious or not by making a judgment on the tracking processing result.

进一步地，对同一个PDF文档进行拟态化处理包括：Further, performing mimetic processing on the same PDF document includes:

将待检测PDF文档作为输入激励分别分发给功能等价的多个异构主机同时处理，异构主机的异构主机系统中安装同种类型相同版本的PDF阅读器。The PDF documents to be detected are distributed as input incentives to multiple heterogeneous hosts with equivalent functions for simultaneous processing, and the heterogeneous host systems of the heterogeneous hosts are installed with the same type and the same version of the PDF reader.

进一步地，所述内部行为包括COS对象解析、PD树构建、脚本执行和元素呈现。Further, the internal behavior includes COS object parsing, PD tree construction, script execution and element presentation.

进一步地，所述外部行为包括文件系统操作、网络活动和程序加载。Further, the external behavior includes file system operation, network activity and program loading.

进一步地，对追踪处理结果判决，确定该PDF文档是否恶意包括：Further, determining whether the PDF document is malicious includes:

将内部行为追踪结果发送到内部行为裁决器进行比较，通过比较处理动作和执行多模裁决确定PDF文档是否恶意；将外部行为追踪结果发送到外部行为裁决器进行比较，通过比较异构主机系统行为和执行多模裁决确定PDF文档是否恶意。Send the internal behavior tracking results to the internal behavior arbiter for comparison, and determine whether the PDF document is malicious by comparing processing actions and executing multi-mode rulings; send the external behavior tracking results to the external behavior arbiter for comparison, and compare the behavior of heterogeneous host systems and perform a multimodal verdict to determine whether a PDF document is malicious.

与现有技术相比，本发明具有以下优点：Compared with the prior art, the present invention has the following advantages:

为了有效提升恶意PDF文档检测能力，本发明设计了一种基于拟态防御的恶意PDF文档检测系统。由于正常PDF文档在不同的主机系统上表现相同，而恶意PDF文档在不同的主机系统上发起攻击时，会导致不同的行为，范围从PDF文档处理动作（内部行为）到PDF文档对主机系统影响（外部行为），因此，通过广泛比较PDF文档在异构主机系统上（例如Windows、Linux、Macintosh）的行为（包括内部行为和外部行为），基于多模裁决，输出恶意PDF文档检测结果。具体而言，基于多种功能等价的异构主机系统同时对输入的PDF文档进行处理，分别追踪PDF阅读器处理动作和PDF文档对主机系统的影响，根据择多判决输出文档检测结果，一旦发现异构主机系统间存在内部行为或外部行为不一致，则认为该PDF文档是恶意的。本发明通过在恶意PDF文档检测中引入拟态防御技术，可以有效地防范PDF文档所面临的已知和未知风险，提升检测准确率和检测效率。In order to effectively improve the malicious PDF document detection capability, the present invention designs a malicious PDF document detection system based on mimicry defense. Since normal PDF documents behave the same on different host systems, when malicious PDF documents launch attacks on different host systems, they will cause different behaviors, ranging from PDF document processing actions (internal behavior) to PDF document impact on the host system (External Behavior), thus outputting malicious PDF document detection results based on multimodal adjudication by extensively comparing the behavior of PDF documents on heterogeneous host systems (e.g., Windows, Linux, Macintosh) (both internal and external). Specifically, multiple functionally equivalent heterogeneous host systems simultaneously process the input PDF documents, respectively track the PDF reader processing action and the impact of PDF documents on the host system, and output the document detection results according to the majority decision. If the internal behavior or external behavior inconsistency among heterogeneous host systems is found, the PDF document is considered to be malicious. By introducing the mimetic defense technology into malicious PDF document detection, the present invention can effectively prevent known and unknown risks faced by PDF documents, and improve detection accuracy and detection efficiency.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are For some embodiments of the present invention, those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1是拟态防御体系抽象模型图；Figure 1 is an abstract model diagram of the mimic defense system;

图2是本发明实施例的PDF阅读器处理动作流程图；Fig. 2 is the PDF reader processing action flowchart of the embodiment of the present invention;

图3是本发明实施例的基于拟态防御的恶意PDF文档检测系统总体框架图。Fig. 3 is an overall framework diagram of a malicious PDF document detection system based on mimicry defense according to an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例，基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work belong to the protection of the present invention. scope.

本发明基于拟态防御原理，设计了基于拟态防御的恶意PDF文档检测系统。拟态防御的核心思想是通过引入动态异构冗余架构提升系统应对未知威胁的能力，拟态防御体系抽象模型如图1所示，在该抽象模型中，输入代理需要将输入序列分发到相应的多个异构功能等价体；功能等价异构执行体集合中接收到输入激励的执行体，产生满足给定语义和语法的输出矢量；多模裁决器根据裁决参数或算法生成的裁决策略，研判多模输出矢量内容的一致性情况并形成输出响应序列。因此，通过将拟态防御技术引入恶意PDF文档检测中，可以极大地提升检测效率和准确率，有效增强信息安全防护能力。Based on the principle of mimic defense, the invention designs a malicious PDF document detection system based on mimic defense. The core idea of mimic defense is to improve the system's ability to cope with unknown threats by introducing a dynamic heterogeneous redundant architecture. The abstract model of the mimic defense system is shown in Figure 1. In this abstract model, the input agent needs to distribute the input sequence to the corresponding multiple Heterogeneous functional equivalents; Executors in the set of functionally equivalent heterogeneous executors that receive input stimuli produce output vectors that satisfy the given semantics and syntax; the multi-mode arbiter generates an adjudication strategy based on the adjudication parameters or algorithms, Study and judge the consistency of the multi-mode output vector content and form an output response sequence. Therefore, by introducing mimic defense technology into malicious PDF document detection, the detection efficiency and accuracy can be greatly improved, and the information security protection capability can be effectively enhanced.

一般地，当通过PDF阅读器打开PDF文档时，PDF阅读器就已经开始对其实施处理，基本的流程如图2所示，依次经过COS对象解析、PD树构建、脚本执行、元素呈现等动作。当一个待检测PDF文档被打开时，扫描PDF头部以快速地定位trailer和交叉引用表（XRT），一旦定位到XRT，PDF文档的基本元素（称之为COS对象）被枚举和解析，COS对象只是带有类型标签（例如整数、字符串、关键字、数组、字典或者流）的数据。然后，根据PDF阅读器对PDF规范的解释，将一个或多个COS对象组装到PDF特定的组件中，例如ext，image，font，form，page，JavaScript code等。PDF文档的层次结构（例如，哪些文本出现在特定的页面中）也是按照这个过程构建的，相应的输出称之为PD树，然后传递给呈现引擎进行显示。对于PDF阅读器，当呈现引擎执行JavaScript动作，或者绘制嵌入JavaScript的形式，JavaScript的整个代码块将被执行。执行结束后，PDF阅读器呈现相应的PDF文档内容。Generally, when a PDF document is opened by a PDF reader, the PDF reader has already started to process it. The basic process is shown in Figure 2, and it goes through actions such as COS object analysis, PD tree construction, script execution, and element presentation. . When a PDF document to be detected is opened, the PDF header is scanned to quickly locate the trailer and cross-reference table (XRT). Once the XRT is located, the basic elements of the PDF document (called COS objects) are enumerated and parsed. A COS object is just data with a type label such as integer, string, key, array, dictionary, or stream. Then, according to the interpretation of the PDF specification by the PDF reader, one or more COS objects are assembled into PDF-specific components, such as ext, image, font, form, page, JavaScript code, etc. The hierarchical structure of the PDF document (for example, which text appears in a specific page) is also constructed according to this process, and the corresponding output is called a PD tree, which is then passed to the rendering engine for display. For a PDF reader, when the rendering engine executes a JavaScript action, or renders a form that embeds JavaScript, the entire code block of JavaScript will be executed. After execution, the PDF reader presents the corresponding PDF document content.

由于PDF规范的跨平台（Windows、Linux、Mac）特性，如果一些合法操作影响了一个平台上主机系统，那么在另一个平台上打开该文档时会执行同样的操作。例如，如果一个良性文档连接到一个远程主机，那么在其他平台也会执行相同的操作。在不同主机系统打开PDF文档时，对于一个良性文档，函数执行顺序和结果都是相同的，而对于一个恶意文档，PDF追踪结果在许多地方可能存在不同，包括PDF文档处理过程以及PDF文档对主机系统的影响。Due to the cross-platform (Windows, Linux, Mac) nature of the PDF specification, if some legitimate operation affects the host system on one platform, the same operation will be performed when the document is opened on another platform. For example, if a benign document connects to a remote host, it will do the same on other platforms. When opening a PDF document on different host systems, for a benign document, the function execution order and results are the same, but for a malicious document, the PDF tracking results may be different in many places, including the PDF document processing process and the PDF document’s impact on the host. system impact.

基于拟态防御体系抽象模型，本实施例对恶意PDF文档检测系统进程拟态化改造，设计了一种基于拟态防御的恶意PDF文档检测系统，总体框架如图3所示，包括功能等价的多个异构主机和裁决器，异构主机的异构主机系统中安装同种类型相同版本的PDF阅读器，异构主机系统可以采用Windows、Linux或者Mac等；异构主机用于对同一个PDF文档进行处理，追踪多个异构主机在处理PDF文档时的系统行为，该系统行为包括内部行为和外部行为，内部行为是追踪异构主机系统中PDF阅读器的PDF文档处理流程，外部行为是追踪PDF文档处理对异构主机系统的影响；裁决器用于对追踪处理结果判决，确定该PDF文档是否恶意，输出检测结果，裁决器包括内部行为裁决器和外部行为裁决器；内部行为裁决器用于对内部行为追踪结果执行多模裁决，外部行为裁决器用于对外部行为追踪结果执行多模裁决。Based on the abstract model of the mimetic defense system, this embodiment transforms the process of the malicious PDF document detection system into a mimetic transformation, and designs a malicious PDF document detection system based on mimetic defense. The overall framework is shown in Figure 3, including multiple Heterogeneous host and arbiter, install the same type and version of PDF reader in the heterogeneous host system of the heterogeneous host, the heterogeneous host system can use Windows, Linux or Mac, etc.; the heterogeneous host is used to view the same PDF document Process and track the system behavior of multiple heterogeneous hosts when processing PDF documents. The system behavior includes internal behavior and external behavior. The internal behavior is to track the PDF document processing flow of the PDF reader in the heterogeneous host system, and the external behavior is to track The impact of PDF document processing on the heterogeneous host system; the arbiter is used to judge the tracking processing results, determine whether the PDF document is malicious, and output the detection results. The arbiter includes an internal behavior arbiter and an external behavior arbiter; the internal behavior arbiter is used to judge The internal behavior tracking results perform multi-modal judgment, and the external behavior arbiter is used to perform multi-modal judgment on the external behavior tracking results.

通过基于拟态防御的恶意PDF文档检测系统，可以快速且有效地检测恶意PDF文档。Through the malicious PDF document detection system based on mimic defense, malicious PDF documents can be detected quickly and effectively.

与上述基于拟态防御的恶意PDF文档检测系统相应地，本实施例还提供了一种基于拟态防御的恶意PDF文档检测方法，对于同一个PDF文档由多个异构主机对其进行同时处理，待检测PDF文档在不同主机系统处理之后，追踪处理结果并发送给裁决器，在裁决器中通多择多判决输出裁决结果；具体包含以下步骤：Corresponding to the above-mentioned malicious PDF document detection system based on mimicry defense, this embodiment also provides a malicious PDF document detection method based on mimicry defense, in which multiple heterogeneous hosts process the same PDF document simultaneously. After detecting that the PDF document is processed by different host systems, track the processing result and send it to the arbiter, and output the adjudication result through the majority decision in the arbiter; specifically, the following steps are included:

步骤S101，基于主机多样性对同一个PDF文档进行拟态化处理。Step S101, performing mimetic processing on the same PDF document based on host diversity.

当有待检测PDF文档输入时，将输入的PDF文档作为输入激励分别分发给功能等价的多个异构主机，异构主机系统中都安装有同种类型相同版本的PDF阅读器，对输入的PDF文档同时进行处理。When there is a PDF document input to be detected, the input PDF document is distributed as an input stimulus to multiple heterogeneous hosts with equivalent functions. The heterogeneous host systems are all equipped with PDF readers of the same type and version. PDF documents are processed simultaneously.

步骤S102，追踪多个异构主机在处理PDF文档时的系统行为。Step S102, tracking system behaviors of multiple heterogeneous hosts when processing PDF documents.

对PDF文档处理过程进行追踪，分为两个部分：一方面，追踪异构主机系统中PDF阅读器的PDF文档处理流程（称为内部行为），内部行为是PDF阅读器四类处理动作，包括COS对象解析、PD树构建、JavaScript脚本执行和PDF元素呈现；另一方面，追踪PDF文档处理对异构主机系统的影响（称为外部行为），可以归纳为一系列可执行操作，包括文件系统操作、网络活动和程序加载，通过挂钩技术挂钩系统调用，并记录参数和返回值，从而捕获执行恶意文档对主机系统的影响。Tracking the PDF document processing process is divided into two parts: on the one hand, tracking the PDF document processing process (called internal behavior) of the PDF reader in the heterogeneous host system. The internal behavior is the four types of processing actions of the PDF reader, including COS object parsing, PD tree construction, JavaScript script execution, and PDF element rendering; on the other hand, tracking the impact of PDF document processing on heterogeneous host systems (called external behavior) can be summarized as a series of executable operations, including file system Operations, network activities, and program loading, hook system calls through hooking technology, and record parameters and return values, so as to capture the impact of executing malicious documents on the host system.

步骤S103，对追踪处理结果判决，确定该PDF文档是否恶意。Step S103, judging the result of the tracking process to determine whether the PDF document is malicious.

将PDF阅读器的上述四类处理动作发送到内部行为裁决器进行比较，通过比较处理动作和执行多模裁决确定PDF文档是否恶意；Send the above four types of processing actions of the PDF reader to the internal behavior adjudicator for comparison, and determine whether the PDF document is malicious by comparing the processing actions and performing multi-mode adjudication;

将外部行为追踪结果发送到外部行为裁决器进行比较，通过比较异构主机系统行为和执行多模裁决确定PDF文档是否恶意。Send the external behavior tracking results to the external behavior arbiter for comparison, and determine whether the PDF document is malicious by comparing the behavior of heterogeneous host systems and performing multi-mode adjudication.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。It should be noted that, in this document, the terms "comprising", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or apparatus.

本领域普通技术人员可以理解：实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成，前述的程序可以存储在计算机可读取的存储介质中，该程序在执行时，执行包括上述方法实施例的步骤；而前述的存储介质包括：ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质中。Those of ordinary skill in the art can understand that all or part of the steps to realize the above method embodiments can be completed by program instructions related hardware, and the aforementioned programs can be stored in a computer-readable storage medium. When the program is executed, the It includes the steps of the above method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.

最后需要说明的是：以上所述仅为本发明的较佳实施例，仅用于说明本发明的技术方案，并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所做的任何修改、等同替换、改进等，均包含在本发明的保护范围内。Finally, it should be noted that the above descriptions are only preferred embodiments of the present invention, and are only used to illustrate the technical solution of the present invention, and are not used to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present invention are included in the protection scope of the present invention.

Claims

1. A malicious PDF document detection system based on mimic defense, characterized in that it comprises:

Multiple heterogeneous hosts with equivalent functions are used to process the same PDF document, and track the system behavior of multiple heterogeneous hosts when processing PDF documents. The system behavior includes internal behavior and external behavior. The internal behavior is to track heterogeneous The PDF document processing flow of the PDF reader in the institutional host system, and the external behavior is to track the impact of PDF document processing on the heterogeneous host system;

The arbiter is used to determine whether the PDF document is malicious to the tracking processing result; the arbiter includes an internal behavior arbiter and an external behavior arbiter; the internal behavior arbiter is used to perform a multi-mode ruling on the internal behavior tracking result, and the The external behavior arbiter is used to perform multimodal adjudication on the external behavior tracking results.

2. The malicious PDF document detection system based on mimicry defense according to claim 1, wherein a PDF reader of the same type and the same version is installed in the heterogeneous host system of the heterogeneous host.

3. A malicious PDF document detection method based on mimic defense, it is characterized in that, comprising the following steps:

Skeuomorphic treatment of the same PDF document;

Track the system behavior of multiple heterogeneous hosts when processing PDF documents. The system behavior includes internal behavior and external behavior. The internal behavior is to track the PDF document processing flow of the PDF reader in the heterogeneous host system, and the external behavior is to track PDF document processing. Impact on heterogeneous host systems;

Judgment on the result of tracking and processing to determine whether the PDF document is malicious, including:

Send the internal behavior tracking results to the internal behavior arbiter for comparison, and determine whether the PDF document is malicious by comparing processing actions and executing multi-mode rulings; send the external behavior tracking results to the external behavior arbiter for comparison, and compare the behavior of heterogeneous host systems and perform a multimodal verdict to determine whether a PDF document is malicious.

4. the malicious PDF document detection method based on mimicry defense according to claim 3, is characterized in that, carrying out mimicry processing to same PDF document comprises:

The PDF documents to be detected are distributed as input incentives to multiple heterogeneous hosts with equivalent functions for simultaneous processing, and the heterogeneous host systems of the heterogeneous hosts are installed with the same type and the same version of the PDF reader.

5. The method for detecting malicious PDF documents based on mimicry defense according to claim 3, wherein the internal behaviors include COS object parsing, PD tree construction, script execution and element presentation.

6. The method for detecting malicious PDF documents based on mimicry defense according to claim 3, wherein the external behaviors include file system operations, network activities and program loading.