CN109902283B - An information output method and device - Google Patents
An information output method and device Download PDFInfo
- Publication number
- CN109902283B CN109902283B CN201810415523.0A CN201810415523A CN109902283B CN 109902283 B CN109902283 B CN 109902283B CN 201810415523 A CN201810415523 A CN 201810415523A CN 109902283 B CN109902283 B CN 109902283B
- Authority
- CN
- China
- Prior art keywords
- word
- semantic
- target data
- description text
- fault description
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
Description
技术领域technical field
本申请涉及通信技术领域,尤其涉及一种信息输出方法及装置。The present application relates to the field of communication technologies, and in particular to an information output method and device.
背景技术Background technique
当网络设备出现故障时,会影响正常的通信,给人们的工作和生活带来严重损失,所以网络设备故障的及时修复非常重要。目前,当网络设备出现故障时,一线工程师会从故障发生现场收集用于协助分析故障原因的数据,例如,收集网络设备故障发生前后一段时间内的关键性能指标(KPI)、设备告警、设备日志等参数数据。并且一线工程师会对故障现象进行描述,得到故障描述文本。一线工程师将收集的KPI等数据和故障描述文本以故障工单的形式反馈给运维部门。运维工程师根据故障工单中的故障描述文本,凭借自身的专业知识,手动从一线收集的数据中选择出一些KPI、设备告警、设备日志等参数数据。进一步地,对选择出来的这些数据进行异常检测和相互佐证,从而分析出故障根因所在,对故障网络设备的修复提供指导性意见。这种通过人工手动从KPI、设备告警、设备日志等参数数据中选择出与故障描述文本相关的参数数据进行查看分析的故障检测方法,效率低速度慢,无法满足日益增加的网络需求。When network equipment fails, it will affect normal communication and bring serious losses to people's work and life, so timely repair of network equipment failure is very important. At present, when a network device fails, front-line engineers will collect data from the fault site to assist in analyzing the cause of the fault, for example, collecting key performance indicators (KPIs), device alarms, and device logs within a period of time before and after the network device fault occurs and other parameter data. And the first-line engineers will describe the fault phenomenon and get the fault description text. Front-line engineers feed back the collected KPI and other data and fault description text to the operation and maintenance department in the form of fault work orders. According to the fault description text in the fault work order, the operation and maintenance engineer manually selects some KPIs, device alarms, device logs and other parameter data from the data collected by the front line with their own professional knowledge. Further, abnormality detection and mutual corroboration are performed on the selected data, so as to analyze the root cause of the fault and provide guidance for repairing faulty network equipment. This fault detection method, which manually selects parameter data related to the fault description text from KPI, device alarm, device log and other parameter data for viewing and analysis, has low efficiency and slow speed, and cannot meet the increasing network demand.
现有技术中通过查找与故障描述文本具有相同的关键词的文本,并根据该文本的相关参数数据进行故障的查看分析。但相关性高的能用于协助分析故障原因的相关文本和故障描述文本中可能并没有相同的关键词。因此,通过现有的方式不能准确地查找到与故障描述文本相关联的用于协助分析故障原因的数据。In the prior art, the text with the same keywords as the fault description text is searched, and the fault is checked and analyzed according to the relevant parameter data of the text. However, there may not be the same keywords in the relevant text that can be used to assist in analyzing the cause of the fault and the fault description text with high correlation. Therefore, the data associated with the fault description text and used to assist in analyzing the cause of the fault cannot be accurately found through the existing methods.
发明内容Contents of the invention
本申请提供了一种信息输出方法及装置,能够自动地并准确地查找到与故障描述文本的相关的用于协助分析故障原因的数据。The present application provides an information output method and device, which can automatically and accurately find data related to the fault description text to assist in analyzing the cause of the fault.
第一方面,本申请提供了一种信息输出方法,该方法包括:获取故障描述文本,该故障描述文本用于描述网络中发生的故障;通过语义生成模型生成故障描述文本的语义向量;获取多种类型的目标数据的相关文本分别对应的语义向量,该目标数据用于协助分析故障产生的原因;计算故障描述文本的语义向量与每种目标数据的相关文本的语义向量的相关性;确定并输出第一数据,该第一数据为每种目标数据中语义向量与故障描述文本的语义向量的相关性最大的目标数据,或该第一数据为每种目标数据中语义向量与故障描述文本的语义向量的相关性大于预设阈值的目标数据。In the first aspect, the present application provides an information output method, the method includes: obtaining a fault description text, which is used to describe a fault occurring in the network; generating a semantic vector of the fault description text through a semantic generation model; obtaining multiple Semantic vectors corresponding to the relevant texts of each type of target data, the target data is used to assist in analyzing the cause of the fault; calculate the correlation between the semantic vectors of the fault description text and the semantic vectors of the relevant texts of each target data; determine and Output the first data, the first data is the target data with the greatest correlation between the semantic vector and the semantic vector of the fault description text in each target data, or the first data is the correlation between the semantic vector and the fault description text in each target data The target data for which the relevance of the semantic vector is greater than a preset threshold.
本申请通过对比故障描述文本的语义向量与目标数据的相关文本的语义向量的相关性,可以准确地查找到与故障描述文本相关联的目标数据。比如,故障描述为“行业用户上网慢”,本申请分析出的与其相关的用于故障分析的关键性指标的名称为“下行带宽控制丢包比例”。可以看出,从字面上二者没有任何可以匹配和关联的成分,而本申请恰恰是通过语义分析挖掘学习到了“上网速度和丢包比例有关系”这样的领域知识,才实现了二者相关联的分析。因此,通过实施第一方面所描述的方法,能够自动地并且准确地查找出与故障描述文本的相关的用于协助分析故障原因的数据。The present application can accurately find the target data associated with the fault description text by comparing the correlation between the semantic vector of the fault description text and the semantic vector of the relevant text of the target data. For example, the fault is described as "slow Internet access by industry users", and the name of the key indicator for fault analysis related to it analyzed in this application is "downlink bandwidth control packet loss ratio". It can be seen that there is literally no matching and correlation between the two, but this application has learned the domain knowledge of "Internet speed and packet loss ratio are related" through semantic analysis and mining, and realized the correlation between the two. linked analysis. Therefore, by implementing the method described in the first aspect, data related to the fault description text for assisting in analyzing the cause of the fault can be automatically and accurately found.
在一种可能的实施方式中,获取故障描述文本之前,还可通过语义生成模型生成多种类型的目标数据的相关文本分别对应的语义向量。In a possible implementation manner, before acquiring the fault description text, semantic vectors corresponding to relevant texts of multiple types of target data may also be generated through a semantic generation model.
并且还可保存多种目标数据的相关文本分别对应的语义向量;相应地,获取多种目标数据的相关文本分别对应的语义向量的具体实施方式为:获取保存的多种目标数据的相关文本分别对应的语义向量。And the semantic vectors corresponding to the relevant texts of various target data can also be saved respectively; correspondingly, the specific implementation manner of obtaining the semantic vectors respectively corresponding to the relevant texts of various target data is: obtaining the relevant texts of various target data stored respectively Corresponding semantic vector.
通过实施该实施方式,可预先生成并保存多种目标数据的相关文本分别对应的语义向量,在接收故障描述文本之后,可直接用保存的多种目标数据的相关文本分别对应的语义向量与故障描述文本的语义向量进行相关性计算,从而不用在接收故障描述文本之后,临时生成多种目标数据的相关文本分别对应的语义向量。可见,通过实施该实施方式,有利于快速地计算得到故障描述文本的语义向量与每种目标数据的相关文本的语义向量的相关性。By implementing this embodiment, the semantic vectors corresponding to the relevant texts of various target data can be generated and saved in advance, and after receiving the fault description text, the semantic vectors corresponding to the relevant texts of various target data can be directly used to match the fault Correlation calculation is performed on the semantic vectors of the description text, so that it is not necessary to temporarily generate semantic vectors corresponding to the relevant texts of various target data after receiving the fault description text. It can be seen that by implementing this embodiment, it is beneficial to quickly calculate the correlation between the semantic vector of the fault description text and the semantic vector of the relevant text of each type of target data.
在一种可能的实施方式中,上述语义生成模型是根据历史故障描述文本对应的词向量矩阵训练生成的,该词向量矩阵包括历史故障描述文本中各个词对应的词向量,该词向量用于表示词的语义。In a possible implementation, the above semantic generation model is generated according to the training of the word vector matrix corresponding to the historical fault description text, and the word vector matrix includes the word vector corresponding to each word in the historical fault description text, and the word vector is used for Indicates the semantics of words.
通过实施该实施方式训练得到的语义生成模型,能够更加准确地表达文本的语义。By implementing the semantic generation model trained in this embodiment, the semantics of the text can be expressed more accurately.
在一种可能的实施方式中,上述多种类型的目标数据包括关键性能指标、设备告警、设备日志中的至少两种;当上述目标数据为关键性能指标时,该目标数据的相关文本为关键性能指标的名称;当上述目标数据为设备告警时,该目标数据的相关文本为设备告警的标识;当上述目标数据为设备日志时,该目标数据的相关文本为设备日志的内容片段。In a possible implementation manner, the above-mentioned multiple types of target data include at least two of key performance indicators, equipment alarms, and equipment logs; when the above-mentioned target data is a key performance indicator, the relevant text of the target data is a key The name of the performance indicator; when the above target data is a device alarm, the relevant text of the target data is the identification of the device alarm; when the above target data is a device log, the relevant text of the target data is the content fragment of the device log.
第二方面,本申请提供了一种语义生成模型的训练方法,该方法包括:获取训练文本对应的词向量集合,该词向量集合中包括的词向量与训练文本中的词一一对应,该词向量用于表示词的语义;根据词向量集合将历史故障描述文本转换为由至少一个词向量组成的词向量矩阵;根据词向量矩阵训练得到语义生成模型,该语义生成模型用于生成文本的语义向量。In a second aspect, the present application provides a training method for a semantic generation model, the method comprising: obtaining a word vector set corresponding to the training text, the word vectors included in the word vector set correspond to the words in the training text one by one, the The word vector is used to represent the semantics of the word; the historical fault description text is converted into a word vector matrix composed of at least one word vector according to the word vector set; the semantic generation model is obtained according to the word vector matrix training, and the semantic generation model is used to generate text semantic vector.
可选的,获取训练文本对应的词向量集合之后,可将训练文本对应的词向量集合进行保存,以便后续使用词向量集合中的词向量。Optionally, after obtaining the word vector set corresponding to the training text, the word vector set corresponding to the training text may be saved, so that the word vectors in the word vector set can be used later.
可见,第二方面所描述的方法是从词汇层面的语义向句子层面的语义逐步建模得到语义生成模型,这种语义生成模型训练方式是符合语言生成的基本原理的。因此,通过实施第二方面所描述的方法训练得到的语义生成模型,能够更加准确地表达文本的语义。It can be seen that the method described in the second aspect is to gradually model the semantics at the lexical level to the semantics at the sentence level to obtain a semantic generation model. This semantic generation model training method is in line with the basic principles of language generation. Therefore, by implementing the semantic generation model trained by the method described in the second aspect, the semantics of the text can be expressed more accurately.
在一种可能的实施方式中,根据词向量集合将历史故障描述文本转换为由至少一个词向量组成的词向量矩阵的具体实施方式为:对历史故障描述文本进行分词处理,得到历史故障描述文本对应的由至少一个词组成的词序列;从词向量集合中获取词序列包括的词对应的词向量;将词序列包括的各个词对应的词向量组成词向量矩阵。In a possible implementation, the specific implementation of converting the historical fault description text into a word vector matrix composed of at least one word vector according to the word vector set is: performing word segmentation processing on the historical fault description text to obtain the historical fault description text A corresponding word sequence consisting of at least one word; obtaining word vectors corresponding to words included in the word sequence from the word vector set; forming a word vector matrix from word vectors corresponding to each word included in the word sequence.
通过实施该实施方式,可准确地将历史故障描述文本转换为由至少一个词向量组成的词向量矩阵。By implementing this embodiment, the historical fault description text can be accurately converted into a word vector matrix composed of at least one word vector.
在一种可能的实施方式中,当词向量集合中不存在词序列包括的词对应的词向量时,生成随机向量作为词序列包括的词对应的词向量。In a possible implementation manner, when there is no word vector corresponding to the word included in the word sequence in the word vector set, a random vector is generated as a word vector corresponding to the word included in the word sequence.
通过实施该实施方式,可准确地将历史故障描述文本转换为由至少一个词向量组成的词向量矩阵。By implementing this embodiment, the historical fault description text can be accurately converted into a word vector matrix composed of at least one word vector.
在一种可能的实施方式中,根据词向量矩阵训练得到语义生成模型的具体实施方式为:获取历史故障描述文本对应的故障设备类型;根据词向量矩阵和类别标签训练分类模型,该类别标签包括该故障设备类型;根据分类模型得到语义生成模型。In a possible implementation, the specific implementation of obtaining the semantic generation model according to the word vector matrix training is: obtaining the fault equipment type corresponding to the historical fault description text; training the classification model according to the word vector matrix and category labels, the category labels include The faulty device type; a semantic generation model is obtained according to the classification model.
通过实施该实施方式训练得到的语义生成模型,能够更加准确地表达文本的语义。By implementing the semantic generation model trained in this embodiment, the semantics of the text can be expressed more accurately.
在一种可能的实施方式中,根据词向量矩阵和类别标签训练分类模型的具体实施方式为:将词向量矩阵和类别标签输入神经网络进行迭代训练,在每次迭代训练时对输入神经网络的词向量矩阵中的词向量和神经网络的参数进行调整,以生成该分类模型。通过该实施方式训练得到的语义生成模型,能够更加准确地表达文本的语义。In a possible implementation, the specific implementation of training the classification model according to the word vector matrix and category labels is as follows: the word vector matrix and category labels are input into the neural network for iterative training, and the input neural network The word vectors in the word vector matrix and the parameters of the neural network are adjusted to generate the classification model. The semantic generation model trained through this embodiment can express the semantics of the text more accurately.
可选的,还可将使用最后一次迭代训练输入的词向量矩阵中的词向量更新词向量集合中相应词对应的词向量。通过实施该实施方式,能够根据带有领域知识的历史故障描述文本语料修正词向量集合中的词向量,使词向量集合中的词向量更能表达领域知识的词的语义信息。Optionally, the word vectors corresponding to the corresponding words in the word vector set may also be updated using the word vectors in the word vector matrix input in the last iteration training. By implementing this embodiment, the word vectors in the word vector set can be corrected according to the historical fault description text corpus with domain knowledge, so that the word vectors in the word vector set can better express the semantic information of words in the domain knowledge.
第三方面,提供了一种信息输出装置,该信息输出装置可执行上述第一方面或第一方面可能的实施方式中的方法。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的单元。该单元可以是软件和/或硬件。基于同一发明构思,该信息输出装置解决问题的原理以及有益效果可以参见上述第一方面或第一方面可能的实施方式中以及有益效果,重复之处不再赘述。In a third aspect, an information output device is provided, and the information output device can execute the method in the above-mentioned first aspect or a possible implementation manner of the first aspect. This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware. The hardware or software includes one or more units corresponding to the above functions. This unit can be software and/or hardware. Based on the same inventive concept, the problem-solving principles and beneficial effects of the information output device can be referred to the above-mentioned first aspect or the possible implementation manners and beneficial effects of the first aspect, and repeated descriptions will not be repeated.
第四方面,提供了一种模型训练装置,该模型训练装置可执行上述第二方面或第二方面可能的实施方式中的方法。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的单元。该单元可以是软件和/或硬件。基于同一发明构思,该模型训练装置解决问题的原理以及有益效果可以参见上述第二方面或第二方面可能的实施方式中以及有益效果,重复之处不再赘述。In a fourth aspect, a model training device is provided, and the model training device can execute the method in the above-mentioned second aspect or a possible implementation manner of the second aspect. This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware. The hardware or software includes one or more units corresponding to the above functions. This unit can be software and/or hardware. Based on the same inventive concept, the problem-solving principle and beneficial effects of the model training device can be referred to the above-mentioned second aspect or the possible implementation manners and beneficial effects of the second aspect, and repeated descriptions will not be repeated.
第五方面,提供了一种信息输出装置,该信息输出装置包括:处理器、存储器、通信接口;处理器、通信接口和存储器相连;其中,通信接口可以为收发器。通信接口用于实现与其他网元之间的通信。其中,一个或多个程序被存储在存储器中,该处理器调用存储在该存储器中的程序以实现上述第一方面或第一方面可能的实施方式中的方案,该信息输出装置解决问题的实施方式以及有益效果可以参见上述第一方面或第一方面可能的实施方式以及有益效果,重复之处不再赘述。According to a fifth aspect, an information output device is provided, and the information output device includes: a processor, a memory, and a communication interface; the processor, the communication interface, and the memory are connected; wherein, the communication interface may be a transceiver. The communication interface is used to implement communication with other network elements. Wherein, one or more programs are stored in the memory, and the processor invokes the programs stored in the memory to realize the above-mentioned first aspect or the solution in the possible implementation manners of the first aspect, and the information output device solves the implementation of the problem For the manner and beneficial effects, reference may be made to the above first aspect or the possible implementation manners and beneficial effects of the first aspect, and repeated descriptions will not be repeated.
第六方面,提供了一种模型训练装置,该模型训练装置包括:处理器、存储器、通信接口;处理器、通信接口和存储器相连;其中,通信接口可以为收发器。通信接口用于实现与其他网元之间的通信。其中,一个或多个程序被存储在存储器中,该处理器调用存储在该存储器中的程序以实现上述第二方面或第二方面可能的实施方式中的方案,该模型训练装置解决问题的实施方式以及有益效果可以参见上述第二方面或第二方面可能的实施方式以及有益效果,重复之处不再赘述。In a sixth aspect, a model training device is provided, which includes: a processor, a memory, and a communication interface; the processor, the communication interface, and the memory are connected; wherein, the communication interface may be a transceiver. The communication interface is used to implement communication with other network elements. Wherein, one or more programs are stored in the memory, and the processor invokes the programs stored in the memory to realize the above-mentioned second aspect or the solution in the possible implementation manner of the second aspect, and the model training device solves the implementation of the problem For the manner and beneficial effects, reference may be made to the above-mentioned second aspect or possible implementation manners and beneficial effects of the second aspect, and repeated descriptions will not be repeated.
第七方面,提供了一种计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面、第二方面、第一方面的可能的实施方式或第二方面的可能的实施方式中的方法。In a seventh aspect, a computer program product is provided, which, when running on a computer, causes the computer to execute the above-mentioned first aspect, the second aspect, the possible implementation manner of the first aspect, or the possible implementation manner of the second aspect Methods.
第八方面,提供了一种信息输出装置的芯片产品,执行上述第一方面或第一方面的任意可能的实施方式中的方法。An eighth aspect provides a chip product of an information output device, which implements the method in the first aspect or any possible implementation manner of the first aspect.
第九方面,提供了一种模型训练装置的芯片产品,执行上述第二方面或第二方面的任意可能的实施方式中的方法。A ninth aspect provides a chip product of a model training device, which implements the method in the second aspect or any possible implementation manner of the second aspect.
第十方面,提了供一种计算机可读存储介质,计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面的方法或第一方面的可能的实施方式中的方法。In a tenth aspect, a computer-readable storage medium is provided. Instructions are stored in the computer-readable storage medium. When the computer-readable storage medium is run on a computer, the computer executes the method of the above-mentioned first aspect or a possible implementation of the first aspect. methods in methods.
第十一方面,提了供一种计算机可读存储介质,计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第二方面的方法或第二方面的可能的实施方式中的方法。In the eleventh aspect, a computer-readable storage medium is provided. Instructions are stored in the computer-readable storage medium. When the computer-readable storage medium is run on a computer, the computer executes the method of the above-mentioned second aspect or the possible method of the second aspect. method in the implementation.
附图说明Description of drawings
图1是本申请实施例提供的一种信息输出方法的流程示意图;FIG. 1 is a schematic flow diagram of an information output method provided in an embodiment of the present application;
图2是本申请实施例提供的一种语义生成模型的训练方法的流程示意图;Fig. 2 is a schematic flow chart of a training method for a semantic generation model provided by an embodiment of the present application;
图3是本申请实施例提供的一种CBOW算法采用的神经网络的示意图;Fig. 3 is the schematic diagram of the neural network adopted by a kind of CBOW algorithm that the embodiment of the application provides;
图4是本申请实施例提供的一种用于训练分类模型的神经网络的结构示意图;FIG. 4 is a schematic structural diagram of a neural network for training a classification model provided in an embodiment of the present application;
图5是本申请实施例提供的一种信息输出装置的结构示意图;Fig. 5 is a schematic structural diagram of an information output device provided by an embodiment of the present application;
图6是本申请实施例提供的一种模型训练装置的结构示意图;Fig. 6 is a schematic structural diagram of a model training device provided by an embodiment of the present application;
图7是本申请实施例提供的另一种信息输出装置的结构示意图;FIG. 7 is a schematic structural diagram of another information output device provided by an embodiment of the present application;
图8是本申请实施例提供的另一种模型训练装置的结构示意图。Fig. 8 is a schematic structural diagram of another model training device provided by an embodiment of the present application.
具体实施方式Detailed ways
下面结合附图对本申请具体实施例作进一步的详细描述。The specific embodiments of the present application will be further described in detail below in conjunction with the accompanying drawings.
本申请实施例提供了一种信息输出方法及装置,能够自动确定并输出与故障描述文本的相关的用于协助分析故障原因的数据。The embodiments of the present application provide an information output method and device, which can automatically determine and output data related to the fault description text to assist in analyzing the cause of the fault.
以下对本申请所提供的信息输出方法及装置进行详细地介绍。The information output method and device provided by this application will be introduced in detail below.
请参见图1,图1是本申请实施例提供的一种信息输出方法的流程示意图。如图1所示,该信息输出方法包括如下101~105部分,其中:Please refer to FIG. 1 . FIG. 1 is a schematic flowchart of an information output method provided by an embodiment of the present application. As shown in Figure 1, the information output method includes the following parts 101-105, wherein:
101、信息输出装置获取故障描述文本。101. The information output device acquires a fault description text.
其中,故障描述文本为对故障现象进行描述的文本,即故障描述文本用于描述网络中发生的故障。例如,故障描述文本可以为“行业用户上网慢”、“在线计费服务器(onlinecharging system,OCS)通讯中断”等。故障描述文本可以是其他装置发送给信息输出装置的。例如,一线工程师对故障现象进行描述,得到故障描述文本,并将收集的用于协助分析故障原因的数据(如关键性能指标等)和故障描述文本以故障工单的形式发送给运维部门的信息输出装置。Wherein, the fault description text is a text describing a fault phenomenon, that is, the fault description text is used to describe a fault occurring in the network. For example, the fault description text may be "slow Internet access of industry users", "communication interruption of online charging system (OCS)" and so on. The fault description text may be sent to the information output device by other devices. For example, the first-line engineer describes the fault phenomenon, obtains the fault description text, and sends the collected data (such as key performance indicators, etc.) and the fault description text to the operation and maintenance department in the form of a fault work order to assist in the analysis of the cause of the fault. Information output device.
102、信息输出装置通过语义生成模型生成故障描述文本的语义向量。102. The information output device generates the semantic vector of the fault description text through the semantic generation model.
在一种可能的实施方式中,语义生成模型可以是根据历史故障描述文本对应的词向量矩阵训练生成的,该词向量矩阵包括历史故障描述文本中各个词对应的词向量。In a possible implementation manner, the semantic generation model may be generated by training a word vector matrix corresponding to the historical fault description text, and the word vector matrix includes word vectors corresponding to each word in the historical fault description text.
可选的,语义生成模型的训练方法具体可参见下图2所描述的语义生成模型的训练方法。也就是说,信息输出装置所使用的语义生成模型可以是下图2中模型训练装置训练的语义生成模型。图1中的信息输出装置和图2中的模型训练装置可部署在同一设备或部署在不同的设备。当图1中的信息输出装置和图2中的模型训练装置部署在不同的设备中时,模型训练装置训练完语义生成模型之后可发送语义生成模型至信息输出装置,从而信息输出装置可通过接收的语义生成模型生成故障描述文本的语义向量。当图1中的信息输出装置和图2中的模型训练装置部署在相同的设备中时,信息输出装置可从模型训练装置中获取语义生成模型,从而信息输出装置可通过语义生成模型生成故障描述文本的语义向量。Optionally, for the training method of the semantic generation model, please refer to the training method of the semantic generation model described in FIG. 2 below. That is to say, the semantic generation model used by the information output device may be the semantic generation model trained by the model training device in FIG. 2 below. The information output device in FIG. 1 and the model training device in FIG. 2 can be deployed on the same device or deployed on different devices. When the information output device in Figure 1 and the model training device in Figure 2 are deployed in different devices, after the model training device has trained the semantic generation model, it can send the semantic generation model to the information output device, so that the information output device can receive The semantic generative model generates semantic vectors of fault description text. When the information output device in Figure 1 and the model training device in Figure 2 are deployed in the same device, the information output device can obtain the semantic generation model from the model training device, so that the information output device can generate a fault description through the semantic generation model Semantic vectors of text.
当然语义生成模型也可以不通过图2所描述的方式训练生成,也可通过其他方式训练生成,本申请实施例不做限定。Of course, the semantic generation model may not be trained in the manner described in FIG. 2 , but may also be trained in other ways, which is not limited in this embodiment of the present application.
在一种可能的实施方式中,信息输出装置通过语义生成模型生成故障描述文本的语义向量的具体实施方式为:In a possible implementation manner, the specific implementation manner in which the information output device generates the semantic vector of the fault description text through the semantic generation model is as follows:
信息输出装置根据词向量集合将故障描述文本转换为词向量矩阵,再将词向量矩阵输入语义生成模型,以生成故障描述文本的语义向量,其中,该词向量集合中包括多个词向量。可选的,该词向量集合可以是下图2中的模型训练装置生成并发送至信息输出装置的。The information output device converts the fault description text into a word vector matrix according to the word vector set, and then inputs the word vector matrix into the semantic generation model to generate a semantic vector of the fault description text, wherein the word vector set includes multiple word vectors. Optionally, the word vector set may be generated by the model training device in FIG. 2 below and sent to the information output device.
可选的,信息输出装置根据词向量集合将故障描述文本转换为词向量矩阵的具体实施方式为:信息输出装置对故障描述文本进行分词处理,得到故障描述文本对应的由至少一个词组成的词序列;从词向量集合中获取该词序列包括的词对应的词向量;将该词序列包括的各个词对应的词向量组成故障描述文本的词向量矩阵。当词向量集合中不存在该词序列包括的词对应的词向量时,生成随机向量作为该词序列包括的词对应的词向量。Optionally, the specific implementation manner in which the information output device converts the fault description text into a word vector matrix according to the word vector set is: the information output device performs word segmentation processing on the fault description text, and obtains a word consisting of at least one word corresponding to the fault description text Sequence; obtain the word vectors corresponding to the words included in the word sequence from the word vector set; form the word vector matrix of the fault description text with the word vectors corresponding to each word included in the word sequence. When the word vector corresponding to the word included in the word sequence does not exist in the word vector set, a random vector is generated as the word vector corresponding to the word included in the word sequence.
举例来说,故障描述文本包括4个词,对故障描述文本进行分词处理得到的词序列为“行业”、“用户”、“上网”、“慢”。信息输出装置从词向量集合中查找到“行业”对应词向量1、“用户”对应词向量2、“上网”对应词向量3,未查找到“慢”对应的词向量,则生成随机向量词向量4作为“慢”对应的词向量。信息输出装置将词向量1~4组成故障描述文本的词向量矩阵。再将该词向量矩阵输入语义生成模型中生成故障描述文本的语义向量。For example, the fault description text includes 4 words, and the word sequence obtained by performing word segmentation on the fault description text is "industry", "user", "Internet access", and "slow". The information output device finds
103、信息输出装置获取多种类型的目标数据的相关文本分别对应的语义向量。103. The information output device acquires semantic vectors respectively corresponding to relevant texts of multiple types of target data.
其中,该目标数据用于协助分析故障产生的原因。其中,103部分和102部分的执行顺序可以不分先后,可先执行102部分再执行103部分,或可先执行103部分再执行102部分。Wherein, the target data is used to assist in analyzing the cause of the fault. Wherein, the execution order of
在一种可能的实施方式中,该多种类型的目标数据包括关键性能指标(KPI)、设备告警、设备日志中的至少两种;当目标数据为关键性能指标时,该目标数据的相关文本为关键性能指标的名称;当目标数据为设备告警时,该目标数据的相关文本为设备告警的标识;当目标数据为设备日志时,该目标数据的相关文本为设备日志的内容片段。其中,每种类型的目标数据为多个。In a possible implementation, the multiple types of target data include at least two of key performance indicators (KPIs), device alarms, and device logs; when the target data is a key performance indicator, the relevant text of the target data is the name of the key performance indicator; when the target data is a device alarm, the relevant text of the target data is the identification of the device alarm; when the target data is a device log, the relevant text of the target data is the content fragment of the device log. Wherein, each type of target data is multiple.
例如,该多种类型的目标数据包括关键性能指标和设备告警。上述的多种类型的目标数据为100个不同的关键性能指标和20个不同的设备告警,100个关键性能指标分别为关键性能指标1~关键性能指标100。20个设备告警分别为设备告警1~20。信息输出装置获取的多种类型的目标数据的相关文本分别对应的语义向量为关键性能指标1~关键性能指标100的名称分别对应的语义向量,以及设备告警1~20的标识分别对应的语义向量。也就是说,信息输出装置会获取120个语义向量。For example, the various types of target data include key performance indicators and equipment alarms. The various types of target data mentioned above are 100 different key performance indicators and 20 different equipment alarms. The 100 key performance indicators are respectively KPI 1 to KPI 100. The 20 equipment alarms are respectively
在一种可能的实施方式中,信息输出装置可在接收故障描述文本之前,通过语义生成模型生成多种类型的目标数据的相关文本分别对应的语义向量。In a possible implementation manner, before receiving the fault description text, the information output device may generate semantic vectors respectively corresponding to relevant texts of various types of target data through a semantic generation model.
可选的,信息输出装置生成多种类型的目标数据的相关文本分别对应的语义向量之后,可保存该多种类型的目标数据的相关文本分别对应的语义向量。在接收故障描述文本之后,就可获取保存该多种目标数据的相关文本分别对应的语义向量,以便与故障描述文本的语义向量进行相关性计算。通过实施该实施方式,可预先生成并保存多种目标数据的相关文本分别对应的语义向量,在接收故障描述文本之后,可直接用保存的多种目标数据的相关文本分别对应的语义向量与故障描述文本的语义向量进行相关性计算,从而不用在接收故障描述文本之后,临时生成多种目标数据的相关文本分别对应的语义向量。可见,通过实施该实施方式,有利于快速地计算得到故障描述文本的语义向量与每种目标数据的相关文本的语义向量的相关性。Optionally, after the information output device generates the semantic vectors respectively corresponding to the relevant texts of various types of target data, it may store the semantic vectors respectively corresponding to the relevant texts of the various types of target data. After receiving the fault description text, the semantic vectors corresponding to the related texts storing the various target data can be obtained, so as to perform correlation calculation with the semantic vectors of the fault description text. By implementing this embodiment, the semantic vectors corresponding to the relevant texts of various target data can be generated and saved in advance, and after receiving the fault description text, the semantic vectors corresponding to the relevant texts of various target data can be directly used to match the fault Correlation calculation is performed on the semantic vectors of the description text, so that it is not necessary to temporarily generate semantic vectors corresponding to the relevant texts of various target data after receiving the fault description text. It can be seen that by implementing this embodiment, it is beneficial to quickly calculate the correlation between the semantic vector of the fault description text and the semantic vector of the relevant text of each type of target data.
在一种可能的实施方式中,信息输出装置通过语义生成模型生成目标数据的相关文本对应的语义向量的原理与信息输出装置通过语义生成模型生成故障描述文本的语义向量的原理相同,在此不赘述。In a possible implementation, the principle of the information output device to generate the semantic vector corresponding to the relevant text of the target data through the semantic generation model is the same as the principle of the information output device to generate the semantic vector of the fault description text through the semantic generation model. repeat.
104、信息输出装置计算故障描述文本的语义向量与每种目标数据的相关文本的语义向量的相关性。104. The information output device calculates the correlation between the semantic vector of the fault description text and the semantic vector of the relevant text of each type of target data.
举例来说,具有两种类型的目标数据,分别为100个不同的关键性能指标和20个不同的设备告警,100个关键性能指标分别为关键性能指标1~关键性能指标100。20个设备告警分别为设备告警1~20。信息输出装置计算故障描述文本的语义向量分别与100个关键性能指标的相关文本的语义向量的相关性,以及计算故障描述文本的语义向量分别与20个设备告警的相关文本的语义向量的相关性。因此,会得到120个相关性。For example, there are two types of target data, which are 100 different key performance indicators and 20 different equipment alarms, and the 100 key performance indicators are respectively KPI 1 to KPI 100. 20 equipment alarms They are
在一种可能的实施方式中,可采用向量的夹角来作为相关性的衡量,故障描述文本的语义向量与目标数据的相关文本的语义向量的相关性可表示为:In a possible implementation, the angle between the vectors can be used as a measure of correlation, and the correlation between the semantic vector of the fault description text and the semantic vector of the relevant text of the target data can be expressed as:
其中,cos(θ)为故障描述文本的语义向量与目标数据的相关文本的语义向量的相关性,n为故障描述文本和目标数据的相关文本的语义向量的维度数量,xi为故障描述文本第i维的语义向量,yi目标数据的相关文本第i维的语义向量。Among them, cos(θ) is the correlation between the semantic vector of the fault description text and the semantic vector of the relevant text of the target data, n is the dimension number of the semantic vector of the fault description text and the relevant text of the target data, x i is the fault description text The semantic vector of the i-th dimension, y i The semantic vector of the i-th dimension of the relevant text of the target data.
105、信息输出装置确定并输出第一数据。105. The information output device determines and outputs the first data.
其中,在信息输出装置计算故障描述文本的语义向量与每种目标数据中的每种目标数据的相关文本的语义向量的相关性之后,信息输出装置确定并输出第一数据,该第一数据为该每种目标数据中语义向量与故障描述文本的语义向量相关性最大的目标数据,或该第一数据为该每种目标数据中语义向量与故障描述文本的语义向量相关性大于预设阈值的目标数据。Wherein, after the information output device calculates the correlation between the semantic vector of the fault description text and the semantic vector of the relevant text of each type of target data in each type of target data, the information output device determines and outputs the first data, the first data is The target data with the greatest correlation between the semantic vector and the semantic vector of the fault description text in each type of target data, or the first data is the target data whose correlation between the semantic vector and the semantic vector of the fault description text in each type of target data is greater than a preset threshold target data.
例如,获取的两种类型的目标数据,分别为100个不同的关键性能指标和20个不同的设备告警,100个关键性能指标分别为关键性能指标1~关键性能指标100。20个设备告警分别为设备告警1~设备告警20。故障描述文本的语义向量与关键性能指标1~关键性能指标100的相关文本的语义向量的相关性为分别为相关性1~100。相关性1为最大的相关性,则信息输出装置输出关键性能指标1。故障描述文本的语义向量与设备告警1~设备告警20的相关文本的语义向量的相关性分别为相关性101~120。相关性120为最大的相关性,则信息输出装置输出设备告警20。For example, the two types of target data obtained are 100 different key performance indicators and 20 different equipment alarms, respectively, and the 100 key performance indicators are respectively KPI 1 to KPI 100. The 20 equipment alarms are respectively These are
再如,相关性1和相关性2为大于预设阈值的相关性,则信息输出装置输出关键性能指标1和关键性能指标2。相关性101和相关性102为大于预设阈值的相关性,则信息输出装置输出设备告警1和设备告警2。For another example, if the
语义向量与故障描述文本的语义向量相关性越大的目标数据,说明该目标数据与故障描述文本越相关,用户可能需要查看该目标数据以分析故障原因。例如,故障描述文本为“ocs通讯中断”,关键性指标的名称为“ocs通讯中断次数”,该故障描述文本的语义向量与关键性指标的名称的语义向量相关性很大,用户可能需要查看该关键性指标以分析故障原因。可见,通过实施图1所描述的方法,能够自动查找到与故障描述文本的相关的用于协助分析故障原因的数据。The target data with greater correlation between the semantic vector and the semantic vector of the fault description text indicates that the target data is more related to the fault description text, and the user may need to view the target data to analyze the cause of the fault. For example, the fault description text is "ocs communication interruption", and the name of the key indicator is "ocs communication interruption times", the semantic vector of the fault description text is highly correlated with the semantic vector of the key indicator name, and the user may need to check This key indicator is used to analyze the cause of the failure. It can be seen that by implementing the method described in FIG. 1 , data related to the fault description text to assist in analyzing the cause of the fault can be automatically found.
现有技术中通过查找与故障描述文本具有相同的关键词的文本,并根据该文本的相关参数数据进行故障的查看分析。但相关性高的能用于协助分析故障原因的相关文本和故障描述文本中可能并没有相同的关键词。因此,通过现有的方式不能准确地查找到与故障描述文本相关联的用于协助分析故障原因的数据。本申请实施例通过对比故障描述文本的语义向量与目标数据的相关文本的语义向量的相关性,可以准确地查找到与故障描述文本相关联的目标数据。比如,故障描述为“行业用户上网慢”,本申请实施例分析出的与其相关的用于故障分析的关键性指标的名称为“下行带宽控制丢包比例”。可以看出,从字面上二者没有任何可以匹配和关联的成分,而本申请恰恰是通过语义分析挖掘学习到了“上网速度和丢包比例有关系”这样的领域知识,才实现了二者相关联的分析。In the prior art, the text with the same keywords as the fault description text is searched, and the fault is checked and analyzed according to the relevant parameter data of the text. However, there may not be the same keywords in the relevant text that can be used to assist in analyzing the cause of the fault and the fault description text with high correlation. Therefore, the data associated with the fault description text and used to assist in analyzing the cause of the fault cannot be accurately found through the existing methods. The embodiment of the present application can accurately find the target data associated with the fault description text by comparing the correlation between the semantic vector of the fault description text and the semantic vector of the relevant text of the target data. For example, the fault is described as "slow Internet access by industry users", and the name of the key indicator for fault analysis related to it analyzed in the embodiment of the present application is "downlink bandwidth control packet loss ratio". It can be seen that there is literally no matching and correlation between the two, but this application has learned the domain knowledge of "Internet speed and packet loss ratio are related" through semantic analysis and mining, and realized the correlation between the two. linked analysis.
因此,通过实施图1所描述的方法,能够自动地并且准确地查找到与故障描述文本的相关的用于协助分析故障原因的数据。Therefore, by implementing the method described in FIG. 1 , data related to the fault description text for assisting in analyzing the cause of the fault can be automatically and accurately found.
请参见图2,图2是本申请实施例提供的一种语义生成模型的训练方法的流程示意图。如图2所示,该语义生成模型的训练方法包括如下201~203部分,其中:Please refer to FIG. 2 . FIG. 2 is a schematic flowchart of a method for training a semantic generation model provided by an embodiment of the present application. As shown in Figure 2, the training method of the semantic generation model includes the following parts 201-203, wherein:
201、模型训练装置获取训练文本对应的词向量集合。201. The model training device acquires a word vector set corresponding to the training text.
其中,词向量集合中包括的词向量与训练文本中的词一一对应。例如,训练文本中包括10000个词,则词向量集合中也包括10000个词向量。该词向量用于表示词的语义。可选的,获取训练文本对应的词向量集合之后,可保存训练文本对应的词向量集合,以便后续使用词向量集合中的词向量。Wherein, the word vectors included in the word vector set are in one-to-one correspondence with the words in the training text. For example, if the training text includes 10,000 words, the word vector set also includes 10,000 word vectors. The word vector is used to represent the semantics of the word. Optionally, after obtaining the word vector set corresponding to the training text, the word vector set corresponding to the training text can be saved, so that the word vectors in the word vector set can be used later.
训练文本即语料。在一种可能的实施方式中,训练文本可以为百科类文本。从百科类文本中学习得到的词向量具有很好的通用语义。The training text is the corpus. In a possible implementation manner, the training text may be an encyclopedia type text. Word vectors learned from encyclopedic texts have good general semantics.
在一种可能的实施方式中,模型训练装置首先对训练文本进行预处理,按句切分后再对每句文本进行分词处理,得到分词后的训练文本,并通过word2vec工具或其他工具获取分词后的训练文本对应的词向量集合。In a possible implementation, the model training device first preprocesses the training text, and then performs word segmentation processing on each sentence text to obtain the word-segmented training text, and obtains the word segmentation through word2vec tools or other tools The set of word vectors corresponding to the subsequent training text.
例如,训练文本为“数学是利用符号语言研究数量结构变化以及空间等概念的一门学科。我喜欢数学”。模型训练装置向将训练文本拆分为两句话,分别为“数学是利用符号语言研究数量结构变化以及空间等概念的一门学科”和“我喜欢数学”。再对这两个句子分别进行分词处理。得到分词后的训练文本为“数学是利用符号语言研究数量结构变化以及空间等概念的一门学科。我喜欢数学”。模型训练装置使用word2vec工具对分词后的训练文本进行逐句遍历,遍历结束就得到了训练文本中的每个词对应的词向量。模型训练装置将训练文本中的每个词对应的词向量组成的词向量集合进行保存。For example, the training text is "Mathematics is a subject that uses symbolic language to study the changes in the structure of quantities and concepts such as space. I like mathematics." The model training device splits the training text into two sentences, which are "Mathematics is a subject that uses symbolic language to study the changes in the structure of quantities and concepts such as space" and "I like mathematics". The two sentences are then segmented separately. The training text after word segmentation is "Mathematics is a subject that uses symbolic language to study the changes in the structure of quantities and concepts such as space. I like mathematics." The model training device uses the word2vec tool to traverse the word-segmented training text sentence by sentence, and the word vector corresponding to each word in the training text is obtained after the traversal. The model training device stores a word vector set composed of word vectors corresponding to each word in the training text.
模型训练装置可通过word2vec工具并采用CBOW算法获取分词后的训练文本对应的词向量集合。CBOW算法的思想是通过给定的上下文词来预测当前词。CBOW算法训练的目标是给定某个词的上下文时,使得该词出现的概率最大。训练结束后,每个词在输出层都得到了一个对应的词向量。尽管CBOW算法的建模思想是一个分类过程,但会生成词向量这一副产品。The model training device can use the word2vec tool and the CBOW algorithm to obtain the word vector set corresponding to the training text after word segmentation. The idea of the CBOW algorithm is to predict the current word through a given context word. The goal of CBOW algorithm training is to maximize the probability of occurrence of a word when the context of a word is given. After training, each word gets a corresponding word vector in the output layer. Although the modeling idea of the CBOW algorithm is a classification process, it will generate word vectors as a by-product.
例如,图3为CBOW算法采用的神经网络的示意图。如图3所示,该神经网络由三层结构构成,分别为输入层、映射层和输出层。其中,输出层包括已经构造好的哈夫曼树。哈夫曼树的一个叶子节点代表训练文本中的一个词的词向量,每个叶子节点对应的单词的词向量是随机初始化的。每个非叶节点内置一个权重向量,该向量的维度和输入层的词向量相同。For example, FIG. 3 is a schematic diagram of a neural network used by the CBOW algorithm. As shown in Figure 3, the neural network consists of three layers, which are input layer, mapping layer and output layer. Wherein, the output layer includes the constructed Huffman tree. A leaf node of the Huffman tree represents the word vector of a word in the training text, and the word vector of the word corresponding to each leaf node is randomly initialized. Each non-leaf node has a built-in weight vector whose dimension is the same as the word vector of the input layer.
其中,输入层为某个单词w(t)周围的n-1个单词的词向量。n为窗口大小。例如,如果n取5,单词w(t)周围的n-1个单词为单词w(t)的前两个和后两个的单词。单词w(t)的前两个和后两个的单词分别为w(t-2)、w(t-1)、w(t+1)、w(t+2)。相对应的,这n-1个单词的词向量记为v(w(t-2))、v(w(t-1))、v(w(t+1))、v(w(t+2))。输入层将到这n-1个词向量传递到映射层,映射层将n-1个词向量进行相加,即将n-1个词向量的各维度对应相加。例如,映射层输入为pro(t)=v(w(t-2))+v(w(t-1))+v(w(t+1))+v(w(t+2))。Among them, the input layer is a word vector of n-1 words around a certain word w(t). n is the window size. For example, if n is 5, the n-1 words around word w(t) are the first two words and the last two words of word w(t). The first two and last two words of word w(t) are w(t-2), w(t-1), w(t+1), w(t+2), respectively. Correspondingly, the word vectors of these n-1 words are recorded as v(w(t-2)), v(w(t-1)), v(w(t+1)), v(w(t +2)). The input layer transfers the n-1 word vectors to the mapping layer, and the mapping layer adds the n-1 word vectors, that is, correspondingly adds the dimensions of the n-1 word vectors. For example, the mapping layer input is pro(t)=v(w(t-2))+v(w(t-1))+v(w(t+1))+v(w(t+2)) .
投影层将加和得到的向量pro(t)输入到哈夫曼树的根节点。在将pro(t)输入根节点之后,会计算根节点到每个叶子节点的概率,模型的训练过程是期望能得到由根节点到达w(t)对应的叶子节点的概率最大,由于在海量的训练文本中,相同的上下文环境会多次出现,所以在遍历训练训练文本的过程中会不断修正各权重向量,达到这样的效果。对训练文本中的所有词遍历完成之后,哈夫曼树的各叶子节点对应的词向量就为训练文本的各个词对应的词向量。这里的“训练文本中的所有词”包括训练文本中重复的词。The projection layer inputs the summed vector pro(t) to the root node of the Huffman tree. After inputting pro(t) into the root node, the probability from the root node to each leaf node will be calculated. The training process of the model is expected to have the highest probability of reaching the leaf node corresponding to w(t) from the root node. In the training text of , the same context will appear many times, so each weight vector will be continuously corrected in the process of traversing the training text to achieve this effect. After traversing all the words in the training text, the word vector corresponding to each leaf node of the Huffman tree is the word vector corresponding to each word in the training text. Here, "all words in the training text" includes repeated words in the training text.
其中,从根节点到词w(t)对应的叶子节点每次经过一个中间节点时,相当于是进行了一次二分类,分类器可以采用softmax回归分类器。其中,每次分类的分类概率为:Among them, each time the leaf node corresponding to the word w(t) passes through an intermediate node from the root node, it is equivalent to performing a binary classification, and the classifier can use a softmax regression classifier. Among them, the classification probability of each classification is:
其中,θi表示第i个权重向量。pro(t)为w(t)的上下文的词向量之和,e为自然常数。Among them, θi represents the ith weight vector. pro(t) is the sum of the word vectors of the context of w(t), and e is a natural constant.
设由根节点遍历到词w(t)对应的叶子节点的路径包含了L个中间节点,这些节点上的参数组成参数向量为[θ1,θ2,θ3,…,θL],则根节点到词w(t)对应的叶子节点的概率为每次二分类的概率的乘积,即根节点到词w(t)对应的叶子节点的概率为:Assume that the path traversed from the root node to the leaf node corresponding to the word w(t) contains L intermediate nodes, and the parameters on these nodes form a parameter vector as [θ 1 , θ 2 , θ 3 ,..., θ L ], then The probability of the root node to the leaf node corresponding to the word w(t) is the product of the probability of each binary classification, that is, the probability of the root node to the leaf node corresponding to the word w(t) is:
其中,P(w(t)|context(w(t)))为根节点到词w(t)对应的叶子节点的概率,表示i从1到L逐一递增对P(context(w(t)),θi)连乘求积。从根节点到其他叶子节点的概率的计算方法同理,在此不赘述。Among them, P(w(t)|context(w(t))) is the probability from the root node to the leaf node corresponding to word w(t), Indicates that i is incremented one by one from 1 to L to multiply and multiply P(context(w(t)), θ i ). The calculation method of the probability from the root node to other leaf nodes is the same, and will not be repeated here.
202、模型训练装置根据词向量集合将历史故障描述文本转换为由至少一个词向量组成的词向量矩阵。202. The model training device converts the historical fault description text into a word vector matrix consisting of at least one word vector according to the word vector set.
具体地,模型训练装置可将大量的历史故障描述文本转换为词向量矩阵。模型训练装置根据大量的词向量矩阵训练得到语义生成模型。例如,具有历史故障描述文本1~历史故障描述文本100,可将历史故障描述文本1~历史故障描述文本100分别转换为词向量矩阵,即得到100个词向量矩阵。模型训练装置根据这100个词向量矩阵训练得到语义生成模型。Specifically, the model training device can convert a large amount of historical fault description text into a word vector matrix. The model training device obtains a semantic generation model according to a large amount of word vector matrix training. For example, if there are historical
在一种可能的实施方式中,模型训练装置根据词向量集合将历史故障描述文本转换为由至少一个词向量组成的词向量矩阵的具体实施方式可以为:模型训练装置对历史故障描述文本进行分词处理,得到历史故障描述文本对应的由至少一个词组成的词序列;从词向量集合中获取该词序列包括的词对应的词向量;将该词序列包括的各个词对应的词向量组成该历史故障描述文本的词向量矩阵。当词向量集合中不存在该词序列包括的词对应的词向量时,可生成随机向量作为该词序列包括的词对应的词向量。可见,通过实施该实施方式,可准确地将历史故障描述文本转换为由至少一个词向量组成的词向量矩阵。In a possible implementation, the model training device converts the historical fault description text into a word vector matrix consisting of at least one word vector according to the word vector set. The specific implementation may be: the model training device performs word segmentation on the historical fault description text Process to obtain a word sequence consisting of at least one word corresponding to the historical fault description text; obtain word vectors corresponding to words included in the word sequence from the word vector set; form the history with word vectors corresponding to each word included in the word sequence A matrix of word vectors for the fault description text. When the word vector corresponding to the word included in the word sequence does not exist in the word vector set, a random vector may be generated as the word vector corresponding to the word included in the word sequence. It can be seen that by implementing this embodiment, the historical fault description text can be accurately converted into a word vector matrix composed of at least one word vector.
举例来说,历史故障描述文本1包括4个词,对历史故障描述文本1进行分词处理得到的词序列为“行业”、“用户”、“上网”、“慢”。模型训练装置从词向量集合中查找到“行业”对应词向量1、“用户”对应词向量2、“上网”对应词向量3,未查找到“慢”对应的词向量,则生成随机向量词向量4作为“慢”对应的词向量。模型训练装置将词向量1~4组成该历史故障描述文本1的词向量矩阵1。其他历史故障描述文本2~100转换为词向量矩阵的原理与历史故障描述文本1转换为词向量矩阵的原理相同,在此不赘述。For example, the historical
203、模型训练装置根据词向量矩阵训练得到语义生成模型。203. The model training device trains the word vector matrix to obtain a semantic generation model.
具体地,模型训练装置在得到词向量矩阵之后,可将词向量矩阵输入神经网络进行训练,以得到语义生成模型。该语义生成模型用于生成文本的语义向量。该语义向量用于表示文本的语义。Specifically, after obtaining the word vector matrix, the model training device can input the word vector matrix into the neural network for training to obtain a semantic generation model. This semantic generative model is used to generate semantic vectors of text. This semantic vector is used to represent the semantics of the text.
可见,图2所描述的方法是从词汇层面的语义向句子层面的语义逐步建模得到语义生成模型,这种语义生成模型训练方式是符合语言生成的基本原理的。因此,通过实施图2所描述的方法训练得到的语义生成模型,能够更加准确地表达文本的语义。It can be seen that the method described in Figure 2 is to gradually model the semantic generation model from the semantic level of the vocabulary to the semantic level of the sentence level. This training method of the semantic generation model is in line with the basic principles of language generation. Therefore, the semantic generation model trained by implementing the method described in FIG. 2 can express the semantics of the text more accurately.
在一种可能的实施方式中,模型训练装置根据词向量矩阵训练得到语义生成模型的具体实施方式为:模型训练装置获取历史故障描述文本对应的故障设备类型;模型训练装置根据词向量矩阵和类别标签训练分类模型,该类别标签包括该故障设备类型;模型训练装置根据分类模型得到语义生成模型。通过实施该实施方式训练得到的语义生成模型,能够更加准确地表达文本的语义。In a possible implementation, the specific implementation of the model training device to obtain the semantic generation model according to the word vector matrix training is: the model training device obtains the fault equipment type corresponding to the historical fault description text; the model training device obtains the fault equipment type according to the word vector matrix and category The classification model is trained with the label, and the category label includes the type of the faulty equipment; the model training device obtains the semantic generation model according to the classification model. By implementing the semantic generation model trained in this embodiment, the semantics of the text can be expressed more accurately.
例如,历史故障描述文本对应的故障设备类型可以为路由器、有线设备或无线设备等。例如,历史故障描述文本描述的故障为路由器产生的故障,则历史故障描述文本对应的故障设备类型为路由器。一线工程师可收集每个故障描述文本对应的故障设备类型,然后将故障描述文本、故障描述文本对应的故障设备类型和用于协助分析故障原因的数据添加至工单中,并将工单发送给运维终端进行故障原因分析。因此,模型训练装置可从工单中获取历史故障描述文本对应的故障设备类型。For example, the faulty device type corresponding to the historical fault description text may be a router, a wired device, or a wireless device. For example, the fault described in the historical fault description text is a fault generated by a router, and the faulty device type corresponding to the historical fault description text is a router. Front-line engineers can collect the faulty device type corresponding to each fault description text, and then add the fault description text, the faulty device type corresponding to the fault description text, and the data used to assist in analyzing the cause of the fault to the work order, and send the work order to The operation and maintenance terminal analyzes the cause of the failure. Therefore, the model training device can obtain the type of faulty equipment corresponding to the historical fault description text from the work order.
其中,训练得到的分类模型是用于生成故障描述文本对应的故障设备类型的模型。例如,将故障描述文本1对应的词向量矩阵输入分类模型,该分类模型可输出故障描述文本1对应的故障设备类型。Wherein, the classification model obtained through training is a model used to generate the type of faulty equipment corresponding to the fault description text. For example, the word vector matrix corresponding to the
在一种可能的实施方式中,模型训练装置根据词向量矩阵和类别标签训练分类模型的具体实施方式为:将词向量矩阵和类别标签输入神经网络进行迭代训练,在每次迭代训练时对输入神经网络的词向量矩阵中的词向量和神经网络的参数进行调整,以生成分类模型。通过实施该实施方式,能够使训练得到的分类模型能够准确地对故障描述文本进行分类。In a possible implementation, the specific implementation of the model training device training the classification model according to the word vector matrix and category labels is as follows: the word vector matrix and category labels are input into the neural network for iterative training, and the input The word vectors in the word vector matrix of the neural network and the parameters of the neural network are adjusted to generate a classification model. By implementing this embodiment, the trained classification model can accurately classify the fault description text.
可选的,模型训练装置还可使用调整后的词向量矩阵中的词向量更新词向量集合中相应词对应的词向量。通过实施该可选的方式,能够根据带有领域知识的历史故障描述文本语料修正词向量集合中的词向量,使词向量集合中的词向量更能表达故障领域的词的语义信息。Optionally, the model training device may also use the word vectors in the adjusted word vector matrix to update the word vectors corresponding to the corresponding words in the word vector set. By implementing this optional method, the word vectors in the word vector set can be corrected according to the historical fault description text corpus with domain knowledge, so that the word vectors in the word vector set can better express the semantic information of words in the fault domain.
举例来说,图4为一种用于训练分类模型的神经网络的结构示意图。如图4所示,该神经网络包括卷积层、池化层和全连接层。历史故障描述文本1的词向量矩阵1包括词向量{w1,w2,w3,w4,w5,w6}。每个词向量的维度为128个维度。模型训练装置得到词向量矩阵1之后,将词向量矩阵1输入神经网络。如图4所示,神经网络中的具有两个卷积核。当然在实际应用中也可以有两个以上的卷积核,本申请实施例以两个卷积核进行举例说明。左边的卷积核1对词向量矩阵1包括词向量进行两两卷积。例如,w1与w2进行卷积得到C1,w2与w3进行卷积得到C2,w3与w4进行卷积得到C3,w4与w5进行卷积得到C4,w5与w6进行卷积得到C5。右边的卷积核2对词向量矩阵1包括词向量进行三三卷积。例如,w1、w2和w3进行卷积得到C6,w2、w3和w4进行卷积得到C7,w3、w4和w5进行卷积得到C8,w4、w5和w6进行卷积得到C9。实际应用中也对其他数量的词向量进行卷积,本申请实施例以两两卷积和三三卷积进行举例说明。For example, FIG. 4 is a structural schematic diagram of a neural network used for training a classification model. As shown in Figure 4, the neural network includes convolutional layers, pooling layers, and fully connected layers. The
可见,卷积核1可生成一个特征图(feature map)C=[C1,C2,…,C5],卷积核2生成一个特征图C=[C6,C7,C8,C9]。模型训练装置得到每个卷积核生成的特征图之后,针对每个特征图,通过最大池化操作选取每个维度上的最大值作为当前卷积核生成的文本特征向量。模型训练装置将所有文本特征向量进行拼接,得到最终的历史故障描述文本1的语义向量。即如图4所示,模型训练装置从C1~C5的第一个维度中选取最大的值,从C1~C5的第2个维度中选取最大的值,从C1~C5的第3个维度中选取最大的值,依次类推,直到从C1~C5的第128个维度中选取到最大的值。模型训练装置将选取的128个维度的最大值组成卷积核1对应的文本特征向量1。同理,模型训练装置也获取卷积核2对应的文本特征向量2。模型训练装置将文本特征向量1和文本特征向量2进行拼接,得到最终的历史故障描述文本1的语义向量。It can be seen that the
模型训练装置将得到的历史故障描述文本1的语义向量输入全连接层,并将历史故障描述文本1对应的故障设备类型(如路由器)作为类别标签,输入全连接层。模型训练装置在全连接层对历史故障描述文本1的语义向量进行分析,分析得到故障设备类型最大概率为交换机。由于历史故障描述文本1的语义向量进行分析得到的最大概率的故障设备类型(即交换机)与历史故障描述文本1对应的类别标签(即路由器)不相同,因此模型训练装置记录通过对历史故障描述文本1的语义向量进行分析得到的最大概率的故障设备类型不正确。同理,模型训练装置按照上述流程将历史故障描述文本2的词向量矩阵输入神经网络进行训练,得到历史故障描述文本2的语义向量,并在全连接层输入历史故障描述文本2对应的故障设备类型(如交换机)作为类别标签。模型训练装置对历史故障描述文本2的语义向量进行分析,分析得到故障设备类型最大概率为防火墙。因此,模型训练装置记录通过对历史故障描述文本2的语义向量进行分析得到的最大概率的故障设备类型不正确。假设具有100个历史故障描述文本,其余98个历史故障描述文本同理,均按照上述历史故障描述文本1的方式输入神经网络进行分类模型的训练。在对历史故障描述文本1~100完成第一轮训练之后,假设根据历史故障描述文本1~50对应的语义向量分析得到的最大概率的故障设备类型不正确,模型训练装置对神经网络的参数以及历史故障描述文本1~50对应的词向量矩阵中的词向量进行调整。调整完毕之后,以新词向量矩阵和神经网络的参数重新对历史故障描述文本1~100进行训练,直到根据历史故障描述文本1~100对应的语义向量分析得到的最大概率的故障设备类型与分类标签相匹配,就生成分类模型,即通过迭代训练神经网络来生成分类模型。The model training device inputs the semantic vector of the historical
最后模型训练装置使用最后一轮迭代训练输入的词向量矩阵中的词向量更新词向量集合中的相应词对应的词向量。例如,历史故障描述文本1为“上网速度慢”最后一轮迭代训练之前对历史故障描述文本1对应词向量矩阵进行了调整,将“上网”对应的词向量调整为词向量1,则最后一次迭代训练完成后,使用词向量1替换词向量集合中的“上网”对应的词向量。历史故障描述文本2为“OCS通讯中断”最后一轮迭代训练之前对历史故障描述文本2对应词向量矩阵进行了调整,将“中断”对应的词向量调整为词向量2,则最后一轮迭代训练完成后,使用词向量2替换词向量集合中的“中断”对应的词向量。其他历史故障描述文本同理,在此不赘述。Finally, the model training device uses the word vectors in the word vector matrix input in the last round of iteration training to update the word vectors corresponding to the corresponding words in the word vector set. For example, the historical
在一种可能的实施方式中,模型训练装置根据分类模型得到语义生成模型的具体实施方式为:模型训练装置将分类模型中全连接层以上的部分作为语义生成模型。通过实施该实施方式生成的语义生成模型,可以准确的生成文本的语义向量。In a possible implementation manner, the specific implementation manner in which the model training device obtains the semantic generation model according to the classification model is: the model training device uses the part above the fully connected layer in the classification model as the semantic generation model. By implementing the semantic generation model generated in this embodiment, the semantic vector of the text can be accurately generated.
本发明实施例可以根据上述方法示例对设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本发明实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the embodiment of the present invention, the device can be divided into functional modules according to the above method examples. For example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. It should be noted that the division of modules in the embodiment of the present invention is schematic, and is only a logical function division, and there may be another division manner in actual implementation.
请参见图5,图5是本发明实施提供的一种信息输出装置。该信息输出装置包括:获取模块501、生成模块502、计算模块503和输出模块504。其中:Please refer to FIG. 5, which is an information output device provided by the implementation of the present invention. The information output device includes: an
获取模块501,用于获取故障描述文本;生成模块502,用于通过语义生成模型生成故障描述文本的语义向量,该故障描述文本用于描述网络中发生的故障;获取模块501,还用于获取多种类型的目标数据的相关文本分别对应的语义向量,该目标数据用于协助分析故障产生的原因;计算模块503,用于计算故障描述文本的语义向量与每种目标数据的相关文本的语义向量的相关性;输出模块504,用于确定并输出第一数据,第一数据为每种目标数据中语义向量与故障描述文本的语义向量的相关性最大的目标数据,或第一数据为每种目标数据中语义向量与故障描述文本的语义向量的相关性大于预设阈值的目标数据。The obtaining
在一种可能的实施方式中,生成模块502,还用于在获取模块501获取故障描述文本之前,通过语义生成模型生成多种目标数据的相关文本分别对应的语义向量。In a possible implementation manner, the
在一种可能的实施方式中,语义生成模型是根据历史故障描述文本对应的词向量矩阵训练生成的,词向量矩阵包括历史故障描述文本中各个词对应的词向量,该词向量用于表示词的语义。In a possible implementation, the semantic generation model is trained and generated according to the word vector matrix corresponding to the historical fault description text, and the word vector matrix includes the word vector corresponding to each word in the historical fault description text, and the word vector is used to represent the word semantics.
在一种可能的实施方式中,该多种类型的目标数据包括关键性能指标、设备告警、设备日志中的至少两种;当目标数据为关键性能指标时,目标数据的相关文本为关键性能指标的名称;当目标数据为设备告警时,目标数据的相关文本为设备告警的标识;当目标数据为设备日志时,目标数据的相关文本为设备日志的内容片段。In a possible implementation, the multiple types of target data include at least two of key performance indicators, device alarms, and device logs; when the target data is a key performance indicator, the relevant text of the target data is a key performance indicator name; when the target data is a device alarm, the relevant text of the target data is the identification of the device alarm; when the target data is a device log, the relevant text of the target data is the content fragment of the device log.
请参见图6,图6是本发明实施提供的一种模型训练装置。该模型训练装置包括获取模块601、转换模块602和训练模块603,其中:Please refer to Fig. 6, Fig. 6 is a model training device provided by the implementation of the present invention. The model training device includes an
获取模块601,用于获取训练文本对应的词向量集合,词向量集合中包括的词向量与训练文本中的词一一对应;转换模块602,用于根据词向量集合将历史故障描述文本转换为由至少一个词向量组成的词向量矩阵;训练模块603,还用于根据词向量矩阵训练得到语义生成模型,语义生成模型用于生成文本的语义向量。
在一种可能的实施方式中,转换模块602具体用于:对历史故障描述文本进行分词处理,得到历史故障描述文本对应的由至少一个词组成的词序列;从词向量集合中获取词序列包括的词对应的词向量;将词序列包括的各个词对应的词向量组成词向量矩阵。In a possible implementation, the
在一种可能的实施方式中,转换模块602还具体用于:当词向量集合中不存在词序列包括的词对应的词向量时,生成随机向量作为词序列包括的词对应的词向量。In a possible implementation manner, the
在一种可能的实施方式中,训练模块603根据词向量矩阵训练得到语义生成模型的方式具体为:获取历史故障描述文本对应的故障设备类型;根据词向量矩阵和类别标签训练分类模型,该类别标签包括所述故障设备类型;根据分类模型得到语义生成模型。In a possible implementation, the
在一种可能的实施方式中,训练模块603根据词向量矩阵和类别标签训练分类模型的方式具体为:将词向量矩阵和类别标签输入神经网络进行迭代训练,在每次迭代训练时对输入神经网络的词向量矩阵中的词向量和神经网络的参数进行调整,以生成分类模型。In a possible implementation, the
请参见图7,图7是本申请实施例公开的一种信息输出装置的结构示意图。如图7所示,该信息输出装置700包括处理器701、存储器702和通信接口703。其中,处理器701、存储器702和通信接口703相连。Please refer to FIG. 7 . FIG. 7 is a schematic structural diagram of an information output device disclosed in an embodiment of the present application. As shown in FIG. 7 , the information output device 700 includes a
其中,处理器701可以是中央处理器(central processing unit,CPU),通用处理器,协处理器,数字信号处理器(digital signal processor,DSP),专用集成电路(application-specific integrated circuit,ASIC),现场可编程门阵列(fieldprogrammable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。该处理器701也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。Wherein, the
其中,通信接口703用于实现与其他网元之间的通信。Wherein, the
其中,处理器701调用存储器702中存储的程序代码,可执行上述方法实施例中信息输出装置所执行的步骤。Wherein, the
请参见图8,图8是本申请实施例公开的一种模型训练装置的结构示意图。如图8所示,该模型训练装置800包括处理器801、存储器802和通信接口803。其中,处理器801、存储器802和通信接口803相连。Please refer to FIG. 8 , which is a schematic structural diagram of a model training device disclosed in an embodiment of the present application. As shown in FIG. 8 , the model training device 800 includes a
其中,处理器801可以是中央处理器(central processing unit,CPU),通用处理器,协处理器,数字信号处理器(digital signal processor,DSP),专用集成电路(application-specific integrated circuit,ASIC),现场可编程门阵列(fieldprogrammable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。该处理器801也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。Wherein, the
其中,通信接口803用于实现与其他网元之间的通信。Wherein, the
其中,处理器801调用存储器802中存储的程序代码,可执行上述方法实施例中模型训练装置所执行的步骤。Wherein, the
基于同一发明构思,本申请实施例中提供的各设备解决问题的原理与本申请方法实施例相似,因此各设备的实施可以参见方法的实施,为简洁描述,在这里不再赘述。Based on the same inventive concept, the problem-solving principle of each device provided in the embodiment of the present application is similar to that of the method embodiment of the present application. Therefore, the implementation of each device can refer to the implementation of the method. For a concise description, it is not repeated here.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the foregoing embodiments, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, rather than limiting them; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present application. scope.
Claims (15)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810415523.0A CN109902283B (en) | 2018-05-03 | 2018-05-03 | An information output method and device |
| PCT/CN2019/084814 WO2019210820A1 (en) | 2018-05-03 | 2019-04-28 | Information output method and apparatus |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810415523.0A CN109902283B (en) | 2018-05-03 | 2018-05-03 | An information output method and device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109902283A CN109902283A (en) | 2019-06-18 |
| CN109902283B true CN109902283B (en) | 2023-06-06 |
Family
ID=66943185
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201810415523.0A Active CN109902283B (en) | 2018-05-03 | 2018-05-03 | An information output method and device |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN109902283B (en) |
| WO (1) | WO2019210820A1 (en) |
Families Citing this family (52)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110245233A (en) * | 2019-06-19 | 2019-09-17 | 北京航空航天大学 | A fault determination method and device |
| CN110378486B (en) * | 2019-07-15 | 2021-12-03 | 出门问问信息科技有限公司 | Network embedding method and device, electronic equipment and storage medium |
| CN110569330A (en) * | 2019-07-18 | 2019-12-13 | 华瑞新智科技(北京)有限公司 | text labeling system, device, equipment and medium based on intelligent word selection |
| CN110704231A (en) * | 2019-09-30 | 2020-01-17 | 深圳前海微众银行股份有限公司 | A fault handling method and device |
| CN112667805B (en) * | 2019-09-30 | 2024-04-09 | 北京沃东天骏信息技术有限公司 | Work order category determining method, device, equipment and medium |
| CN110909550B (en) * | 2019-11-13 | 2023-11-03 | 北京环境特性研究所 | Text processing method, text processing device, electronic equipment and readable storage medium |
| CN111078822A (en) * | 2019-11-29 | 2020-04-28 | 北京百卓网络技术有限公司 | Reader information extraction method and system based on Chinese novel text |
| CN112988921A (en) * | 2019-12-13 | 2021-06-18 | 北京四维图新科技股份有限公司 | Method and device for identifying map information change |
| CN111046674B (en) * | 2019-12-20 | 2024-05-31 | 科大讯飞股份有限公司 | Semantic understanding method and device, electronic equipment and storage medium |
| CN111124925B (en) * | 2019-12-25 | 2024-04-05 | 斑马网络技术有限公司 | Scene extraction method, device, equipment and storage medium based on big data |
| CN113313134B (en) * | 2020-02-26 | 2025-04-18 | 阿里巴巴集团控股有限公司 | Cluster fault repair method and model training method, device and server |
| CN111460798B (en) * | 2020-03-02 | 2024-10-18 | 平安科技(深圳)有限公司 | Method, device, electronic equipment and medium for pushing paraphrasing |
| CN111291564B (en) * | 2020-03-03 | 2023-10-31 | 腾讯科技(深圳)有限公司 | Model training method, device and storage medium for word vector acquisition |
| CN113495949B (en) * | 2020-03-18 | 2024-06-18 | 北京沃东天骏信息技术有限公司 | Text recognition method, system, computer system and medium |
| CN111429155A (en) * | 2020-03-25 | 2020-07-17 | 中国银行股份有限公司 | Bank card dispute processing method and device |
| CN111274366B (en) * | 2020-03-25 | 2024-12-20 | 联想(北京)有限公司 | Search recommendation method, device, equipment, and storage medium |
| CN111767721B (en) * | 2020-03-26 | 2025-01-10 | 北京沃东天骏信息技术有限公司 | Information processing method, device and equipment |
| CN111858725B (en) * | 2020-04-30 | 2024-11-12 | 北京嘀嘀无限科技发展有限公司 | Event attribute determining method and system |
| CN111651601B (en) * | 2020-06-02 | 2023-04-18 | 全球能源互联网研究院有限公司 | Training method and classification method for fault classification model of power information system |
| CN112749553B (en) * | 2020-06-05 | 2023-07-25 | 腾讯科技(深圳)有限公司 | Text information processing method and device for video file and server |
| CN113779975B (en) * | 2020-06-10 | 2024-03-01 | 北京猎户星空科技有限公司 | Semantic recognition method, device, equipment and medium |
| CN113822016B (en) * | 2020-06-19 | 2024-03-22 | 阿里巴巴集团控股有限公司 | Text data processing method and device, electronic equipment and readable storage medium |
| CN112069833B (en) * | 2020-09-01 | 2024-04-30 | 北京声智科技有限公司 | Log analysis method, log analysis device and electronic equipment |
| CN112183994B (en) * | 2020-09-23 | 2023-05-12 | 南方电网数字电网研究院有限公司 | Evaluation method and device for equipment state, computer equipment and storage medium |
| CN113761184B (en) * | 2020-09-29 | 2024-12-31 | 北京沃东天骏信息技术有限公司 | Text data classification method, device and storage medium |
| CN112383421B (en) * | 2020-11-03 | 2023-03-24 | 中国联合网络通信集团有限公司 | Fault positioning method and device |
| CN112433874A (en) * | 2020-11-05 | 2021-03-02 | 北京浪潮数据技术有限公司 | Fault positioning method, system, electronic equipment and storage medium |
| CN112507720B (en) * | 2020-11-12 | 2024-08-20 | 西安交通大学 | Causal semantic relation transfer-based graph convolution network root cause identification method |
| CN112463378B (en) * | 2020-11-27 | 2023-12-22 | 北京浪潮数据技术有限公司 | Server asset scanning method, system, electronic equipment and storage medium |
| CN112529104B (en) * | 2020-12-23 | 2024-06-18 | 东软睿驰汽车技术(沈阳)有限公司 | Vehicle fault prediction model generation method, fault prediction method and device |
| CN112711947B (en) * | 2021-01-09 | 2023-08-22 | 国网湖北省电力有限公司电力科学研究院 | Text vectorization-based fault power failure emergency repair handling reference method |
| CN112818008A (en) * | 2021-01-21 | 2021-05-18 | 中广核工程有限公司 | Intelligent diagnosis method, system, medium and electronic equipment for nuclear power debugging faults |
| CN113590983B (en) * | 2021-01-28 | 2024-12-03 | 腾讯科技(深圳)有限公司 | Description text generation method and device, text processing model training method |
| CN112925668B (en) * | 2021-02-25 | 2024-04-05 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for evaluating server health |
| CN113821418B (en) * | 2021-06-24 | 2024-05-14 | 腾讯科技(深圳)有限公司 | Fault root cause analysis method and device, storage medium and electronic equipment |
| CN113610112B (en) * | 2021-07-09 | 2024-04-16 | 中国商用飞机有限责任公司上海飞机设计研究院 | Aircraft assembly quality defect auxiliary decision-making method |
| CN113657022B (en) * | 2021-07-15 | 2024-05-14 | 华为技术有限公司 | Chip fault recognition method and related equipment |
| CN113591477B (en) * | 2021-08-10 | 2023-09-15 | 平安银行股份有限公司 | Fault positioning method, device, equipment and storage medium based on associated data |
| CN113722494A (en) * | 2021-09-10 | 2021-11-30 | 中国航空工业集团公司西安飞行自动控制研究所 | Equipment fault positioning method based on natural language understanding |
| CN114036293B (en) * | 2021-11-03 | 2023-06-06 | 腾讯科技(深圳)有限公司 | Data processing method and device and electronic equipment |
| CN113961708B (en) * | 2021-11-10 | 2024-04-23 | 北京邮电大学 | A method for tracing power equipment faults based on multi-level graph convolutional networks |
| CN114265930B (en) * | 2021-11-19 | 2025-02-07 | 国电南京自动化股份有限公司 | A method for merging and processing low-voltage user fault reports based on event extraction |
| CN114218402B (en) * | 2021-12-17 | 2024-05-28 | 迈创企业管理服务股份有限公司 | Method for recommending computer hardware fault replacement parts |
| CN114625839A (en) * | 2022-03-18 | 2022-06-14 | 广东电网有限责任公司 | Text classification method, device, equipment and storage medium for power grid maintenance list |
| CN119358532A (en) * | 2022-04-12 | 2025-01-24 | 支付宝(杭州)信息技术有限公司 | Fault change location method, device, equipment, medium and program product |
| CN115687031A (en) * | 2022-11-15 | 2023-02-03 | 北京优特捷信息技术有限公司 | Method, device, equipment and medium for generating alarm description text |
| CN115994217B (en) * | 2022-11-29 | 2024-01-23 | 南京审计大学 | Financial report fraud detection method and system |
| CN116341542A (en) * | 2023-03-31 | 2023-06-27 | 上海市特种设备监督检验技术研究院 | A method, device, and storage medium for assisting decision-making on causes of special equipment accidents |
| CN116502058B (en) * | 2023-06-28 | 2023-09-26 | 长园深瑞能源技术有限公司 | AI fault detection analysis method and system applied to charging pile system and cloud platform |
| CN116738323B (en) * | 2023-08-08 | 2023-10-27 | 北京全路通信信号研究设计院集团有限公司 | Fault diagnosis method, device, equipment and medium for railway signal equipment |
| CN117493886B (en) * | 2023-11-16 | 2024-11-12 | 河南科测电力设备有限公司 | Training method and device for intelligent recognition model of transformer fault based on text |
| CN118940725B (en) * | 2024-07-01 | 2025-02-25 | 合肥霍因科技有限公司 | A standardized semantic mapping method and system based on artificial intelligence and big data |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2003173286A (en) * | 2001-12-05 | 2003-06-20 | Nippon Telegr & Teleph Corp <Ntt> | Acquisition method of semantic information in distributed network |
| CN101795210A (en) * | 2010-01-11 | 2010-08-04 | 浪潮通信信息系统有限公司 | Method for processing communication network failure |
| CN102650960A (en) * | 2012-03-31 | 2012-08-29 | 奇智软件(北京)有限公司 | Method and device for eliminating faults of terminal equipment |
| CN107171819A (en) * | 2016-03-07 | 2017-09-15 | 北京华为数字技术有限公司 | A kind of network fault diagnosis method and device |
Family Cites Families (16)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101078751B1 (en) * | 2011-02-23 | 2011-11-02 | 한국과학기술정보연구원 | Method and apparatus for detecting error in lexical resource semantic network |
| US20120233112A1 (en) * | 2011-03-10 | 2012-09-13 | GM Global Technology Operations LLC | Developing fault model from unstructured text documents |
| US9256595B2 (en) * | 2011-10-28 | 2016-02-09 | Sap Se | Calculating term similarity using a meta-model semantic network |
| US20130339787A1 (en) * | 2012-06-15 | 2013-12-19 | International Business Machines Coporation | Systematic failure remediation |
| CN103617157B (en) * | 2013-12-10 | 2016-08-17 | 东北师范大学 | Based on semantic Text similarity computing method |
| CN103744905B (en) * | 2013-12-25 | 2018-03-30 | 新浪网技术(中国)有限公司 | Method for judging rubbish mail and device |
| CN104361026B (en) * | 2014-10-22 | 2017-09-19 | 北京航空航天大学 | A fault knowledge storage and push method in the process of FMEA analysis |
| CN106815252B (en) * | 2015-12-01 | 2020-08-25 | 阿里巴巴集团控股有限公司 | Searching method and device |
| CN106326346A (en) * | 2016-08-06 | 2017-01-11 | 上海高欣计算机系统有限公司 | Text classification method and terminal device |
| CN106941423B (en) * | 2017-04-13 | 2018-06-05 | 腾讯科技(深圳)有限公司 | Failure cause localization method and device |
| CN107248927B (en) * | 2017-05-02 | 2020-06-09 | 华为技术有限公司 | Fault location model generation method, fault location method and device |
| CN107291693B (en) * | 2017-06-15 | 2021-01-12 | 广州赫炎大数据科技有限公司 | Semantic calculation method for improved word vector model |
| CN107291699B (en) * | 2017-07-04 | 2020-11-24 | 湖南星汉数智科技有限公司 | Sentence semantic similarity calculation method |
| CN107340766B (en) * | 2017-07-10 | 2019-04-12 | 浙江大学 | Power scheduling alarm signal text based on similarity sorts out and method for diagnosing faults |
| CN107391727B (en) * | 2017-08-01 | 2020-03-06 | 北京航空航天大学 | Method and device for mining equipment failure sequence patterns |
| CN107704563B (en) * | 2017-09-29 | 2021-05-18 | 广州多益网络股份有限公司 | Question recommendation method and system |
-
2018
- 2018-05-03 CN CN201810415523.0A patent/CN109902283B/en active Active
-
2019
- 2019-04-28 WO PCT/CN2019/084814 patent/WO2019210820A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2003173286A (en) * | 2001-12-05 | 2003-06-20 | Nippon Telegr & Teleph Corp <Ntt> | Acquisition method of semantic information in distributed network |
| CN101795210A (en) * | 2010-01-11 | 2010-08-04 | 浪潮通信信息系统有限公司 | Method for processing communication network failure |
| CN102650960A (en) * | 2012-03-31 | 2012-08-29 | 奇智软件(北京)有限公司 | Method and device for eliminating faults of terminal equipment |
| CN107171819A (en) * | 2016-03-07 | 2017-09-15 | 北京华为数字技术有限公司 | A kind of network fault diagnosis method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN109902283A (en) | 2019-06-18 |
| WO2019210820A1 (en) | 2019-11-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109902283B (en) | An information output method and device | |
| CN110609759B (en) | Fault root cause analysis method and device | |
| CN111079430B (en) | Power failure event extraction method combining deep learning and concept map | |
| CN112182219A (en) | An online service anomaly detection method based on log semantic analysis | |
| Redl et al. | Automatic SLA matching and provider selection in grid and cloud computing markets | |
| CN111027629B (en) | Power distribution network fault power failure rate prediction method and system based on improved random forest | |
| CN114785666B (en) | Network troubleshooting method and system | |
| CN105825269B (en) | A kind of feature learning method and system based on parallel automatic coding machine | |
| CN111104242A (en) | Method and device for processing abnormal logs of operating system based on deep learning | |
| CN114912460B (en) | Method and equipment for identifying transformer faults by refined fitting based on text mining | |
| CN113824575B (en) | Method and device for identifying fault node, computing equipment and computer storage medium | |
| CN111126820A (en) | Anti-stealing method and system | |
| CN115766518B (en) | Anomaly detection model training, anomaly detection method and system for cloud-edge systems | |
| CN116775497B (en) | Database test case generation demand description coding method | |
| CN118467229A (en) | A database alarm intelligent diagnosis method and system based on multimodal knowledge graph fusion and small sample learning | |
| CN116795977A (en) | Data processing methods, devices, equipment and computer-readable storage media | |
| CN113779882A (en) | Method, device, device and storage medium for predicting remaining service life of equipment | |
| CN117633518A (en) | An industrial chain construction method and system | |
| CN115934666B (en) | Feature-enhanced cloud container abnormal log classification method based on graph convolutional neural network | |
| CN116401372A (en) | Knowledge graph representation learning method and device, electronic equipment and readable storage medium | |
| CN115169458A (en) | Adaptive fault diagnosis method, device and related medium based on active learning | |
| CN119249221A (en) | Intelligent decision making system based on user prior information guidance | |
| CN114444496A (en) | Short text entity correlation identification method, system, electronic device and storage medium | |
| CN118890087A (en) | A communication fault location system based on machine learning technology and corresponding method | |
| CN118819933A (en) | A fault root cause location method and model training method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |