CN111985677B - Causal link analysis method, causal link analysis equipment and computer readable storage medium - Google Patents
Causal link analysis method, causal link analysis equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN111985677B CN111985677B CN202010618648.0A CN202010618648A CN111985677B CN 111985677 B CN111985677 B CN 111985677B CN 202010618648 A CN202010618648 A CN 202010618648A CN 111985677 B CN111985677 B CN 111985677B
- Authority
- CN
- China
- Prior art keywords
- link
- directed
- causal
- variables
- correlation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/04—Manufacturing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Economics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Strategic Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Primary Health Care (AREA)
- Manufacturing & Machinery (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a causal link analysis method, a causal link analysis device and a computer readable storage medium, wherein the causal link analysis method comprises the following steps: recording and analyzing a production process of a specified object to obtain a plurality of specified variables corresponding to the production process; determining an undirected correlation link between the plurality of specified variables based on a correlation determination rule; predicting the undirected related link according to a causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link; determining a directed causal link between the plurality of designated variables according to the directed causal probability value, the directed causal link being used to characterize a causal relationship of the designated object production process; the sequence of the interactions between the specified variables in the production process can be known.
Description
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a causal link analysis method, a causal link analysis device, and a computer readable storage medium.
Background
The industrial field has a large number of process parameters in the production process, and the process parameters are strongly coupled with each other, and the change of one process parameter often causes the change of one or more other process parameters, thereby affecting the final product. When one adjusts process parameters empirically and specifically, a "chain reaction" may be caused, such as increasing the temperature at a, which may result in excessive temperature at B. At present, in the process of modeling and analyzing the technological parameters, the technological parameters are usually adjusted by analyzing the correlation among the technological parameters, but only the correlation among the technological parameters can be determined by the correlation analysis, and what result can be caused after the technological parameters are adjusted can not be accurately predicted, so that the technological parameters can not be accurately controlled, and the effects are not ideal in the aspects of improving the quality of products, intelligent early warning of devices and the like.
Disclosure of Invention
The embodiment of the invention provides a causal link analysis method, causal link analysis equipment and a computer readable storage medium, which can know the sequence of mutual influence among all specified variables in the production process.
In one aspect, the present invention provides a causal link analysis method, the method comprising: recording and analyzing a production process of a specified object to obtain a plurality of specified variables corresponding to the production process; determining an undirected correlation link between the plurality of specified variables based on a correlation determination rule; predicting the undirected related link according to a causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link; and determining a directed causal link between the plurality of specified variables according to the directed causal probability value, wherein the directed causal link is used for representing the causal relationship of the specified object production process.
In one embodiment, the recording and analyzing the specified object production process to obtain a plurality of specified variables corresponding to the production process includes: recording the operation process of the specified object through a distributed control system to obtain an operation record; screening and supplementing the operation records to obtain associated information corresponding to the production process; wherein the production process is included in the operation process; and carrying out standardization processing on the associated information to obtain the specified variable.
In an embodiment, the determining the undirected correlation link between the specified variables based on the correlation determination rule includes: establishing a directed complete graph corresponding to the plurality of specified variables, wherein the directed complete graph comprises a plurality of bidirectional links used for connecting the plurality of specified variables, and any bidirectional link comprises two directed links with opposite directions; determining a correlation value corresponding to each directional link according to the specified variables; screening a correlation value meeting a first threshold value, and determining a directed link corresponding to the correlation value meeting the first threshold value as a first directed link; and determining a directed correlation graph according to the specified variables and the first directed link, wherein the directed correlation graph is used for representing the undirected correlation link among the specified variables.
In an embodiment, the causal probability prediction model is a graphical neural network model; correspondingly, the predicting the undirected related link according to the causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link includes: determining a variable set and a link set according to the directed correlation graph; wherein the set of variables comprises a plurality of specified variables, and the set of links comprises a first directed link; and predicting the variable set and the link set through a graphic neural network model to obtain a causal probability set corresponding to the link set, wherein the causal probability set comprises a directed causal probability value corresponding to the first directed link.
In one embodiment, determining the directed causal link between the plurality of specified variables based on the directed causal probability value comprises: screening directed cause and effect probability values meeting a second threshold, and determining a directed link corresponding to the directed cause and effect probability value meeting the second threshold as a second directed link; determining a directed cause and effect link graph from the plurality of designated variables and the second directed link, the directed cause and effect link graph being used to characterize the directed cause and effect link between the designated variables.
In an embodiment, before the determining the undirected correlation link between the plurality of specified variables based on the correlation determination rule, the method further includes: carrying out causal link labeling between at least two specified variables, and determining a known directed causal link; and performing semi-supervised learning on the causal probability prediction model based on the known directed causal link to obtain a causal probability prediction model.
Another aspect of the invention provides a causal link analysis apparatus, the apparatus comprising: the analysis module is used for carrying out record analysis on the production process of the appointed object to obtain a plurality of appointed variables; a determining module, configured to determine, based on a relevance determining rule, an undirected relevance link between the plurality of specified variables; the prediction module is used for predicting the undirected related link according to a causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link; the determining module is further configured to determine a directed causal link between the plurality of specified variables based on the directed causal probability value.
In one embodiment, the analysis module includes: the recording sub-module is used for recording the operation process of the appointed object through the distributed control system to obtain an operation record; the screening and supplementing sub-module is used for screening and supplementing the operation records to obtain the associated information corresponding to the production process; wherein the production process is included in the operation process; and the processing sub-module is used for carrying out standardized processing on the associated information to obtain a specified variable.
In an embodiment, the determining module includes: the establishing sub-module is used for establishing a directed complete graph corresponding to the specified variables, wherein the directed complete graph comprises a plurality of bidirectional links used for connecting the specified variables, and any bidirectional link comprises two directed links with opposite directions; a determining submodule, configured to determine a correlation value corresponding to each directional link according to the plurality of specified variables; a screening sub-module, configured to screen a correlation value that meets a first threshold, and determine a directional link corresponding to the correlation value that meets the first threshold as a first directional link; the determining submodule is further used for determining a directed correlation graph according to the specified variables and the first directed link, and the directed correlation graph is used for representing the undirected correlation link among the specified variables.
In an embodiment, the causal probability prediction model is a graphical neural network model; correspondingly, the predicting the undirected related link according to the causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link includes: determining a variable set and a link set according to the directed correlation graph; wherein the set of variables comprises a plurality of specified variables, and the set of links comprises a first directed link; and predicting the variable set and the link set through a graphic neural network model to obtain a causal probability set corresponding to the link set, wherein the causal probability set comprises a directed causal probability value corresponding to the first directed link.
In an embodiment, the screening submodule is further configured to screen the directed cause and effect probability value that meets a second threshold, and determine a directed link corresponding to the directed cause and effect probability value that meets the second threshold as the second directed link; the determination submodule is further configured to determine a directed causal link map from the plurality of specified variables and the second directed link, the directed causal link map being used to characterize the directed causal link between the specified variables.
In an embodiment, the apparatus further comprises: the labeling module is used for labeling the causal link between at least two specified variables and determining a known directed causal link; and the learning module is used for performing semi-supervised learning on the causal probability prediction model based on the known directed causal link so as to obtain the causal probability prediction model.
Another aspect of the invention provides a computer readable storage medium comprising a set of computer executable instructions for performing the causal link analysis method of any of the above, when said instructions are executed.
In the embodiment of the invention, the causal link analysis method provided by the embodiment of the invention is used for analyzing a plurality of appointed variables in the production process to determine causal relations among the plurality of variables, and the obtained causal relations are beneficial to determining the influence of any appointed variable in the production process on the whole production process, so that the stability of the production process can be ensured by accurately controlling each appointed variable, the quality of a product generated in the production process can be ensured, the yield of the product can be effectively improved, and the purposes of energy conservation and emission reduction can be achieved.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. Several embodiments of the present invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
FIG. 1 is a schematic diagram of an implementation flow of a causal link analysis method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of an implementation of a causal link analysis method for analyzing a specified variable according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a implementation flow of relevance determination in a causal link analysis method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a causal link analysis method for predicting a causal probability value according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an implementation flow of a causal link analysis method for determining a causal link in accordance with an embodiment of the present invention;
FIG. 6 is a schematic view of a scenario in which a causal link analysis method according to an embodiment of the present invention creates a directed full graph;
FIG. 7 is a schematic diagram of a scenario for causal link analysis method model update according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a scenario in which a causal link map is obtained by a causal link analysis method according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of an implementation flow of a causal link analysis device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more comprehensible, the technical solutions in the embodiments of the present invention will be clearly described in the following description with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention and not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
FIG. 1 is a schematic diagram of an implementation flow of a causal link analysis method according to an embodiment of the present invention.
Referring to fig. 1, in one aspect, an embodiment of the present invention provides a causal link analysis method, including: operation 101, performing record analysis on a specified object production process to obtain a plurality of specified variables corresponding to the production process; operation 102, determining an undirected correlation link between a plurality of specified variables based on a correlation determination rule; an operation 103, predicting the undirected related link according to the causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link; operation 104 determines a directed causal link between a plurality of specified variables based on the directed causal probability values, the directed causal link to characterize a causal relationship of a specified object production process.
The causal link analysis method provided by the embodiment of the invention is used for analyzing a plurality of appointed variables in the production process to determine causal relations among the plurality of variables, and the obtained causal relations are favorable for determining the influence of any appointed variable in the production process on the whole production process, so that the stability of the production process can be ensured by accurately controlling all the appointed variables, the quality of a product generated in the production process can be ensured, the yield of the product can be effectively improved, and the purposes of energy conservation and emission reduction can be achieved.
The production process of the method can be the production process in various fields, such as the petroleum chemical field, the new material field, the bioengineering field and the like. The method is particularly suitable for the production process with long production periodicity, a plurality of specified variables in the production process and strong coupling among the specified variables, namely, the change of one specified variable often causes the change of a plurality of specified variables, thereby influencing the final product. For example, when the method is applied to catalytic cracking reaction, the causal link relation between a plurality of specified variables, namely a plurality of process parameters and the product yield is researched, so that the process parameters can be accurately controlled, and the yield can be effectively improved; meanwhile, intelligent detection and early warning of device faults can be realized, and enterprise benefits are improved. In the method, the specified variable includes at least one of: variables for characterizing process parameters, variables for characterizing the product formed. Wherein, the technological parameters of the method comprise at least one of the following: including variables for characterizing raw material parameters, variables for characterizing equipment parameters, variables for characterizing environmental parameters.
Specifically, in the method, first, a specified object production process is subjected to record analysis, and a plurality of specified variables corresponding to the production process are obtained. Wherein the specified object is used to refer to a production process within a certain specific range. The specific range may be divided based on the product of production, may be divided based on the production cycle, and may be divided based on the specific device. For example, in one case, the specified object production process of the present method may be a production process of obtaining a specific product from a specific raw material by production, wherein the specific raw material and the specific product may be determined according to actual conditions. In another case, the specified object production process of the present method may be a production process during one of the time periods for a particular product. In yet another case, the specified object generation process of the method may also be a production process corresponding to a specific device and/or system. The method is described in terms of a production process in which a particular feedstock is produced to obtain a particular product. Wherein, it is understood that parameters corresponding to a particular raw material and a particular product are contained in a plurality of specified variables.
After determining that the specified object production process is a production process in which a specific raw material is produced to obtain a specific product, the specified variable is used to characterize parameter information that may itself change during the production process, and is also used to characterize parameter information that may cause other variables to change. The specified variables in the production process may be determined by analyzing the production record corresponding to the production process. It will be appreciated that some process parameters which are not relevant to the production process may also change, in which case these process parameters which do not affect the production process do not belong to the specified variables. Such as maintenance parameters that are not related to the production process.
After determining the plurality of specified variables, the method determines an undirected correlation link between the plurality of specified variables based on the correlation determination rule. Wherein the correlation judgment rule is used for determining whether the specified variables are correlated or not. It should be noted that whether a correlation between specified variables is defined as a correlation between specified variables when one of the variables changes, the other variable also changes. For example, when the second variable is changed, the first variable and the third variable are also changed, and it is determined that there is a correlation between the second variable and the first variable, the second variable, and the third variable. And determining whether all the specified variables have correlation according to the correlation judgment rule. The undirected correlation link is determined based on the specified variables with correlation. The number of the undirected correlation links is plural, and each undirected correlation link is used for connecting two specified variables with correlation. That is, a non-directional correlation link is used to characterize a correlation between specified variables at both ends of the link.
After determining the undirected correlation link, the method further includes predicting the undirected correlation link according to a causal probability prediction model to obtain a directed causal probability value corresponding to the undirected correlation link. The causal probability prediction model is used for predicting the undirected related link and the specified variables corresponding to the two ends of the undirected related link as inputs, outputting causal probability values according to the prediction results, and the directed causal probability values are used for predicting whether the specified variables positioned on the two ends of the undirected related link have causal relations or not. According to the above manner, each undirected correlation link and the corresponding specified variable can be predicted to determine a directed causal probability value for the corresponding undirected correlation link.
A directed causal link between a plurality of designated variables is determined based on the directed causal probability values, the directed causal link being used to characterize a causal relationship of a designated object production process.
Wherein the direction of the directed causal probability value can be represented by positive and negative. From the directed causal probability values, it is possible to determine the directionality, i.e. whether there is a causal relationship, between the specified variables corresponding to the two ends of the undirected correlated link. For example, a directed causal probability value defined as a value within a preset range may determine that there is causal relationship between specified variables corresponding to both ends of the undirected link, and define a direction for characterizing a causal relationship direction between the two specified variables, where the directed causal probability value is positive, i.e., determined to cause a change in the second variable by the first variable, and where the directed causal probability value is negative, i.e., determined to cause a change in the first variable by the second variable, if the preset range is satisfied. In the manner described above, each directed cause and effect probability value is determined to obtain a directed cause and effect link between a plurality of specified variables. The causal relationship between the appointed variables can be determined according to the directed causal link, so that each appointed variable can be accurately controlled based on the directed causal link, and the stability of the production process is ensured.
FIG. 2 is a schematic flow chart of an implementation of a causal link analysis method for analyzing specified variables according to an embodiment of the present invention.
Referring to fig. 2, in one embodiment, operation 101 performs a record analysis on a specified object production process to obtain a plurality of specified variables corresponding to the production process, including: operation 1011, recording the operation process of the appointed object through the distributed control system to obtain an operation record; operation 1012, screening and supplementing the operation records to obtain the associated information corresponding to the production process; wherein the production process is contained within the operation process; in operation 1013, the related information is normalized to obtain a specified variable.
The method for analyzing the production process of the appointed object comprises the steps of setting a distributed control system (DCS system) in the production process to collect and record all production information in the production process and obtaining an operation record. It will be appreciated that the running records include records related to the production process and records unrelated to the production process.
Therefore, screening and supplementing operation records are needed, and in particular, the method comprises a plurality of screening and supplementing methods. It will be appreciated that the present method applies at least one of the following screening complementary methods:
In a first screening supplement method, the method deletes records unrelated to the production process according to the production process operation daily report. For example, the associated information corresponding to the production process is obtained from all data during the device damage and device repair by deleting the device operation daily report.
In the second screening and supplementing method, the method counts the number of missing values of a plurality of specified variables, deletes all variables with the missing value ratio exceeding a preset threshold, for example, sets the preset threshold to 10%, deletes all variables exceeding 10%, and fills missing values existing in the variables by using an upward filling method to obtain association information corresponding to the production process.
In a third screening supplementary method, the method obtains association information corresponding to a production process by performing outlier processing on a plurality of specified variables, for example, processing outliers by using a 3σ method.
It will be appreciated that the above screening supplementation methods may be used in combination according to actual needs. For example, in one implementation scenario, the method first collects aggregate DCS data via a distributed control system; then deleting all data of the damaged device and the maintenance period of the device according to the daily report of the operation of the device to obtain a screening variable; afterwards, counting the number of missing values of each variable in the screening variables, deleting all variables with the missing value ratio exceeding 10%, and filling the missing values existing in each variable by using an upward filling method to obtain a supplementary variable; and then, processing the outliers on the supplementary variables by using a 3 sigma method, and determining the processed variables as associated information corresponding to the production process.
And then, carrying out standardization processing on the associated information to obtain the specified variable. It is understood that the correlation information includes the yield of the product during the production process. The specific content of the standardization process is that each appointed variable is subjected to unit conversion according to a preset conversion unit, so that all appointed variables are characterized by adopting the same conversion unit, and the variable characterized by adopting the same conversion unit is determined to be the appointed variable.
Fig. 3 is a schematic diagram of an implementation flow of relevance determination in a causal link analysis method according to an embodiment of the present invention.
Referring to fig. 3, in one embodiment, operation 102, determining an undirected correlation link between a plurality of specified variables based on a correlation determination rule, includes: an operation 1021 of creating a directed completion graph corresponding to the plurality of specified variables, the directed completion graph including a plurality of bidirectional links for connecting the plurality of specified variables, any bidirectional link including two directed links of opposite directions; operation 1022, determining a correlation value corresponding to each of the directional links based on the plurality of specified variables; operation 1023, screening the correlation value satisfying the first threshold value, and determining the directed link corresponding to the correlation value satisfying the first threshold value as the first directed link; an operation 1024 determines a directed correlation graph from the plurality of specified variables and the first directed link, the directed correlation graph being used to characterize the undirected correlation links between the specified variables.
When the method is used for judging the correlation degree, a chart is used for representing undirected phases Guan Lianlu among specified variables, and specifically, the method adopts a directed correlation chart for representing. The method takes the appointed variable as the vertex, and connects each vertex through a bidirectional link to form a directed complete graph. On the directed complete graph, the correlation degree between every two specified variables is calculated through methods of pearson correlation coefficients, mutual information and the like, and correlation values are obtained and used for corresponding bidirectional links between every two specified variables. And when the correlation value does not meet the first threshold value, screening out the bidirectional links which do not meet the first threshold value, namely, when the correlation value meets the first threshold value, reserving the bidirectional links corresponding to the correlation value, and determining the reserved bidirectional links and vertexes as directed correlation paths. That is, the directed correlation graph includes vertices corresponding to a plurality of specified variables, bi-directional links corresponding to correlation values satisfying a first threshold, and each bi-directional link is used to connect two of the specified variables having correlation. The first threshold is used for representing a threshold meeting the correlation relation, and the first threshold is preset according to actual conditions. The directed correlation graph may characterize undirected correlation links between specified variables.
FIG. 4 is a schematic diagram of a causal link analysis method for predicting a causal probability value according to an embodiment of the present invention.
Referring to FIG. 4, in one embodiment, the causal probability prediction model is a graphical neural network model; accordingly, operation 103 predicts the undirected link according to the causal probability prediction model to obtain a directed causal probability value corresponding to the undirected link, including: operation 1031, determining a variable set and a link set according to the directed correlation diagram; wherein the set of variables comprises a plurality of specified variables and the set of links comprises a first directed link; operation 1032 predicts the set of variables and the set of links through the pattern neural network model to obtain a causal probability set corresponding to the set of links, the causal probability set comprising directed causal probability values corresponding to the first directed link.
The causal probability prediction model of the method is a graphic neural network model (Graph Neural Network, GNN), and when the method relates to a plurality of specified variables, the production process is predicted through the graphic neural network model, so that a frame result and a prediction result can be more intuitively and clearly reflected. The method takes a variable set and a link set as inputs and takes a directed causal probability value corresponding to a first directed link as an output when predicting by using a causal probability prediction model.
Specifically, the method establishes a graph model G= (V, E) corresponding to a specified variable set and an undirected correlation link set, wherein V is a set of all top points corresponding to the specified variable in the directed correlation graph, namely a variable set; e is the set of all first directed links, i.e. the link set, used to connect all vertices in the directed correlation graph. The first directional link includes a first positive link and a first negative link. The graphical neural network model predicts the set of variables and the set of links, specifically to obtain a directed causal probability value. Further, each directed cause and effect probability value corresponds to a directed link.
FIG. 5 is a schematic diagram of an implementation flow of a causal link analysis method for determining a causal link in accordance with an embodiment of the present invention.
Referring to FIG. 5, in one embodiment, operation 104 determining a directed causal link between a plurality of specified variables based on directed causal probability values comprises: operation 1041, screening the directed causal probability values satisfying the second threshold, determining a directed link corresponding to the directed causal probability values satisfying the second threshold as a second directed link; operation 1042 determines a directed causal link map based on the plurality of designated variables and the second directed link, the directed causal link map being used to characterize the directed causal links between the designated variables.
The second threshold is used to characterize a threshold condition that satisfies a causal relationship. That is, when the directed cause and effect probability value satisfies the second threshold, it can be determined that the directed link corresponding to the directed cause and effect probability value is used to characterize the cause and effect relationship between the specified variables, and thus the directed link corresponding to the directed cause and effect probability value satisfying the second threshold is determined as the second directed link. Wherein the second directed link is used to characterize the causal relationship and the direction of the causal relationship between the specified variables. The plurality of specified variables are determined as vertices and the second directed link is determined to connect the vertices to form a directed causal link graph from which causal relationships between the plurality of specified variables can be determined. Thereby determining a directed causal link between all specified variables of the directed production process during the production of the specified object.
In one embodiment, before determining the undirected correlation link between the plurality of specified variables based on the correlation determination rule at operation 102, the method further comprises: firstly, carrying out causal link labeling between at least two specified variables, and determining a known directed causal link; then, semi-supervised learning of the causal probability prediction model is performed based on the known directed causal links to obtain the causal probability prediction model.
Specifically, a schematic diagram of the updating of the graphic neural network model designed by the method is shown in fig. 7: wherein, by setting upPerforming convolution and pooling operations, stacking vertex data at two ends of a directional link by setting rho v→e, and stacking vertex data at two ends of a directional link by setting/>For a single-layer LSTM network, the output is a causal relationship probability value. Furthermore, the model also sets a loss function as cross entropy loss, sets iteration times and performs semi-supervised learning. The training sample of semi-supervised learning can invite an expert to label a small number of direct causal relations and a small number of irrelevant relations by using domain knowledge corresponding to the production process of the designated object. And a small number of direct causal relationships and a small number of irrelevant relationships are standardized. Schematic diagrams of graphical neural network model updates are designed, including but not limited to setting functions or network structures for vertex updates, edge updates.
To facilitate understanding of the foregoing embodiments, a specific implementation scenario is provided below for illustration. In this scenario, the causal link analysis method provided by the embodiment of the present invention is applied to causal link analysis equipment, which is used for analyzing production processes in the petrochemical field, wherein the production processes have complex production process parameters, and the production processes with the process parameters being strongly coupled with each other, such as a petroleum refining process.
In the analysis process, the device first collects aggregate distributed control system data (DCS data) by the distributed control system. And then deleting all data of the device during the damage and maintenance of the device according to the operation daily report of the device used in the production process, obtaining production data, and analyzing the production data to determine production variables. The operation daily report is contained in the distributed control system data.
And then, counting the number of missing values in the production variable, deleting the variable with the missing value ratio exceeding 10%, and filling the missing values in the deleted production variable by using an upward filling method to obtain a filling variable. The outliers are processed on the filling variables using the 3σ method to obtain the processing variables. And carrying out standardization processing on the processing variables, and determining specified variables corresponding to the production process, wherein the number of the specified variables is multiple.
And then, the device uses the designated variable as the vertex, connects the designated variable through a bidirectional link, and builds a directed complete graph as shown in fig. 6, wherein the dots in fig. 6 represent the vertex, and the line segments with opposite directions of two arrows connecting the two circle points represent the bidirectional link. And calculating the correlation degree corresponding to the bidirectional link by adopting methods such as pearson correlation coefficient, mutual information and the like so as to represent the correlation degree between the specified variables. The correlation is evaluated through a threshold th1, and when the correlation between two vertexes does not meet a preset formula, a bidirectional link between the two vertexes is deleted, wherein the formula is as follows: abs (corr) > =th1, where corr represents the correlation between two vertices, abs represents the absolute value, th1 represents the set threshold, and the graph obtained by deleting the bidirectional links corresponding to the correlation that does not satisfy the preset formula is determined as the directed correlation graph.
Then, as shown in fig. 7, a graph model g= (V, E) is established, V is a set of all vertices in the directed correlation graph, and an initial value of each vertex is all time sequence values of variables corresponding to the vertex; e is the set of all edges in the directed correlation graph, and the initial value is set to be a full 1 matrix. Designing an update schematic diagram of the GNN model, setting functions or network structures of vertex update and edge update, and settingAnd carrying out convolution and pooling operation on the vertex set V and the edge set E. And setting rho v→e as the vertex data at two ends of one edge to be stacked, namely stacking the vertex data at two ends of one edge in the edge set after convolution and pooling operations to obtain stacking data. Can set/>For a single-layer LSTM network, i.e. for outputting a corresponding causal relationship probability value E' based on the input stack data and the edge set. The expert can be invited to annotate a small amount of direct causal relations and a small amount of irrelevant relations by using domain knowledge, the causal relations and the irrelevant relations annotated by the expert are standardized and then used as training samples, a loss function is set as cross entropy loss, iteration times are set, and semi-supervised learning is carried out on the GNN model. And predicting the vertex set and the edge set corresponding to the directed correlation graph through the GNN model to obtain the directed causal probability value corresponding to each edge.
Finally, a threshold th2 is set, and as shown in fig. 8, a side th2 with the directed causal probability value greater than the threshold in the directed correlation graph is reserved, so that a causal link graph is obtained. The circles in fig. 8 and the dots in fig. 6 refer to the same designated variables, and the directed line segment between the two dots in fig. 8 is used to represent the causal relationship between the two designated variables, specifically, when the designated variable facing away from one end of the line segment arrow changes, the designated variable facing the line segment arrow will change.
FIG. 9 is a schematic diagram of an implementation flow of a causal link analysis device according to an embodiment of the present invention.
Referring to fig. 9, another aspect of the present invention provides a causal link analysis apparatus, the apparatus comprising: the analysis module 601 is configured to perform record analysis on a production process of a specified object to obtain a plurality of specified variables; a determining module 602, configured to determine, based on the relevance determining rule, an undirected relevance link between the plurality of specified variables; the prediction module 603 is configured to predict the undirected relevant link according to a causal probability prediction model, and obtain a directed causal probability value corresponding to the undirected relevant link; the determining module 602 is further configured to determine a directed causal link between a plurality of specified variables based on the directed causal probability value.
In one embodiment, the analysis module 601 includes: a recording submodule 6011, configured to record an operation process of a specified object through a distributed control system, and obtain an operation record; screening and supplementing submodule 6012, configured to screen and supplement the operation record to obtain association information corresponding to the production process; wherein the production process is contained within the operation process; and the processing submodule 6013 is used for carrying out standardization processing on the associated information to obtain the specified variable.
In one embodiment, the determining module 602 includes: a building sub-module 6021, configured to build a directed complete graph corresponding to the plurality of specified variables, where the directed complete graph includes a plurality of bidirectional links for connecting the plurality of specified variables, and any bidirectional link includes two directed links with opposite directions; a determining submodule 6021 for determining a correlation value corresponding to each of the directed links according to a plurality of specified variables; a screening submodule 6022, configured to screen the correlation value that satisfies the first threshold, and determine the directional link corresponding to the correlation value that satisfies the first threshold as the first directional link; the determining submodule 6021 is further configured to determine a directed correlation graph from the plurality of specified variables and the first directed link, the directed correlation graph being used to characterize an undirected correlation link between the specified variables.
In one embodiment, the causal probability prediction model is a graphical neural network model; accordingly, the prediction module 603 includes: determining a variable set and a link set according to the directed correlation diagram; wherein the set of variables comprises a plurality of specified variables and the set of links comprises a first directed link; and predicting the variable set and the link set through the graphic neural network model to obtain a causal probability set corresponding to the link set, wherein the causal probability set comprises a directed causal probability value corresponding to the first directed link.
In an implementation manner, the screening submodule 6022 is further configured to screen the directed cause and effect probability value that meets the second threshold, and determine the directed link corresponding to the directed cause and effect probability value that meets the second threshold as the second directed link; the determination submodule 6021 is further configured to determine a directed causal link map from the plurality of specified variables and the second directed link, the directed causal link map being used to characterize the directed causal link between the specified variables.
In an embodiment, the apparatus further comprises: the labeling module 604 is configured to label a causal link between at least two specified variables, and determine a known directed causal link; the learning module 605 is configured to perform semi-supervised learning on the causal probability prediction model based on the known directed causal links to obtain the causal probability prediction model.
Another aspect of the invention provides a computer readable storage medium comprising a set of computer executable instructions for performing the causal link analysis method of any of the above, when the instructions are executed.
In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "examples," "particular examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the various embodiments or examples described in this specification and the features of the various embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
The foregoing is merely illustrative embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the technical scope of the present invention, and the invention should be covered. Therefore, the protection scope of the invention is subject to the protection scope of the claims.
Claims (8)
1. A causal link analysis method, the method comprising:
recording and analyzing a production process of a specified object to obtain a plurality of specified variables corresponding to the production process;
Determining an undirected correlation link between the plurality of specified variables based on a correlation determination rule;
Predicting the undirected related link according to a causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link;
Determining a directed causal link between the plurality of designated variables according to the directed causal probability value, the directed causal link being used to characterize a causal relationship of the designated object production process;
the determining, based on the relevance determining rule, a non-directional relevance link between the plurality of specified variables includes:
establishing a directed complete graph corresponding to the plurality of specified variables, wherein the directed complete graph comprises a plurality of bidirectional links used for connecting the plurality of specified variables, and any bidirectional link comprises two directed links with opposite directions;
determining a correlation value corresponding to each directional link according to the specified variables;
Screening a correlation value meeting a first threshold value, and determining a directed link corresponding to the correlation value meeting the first threshold value as a first directed link;
and determining a directed correlation graph according to the specified variables and the first directed link, wherein the directed correlation graph is used for representing the undirected correlation link among the specified variables.
2. The method of claim 1, wherein performing a record analysis on a specified object production process to obtain a plurality of specified variables corresponding to the production process comprises:
recording the operation process of the appointed object through a distributed control system to obtain an operation record;
Screening and supplementing the operation records to obtain associated information corresponding to the production process; wherein the production process is included in the operation process;
And carrying out standardization processing on the associated information to obtain the specified variable.
3. The method of claim 1, wherein the causal probability prediction model is a graphical neural network model;
correspondingly, the predicting the undirected related link according to the causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link includes:
determining a variable set and a link set according to the directed correlation diagram; wherein the set of variables comprises a plurality of specified variables, and the set of links comprises a first directed link;
And predicting the variable set and the link set through a graphic neural network model to obtain a causal probability set corresponding to the link set, wherein the causal probability set comprises a directed causal probability value corresponding to the first directed link.
4. A method according to claim 1 or 3, wherein determining a directed causal link between the plurality of specified variables from the directed causal probability value comprises:
Screening directed cause and effect probability values meeting a second threshold, and determining a directed link corresponding to the directed cause and effect probability value meeting the second threshold as a second directed link;
determining a directed causal link map from the plurality of designated variables and the second directed link, the directed causal link map being used to characterize the directed causal link between the designated variables.
5. The method of claim 1, wherein prior to determining the undirected correlation link between the plurality of specified variables based on the correlation determination rule, the method further comprises:
carrying out causal link labeling between at least two specified variables, and determining a known directed causal link;
Semi-supervised learning of the causal probability prediction model based on the known directed causal links to obtain a causal probability prediction model.
6. A causal link analysis apparatus, said apparatus comprising:
The analysis module is used for carrying out record analysis on the production process of the appointed object to obtain a plurality of appointed variables;
A determining module, configured to determine, based on a relevance determining rule, an undirected relevance link between the plurality of specified variables;
The prediction module is used for predicting the undirected related link according to a causal probability prediction model to obtain a directed causal probability value corresponding to the undirected related link;
the determining module is further configured to determine a directed causal link between the plurality of specified variables according to the directed causal probability value;
The determining module includes:
the establishing submodule is used for establishing a directed complete graph corresponding to the specified variables, wherein the directed complete graph comprises a plurality of bidirectional links used for connecting the specified variables, and any bidirectional link comprises two directed links with opposite directions;
a determining submodule, configured to determine a correlation value corresponding to each directional link according to the plurality of specified variables;
a screening sub-module, configured to screen a correlation value that meets a first threshold, and determine a directional link corresponding to the correlation value that meets the first threshold as a first directional link;
The determining submodule is further used for determining a directed correlation graph according to the specified variables and the first directed link, and the directed correlation graph is used for representing the undirected correlation link among the specified variables.
7. The apparatus of claim 6, wherein the analysis module comprises:
The recording sub-module is used for recording the operation process of the appointed object through the distributed control system to obtain an operation record;
The screening and supplementing sub-module is used for screening and supplementing the operation records to obtain the associated information corresponding to the production process; wherein the production process is included in the operation process;
and the processing sub-module is used for carrying out standardized processing on the associated information to obtain a specified variable.
8. A computer readable storage medium comprising a set of computer executable instructions for performing the causal link analysis method of any of claims 1-5 when executed.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010618648.0A CN111985677B (en) | 2020-06-30 | 2020-06-30 | Causal link analysis method, causal link analysis equipment and computer readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010618648.0A CN111985677B (en) | 2020-06-30 | 2020-06-30 | Causal link analysis method, causal link analysis equipment and computer readable storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111985677A CN111985677A (en) | 2020-11-24 |
| CN111985677B true CN111985677B (en) | 2024-06-21 |
Family
ID=73438474
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010618648.0A Active CN111985677B (en) | 2020-06-30 | 2020-06-30 | Causal link analysis method, causal link analysis equipment and computer readable storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111985677B (en) |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114691746A (en) * | 2020-12-30 | 2022-07-01 | 北京嘀嘀无限科技发展有限公司 | Method, device, equipment and medium for acquiring causal relationship between feature information and scene |
| JP7088427B1 (en) | 2022-01-20 | 2022-06-21 | 富士電機株式会社 | Driving support equipment, driving support methods and programs |
| CN120579620B (en) * | 2025-08-01 | 2025-10-03 | 深圳计算科学研究院 | Method, device, equipment and medium for detecting causal relationship of battery production indicators |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2008084039A (en) * | 2006-09-28 | 2008-04-10 | Hitachi Ltd | Manufacturing process analysis method |
| CN108983710A (en) * | 2017-06-02 | 2018-12-11 | 欧姆龙株式会社 | Procedure analysis device, procedure analysis method and storage medium |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP4239932B2 (en) * | 2004-08-27 | 2009-03-18 | 株式会社日立製作所 | production management system |
| US7283928B2 (en) * | 2004-09-30 | 2007-10-16 | John Antanies | Computerized method and software for data analysis |
| CN107563596A (en) * | 2017-08-03 | 2018-01-09 | 清华大学 | A kind of evaluation index equilibrium state analysis method based on Bayes's causal network |
| CN110555047B (en) * | 2018-03-29 | 2024-03-15 | 日本电气株式会社 | Data processing method and electronic equipment |
| CN109754158B (en) * | 2018-12-07 | 2022-08-23 | 国网江苏省电力有限公司南京供电分公司 | Method for generating big data causal model corresponding to power grid operation environment |
-
2020
- 2020-06-30 CN CN202010618648.0A patent/CN111985677B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2008084039A (en) * | 2006-09-28 | 2008-04-10 | Hitachi Ltd | Manufacturing process analysis method |
| CN108983710A (en) * | 2017-06-02 | 2018-12-11 | 欧姆龙株式会社 | Procedure analysis device, procedure analysis method and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111985677A (en) | 2020-11-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111985677B (en) | Causal link analysis method, causal link analysis equipment and computer readable storage medium | |
| CN115578015A (en) | The whole process supervision method, system and storage medium of sewage treatment based on Internet of Things | |
| CN113805548B (en) | Machining intelligent control system, machining intelligent control method and computer readable medium | |
| AU2025204677A1 (en) | Computer-implemented methods referring to an industrial process for manufacturing a product and system for performing said methods | |
| CN117575072A (en) | A device energy consumption prediction method and system based on graph neural network | |
| CN117669384A (en) | Intelligent monitoring method and system for temperature sensor production based on Internet of Things | |
| Colosimo et al. | Statistical process monitoring from industry 2.0 to industry 4.0: Insights into research and practice | |
| CN118015839B (en) | A highway road area risk prediction method and device | |
| CN117408519A (en) | Chemical reaction process risk early warning method and device based on deep learning algorithm | |
| CN119690028B (en) | A full-process intelligent control system for the production of corrugated packaging lines | |
| Bhatia et al. | Casting plate defect detection using motif discovery with minimal model training and small data sets | |
| CN120106301A (en) | Factory monitoring method, system and storage medium based on Internet of Things | |
| CN118396606B (en) | A method for optimizing manufacturing product production based on multimodal large models | |
| CN113377630B (en) | Universal KPI anomaly detection framework implementation method | |
| CN112084294B (en) | An artificial intelligence-based approach to vehicle electromagnetic compatibility classification management | |
| WO2022218632A1 (en) | System and method of monitoring an industrial environment | |
| Okuniewska et al. | Machine learning methods for diagnosing the causes of die-casting defects | |
| CN117076454A (en) | Engineering quality acceptance form data structured storage method and system | |
| CN114564988A (en) | Fault diagnosis model training method, fault diagnosis device and operation machine | |
| CN112596391A (en) | Deep neural network large time lag system dynamic modeling method based on data driving | |
| Perez et al. | Optimization of the new DS-u control chart: an application of genetic algorithms | |
| Singh et al. | Predicting the remaining useful life of ball bearing under dynamic loading using supervised learning | |
| Parmar | Structured Problem Solving Techniques for Manufacturing Datasets to Enhance Yield | |
| US20220101137A1 (en) | Learning device, extraction device, learning method, extraction method, learning program, and extraction program | |
| JP7685374B2 (en) | Plant point cloud classification system, plant point cloud classification method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |