CN118171715B

CN118171715B - Data set sampling method, electronic device and computer readable storage medium

Info

Publication number: CN118171715B
Application number: CN202410509839.1A
Authority: CN
Inventors: 吴铿
Original assignee: Crystal Storage Array Shanghai Technology Co ltd
Current assignee: Crystal Storage Array Shanghai Technology Co ltd
Priority date: 2024-04-26
Filing date: 2024-04-26
Publication date: 2024-08-02
Anticipated expiration: 2044-04-26
Also published as: CN118171715A

Abstract

The invention provides a data set sampling method, which is used for sampling a data set based on a formulated optimal sampling track strategy through reinforcement learning combined with a causal machine learning model.

Description

Data set sampling method, electronic device and computer readable storage medium

Technical Field

The present invention relates to the field of computer technology, and in particular, to the field of artificial intelligence technologies such as deep learning and machine learning, and in particular, to a method for sampling a data set for machine learning, an electronic device, and a computer readable storage medium.

Background

Machine learning is a common research hotspot in the fields of artificial intelligence and pattern recognition, and its theory and method have been widely applied to solve complex problems in engineering application and scientific fields. Machine learning is a science of studying how to simulate or realize human learning activities using a computer, and is one of the most intelligent features in artificial intelligence and the forefront research fields. Machine learning has been developed in a plurality of decades of tortuosity, deep learning is used as a layer-by-layer analysis processing mechanism for representing connection interaction information of multiple layered structures and neurons of the reference brain, and the powerful parallel information processing capability of self-adaption and self-learning is achieved, so that breakthrough progress is achieved in many aspects. With the continuous increase of data analysis demands of various industries in the big data age, the efficient acquisition of knowledge through machine learning has gradually become a major driving force for the development of machine learning technology nowadays.

When classifying machine learning based on learning methods, the following three categories can be broadly classified:

(1) Supervised learning (learning with a mentor): the input data has teacher signals, the probability function, algebraic function or artificial neural network is used as a basic function model, an iterative calculation method is adopted, and the learning result is used as a function.

(2) Unsupervised learning (non-mentor learning): no teacher signal is input in the input data, a clustering method is adopted, and the learning result is a category. Typical non-mentor learning is discovery learning, clustering, competitive learning, etc.

(3) Reinforcement learning (reinforcement learning): the learning method takes environmental feedback (reward/punishment signals) as input and takes statistical and dynamic planning techniques as guidance.

In the field of semiconductor chips, it is often necessary to test various properties and parameters of the chip. For example, various parameters of the semiconductor chip can be measured by a tester (test platform). As an example, it may be determined what changes would occur to various parameters of the semiconductor chip if the user mode data were modified.

In this case, when the user mode parameters of the semiconductor chip are changed, the specifications of the semiconductor chip may be changed in various ways, and in this case, it is necessary to consider multi-dimensional data input and obtain a balance of various parameters in a limited number of tests to satisfy the required specifications. At this time, the adjustment is performed only by manual testing, and often cannot meet the actual requirement, so that an AI (ARTIFICIAL INTELLIGENCE ) method is introduced and is implemented based on machine learning (deep learning).

In reality, the specification change of the semiconductor chip can be generated and predicted by using the AI model with the mode parameter required by the user as an input, and thus, whether the requirement of the customer is satisfied can be quickly determined.

Disclosure of Invention

The invention aims to solve the technical problems

However, in order for the AI model to function effectively, it first needs to be trained. At this time, reinforcement learning is introduced as machine learning.

Reinforcement learning (Reinforcement learning, RL) discusses the problem of how an agent can maximize the rewards it can get within a complex uncertainty environment. Better actions are guided by perceiving the reaction (reflow) of the state (state) of the environment in which they are located to the action, so that the greatest return (return) is obtained, which is called learning in interaction.

In the reinforcement learning process, the agent is constantly interacting with the environment. The agent acquires a state within the environment, and the agent uses the state to output an action and a decision. This decision is then put into the environment, which will output the next state and the current rewards that this decision gets based on the decision taken by the agent. The goal of the agent is to obtain as much rewards from the environment as possible.

In training a reinforcement learning model, a data set that maximally reflects the true data distribution is needed in order for the model training to be effective and efficient. However, the conventional manual test alone is not effective in sampling in a limited number of sampling steps, and model training is also problematic when such a data set is lacking.

In particular, when the data dimension is small, experienced engineers typically choose important features from which they desire to collect data and draw conclusions from that data. But this is not practical in high dimensional situations for the following reasons: 1) The field knowledge of an experienced engineer may be limited to high-dimensional or different products; 2) Because of the large number of sampled data points, a large amount of manual work is required, and searching based on a high-dimensional grid cannot be achieved. There is therefore a need for an efficient high-dimensional data set sampling strategy.

The present invention has been made to solve the above-described problems, and an object of the present invention is to provide a data set sampling method capable of collecting a high-quality data set and greatly reducing test time even in the case of input data having a high dimension, and capable of using the data set for training various machine learning models.

According to an embodiment of the present invention, there is provided a method of sampling a dataset based on an optimized sampling trajectory strategy formulated by reinforcement learning in combination with a causal machine learning model, the optimized sampling trajectory strategy including a data sampling closed loop and model optimization, comprising the steps of: the intelligent agent receives the domain knowledge through the application program interface, follows the domain knowledge suggestion or through the Monte Carlo method, and samples the data points based on the formulated strategy; collecting data from the tester in a sample trace buffer, the data also being the number of samples used by the causal machine learning model in iterative training; testing the chip by a tester to obtain a reference true value; generating, by the causal machine learning model, a predictive label/value with the collected data as input; calculating a loss between the reference truth value and the predictive label/value, and if the loss is above a predetermined threshold, retraining/fine-tuning the causal machine learning model, thereby performing training and iteration; the loss is input as a state to the agent, and is input as a reward to the agent in the case where the sampled data set matches the actual data distribution and the output of the causal machine learning model matches the reference truth obtained after the chip is tested by the tester at the time of the given input.

Further, in the case of application to the field of semiconductor chips, the data set is a specification of the semiconductor chip.

Further, in the case where the semiconductor chip is a NAND flash memory, the specification of the semiconductor chip is array characteristic data and temperature of the NAND flash memory.

Further, the array characteristic data includes read window budget, residual bit error rate, read/write interference, retention of data, and endurance.

Further, in the case that the data set has n features, where n is a natural number greater than 1, the optimized sampling trajectory strategy includes the steps of: sampling data based on user domain knowledge and a statistical test method; carrying out one-dimensional scanning on the first features at the topmost layer, and sorting the features according to domain knowledge or by calculating gradients of the features; adding a second feature based on the one-dimensional scan, and performing a two-dimensional scan, ordering the features according to domain knowledge, or by calculating gradients of the features; adding a third feature based on the two-dimensional scanning, and performing three-dimensional scanning, and sorting the features according to domain knowledge or by calculating gradients of the features; … …; adding an nth feature based on n-1 dimensional scanning, and performing n dimensional scanning, and sequencing the features according to domain knowledge or by calculating gradients of the features; under the actual environment, calling a function on the test strip to collect data points sampled by the optimized sampling trajectory strategy; the importance of each feature is sequenced by calculating the gradient, and the sampling density is improved for the features with higher importance.

Further, in the early stages of the data sampling closed loop, a larger loss is selected, and for the final convergence stage, a determination is made by keeping the loss for a series of long sampling trajectories below a prescribed threshold.

Further, for a causal machine learning model used in a data sampling closed loop, the following steps are selected: based on the collected dataset, determining whether it is graph data (Figure data), if so, using a CNN (Convolutional Neural Networks, convolutional neural network) model, if not, proceeding to the next step; judging whether the data is Temporal data, if so, using an LSTM (Long Short Term Memory long-short-term memory) model, and if not, proceeding to the next step; judging whether the system is Multi-mode (Multi-modality), if so, using a converter model, and if not, proceeding to the next step; judging whether IO (Input/output) changes frequently (IO change frequently), if so, using an MLP (Multi-Layer Perceptron) model, and if not, proceeding to the next step; GNN (for Graph neural network, graph neural network) model was used.

The present invention provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the methods described above.

The present invention provides a computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of the above.

According to the data set sampling method of the present invention, even in the case of input data having a high dimension, a high quality data set can be collected and test time can be greatly reduced, and the data set can be used for training various machine learning models.

Drawings

The disclosure may be better understood by describing exemplary embodiments thereof in conjunction with the accompanying drawings, in which:

FIG. 1 is a diagram illustrating a causal oriented machine learning model.

Fig. 2 is a diagram illustrating a customer-oriented machine learning model.

FIG. 3 is a schematic diagram illustrating an optimization strategy sampling trajectory.

FIG. 4 is a schematic diagram illustrating data sampling closed loop and model optimization.

FIG. 5 is a diagram illustrating an example of selection of a causal oriented machine learning model.

Fig. 6 is a schematic block diagram illustrating an exemplary electronic device 1000 in which embodiments of the present invention can be implemented.

Detailed Description

In the following, specific embodiments of the present disclosure will be described, and it should be noted that in the course of the detailed description of these embodiments, it is not possible in the present specification to describe all features of an actual embodiment in detail for the sake of brevity. It should be appreciated that in the actual implementation of any of the implementations, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that while such a development effort might be complex and lengthy, it would nevertheless be a routine undertaking of design, fabrication, or manufacture for those of ordinary skill having the benefit of this disclosure, and thus should not be construed as having the benefit of this disclosure.

In the present disclosure, all embodiments and preferred embodiments mentioned herein may be combined with each other to form new technical solutions, if not specifically stated. In the present disclosure, all technical features mentioned herein as well as preferred features may be combined with each other to form new technical solutions, if not specifically stated. In the description of the embodiments of the present disclosure, the term "and/or" is merely an association relationship describing an association object, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

A method for sampling a data set according to an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

As a specific example, in the present embodiment, a NAND flash memory (NAND FLASH) is used, and when the user mode parameter (User mode parameters) is changed, the NAND specification data (NAND Specification data) is changed accordingly.

As the target NAND specification data, NAND array characteristic data (ARRAY CHARACTERISTIC DATA) is exemplified here.

The array characteristic data of NAND (hereinafter, sometimes abbreviated as AC data) includes a huge amount of data and multidimensional data when viewed from the global. Specifically, the NAND AC data may include read window budget (read Windows budget, RWB), residual Bit Error Rate (RBER), read/write disturb (W/R disturb), retention and endurance of the data (data retention and endurance), and the like. For example, in determining the read window budget (read Windows budget, RWB), the key point is the voltage threshold distribution (Voltage Threshold distribution).

In addition, at a specific temperature, the AC data of each NAND also varies. RWB may be further affected when the user load spans different temperatures. At this point, different user mode parameters at different temperatures need to be scanned (shmoo) continuously to collect a large amount of RWB data.

In order to study what kind of change will happen to the NAND specification data when the user mode parameter changes, usually, a chip tester is used to test the chip, so as to obtain a corresponding result. However, when there is high-dimensional data, testing becomes very difficult, and variations in the data of each dimension cannot be accurately reflected.

In this embodiment, instead of the usual manual sampling method, a machine learning (MACHINE LEARNING, ML) model is used to optimize the AC data.

Fig. 1 is a diagram showing a causal machine learning model, which is also a common sequence model, i.e. predicting (giving) the output given an input.

As shown in fig. 1, in the causal machine learning model, data of user mode parameters, temperature, cycle, etc. are first given, and then whether AC data meets specifications is predicted by the model. In this case, the result can be obtained immediately by supervised learning without performing manual tests each time. In addition, in the case where the input data is multi-dimensional, more time can be saved.

Fig. 2 is a diagram showing a machine learning model facing a client, which is also an inverse machine learning model, the gist of which is to predict (give) user parameters that can achieve a target specification, in case that the target specification has already been given.

Specifically, as shown in fig. 2, in the client-oriented machine learning model, the corresponding user pattern data is predicted based on the array characteristic data, the temperature, and the like that meet the specification. That is, the reverse problem can be solved by such a machine model for client-oriented learning. Compared with manual testing, the method can obtain the result immediately, and can save more time under the condition that the input data is multidimensional.

More and more accurate results can be obtained by continuously training and iterating the machine learning model shown in fig. 1 and 2.

However, for ML (MACHINE LEARNING ), it is necessary to have high enough quality data for training the model to enable ML training to be performed effectively and efficiently. Most common ML is the end-to-end case, which is widely used in CV, NLP and various scientific findings, but at the production level the test time to collect a dataset for each product is long and it is difficult to provide a generic sampling trajectory strategy (universal sampling trajectory strategy) to balance each product. Obviously, it is very inflexible when there is a lot of manual work and hard coding case by case.

As a conventional AC data sampling (collecting) method, there is the following method.

(1) Based on experience.

However, this method is limited to the knowledge of the limited dimensions of individuals, and generally has a small sample size and cannot cover the entire data distribution.

(2) Based on mathematical methods.

However, as the total feature increases, the amount of data required for the probabilistic sampling method (e.g., importance sampling, rejection sampling, gibbs sampling, metropolis Hasting sampling) is enormous, resulting in a significant decrease in the efficiency of sampling.

(3) Based on a conventional AC model.

However, even an expert has difficulty in solving the trade-off (trade off) between multiple objectives, such as how to balance RBER, retention of data, and durability, become problems. And it is highly dependent on human knowledge, difficult to qualify, and difficult to generalize into a flexible model. Furthermore, it is difficult to generate comprehensive data, often requiring a person to give recommended fine tuning settings.

(4) Based on a conventional AC workflow.

However, the output cannot be obtained quickly under the condition that any value is input on the characteristics, the detail distribution in the domain knowledge cannot be known, and the data distribution outside the domain knowledge is difficult to explore.

Therefore, in the present invention, an artificial intelligence guidance system is presented that integrates reinforcement learning, causal model and tester API (application program interface, application Programming Interface) related software, and hardware to collect chip data on a tester, which solves the high-dimensional data sampling problem based on reinforcement learning methods combined with causal machine learning models.

First, a strategy of a data set sampling method according to the present invention will be described.

The goal of this strategy is to match the sampled data set to the true data distribution by optimizing the sampling strategy. For example, in the case of CV (computer vision), NLP (natural language processing), they are typically large enough to cover the real distribution. In addition, it is necessary to determine whether specification labels appearing in the classification task are satisfied or not, and to cause the output value to exceed a predefined threshold value in the regression task.

As shown in fig. 3, first, the basic strategy is based on user domain knowledge and statistical experimentation (monte carlo method ) when collecting the dataset.

Where domain knowledge refers to specific, specialized disciplines or domain knowledge, this term is often used to refer to a more general discipline, unlike general knowledge or domain independent knowledge, and persons possessing domain knowledge are generally considered to be experts or scholars in the domain. Where the strategy of sampling is based on domain knowledge, typically with default values or ranges, or with a list of characteristic importance (or a list of features of interest), the sampling can be first based on rich indications and experience.

One example is given below.

For example, in the case of having multiple features, a dimension-by-dimension scan may be performed for each feature, and the importance of each feature may be ranked by gradient calculation.

Here, assuming that there are n features (n is a natural number greater than 1), the sampling strategy comprises the steps of:

first, the topmost feature i (i.e., the first feature) is scanned in one dimension (1D Shmoo), and the features may be ranked according to domain knowledge, or by calculating the gradient (directional derivative) of each feature;

Next, a two-dimensional scan (2D Shmoo) is performed based on the previous one-dimensional scan, on which feature j (i.e., the second feature) is added, the individual features may be ranked according to domain knowledge, or by calculating a gradient (directional derivative) of each feature;

Then, three-dimensional scanning (3D Shmoo) based on previous two-dimensional scanning, with feature k (i.e., third feature) added thereto, the individual features may be ranked according to domain knowledge, or by calculating gradients (directional derivatives) of each feature;

……；

Performing an n-ary scan (nD Shmoo) based on an n-1 dimensional scan, adding features n (i.e., nth features) to the scan, the features may be ordered according to domain knowledge, or by calculating gradients (directional derivatives) of the features;

In an actual environment, calling an APF function on a test strip to collect data points sampled by the strategy, wherein the data points are real data obtained by testing a chip by a tester (hardware);

And finally, calculating, namely, sorting the importance of the features by calculating gradients, and improving the sampling density of the features with higher importance.

In the above, the concept of optimization strategy sampling is introduced, and the following describes data sampling closed loop and model optimization.

Here, the data sampling closed loop is implemented based on reinforcement learning. Reinforcement learning is a learning of an Agent (Agent) in a "trial and error" manner, and directs the behavior by interacting with the environment to obtain the maximum prize for the Agent, and is different from supervised learning in connection with learning, which is mainly represented by reinforcement signals, in which reinforcement signals provided by the environment are an evaluation of how well an action is generated (typically a scalar signal), rather than telling the reinforcement learning system RLS (reinforcement LEARNING SYSTEM) how to generate the correct action. Since little information is provided by the external environment, RLS must learn from its own experiences. In this way, the RLS obtains knowledge in the context of the action-assessment, improving the action plan to suit the context.

Specifically, if an Agent's certain behavior policy results in a positive reward (signal enhancement) for the environment, the Agent's later trend to generate this behavior policy will be enhanced. The goal of the Agent is to find the optimal strategy at each discrete state to maximize the desired discount rewards and.

Reinforcement learning refers to learning as a heuristic evaluation process, in which an Agent selects an action for an environment, the state of the environment changes after receiving the action, and a reinforcement signal (rewards or punishments) is generated and fed back to the Agent, and the Agent selects the next action according to the reinforcement signal and the current state of the environment, wherein the selection principle is that the probability of receiving positive reinforcement (rewards) is increased. The action selected affects not only the immediate enhancement value, but also the state at the moment in the environment and the final enhancement value.

The data sampling comprises the following steps:

(1) The Agent receives domain knowledge through an API (application program interface, application Programming Interface), follows domain knowledge recommendations or through monte carlo, and samples data points based on formulated policies.

(2) In the sample trajectory buffer, a large amount of data is collected from the tester, which also serves as the number of samples used by the causal machine learning model in iterative training.

(3) And testing the chip by using a tester to obtain a reference true value.

(4) The predictive labels/values are generated by a causal machine learning model with the data as input.

(5) And calculating a loss between the reference true value and the predicted label/value, and training and iterating the causal machine learning model by retraining/fine-tuning the causal machine learning model when the loss is equal to or greater than a predetermined threshold.

(6) Losses are input as states to the agent, and as rewards in the case where the sampled data set nearly matches the actual data distribution and the output of the causal machine learning model at a given input nearly matches the benchmark truth obtained after the chip is tested by the tester.

It is noted that a greater loss is preferred at the early stages of sampling the closed loop. The final convergence stage is determined by keeping the loss of a series of long sampling trajectories below a predetermined threshold.

At time zero, the reinforcement learning model (RL model) should follow domain knowledge recommendations from the sampling to ensure that the RL model can fully utilize domain knowledge.

By using reinforcement learning for optimizing the sampling trajectories, it is possible to replace the expert. The causal relationship model may be published to users for accuracy prediction. The sampling efficiency can be improved, potentially avoiding over-sampling and under-sampling. In addition, a reasonable (or fair) dataset is output, which may further facilitate causal ML model training (even MAE model training) and may be visualized by the user for direct conclusion brain storms.

As shown in fig. 5, for different data types, different models may be selected and trained based on the collected data sets, specifically comprising the steps of:

Based on the collected dataset, determining whether it is graph data (Figure data), if so, using a CNN (Convolutional Neural Networks, convolutional neural network) model, if not, proceeding to the next step;

Judging whether the data is Temporal data, if so, using an LSTM (Long Short Term Memory long-short-term memory) model, and if not, proceeding to the next step;

Judging whether the system is Multi-mode (Multi-modality), if so, using a converter model, and if not, proceeding to the next step;

Judging whether IO (Input/output) changes frequently (IO change frequently), if so, using an MLP (Multi-Layer Perceptron) model, and if not, proceeding to the next step;

GNN (for Graph neural network, graph neural network) model was used.

Among them, LSTM is more suitable for temporal data. The transducer model is used for multi-modal datasets, which require a larger dataset to work on the transducer model. MLP is the simplest model, applicable to relatively small datasets.

In addition, the invention also provides electronic equipment and a readable storage medium.

Fig. 6 is a schematic block diagram illustrating an exemplary electronic device 1000 in which embodiments of the present invention can be implemented. Electronic devices are intended to represent various forms of digital computers. The electronic device may also represent various forms of mobile devices.

As shown in fig. 6, the electronic device 1000 may include a computing unit 1001 that may perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1002 or a computer program loaded from a storage unit 1008 into a Random Access Memory (RAM) 1003. In the RAM1003, various programs and data required for the operation of the device 1000 can also be stored. The computing unit 1001, the ROM1002, and the RAM1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

Various components in device 1000 are connected to I/O interface 1005, including: an input unit 1006 such as a keyboard, a mouse, and the like; an output unit 1007 such as various types of displays, speakers, and the like; a storage unit 1008 such as a magnetic disk, an optical disk, or the like; and communication unit 1009 such as a network card, modem, wireless communication transceiver, etc. Communication unit 1009 allows device 1000 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.

The computing unit 1001 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1001 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1001 performs the respective methods and processes described above, such as a model training method. For example, in some embodiments, the model training method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 1008. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1000 via ROM1002 and/or communication unit 1009. When the computer program is loaded into RAM1003 and executed by computing unit 1001, one or more steps of the model training method described above may be performed.

Various implementations of the apparatus and techniques described here above may be implemented in digital electronic circuit devices, integrated circuit devices, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), on-chip device devices (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server. In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution apparatus, device, or apparatus. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.

The computer device may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual PRIVATE SERVER" or simply "VPS") are overcome. The server may be a cloud server, a server of a distributed device, or a server incorporating a blockchain.

It should be noted that, artificial intelligence is a subject of studying a certain thought process and intelligent behavior (such as learning, reasoning, thinking, planning, etc.) of a computer to simulate a person, and has a technology at both hardware and software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge graph technology and the like.

It is to be understood that the above description is intended to be illustrative, and not restrictive. For example, the above-described embodiments (and/or aspects thereof) may be used in combination with one another. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the various embodiments of the invention without departing from the scope thereof. While the dimensions and types of materials described herein are intended to define the parameters of the various embodiments of the invention, the various embodiments are not meant to be limiting and are exemplary embodiments. Many other embodiments will be apparent to those of skill in the art upon reading the above description. The scope of the various embodiments of the invention should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.

Claims

1. A method for sampling a data set, characterized in that,

The data set is sampled based on an established optimal sampling trajectory strategy by reinforcement learning combined with a causal machine learning model, wherein in the optimal sampling trajectory strategy, in the case of a plurality of features, the features are scanned in a dimension-by-dimension manner, the importance of the features is ranked by gradient calculation, the features with higher importance are improved in sampling density,

The optimized sampling trajectory strategy comprises a data sampling closed loop and model optimization, and comprises the following steps:

The intelligent agent receives the domain knowledge through the application program interface, follows the domain knowledge suggestion or through the Monte Carlo method, and samples the data points based on the formulated strategy;

Collecting data from the tester in a sample trace buffer, the data also being the number of samples used by the causal machine learning model in iterative training;

testing the chip by a tester to obtain a reference true value;

generating, by the causal machine learning model, a predictive label/value with the collected data as input;

Calculating a loss between the reference truth value and the predictive label/value, and if the loss is above a predetermined threshold, retraining/fine-tuning the causal machine learning model, thereby performing training and iteration;

The loss is input as a state to the agent, and is input as a reward to the agent in the case where the sampled data set matches the actual data distribution and the output of the causal machine learning model matches the reference truth obtained after the chip is tested by the tester at the time of the given input.

2. The method of sampling a data set according to claim 1,

In the case of application in the field of semiconductor chips, the data set is a specification of the semiconductor chip.

3. A method of sampling a data set according to claim 2, wherein,

In the case where the semiconductor chip is a NAND flash memory, the specification of the semiconductor chip is array characteristic data and temperature of the NAND flash memory.

4. A method of sampling a data set according to claim 3,

The array characteristic data includes read window budget, residual bit error rate, read/write disturbance, retention of data, and endurance.

5. The method of sampling a data set according to claim 1,

In the case where the dataset has n features, where n is a natural number above 1, the optimized sampling trajectory strategy comprises the steps of:

sampling data based on user domain knowledge and a statistical test method;

carrying out one-dimensional scanning on the first features at the topmost layer, and sorting the features according to domain knowledge or by calculating gradients of the features;

adding a second feature based on the one-dimensional scan, and performing a two-dimensional scan, ordering the features according to domain knowledge, or by calculating gradients of the features;

adding a third feature based on the two-dimensional scanning, and performing three-dimensional scanning, and sorting the features according to domain knowledge or by calculating gradients of the features;

And so on;

Adding an nth feature based on n-1 dimensional scanning, and performing n dimensional scanning, and sequencing the features according to domain knowledge or by calculating gradients of the features;

Under the actual environment, calling a function on the test strip to collect data points sampled by the optimized sampling trajectory strategy;

The importance of each feature is sequenced by calculating the gradient, and the sampling density is improved for the features with higher importance.

6. The method of sampling a data set according to claim 1,

In the early stages of the data sampling loop, a larger loss is selected, while for the final convergence stage, a determination is made by keeping the loss of a series of long sampling trajectories below a prescribed threshold.

7. The method of sampling a data set according to claim 1,

For a causal machine learning model used in a data sampling closed loop, the following steps are selected:

based on the collected data set, judging whether the data is graph data, if so, using a CNN model, and if not, proceeding to the next step;

Judging whether the data are temporal data, if so, using an LSTM model, and if not, proceeding to the next step;

judging whether the mode is multi-mode, if so, using a transducer model, and if not, proceeding to the next step;

judging whether the input/output is changed frequently, if so, using an MLP model, and if not, proceeding to the next step;

The GNN model was used.

8. An electronic device, comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.

9. A computer-readable storage medium storing computer instructions, characterized in that,

The computer instructions for causing a computer to perform the method of any one of claims 1-7.