[go: up one dir, main page]

CN111931939B - Single-amplitude quantum computing simulation method - Google Patents

Single-amplitude quantum computing simulation method Download PDF

Info

Publication number
CN111931939B
CN111931939B CN201910394102.9A CN201910394102A CN111931939B CN 111931939 B CN111931939 B CN 111931939B CN 201910394102 A CN201910394102 A CN 201910394102A CN 111931939 B CN111931939 B CN 111931939B
Authority
CN
China
Prior art keywords
quantum
vertex
edge
tensor
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910394102.9A
Other languages
Chinese (zh)
Other versions
CN111931939A (en
Inventor
王晶
窦猛汉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Benyuan Quantum Computing Technology Hefei Co ltd
Original Assignee
Benyuan Quantum Computing Technology Hefei Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Benyuan Quantum Computing Technology Hefei Co ltd filed Critical Benyuan Quantum Computing Technology Hefei Co ltd
Priority to CN201910394102.9A priority Critical patent/CN111931939B/en
Publication of CN111931939A publication Critical patent/CN111931939A/en
Application granted granted Critical
Publication of CN111931939B publication Critical patent/CN111931939B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N10/00Quantum computing, i.e. information processing based on quantum-mechanical phenomena

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Condensed Matter Physics & Semiconductors (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a single-amplitude quantum computing simulation method, which comprises the following steps: obtaining a target quantum program for each computing node of the distributed cluster; constructing an undirected graph corresponding to the target quantum program; wherein, the vertex of the undirected graph represents the quantum state of the operated quantum bit before or after the quantum logic gate is operated, and one edge of the undirected graph corresponds to one tensor; obtaining a quantum state corresponding to a target single amplitude to be measured, and calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and by matching with the GPU corresponding to the calculation node; wherein the sub-amplitude is the amplitude corresponding to the undirected graph; and returning the sub-amplitudes to the main nodes of the distributed clusters, so that the main nodes reduce the sub-amplitudes to obtain the amplitudes of the quantum states, and the amplitudes are used as target single amplitudes. With the embodiments of the present invention, quantum computing simulations involving 50 or even more qubits can be achieved.

Description

Single-amplitude quantum computing simulation method
Technical Field
The invention belongs to the technical field of quantum computing, and particularly relates to a single-amplitude quantum computing simulation method.
Background
The quantum computer is a kind of physical device which performs high-speed mathematical and logical operation, stores and processes quantum information according to the law of quantum mechanics. When a device processes and calculates quantum information and operates on a quantum algorithm, the device is a quantum computer.
Quantum computers can perform a variety of tasks not possible with classical computers, such as quantum simulation and decomposition of large prime factors. On the way quantum computation advances, in order to realize "quantum override", a quantum computer having a quantum bit number of 50 or more and high fidelity is required to be realized first. However, before the real implementation, the quantum computing simulation can be performed through the related theory of quantum computing, so that the software and hardware decoupling of the quantum computer is realized, and a foundation is laid for the development of quantum programs and quantum applications.
The quantum computing simulation is a simulation computation which simulates and follows the law of quantum mechanics by means of numerical computation and computer science, and is taken as a simulation program, and the high-speed computing capability of a computer is utilized to characterize the space-time evolution of the quantum state according to the basic law of quantum bits of the quantum mechanics.
At present, full-amplitude simulation, namely, one-time simulation of all amplitudes of quantum bit end states is generally adopted for quantum computation simulation, but the full-amplitude simulation is calculated based on unitary transformation, and the memory overhead of the full-amplitude simulation increases exponentially with the number of quantum bits. When simulating quantum computation involving 30 qubits, the memory overhead is 16GByte (Ji Bytes); at 40 qubits, the memory overhead would need 16 TBytes (terabytes), i.e. 2 10 * (16 GByte); at 50 qubits, the memory overhead would be 16 PBytes (beat bytes), i.e. 2 10 * (16 TByte). This is not affordable for common cloud platforms and even super computing platforms that provide quantum computing simulation services, and full-amplitude simulators currently only simulate 49 quantum bits at maximum in academia, which is a simulation result based on the world-largest super computer, but does not provide cloud services to the outside, which is very unfavorable for research and development of quantum programs and quantum applications. In this case, a single amplitude simulation, i.e. a scheme that simulates only one amplitude at a time, is proposed, and this mode would require much less memory. Thus, it can be seen that the related research and implementation of quantum computation simulation of single quantum state component amplitude are particularly important for the development of quantum computation under the condition that the memory resources of the current platform are limited.
Disclosure of Invention
The invention aims to provide a single-amplitude quantum computing simulation method, which aims to solve the defects in the prior art and realize the computing simulation involving 50 or more quantum bits.
The technical scheme adopted by the invention is as follows:
a single amplitude quantum computing simulation method, the method comprising:
obtaining a target quantum program for each computing node of the distributed cluster;
constructing an undirected graph corresponding to the target quantum program; wherein, the vertex of the undirected graph represents the quantum state of the operated quantum bit before or after the quantum logic gate is operated, and one edge of the undirected graph corresponds to one tensor;
obtaining a quantum state corresponding to a target single amplitude to be measured, and calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and by matching with the GPU corresponding to the calculation node; wherein the sub-amplitude is the amplitude corresponding to the undirected graph;
and returning the sub-amplitudes to the main nodes of the distributed clusters, so that the main nodes reduce the sub-amplitudes to obtain the amplitudes of the quantum states, and the amplitudes are used as target single amplitudes.
Optionally, the constructing an undirected graph corresponding to the target quantum program includes:
analyzing the target quantum program to obtain a linked list of recorded quantum program information;
traversing the linked list, and creating an edge with tensor order of 1 when the type of the quantum logic gate in the linked list is a first single quantum gate; the side is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the first single quantum gate, and the unitary matrix of the first single quantum gate is a diagonal matrix;
When the quantum logic gate type in the linked list is a second single quantum gate, creating an edge with tensor order of 2 and a vertex connected with the edge; the side is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the second single quantum gate, and the unitary matrix of the second single quantum gate is a non-diagonal matrix;
when the type of the quantum logic gate in the linked list is a first double quantum gate, creating an edge with tensor order of 2; the side is connected with the last vertex in the vertex chain corresponding to the two quantum bits operated by the first double-quantum gate, and the unitary matrix of the first double-quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second double-quantum gate, creating an edge with the tensor order of 4 and two vertexes connected with the edge; the side is connected with the last vertex in the vertex chain corresponding to the two quantum bits operated by the second double-quantum gate, and the unitary matrix of the second double-quantum gate is a non-diagonal matrix;
and obtaining an undirected graph corresponding to the target quantum program.
Optionally, the calculating, based on the quantum state and the undirected graph and in cooperation with the GPU corresponding to the calculating node, the sub-amplitude of the quantum state includes:
Invoking GPU corresponding to the computing node, and respectively performing definite value reduction on tensors of edges connected with specific vertexes of the undirected graph; wherein the specific vertex is the first vertex and the last vertex of the vertex chain corresponding to each quantum bit;
deleting the specific vertex;
receiving the value of a target vertex distributed by the main node, splitting a current undirected graph based on the value of the target vertex, and calling the GPU to respectively determine value reduction for tensors of connecting edges of the target vertex aiming at each sub undirected graph obtained by splitting;
aiming at each vertex in the sub undirected graph, combining the GPU to fuse all the connecting edges of the vertex into a new edge, reducing the tensor of the new edge, and deleting the vertex;
integrating tensor values of all new edges after the order reduction to obtain a first sub-amplitude of the quantum state, which corresponds to the sub-undirected graph;
and summing the first sub-amplitudes of all sub-undirected graphs in the quantum state to obtain the sub-amplitudes of the quantum state.
Optionally, the performing a definite value reduction on tensors of the edges connected by the specific vertices of the undirected graph includes:
the GPU corresponding to the computing node sets the thread block number according to the tensor order after the order reduction and the thread number in each thread block in the GPU aiming at the edge connected with each specific vertex;
According to the thread block serial numbers, the thread number in each thread block and the thread serial numbers, calculating first element numbers of the reduced tensors, and calculating two second element numbers of the reduced tensors corresponding to the first element numbers; the number of bits of the element number corresponds to the number of vertex bits connected with the current edge one by one, and the value of each bit of the element number is the vertex value of the corresponding vertex position;
determining a second element number with a value on a number bit corresponding to a specific top point position as a preset determined value from the two second element numbers;
and acquiring a second element value corresponding to the determined second element number, and determining the second element value as a first element value corresponding to the first element number.
Optionally, the receiving the value of the target vertex allocated by the master node, splitting the current undirected graph based on the value of the target vertex, and calling the GPU to respectively determine value reduction for tensors of connecting edges of the target vertex for each sub undirected graph obtained by splitting, including:
receiving one or more values of target vertexes equally divided by a main node; wherein the target vertexes are the first m vertexes with the largest number of connected edges in the current undirected graph, and the m vertexes comprise 2 m Seed value, wherein the value of each vertex is 0 or 1, m is a positive integer, and the number of the calculation nodes is 2 n N is a positive integer, and n is greater than 0 and less than or equal to m;
splitting an undirected graph of the computing node into one or more sub undirected graphs aiming at each vertex value which is evenly divided;
and traversing the edges connected with the target vertexes aiming at each sub undirected graph, and calling the GPU to respectively perform definite value reduction on tensors of the edges connected with the target vertexes.
Optionally, the fusing all connection edges of the vertex with the GPU into a new edge includes:
determining a first edge and a second edge to be fused for all the connecting edges of the vertex;
according to the vertex which is not connected with the first edge in the second edge, calling the GPU to upgrade the first tensor of the first edge, and updating the first tensor by the tensor after upgrade;
deleting the second edge, and connecting the vertex which is not connected with the first edge in the second edge to the first edge to obtain a fused middle edge;
invoking the GPU to calculate tensor elements of the middle edge according to the corresponding relation between the recorded vertex numbers of the first edge and the second edge;
and returning to the execution step of determining the first edge and the second edge to be fused until the calculated tensor element is the tensor element of the last edge, and determining the last edge as a new fused edge.
Optionally, the step-up the first tensor of the first edge includes:
the GPU calculates the tensor order after the step-up according to the order of the first tensor and the step-up;
setting the number of thread blocks according to the tensor order after the ascending and the number of threads in each thread block in the GPU;
according to the thread block serial numbers, the number of threads in each thread block and the thread serial numbers, calculating the first element numbers of tensors after the step-up;
and calculating the element of the tensor after the step-up according to the first element number, the step-up number and the element of the first tensor.
Optionally, the calculating the tensor element of the middle edge includes:
the GPU sets the number of thread blocks according to the updated order of the first tensor and the number of threads in each thread block in the GPU;
according to the thread block serial numbers, the number of threads in each thread block and the thread serial numbers, calculating the first element numbers of tensors of the middle edges;
determining corresponding elements of each element in the first tensor in a second tensor of the second edge according to the corresponding relation;
each element in the first tensor is traversed, and the element is updated with the product of the element and its corresponding element in the second tensor.
Optionally, the reducing the tensor of the new edge includes:
the GPU sets the number of thread blocks according to the tensor order of the new edge after the order is reduced;
according to the thread block serial numbers, the thread number in each thread block and the thread serial numbers, calculating first element numbers of the reduced tensors, and calculating two second element numbers of the reduced tensors corresponding to the first element numbers; the number of bits of the element number corresponds to the number of vertex bits connected with the current edge one by one, and the value of each bit of the element number is the vertex value of the corresponding vertex position;
and obtaining two second element values corresponding to the two second element numbers one by one, summing the two second element values, and determining the sum as a first element value corresponding to the first element number.
Optionally, the calculation formula of the first element number is:
Idx=block_id*num+thread_id
wherein Idx is the first element number, block_id is the thread block number, num is the number of threads in each thread block, and thread_id is the thread number.
Compared with the prior art, the method can only calculate one target single amplitude of the concerned quantum bit at a time, specifically map the target quantum program onto the undirected graph, split the undirected graph onto a plurality of computing nodes by combining a path integration method, and calculate the corresponding sub undirected graph by each computing node and the subordinate GPU in a matching way. The whole calculation process is mainly based on simple operation of elements in tensors, compared with full-amplitude simulation based on unitary matrix in the prior art, the requirement on memory is greatly reduced, and the calculated amount does not rise along with the quantum bit index, so that quantum calculation simulation involving 50 or more quantum bits can be realized; the GPU has strong performance of executing a large amount of parallel computation, so that the quantum computation simulation efficiency is high. At present, the technical scheme provided by the embodiment of the invention can realize quantum computing simulation involving 196 quantum bits at maximum.
In addition, in practical application, only one or more of the full amplitudes of the qubits are sometimes needed, in which case, if the full amplitude mode in the prior art is adopted, that is, all the amplitudes are simulated at one time, the waste of resources such as a memory and time is undoubtedly caused; by applying the method provided by the embodiment of the invention, one or more times of simulation can be performed in a targeted manner, and one or more single amplitudes are needed to be simulated, so that resources and time are saved greatly.
Drawings
FIG. 1 is a specific example of splitting a quantum wire into different paths corresponding to a quantum program of the present invention;
FIG. 2 is a flow chart of a single-amplitude quantum computing simulation method provided by an embodiment of the invention;
fig. 3 is an undirected schematic illustration of different types of quantum logic gate construction in a single-amplitude quantum computing simulation method provided by an embodiment of the present invention.
Detailed Description
The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
In order to realize computational simulation involving 50 or even more qubits, the embodiment of the invention provides a single-amplitude quantum computational simulation method and device.
The following first describes a single-amplitude quantum computing simulation method provided by the embodiment of the invention.
It will be appreciated by those skilled in the art that each qubit may be at |0 at the same time>And |1>Is a quantum state of one qubitCan be expressed as a|0>+b|1>Wherein a and b are respectively |0>、|1>The amplitudes of (a) are complex. After measurement, the quantum state collapses to a fixed quantum state, where it collapses to |0>The probability of (a) is a 2 Collapse to |1>The probability of (b) is b 2 ,a 2 +b 2 =1. And the quantum state of n quantum bits is 2 n And a superposition of the individual quantum states. For example, a quantum state consisting of 3 qubits +.>Is 2 3 (i.e., 8) superimposed states of quantum states, wherein the 8 quantum states are each |000>、|001>、|010>、|011>、|100>、|101>、|110>Sum |111>At this time, a quantum state composed of 3 qubits +.>Can be expressed as
Wherein each quantum state of the 8 quantum states is called a quantum state component, and each quantum state component has a corresponding amplitude, i.e. c 0 To c 7 These complex numbers may be referred to as a single amplitude. Full amplitude simulation, i.e. 2 in which n qubits are simulated at one time n Amplitude of the individual quantum state components; whereas single amplitude simulation refers to one-time simulation 2 n Amplitude of any one quantum state component of the individual quantum states.
Currently, with respect to quantum computing simulation, full-amplitude mode is mostly adopted in the industry. However, for the full-amplitude mode, the memory occupied by the full-amplitude mode generally increases exponentially with the number of the simulated qubits, for example, only 16KB of memory is needed for simulating 10 qubits, 16MB of memory is needed for simulating 20 qubits, 16GB is needed for simulating 30 qubits, and up to 16PB of memory is needed for simulating 50 qubits, so that full-amplitude simulation of 50 or more qubits cannot be realized in all computer memories in the world.
In view of this, the embodiment of the present invention provides a single-amplitude quantum computing simulation method, which is applied to computing nodes of a distributed cluster, where all computing nodes correspond to a master node, each computing node is subordinate to (i.e. controls) one or more GPUs, and the number of GPUs controlled by one computing node may be positive integer power of 2, so that it is convenient to process quantum information, because the quantum state number of the quantum bits is an index of 2. The following description will take a GPU controlled by a computing node as an example. Preferably, the distributed clusters may be supercomputer clusters (e.g., optical supercomputer platforms of the Shenwei Taihu lake).
It should be noted that, the quantum program is a series of instruction sequences written by a quantum language and capable of running on a quantum computer, so as to realize the support of quantum logic gate operation and finally realize the simulation of quantum computation. Specifically, the quantum program is a series of instruction sequences for operating the quantum logic gate according to a certain time sequence.
Among these, quantum logic gates are a fundamental, quantum circuit that operates on a small number of qubits (i.e., qubits). It is the basis of quantum circuits, just like the relationship between a conventional logic gate and a typical digital circuit. Quantum logic gates include single quantum logic gates, double quantum logic gates, and multiple quantum logic gates.
While a quantum program may contain hundreds of quantum logic gate operations, or thousands of quantum logic gate operations. The execution process of the quantum program is a process of executing all quantum logic gates according to a certain time sequence. The timing is a time sequence in which each quantum logic gate is executed.
It should be noted that the quantum logic gate is generally represented by a unitary matrix, and the unitary matrix is not only a matrix, but also an operation and transformation. The effect of a general quantum logic gate on a quantum state is calculated by multiplying the unitary matrix by the right vector corresponding to the quantum state.
For example: quantum state |0>The corresponding right vector isAnd quantum state |1>The corresponding right vector is +.>
The quantum gates can be further divided into diagonal quantum gates and off-diagonal quantum gates according to unitary matrix types. The diagonal quantum logic gate refers to a quantum logic gate with unitary matrix being a diagonal matrix. As is well known, a diagonal matrix refers to a matrix in which all elements other than the main diagonal are 0, and the elements on the diagonal may be 0 or other values.
Such as an identity matrixIs typically a diagonal matrix.
In contrast, matrices having non-zero elements outside the main diagonal, i.e. non-diagonal matrices, e.g.The method comprises the steps of carrying out a first treatment on the surface of the And the unitary matrix is a quantum logic gate of an off-diagonal matrix, i.e., an off-diagonal quantum logic gate.
It will be appreciated by those skilled in the art that it is assumed that the quantum bit is divided in the target quantum programInitial quantum state |0 … 0>And the last state ex-co-involve M 1 Each quantum state, then, since the state of each qubit can be at |0>And |1>So for M 1 One of the quantum states is split into two quantum state components: i0>And |1>The initial quantum state to final state component X= |x can be obtained 0 …x n-1 >2 of (2) M1 The possible transformation paths are calculated, the amplitude of each path is calculated, and the sum is carried out to obtain the final state component, namely the amplitude of the target quantum state component. Wherein M is 1 Is a positive integer.
For example, the target quantum program involves 2 qubits, respectively: q 0 、q 1 Initial state s 0 =|00>The target quantum state component is |11>The quantum program contained 2H gates (Hadamard Gate, ada Ma Men): h 1 、H 2 1 CNOT Gate (Control-not Gate, control NOT Gate).
As shown in FIG. 1a, a simple illustration of the quantum program is given for a quantum wire, it can be seen that the quantum wire divides the initial quantum state |00 >And target quantum state component |11>In addition, 4 quantum states are involved: s is(s) 0 1 、s 0 2 、s 1 1 、s 1 2 Wherein each quantum state can be represented as |0>And |1>Is a superposition of (a) to (b). If s is to 1 1 Splitting into |0>And |1>Two parts, then, from the initial quantum state |00>To |11>Can be split into two paths as shown in figures (1 b) and (1 c) to obtain initial quantum state |00>Transformed into |11 via two paths>And then summing the sub-amplitudes to obtain an amplitude value corresponding to the target quantum program, thereby completing the simulation.
It will be appreciated that if M in the target quantum program is to be 1 Each quantum state is split into |0>And |1>In two parts, then, the initial quantum state to final state component is obtainedThe possible transformation paths are calculated, the amplitude of each path is calculated, and the sum is performedA target single amplitude of the quantum state can be obtained.
In a quantum program with only single quantum logic gates and diagonal double quantum logic gates, the initial quantum state of a given qubit is |0 … 0>When the last state component, namely the target quantum state component, is taken as x= |x 0 …x n-1 >When the amplitude is calculated, the formula can be expressed as:
(1)
equation (1) is a basic equation of the quantum mechanical path integration method.
In the formula (1) The functions are all complex functions related to Boolean variables, represent the contribution of quantum logic gates to quantum states, and for better explanation, only three types of +.>One of the functions is +.>The function is omitted; />The value is {0,1}, the corresponding qubit is +.>Is->The components of the quantum states after the action of the quantum logic gates. />The value of the function is mainly related to two factors, namely, the quantum state of the quantum bit operated by the quantum logic gate before and after the quantum logic gate is executed and the unitary matrix of the quantum logic gate.
In particular, the method comprises the steps of,is a variable +.>And->The value of which is determined by the values of two variables and unitary matrix corresponding to the diagonal two-quantum logic gate,/->And->Respectively corresponding to the quantum bit of +.>And->Is not +.>Components of the quantum state before the diagonal double-quantum logic gate acts; />Is a variable +.>The value of which is determined by the value of the variable and the unitary matrix of the corresponding diagonal single quantum logic gate,/v>The corresponding qubit is +.>Is not +.>Components of the quantum state before the diagonal single quantum logic gate acts; />Is a variable +.>And->The value of which is determined by the values of two variables and unitary matrix corresponding to the off-diagonal single quantum logic gate,/- >And->Respectively corresponding to the quantum bit of +.>Is at%>The components of the front and rear quantum states of the off-diagonal single quantum logic gate act. It will be appreciated that->、/>、/>、/>、/>、/>、/>、/>、/>、/>Are all non-negative integers.
The single-amplitude quantum computing simulation method provided by the embodiment of the invention is based on the formula (1), expands to a non-diagonal double-quantum logic gate and maps the non-diagonal double-quantum logic gate into an undirected graph. In particular, the method comprises the steps of,points corresponding to undirected graph->The function corresponds to the edge of the undirected graph, converting the solving equation (1) into a process for the undirected graph, it being understood that, according to +.>The undirected graph can be split; more specifically, firstly, constructing an undirected graph corresponding to a target quantum program on a plurality of computing nodes, then splitting the undirected graph according to different vertex values to obtain different sub undirected graphs, then obtaining the amplitudes of corresponding paths through computing the sub undirected graphs, and finally merging the amplitudes of all paths to obtain the corresponding amplitudes of the target quantum state components.
As shown in fig. 2, the single-amplitude quantum computing simulation method provided by the embodiment of the invention may include the following steps:
s201, aiming at each computing node of the distributed cluster, obtaining a target quantum program;
At the angle of the computing nodes, each computing node can obtain the same target quantum program sent by the main node, the program is preferably written by the existing qrenes language, and can also be written by other feasible quantum languages, the computing node is usually a CPU in a computer, the main node can also be the CPU, the first computing node can be used as the main node, and the user input quantum program can be taken, and the following description is given by taking the computing node as the CPU.
S202, constructing an undirected graph corresponding to the target quantum program; wherein, the vertex of the undirected graph represents the quantum state of the operated quantum bit before or after the quantum logic gate is operated, and one edge of the undirected graph corresponds to one tensor;
first, the target quantum program can be parsed to obtain a linked list of recorded quantum program information. And traversing the linked list, sequentially reading the types and unitary matrix forms of all quantum logic gates in the linked list, adding vertexes and edges, and constructing an undirected graph of the quantum program. One specific implementation is as follows:
when the type of the quantum logic gate in the linked list is a first single quantum gate, creating an edge with tensor order of 1; the side is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the first single quantum gate, and the unitary matrix of the first single quantum gate is a diagonal matrix;
When the quantum logic gate type in the linked list is a second single quantum gate, creating an edge with tensor order of 2 and a vertex connected with the edge; the side is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the second single quantum gate, and the unitary matrix of the second single quantum gate is a non-diagonal matrix;
when the type of the quantum logic gate in the linked list is a first double quantum gate, creating an edge with tensor order of 2; the side is connected with the last vertex in the vertex chain corresponding to the two quantum bits operated by the first double-quantum gate, and the unitary matrix of the first double-quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second double-quantum gate, creating an edge with the tensor order of 4 and two vertexes connected with the edge; the side is connected with the last vertex in the vertex chain corresponding to the two quantum bits operated by the second double-quantum gate, and the unitary matrix of the second double-quantum gate is a non-diagonal matrix;
and finishing traversing, and finishing adding the vertexes and the edges to obtain the undirected graph corresponding to the target quantum program.
In the process of constructing the undirected graph, when a vertex is created, the vertex is recorded as to what vertex the current quantum logic gate operates on.
In the constructed undirected graph, each quantum bit corresponds to vertex chain information, and the vertex chain information comprises the values of the vertexes from the first vertex to the last vertex, the edge information of vertex connection and vertex identification. The vertex identification uniquely determines a vertex, and according to the identification, the quantum bit to which the corresponding vertex belongs and the value, the connecting side information and the like can be determined; after the value of the vertex is determined, the vertex value is 0 or 1; however, when the value of the vertex is uncertain, the vertex can be either null or any agreed numerical value or character conforming to the value type of the vertex, such as-1, for judging the value condition of the vertex in the implementation process. The value of the vertex can be represented by tensors, variables or other reasonable data types.
The undirected graph also includes tensor information of the edge, and may include a tensor array and an identification of a vertex to which the corresponding edge of the tensor is connected.
Wherein, the vertex of the undirected graph corresponds to the quantum state component of the operated quantum bit before or after the operation of the quantum logic gate, and the values are {0,1}, which corresponds to the variable in the formula (1)The method comprises the steps of carrying out a first treatment on the surface of the The operated quantum bit is the quantum bit corresponding to the quantum logic gate operation. The edges of the undirected graph correspond to quantum logic gates in the target quantum program, specifically, each edge corresponding to the quantum logic gate corresponds to a tensor, the elements in the tensor are jointly determined by unitary matrixes corresponding to the quantum logic gates and vertex values connected with the corresponding edges, and it can be understood that ++ - >A function.
Tensor (Tensor) is a quantity defined in several linear spaces at the same time, which is a generalization of the vector concept and the matrix concept. Each tensor may employ a subscript notation, e.g. tensor T 12 The number of subscripts isThe order of the quantity (rank) represents the dimension of the tensor. For example, the scalar is a 0 th order tensor, the vector is a 1 st order tensor, and the matrix is a 2 nd order tensor. The shape of the tensor refers to the number of elements in each dimension; the number of elements of the tensor is determined by its shape.
For example, tensor B is known 1234 Subscript "1234", tensor of order 4; wherein if 3 elements are in the dimension indicated by the subscript 1, 4 elements are in the dimension indicated by the subscript 2, 5 elements are in the dimension indicated by the subscript 3, and 4 elements are in the dimension indicated by the subscript 4, then tensor B 1234 Shape= [3,4,5,6 ]]The number of elements is as follows:
in the embodiment of the invention, the order of the tensor is equal to the number of the vertexes connected with the corresponding side of the tensor, and the subscript of the tensor is the number of the vertexes in the undirected graph. Since each vertex can only take a value of 0 or 1, there are only two possibilities, so for an n-order tensor, its shape shape= [2,2 … 2]The number of elements is 2 n
And the number of each tensor element can be expressed as a binary number, and each bit of the binary number can be expressed as the value of the corresponding vertex.
For example, edge E m The 4 vertices are connected, so that the corresponding tensor is a 4-order tensor, the elements of which share 2 4 =16, then the element number can be expressed as a binary number: (0000) 2 ~(1111) 2 . When edge E m The 4 connected vertexes are arranged according to a certain sequence to obtain a vertex sequence, when the values of all the four vertexes are O, the combination value of the vertex sequence is 0000, and the (0000) of the corresponding tensor is obtained 2 A bit element; when edge E m When the first vertex is 1, the second vertex is 0, the third vertex is 0, and the fourth vertex is 1, the combined value of the vertex sequence is 1001, and the corresponding tensor (1001) 2 Bit elements.
In practical application, the equivalent logic gate is single quantum logicWhen the gate or the double-quantum logic gate and the corresponding unitary matrix are diagonal matrixes, namely diagonal single-quantum logic gate or diagonal double-quantum logic gate, the quantum logic gate acts on the quantum bit, usually only changes in amplitude are brought, and the quantum state component corresponding to the quantum bit corresponds to the quantum state component in the formula (1)There is typically no change.
Single quantum logic gates of this type, typically Pauli-Z gates (Pauli-Z gates), the unitary matrix of which is
When a qubit is operated on with a bubble-Z gate, the basic state of the qubit, i.e., 0> is left unchanged, and i.1 > is converted to- |1>.
This type of double quantum logic gate, typically a CZ gate, has a unitary matrix of:
when two qubits Q 0 、Q 1 (Q 0 To control bits, Q 1 For the target bit) performs a CZ gate operation, when Q 0 Is |0 in quantum state>When Q is 1 Is unchanged; when Q is 0 Is of the quantum state |1>When Q is 1 Quantum state retention of |0>Unchanged, will be |1>Change to- |1>。
It can be seen that both the bubble-Z gate and the CZ gate bring about only a change in the amplitude of the quantum state component, which is unchanged. Therefore, when the undirected graph is constructed, one quantum logic gate corresponding edge is added to the undirected graph for the quantum logic gates, namely the single quantum logic gate or the double quantum logic gate with unitary matrix being diagonal matrix.
As shown in FIG. 3a, for a diagonal single quantum logic gate, only one edge E is added when constructing an undirected graph 1 The edge has one end and topV 0 Are connected. Wherein V is 0 The current last vertex of the corresponding quantum bit, namely the vertex corresponding to the quantum state directly acted by the quantum logic gate.
It will be appreciated that edge E 1 Only one vertex is connected, so its tensor Is tensor of 1 order, 2 1 =2 elements, the tensor corresponds to +.>And a function representing the contribution of the diagonal single quantum logic gate to the quantum state. Specifically, suppose that unitary matrix of the diagonal single-quantum logic gate +.>Then->Wherein the vertex V 0 When the value is 0, the corresponding +.>(0) 2 Bit element, i.e.)>,V 0 When the value is "1", the tensor (1) is corresponding to the value 2 Bit element, i.e.)>
For example, when edge E 1 Is the corresponding edge of the diagonal single-quantum logic gate Brillouin-Z gate, and the tensor is known according to the unitary matrix of the Brillouin-Z gate
As shown in fig. 3b, for the diagonal double-quantum logic gate, only one edge E is needed to be added when constructing an undirected graph 12 One end of the edge is connected with the vertex V 1 One end is connected with the vertex V 2 Are connected. Wherein V is 1 And V 2 The current last vertex corresponding to the two quantum bits of the diagonal two-quantum logic gate operation is respectively, namely the vertex corresponding to the quantum state directly acted by the diagonal two-quantum logic gate.
It will be appreciated that as shown in FIG. 3b, edge E 12 And two vertexes V 1 And V 2 Connected so that it tensorFor the tensor of order 2, 2 is taken as a total 2 =4 elements, the tensor corresponds to +.>And a function representing the contribution of the diagonal two-quantum logic gate to the quantum state. Specifically, suppose that unitary matrix of the diagonal two-quantum logic gate +. >Then. Wherein when V 1 V 2 When the value is '00', corresponding ++>(00) 2 A bit element; v (V) 1 V 2 When the value is "01", the corresponding ∈Reinforcement is given>(01) 2 A bit element; when V is 1 V 2 When the value is 10, the corresponding +.>(10) 2 A bit element; v (V) 1 V 2 When the value is 11, the corresponding +.>(11) 2 Bit elements. Of course, it is also possible to follow the vertex sequence "V 2 V 1 "takes on values corresponding to the elements in the tensor, if in this order, it is understood that +.>. It should be noted that, a value of the vertex sequence uniquely determines an element in the tensor, but a positional relationship between the value of the vertex sequence and a corresponding element in the tensor is not unique, and may be determined according to actual requirements.
For example, when edge E 12 When the corresponding edge of the diagonal double-quantum logic gate CZ gate is known according to unitary matrix
When the quantum logic gate is a single quantum logic gate and its corresponding unitary matrix is an off-diagonal matrix, i.e., an off-diagonal single quantum logic gate, such as an H gate, its unitary matrix isThrough the operation of the H gate, |0>Will become +.>(|0>+|1>),|1>Will become +.>(|0>-|1>) Both the amplitude and the quantum state components change. When the undirected graph is constructed, the vertex corresponding to the new quantum state and the corresponding edge of the quantum logic gate after the single quantum logic gate is operated are added to the undirected graph.
As shown in fig. 3c, when constructing an undirected graph for a non-diagonal single quantum logic gate, a vertex V is added 4 One edge E 34 One end of the edge is connected with the vertex V 3 One end is connected with the vertex V 4 Are connected. Wherein V is 3 The current last vertex corresponding to the quantum bit operated by the off-diagonal single quantum logic gate, namely the vertex corresponding to the quantum state directly acted by the off-diagonal single quantum logic gate; v (V) 4 And the vertex is a new vertex, and corresponds to a new quantum state after the off-diagonal single quantum logic gate is operated.
It will be appreciated that edge E 34 And two vertexes V 3 And V 4 Connected so that it tensorFor the tensor of order 2, 2 is taken as a total 2 =4 elements, the tensor corresponds to +.>And a function representing the contribution of the off-diagonal single quantum logic gate to the quantum state. Specifically, suppose that unitary matrix of the diagonal single-quantum logic gate +.>Then->. Wherein when V 3 V 4 When the value is '00', the formula is +.>(00) 2 A bit element; v (V) 3 V 4 When the value is "01", the corresponding ∈Reinforcement is given>(01) 2 A bit element; when V is 3 V 4 When the value is 10, the corresponding +.>(10) 2 A bit element; v (V) 3 V 4 When the value is 11, the corresponding +.>(11) 2 Bit elements.
For example, when edge E 34 When the corresponding edge of the non-diagonal single-quantum logic gate H gate is the corresponding edge, tensor can be known according to unitary matrix
When the quantum logic gate is a double-quantum logic gate and the corresponding unitary matrix is an off-diagonal matrix, i.e. an off-diagonal double-quantum logic gate, such as a CNOT gate, the unitary matrix isThrough the operation of the CNOT gate, the quantum state of the control bit is |0>When the quantum state of the controlled bit is unchanged, i.e. the quantum states of the controlled bit and the controlled bit are |00>And |01>After CNOT gate operation, it is still |00>And |01>The method comprises the steps of carrying out a first treatment on the surface of the Control the quantum state of bit to be |1>The quantum state of the controlled bit is then not operated, i.e., 0>Becomes |1>Will be |1>Becomes |0>I.e. the quantum states of the control bit and the controlled bit are |10>Sum |11>After CNOT gate operation, it becomes 11>And |10>. Since two qubits are usually at |00>、|01>、|10>Sum |11>After CNOT gating, it can be seen that both the amplitude and the quantum state components of the two qubits involved will change. For such quantum logic gates, when an undirected graph is constructed, for each quantum bit, one vertex corresponding to a new quantum state of the undirected graph after the operation of the double-quantum logic gate, namely two new vertices, needs to be added to the undirected graph, and a corresponding edge of the double-quantum logic gate needs to be added to the undirected graph.
As shown in the figure (3 d), for the non-diagonal double-quantum logic gate, an edge E is added when constructing an undirected graph 5678 For the sake of clarity, two sides are shown here, but it should be noted that the two sides in the diagram (3 d) are actually the same side E 5678 . Newly added edge one end and vertex V 5 、V 7 One end is connected with the vertex V 6 、V 8 Are connected. Wherein V is 5 And V 6 The current last vertex corresponding to the two quantum bits of the off-diagonal double-quantum logic gate operation is V 7 And V 8 And the peaks corresponding to the new quantum states of the two quantum bits after the off-diagonal double-quantum logic gate operation are respectively obtained. In particular, when the off-diagonal double quantum logic gate is a control logic gate, V 5 、V 7 Corresponding to control bits, V 6 、V 8 Corresponding to the controlled bits.
It will be appreciated that edge E, as shown in FIG. 3d 5678 And four vertexes V 5 、V 6 、V 7 And V 8 Connected so that it tensorFor 4 th order tensors, 2 is taken as a total 4 Let vertex sequence "V =16 elements 5 V 6 V 7 V 8 "values (" 0000 "to" 1111 ") correspond to +.>Middle (0000) 2 ~(1111) 2 Bit elements. Specifically, suppose the unitary matrix of the off-diagonal double-quantum logic gateWherein, when->When (I)>Not all 0, vertex sequence "V 5 V 6 V 7 V 8 Value and->The correspondence of each element in (a) is shown in table 1:
TABLE 1V 5 V 6 V 7 V 8 Value and U 4 Correspondence of elements in (a)
It is known that the number of the components,
for example, when edge E 34 When the corresponding edge of the non-diagonal single-quantum logic gate CNOT gate is, it is known that the unitary matrix is Thus, tensor +.>
It will be appreciated by those skilled in the art that any multiple quantum bit gate may be constructed with a single quantum logic gate plus any double quantum logic gate, in most cases the double quantum logic gate selecting the CNOT gate more. CNOT gates and single quantum logic gates are in a sense prototypes of all other gates. Therefore, the method provided by the application is also suitable for quantum programs with multiple quantum bit gates, and in practical application, the multiple quantum bit gates in the quantum programs can be converted into the combination of single quantum logic gates and double quantum logic gates, and then the single-amplitude quantum computing simulation method provided by the invention is applied.
S203, obtaining a quantum state corresponding to a target single amplitude to be measured, and calculating the sub-amplitude of the quantum state based on the quantum state and the undirected graph and by matching with the GPU corresponding to the calculation node; wherein the sub-amplitude is the amplitude corresponding to the undirected graph;
in practical application, when the simulation of the equivalent calculation involves a plurality of quantum bits, the Dirac sign is directly used>It would be very inconvenient to represent each quantum state in combination with a binary representation method. Thus, the decimal numbers corresponding to the binary representation method are commonly used for representing, such as |000 >I.e. zero state, |0100>I.e. 4 state. It will be appreciated that if the target quantum state component is a decimal number, the host node will convert it to a binary string and then send it to each compute node. Each bit in the binary string corresponds to a value of a quantum bit, and the low-order to the high-order quantum bits correspond to the low-order to the high-order quantum bits in sequence, and it is to be noted that, for a quantum computer, the low-order and the high-order arrangement are all arranged from the low-order to the high-order in the right-to-left order, similar to a classical computer. For example, assuming that the target quantum state component is 5-state, it is converted into binary form, i.e. "101", corresponding to 3 qubits (q in order from low order to high order 0 、q 1 、q 2 ) Then "101" corresponds to "q", respectively 2 q 1 q 0 ”。
Specifically, based on the quantum state and the undirected graph, and in cooperation with the GPU corresponding to the computation node, computing the sub-amplitude of the quantum state may include:
invoking GPU corresponding to the computing node, and respectively performing definite value reduction on tensors of edges connected with specific vertexes of the undirected graph; wherein the specific vertex is the first vertex and the last vertex of the vertex chain corresponding to each quantum bit; deleting the specific vertex; receiving the value of a target vertex distributed by a main node, splitting a current undirected graph based on the value of the target vertex, and calling the GPU to respectively determine value reduction for tensors of connecting edges of the target vertex aiming at each sub undirected graph obtained by splitting; aiming at each vertex in the sub undirected graph, combining the GPU to fuse all the connecting edges of the vertex into a new edge, reducing the tensor of the new edge, and deleting the vertex; integrating tensor values of all new edges after the order reduction to obtain a first sub-amplitude of the quantum state, which corresponds to the sub-undirected graph; and summing the first sub-amplitudes of all sub-undirected graphs in the computing node to obtain the sub-amplitudes of the quantum state.
Finally, all vertices in each sub undirected graph are deleted, leaving only edges with tensor order 0, with the 0-order tensor being a scalar. The tensor values of the edges are multiplied to obtain a first sub-amplitude of the quantum state in the compute node.
For example, for a sub undirected graph, all vertices and their connecting edges are S 12 、S 16 、S 25 、S 3 、S 4
For vertex 1, S 12 、S 16 Fused into a new edge S 126 For tensor A thereof 126 Reducing the order to A 26 The corresponding edge becomes S 26 Deleting the vertex 1;
for vertex 2, the current connecting edge S 25 、S 26 Fused into a new edge S 256 For tensor A thereof 256 Reduced to A 56 Corresponding edge S 56 Deleting the vertex 2;
for vertex 3, only edge S is connected 3 For tensor A thereof 3 Reduced to a=x 3 (scalar) the corresponding edge becomes tensorEdge s with order of 0 3 Deleting the vertex 3;
for vertex 4, only edge S is connected 4 For tensor A thereof 4 Reduced to A =x 4 (scalar) edge s for which the corresponding edge becomes tensor order 0 4 Deleting the vertex 4;
for vertex 5, only edge S is currently connected 56 Unchanged after fusion, for tensor A 56 Reducing the order to A 6 Corresponding edge S 6 Deleting the vertex 5;
for vertex 6, only edge S is currently connected 6 For tensor A thereof 6 Reduced to A ′′ =x 6 (scalar) edge s for which the corresponding edge becomes tensor order 0 6 Vertex 6 is deleted.
Calculating a first sub-amplitude x corresponding to the sub-undirected graph 3 *x 4 *x 6
For another example, for a sub undirected graph, all the current vertices and their connecting edges are S 123 、S 124 、S 15 、S 46
For vertex 1, S 123 、S 124 、S 15 Fused into a new edge S 12345 For tensor A thereof 12345 Reducing the order to A 2345 The corresponding edge becomes S 2345 Deleting the vertex 1;
for vertex 2, only edge S is currently connected 2345 For tensor A thereof 2345 Reduced to A 345 Corresponding edge S 345 Deleting the vertex 2;
for vertex 3, only edge S is connected 345 For tensor A thereof 345 Reduced to A 45 Corresponding edge S 45 Deleting the vertex 3;
for vertex 4, the current connecting edge S 45 、S 46 Fused into a new edge S 456 For tensor A thereof 456 Reduced to A 56 Corresponding edge S 56 Deleting the vertex 4;
for vertex 5, only edge S is currently connected 56 For tensor A thereof 56 Reducing the order to A 6 Corresponding edge S 6 Deleting the vertex 5;
for vertex 6, only edge S is currently connected 6 For tensor A thereof 6 Reduced to b=y 6 (scalar) edge s for which the corresponding edge becomes tensor order 0 6 Vertex 6 is deleted.
Calculating a first sub-amplitude corresponding to the sub-undirected graph as y 6
Specifically, in order to reduce the calculation amount of the subsequent undirected graph, the tensor of the edge connected with the specific vertex of the undirected graph is respectively subjected to definite value reduction, which may be:
calculating GPU corresponding to the node, and setting the number of thread blocks according to the tensor order after the order reduction and the number of threads in each thread block in the GPU aiming at the edge connected with each specific vertex; according to the thread block serial numbers, the thread number in each thread block and the thread serial numbers, calculating first element numbers of the reduced tensors, and calculating two second element numbers of the reduced tensors corresponding to the first element numbers; the number of bits of the element number corresponds to the number of vertex bits connected with the current edge one by one, and the value of each bit of the element number is the vertex value of the corresponding vertex position; determining a second element number with a value on a number bit corresponding to a specific top point position as a preset determined value from the two second element numbers; and acquiring a second element value corresponding to the determined second element number, and determining the second element value as a first element value corresponding to the first element number.
Wherein, the calculation formula (the following is the same) of the first element number is:
Idx=block_id*num+thread_id(2)
idx is the first element number, block_id is the thread block number, num is the number of threads in each thread block, and thread_id is the thread number.
The specific implementation mode of determining the value reduction is as follows: firstly, the tensor elements may be complex numbers, and since the GPU does not (the CPU has) have a representation form of complex number x+yi, a new real part tensor space and an imaginary part tensor space need to be applied, wherein the real part tensor space is specified to store x, and the imaginary part tensor space stores y, so as to characterize the tensor.
Obtain the obtainedThe reduced order n sent by the computing node (CPU) is judged to be 2 n If not, setting 1 thread block if not, otherwise, setting the number of GPU thread blocks to 2 n And/num, the num value satisfies the power of a non-negative integer of 2. Due to the nature of the computer itself, the thread block sequence number block_id and thread sequence number thread_id tend to start numbering from 0.
Taking GPU as GTX1080ti as an example, num=1024 of GTX1080ti, thread_id=0, 1, 2 … … 1023. For example, connecting edge E to vertex number 2 (vertex 2 for short) 13254 Tensor a of 5 th order of (2) 13254 Performing definite value reduction, namely, for A 13254 The 3 rd order (or 3 rd dimension, hereinafter the same) of the determined value is reduced, wherein the dimensions are numbered from right to left, e.g., vertex 1 is 5 th dimension and vertex 4 is 1 st dimension.
Assuming that the determined value is 0 (or 1, which may be set by itself as needed), it means that the vertex 2 takes a value of 0. A first order reduction operation is typically referred to as a first order reduction, so n=4, 2 n Less than 1024, the thread block is 1 block, blcok_id=0. Reduced 4 th order tensor A 1354 Only 16 elements are provided, one thread calculates an element number from the thread 0, and only the threads 0 to 15 are called, wherein the first element number calculation can be as follows: idx=0, 1, 2 … ….
For the first element number 10 (binary 1010), two second element numbers before their corresponding reduction are calculated as follows:
will 2 3-1 -1=3 (binary 11, reduced to the mth order, then calculate 2 m-1 The value of-1, with the aim of splitting the element number) and 10 by bit and, resulting in 2 (binary 10). The bit-wise inversion of 3 (binary 1111 … … 1100 after inversion, the number of binary bits being dependent on the data type, e.g. 64 bits, etc.), and the bit-wise AND operation of 10 results in 8 (1000), and the binary split of 10 is achieved, resulting in the first two and the second two bits of 10. The two second element numbers with the vertex determination values of 0 and 1 corresponding to the subscript 2 before order reduction are respectively: 8 (1000) is shifted left by one bit (10000) and 2 (10) to perform or operation, and 18 (10010) is obtained; the reduced order is 3 rd order, pairs 18 (10010) and 4 (equivalent At 1<<(3-1), i.e., the mth order is reduced, 1 is performed<<The left shift operation of (m-1), m is more than or equal to 1 and less than or equal to n+1) is performed or operated, and 22 (10110) is obtained.
Because the set determination value is 0, that is, the value on the 3 rd bit of the element number corresponding to the 3 rd specific vertex is 0, the bit of the binary element number and the value on the bit are uniformly and correspondingly same with the vertex bit and the vertex value on the vertex point, thereby determining 18 to be the second element number which is required by the first element number 10 to obtain A 13254 Element value p of element number 18 of (2) as reduced A 1354 Is the first element value of element number 10. Similarly, A can be obtained by calculation 1354 The first element values corresponding to the rest first element numbers finally obtain the tensor A after the reduction 1354 ={p 0 ,p 1 ,p 2 …p…p 14 ,p 15 }。
Then, the original reduced order pre-tensor A is released 13254 The occupied video memory.
Specifically, to reduce the computational complexity, one or more values of the target vertices equally divided by the master node may be received; wherein the target vertexes are the first m vertexes with the largest number of connected edges in the current undirected graph, and the m vertexes comprise 2 m Seed value, wherein the value of each vertex is 0 or 1, m is a positive integer, and the number of the calculation nodes is 2 n N is a positive integer, and n is greater than 0 and less than or equal to m; splitting an undirected graph of the computing node into one or more sub undirected graphs aiming at each vertex value which is evenly divided; and traversing the edges connected with the target vertexes aiming at each sub undirected graph, and calling the GPU to respectively perform definite value reduction on tensors of the edges connected with the target vertexes.
The number of the computing nodes is a preset value, namely n is a preset value, the number of the computing nodes is related to written quantum programs and available computing resources correspondingly configured, and the value m is preset according to requirements. If n=1 and m=1, for the 1 vertices with the largest number of connected edges, the vertex value is originally uncertain, and there are 2 types: 0 or 1. And distributing the 2 value conditions to 2 computing nodes for computing respectively, wherein the vertex value of the undirected graph in one computing node is determined to be 0, so as to obtain one sub undirected graph, the vertex in the undirected graph in the other computing node is determined to be 1, so as to obtain the other sub undirected graph, and splitting the undirected graph is realized.
For another example, n=1, m=2, and there are 4 cases: 00. 01, 10, 11, then 2 computing nodes equally divide 4 values of the top 2 vertices with the largest connecting edges: 00. 01, 10, 11, one computing node is divided into 00, 01, and the other computing node is divided into 10, 11. Equivalently, the undirected graph in one computing node is split into 2 sub undirected graphs, wherein the values of the 2 vertexes in one sub undirected graph are respectively 0 and 0, and the values in the other sub undirected graph are respectively 0 and 1. Similarly, the values of the two vertices in the undirected graph of the other computing node are respectively 1 and 0, and the values of the two vertices in the undirected graph of the other computing node are respectively 1 and 1.
The method is characterized in that the method is used for standing at the angles of all computing nodes, each computing node originally comprises the same undirected graph, so that the undirected graph of each computing node is split into one or more different undirected sub-graphs, each undirected sub-graph can be regarded as a path, first sub-amplitudes corresponding to each path in one computing node are calculated, and all first sub-amplitudes in the interior are summed to obtain sub-amplitudes of a quantum state corresponding to the current node, so that the idea of a path integration method (FeymannPath Integral) is reflected, and the method is not based on a unitary matrix transformation method, because the memory occupation of the sub-undirected graph is exponentially increased.
Taking n=1 and m=2 as an example, the most connected edges in the current undirected graph are vertex 3 and vertex 5. All edges connected to vertex 3 or 5 are assumed to be E 3241 、E 361 、E 32 、E 354 、E 125 、E 45 、E 56 Traversing the edges, calling the GPU subordinate to the computing node with 00 values, respectively carrying out definite value reduction on tensors of the edges in the undirected graph, wherein the principle is the same as that of respectively carrying out definite value reduction on tensors of the edges connected with specific vertexes of the undirected graph, and the method is exemplified as follows again:
for edge E 3241 Corresponding to 4 th order tensor T 3241 In the case of taking the value of 00 for the vertices 3 and 5, since the vertex 5 is not connected to the edge, only T is needed 3241 The 4 th order of the vertex 3 of (2) is subjected to definite value reduction. The GPU still applies for new real part tensor and imaginary part tensor to obtain reduced order n sent by the affiliated computing node (CPU), and judges 2 n If not, setting 1 thread block if not, otherwise, setting the number of GPU thread blocks to 2 n /num。
Taking GTX1080ti as an example, num=1024, thread_id=0, 1, 2 … … 1023 of GTX1080 ti. The vertex 3 takes a value of 0, i.e. the determined value is 0. A first order reduction operation is typically referred to as a first order reduction, so n=3, 2 n Less than 1024, the thread block is 1 block, blcok_id=0. Reduced 3-order tensor T 241 Only 8 elements are provided, one thread calculates an element number from the thread 0, and only threads 0 to 7 are called, wherein the first element number is calculated by using the formula (2): idx=0, 1, 2 … … 7.
For the first element number 5 (binary 101), two second element numbers before their corresponding reduction are calculated as follows:
for the 4 th order, 2 4-1 -1=7 (binary 111) and 5, yielding 5 (binary 101). And performing bit inversion on 7 (binary 1111 … … after inversion), performing AND operation on the sum 5 to obtain 0, and realizing binary splitting on the sum 5. The numbers of the two second elements with the determined values of 0 and 1 of the vertex 3 corresponding to the subscript 3 before order reduction are respectively as follows: 0 shift left by one bit and 5 to perform or operation to obtain 5 (0101); for 5 (0101) and 8 (1000) (corresponding to 1) <<(4-1), i.e., the mth order is reduced, 1 is performed<<The left shift operation of (m-1), where 1.ltoreq.m.ltoreq.n+1), gives 13 (1101).
Since each binary bit value of the element number represents the value of the corresponding vertex, 0101 represents T 3241 Edge E of (2) 3241 The values of the vertexes 3, 2, 4 and 1 are sequentially 0, 1, 0 and 1, so that the finally required second element number with the first element number 5 as 5 is determined, and T is obtained 3241 The second element value with element number of 5 is used as T after the reduction 241 First element value w of element number 5 in 5 . The same can be calculated to obtain the first element values corresponding to the rest of the first element numbers, and finally the first element values are obtainedReduced tensor T 243 ={w 0 ,w 1 ,w 2 ,w 3 ,w 4 ,w 5 ,w 6 ,w 7 }。
Then, the original reduced order pre-tensor T is released 3241 The occupied video memory.
In addition, for edge E 354 In the case of the vertex 3, 5 of (3) taking a value of 00, since each reduction is only one, the edge E can be opposite first 354 Tensor T of (1) 354 The 3 rd order of the falling vertex 3 is T 54 At T 54 The 2 nd order of the falling peak 5 is T 4
The reduced order calculation of the edges in the other undirected graph and the value condition is the same as the above, and the description is omitted.
Specifically, the fusing all the connection edges of the vertex with the GPU into a new edge includes:
determining a first edge and a second edge to be fused for all the connecting edges of the vertex; according to the vertex which is not connected with the first edge in the second edge, calling the GPU to upgrade the first tensor of the first edge, and updating the first tensor by the tensor after upgrade; deleting the second edge, and connecting the vertex which is not connected with the first edge in the second edge to the first edge to obtain a fused middle edge; invoking the GPU to calculate tensor elements of the middle edge according to the corresponding relation between the recorded vertex numbers of the first edge and the second edge; and returning to the execution step of determining the first edge and the second edge to be fused until the calculated tensor element is the tensor element of the last edge, and determining the last edge as a new fused edge.
In practical applications, the tensor of one edge is generally raised for the fusion of any two edges, so that the edge with the largest order in all the connected edges of the vertex is generally selected as the first edge.
It is understood that the second edge is an unfused edge of the remaining edges except the first edge.
Specifically, one side can be directly selected from the remaining sides as the second side; the connection edges of the vertices may be ordered according to the order of the vertices, but not limited to the order from small to large or from large to small, and then the edge with the largest order is determined as the first edge, and the edge with the largest order in the remaining unfused edges is determined as the second edge.
For example, vertex V n The total of 4 connected edges are respectively: e (E) n1 、E n2 、E n3 、E n4 The corresponding orders are respectively: 3. 2, 4 and 2. Firstly, ordering edges according to the order from big to small to obtain: e (E) n3 、E n1 、E n2 、E n4 (or E) n3 、E n1 、E n4 、E n2 ) It can be seen that E n3 The maximum order is determined as the first edge, and E is the three edges which are not fused n1 The order is the largest and is determined to be the second edge.
And comparing the vertexes of the first edge and the second edge, determining the vertexes which are connected with the second edge but are not connected with the first edge, then executing the ascending operation on the first tensor according to the determined vertexes, and updating the first tensor with the new tensor after ascending.
For example, assume that the vertex number is 2 and the corresponding first edge is E 12 Connecting vertices 1 and 2, the corresponding first tensor is A 12 = {1,2,3,4}; the second side is E 23 Connecting vertices 2 and 3, corresponding to a second tensor B 23 = {5,6,7,8}. Comparing the vertices connected by the two sides, it can be seen that vertex 3 is a vertex connected to the second side and disconnected from the first side. Accordingly, will A 12 Ascending order A 123 A is known from the following 123 = {1,2,3,4}, then the first tensor corresponding to the first edge is updated to a 123
Wherein the ascending computation of the first tensor for the first edge is performed by the GPU, the principle may be as follows:
the GPU performs ascending order on the first tensor of the first edge, and the tensor order t=r+s after ascending order can be calculated according to the order r and the ascending order s of the first tensor, wherein r and s are positive integers; according to the tensor order t after the rising orderAnd the number num of threads in each thread block in the GPU, setting the number of thread blocks, wherein if 2 t Setting 1 thread block smaller than num, otherwise setting the number of GPU thread blocks to 2 n /num; according to the thread block sequence number block_id, the thread number num and the thread sequence number thread_id in each thread block, calculating a first element number idx of the tensor after the step-up, wherein a calculation formula is the same as the formula (2); the element of the tensor after the ascending is calculated from the first element number idx, the ascending order s and the element of the first tensor.
Continuing taking GTX1080ti as an example, receiving from a computing node ascending order instruction information, a first tensor A before ascending order 12 = {1,2,3,4}, which corresponds to element numbers 0, 1,2,3 (corresponding to binary 00, 01, 10, 11), rising first order to a 123 R=2, s=1, t=3. Judgment 2 t Less than num (1024), configure 1 thread block, block_id=0. As in the previous case, thread_id=0, 1 … … 1023, due to a 123 Only 8 elements are needed, and only threads 0-7 are required to be called, and each thread is calculated by a formula (2) to obtain A 123 Idx of (a) is 0, 1,2 … … 7 in this order. Assigning a value to the element whose first element number is idx: dst [ idx ]]=src[idx/2 s ]A representing the first element number idx 123 Element value dst [ idx ]]Equal to element number idx/2 s Is a whole-up part of 12 Element value, rounding off refers to removing the integer portion remaining after the decimal point.
From A 123 Idx/2 is calculated by idx of (2) s =idx/2, 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, in order, obtainable:
dst[0]=src[0]=1,dst[1]=src[0.5]=src[0]=1;
dst[2]=src[1]=2,dst[3]=src[1.5]=src[1]=2;
dst[4]=src[2]=3,dst[5]=src[2.5]=src[2]=3;
dst[6]=src[3]=4,dst[7]=src[3.5]=src[3]=4。
thereby obtaining A 123 = {1,2,3, 4}. Similarly, if pair A 12 Raise 2 nd order to A 1234 Can obtain A 1234 ={1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4}。
Then the first tensor before the rising step is taken as a tensor A of 10 steps 12345678910 For example, = {1,2,3 … … 1024} corresponds to element numbers 0, 1,2 … … 1023 (corresponding to binary 0000000000, 0000000001 … … 1111111111), rising first order to a 1234567891011 R=10, s=1, t=11. Judgment 2 t Greater than num (1024), configuration 2 t The number of per num, namely 2 thread blocks, the block_id=0 and 1, the thread number thread_id in each thread block is 0 and 1 … … 1023, and the number of tensor elements after the step-up is 2 11 The idx obtained by calculating the number 0 thread block is sequentially 0, the idx obtained by calculating the number 1 thread block and the number 2 … … 1023,1 thread block is sequentially 1024 and 1025 … … 2047, namely A 1234567891011 Idx of (2) is 0 to 2047.
Calculation of idx/2 s =idx/2, 0, 0.5, 1, 1.5 … … 1023, 1023.5, in order, yields:
dst[0]=src[0]=1,dst[1]=src[0.5]=src[0]=1;
dst[2]=src[1]=2,dst[3]=src[1.5]=src[1]=2;
……
dst[2046]=src[1023]=1024,dst[2047]=src[1023.5]=src[1023]=1024。
thereby obtaining A 1234567891011 ={1,1,2,2……1024,1024}。
And after the tensor after the rising step is obtained, releasing the video memory occupied by the original first tensor.
Still take the first edge as E 12 The second side is E 23 For example, the first tensor is updated to A 123 After that, delete the second edge E 23 Connecting the vertex 3 which is not connected with the first edge in the second edge to the first edge to obtain a fused intermediate edge E 123 . Then according to the corresponding relation between the recorded vertex numbers of the first edge and the second edge, invoking the GPU to calculate the tensor C of the middle edge 123 Is an element of (a).
The corresponding relation can be obtained by a first tensor A 123 And a second tensor B 23 And obtaining the product. Wherein A is 123 The corresponding vertex sequence is "123", B 23 The corresponding vertex sequence is "23". It can be seen that vertex 2 is at A 123 Corresponding vertexIn the sequence is the 2 nd bit in B 23 The corresponding vertex sequence is 1 st bit, and vertex number 3 is A 123 The corresponding vertex sequence is the 3 rd bit, and the vertex sequence is B 23 The corresponding vertex sequence is the 2 nd bit, and the position corresponding relation is recorded. Specifically, the element structure of the mask array may be stored in an array, for example, in the mask array: struct { vertex at A 123 Positions in the corresponding vertex sequence, vertex at B 23 Position in the corresponding vertex sequence }.
Specifically, the steps of calculating tensor elements of the middle edge may be as follows:
the GPU firstly applies for a plurality of groups of arrays, copies the Maskarray in the CPU of the computing node to the GPU and stores the Maskarray in the array;
setting the number of thread blocks according to the updated order q of the first tensor and the number num of threads in each thread block in the GPU, if 2 q Setting 1 thread block less than num, otherwise setting 2 q A num thread block; according to the thread block sequence number block_id, the thread number num and the thread sequence number thread_id in each thread block, calculating a first element number idx of tensor of the middle edge, wherein a calculation formula is the same as the formula (2);
determining corresponding elements of each element in the first tensor in a second tensor of the second edge according to the corresponding relation stored in the array; each element in the first tensor is traversed, and the element is updated with the product of the element and its corresponding element in the second tensor.
With a first tensor A 123 = {1, 2,3, 4} and second tensor B 23 For example = {5,6,7,8}, the corresponding relationship indicates that the vertices 2 are respectively at a 123 And B 23 The 2 nd bit and the 1 st bit in the corresponding vertex sequence, the vertex numbers 3 are respectively A 123 And B 23 Bits 3 and 2 in the corresponding vertex sequence.
First, the position number of the first tensor element is expressed as a binary number (value corresponding to the vertex "123"): (000) 2 、(001) 2 、(010) 2 、(011) 2 、(100) 2 、(101) 2 、(110) 2 、(111) 2 The position numbers of the elements in the second tensor are expressed as binary numbers (values corresponding to the vertices "23"): (00) 2 、(01) 2 、(10) 2 、(11) 2 At tensor A according to vertices 2 and 3, respectively 123 And B 23 The corresponding position correspondence in the corresponding vertex sequence can determine A 123 Each element in B 23 The corresponding elements in (a) are shown in table 1:
table 1A 123 Each element in B 23 Corresponding element in (a)
Wherein, in the second row of Table 1, the binary numbers are underlined to clarify B 23 The middle vertex is at A 123 The values in the position numbers of each element are more clear for the convenience of explanation and are not meant in any limiting sense.
Each element in the first tensor is traversed, multiplied by the corresponding element in the second tensor, and the product is then used to update the element.
In A way 123 And B 23 For example, A is first 123 Each element of (a) and B 23 The multiplication of the corresponding elements in (a) can be obtained as the following products in turn: 5. 6, 14, 16, 15, 18, 28, 32, and then updating A with these products 123 Obtaining the middle edge E 123 Tensor C of (2) 123 ={5,6,14,16,15,18,28,32}。
The reduced order computation of tensors for the new edge is performed by the GPU, and the flow may be as follows:
firstly, the GPU sets the number of thread blocks according to the tensor order of the new edge after the order is reduced;
secondly, according to the serial numbers of the thread blocks, the number of threads in each thread block and the serial numbers of the threads, calculating the first element numbers of the reduced tensors, and calculating two second element numbers of the first element numbers corresponding to the reduced tensors; the number of bits of the element number corresponds to the number of vertex bits connected with the current edge one by one, and the value of each bit of the element number is the vertex value of the corresponding vertex position.
It should be noted that, the two steps are consistent with the calculation principle of the step corresponding to the step of determining the value reduction, and are not described herein.
And then, obtaining two second element values corresponding to the two second element numbers one by one, summing the two second element values, and determining the sum as a first element value corresponding to the first element number.
It can be seen that, compared with the calculation principle of determining value reduction, the difference is that determining value reduction requires finding out the second element number L1 with the value of the preset determined value on the number bit corresponding to a certain top point from the two second element numbers L1 and L2, so as to obtain the second element value corresponding to the second element number L1 as the first element value corresponding to the first element number, and the reduction calculation requires adding the two second element values corresponding to the two second element numbers L1 and L2 respectively, so as to obtain the first element value corresponding to the first element number.
Referring to the specific implementation manner and example of the above definite value reduction, assume a 5-order tensor A 13254 The rest conditions are consistent for the tensor of a new edge after the final fusion of the vertex 2. For the first element number 10 (binary 1010), two second element numbers 18 (10010), 22 (10110) before their corresponding reduction are calculated. Because the vertex 2 at the 3 rd order has an undetermined value (which can be 0 or 1), the two element numbers 18 and 22 are the corresponding element numbers of 10, and the element values p and p corresponding to the element numbers 18 and 22 are obtained Adding the obtained sums p ′′ I.e. the first element value of the first element number 10, and so on, to finally obtain A 13254 Reduced tensor A 1354
For another example, assuming that the vertex is vertex 2, a new edge E is obtained by merging all the connected edges 1234 The tensor is A 1234 = {5,5,6,6,14,14,16,16,15,15,18,18,28,28,32,32}. The corresponding vertex sequence of the new edge tensor is '1234', and a new vertex sequence '134' is obtained except the vertex 2, and the value of the new vertex sequence is equal to A 1234 Correspondence of elements in (3)The relationship is shown in Table 2:
table 2 values and A of the new vertex sequence "134 1234 Correspondence of elements in (3)
Thus, to the target edge E 1234 After the order is reduced based on the vertex 2, deleting the vertex 2 to obtain a new edge E after the order is reduced 134 Corresponding tensor A 134 ={19,19,22,22,43,43,50,50}。
And S204, returning the sub-amplitudes to the main nodes of the distributed clusters, so that the main nodes reduce the sub-amplitudes to obtain the amplitudes of the quantum states as target single amplitudes.
The reduction is data reduction, namely, the data quantity is reduced to the greatest extent on the premise of keeping the original appearance of the data as much as possible. The main node sums all the sub-amplitudes by reducing the sub-amplitudes calculated by each calculation node to obtain the target single amplitude of the measured quantum state.
The existing CPU microarchitecture is designed for high efficiency of instruction execution, and has strong logic processing (instruction execution) performance and high efficiency, but the number of GPU threads is more (hundreds or thousands), so that the CPU is focused on large-scale concurrent computation, and the numerical computation efficiency is generally about 5 to 10 times higher than that of the CPU. Based on this, in the embodiment of the present invention, the related main calculation tasks include determining the tensor element calculation of the intermediate edge of the value reduction, rising and fusion processes, etc., which are all distributed to the GPUs subordinate to each calculation node CPU for execution, and the CPU mainly executes the logic processing task, and the two tasks cooperate with each other, so that the single-amplitude quantum calculation simulation efficiency is at a higher level.
Therefore, the method can only calculate one target single amplitude of the concerned quantum bit at a time, specifically map the target quantum program onto the undirected graph, split the undirected graph onto a plurality of calculation nodes by combining a path integration method, and calculate the corresponding sub undirected graph by each calculation node and the subordinate GPU in a matching way. The whole calculation process is mainly based on simple operation of elements in tensors, compared with full-amplitude simulation based on unitary matrix in the prior art, the requirement on memory is greatly reduced, and the calculated amount does not rise along with the quantum bit index, so that quantum calculation simulation involving 50 or more quantum bits can be realized; the GPU has stronger performance of executing a large amount of parallel computation, so that the overall quantum computation simulation efficiency is higher. At present, the technical scheme provided by the embodiment of the invention can realize quantum computing simulation involving 196 quantum bits at maximum.
In addition, in practical application, only one or more of the full amplitudes of the qubits are sometimes needed, in which case, if the full amplitude mode in the prior art is adopted, that is, all the amplitudes are simulated at one time, the waste of resources such as a memory and time is undoubtedly caused; by applying the method provided by the embodiment of the invention, one or more times of simulation can be performed in a targeted manner, and one or more single amplitudes are needed to be simulated, so that resources and time are saved greatly.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims (10)

1. A single amplitude quantum computing simulation method, the method comprising:
obtaining a target quantum program for each computing node of the distributed cluster;
constructing an undirected graph corresponding to the target quantum program; the vertex of the undirected graph represents the quantum state of the operated quantum bit before or after the quantum logic gate in the target quantum program is operated, and one edge of the undirected graph correspondingly represents one tensor of the quantum logic gate in the target quantum program;
Obtaining a quantum state corresponding to a target single amplitude to be measured, carrying out definite value reduction on tensors of edges of the undirected graph based on the quantum state and the undirected graph and matching with a GPU corresponding to the computing node until all new edges with the tensor order of 0 after reduction are obtained, and calculating the sub-amplitude of the quantum state based on the tensors of the new edges; wherein the sub-amplitude is the amplitude corresponding to the undirected graph;
and returning the sub-amplitudes to the main nodes of the distributed clusters, so that the main nodes reduce the sub-amplitudes to obtain the amplitudes of the quantum states, and the amplitudes are used as target single amplitudes.
2. The single amplitude quantum computing simulation method of claim 1, wherein: the constructing the undirected graph corresponding to the target quantum program comprises the following steps:
analyzing the target quantum program to obtain a linked list of recorded quantum program information;
traversing the linked list, and creating an edge with tensor order of 1 when the type of the quantum logic gate in the linked list is a first single quantum gate; the side is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the first single quantum gate, and the unitary matrix of the first single quantum gate is a diagonal matrix;
When the quantum logic gate type in the linked list is a second single quantum gate, creating an edge with tensor order of 2 and a vertex connected with the edge; the side is connected with the last vertex of the vertex chain corresponding to the quantum bit operated by the second single quantum gate, and the unitary matrix of the second single quantum gate is a non-diagonal matrix;
when the type of the quantum logic gate in the linked list is a first double quantum gate, creating an edge with tensor order of 2; the side is connected with the last vertex in the vertex chain corresponding to the two quantum bits operated by the first double-quantum gate, and the unitary matrix of the first double-quantum gate is a diagonal matrix;
when the type of the quantum logic gate in the linked list is a second double-quantum gate, creating an edge with the tensor order of 4 and two vertexes connected with the edge; the side is connected with the last vertex in the vertex chain corresponding to the two quantum bits operated by the second double-quantum gate, and the unitary matrix of the second double-quantum gate is a non-diagonal matrix;
and obtaining an undirected graph corresponding to the target quantum program.
3. The single amplitude quantum computing simulation method of claim 2, wherein: determining value reduction is carried out on tensors of edges of the undirected graph on the basis of the quantum state and the undirected graph and in cooperation with the GPU corresponding to the computing node until all new edges with the tensor order of 0 after the reduction are obtained, and sub-amplitudes of the quantum state are calculated on the basis of tensors of the new edges, wherein the method comprises the following steps:
Invoking GPU corresponding to the computing node, and respectively performing definite value reduction on tensors of edges connected with specific vertexes of the undirected graph; wherein the specific vertex is the first vertex and the last vertex of the vertex chain corresponding to each quantum bit;
deleting the specific vertex;
receiving the value of a target vertex distributed by the main node, splitting a current undirected graph based on the value of the target vertex, and calling the GPU to respectively determine value reduction for tensors of connecting edges of the target vertex aiming at each sub undirected graph obtained by splitting;
aiming at each vertex in the sub undirected graph, combining the GPU to fuse all the connecting edges of the vertex into a new edge, reducing the tensor of the new edge, and deleting the vertex;
integrating tensor values of all new edges after the order reduction to obtain a first sub-amplitude of the quantum state, which corresponds to the sub-undirected graph;
and summing the first sub-amplitudes of all sub-undirected graphs in the quantum state to obtain the sub-amplitudes of the quantum state.
4. A single amplitude quantum computing simulation method according to claim 3, wherein: and respectively carrying out definite value reduction on tensors of edges connected with specific vertexes of the undirected graph, wherein the method comprises the following steps:
The GPU corresponding to the computing node sets the thread block number according to the tensor order after the order reduction and the thread number in each thread block in the GPU aiming at the edge connected with each specific vertex;
according to the thread block serial numbers, the thread number in each thread block and the thread serial numbers, calculating first element numbers of the reduced tensors, and calculating two second element numbers of the reduced tensors corresponding to the first element numbers; the number of bits of the element number corresponds to the number of vertex bits connected with the current edge one by one, and the value of each bit of the element number is the vertex value of the corresponding vertex position;
determining a second element number with a value on a number bit corresponding to a specific top point position as a preset determined value from the two second element numbers;
and acquiring a second element value corresponding to the determined second element number, and determining the second element value as a first element value corresponding to the first element number.
5. The single amplitude quantum computing simulation method of claim 4, wherein: the method includes the steps that the value of a target vertex distributed by a main node is received, a current undirected graph is split based on the value of the target vertex, and the GPU is called for each sub undirected graph obtained through splitting to respectively determine value reduction of tensors of connecting edges of the target vertex, wherein the steps include:
Receiving one or more values of target vertexes equally divided by a main node; wherein the target vertexes are the first m vertexes with the largest number of connected edges in the current undirected graph, and the m vertexes comprise 2 m Seed value, wherein the value of each vertex is 0 or 1, m is a positive integer, and the number of the calculation nodes is 2 n N is a positive integer, and n is greater than 0 and less than or equal to m;
splitting an undirected graph of the computing node into one or more sub undirected graphs aiming at each vertex value which is evenly divided;
and traversing the edges connected with the target vertexes aiming at each sub undirected graph, and calling the GPU to respectively perform definite value reduction on tensors of the edges connected with the target vertexes.
6. A single amplitude quantum computing simulation method according to claim 3, wherein: the step of fusing all the connection edges of the vertex into a new edge by matching with the GPU comprises the following steps:
determining a first edge and a second edge to be fused for all the connecting edges of the vertex;
according to the vertex which is not connected with the first edge in the second edge, calling the GPU to upgrade the first tensor of the first edge, and updating the first tensor by the tensor after upgrade;
deleting the second edge, and connecting the vertex which is not connected with the first edge in the second edge to the first edge to obtain a fused middle edge;
Invoking the GPU to calculate tensor elements of the middle edge according to the corresponding relation between the recorded vertex numbers of the first edge and the second edge;
and returning to the execution step of determining the first edge and the second edge to be fused until the calculated tensor element is the tensor element of the last edge, and determining the last edge as a new fused edge.
7. The single amplitude quantum computing simulation method of claim 6, wherein: the step-up the first tensor of the first edge includes:
the GPU calculates the tensor order after the step-up according to the order of the first tensor and the step-up;
setting the number of thread blocks according to the tensor order after the ascending and the number of threads in each thread block in the GPU;
according to the thread block serial numbers, the number of threads in each thread block and the thread serial numbers, calculating the first element numbers of tensors after the step-up;
and calculating the element of the tensor after the step-up according to the first element number, the step-up number and the element of the first tensor.
8. The single amplitude quantum computing simulation method of claim 6, wherein: the computing tensor elements of the intermediate edge includes:
The GPU sets the number of thread blocks according to the updated order of the first tensor and the number of threads in each thread block in the GPU;
according to the thread block serial numbers, the number of threads in each thread block and the thread serial numbers, calculating the first element numbers of tensors of the middle edges;
determining corresponding elements of each element in the first tensor in a second tensor of the second edge according to the corresponding relation;
each element in the first tensor is traversed, and the element is updated with the product of the element and its corresponding element in the second tensor.
9. The single amplitude quantum computing simulation method of claim 6, wherein: the reducing the tensor of the new edge includes:
the GPU sets the number of thread blocks according to the tensor order of the new edge after the order is reduced;
according to the thread block serial numbers, the thread number in each thread block and the thread serial numbers, calculating first element numbers of the reduced tensors, and calculating two second element numbers of the reduced tensors corresponding to the first element numbers; the number of bits of the element number corresponds to the number of vertex bits connected with the current edge one by one, and the value of each bit of the element number is the vertex value of the corresponding vertex position;
And obtaining two second element values corresponding to the two second element numbers one by one, summing the two second element values, and determining the sum as a first element value corresponding to the first element number.
10. The single amplitude quantum computing simulation method according to any one of claims 4-5 and 7-9, wherein: the calculation formula of the first element number is as follows:
Idx=block_id*num+thread_id
wherein Idx is the first element number, block_id is the thread block number, num is the number of threads in each thread block, and thread_id is the thread number.
CN201910394102.9A 2019-05-13 2019-05-13 Single-amplitude quantum computing simulation method Active CN111931939B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910394102.9A CN111931939B (en) 2019-05-13 2019-05-13 Single-amplitude quantum computing simulation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910394102.9A CN111931939B (en) 2019-05-13 2019-05-13 Single-amplitude quantum computing simulation method

Publications (2)

Publication Number Publication Date
CN111931939A CN111931939A (en) 2020-11-13
CN111931939B true CN111931939B (en) 2024-02-09

Family

ID=73282551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910394102.9A Active CN111931939B (en) 2019-05-13 2019-05-13 Single-amplitude quantum computing simulation method

Country Status (1)

Country Link
CN (1) CN111931939B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757357B (en) 2020-12-29 2023-06-02 合肥本源量子计算科技有限责任公司 Quantum circuit amplitude estimation method, device, storage medium and electronic device
CN114764549B (en) * 2020-12-31 2023-04-25 合肥本源量子计算科技有限责任公司 Quantum circuit simulation calculation method and device based on matrix product state
CN114692880B (en) * 2020-12-31 2023-09-05 本源量子计算科技(合肥)股份有限公司 Quantum state amplitude simulation method and device in quantum circuit
CN113569511A (en) * 2021-06-11 2021-10-29 清华大学 Quantum circuit simulation method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103038956A (en) * 2010-03-19 2013-04-10 多伦多大学董事局 Amplitude and phase modulation of a laser by modulation of an output coupler
CN105960651A (en) * 2013-12-05 2016-09-21 微软技术许可有限责任公司 A method and system for computing distance measures on a quantum computer
CN108833353A (en) * 2018-05-18 2018-11-16 中南大学 Quantum Byzantine agreement method based on tripartite participation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8811763B2 (en) * 2007-12-06 2014-08-19 The United States Of America As Represented By The Secretary Of The Army Method and system for producing image frames using quantum properties
US8849580B2 (en) * 2010-07-26 2014-09-30 The University Of Vermont Uses of systems with degrees of freedom poised between fully quantum and fully classical states

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103038956A (en) * 2010-03-19 2013-04-10 多伦多大学董事局 Amplitude and phase modulation of a laser by modulation of an output coupler
CN105960651A (en) * 2013-12-05 2016-09-21 微软技术许可有限责任公司 A method and system for computing distance measures on a quantum computer
CN108833353A (en) * 2018-05-18 2018-11-16 中南大学 Quantum Byzantine agreement method based on tripartite participation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Classical Simulation of Intermediate-Size Quantum Circuits;Jianxin Chen 等;《arXiv:1805.01450v2》;第1-12页 *
单量子比特量子计算中的全局相位;魏永鑫 等;《福建师范大学学报(自然科学报)》;第29卷(第3期);第42-46页 *

Also Published As

Publication number Publication date
CN111931939A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN111931939B (en) Single-amplitude quantum computing simulation method
CN111914378B (en) Single-amplitude quantum computing simulation method and device
CN109800883B (en) Quantum machine learning framework construction method, device and quantum computer
JP7186797B2 (en) Method and system for quantum computing
JP7389268B2 (en) Quantum state preparation circuit generation method, device, chip, equipment and program
Gorlatch Systematic efficient parallelization of scan and other list homomorphisms
KR20200088475A (en) Simultaneous training of functional networks of neural networks
KR102108342B1 (en) A graph upscaling method for preserving graph properties and operating method thereof
CN111915011B (en) Single-amplitude quantum computing simulation method
JP7381723B2 (en) Quantum operation execution method and device, quantum operation control waveform generation method and device, quantum operation chip, computer device and program
CN115577787B (en) Quantum amplitude estimation method, device, apparatus and storage medium
CN115066589A (en) Distributed tensor network contraction scheme based on dynamic ordering segmentation
CN113222159B (en) A method and device for determining a quantum state
JP7372996B2 (en) Node grouping method, device and electronic equipment
CN115577789B (en) Quantum entanglement degree determination method, device, equipment and storage medium
WO2024007652A1 (en) Accelerated solving method for large sparse matrix, system, and storage medium
CN118922841A (en) Sparsity masking method for neural network training
CN117371546A (en) Matrix product state quantum Fourier transform simulation method and system based on DCU acceleration
CN115456184B (en) Quantum circuit processing method, quantum state preparation device, quantum state preparation equipment and quantum state preparation medium
CN117313879B (en) Quantum circuit processing method, device and electronic equipment
CN117313877B (en) Quantum circuit processing method and device and electronic equipment
WO2022211655A1 (en) A processor and a method for performing tensor network contraction in a quantum simulator
CN113128015A (en) Method and system for predicting resources required by single-amplitude analog quantum computation
Hoppe et al. A modular massively parallel computing environment for three-dimensional multiresolution simulations of compressible flows
CN117521829B (en) Quantum circuit simulation method, device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 230008 6th floor, building E2, phase II, venture industrial park, high tech Zone, Hefei City, Anhui Province

Applicant after: Benyuan Quantum Computing Technology (Hefei) Co.,Ltd.

Address before: 230008 6th floor, building E2, phase II, venture industrial park, high tech Zone, Hefei City, Anhui Province

Applicant before: ORIGIN QUANTUM COMPUTING COMPANY, LIMITED, HEFEI

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant