[go: up one dir, main page]

CN119155037A - Parallel point adding method of elliptic curve based on AVX512IFMA - Google Patents

Parallel point adding method of elliptic curve based on AVX512IFMA Download PDF

Info

Publication number
CN119155037A
CN119155037A CN202410953512.3A CN202410953512A CN119155037A CN 119155037 A CN119155037 A CN 119155037A CN 202410953512 A CN202410953512 A CN 202410953512A CN 119155037 A CN119155037 A CN 119155037A
Authority
CN
China
Prior art keywords
point
data
point data
coordinate value
avx512ifma
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410953512.3A
Other languages
Chinese (zh)
Inventor
姜蕊
彭聪
冯琦
罗敏
何德彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202410953512.3A priority Critical patent/CN119155037A/en
Publication of CN119155037A publication Critical patent/CN119155037A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/30Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy
    • H04L9/3066Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy involving algebraic varieties, e.g. elliptic or hyper-elliptic curves
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3887Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • H04L2209/125Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Algebra (AREA)
  • Mathematical Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Complex Calculations (AREA)

Abstract

The application discloses a parallel point adding method of an elliptic curve based on AVX512IFMA, and belongs to the technical field of computers. The method comprises the steps of determining each coordinate of a plurality of pairs of point data participating in parallel point addition, determining a target number domain corresponding to the plurality of pairs of point data, determining a large number operation rule corresponding to the target number domain, determining a target point addition formula of the plurality of pairs of point data based on the relation between Z coordinate value Z 1 of the point data and Z coordinate value Z 2 of the added point data in each pair of point data, converting the coordinate value of the plurality of pairs of point data into the target data corresponding to a data structure of AVX512IFMA, and processing the target data based on the target point addition formula and the large number operation rule to obtain a point addition result of the plurality of pairs of point data. The method realizes 8X 1 paths of parallel acceleration of elliptic curve point addition, and can obtain great performance optimization and improvement.

Description

Parallel point adding method of elliptic curve based on AVX512IFMA
Technical Field
The application belongs to the technical field of computers, and particularly relates to a parallel point adding method of an elliptic curve based on AVX512 IFMA.
Background
In modern cryptography, elliptic curve cryptography (Elliptic Curve Cryptography, ECC) has been widely used in various fields of information security, such as digital signature, blockchain, SSL/TLS protocols, etc., due to its high security and low computational complexity. Elliptic curve Point Addition (Point Addition) and Point multiplication (Point Multiplication) are basic operations of ECC, whose efficiency directly affects the performance of the entire cryptosystem. Therefore, how to optimize the speed of these basic operations, especially the point-add operation, is a hot problem in research.
Conventional ECC point-and-addition implementations typically rely on large integer operations. Since the points on the elliptic curve are represented as large integer pairs, these operations involve a large number of modulo addition, modulo subtraction, modulo multiplication, etc. Although the computational power of modern processors has increased significantly, there is still a performance bottleneck facing large-scale elliptic curve point operations. In addition, the traditional method has low utilization rate in the aspect of parallel computing, and the advantages of the modern multi-core processor cannot be fully exerted.
AVX512IFMA (AVX-512 Integer Fused Multiply-Add) is a single instruction multiple data (SIMD, single Instruction Multiple Data) instruction set proposed by Intel corporation, which is part of the AVX-512 instruction set. It provides efficient vectorized integer arithmetic capability, in particular supporting Fused Multiply-Add (Fused Multiply-Add), i.e., simultaneous Multiply and Add operations. The operation can obviously reduce the instruction number and improve the operation speed. In addition, the AVX512IFMA supports 512-bit vector registers, so that each instruction can process more data, and the parallel computing efficiency is further improved.
Today, multiple cipher implementations work has begun to utilize SIMD to accelerate operations, such as in encryption, digital signature, TLS protocols, to increase computational efficiency and response speed. The BLS12-381 curve is an elliptic curve with high security intensity, and is widely applied to the field of cryptography, in particular to zero knowledge proof systems and public key cryptography. But there is currently no optimization scheme for the AVX512IFMA instruction set for this curve.
Therefore, there is no problem in the related art that the point addition calculation in the elliptic curve is optimized by using the AVX512IFMA instruction set, and no scheme for realizing the calculation acceleration of the point addition of the elliptic curve by using the parallel data processing advantage of the AVX512IFMA instruction set exists at present.
Disclosure of Invention
The present application is directed to solving at least one of the technical problems existing in the related art. Therefore, the application provides a parallel point adding method of elliptic curve based on AVX512IFMA, which utilizes the characteristic that AVX512IFMA single instruction multiple data instruction sets can process multiple data elements of the same type simultaneously in the same instruction period, realizes 8X 1 paths of parallel acceleration on elliptic curve point adding part, and can obtain great performance optimization and improvement.
In a first aspect, the present application provides a parallel point adding method for elliptic curve based on AVX512IFMA, the method comprising:
determining each coordinate of a plurality of pairs of point data participating in parallel point addition, wherein each pair of point data comprises point addition data and point data to be added;
Determining a target number domain corresponding to the multi-point data, wherein the target number domain is a finite domain of an elliptic curve or a secondary expansion domain of the finite domain;
determining a large number operation rule corresponding to the target number domain;
Determining a target point formulation of the multi-point data based on a relationship between a Z coordinate value Z 1 of the point data and a Z coordinate value Z 2 of the added point data in each of the multi-point data;
Converting coordinate values of the multi-point data into target data corresponding to a data structure of the AVX512 IFMA;
And processing the target data based on the target point addition formula and the large number operation rule to obtain a point addition result of the multi-point data.
According to the parallel point addition method of the elliptic curve based on the AVX512IFMA, the characteristics that a plurality of data elements of the same type can be processed simultaneously in the same instruction period by utilizing a single-instruction multi-data instruction set are utilized, the bottom 8X 1 path finite field operation and the upper 8X 1 path point addition operation are realized on the elliptic curve, meanwhile, as the bottom majority operation rule is determined based on a target number field, the point addition formulas are different along with the change of the coordinate relation of point data of the participating point addition calculation, a plurality of types of point addition operation interfaces can be provided, and the AVX512IFMA instruction set is used for realizing 8X 1 path parallel acceleration on the elliptic curve point addition part, so that the great performance optimization promotion can be obtained.
According to one embodiment of the present application, the relationship between the Z coordinate value Z 1 of the added point data and the Z coordinate value Z 2 of the added point data in each pair of the pairs of point data includes three types of data:
z 1=Z2=1、Z1≠Z2 = 1, and Z 1≠Z2.
According to one embodiment of the present application, converting coordinate values of multi-point data into target data corresponding to a data structure of AVX512IFMA includes:
converting each coordinate value of the coordinates of the plurality of pairs of point data into a coordinate value vector corresponding to a target operand base of AVX512 IFMA;
and forming a vector set by a plurality of coordinate value vectors corresponding to each coordinate value of each point data to obtain target data.
According to one embodiment of the present application, converting each coordinate value of coordinates of a plurality of pairs of point data into a coordinate value vector corresponding to a target operand base of AVX512IFMA includes:
determining the data type corresponding to the multi-point data;
if the data type is a standard element corresponding to the finite field where the elliptic curve is located, converting each coordinate value in the coordinates of the multiple pairs of point data into a coordinate value vector corresponding to a target operation base of AVX512IFMA based on a preset rule;
If the data type is a preset data type expressed on the Montgomery domain, each coordinate value in coordinates of the plurality of pairs of point data is converted into a coordinate value vector corresponding to a target operation radix of AVX512IFMA based on a Montgomery domain conversion rule corresponding to the target number domain.
According to one embodiment of the present application, if the data type is a standard element corresponding to a finite field where the elliptic curve is located, converting each coordinate value in coordinates of multiple pairs of point data into a coordinate value vector corresponding to a target operation base of AVX512IFMA based on a preset rule includes:
for any coordinate value F in the coordinate values of the plurality of pairs of point data, determining a coordinate value vector F corresponding to a target operation base of AVX512IFMA and corresponding to any coordinate value F by using the following formula, wherein the data type of any coordinate value F is a standard element corresponding to a finite field in which the elliptic curve is located:
f=f0+252f1+2104f2+2156f3+2208f4+2260f5+2312f6+2364f7
F=[f0,f1,f2,f3,f4,f5,f6,f7]
Wherein f i<252 is more than or equal to 0 and i 8,i is more than or equal to 0, and the target operation base number is 2 52.
According to an embodiment of the present application, forming a vector set of a plurality of coordinate value vectors corresponding to each coordinate value of each point data includes:
the vector set V is determined by the following formula:
The coordinate value vectors corresponding to any coordinate of any point data are F a,Fb,Fc,Fd,Fe,Ff,Fg and F h, any column of the vector set V corresponds to one coordinate value vector, and the i-th row of the vector set V corresponds to one register vector vi=[fi a,fi b,fi c,fi d,fi e,fi f,fi g,fi h],0≤i<8,i as an integer.
In a second aspect, the present application provides a parallel point adding device based on an elliptic curve of AVX512IFMA, the device comprising:
the first determining module is used for determining each coordinate in coordinates of a plurality of pairs of point data participating in parallel point addition, wherein each pair of point data comprises point data and point data to be added;
The second determining module is used for determining a target number domain corresponding to the multi-point data, wherein the target number domain is a finite domain of an elliptic curve or a secondary expansion domain of the finite domain;
A third determining module, configured to determine a big number operation rule corresponding to the target number domain;
A fourth determining module, configured to determine a target point formulation of the multi-point data based on a relationship between a Z coordinate value Z 1 of the point data and a Z coordinate value Z 2 of the added point data in each of the multi-point data;
The data arrangement module is used for converting coordinate values of the multi-point data into target data corresponding to a data structure of the AVX512 IFMA;
and the data processing module is used for processing the target data based on the target point addition formula and the big number operation rule to obtain a point addition result of the multi-point data.
According to the parallel point adding device of the elliptic curve based on the AVX512IFMA, the characteristics that a plurality of data elements of the same type can be processed simultaneously in the same instruction period by utilizing a single-instruction multi-data instruction set are utilized, the bottom 8X 1 path finite field operation and the upper 8X 1 path point adding operation are realized on the elliptic curve, meanwhile, as the bottom majority operation rule is determined based on a target number field, the point adding formulas are different along with the change of the coordinate relation of point data of the participating point adding calculation, a plurality of types of point adding operation interfaces can be provided, and the AVX512IFMA instruction set is used for realizing 8X 1 path parallel acceleration on the elliptic curve point adding part, so that the great performance optimization promotion can be obtained.
In a third aspect, the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the parallel point addition method based on an elliptic curve of AVX512IFMA according to the first aspect as described above when the computer program is executed by the processor.
In a fourth aspect, the present application provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a parallel point-adding method for an AVX512 IFMA-based elliptic curve according to the first aspect described above.
In a fifth aspect, the present application provides a chip, the chip comprising a processor and a communication interface, the communication interface being coupled to the processor, the processor being configured to execute a program or instruction to implement a parallel point addition method based on an elliptic curve of AVX512IFMA as in the first aspect.
In a sixth aspect, the application provides a computer program product comprising a computer program which, when executed by a processor, implements the AVX512 IFMA-based elliptic curve parallel point-adding method of the first aspect as described above.
Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
Drawings
The foregoing and/or additional aspects and advantages of the application will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic flow chart of a parallel point addition method of an elliptic curve based on AVX512IFMA according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a parallel point-adding device based on an elliptic curve of AVX512IFMA according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a parallel point-to-point computing engine based on an elliptic curve of AVX512IFMA according to an embodiment of the present application;
fig. 4 is a schematic hardware diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions of the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The parallel point adding method based on the elliptic curve of the AVX512IFMA, the parallel point adding device based on the elliptic curve of the AVX512IFMA, the electronic device and the readable storage medium provided by the embodiment of the application are described in detail below by specific embodiments and application scenes thereof with reference to the accompanying drawings.
The parallel point addition method based on the elliptic curve of the AVX512IFMA can be applied to the terminal, and can be specifically executed by hardware or software in the terminal.
The execution main body of the parallel point adding method of the elliptic curve based on the AVX512IFMA provided by the embodiment of the application can be the electronic equipment or the functional module or the functional entity of the parallel point adding method of the elliptic curve based on the AVX512IFMA, the electronic device in the embodiment of the application includes, but is not limited to, a mobile phone, a tablet computer, a camera, a wearable device and the like, and the parallel point adding method based on the elliptic curve of the AVX512IFMA provided in the embodiment of the application is described below by taking the electronic device as an execution main body as an example.
First, the following description is made of the coincidence and corresponding definitions that need to be used in the present application:
The finite field of the BLS12-381 elliptic curve, where p is the modulus of the BLS12-381 elliptic curve.
Finite field of BLS12-381 elliptic curveIs a secondary domain of the (c).
R: BLS12-381 elliptic curve.
P i coordinates of elliptic curve point i, expressed as (X, Y, Z), P i ε E where 1.ltoreq.i.ltoreq.n, n being the order of the elliptic curve.
V i a 512 bit vector.
V is vector set, which is composed of 8V i (0.ltoreq.i < 8) and can represent 8 big numbers.
A, R, C: three different sets of vectors.
Q is a vector set composed of modulus.
Elliptic curve point groups on the upper surface.
Elliptic curve point groups on the upper surface.
PADD * elliptic curve point-add, when Z 1=Z2=1,Z1 and Z 2 are the Z coordinate values of the two points participating in the point-add, respectively.
PADD, elliptic curve point-added, when Z 1≠Z2=1,Z1 and Z 2 are the Z coordinates of the two points, respectively.
PADD + elliptic curve point-added, when Z 1≠Z2,Z1 and Z 2 are the Z coordinates of the two points, respectively.
As shown in FIG. 1, the parallel point adding method of the elliptic curve based on the AVX512IFMA comprises the steps of 110, 120, 130, 140, 1 and 160.
Step 110, determining each coordinate of a plurality of pairs of point data participating in parallel point addition, wherein each pair of point data comprises point data and point data to be added.
It should be noted that the AVX512IFMA can operate on multiple sets of data in parallel by using one set of instructions, and the AVX512IFMA instruction set is similar in nature to a vector processor that can perform the same operation on one set of data (also referred to as a "data vector") simultaneously, thereby achieving spatial parallelism. Since the AVX-512IFMA instruction set supports multiplication and addition operations on integers in each 64-bit lane of two registers, 8×1-way data arrangement can be realized, that is, at most, 8-way point additions of the same type of point data can be processed simultaneously in the same instruction cycle, therefore, the pairs of the multi-point data on the elliptic curve participating in parallel point additions cannot exceed 8 at a time, and since the AVX-512IFMA instruction set realizes 8×1-way data arrangement, when the pairs of the point data participating in parallel point additions cannot reach 8, for example, only 5-way point data participate in parallel point additions, 0-setting processing can be adopted on the data of the remaining 3 ways, but in general, in order to realize efficient parallel point addition processing, 8-way point data participate in parallel point additions are arranged as much as possible in each instruction cycle. For example, the same type of dot data P1, Q1, P2, Q2, P3, Q3, P4, Q4, P5, Q5, P6, Q6, P7, Q7, P8, and Q8 on the elliptic curve, if dot data P1 and dot data Q1 are to be added, dot data P2 and dot data Q2 are to be added, dot data P3 and dot data Q3 are to be added, dot data P4 and dot data Q4 are to be added, dot data P5 and dot data Q5 are to be added, dot data P6 and dot data Q6 are to be added, dot data P7 and dot data Q7 are to be added, dot data P8 and dot data Q8 are to be added, the above 8 dot data calculation may be processed simultaneously in the same instruction period, and two dot data involved in dot calculation may be divided into two data types, one being the dot data is to be added, and the other is the dot data to be added, and the dot data Q6 is used for the dot data.
It should be noted that, in this embodiment of the present application, the elliptic curve is a specific curve BLS12-381 curve, so it may be determined that the coordinate values of the point data on the elliptic curve are in most cases standard binary large integers represented by 381 bits, and if the determined point data on the elliptic curve is from a special database and the data in the special database is the data type expressed on the montgomery domain, then the montgomery domain conversion is also required for the point data on the elliptic curve participating in point addition, which will be described in detail later.
And 120, determining a target number field corresponding to the plurality of pairs of point data, wherein the target number field is a finite field of an elliptic curve or a secondary expansion field of the finite field.
It should be noted that, the point-adding calculation is performed on the point data on the BLS12-381 curve, which can be divided into two kinds of point data in the target number domain, one is the finite domain of the elliptic curve BLS12-381Elliptic curve point group onThe other is the finite field of elliptic curve BLS12-381Is spread by the second order of (2)Elliptic curve point group onIs included in the data of the point in the image. For different target number domains, the corresponding large number operation rules of the bottom number domain operation layer are also different, namely for a finite fieldThe large number operations of the method comprise modulo addition, modulo subtraction, modulo multiplication addition, modulo multiplication and modulo square, and all have a set of operation rules corresponding to AVX512IFMA, and are applied to the secondary spread domainThe above large number operation also comprises the large number operation of modulo addition, modulo subtraction, modulo multiplication addition, modulo multiplication and modulo square, and is based on finite fieldAn additional set of operation rules extended on the basis of the rules of the large number operation.
Step 130, determining a big number operation rule corresponding to the target number domain.
It will be appreciated that since the target number field includes two types, the finite fields of elliptic curves BLS12-381, respectivelyAnd the finite field of elliptic curve BLS12-381Is spread by the second order of (2)The two number domains are respectively provided with a corresponding large number operation rule.
Step 140, determining a target point formulation of the multi-point data based on the relationship between the Z coordinate value Z 1 of the point data and the Z coordinate value Z 2 of the added point data in each of the multi-point data.
It should be noted that, when performing point addition calculation on point data on the BLS12-381 curve, different relationships between Z coordinate values of the added point data and the added point data may result in different point addition calculation formulas. For example, the coordinates of the added point data P are (X1, Y1, Z1), the coordinates of the added point data Q are (X2, Y2, Z2), and if Z 1=Z2 =1, the calculation formula of the coordinates (X3, Y3, Z3) of the point-added result point data R obtained by the point-adding calculation is as follows:
H=X2-X1
HH=H2
I=4*HH
J=H*I
r=2*(Y2-Y1)
V=X1*I
X3=r2-J-2*V
Y3=r*(V-X3)-2*Y1*J
Z3=2*H
In the case of Z 1≠Z2 =1, the calculation formula of the coordinates (X3, Y3, Z3) of the dot-added calculation result dot data R is as follows:
Z1Z1=Z12
U2=X2*Z1Z1
S2=Y2*Z1*Z1Z1
H=U2-X1
HH=H2
I=4*HH
J=H*I
r=2*(S2-Y1)
V=X1*I
X3=r2-J-2*V
Y3=r*(V-X3)-2*Y1*J
Z3=(Z1+H)2-Z1Z1-HH
in the case of Z 1≠Z2, the calculation formula of the coordinates (X3, Y3, Z3) of the dot-added calculation result dot data R is as follows:
Z1Z1=Z12
Z2Z2=Z22
U1=Xl*Z2Z2
U2=X2*Z1Z1
S1=Y1*Z2*Z2Z2
S2=Y2*Z1*Z1Z1
H=U2-U1
I=(2*H)2
J=H*I
r=2*(S2-S1)
V=U1*I
X3=r2-J-2*V
Y3=r*(V-X3)-2*S1*J
Z3=((Z1+Z2)2-Z1Z1-Z2Z2)*H
therefore, there is a need to determine the relationship between the Z coordinate values of the added point data and the added point data.
Step 150, converting the coordinate values of the multi-point data into target data corresponding to the data structure of the AVX512 IFMA.
It should be noted that, for the conversion of the data structure of the coordinate values of the multi-point data, it is necessary to perform the conversion according to the characteristics of the AVX512IFMA, and in view of the fact that the AVX-512IFMA instruction set supports multiplication and addition operations on the 52-bit unsigned integers in each 64-bit lane of the two registers, 2 52 can be naturally used as the operation base, each coordinate value in the coordinates of the multi-point data is converted into a coordinate value vector with the operation base of 2 52, and then a plurality of coordinate value vectors corresponding to each coordinate value of each point data form a vector set, so as to obtain the target data.
And 160, processing the target data based on the target point addition formula and the large number operation rule to obtain a point addition result of the multi-point data.
It will be appreciated that, no matter what the target point formula is, the target point formula needs to call the underlying large number operation rule to perform the basic operation, so after the target point formula is determined, and a set of underlying large number operation rules determined by the target number domain to which the point data participating in the point-to-point calculation belongs are determined, a corresponding set of underlying large number operation rules is called based on the target point formula to process the multi-point data.
According to the parallel point addition method of the elliptic curve based on the AVX512IFMA, the characteristics that a plurality of data elements of the same type can be processed simultaneously in the same instruction period by utilizing a single-instruction multi-data instruction set are utilized, the bottom 8X 1 path finite field operation and the upper 8X 1 path point addition operation are realized on the BLS121-381 curve, meanwhile, as the bottom most calculation rule is determined based on a target number field, and the point addition formulas are different according to the change of the coordinate relation of point data calculated along with the participation point addition, a plurality of types of point addition operation interfaces can be provided, and the 8X 1 path parallel acceleration can be realized on the elliptic curve point addition part by utilizing the AVX512IFMA instruction set, so that the great performance optimization improvement can be obtained.
In some embodiments, the relationship between the Z coordinate value Z 1 of the point data and the Z coordinate value Z 2 of the added point data in each of the plurality of pairs of point data includes three types of data:
z 1=Z2=1、Z1≠Z2 = 1, and Z 1≠Z2.
Specifically, the point data on the BLS121-381 curve is generally represented using elliptic curve points in the jacobian coordinate system, and thus, coordinates of the point data on the BLS121-381 curve include three coordinate values of X, Y and Z, and a relationship between the Z coordinate values of two point data of participating points on the BLS121-381 curve determines a corresponding point addition formula. For example, the coordinates of the added point data P are (X1, Y1, Z1), the coordinates of the added point data Q are (X2, Y2, Z2), and if Z 1=Z2 =1, the calculation formula of the coordinates (X3, Y3, Z3) of the point-added result point data R obtained by the point-adding calculation is as follows:
H=X2-X1
HH=H2
I=4*HH
J=H*I
r=2*(Y2-Y1)
V=X1*I
X3=r2-J-2*V
Y3=r*(V-X3)-2*Y1*J
Z3=2*H
In the case of Z 1≠Z2 =1, the calculation formula of the coordinates (X3, Y3, Z3) of the dot-added calculation result dot data R is as follows:
Z1Z1=Z12
U2=X2*Z1Z1
S2=Y2*Z1*Z1Z1
H=U2-X1
HH=H2
I=4*HH
J=H*I
r=2*(S2-Y1)
V=X1*I
X3=r2-J-2*V
Y3=r*(V-X3)-2*Y1*J
Z3=(Z1+H)2-Z1Z1-HH
in the case of Z 1≠Z2, the calculation formula of the coordinates (X3, Y3, Z3) of the dot-added calculation result dot data R is as follows:
Z1Z1=Z12
Z2Z2=Z22
U1=X1*Z2Z2
U2=X2*Z1Z1
S1=Y1*Z2*Z2Z2
S2=Y2*Z1*Z1Z1
H=U2-U1
I=(2*H)2
J=H*I
r=2*(S2-S1)
V=U1*I
X3=r2-J-2*V
Y3=r*(V-X3)2*S1*J
Z3=((Z1+Z2)2-Z1Z1-Z2Z2)*H
therefore, there is a need to determine the relationship between the Z coordinate values of the added point data and the added point data.
It will be appreciated that since there are two types of point data for the target number field, the finite fields of the elliptic curves BLS12-381, respectivelyElliptic curve point group onIs included in the elliptic curve BLS12-381, and is limited to the elliptic curve BLSIs spread by the second order of (2)
Elliptic curve point group onIn addition to 3 different point addition formulas corresponding to 3 different situations of the relationship between the two point Z coordinates of the participating point addition, 2×3=6 different point addition operations may actually be implemented, that is, a calculation engine (i.e., a device) corresponding to the parallel point addition method of the elliptic curve based on AVX512IFMA provided by the embodiment of the present invention may include point addition formulas of 3 points of two groups of elliptic curves BLS12-381, that is, 6 different point addition operation interfaces are provided.
In some embodiments, step 150 may include:
converting each coordinate value of the coordinates of the plurality of pairs of point data into a coordinate value vector corresponding to a target operand base of AVX512 IFMA;
and forming a vector set by a plurality of coordinate value vectors corresponding to each coordinate value of each point data to obtain target data.
It should be noted that, for each pair of point data, including two kinds of point data including point data and point data to be added, each kind of point data is represented by three coordinate values X, Y and Z, each coordinate value is a big number, and since the AVX-512IFMA instruction set supports multiplication and addition operations on 52-bit unsigned integers in each 64-bit channel of two registers, each coordinate value is first converted into a coordinate value vector with an operation base of 2 52.
And rearranging a plurality of coordinate value vectors corresponding to each coordinate value of each point data to construct a vector set, wherein the point data and the added point data are provided with three coordinate values of X, Y and Z, so that 6 vector sets are provided. For convenience of explanation of parallel processing of 8×1 data, the multi-pair point data participating in parallel point addition is set to 8-pair point data here and hereinafter. For example, the same type of point data P1, Q1, P2, Q2, P3, Q3, P4, Q4, P5, Q5, P6, Q6, P7, Q7 on an elliptic curve, P8 and Q8, if the point addition of the point addition data P1 (X P1,YP1,ZP1) and the point addition data Q1 (X Q1,YQ1,ZQ1), the point addition of the point addition data P2 (X P2,YP2,ZP2) and the point addition data Q2 (X Q2,YQ2,ZQ2), the point addition of the point addition data P3 (X P3,YP3,ZP3) and the point addition data Q3 (X Q3,YQ3,ZQ3), the point addition of the point addition data P4 (X P4,YP4,ZP4) and the point addition data Q4 (X Q4,YQ4,ZQ4), the point addition of the point addition data P5 (X P5,YP5,ZP5) and the point addition data Q5 (X Q5,YQ5,ZQ5), the point addition of the point addition data P6 (X P6,YP6,ZP6) and the point addition data Q6 (X Q6,YQ6,ZQ6), the point addition of the point addition data P7 (X P7,YP7,ZP7) and the point addition data Q7 (X Q7,YQ7,ZQ7), the point addition data P8 (X P8,YP8,ZP8) and the point addition data Q8 (X Q8,YQ8,ZQ8) are first converted from a total of XP1,YP1,ZP1,XQ1,YQ1,ZQ1,XP2,YP2,ZP2,XQ2,YQ2,ZQ2,XP3,YP3,ZP3,XQ3,YQ3,ZQ3,XP4,YP4,ZP4,XQ4,YQ4,ZQ4,XP5,YP5,ZP5,XQ5,YQ5,ZQ5,XP6,YP6,ZP6,XQ6,YQ6,ZQ6,XP7,YP7,ZP7,XQ7,YQ7,ZQ7,XP8,YP8,ZP8,XQ8,YQ8,ZQ8 X2=48 to a vector of coordinate values 52, respectively, for example, the converted coordinate value vectors are VXP1,VYP1,VZP1,VXQ1,VYQ1,VZQ1,VXP2,VYP2,VZP2,VXQ2,VYQ2,VZQ2,VXP3,VYP3,VZP3,VXQ3,VYQ3,VZQ3,VXP4,VYP4,VZP4,VXQ4,VYQ4,VZQ4,VXP5,VYP5,VZP5,VXQ5,VYQ5,VZQ5,VXP6,VYP6,VZP6,VXQ6,VYQ6,VZQ6,VXP7,VYP7,VZP7,VXQ7,VYQ7,VZQ7,VXP8,VYP8,VZP8,VXQ8,VYQ8,VZQ8; respectively, then the 8 coordinate value vectors corresponding to each coordinate value of each point data are rearranged to construct a vector set, which is equivalent to constructing a vector set for the 8 coordinate value vectors of the X coordinate value in the point data, constructing a vector set for the 8 coordinate value vectors of the Y coordinate value in the point data, constructing a vector set for the 8 coordinate value vectors of the Z coordinate value in the point data, constructing a vector set for the 8 coordinate value vectors of the X coordinate value in the point data, constructing a vector set for the 8 coordinate value vectors of the Y coordinate value in the point data, constructing a vector set for 8 coordinate value vectors of Z coordinate value in the added point data, 6 vector sets in total, followed by the above example, namely constructing 6 vector sets V1, V2, V3, V4, V5 and V6, wherein ,V1=<VXP1,VXP2,VXP3,VXP4,VXP5,VXP6,VXP7,VXP8>,V2=<VYP1,VYP2,VYP3,VYP4,VYP5,VYP6,VYP7,VYP8>,V3=<VZP1,VZP2,VZP3,VZP4,VZP5,VZP6,VZP7,VZP8>,V4=<VXQ1,VXQ2,VXQ3,VXQ4,VXQ5,VXQ6,VXQ7,VXQ8>,V5=<VYQ1,VYQ2,VYQ3,VYQ4,VYQ5,VYQ6,VYQ7,VYQ8>,V6=<VZQ1,VZQ2,VZQ3,VZQ4,VZQ5,VZQ6,VZQ7,VZQ8>.
In some embodiments, converting each coordinate value of the coordinates of the plurality of pairs of point data into a coordinate value vector corresponding to a target operand base of AVX512IFMA includes:
determining the data type corresponding to the multi-point data;
if the data type is a standard element corresponding to the finite field where the elliptic curve is located, converting each coordinate value in the coordinates of the multiple pairs of point data into a coordinate value vector corresponding to a target operation base of AVX512IFMA based on a preset rule;
If the data type is a preset data type expressed on the Montgomery domain, each coordinate value in coordinates of the plurality of pairs of point data is converted into a coordinate value vector corresponding to a target operation radix of AVX512IFMA based on a Montgomery domain conversion rule corresponding to the target number domain.
It should be noted that, since the obtained point data to be added may originate from databases of different data structures, it is also necessary to determine the data type of the point data to be added and calculated first, which generally corresponds to the elliptic curve BLS12-381, and the obtained point data to be added and calculated is 381-bit binary, then each coordinate value in the coordinate values of 8 pairs of point data may be converted into a coordinate value vector with an operation base of 2 52 directly based on the preset rule, but there are other cases where the point data to be added and calculated is from a database of a specific montgomery domain, and therefore, for such other cases, each coordinate value in the coordinate values of 8 pairs of point data is converted into a coordinate value vector with an operation base of 2 52 based on the montgomery domain conversion rule corresponding to the target number domain. For example, in most cases, modern computer architectures are 64-bit, and 2 64 is often used as the operand base in large-number operations, but in the present application, the operand base of AVX512IFMA is 2 52, so that when performing a point-add operation, a montgomery domain conversion may be required. The conversion of one domain is essentially a Montgomery multiplication process, e.g., a is a large number of 381 bits, R 1 is a first Montgomery domain, R 2 is a second Montgomery domain, and the conversion of aR 1 to aR 2 is performed only once on R 2 domain by performing aR 1 and aRIn the finite fieldMontgomery multiplication on, converting aR 2 to aR 1, only by performing one time on R 2 domain with aR 2 and R 1 in finite domainMontgomery multiplications thereon.
In some embodiments, if the data type is a standard element corresponding to the finite field where the elliptic curve is located, converting each coordinate value in coordinates of the multiple pairs of point data into a coordinate value vector corresponding to a target operation radix of AVX512IFMA based on a preset rule includes:
for any coordinate value F in the coordinate values of the plurality of pairs of point data, determining a coordinate value vector F corresponding to a target operation base of AVX512IFMA and corresponding to any coordinate value F by using the following formula, wherein the data type of any coordinate value F is a standard element corresponding to a finite field in which the elliptic curve is located:
f=f0+252f1+2104f2+2156f3+2208f4+2260f5+2312f6+2364f7
F=[f0,f1,f2,f3,f4,f5,f6,f7]
Wherein f i<252 is more than or equal to 0 and i 8,i is more than or equal to 0, and the target operation base number is 2 52.
Specifically, the 8×1 way data arrangement of the point data of the participating point addition is realized based on the AVX-512IFMA instruction set, and first, the point data of 381 bit integers is expressed by using an operation base 2 52. Since the AVX-512IFMA instruction set supports multiply and add operations on 52-bit unsigned integers within each 64-bit lane of two registers, 2 52 can naturally be used as the radix. For example, in the BLS12-381 curve, the 381-bit integer f represented using radix 2 52 may be represented as follows, where 0.ltoreq.f i<252 and 0.ltoreq.i <8.
f=f0+252f1+2104f2+2156f3+2208f4+2260f5+2312f6+2364f7.
In some embodiments, forming a vector set from a plurality of coordinate value vectors corresponding to each coordinate value of each point data includes:
the vector set V is determined by the following formula:
The coordinate value vectors corresponding to any coordinate of any point data are F a,Fb,Fc,Fd,Fe,Ff,Fg and F h, any column of the vector set V corresponds to one coordinate value vector, and the i-th row of the vector set V corresponds to one register vector vi=[fi a,fi b,fi c,fi d,fi e,fi f,fi g,fi h],0≤i<8,i as an integer.
Specifically, in the 8×1-way parallel computation of the present application, the main data structure is vector set V, which consists of 8 vectors V i (0+.i < 8). Each v i consists of eight elements with radix 2 52, which can be stored in a 512-bit register for computation. Here again, the logarithmic setting for the multi-point data involved in parallel point addition is exactly equal to 8, thus for eight integersThe vector set V is defined as follows, where each vector vi=[fi a,fi b,fi c,fi d,fi e,fi f,fi g,fi h].
Based on this data structure, the present application can implement basic finite field operation in 8×1 mode using AVX-512 IFMA. If the number of pairs of point data participating in parallel point addition is smaller than 8, for example, 5 pairs of point data are subjected to parallel point addition calculation, then F a,Fb,Fc,Fd and F e in the vector set V sequentially correspond to 5 coordinate value vectors, respectively, which are determined based on any coordinate value of any point data in the 5 pairs of point data, and the remaining vectors F f,Fg and F h in the vector set V are processed with 0.
Continuing with the description of the large number operations for the different target number fields, the finite field is described firstThe large number operation includes modular addition, modular subtraction, modular multiplication addition, modular multiplication and modular square.
For modulo addition, modulo subtraction, and modulo multiple addition operations: modulo addition, which can be denoted as C++A+B mod Q, includes three steps: first, a and B are added, and the result of their sum is stored in C; next, Q is subtracted from C, but some lanes in C may appear negative, thus creating a 512-bit mask vector in which each 64-bit element corresponds to 8 integers in C, the corresponding 64-bit element being set to all 1's if the integer in the lane is negative, and all 0's if it is non-negative, bitwise and-ing Q with the mask vector, and then adding Q to the negative integer in C instead of the negative integer plus 0 has no effect on the result. Based on a similar principle of the above steps, modulo subtraction C≡A-B mod Q involves only two steps, first subtracting Y from X, then processing the negative part with a mask. Modulo addition is the same as modulo addition, except that the X and Y addition portions of the first stage are shifted left.
Modular multiplication and modular square, where modular multiplication uses the method of Montgomery multiplication, is divided into two stages, integer multiplication and Montgomery reduction. Montgomery multiplications use a modified coarse-integration hybrid scanning (CIHS) method. The modular square is the same as the reduced part of the modular multiplication, and the calculation part is changed from multiplication to square.The above operation is based onIs calculated by finite field operation. For the followingThe 8 multiplied by 1 path can be accelerated by using Karatuba algorithm, and the result is thatThe number of multiplications is reduced from four to three, and for modulo squaring, one can reduceThe number of multiplications is reduced from three to two.
According to the parallel point adding method of the elliptic curve based on the AVX512IFMA provided by the embodiment of the application, the execution main body can be a parallel point adding device of the elliptic curve based on the AVX512 IFMA. In the embodiment of the application, the parallel point adding device based on the elliptic curve of the AVX512IFMA is taken as an example to execute the parallel point adding method based on the elliptic curve of the AVX512IFMA, and the parallel point adding device based on the elliptic curve of the AVX512IFMA provided by the embodiment of the application is described.
The embodiment of the application also provides a parallel point adding device of the elliptic curve based on the AVX512 IFMA.
As shown in fig. 2, the parallel point adding apparatus for an elliptic curve based on AVX512IFMA includes a first determining module 210, a second determining module 220, a third determining module 230, a fourth determining module 240, a data arranging module 250 and a data processing module 260, wherein,
A first determining module 210, configured to determine each coordinate of a plurality of pairs of point data participating in parallel point addition, where each pair of point data includes two kinds of point data, namely point data and point data to be added;
a second determining module 220, configured to determine a target number field corresponding to the plurality of pairs of point data, where the target number field is a finite field of an elliptic curve or a secondary expansion field of the finite field;
a third determining module 230, configured to determine a big number operation rule corresponding to the target number domain;
A fourth determining module 240, configured to determine a target point formulation of the multi-point data based on a relationship between a Z coordinate value Z 1 of the point data and a Z coordinate value Z 2 of the added point data in each of the multi-point data;
A data arrangement module 250 for converting coordinate values of the multi-point data into target data corresponding to a data structure of the AVX512 IFMA;
the data processing module 260 is configured to process the target data based on the target point addition formula and the big number operation rule, so as to obtain a point addition result of the multi-point data.
According to the parallel point adding device of the elliptic curve based on the AVX512IFMA, the characteristics that a plurality of data elements of the same type can be processed simultaneously in the same instruction period by utilizing a single-instruction multi-data instruction set are utilized, the bottom 8X 1 path finite field operation and the upper 8X 1 path point adding operation are realized on the BLS121-381 curve, meanwhile, as the bottom most calculation rule is determined based on a target number field, and the point adding formulas are different according to the change of the coordinate relation of point data calculated along with the participation point adding, a plurality of types of point adding operation interfaces can be provided, and the 8X 1 path parallel acceleration can be realized on the elliptic curve point adding part by utilizing the AVX512IFMA instruction set, so that the great performance optimization improvement can be obtained.
In some embodiments, the relationship between the Z coordinate value Z 1 of the point data and the Z coordinate value Z 2 of the added point data in each of the plurality of pairs of point data includes three types of data:
z 1=Z2=1、Z1≠Z2 = 1, and Z 1≠Z2.
Converting each coordinate value of the coordinates of the plurality of pairs of point data into a coordinate value vector corresponding to a target operand base of AVX512 IFMA;
and forming a vector set by a plurality of coordinate value vectors corresponding to each coordinate value of each point data to obtain target data.
In some embodiments, converting each coordinate value of the coordinates of the plurality of pairs of point data into a coordinate value vector corresponding to a target operand base of AVX512IFMA includes:
determining the data type corresponding to the multi-point data;
if the data type is a standard element corresponding to the finite field where the elliptic curve is located, converting each coordinate value in the coordinates of the multiple pairs of point data into a coordinate value vector corresponding to a target operation base of AVX512IFMA based on a preset rule;
If the data type is a preset data type expressed on the Montgomery domain, each coordinate value in coordinates of the plurality of pairs of point data is converted into a coordinate value vector corresponding to a target operation radix of AVX512IFMA based on a Montgomery domain conversion rule corresponding to the target number domain.
In some embodiments, if the data type is a standard element corresponding to the finite field where the elliptic curve is located, converting each coordinate value in coordinates of the multiple pairs of point data into a coordinate value vector corresponding to a target operation radix of AVX512IFMA based on a preset rule includes:
for any coordinate value F in the coordinate values of the plurality of pairs of point data, determining a coordinate value vector F corresponding to a target operation base of AVX512IFMA and corresponding to any coordinate value F by using the following formula, wherein the data type of any coordinate value F is a standard element corresponding to a finite field in which the elliptic curve is located:
f=f0+252f1+2104f2+2156f3+2208f4+2260f5+2312f6+2364f7
F=[f0,f1,f2,f3,f4,f5,f6,f7]
Wherein f i<252 is more than or equal to 0 and i 8,i is more than or equal to 0, and the target operation base number is 2 52.
In some embodiments, forming a vector set from a plurality of coordinate value vectors corresponding to each coordinate value of each point data includes:
the vector set V is determined by the following formula:
The coordinate value vectors corresponding to any coordinate of any point data are F a,Fb,Fc,Fd,Fe,Ff,Fg and F h, any column of the vector set V corresponds to one coordinate value vector, and the i-th row of the vector set V corresponds to one register vector vi=[fi a,fi b,fi e,fi d,fi e,fi f,fi g,fi h],0≤i<8,i as an integer.
The parallel point adding device based on the elliptic curve of the AVX 512 IFMA in the embodiment of the application can be electronic equipment, and can also be a component in the electronic equipment, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. The electronic device may be a Mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a Mobile internet appliance (Mobile INTERNET DEVICE, MID), an augmented reality (augmented reality, AR)/Virtual Reality (VR) device, a robot, a wearable device, an ultra-Mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), etc., and may also be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a Television (TV), a teller machine, a self-service machine, etc., which are not particularly limited in the embodiments of the present application.
The parallel point adding device based on the elliptic curve of the AVX5 12IFMA in the embodiment of the application can be a device with an operating system. The operating system may be a microsoft (Windows) operating system, an Android operating system, a 1OS operating system, or other possible operating systems, which is not particularly limited in the embodiments of the present application.
The parallel point adding device based on the elliptic curve of the AVX512IFMA provided by the embodiment of the present application can implement each process implemented by the method embodiment of fig. 1, and in order to avoid repetition, a detailed description is omitted herein.
Based on the above embodiment, the embodiment of the application also provides a method and a device for rapidly realizing points of elliptic curve based on the AVX512IFMA instruction set, which accelerate the realization of elliptic curve points on a CPU platform, and provide an engine computing interface, which can be used for digital signature, privacy protection, zero knowledge proof and other application scenes. In this example, the BLS12-381 curveGroup(s)The point addition of the group is divided into three cases according to the Z coordinate, and six calculation engines are realized.
In this embodiment, as shown in fig. 3, the scheme may be divided into three parts of 8×1 path data arrangement, finite field operation and point operation, which are described in detail as follows:
And part 1:8×1 paths of data are distributed.
Since the AVX-512IFMA instruction set supports multiply and add operations on 52-bit unsigned integers within each 64-bit lane of two registers, 2 52 can naturally be used as the radix. For example, in the BLS12-381 curve, the 381-bit integer f represented using radix 2 52 may be represented as follows, where 0.ltoreq.f i<252 and 0.ltoreq.i <8.
f=f0+252f1+2104f2+2156f3+2208f4+2260f5+2312f6+2364f7
In our 8 x 1-way parallel computation, the main data structure is vector set V, which consists of 8 vectors V i (0.ltoreq.i < 8). Each v i consists of eight elements with radix 2 52, which can be stored in a 512-bit register for computation. For eight integersThe vector set V is defined as follows, where each vector vi=[fi a,fi b,fi c,fi d,fi e,fi f,fi g,fi h].
Based on this data structure, the basic finite field operation in 8×1 mode can be implemented using AVX-512 IFMA.
Part 2, finite field operation.
The part can realizeDomain and its secondary domain expansionThe large number operation mainly comprises modulo addition, modulo subtraction, modulo multiplication, modulo square and Montgomery domain conversion.
For modulo addition, modulo subtraction, and modulo multiple addition operations: modulo addition, which can be denoted as C++A+B mod Q, includes three steps: first, a and B are added, and the result of their sum is stored in C; next, Q is subtracted from C, but some lanes in C may appear negative, thus creating a 512-bit mask vector in which each 64-bit element corresponds to 8 integers in C, the corresponding 64-bit element being set to all 1's if the integer in the lane is negative, and all 0's if it is non-negative, bitwise and-ing Q with the mask vector, and then adding Q to the negative integer in C instead of the negative integer plus 0 has no effect on the result. Based on a similar principle of the above steps, modulo subtraction C≡A-B mod Q involves only two steps, first subtracting Y from x, then processing the negative part with a mask. Modulo addition is the same as modulo addition, except that the X and Y addition portions of the first stage are shifted left.
Modular multiplication and modular square, where modular multiplication uses the method of Montgomery multiplication, is divided into two stages, integer multiplication and Montgomery reduction. Montgomery multiplications use a modified coarse-integration hybrid scanning (CIHS) method. The modular square is the same as the reduced part of the modular multiplication, and the calculation part is changed from multiplication to square.The above operation is based onIs calculated by finite field operation. For the followingThe 8 multiplied by 1 path can be accelerated by using Karatuba algorithm, and the result is thatThe number of multiplications is reduced from four to three, and for modulo squaring, one can reduceThe number of multiplications is reduced from three to two.
Montgomery domain conversion in most cases, modern computer architectures are 64-bit, often with 2 64 as the operand radix in large number operations, but in the present application, the operand radix of AVX512IFMA is 2 52, so Montgomery domain conversion is sometimes required in performing point addition operations. The conversion of one domain is essentially a Montgomery multiplication process, e.g., a is a large number of 381 bits, R 1 is a first Montgomery domain, R 2 is a second Montgomery domain, and the conversion of aR 1 to aR 2 is performed only once on R 2 domain by performing aR 1 and aRIn the finite fieldMontgomery multiplication on, converting aR 2 to aR 1, only by performing one time on R 2 domain with aR 2 and R 1 in finite domainMontgomery multiplications thereon.
And part 3, point operation.
The elliptic curve points under the jacobian coordinate system can be expressed as (X, Y, Z). For points P 1 and P 2, the point addition formula can be divided into three cases, PADD *, PADD, and PADD +, according to Z 1=Z2=1、Z1≠Z2 =1 and Z 1≠Z2. For the 8×1-way point operation part, the 8×1-way finite field operation of the bottom layer can be directly called, and the results are respectively calculated according to a point addition formula.
In some embodiments, as shown in fig. 4, an electronic device 400 is further provided in the embodiments of the present application, which includes a processor 401, a memory 402, and a computer program stored in the memory 402 and capable of running on the processor 401, where the program, when executed by the processor 401, implements each process of the above embodiment of the parallel point adding method based on the elliptic curve of the AVX512IFMA, and the same technical effects can be achieved, so that repetition is avoided and no further description is given here.
The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.
The embodiment of the application also provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements each process of the above embodiment of the parallel point adding method based on the elliptic curve of AVX512IFMA, and can achieve the same technical effect, so that repetition is avoided and redundant description is omitted.
The processor is a processor in the electronic device in the above embodiment. Readable storage media include computer readable storage media such as computer readable memory ROM, random access memory RAM, magnetic or optical disks, and the like.
The embodiment of the application also provides a computer program product, which comprises a computer program, wherein the computer program realizes the parallel point adding method based on the elliptic curve of the AVX512IFMA when being executed by a processor.
The processor is a processor in the electronic device in the above embodiment. Readable storage media include computer readable storage media such as computer readable memory ROM, random access memory RAM, magnetic or optical disks, and the like.
The embodiment of the application further provides a chip, the chip comprises a processor and a communication interface, the communication interface is coupled with the processor, the processor is used for running programs or instructions, the parallel point adding method embodiment based on the elliptic curve of the AVX512IFMA is realized, the same technical effect can be achieved, and the repetition is avoided, and the description is omitted here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the related art in the form of a computer software product stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk), including several instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.
In the description of the present specification, reference to the terms "one embodiment," "some embodiments," "illustrative embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the spirit and scope of the application as defined by the appended claims and their equivalents.

Claims (10)

1.一种基于AVX512IFMA的椭圆曲线的并行点加方法,其特征在于,包括:1. A parallel point addition method for elliptic curve based on AVX512IFMA, characterized by comprising: 确定参与并行点加的所述多对点数据的坐标中的每个坐标,每对点数据包括加点数据和被加点数据两种点数据;Determine each coordinate of the plurality of pairs of point data participating in parallel point addition, each pair of point data includes two types of point data: adding point data and added point data; 确定所述多对点数据对应的目标数域,所述目标数域为所述椭圆曲线的有限域或者所述有限域的二次扩域;Determine a target number field corresponding to the plurality of pairs of point data, the target number field being a finite field of the elliptic curve or a quadratic extension field of the finite field; 确定对应于所述目标数域的大数运算规则;Determine a large number operation rule corresponding to the target number domain; 基于所述多对点数据中每对点数据中加点数据的Z坐标值Z1和被加点数据的Z坐标值Z2之间的关系,确定所述多对点数据的目标点加公式;Determine a target point adding formula for the plurality of pairs of point data based on a relationship between a Z coordinate value Z1 of the adding point data and a Z coordinate value Z2 of the added point data in each pair of point data in the plurality of pairs of point data; 将所述多对点数据的坐标值转换为对应于所述AVX512IFMA的数据结构的目标数据;Convert the coordinate values of the plurality of pairs of point data into target data corresponding to the data structure of the AVX512IFMA; 基于所述目标点加公式和所述大数运算规则对所述目标数据进行处理,得到所述多对点数据的点加结果。The target data is processed based on the target point addition formula and the large number operation rule to obtain the point addition results of the multiple pairs of point data. 2.根据权利要求1所述的基于AVX512IFMA的椭圆曲线的并行点加方法,其特征在于,所述多对点数据中每对点数据中加点数据的Z坐标值Z1和被加点数据的Z坐标值Z2之间的关系包括三种,分别为:2. The parallel point adding method of elliptic curve based on AVX512IFMA according to claim 1 is characterized in that the relationship between the Z coordinate value Z1 of the adding point data and the Z coordinate value Z2 of the added point data in each pair of point data in the multiple pairs of point data includes three types, namely: Z1=Z2=1、Z1≠Z2=1、和Z1≠Z2Z 1 =Z 2 =1, Z1≠Z 2 =1, and Z 1 ≠Z 2 . 3.根据权利要求2所述的基于AVX512IFMA的椭圆曲线的并行点加方法,其特征在于,所述将所述多对点数据的坐标值转换为对应于所述AVX512IFMA的数据结构的目标数据,包括:3. The parallel point addition method of elliptic curve based on AVX512IFMA according to claim 2, characterized in that the step of converting the coordinate values of the plurality of pairs of point data into target data corresponding to the data structure of the AVX512IFMA comprises: 将所述多对点数据的坐标中的每个坐标值均转换为对应于所述AVX512IFMA的目标运算基数的坐标值向量;Convert each coordinate value in the coordinates of the plurality of pairs of point data into a coordinate value vector corresponding to the target operation base of the AVX512IFMA; 将每种点数据的每个坐标值对应的多个坐标值向量构成一个向量集,得到目标数据。Multiple coordinate value vectors corresponding to each coordinate value of each point data form a vector set to obtain target data. 4.根据权利要求3所述的基于AVX512IFMA的椭圆曲线的并行点加方法,其特征在于,所述将所述多对点数据的坐标中的每个坐标值均转换为对应于所述AVX512IFMA的目标运算基数的坐标值向量,包括:4. The parallel point addition method of elliptic curve based on AVX512IFMA according to claim 3, characterized in that the step of converting each coordinate value in the coordinates of the plurality of pairs of point data into a coordinate value vector corresponding to a target operation base of the AVX512IFMA comprises: 确定所述多对点数据对应的数据类型;Determine the data types corresponding to the multiple pairs of point data; 若所述数据类型为标准的对应于所述椭圆曲线所在有限域上的元素,则基于预设规则将所述多对点数据的坐标中的每个坐标值均转换为对应于所述AVX512IFMA的目标运算基数的坐标值向量;If the data type is a standard element corresponding to the finite field where the elliptic curve is located, each coordinate value in the coordinates of the multiple pairs of point data is converted into a coordinate value vector corresponding to the target operation base of the AVX512IFMA based on a preset rule; 若所述数据类型为蒙哥马利域上的表达的预设数据类型,则基于对应于所述目标数域的蒙哥马利域转换规则将所述多对点数据的坐标中的每个坐标值均转换为对应于所述AVX512IFMA的目标运算基数的坐标值向量。If the data type is a preset data type expressed on a Montgomery domain, each coordinate value in the coordinates of the multiple pairs of point data is converted into a coordinate value vector corresponding to the target operation base of the AVX512IFMA based on the Montgomery domain conversion rule corresponding to the target number domain. 5.根据权利要求4所述的基于AVX512IFMA的椭圆曲线的并行点加方法,其特征在于,所述若所述数据类型为标准的对应于所述椭圆曲线所在有限域上的元素,则基于预设规则将所述多对点数据的坐标中的每个坐标值均转换为对应于所述AVX512IFMA的目标运算基数的坐标值向量,包括:5. The parallel point addition method of elliptic curve based on AVX512IFMA according to claim 4 is characterized in that if the data type is a standard element corresponding to the finite field where the elliptic curve is located, then each coordinate value in the coordinates of the multiple pairs of point data is converted into a coordinate value vector corresponding to the target operation base of the AVX512IFMA based on a preset rule, including: 对于所述多对点数据的坐标值中的任一坐标值f,所述任一坐标值f的数据类型为标准的对应于所述椭圆曲线所在有限域上的元素,通过如下公式确定所述任一坐标值f对应的对应于所述AVX512IFMA的目标运算基数的坐标值向量F:For any coordinate value f among the coordinate values of the multiple pairs of point data, the data type of the any coordinate value f is a standard element corresponding to the finite field where the elliptic curve is located, and the coordinate value vector F corresponding to the target operation base of the AVX512IFMA corresponding to the any coordinate value f is determined by the following formula: f=f0+252f1+2104f2+2156f3+2208f4+2260f5+2312f6+2364f7 f=f 0 +2 52 f 1 +2 104 f 2 +2 156 f 3 +2 208 f 4 +2 260 f 5 +2 312 f 6 +2 364 f 7 F=[f0,f1,f2,f3,f4,f5,f6,f7]F=[f 0 , f 1 , f 2 , f 3 , f 4 , f 5 , f 6 , f 7 ] 其中,0≤fi<252且0≤i<8,i为整数,所述目标运算基数为252Wherein, 0≤f i <2 52 and 0≤i<8, i is an integer, and the target operation base is 2 52 . 6.根据权利要求5所述的基于AVX512IFMA的椭圆曲线的并行点加方法,其特征在于,所述将每种点数据的每个坐标值对应的多个坐标值向量构成一个向量集,包括:6. The parallel point addition method of elliptic curve based on AVX512IFMA according to claim 5, characterized in that the multiple coordinate value vectors corresponding to each coordinate value of each point data constitute a vector set, comprising: 通过如下公式确定向量集V:The vector set V is determined by the following formula: 其中,任一种点数据的任一坐标对应的多个坐标值向量分别为Fa,Fb,Fc,Fd,Fe,Ff,Fg和Fh,所述向量集V的任一列对应一个坐标值向量,所述向量集V的第i行对应一个寄存器向量vi=[fi a,fi b,fi c,fi d,fi e,fi f,fi g,fi h],0≤i<8,i为整数。Among them, the multiple coordinate value vectors corresponding to any coordinate of any type of point data are Fa , Fb , Fc , Fd , Fe , Ff , Fg and Fh respectively, any column of the vector set V corresponds to a coordinate value vector, and the i-th row of the vector set V corresponds to a register vector vi = [ fia , fib ,fic , fid,fie , fif , fig,fih ] , 0≤i < 8 , i is an integer. 7.一种基于AVX512IFMA的椭圆曲线的并行点加装置,其特征在于,包括:7. A parallel point addition device for elliptic curve based on AVX512IFMA, characterized by comprising: 第一确定模块,用于确定参与并行点加的所述多对点数据的坐标中的每个坐标,每对点数据包括加点数据和被加点数据两种点数据;A first determination module is used to determine each coordinate of the coordinates of the plurality of pairs of point data participating in the parallel point addition, each pair of point data includes two types of point data: adding point data and added point data; 第二确定模块,用于确定所述多对点数据对应的目标数域,所述目标数域为所述椭圆曲线的有限域或者所述有限域的二次扩域;A second determination module is used to determine a target number field corresponding to the plurality of pairs of point data, wherein the target number field is a finite field of the elliptic curve or a quadratic extension field of the finite field; 第三确定模块,用于确定对应于所述目标数域的大数运算规则;A third determination module is used to determine a large number operation rule corresponding to the target number domain; 第四确定模块,用于基于所述多对点数据中每对点数据中加点数据的Z坐标值Z1和被加点数据的Z坐标值Z2之间的关系,确定所述多对点数据的目标点加公式;a fourth determination module, for determining a target point addition formula for the plurality of pairs of point data based on a relationship between a Z coordinate value Z1 of the added point data and a Z coordinate value Z2 of the added point data in each pair of point data in the plurality of pairs of point data; 数据排布模块,用于将所述多对点数据的坐标值转换为对应于所述AVX512IFMA的数据结构的目标数据;A data arrangement module, used for converting the coordinate values of the plurality of pairs of point data into target data corresponding to the data structure of the AVX512IFMA; 数据处理模块,用于基于所述目标点加公式和所述大数运算规则对所述目标数据进行处理,得到所述多对点数据的点加结果。The data processing module is used to process the target data based on the target point addition formula and the large number operation rules to obtain the point addition results of the multiple pairs of point data. 8.一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1-6任一项所述的基于AVX512IFMA的椭圆曲线的并行点加方法。8. An electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the program, the parallel point addition method for elliptic curves based on AVX512IFMA as described in any one of claims 1 to 6 is implemented. 9.一种非暂态计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现如权利要求1-6任一项所述的基于AVX512IFMA的椭圆曲线的并行点加方法。9. A non-transitory computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the parallel point addition method for elliptic curves based on AVX512IFMA as described in any one of claims 1 to 6 is implemented. 10.一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-6任一项所述的基于AVX512IFMA的椭圆曲线的并行点加方法。10. A computer program product, comprising a computer program, characterized in that when the computer program is executed by a processor, the parallel point addition method of the elliptic curve based on AVX512IFMA as described in any one of claims 1 to 6 is implemented.
CN202410953512.3A 2024-07-16 2024-07-16 Parallel point adding method of elliptic curve based on AVX512IFMA Pending CN119155037A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410953512.3A CN119155037A (en) 2024-07-16 2024-07-16 Parallel point adding method of elliptic curve based on AVX512IFMA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410953512.3A CN119155037A (en) 2024-07-16 2024-07-16 Parallel point adding method of elliptic curve based on AVX512IFMA

Publications (1)

Publication Number Publication Date
CN119155037A true CN119155037A (en) 2024-12-17

Family

ID=93802439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410953512.3A Pending CN119155037A (en) 2024-07-16 2024-07-16 Parallel point adding method of elliptic curve based on AVX512IFMA

Country Status (1)

Country Link
CN (1) CN119155037A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060210068A1 (en) * 2005-03-15 2006-09-21 Microsoft Corporation Elliptic curve point octupling using single instruction multiple data processing
US20090310775A1 (en) * 2008-06-13 2009-12-17 Shay Gueron Using a single instruction multiple data (SIMD) instruction to speed up galois counter mode (GCM) computations
CN112134704A (en) * 2020-09-21 2020-12-25 中国电子科技网络信息安全有限公司 Sm2 performance optimization implementing method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060210068A1 (en) * 2005-03-15 2006-09-21 Microsoft Corporation Elliptic curve point octupling using single instruction multiple data processing
US20090310775A1 (en) * 2008-06-13 2009-12-17 Shay Gueron Using a single instruction multiple data (SIMD) instruction to speed up galois counter mode (GCM) computations
CN112134704A (en) * 2020-09-21 2020-12-25 中国电子科技网络信息安全有限公司 Sm2 performance optimization implementing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
马占刚;李春雷;曹喜信;: "一种可扩展的区块链专用协处理器架构", 微纳电子与智能制造, no. 01, 15 March 2020 (2020-03-15) *

Similar Documents

Publication Publication Date Title
Wang et al. HE-Booster: an efficient polynomial arithmetic acceleration on GPUs for fully homomorphic encryption
CN110474769B (en) Quantum color image encryption method based on direction modification
Ochoa-Jiménez et al. Implementation of RSA signatures on GPU and CPU architectures
JP2024525754A (en) Cryptosystem for post-quantum cryptographic operations
Nath et al. Efficient 4-way vectorizations of the Montgomery ladder
Paksoy et al. Faster NTRU on ARM Cortex-M4 with TMVP-based multiplication
Zeng et al. The implementation of polynomial multiplication for lattice-based cryptography: A survey
CN115801244B (en) Post-quantum cryptographic algorithm implementation method and system for resource-constrained processors
CN117692126A (en) A Paillier homomorphic encryption method and system based on low-complexity modular multiplication algorithm
US20250106022A1 (en) Multi-lane cryptographic engine and operations thereof
KR102587719B1 (en) Circuits, computing chips, data processing devices and methods for performing hash algorithms
Lee et al. Area-efficient subquadratic space-complexity digit-serial multiplier for type-II optimal normal basis of $ GF (2^{m}) $ using symmetric TMVP and block recombination techniques
CN119155037A (en) Parallel point adding method of elliptic curve based on AVX512IFMA
Khan et al. FPGA implementation of elliptic-curve Diffie Hellman protocol
Lee et al. Area-Delay Efficient Digit-Serial Multiplier Based on $ k $-Partitioning Scheme Combined With TMVP Block Recombination Approach
Dong et al. TEGRAS: An efficient Tegra embedded GPU-based RSA acceleration server
da Silva Factoring semiprimes and possible implications for RSA
Pei et al. An area-efficient SM2 cryptographic engine for WBAN security enhancement
Baktır et al. Frequency domain finite field arithmetic for elliptic curve cryptography
Zhang et al. A lightweight FourQ primitive on ARM cortex-M0
Zhao et al. Exploring the speed limit of SM2
CN115113848B (en) Signature/signature verification circuit, device, equipment, method and coordinate restoration circuit
US20240370229A1 (en) Multi-lane cryptographic engines with systolic architecture and operations thereof
Mashayekhi et al. New Design of Efficient Reversible Quantum Saturation Adder
RU2080650C1 (en) Device for calculation of absolute value of m- dimensional vector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination