[go: up one dir, main page]

CN113239395B - Data query method, device, equipment, storage medium and program product - Google Patents

Data query method, device, equipment, storage medium and program product Download PDF

Info

Publication number
CN113239395B
CN113239395B CN202110507063.6A CN202110507063A CN113239395B CN 113239395 B CN113239395 B CN 113239395B CN 202110507063 A CN202110507063 A CN 202110507063A CN 113239395 B CN113239395 B CN 113239395B
Authority
CN
China
Prior art keywords
query
queried
target
data
participant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110507063.6A
Other languages
Chinese (zh)
Other versions
CN113239395A (en
Inventor
黄安埠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202110507063.6A priority Critical patent/CN113239395B/en
Publication of CN113239395A publication Critical patent/CN113239395A/en
Application granted granted Critical
Publication of CN113239395B publication Critical patent/CN113239395B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • H04L63/0442Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload wherein the sending and receiving network entities apply asymmetric encryption, i.e. different keys for encryption and decryption
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Storage Device Security (AREA)

Abstract

The application provides a data query method, a device, equipment, a storage medium and a program product, wherein the method comprises the steps that a coordinator generates a public key for encryption and a private key for decryption; determining at least two target participants according to at least two characteristics to be queried, determining query requests corresponding to the target participants, storing part of characteristic data of the characteristics to be queried by the target participants, sending a public key and the query requests corresponding to the target participants to the corresponding target participants, receiving ciphertext query results sent by the target participants, determining the ciphertext query results by the target participants based on the public key and the received query requests, and determining target data according to the private key and the ciphertext query results sent by the target participants. On the premise of ensuring that the data does not appear locally to all the participants, the cross-table query of the data is realized, so that the data privacy can be protected, the data leakage can be prevented, and the availability of the queried target data can not be influenced.

Description

Data query method, device, equipment, storage medium and program product
Technical Field
The present application relates to the field of artificial intelligence technology, and relates to, but is not limited to, a data query method, apparatus, device, storage medium, and program product.
Background
In the information society, various information resources are fully and effectively managed and utilized, and databases are applied to almost every industry. Database queries are a common database technology, and in general, the current database stores are all stored in the form of (key, value) pairs. Database query refers to that a user searches a value corresponding to a key in a database by using the key value. Multi-table query refers to that data is distributed in multiple database tables, and multiple database tables are required to be combined to perform query, and common connection modes include internal connection, left connection, right connection and full connection.
When a plurality of tables are distributed in different institutions or clients, the tables cannot be directly shared because private data leakage needs to be prevented. In the related technology, firstly, after the privacy treatment such as desensitization, anonymization, differential privacy or homomorphic encryption is carried out on the data in the tables, a plurality of tables are concentrated together for connection, and the multi-table inquiry is realized. The related art can prevent data leakage by encryption or anonymization technology, but can cause reduced availability of data and violate the limitation condition of data out of domain in privacy protection law. How to combine multi-party data to realize cross-table connection query on the premise that the data does not appear locally is one of the problems to be solved.
Disclosure of Invention
The embodiment of the application provides a data query method, a device, equipment, a computer readable storage medium and a computer program product, which can realize cross-table connection query on the premise of ensuring that data is not local and protecting data privacy.
The technical scheme of the embodiment of the application is realized as follows:
The embodiment of the application provides a data query method, which is applied to a coordinator of federal learning and comprises the following steps:
generating a public key for encryption and a private key for decryption;
Determining at least two target participants from a plurality of participants learned by federation according to at least two characteristics to be queried, determining query requests corresponding to the target participants, and storing characteristic data of part of the characteristics to be queried by the target participants;
sending the public key and the query request corresponding to each target participant to the corresponding target participant;
receiving ciphertext query results sent by all target participants, wherein the ciphertext query results are determined by all target participants based on the public key and the query requests received by all target participants;
And determining target data according to the private key and ciphertext query results sent by all target participants.
The embodiment of the application provides a data query method, which is applied to target participants of federal learning, wherein the target participants are the participants storing part of data to be queried, and the method comprises the following steps:
receiving a public key and a query request sent by a coordinator;
analyzing the query request to obtain query conditions carried by the query request;
searching in the storage space based on the query condition to obtain a plaintext query result;
Encrypting the plaintext inquiry result by using the public key to obtain a ciphertext inquiry result;
and sending the ciphertext query result to the coordinator.
The embodiment of the application provides a data query device, which is applied to a coordinator of federal learning and comprises the following components:
A generation module for generating a public key for encryption and a private key for decryption;
the first determining module is used for determining at least two target participants from a plurality of participants of federal learning, wherein each target participant stores characteristic data of part of the characteristics to be queried;
a second determining module, configured to determine a query request corresponding to each target participant;
The first sending module is used for sending the public key and the query requests corresponding to all target participants to the corresponding target participants;
The first receiving module is used for receiving ciphertext query results sent by all target participants, and the ciphertext query results are determined by all target participants based on the public key and the query requests received by all target participants;
and the third determining module is used for determining target data according to the private key and the ciphertext query results sent by the target participants.
The embodiment of the application provides a data query device, which is applied to target participants of federal learning, wherein the target participants are the participants storing part of data to be queried, and the device comprises:
The second receiving module is used for receiving the public key and the query request sent by the coordinator;
The analysis module is used for analyzing the query request to obtain query conditions carried by the query request;
the query module is used for searching in the storage space based on the query condition to obtain a plaintext query result;
The encryption module is used for encrypting the plaintext inquiry result by utilizing the public key to obtain a ciphertext inquiry result;
And the second sending module is used for sending the ciphertext query result to the coordinator.
The embodiment of the application provides data query equipment, which comprises the following components:
a memory for storing executable instructions;
and the processor is used for realizing the method provided by the embodiment of the application when executing the executable instructions stored in the memory.
The embodiment of the application provides a computer readable storage medium, and executable instructions are stored on the computer readable storage medium and used for realizing the method provided by the embodiment of the application when the computer readable storage medium is used for causing a processor to execute.
The embodiment of the application provides a computer program product, which comprises a computer program, wherein the computer program is used for realizing the method provided by the embodiment of the application when being executed by a processor.
The embodiment of the application has the following beneficial effects:
In the data query method provided by the embodiment of the application, a coordinator determines at least two target participants from a plurality of participants of federal learning according to at least two features to be queried, determines query requests corresponding to the target participants, and then sends a public key generated in advance and the query requests corresponding to the target participants to the corresponding target participants. After each target participant receives the query request, respectively querying the data table of each target participant to obtain a plaintext query result, and encrypting the plaintext query result by using the public key to obtain a ciphertext query result. And the coordinator receives the ciphertext query results sent by each target participant and determines target data according to the private key and the ciphertext query results sent by each target participant. Therefore, under the premise that the data of each participant cannot be locally found out, the cross-table query of the data is realized, the data privacy can be protected, the data leakage can be prevented, the anonymization and other treatments are not carried out on the data, and the usability of the queried target data cannot be affected.
Drawings
Fig. 1 is a schematic diagram of a network architecture of a data query method according to an embodiment of the present application;
fig. 2 is a schematic diagram of a composition structure of a data query device according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of an implementation of a data query method according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of another implementation of the data query method according to the embodiment of the present application;
FIG. 5 is a schematic flow chart of another implementation of the data query method according to the embodiment of the present application;
FIG. 6 is a schematic diagram of a database table according to an embodiment of the present application;
FIG. 7 is a schematic flow chart of an implementation of a data query method for federal learning implementation of internal connection according to an embodiment of the present application;
FIG. 8 is a schematic flow chart of an implementation of a data query method for implementing left connection in federal learning according to an embodiment of the present application;
fig. 9 is a schematic flow chart of an implementation of a data query method for implementing right connection in federal learning according to an embodiment of the present application.
Detailed Description
The present application will be further described in detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present application more apparent, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.
In the following description, the terms "first", "second", "third" and the like are used merely to distinguish similar objects and do not represent a specific ordering of the objects, it being understood that the "first", "second", "third" may be interchanged with a specific order or sequence, as permitted, to enable embodiments of the application described herein to be practiced otherwise than as illustrated or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
Before describing embodiments of the present application in further detail, the terms and terminology involved in the embodiments of the present application will be described, and the terms and terminology involved in the embodiments of the present application will be used in the following explanation.
1) Federal learning (FEDERATED LEARNING), an emerging artificial intelligence basic technology, is designed to develop efficient machine learning among multiple parties or computing nodes on the premise of guaranteeing information security during large data exchange, protecting terminal data and personal data privacy, and guaranteeing legal compliance.
2) The connection query is the most main query in the relational database, and mainly comprises internal connection, external connection, cross connection and the like, and a plurality of table queries can be realized through connection operators.
3) Homomorphic encryption (Homomorphic Encryption), which is a cryptographic technique based on the theory of computational complexity of mathematical problems. The homomorphically encrypted data is processed to obtain an output, and the output is decrypted, the result of which is the same as the output result obtained by processing the unencrypted original data by the same method.
An exemplary application of the apparatus implementing the embodiment of the present application is described below, and the apparatus provided in the embodiment of the present application may be implemented as a terminal device. In the following, an exemplary application covering a terminal device when the apparatus is implemented as a terminal device will be described.
Fig. 1 is a schematic diagram of a network architecture of a data query method according to an embodiment of the present application, as shown in fig. 1, in the network architecture, a coordinator 100, at least two target participants (respectively denoted as a target participant 200-1 and a target participant 200-2, to show distinction, and may include more target participants in actual implementation), and a network 300. To support an exemplary application, the coordinator 100 may be a service terminal for federal learning, such as a server, etc., where the coordinator 100 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server based on cloud technology, which is configured to query feature data of a user stored in a client terminal of each bank or hospital without revealing privacy. The target participants (including target participant 200-1 and target participant 200-2) may be client terminals for federal learning, for example, participant devices such as banks or hospitals, etc. storing characteristic data of users, and the client terminals may be notebook computers, tablet computers, and desktop computers. Coordinator 100 connects target participant 200-1 and target participant 200-2 via network 300. Network 300 may be a wide area network or a local area network, or a combination of both, with wireless or wired links being used to effect data transmission.
In the application scenario of the network architecture, firstly, the coordinator 100 determines the target participant 200-1 and the target participant 200-2 from a plurality of participants according to the characteristics to be queried, then determines the query requests corresponding to the target participant 200-1 and the target participant 200-2 according to the characteristics to be queried, and then distributes the public key and the query requests generated in advance to the corresponding target participant 200-1 and the target participant 200-2. The target participant 200-1 and the target participant 200-2 determine ciphertext query results based on the public key and the query request, respectively, and then return the ciphertext query results to the coordinator 100. After receiving the ciphertext query results returned by the target participant 200-1 and the target participant 200-2, the coordinator 100 determines target data according to the private key and each ciphertext query result generated in advance. Therefore, under the premise of ensuring that the data does not go out of the local area, the cross-table query of the data is realized, the data privacy can be protected, the data leakage can be prevented, and the usability of the queried target data can not be influenced.
The apparatus provided in the embodiments of the present application may be implemented in hardware or a combination of hardware and software, and various exemplary implementations of the apparatus provided in the embodiments of the present application are described below.
According to the exemplary structure of the data query device shown in fig. 2, the data query device is shown here by taking the coordinator 100 as an example, other exemplary structures of the data query device may be foreseen, and thus the structure described herein should not be considered as limiting, for example, some components described below may be omitted, or components not described below may be added to adapt to specific requirements of some applications.
The data querying device 10 shown in fig. 2 includes at least one processor 110, a memory 140, at least one network interface 120, and a user interface 130. Each of the components in the data querying device 10 are coupled together by a bus system 150. It is understood that bus system 150 is used to enable connected communications between these components. The bus system 150 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled in fig. 2 as bus system 150.
The user interface 130 may include a display, keyboard, mouse, touch pad, touch screen, and the like.
Memory 140 may be volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM). The volatile memory may be random access memory (RAM, random Acces s Memory). The memory 140 described in embodiments of the present application is intended to comprise any suitable type of memory.
The memory 140 in embodiments of the present application is capable of storing data to support the operation of the data querying device 10. Examples of such data include any computer programs, such as an operating system and application programs, for operation on the data querying device 10. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, for implementing various basic services and processing hardware-based tasks. The application may comprise various applications.
As an example of implementation of the method provided by the embodiment of the present application by software, the method provided by the embodiment of the present application may be directly embodied as a combination of software modules executed by the processor 110, the software modules may be located in a storage medium, the storage medium is located in the memory 140, and the processor 110 reads executable instructions included in the software modules in the memory 140, and the method provided by the embodiment of the present application is completed by combining necessary hardware (including, for example, the processor 110 and other components connected to the bus 150).
By way of example, the Processor 110 may be an integrated circuit chip having signal processing capabilities such as a general purpose Processor, such as a microprocessor or any conventional Processor, a digital signal Processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
The data query method provided by the embodiment of the application will be described in connection with the exemplary application and implementation of the device provided by the embodiment of the application.
Fig. 3 is a schematic flow chart of an implementation of the data query method according to an embodiment of the present application, which is applied to a coordinator of the network architecture shown in fig. 1, and will be described with reference to the steps shown in fig. 3.
Step S301, a public key for encryption and a private key for decryption are generated.
In the embodiment of the application, when the coordinator performs data query under federal learning, the coordinator generates a public key for encryption and a private key for decryption. Through the public key, each participant does not need to send the user characteristic data stored by the participant to the counterpart or the coordinator, and the privacy of the data of each participant can be protected.
In one implementation, the public key generated by the coordinator may be a public key for homomorphic encryption and the private key generated may be a private key for homomorphic decryption. Homomorphic encryption is a cryptographic technique based on the theory of computational complexity of mathematical problems. The homomorphically encrypted data is processed to obtain an output, and the output is decrypted, the result of which is the same as the output result obtained by processing the unencrypted original data by the same method. The generated public key can be used for any homomorphic encryption of addition homomorphic encryption, multiplication homomorphic encryption, mixed multiplication homomorphic encryption, subtraction homomorphic encryption, division homomorphic encryption, algebraic homomorphic encryption (also called full homomorphic encryption) and arithmetic homomorphic encryption. Fully homomorphic encryption herein refers to encryption functions that satisfy both additive homomorphic and multiplicative homomorphic.
In some embodiments, the public key generated by the coordinator may be used for addition homomorphic encryption, the private key generated may be used for addition homomorphic decryption, or the public key generated by the coordinator may be used for multiplication homomorphic encryption, the private key generated may be used for multiplication homomorphic decryption, or the public key generated by the coordinator may be used for homomorphic encryption, and the private key generated may be used for homomorphic decryption. Compared with the encryption and decryption adopting the full homomorphism, the encryption and decryption adopting the addition homomorphism can improve the operation efficiency.
Step S302, determining at least two target participants from a plurality of participants learned by federally according to at least two characteristics to be queried.
Here, each target participant stores feature data of a part of the feature to be queried.
In combination with the application scenario of the embodiment of the present application, if the feature data of the feature to be queried is stored in only one participant, connection query is not required, so that at least two features to be queried are provided, and the feature data of the at least two features to be queried are stored in different participants. Based on this, before querying, it is necessary to determine, according to at least two features to be queried, in which party the feature data of each feature to be queried is stored, that is, it is necessary to determine at least two target parties from multiple parties learned by federally, each target party storing the feature data of a part of features to be queried.
For example, there are 2 features to be queried, feature A and feature C, respectively. There are 3 federal learning participants, a first participant, a second participant and a third participant. The data Table of the first participant (Table 1) stores feature data of feature a and feature B, the data Table of the second participant (Table 2) stores feature data of feature C and feature D, and the data Table of the third participant (Table 3) stores feature data of feature E and feature F. The characteristic data of each participant is kept secret to other participants and the coordinator, but the Table attributes of the data tables are not needed to be kept secret, namely, the coordinator can acquire the Table attributes of the data tables of each participant from each participant, specifically, the Table attributes acquired from the first participant are (Table 1; ID, A and B), the Table attributes acquired from the second participant are (Table 2; ID, C and D), the Table attributes acquired from the third participant are (Table 3; ID, E and F), wherein among the acquired Table attributes, table1, table2 and Table3 are Table names, ID is identification (such as identification numbers and the like capable of distinguishing different users) of different users, and A, B, C, D, E and F are characteristics. The coordinator can determine that the characteristic data of the characteristic A to be queried is stored in the Table1 of the first participant and the characteristic data of the characteristic C to be queried is stored in the Table2 of the second participant according to the characteristics to be queried and the Table attributes of the data tables stored by the obtained participants, thereby determining that the target participant is the first participant and the second participant.
Step S303, determining a query request corresponding to each target participant.
In step S302, table names and characteristics of the data tables corresponding to the characteristics to be queried have been determined, based on which query conditions corresponding to the target participants can be determined, and query requests to be sent to the target participants are determined from the query conditions.
Still further, in the above explanation, since it is necessary to determine that the feature data of the feature a and the feature data of the feature C are from the same user, when the feature data of the feature a is queried, it is necessary to query the identification IDs corresponding to the feature data at the same time, and determine which feature data belong to the same user based on the identification IDs.
In one implementation, the feature data of each target participant may be queried separately and then a connection query may be performed.
In the implementation mode, when the feature data of the feature A to be queried is determined to be stored in the Table1 of the first participant, query conditions are obtained, wherein the query conditions are that IDs of all users in the Table1 and the feature data of the feature A are recorded as a SELECT a.ID and a.A from Table 1a, and a is the name of Table 1. A first query request is generated for transmission to the first participant in accordance with the query condition.
Similarly, in the case that the feature data of the feature C to be queried is determined to be stored in the Table2 of the second participant, query conditions are obtained, wherein the query conditions are that IDs of all users in the Table2 and the feature data of the feature C are recorded as SELECT b.ID, b.C from Table2 b, and b is the name of Table 2. A second query request is generated for transmission to a second participant in accordance with the query condition.
In another implementation, all users with one feature corresponding to the target participant and feature data of the feature can be queried first, then the user feature to be queried in the next query is determined according to the user identification in the last query result, namely the implementation determines the current query condition based on the previous query result, and then each query request sent to each participant is generated.
In the implementation mode, when the feature data of the feature A to be queried is determined to be stored in the Table1 of the first participant, query conditions are obtained, wherein the query conditions are that IDs of all users in the Table1 and the feature data of the feature A are queried, and the IDs are recorded as SELECT a.ID and a.A from Table1 a. A first query request is generated for transmission to the first participant in accordance with the query condition.
And sending the first query request to the first participant, and querying the data table of the first participant by the first participant to obtain and return a first query result to the coordinator. And continuing to query, and storing the feature data of the feature C to be queried in the Table2 of the second party to obtain the query condition, wherein the query condition is that the user ID belonging to S1 and the feature data of the feature C in the Table2 are queried, and the query condition is recorded as SELECT b.ID, b.C from Table 2b where b.ID in S1. A second query request is generated for transmission to a second participant in accordance with the query condition.
In yet another implementation, all users of each target participant may be queried separately, and the users of each target participant may be queried for connectivity to determine a common user, and then the feature data of the common user may be queried separately.
In the implementation mode, when the feature data of the feature A to be queried is determined to be stored in the Table1 of the first participant, the query condition is obtained by querying the IDs of all users in the Table1 and is recorded as the SELECT a.ID from Table1 a. And storing the feature data of the feature C to be queried in the Table2 of the second participant, wherein the obtained query condition is that the IDs of all users in the Table2 are queried and marked as the SELECT b.ID from Table1 b. And then determining the common user according to the results of the two queries, and marking the user identification set of the common user as S2= { S1, S2}.
After the co-users are determined, the query condition of the first participant is determined to be that the characteristic data of the characteristic A of the user belonging to S2 in the Table1 is queried and is recorded as a SELECT a.ID, a.A from Table 1a where a.ID in S2. A first query request is generated for transmission to the first participant in accordance with the query condition.
Similarly, after the common users are determined, the query condition of the second party is determined to be that the characteristic data of the characteristic C of the user belonging to S2 in the Table2 is queried, and the characteristic data is recorded as a SELECT b.ID, b.C from Table1 b sphere b.ID in S2. A second query request is generated for transmission to a second participant in accordance with the query condition.
In the first implementation mode, the coordinator and each target participant only need to interact once, which has the advantages of less interaction times, mutual independence between the coordinator and each target participant, concurrent inquiry, and the disadvantage of transmitting characteristic data of all users of the characteristics to be inquired, which are held by each target participant, for each interaction, and the transmission of a large amount of data, slow transmission speed and consumption of network resources. In the second implementation mode, the coordinator and each target participant only need to interact once, and the method has the advantages of less interaction times and moderate data transmission quantity, and has the defects that the next query needs to be based on the last query result, the concurrent query cannot be carried out, and the query speed is influenced. In the third implementation manner, the coordinator needs to interact with each target participant twice, which has the advantages of small transmission data amount, concurrent inquiry after the determination of the common user, and high interaction times. In practical application, one implementation mode can be selected based on the actual conditions of the size, the connection mode and the like of the data volume to be queried.
Step S304, the public key and the query request corresponding to each target participant are sent to the corresponding target participant.
Step S305, receiving ciphertext query results sent by each target participant.
The ciphertext query result is determined by each target party based on the public key and the respective received query request.
The coordinator sends the public key and the first query request to the first participant, and the first participant responds to the first query request and queries in a data Table Table1 held by the coordinator to obtain a first plaintext query result. In order to ensure that the data is not revealed, the public key is utilized to encrypt the first plaintext inquiry result to obtain a first ciphertext inquiry result, and the first participant returns the first ciphertext inquiry to the coordinator.
And similarly, the coordinator sends the public key and the second query request to the second participant, and the second participant responds to the second query request and queries in a data Table Table2 held by the coordinator to obtain a second plaintext query result. And encrypting the second plaintext inquiry result by using the public key to obtain a second ciphertext inquiry result, and returning the second ciphertext inquiry to the coordinator by the second participant.
Step S306, determining target data according to the private key and the ciphertext query results sent by each target participant.
After receiving a first ciphertext query result of a first participant and a second ciphertext query result of a second participant, the coordinator performs cross-table connection query on the first ciphertext query result and the second ciphertext query result according to a predefined connection mode to obtain a connection query result, wherein the connection query result is ciphertext data. And the coordinator decrypts the ciphertext data by using the private key to obtain the target data. Therefore, under the premise of ensuring that the data cannot go out of the local places of all the participants, the cross-table query of the data is realized, the data privacy can be protected, the data leakage can be prevented, and the usability of the queried target data can not be influenced.
In the embodiment of the application, the private key is held only by the coordinator, and even if other participants intercept the ciphertext query result, the other participants cannot acquire the private key, so that the ciphertext query result cannot be decrypted to obtain the plaintext query result, and thus the cross-table connection query of the user characteristic data is realized on the premise of protecting the private data of each participant.
The data query method includes the steps of generating a public key for encryption and a private key for decryption by a coordinator, determining at least two target participants from multiple participants for federal learning according to at least two features to be queried, determining query requests corresponding to all target participants, storing feature data of part of the features to be queried by all target participants, sending the public key and the query requests corresponding to all target participants to corresponding target participants, receiving ciphertext query results sent by all target participants, determining the ciphertext query results by all target participants based on the public key and the query requests received by all target participants, and determining target data according to the private key and the ciphertext query results sent by all target participants. On the premise of ensuring that the data does not appear locally to all the participants, the cross-table query of the data is realized, so that the data privacy can be protected, the data leakage can be prevented, and the availability of the queried target data can not be influenced.
In some embodiments, step S302 "determining at least two target participants from the multiple participants of federal learning according to at least two features to be queried" in the embodiment shown in fig. 3 may be implemented by:
Step S3021, obtaining table attributes of a data table stored by each of the participants from a plurality of participants of the federal learning.
The method provided by the embodiment of the application can be applied to two or more than two random numbers of characteristics to be queried, and the characteristic data of at least two characteristics to be queried are stored in different two participants. The number of features to be queried is exemplified below as two. The two features to be queried are denoted as feature a and feature C.
The coordinator obtains table attributes of all participants from all participants learned by the federation in advance. The table attributes herein include table names and characteristics of the data tables. For example, there are 3 federal learning participants, a first participant, a second participant, and a third participant.
The first party's data table is shown in Table 1 below, the second party's data table is shown in Table 2 below, and the third party's data table is shown in Table 3 below
Table1 data Table1 of first party
ID A B
1 a1 b1
2 a2 b2
3 a3 b3
4 a4 b4
Table2 data Table2 of the second party
ID C D
2 c2 d2
4 c4 d4
5 c5 d5
6 c6 d6
Table3 data Table3 of third party
ID E F
2 e2 f2
5 e5 f5
7 e7 f7
8 e8 f8
The characteristic data of each party, such as (1, 2,3,4, a1, a2, a3, a4, B1, B2, B3, B4) of the first party, is kept secret from the other parties and the coordinator, but the Table attributes of the data Table, such as (Table 1; ID, A, B) of the first party, are not kept secret, so that the coordinator can obtain the Table attributes of each party's data Table from each party. If the coordinator can obtain the Table attribute of the data Table stored by each participant from the 3 tables, the Table attribute obtained from the first participant is (Table 1; ID, A and B), the Table attribute obtained from the second participant is (Table 2; ID, C and D), the Table attribute obtained from the third participant is (Table 3; ID, E and F), wherein, among the obtained Table attributes, table1, table2 and Table3 are Table names, ID is identification for distinguishing different users (such as identification numbers and the like capable of distinguishing different users), and A, B, C, D, E and F are characteristic.
Step S3022, determining a data table corresponding to each feature to be queried according to at least two features to be queried and table attributes of the data tables stored by each participant.
The feature data of the at least two features to be queried are stored at different participants of the federal study.
The coordinator determines that the data Table corresponding to the feature A to be queried is Table1 and the data Table corresponding to the feature C to be queried is Table2 according to the feature A to be queried, the feature C to be queried and the obtained Table attributes (Table 1, ID, A and B), (Table 2, ID, C and D) and (Table 3, ID, E and F).
In step S3023, the participants corresponding to the data tables are determined as target participants.
The participants corresponding to the data Table are the participants storing the data Table, the participant corresponding to the Table1 is the first participant, and the participant corresponding to the Table2 is the second participant, so that the first participant and the second participant are determined to be target participants.
In the embodiment of the application, after receiving the query instruction, the coordinator obtains the table attribute of each participant which does not need to be kept secret from each participant in federal learning, the table attribute can characterize the characteristics stored by each participant, and then the target participant which stores the characteristic data of the characteristics to be queried is determined according to the characteristics to be queried and the table attribute of the data table stored by each participant. By acquiring the table attribute without confidentiality from each participant, the target participant corresponding to each feature to be queried can be determined, and the possibility is provided for realizing the cross-table connection query on the premise of protecting the data privacy.
In some embodiments, step S303 "determine the query request corresponding to each target participant" in the embodiment shown in fig. 3, may be implemented by the following steps:
step S3031, according to each feature to be queried and the data table corresponding to each feature to be queried, determining the query conditions corresponding to each target participant.
In determining the query conditions, this may be accomplished based on three different implementations:
The first implementation mode comprises the steps of sequencing all the features to be queried, determining a query sequence, determining the features to be queried at this time according to the query sequence, and determining query conditions of a target participant corresponding to the query according to the features to be queried at this time and a data table corresponding to the features to be queried at this time.
The implementation method can be applied to application scenes of which the connection modes are internal connection, left connection, right connection and full connection. The purpose of determining the query order is to determine that each target participant has interacted. In actual implementation, if it is ensured that each target participant can be interacted with, the inquiry may be performed out of the inquiry order.
When the method is realized, the feature data of the feature A to be queried is determined to be stored in the Table1 of the first participant, and the query condition is obtained by querying the IDs of all users in the Table1 and the feature data of the feature A, and the IDs are recorded as SELECT a.ID and a.A from Table1 a.
Similarly, in the case that the feature data of the feature C to be queried is determined to be stored in the Table2 of the second participant, query conditions are obtained, wherein the query conditions are that IDs of all users in the Table2 and the feature data of the feature C are recorded as SELECT b.ID and b.C from Table2 b.
In the implementation mode, the coordinator and each target participant only need to interact once, and the method has the advantages of being small in interaction times, independent in interaction between the coordinator and each target participant, capable of carrying out concurrent query, and low in transmission speed and consuming network resources, and the disadvantage of being capable of transmitting feature data of all users of the feature to be queried held by each target participant in each interaction.
The second implementation mode comprises the steps of sorting all the characteristics to be queried, determining a query sequence, determining the characteristics to be queried according to the query sequence, determining query conditions of target participants corresponding to the 1st query according to the 1st characteristic to be queried and a data table corresponding to the 1st characteristic to be queried when the query is the first query, extracting ciphertext identification of the last ciphertext query result according to ciphertext query results obtained by the last query when the query is not the first query, and determining the query conditions of the target participants corresponding to the query based on the ciphertext identification of the last ciphertext query result, the current characteristic to be queried and the data table corresponding to the current characteristic to be queried.
The implementation method can be applied to application scenes of which the connection modes are internal connection, left connection and right connection. When the connection is an internal connection, the purpose of determining the query order is to determine that each target participant has interacted. When the connection mode is left connection or right connection, the purpose of determining the query sequence is not only to determine that each target participant has interacted, but also to ensure that each result is queried. When the connection mode is left connection, the list for left connection needs to precede other lists, so that feature data of all users of the list for left connection can be queried. When the connection mode is right connection, the list for right connection needs to precede other lists, so that feature data of all users of the list for right connection can be queried. The implementation method is not suitable for application scenes in which the connection mode is full-connection query.
When the method is implemented, if the determined query sequence is that the feature A to be queried is queried first, and then the feature C to be queried is queried. And storing the feature data of the feature A to be queried in the Table1 of the first participant to obtain a first query condition, wherein the first query condition is that the IDs of all users in the Table1 and the feature data of the feature A are queried and marked as the SELECT a.ID and a.A from Table1 a.
When determining the query condition of the feature C to be queried, the coordinator needs to acquire a first query result of the feature A to be queried, and determine the query condition of the feature C to be queried based on the first query result. The implementation step is that the coordinator generates a first query request sent to the first participant according to the first query condition. And sending the first query request to the first participant, querying a data table of the first participant by the first participant to obtain a first plaintext query result, encrypting the first plaintext query result by using a public key to obtain and returning the first ciphertext query result to the coordinator. The coordinator extracts a user identifier set s1= { S1, S2, S3}, where S1, S2, S3 are encrypted user identifiers, from the first ciphertext query result. And in the process of determining that the characteristic data of the characteristic C to be queried is stored in the Table2 of the second participant, determining a second query condition, namely, querying the user ID belonging to S1 and the characteristic data of the characteristic C in the Table2, and recording the user ID and the characteristic data as the SELECT b.ID, b.C from the Table 2b where b.ID in S1.
In the implementation mode, the coordinator and each target participant only need to interact once, and the method has the advantages of less interaction times and moderate data transmission quantity, and has the defects that the next query is required to be based on the last query result, the concurrent query cannot be carried out, and the query speed is influenced.
The third implementation mode comprises the steps of sequencing all the features to be queried, determining a query sequence, obtaining target ciphertext identifications of all target participants, determining the features to be queried according to the query sequence, and determining query conditions of the target participants corresponding to the query according to the features to be queried, the target ciphertext identifications and the data tables corresponding to the features to be queried.
The implementation method can be applied to application scenes of which the connection modes are internal connection and full connection. When the connection mode is the internal connection, the target ciphertext mark is the intersection of ciphertext marks of all target participants, and when the connection mode is the full connection, the target ciphertext mark is the union of ciphertext marks of all target participants. The purpose of determining the query order is to determine that each target participant has interacted. In actual implementation, if it is ensured that each target participant can be interacted with, the inquiry may be performed out of the inquiry order.
When the method is realized, all users of all target participants can be queried respectively, and the users of all target participants are connected in a connection mode to determine target users. If the feature data of the feature A to be queried is determined to be stored in the Table1 of the first participant, obtaining the query condition, namely, querying the IDs of all users in the Table1, and marking the IDs as the SELECT a.ID from Table1 a. And storing the feature data of the feature C to be queried in the Table2 of the second participant, wherein the obtained query condition is that the IDs of all users in the Table2 are queried and marked as the SELECT b.ID from Table1 b. And then determining a target user according to the two ciphertext query results, and marking a user identification set of the target user as S2= { S1, S2}, wherein S1, S2 are encrypted user identifications.
After the target user is determined, a query request is respectively sent to each target participant to query the characteristic data of the target user. The first query condition for determining the first participant is that feature data of the feature A of the user belonging to S2 in Table1 is queried, and is marked as SELECT a.ID, a.A from Table1 a where a.ID in S2. Similarly, the second query condition for determining the second participant is that the feature data of the feature C of the user belonging to S2 in the query Table2 is recorded as a SELECT b.ID, b.C from Table1 b where b.ID in S2.
Step S3032, a query request corresponding to each target participant is generated based on the query conditions corresponding to each target participant.
According to the method provided by the embodiment of the application, the query conditions corresponding to each target participant are determined according to each feature to be queried and the data table corresponding to each feature to be queried, so that the corresponding query request is generated, and the corresponding query request is sent to the corresponding target participant, thereby realizing the data query of each target participant.
In some embodiments, step S306 "determining the target data according to the private key and the ciphertext query result sent by each target participant" in the embodiment shown in fig. 3 may be implemented by the following steps:
step S3061, connecting ciphertext query results sent by all target participants according to a predefined connection mode to obtain ciphertext data.
Here, the connection means includes an internal connection, a left connection, a right connection, and a full connection.
In the embodiment of the present application, when the connection mode is the internal connection mode, the data table provided in the above step S3021 is used to determine the query condition by combining the first implementation mode in step S3031 for illustration.
The first query condition carried in the first query request sent by the coordinator to the first participant is a SELECT a.id, a.A from Table1 a. The first participant performs query in Table1 according to the first query condition, and the obtained first plaintext query result is shown in Table 4:
Table 4 links the first plaintext query results obtained
ID A
1 a1
2 a2
3 a3
4 a4
The first party encrypts the first plaintext query result by using the public key to obtain a first ciphertext query result, and in the embodiment of the present application, "[ [ ] ] ]" is used to represent encryption, and the obtained first ciphertext query result is referred to table 5:
Table 5 shows the first ciphertext query result obtained in an interconnection
ID A
[[1]] [[a1]]
[[2]] [[a2]]
[[3]] [[a3]]
[[4]] [[a4]]
The second query condition carried in the second query request sent by the coordinator to the second participant is SELECT b.id, b.C from Table2 b. The second participant inquires in Table2 according to the second inquiry condition, and the second plaintext inquiry result is shown in Table 6:
The second plaintext query results obtained by the concatenation in Table 6
ID C
2 c2
4 c4
5 c5
6 c6
The second party encrypts the second plaintext query result using the public key, and the obtained second ciphertext query result is shown in table 7:
table 7 links the second ciphertext query results
ID C
[[2]] [[c2]]
[[4]] [[c4]]
[[5]] [[c5]]
[[6]] [[c6]]
The coordinator receives a first ciphertext query result shown in table 5 from the first participant, receives a second ciphertext query result shown in table 7 from the second participant, and internally connects the first ciphertext query result and the second ciphertext query result, wherein the obtained ciphertext data is shown in table 8:
table 8 ciphertext data obtained by concatenation
ID A C
[[2]] [[a2]] [[c2]]
[[4]] [[a4]] [[c4]]
In the embodiment of the application, when the connection mode is left connection, the query sequence is that the feature A to be queried is queried first, and then the feature C to be queried is queried. The above data table provided in step S3021 is exemplified in connection with the determination of the query condition in the second implementation manner in step S3031.
The first query condition carried in the first query request sent by the coordinator to the first participant is a SELECT a.id, a.A from Table1 a. The first participant performs query in Table1 according to the first query condition, and the obtained first plaintext query result is shown in Table 9:
TABLE 9 left connected first plaintext query results
ID A
1 a1
2 a2
3 a3
4 a4
The first participant encrypts the first plaintext query result by using the public key, and the obtained first ciphertext query result is referred to in table 10:
Table 10 left-concatenated first ciphertext query result
ID A
[[1]] [[a1]]
[[2]] [[a2]]
[[3]] [[a3]]
[[4]] [[a4]]
The coordinator receives the first ciphertext query result as shown in table 10 from the first participant, and extracts ciphertext identification sets s1= { [ [1] ], [ [2] ], [ [3] ], [ (4 ] ] from the first ciphertext query result. And determining a second query condition corresponding to the feature C to be queried as a SELECT b.ID, b.C from Table 2b sphere b.ID in S1. The second participant queries in Table2 according to the second query condition, and the second plaintext query results are shown in Table 11:
table 11 left-concatenated second plaintext query results
ID C
2 c2
4 c4
The second party encrypts the second plaintext query result using the public key, and the obtained second ciphertext query result is referred to in table 12:
table 12 left concatenated second ciphertext query result
ID C
[[2]] [[c2]]
[[4]] [[c4]]
The coordinator receives the first ciphertext query result shown in table 10 from the first participant, receives the second ciphertext query result shown in table 12 from the second participant, and left-links the first ciphertext query result and the second ciphertext query result, where the obtained ciphertext data is shown in table 13:
Table 13 ciphertext data from left join
Here, the second ciphertext query result is NULL because it does not have the characteristic data with IDs of [ [1] ] and [ [3] ].
In the embodiment of the present application, when the connection mode is full connection, the query condition is determined by using the data table provided in step S3021 and combining the third implementation mode in step S3031 for illustration.
The coordinator firstly inquires all users of each target participant respectively, and connects the users of each target participant according to a full-connection mode to determine target users.
The coordination sends a query request carrying a query condition SELECT a.ID from Table 1a to a first participant to obtain a ciphertext identification set Sa= { [ [1] ], [ [2] ], [ (3 ] ], [ (4 ] ] ], and sends a query request carrying a query condition SELECT b.ID from Table 1b to a second participant to obtain a ciphertext identification set Sb= { [ [2] ], [ (4 ] ], [ (5 ] ], [ (6 ] }. The Sa and the Sb are fully connected to obtain target users S2= { [ [1] ], [ (2 ] ], [ (3 ] ], [ (4 ] ], [ (5 ] ], [ (6 ] }.
The first query condition carried in the first query request sent by the coordinator to the first participant is SELECT a.id, a.A from Table 1a where a.id in S2. The first participant performs query in Table1 according to the first query condition, and the obtained first plaintext query result is shown in Table 14:
Table 14 fully concatenated first plaintext query results
ID A
[[1]] a1
[[2]] a2
[[3]] a3
[[4]] a4
[[5]] NULL
[[6]] NULL
The first participant encrypts the first plaintext query result using the public key, and the obtained first ciphertext query result is referred to in table 15:
table 15 fully concatenated first ciphertext query result
ID A
[[1]] [[a1]]
[[2]] [[a2]]
[[3]] [[a3]]
[[4]] [[a4]]
[[5]] NULL
[[6]] NULL
The second query condition carried in the second query request sent by the coordinator to the second participant is SELECT b.id, b.C from Table 1b sphere b.id in S2. The second participant queries in Table2 according to the second query condition to obtain a second plaintext query result as shown in Table 16:
table 16 fully concatenated second plaintext query results
ID C
[[1]] NULL
[[2]] c2
[[3]] NULL
[[4]] c4
[[5]] c5
[[6]] c6
The second party encrypts the second plaintext query result using the public key, and the obtained second ciphertext query result is referred to in table 17:
Table 17 fully concatenated second ciphertext query result
ID C
[[1]] NULL
[[2]] [[c2]]
[[3]] NULL
[[4]] [[c4]]
[[5]] [[c5]]
[[6]] [[c6]]
The coordinator receives the first ciphertext query result shown in table 15 from the first participant, receives the second ciphertext query result shown in table 17 from the second participant, and fully connects the first ciphertext query result and the second ciphertext query result, and the obtained ciphertext data is shown in table 18:
Table 18 ciphertext data from full concatenation
ID A C
[[1]] [[a1]] NULL
[[2]] [[a2]] [[c2]]
[[3]] [[a3]] NULL
[[4]] [[a4]] [[c4]]
[[5]] NULL [[c5]]
[[6]] NULL [[c6]]
And step S3062, decrypting the ciphertext data by using the private key to obtain the target data.
And the coordinator decrypts the ciphertext data obtained in the step S3061 by utilizing the pre-generated private key to obtain target data. For example, the ciphertext data shown in table 8, table 13, and table 18 in step S3061 are decrypted to obtain the corresponding target data shown in table 19, table 20, and table 21, respectively:
Table 19 links the obtained target data
ID A C
2 a2 c2
4 a4 c4
Table 20 left side connection target data
ID A C
1 a1 NULL
2 a2 c2
3 a3 NULL
4 a4 c4
TABLE 21 target data from full concatenation
In the embodiment of the application, the coordinator can homomorphic decrypt the ciphertext data according to the pre-generated private key to obtain the target data, wherein the target data is the characteristic data of the characteristic to be queried.
According to the method provided by the embodiment of the application, the coordinator connects the ciphertext query results sent by all target participants in a corresponding connection mode according to the predefined connection mode to obtain ciphertext data, and decrypts the ciphertext data by using the private key to obtain target data, so as to obtain the feature data of the feature to be queried. On the premise of ensuring that the data does not go out of the local area, the cross-table query of the data is realized, and the availability of the queried target data is not influenced.
Based on the foregoing embodiments, the embodiment of the present application further provides a data query method, and fig. 4 is a schematic flow chart of another implementation of the data query method provided by the embodiment of the present application, which is applied to a target participant in the network architecture shown in fig. 1, as shown in fig. 4, and the data query method includes the following steps:
step S401, receiving a public key and a query request sent by a coordinator.
When the data cross-table query is carried out, the coordinator generates a public key for encryption and a private key for decryption, and then the public key is sent to each target participant, so that each target participant can carry out the data cross-table query by utilizing the public key on the premise of ensuring the respective data privacy.
Step S402, analyzing the query request to obtain the query conditions carried by the query request.
After receiving the query request, the target participant analyzes the query request to obtain the query conditions carried by the query request. Still further to the above example, when the target participant is the first participant, the first query request is parsed to obtain the first query condition.
In the embodiment of the present application, the implementation manner of determining the query condition by the coordinator according to the feature to be queried and the data table corresponding to the feature to be queried may refer to the detailed description of step S3031.
Step S403, searching in the storage space based on the query condition to obtain a plaintext query result.
For example, the target participant is a first participant, and the data Table1 is shown in Table1, and the first query condition carried in the first query request sent by the coordinator to the first participant is SELECT a.id, a.A from Table 1a. The first participant performs query in Table1 shown in Table1 according to the first query condition, and the obtained first plaintext query result is shown in Table 4.
Step S404, encrypting the plaintext inquiry result by using the public key to obtain the ciphertext inquiry result.
The first party encrypts the first ciphertext query result using the public key, and the obtained first ciphertext query result is shown in table 5.
Step S405, the ciphertext query result is sent to the coordinator.
The data query method is applied to target participants of federal learning, the target participants are participants storing part of data to be queried, the method comprises the steps of receiving public keys and query requests sent by a coordinator, analyzing the query requests to obtain query conditions carried by the query requests, searching in a storage space of the target participants based on the query conditions to obtain plaintext query results, encrypting the plaintext query results by using the public keys to obtain ciphertext query results, and sending the ciphertext query results to the coordinator. Therefore, under the premise of ensuring that the data does not appear locally to the participants, the cross-table query of the data is realized, the data privacy can be protected, the data leakage can be prevented, and the availability of the queried target data can not be influenced.
Based on the foregoing embodiments, the embodiment of the present application further provides a data query method, and fig. 5 is a schematic flow chart of still another implementation of the data query method provided by the embodiment of the present application, which is applied to the network architecture shown in fig. 1, as shown in fig. 5, and the data query method includes the following steps:
in step S501, the coordinator generates a public key for encryption and a private key for decryption.
In the embodiment of the application, when the coordinator performs data query under federal learning, the coordinator generates a public key for encryption and a private key for decryption. Through the public key, each participant does not need to send the user characteristic data stored by the participant to the counterpart or the coordinator, and the privacy of the data of each participant can be protected.
In one implementation, the public key generated by the coordinator may be a public key for homomorphic encryption and the private key generated may be a private key for homomorphic decryption.
In step S502, the coordinator acquires table attributes of the data tables stored by each participant from the multiple participants of the federal learning.
The coordinator obtains the Table attribute of the data Table stored by each participant, wherein the Table attribute obtained from the first participant is (Table 1; ID, A and B), the Table attribute obtained from the second participant is (Table 2; ID, C and D), the Table attribute obtained from the third participant is (Table 3; ID, E and F), the Table1, table2 and Table3 are Table names, the ID is identification (such as identification card number and the like capable of distinguishing different users) of different users, and A, B, C, D, E and F are characteristics.
In step S503, the coordinator determines the data table corresponding to each feature to be queried according to at least two features to be queried and table attributes of the data tables stored by each participant.
The coordinator determines that the data Table corresponding to the feature A to be queried is Table1 and the data Table corresponding to the feature C to be queried is Table2 according to the feature A to be queried, the feature C to be queried and the obtained Table attributes (Table 1, ID, A and B), (Table 2, ID, C and D) and (Table 3, ID, E and F).
In step S504, the coordinator determines the participant corresponding to each data table as the target participant.
Here, each target participant stores feature data of a part of the feature to be queried.
The participants corresponding to the data Table are the participants storing the data Table, the participant corresponding to the Table1 is the first participant, and the participant corresponding to the Table2 is the second participant, so that the first participant and the second participant are determined to be target participants.
In step S505, the coordinator determines the query conditions corresponding to the target participants according to the features to be queried and the data table corresponding to the features to be queried.
In determining the query conditions, this may be accomplished based on three different implementations:
The first implementation mode comprises the steps of sequencing all the features to be queried, determining a query sequence, determining the features to be queried at this time according to the query sequence, and determining query conditions of a target participant corresponding to the query according to the features to be queried at this time and a data table corresponding to the features to be queried at this time.
The second implementation mode comprises the steps of sorting all the characteristics to be queried, determining a query sequence, determining the characteristics to be queried according to the query sequence, determining query conditions of target participants corresponding to the 1st query according to the 1st characteristic to be queried and a data table corresponding to the 1st characteristic to be queried when the query is the first query, extracting ciphertext identification of the last ciphertext query result according to ciphertext query results obtained by the last query when the query is not the first query, and determining the query conditions of the target participants corresponding to the query based on the ciphertext identification of the last ciphertext query result, the current characteristic to be queried and the data table corresponding to the current characteristic to be queried.
The third implementation mode comprises the steps of sequencing all the features to be queried, determining a query sequence, obtaining target ciphertext identifications of all target participants, determining the features to be queried according to the query sequence, and determining query conditions of the target participants corresponding to the query according to the features to be queried, the target ciphertext identifications and the data tables corresponding to the features to be queried.
In step S506, the coordinator generates a query request corresponding to each target participant based on the query conditions corresponding to each target participant.
In step S507, the coordinator sends the public key and the query request corresponding to each target participant to the corresponding target participant.
In step S508, the target participant parses the query request to obtain the query conditions carried by the query request.
In an embodiment of the present application, the target participant is shown as a first participant and a second participant. The first party analyzes the first query request after receiving the first query request to obtain a first query condition carried by the first query request, and the second party analyzes the second query request after receiving the second query request to obtain a second query condition carried by the second query request.
Step S509, the target participant searches in the storage space based on the query condition to obtain a plaintext query result.
The first participant inquires in the data Table Table1 under a first inquiry condition to obtain a first plaintext inquiry result, and the second participant inquires in the data Table Table2 under a second inquiry condition to obtain a second plaintext inquiry result.
In step S510, the target participant encrypts the plaintext query result by using the public key to obtain the ciphertext query result.
The first party performs homomorphic encryption on the first plaintext inquiry result by using the public key to obtain a first ciphertext inquiry result, and the second party performs homomorphic encryption on the second plaintext inquiry result by using the public key to obtain a second ciphertext inquiry result.
In step S511, the target participant sends the ciphertext query to the coordinator.
The first participant sends the first ciphertext query result to the coordinator, and the second participant sends the second ciphertext query result to the coordinator.
In step S512, the coordinator connects the ciphertext query results sent by the target participants according to the predefined connection mode, so as to obtain ciphertext data.
Here, the connection means includes an internal connection, a left connection, a right connection, and a full connection.
In step S513, the coordinator decrypts the ciphertext data using the private key to obtain the target data.
In the embodiment of the application, the coordinator can homomorphic decrypt the ciphertext data according to the pre-generated private key to obtain the target data, wherein the target data is the characteristic data of the characteristic to be queried.
According to the data query method provided by the embodiment of the application, a coordinator determines at least two target participants from a plurality of participants of federal learning according to at least two features to be queried, determines query requests corresponding to the target participants, and then sends a public key generated in advance and the query requests corresponding to the target participants to the corresponding target participants. After each target participant receives the query request, respectively querying the data table of each target participant to obtain a plaintext query result, and encrypting the plaintext query result by using the public key to obtain a ciphertext query result. And the coordinator receives the ciphertext query results sent by each target participant and determines target data according to the private key and the ciphertext query results sent by each target participant. Therefore, under the premise that the data of each participant cannot be locally found out, the cross-table query of the data is realized, the data privacy can be protected, the data leakage can be prevented, the anonymization and other treatments are not carried out on the data, and the usability of the queried target data cannot be affected.
In the following, an exemplary application of the embodiment of the present application in a practical application scenario will be described.
Database queries are a common database technology, and in general, the current database stores are all stored in the form of (key, value) pairs of keys (i.e., the above identifiers) and values (i.e., the above feature data) values. Database query refers to that a user searches a value corresponding to a key in a database by using the key value.
Multi-table query refers to that data is distributed in multiple database tables, and multiple database tables are required to be combined to perform query, and three modes of internal connection, left connection and right connection are typical. When multiple tables are distributed in different institutions or clients, because private data leakage needs to be prevented, we cannot directly share the tables, so how can effectively query the connection result on the premise that the data is not shared?
The present privacy query technology can either concentrate a plurality of tables together or prevent disclosure through encryption or anonymization technology, but can cause reduced availability of data.
The current database privacy protection query technology includes data desensitization, anonymization, differential privacy and homomorphic encryption. However, these schemes either concentrate a plurality of tables together, violate the limitation of the data out-of-domain in the privacy protection law, or prevent data disclosure by encryption or anonymization techniques, but reduce the availability of the data.
The embodiment of the application provides a method for realizing cross-table connection inquiry by utilizing a federal learning technology. If each database table is distributed among different clients, the data is not allowed to go out of the local. Referring to fig. 6, fig. 6 is a schematic diagram of database table distribution provided in an embodiment of the present application, in which databases 1 to n are distributed in different clients 1 to n under a server 2, and a database k is distributed in a client k, where k is an integer between 1 and n.
For ease of explanation, a connection query of two tables is taken as an example, wherein the two tables are respectively shown below, and Table22 is distributed in client a and Table23 is distributed in client B.
Table22 database Table22 in client a
ID A B
1 a1 b1
2 a2 b2
3 a3 b3
4 a4 b4
Table23 database Table23 in client B
ID C D
2 c2 d2
4 c4 d4
5 c5 d5
6 c6 d6
Federal learning implements an internal connection:
The inner join defines that the rows in the two tables where the join fields are equal are returned. The syntax is as follows:
SELECT a.ID,a.A,b.C from Table22 a JOIN Table23 b;
Table22 herein refers to Table22 in client A, and Table23 refers to Table23 in client B. The returned result values are:
table 24 internal connection results of Table22 of client a and Table23 of client B
ID A C
2 a2 c2
4 a4 c4
Fig. 7 is a schematic flow chart of an implementation of a data query method for federal learning implementation internal connection according to an embodiment of the present application, as shown in fig. 7, where the data query method includes the following steps:
in step S701, the server sends public keys for encryption to the client a and the client B, respectively.
Step S702, the server sends a first sql statement to the client A.
The first sql statement is SELECT a.id from Table22 a.
In step S703, the client a receives the first sql statement, executes locally to obtain all ID values, encrypts with the public key, and returns to the server.
Step S704, the server extracts all the encryption IDs returned by the client A to obtain an encryption ID set S1, and sends the S1 and a second aql statement to the client B.
The second sql statement is SELECT b.id, b.C from Table23 b wher e b.id in S1, wherein the encrypted ID set s1= { [ [1] ], [ [2] ], [ [3] ], [ [4] ] ], extracted from the search result sent by client a.
Step S705, the client B encrypts the data table of the client B, executes a second sql statement to obtain a search result in an encrypted state, and sends the search result back to the server.
The client B encrypts its own data table with the public key, and in the encrypted state, executes the second sql statement, and the obtained search result is shown in table 25:
Table 25 search results from client B executing the second sql statement in the encrypted state
ID C
[[2]] [[c2]]
[[4]] [[c4]]
Step S706, the server extracts the ID value of the search result sent by the client B to obtain an encrypted ID set S2, and sends the S2 and a third sql statement to the client A.
The third sql statement is SELECT a.id, a.A from Table22 a where a.id in S2, which is an encrypted ID set s2= { [ [2] ], [ [4] ] }, extracted from the search result sent by the client B.
Step S707, the client A executes the third sql statement to obtain a search result, and returns the search result to the server.
The client a encrypts its own data table with the public key, and executes the third sql statement in the encrypted state, where the obtained search result is shown in table 26:
table 26 search results from client A executing the third sql statement in the encrypted state
ID A
[[2]] [[a2]]
[[4]] [[a4]]
Step S708, the server combines the search result obtained by the client executing the second sql statement and the search result obtained by the client A executing the third sql statement in an encryption state, and executes the internal connection operation at the server to obtain ciphertext data, and decrypts the ciphertext data to obtain target data.
The sql statement for performing the join operation at the server is SELECT a.id, a.A, b.C from Table 25a JOIN Table26 b, resulting in the results shown in Table 27 below.
Ciphertext data obtained by performing internal connection operation at table 27 server
ID A C
[[2]] [[a2]] [[c2]]
[[4]] [[a4]] [[c4]]
After decrypting it, the final results are shown in Table 24.
Federal learning achieves left connection:
The left connection defines that the left connection is executed, the data of the left table is reserved completely, the data of the right table can be matched and reserved correctly, and the fields which cannot be matched with other tables are all set to NULL. Taking tables 22 and 23 as examples, table 22 is a left table, table 23 is a right table, and the sql statement syntax is as follows:
SELECT a.ID,a.A,b.C from Table22 a left JOIN Table23 b;
The results of the execution are shown in table 28 below:
table 28 left connection results of Table22 of client A and Table23 of client B
ID A C
1 a1 NULL
2 a2 c2
3 a3 NULL
4 a4 c4
Fig. 8 is a schematic flow chart of an implementation of a data query method for realizing left connection in federal learning according to an embodiment of the present application, as shown in fig. 8, the data query method includes the following steps:
step S801, the server sends the public key for encryption to the client a and the client B.
Step S802, the server sends a first sql statement to the client A.
The first sql statement is SELECT a.id, a.A from Table22 a.
Step 803, the client A receives the first sql statement, executes the first sql statement locally to obtain a plaintext query result, encrypts the plaintext query result to obtain a search result, and sends the search result back to the server.
The client a encrypts the plaintext query result of the first sql statement by using the public key, and the obtained search result is shown in table 29:
table 29 search results obtained by client A encrypting the results obtained by executing the first sql statement
ID A
[[1]] [[a1]]
[[2]] [[a2]]
[[3]] [[a3]]
[[4]] [[a4]]
Step S804, the server extracts all the encryption IDs in the search result returned by the client A to obtain an encryption ID set S1, and sends the S1 and the second sql statement to the client B.
The second sql statement is SELECT b.id, b.C from Table23 b wher e b.id in S1, wherein the encrypted ID set s1= { [ [1] ], [ [2] ], [ [3] ], [ [4] ] ], extracted from the search result sent by client a.
Step S805, the client B encrypts the data table of the client B, executes a second sql statement to obtain a search result in an encrypted state, and sends the search result back to the server.
The client B encrypts its own data table with the public key, and in the encrypted state, executes the second sql statement, and the obtained search result is shown in table 30:
table 30 search results from client B executing the second sql statement in the encrypted state
ID C
[[2]] [[c2]]
[[4]] [[c4]]
Step S806, the server combines the search result returned by the client A and the search result returned by the client B, performs left connection operation on the server to obtain ciphertext data, and decrypts the ciphertext data to obtain target data.
The sql statement for performing the left join operation at the server is SELECT a.id, a.A, b.C from Table29 a left JOIN Table b, resulting in the results shown in Table 31 below.
Ciphertext data obtained by performing left connection operation on table 31 server
ID A C
[[1]] [[a1]] NULL
[[2]] [[a2]] [[c2]]
[[3]] [[a3]] NULL
[[4]] [[a4]] [[c4]]
After decrypting it, the final results are shown in Table 28.
Federal learning implements right connection:
The right connection definition, also called right external connection, performs right connection, the data of the right table will be all reserved, the data of the left table can be matched, reserved correctly, and the fields which cannot be matched with other tables are all set to NULL. Taking tables 22 and 23 as examples, table 22 is a left table, table 23 is a right table, and the sql statement syntax is as follows:
SELECT b.ID,a.A,b.C from Table22 a right JOIN Table23 b;
the execution results are shown in the following table 32:
table 32 Right connection results of Table22 of client A and Table23 of client B
ID A C
2 a2 c2
4 a4 c4
5 NULL c5
6 NULL c6
Fig. 9 is a schematic flow chart of an implementation of a data query method for implementing right connection in federal learning according to an embodiment of the present application, as shown in fig. 9, the data query method includes the following steps:
Step S901, the server sends public keys for encryption to the client a and the client B.
Step S902, the server sends a first sql statement to the client B.
The first sql statement is SELECT b.ID, b.C from Table23 b
Step 903, the client B receives the first sql statement, executes the first sql statement locally to obtain a plaintext query result, encrypts the plaintext query result to obtain a search result, and sends the search result back to the server.
The client B encrypts the plaintext query result of the first sql statement using the public key, and the obtained search result is shown in table 33:
Table 33 search results obtained by client B encrypting results obtained by executing the first sql statement
ID C
[[2]] [[c2]]
[[4]] [[c4]]
[[5]] [[c5]]
[[6]] [[c6]]
Step S904, the server extracts all the encryption IDs in the search result returned by the client B to obtain an encryption ID set S1, and sends the S1 and the second sql statement to the client A.
The second sql statement is selected a.ID, a.A from Table22 a where a.ID in S1, wherein the encrypted ID set S1= { [ [2] ], [ [4] ], [ [5] ], [ [6] ] ], which is extracted from the search result sent by the client B.
Step S905, the client A encrypts the data table of the client A, executes a second sql statement to obtain a search result in an encrypted state, and sends the search result back to the server.
The client B encrypts its own data table with the public key, and in the encrypted state, executes the second sql statement, and the obtained search result is shown in table 30:
table 34 lookup results from client B executing the second sql statement in the encrypted state
ID A
[[2]] [[a2]]
[[4]] [[a4]]
Step S906, the server combines the search result returned by the client B and the search result returned by the client A, performs right connection operation on the server to obtain ciphertext data, and decrypts the ciphertext data to obtain target data.
The sql statement for performing the right join operation at the server is SELECT a.id, b.A, a.C from Table33 a left JOIN Table b, resulting in the results shown in Table 35 below.
Ciphertext data obtained by performing right connection operation on table 35 server
ID A C
[[2]] [[a2]] [[c2]]
[[4]] [[a4]] [[c4]]
[[5]] NULL [[c5]]
[[6]] NULL [[c6]]
After decrypting it, the final results are shown in table 32.
According to the method provided by the embodiment of the application, on the premise that the data is not locally found, the cross-table connection query of the data can be realized, the data privacy can be protected, the data leakage can be prevented, and the usability of the queried target data can not be influenced.
Continuing with the description of the exemplary architecture of the data query device implemented as a software module provided by embodiments of the present application, in some embodiments, as shown in fig. 2, a data query device 90 stored in a memory 140 is applied to a coordinator of federal learning, where the software modules in the data query device 90 may include:
A generation module 91 for generating a public key for encryption and a private key for decryption;
A first determining module 92, configured to determine at least two target participants from a plurality of participants learned by federally, where each target participant stores feature data of a portion of the features to be queried;
A second determining module 93, configured to determine a query request corresponding to each target participant;
a first sending module 94, configured to send the public key and the query request corresponding to each target participant to the corresponding target participant;
A first receiving module 95, configured to receive ciphertext query results sent by the target participants, where the ciphertext query results are determined by the target participants based on the public key and the received query requests;
And a third determining module 96, configured to determine target data according to the private key and the ciphertext query results sent by the target parties.
In some embodiments, the first determining module 92 is further configured to:
obtaining table attributes of a data table stored by each party from a plurality of participants of federal learning;
Determining a data table corresponding to each feature to be queried according to the at least two features to be queried and table attributes of the data tables stored by each participant, wherein feature data of the at least two features to be queried are stored in different participants of federal learning;
And determining the participants corresponding to the data tables as target participants.
In some embodiments, the second determining module 93 is further configured to:
Determining query conditions corresponding to each target participant according to each feature to be queried and the data table corresponding to each feature to be queried;
and generating a query request corresponding to each target participant based on the query conditions corresponding to each target participant.
In some embodiments, the second determining module 93 is further configured to:
sequencing the features to be queried to determine a query sequence;
Determining the characteristics to be queried at the present time according to the query sequence;
and determining the query conditions of the target participants corresponding to the query according to the features to be queried and the data table corresponding to the features to be queried.
In some embodiments, the second determining module 93 is further configured to:
sequencing the features to be queried to determine a query sequence;
Determining the characteristics to be queried at the present time according to the query sequence;
when the query is the first query, determining a query condition of a target participant corresponding to the 1 st query according to the 1 st feature to be queried and a data table corresponding to the 1 st feature to be queried;
And determining the query conditions of the target participants corresponding to the query based on the ciphertext identification of the last ciphertext query result, the features to be queried and the data table corresponding to the features to be queried.
In some embodiments, the third determining module 96 is further configured to:
connecting ciphertext query results sent by all target participants according to a predefined connection mode to obtain ciphertext data, wherein the connection mode comprises internal connection, left connection, right connection and full connection;
and decrypting the ciphertext data by using the private key to obtain target data.
Based on the foregoing embodiments, the embodiments of the present application further provide a data query device, which is applied to a target participant in federal learning, where the target participant is a participant storing a portion of data to be queried, and a software module in the data query device may include:
The second receiving module is used for receiving the public key and the query request sent by the coordinator;
The analysis module is used for analyzing the query request to obtain query conditions carried by the query request;
the query module is used for searching in the storage space based on the query condition to obtain a plaintext query result;
The encryption module is used for encrypting the plaintext inquiry result by utilizing the public key to obtain a ciphertext inquiry result;
And the second sending module is used for sending the ciphertext query result to the coordinator.
It should be noted here that the description of the items of embodiment of the data querying device described above, which is similar to the description of the method described above, has the same advantageous effects as the embodiment of the method. For technical details not disclosed in the embodiments of the data query device of the present application, those skilled in the art will understand with reference to the description of the embodiments of the method of the present application.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the data query method according to the embodiment of the present application.
Embodiments of the present application provide a storage medium having stored therein executable instructions which, when executed by a processor, cause the processor to perform a method provided by embodiments of the present application, for example, as shown in fig. 3.
In some embodiments, the storage medium may be FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM, or various devices including one or any combination of the above.
In some embodiments, the executable instructions may be in the form of programs, software modules, scripts, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.
As an example, executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, such as in one or more scripts in a hypertext markup language (HTML, hyper Text Markup Language) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
As an example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices located at one site or distributed across multiple sites and interconnected by a communication network.
The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (9)

1. A method of data querying, the method being applied to a coordinator of federal learning, the method comprising:
generating a public key for encryption and a private key for decryption;
Determining at least two target participants from a plurality of participants learned by federation according to at least two characteristics to be queried, determining query requests corresponding to the target participants, and storing characteristic data of part of the characteristics to be queried by the target participants;
sending the public key and the query request corresponding to each target participant to the corresponding target participant;
receiving ciphertext query results sent by all target participants, wherein the ciphertext query results are determined by all target participants based on the public key and the query requests received by all target participants;
determining target data according to the private key and ciphertext query results sent by all target participants;
the determining at least two target participants from the plurality of participants learned by the federal according to the at least two characteristics to be queried comprises:
Obtaining table attributes of a data table stored by each participant from a plurality of participants of federal learning, wherein the table attributes comprise table names and features of the data table;
Determining a data table corresponding to each feature to be queried according to the at least two features to be queried and table attributes of the data tables stored by each participant, wherein feature data of the at least two features to be queried are stored in different participants of federal learning;
Determining the corresponding participant of each data table as a target participant;
The determining a query request corresponding to each target participant includes:
Determining query conditions corresponding to each target participant according to each feature to be queried and the data table corresponding to each feature to be queried;
generating a query request corresponding to each target participant based on the query conditions corresponding to each target participant;
the determining the query condition corresponding to each target participant according to the features to be queried and the data table corresponding to the features to be queried comprises the following steps:
sequencing the features to be queried to determine a query sequence;
Determining the characteristics to be queried at the present time according to the query sequence;
when the query is the first query, determining a query condition of a target participant corresponding to the 1 st query according to the 1 st feature to be queried and a data table corresponding to the 1 st feature to be queried;
And determining the query conditions of the target participants corresponding to the query based on the ciphertext identification of the last ciphertext query result, the features to be queried and the data table corresponding to the features to be queried.
2. The method according to claim 1, wherein determining the query conditions corresponding to each target participant according to the features to be queried and the data table corresponding to the features to be queried comprises:
sequencing the features to be queried to determine a query sequence;
Determining the characteristics to be queried at the present time according to the query sequence;
and determining the query conditions of the target participants corresponding to the query according to the features to be queried and the data table corresponding to the features to be queried.
3. The method of claim 1, wherein the determining the target data based on the private key and the ciphertext query results sent by the target participants comprises:
connecting ciphertext query results sent by all target participants according to a predefined connection mode to obtain ciphertext data, wherein the connection mode comprises internal connection, left connection, right connection and full connection;
and decrypting the ciphertext data by using the private key to obtain target data.
4. The data query method is characterized in that the method is applied to target participants of federal learning, the target participants are the participants storing part of data to be queried, the target participants are determined according to at least two features to be queried and table attributes of a data table stored by each participant, and the table attributes comprise table names and features of the data table;
The method comprises the following steps:
receiving a public key and a query request sent by a coordinator;
analyzing the query request to obtain query conditions carried by the query request;
searching in the storage space based on the query condition to obtain a plaintext query result;
Encrypting the plaintext inquiry result by using the public key to obtain a ciphertext inquiry result;
Sending the ciphertext query result to the coordinator;
The query request of the target participants determines query conditions corresponding to the target participants according to the features to be queried and the data tables corresponding to the features to be queried, and generates the query conditions corresponding to the target participants;
the determining the query condition corresponding to each target participant according to the features to be queried and the data table corresponding to the features to be queried comprises the following steps:
sequencing the features to be queried to determine a query sequence;
Determining the characteristics to be queried at the present time according to the query sequence;
when the query is the first query, determining a query condition of a target participant corresponding to the 1 st query according to the 1 st feature to be queried and a data table corresponding to the 1 st feature to be queried;
And determining the query conditions of the target participants corresponding to the query based on the ciphertext identification of the last ciphertext query result, the features to be queried and the data table corresponding to the features to be queried.
5. A data querying device for use with a federal learning coordinator, the device comprising:
A generation module for generating a public key for encryption and a private key for decryption;
the first determining module is used for determining at least two target participants from a plurality of participants of federal learning, wherein each target participant stores characteristic data of part of the characteristics to be queried;
a second determining module, configured to determine a query request corresponding to each target participant;
The first sending module is used for sending the public key and the query requests corresponding to all target participants to the corresponding target participants;
The first receiving module is used for receiving ciphertext query results sent by all target participants, and the ciphertext query results are determined by all target participants based on the public key and the query requests received by all target participants;
The third determining module is used for determining target data according to the private key and ciphertext query results sent by all target participants;
The first determining module is further configured to:
Obtaining table attributes of a data table stored by each participant from a plurality of participants of federal learning, wherein the table attributes comprise table names and features of the data table;
Determining a data table corresponding to each feature to be queried according to the at least two features to be queried and table attributes of the data tables stored by each participant, wherein feature data of the at least two features to be queried are stored in different participants of federal learning;
Determining the corresponding participant of each data table as a target participant;
the second determining module is further configured to:
Determining query conditions corresponding to each target participant according to each feature to be queried and the data table corresponding to each feature to be queried;
generating a query request corresponding to each target participant based on the query conditions corresponding to each target participant;
the second determining module is further configured to:
sequencing the features to be queried to determine a query sequence;
Determining the characteristics to be queried at the present time according to the query sequence;
when the query is the first query, determining a query condition of a target participant corresponding to the 1 st query according to the 1 st feature to be queried and a data table corresponding to the 1 st feature to be queried;
And determining the query conditions of the target participants corresponding to the query based on the ciphertext identification of the last ciphertext query result, the features to be queried and the data table corresponding to the features to be queried.
6. The data query device is characterized by being applied to target participants of federal learning, wherein the target participants are participants storing part of data to be queried, the target participants are determined according to at least two features to be queried and table attributes of a data table stored by each participant, and the table attributes comprise table names and features of the data table;
the device comprises:
The second receiving module is used for receiving the public key and the query request sent by the coordinator;
The analysis module is used for analyzing the query request to obtain query conditions carried by the query request, wherein the query request of the target participants determines the query conditions corresponding to the target participants according to the features to be queried and the data tables corresponding to the features to be queried, and is generated based on the query conditions corresponding to the target participants;
the determining the query condition corresponding to each target participant according to the features to be queried and the data table corresponding to the features to be queried comprises the following steps:
sequencing the features to be queried to determine a query sequence;
Determining the characteristics to be queried at the present time according to the query sequence;
when the query is the first query, determining a query condition of a target participant corresponding to the 1 st query according to the 1 st feature to be queried and a data table corresponding to the 1 st feature to be queried;
When the query is not the first query, extracting a ciphertext identifier of the last ciphertext query result according to the ciphertext query result obtained by the last query; determining query conditions of a target participant corresponding to the current query based on the ciphertext identification of the last ciphertext query result, the current feature to be queried and a data table corresponding to the current feature to be queried;
the query module is used for searching in the storage space based on the query condition to obtain a plaintext query result;
The encryption module is used for encrypting the plaintext inquiry result by utilizing the public key to obtain a ciphertext inquiry result;
And the second sending module is used for sending the ciphertext query result to the coordinator.
7. A data querying device, the device comprising:
a memory for storing executable instructions;
a processor for implementing the method of any one of claims 1 to 3 or claim 4 when executing executable instructions stored in said memory.
8. A computer readable storage medium having stored thereon executable instructions for causing a processor to perform the method of any one of claims 1 to 3 or claim 4.
9. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 3 or claim 4.
CN202110507063.6A 2021-05-10 2021-05-10 Data query method, device, equipment, storage medium and program product Active CN113239395B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110507063.6A CN113239395B (en) 2021-05-10 2021-05-10 Data query method, device, equipment, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110507063.6A CN113239395B (en) 2021-05-10 2021-05-10 Data query method, device, equipment, storage medium and program product

Publications (2)

Publication Number Publication Date
CN113239395A CN113239395A (en) 2021-08-10
CN113239395B true CN113239395B (en) 2025-04-01

Family

ID=77132987

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110507063.6A Active CN113239395B (en) 2021-05-10 2021-05-10 Data query method, device, equipment, storage medium and program product

Country Status (1)

Country Link
CN (1) CN113239395B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722754B (en) * 2021-08-25 2024-06-14 上海阵方科技有限公司 Method, device and server for generating privacy executable file
CN114116637A (en) * 2021-11-22 2022-03-01 中国银联股份有限公司 Data sharing method, device, equipment and storage medium
CN114925392B (en) * 2022-05-10 2025-09-02 杭州博盾习言科技有限公司 A decentralized multi-party privacy intersection method, device, equipment and medium
CN115086037B (en) * 2022-06-16 2024-04-05 京东城市(北京)数字科技有限公司 Data processing method and device, storage medium and electronic equipment
CN116346310A (en) * 2023-04-10 2023-06-27 杭州安恒信息技术股份有限公司 Method and device for inquiring trace based on homomorphic encryption and computer equipment
CN117610079B (en) * 2024-01-23 2024-04-09 中汽智联技术有限公司 Data security processing method, device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222034A (en) * 2019-12-31 2020-06-02 湖南华菱涟源钢铁有限公司 Data mobile display method and device and cloud server
CN111858826A (en) * 2020-07-30 2020-10-30 深圳前海微众银行股份有限公司 Retrieval method, system, terminal device and storage medium for spatiotemporal trajectory

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629924A (en) * 2012-03-30 2012-08-08 上海交通大学 Private information retrieval method in environment of a plurality of servers
CN104158827B (en) * 2014-09-04 2018-07-31 中电长城网际系统应用有限公司 Ciphertext data sharing method, device, inquiry server and upload data client
CN105471826B (en) * 2014-09-04 2019-08-20 中电长城网际系统应用有限公司 Ciphertext data query method, apparatus and cryptogram search server
US10691754B1 (en) * 2015-07-17 2020-06-23 Hrl Laboratories, Llc STAGS: secure, tunable, and accountable generic search in databases
CN105468986B (en) * 2015-12-02 2018-11-13 深圳大学 A kind of confidential information search method and system
CN109241753A (en) * 2018-08-09 2019-01-18 南京简诺特智能科技有限公司 A kind of data sharing method and system based on block chain
FR3097353B1 (en) * 2019-06-12 2021-07-02 Commissariat Energie Atomique COLLABORATIVE LEARNING METHOD OF AN ARTIFICIAL NEURON NETWORK WITHOUT DISCLOSURE OF LEARNING DATA
WO2020257123A1 (en) * 2019-06-16 2020-12-24 Planaria Corp. Systems and methods for blockchain-based authentication
CN110601814B (en) * 2019-09-24 2021-08-27 深圳前海微众银行股份有限公司 Federal learning data encryption method, device, equipment and readable storage medium
CN111723385B (en) * 2020-06-01 2024-02-09 清华大学 Data information processing method, device, electronic equipment and storage medium
CN112580821A (en) * 2020-12-10 2021-03-30 深圳前海微众银行股份有限公司 Method, device and equipment for federated learning and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222034A (en) * 2019-12-31 2020-06-02 湖南华菱涟源钢铁有限公司 Data mobile display method and device and cloud server
CN111858826A (en) * 2020-07-30 2020-10-30 深圳前海微众银行股份有限公司 Retrieval method, system, terminal device and storage medium for spatiotemporal trajectory

Also Published As

Publication number Publication date
CN113239395A (en) 2021-08-10

Similar Documents

Publication Publication Date Title
CN113239395B (en) Data query method, device, equipment, storage medium and program product
Cui et al. Efficient and expressive keyword search over encrypted data in cloud
US10476662B2 (en) Method for operating a distributed key-value store
JP6105068B2 (en) Secure Private Database Query with Content Hiding Bloom Filter
US10664604B2 (en) Securing SQL based databases with cryptographic protocols
US10341103B2 (en) Data analytics on encrypted data elements
US20200401726A1 (en) System and method for private integration of datasets
JP7061042B2 (en) Systems and architectures that support parsing for encrypted databases
Liu et al. An efficient privacy-preserving outsourced computation over public data
JP6971926B2 (en) System and architecture for analysis of cryptographic databases
Buyrukbilen et al. Secure similar document detection with simhash
CN108170753B (en) A method for encryption and secure query of Key-Value database in public cloud
CN116383246A (en) Combined query method and device
Raisaro et al. Feasibility of homomorphic encryption for sharing I2B2 aggregate-level data in the cloud
Yao et al. A multi-dimension traceable privacy-preserving prevention and control scheme of the COVID-19 epidemic based on blockchain
Ou et al. An Efficient and Privacy‐Preserving Multiuser Cloud‐Based LBS Query Scheme
CN114519064A (en) Data query method, device and storage medium
Chen et al. BTMDS: Blockchain trusted medical data sharing scheme with privacy protection and access control
US11907392B2 (en) System and method utilizing function secret sharing with conditional disclosure of secrets
Kavitha et al. A survey on Homomorphic encryption in cloud security
Mustaçoğlu Blockchain-based data sharing and managing sensitive data
Lin et al. A Privacy‐Preserving Intelligent Medical Diagnosis System Based on Oblivious Keyword Search
Almutairi et al. Secure third‐party data clustering using SecureCL, Φ‐data and multi‐user order preserving encryption
Sanamrad et al. Query log attack on encrypted databases
Rahman et al. A novel privacy preserving search technique for stego data in untrusted cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant