[go: up one dir, main page]

CN112685389B - Data management method, data management device, electronic device, and storage medium - Google Patents

Data management method, data management device, electronic device, and storage medium Download PDF

Info

Publication number
CN112685389B
CN112685389B CN202110293618.1A CN202110293618A CN112685389B CN 112685389 B CN112685389 B CN 112685389B CN 202110293618 A CN202110293618 A CN 202110293618A CN 112685389 B CN112685389 B CN 112685389B
Authority
CN
China
Prior art keywords
data
data set
acquisition request
searched
data acquisition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110293618.1A
Other languages
Chinese (zh)
Other versions
CN112685389A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Real AI Technology Co Ltd
Original Assignee
Beijing Real AI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Real AI Technology Co Ltd filed Critical Beijing Real AI Technology Co Ltd
Priority to CN202110293618.1A priority Critical patent/CN112685389B/en
Publication of CN112685389A publication Critical patent/CN112685389A/en
Application granted granted Critical
Publication of CN112685389B publication Critical patent/CN112685389B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to the field of data management technologies, and in particular, to a data management method, a data management apparatus, an electronic device, and a storage medium. The data management method comprises the following steps: receiving a data acquisition request, and searching data according to the data acquisition request; creating a data set in response to a search result of the data that is not found; and generating identification information of the data set based on the data acquisition request, and storing the created data set and the identification information corresponding to the data set. According to the data management method, the data management device, the electronic equipment and the storage medium, the problems of low data transmission efficiency, difficulty in data management and difficulty in recall in a data management process can be solved, the data can be efficiently managed, and the recall of the data is facilitated.

Description

Data management method, data management device, electronic device, and storage medium
Technical Field
The present invention relates to the field of data management technologies, and in particular, to a data management method, a data management apparatus, an electronic device, and a storage medium.
Background
With the development of computer technology, the integrated technology taking artificial intelligence as a core is continuously developed, in the development process, data is the basis for supporting the training and testing of computer algorithms and models, and with the increasingly wide application fields of computer algorithms and model technologies, the demand for different types of data in various fields is also increasingly large.
As the amount of data gradually increases, problems associated with data management also gradually emerge. In the existing data management process, a data provider collects relevant data, and then transmits the collected data to a data demand party in a network disk uploading or hard disk mailing mode, and after the data is provided, the data provider can store the data in a hard disk or a server.
However, data is transmitted by uploading to a network disk or mailing a hard disk, the efficiency of data transmission is low, data management is difficult, and collected data is disorderly stored in a hard disk or a server after data is provided and is difficult to recall.
Disclosure of Invention
The invention provides a data management method, a data management device, an electronic device and a storage medium, which solve the problems that the existing data management process is low in data transmission efficiency, difficult to manage data and difficult to call again. The data management method, the data management device, the electronic equipment and the storage medium can efficiently transmit and manage data and are beneficial to recall of the data.
According to a first aspect of the invention, a data management method is provided. The data management method comprises the following steps: receiving a data acquisition request, and searching data according to the data acquisition request; creating a data set in response to a search result of the data that is not found; and generating identification information of the data set based on the data acquisition request, and storing the created data set and the identification information corresponding to the data set.
Optionally, the data management method may further include: generating data task information based on the data acquisition request; sending the data task information, wherein the data task information carries an access interface of the data set; and performing permission verification in response to the access interface being triggered.
Optionally, the step of creating a data set may comprise: receiving data from a data source and verifying the received data; in response to the verification result indicating that the data passes the verification, storing the data passing the verification as a data set; and responding to the verification result that the data is not verified, and sending prompt information of verification failure to the data source.
Optionally, the step of storing the verified data as a data set may include: generating data information based on the verified data; updating the identification information of the data set based on the data information.
Optionally, searching for data according to the data obtaining request may include: determining a keyword corresponding to the data acquisition request; and searching a data set with identification information comprising the keyword based on the keyword, and determining the searched data set as the data.
Optionally, the data management method may further include: responding to the search result of the searched data, and sending an access interface for accessing the searched data; and performing permission verification in response to the access interface being triggered.
According to a second aspect of the present invention, there is provided a data management apparatus. The data management apparatus includes: the searching unit receives a data acquisition request and searches data according to the data acquisition request; the creating unit is used for creating a data set in response to a searching result of the data which is not searched; and the identification storage unit is used for generating identification information of the data set based on the data acquisition request and storing the created data set and the identification information corresponding to the data set.
Optionally, the creating unit may further perform the following operations: receiving data from a data source and verifying the received data; in response to the verification result indicating that the data passes the verification, storing the data passing the verification as a data set; generating data information based on the verified data; updating the identification information of the data set based on the data information.
According to a third aspect of the invention, an electronic device is provided. The electronic device includes: a processor; a memory storing a computer program which, when executed by the processor, implements the data management method as described above in the first aspect.
According to a fourth aspect of the present invention, there is provided a computer-readable storage medium storing a computer program. The computer program, when executed by a processor, implements a data management method as described above in the first aspect.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 shows a schematic flow diagram of a data management method according to an embodiment of the invention;
fig. 2 shows a schematic block diagram of a data management device according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be noted that the term "comprising" will be used in the embodiments of the invention to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features.
An aspect of the present invention provides a data management method that can improve data transmission efficiency and efficiently manage data to facilitate calling of collected data.
Before the application of the present invention, the existing data transmission mainly depends on a third-party network disk or hard disk mailing mode, and in such a transmission process, not only is the transmission efficiency low, but also a great potential safety hazard exists, and data is easily lost in transmission. Furthermore, the collected data is stored on a hard disk or a server in a stack, and is difficult to be efficiently recalled.
Fig. 1 shows a schematic flow diagram of a data management method according to an embodiment of the present invention, which data management method shown in fig. 1 may be executed on a database server of a data provider. As shown in fig. 1, a data management method according to an embodiment of the present invention includes the steps of:
and S1, receiving the data acquisition request, and searching data according to the data acquisition request.
In this step, the data obtaining request may be a data requirement made by a data demanding party, and the data obtaining request may carry basic information of the required data, such as a data type, a data attribute, a data demand amount, and the like.
The data provider can receive a data acquisition request from the data demander and search data according to the data acquisition request. Here, the data may be stored in a database of the data provider, and the data provider may search the database for data corresponding to the data acquisition request.
In an example, a database can include a data set and identification information corresponding to the data set.
As an example, the step of searching for data according to the data acquisition request may include: determining a keyword corresponding to the data acquisition request according to the data acquisition request; based on the keyword, a data set having identification information including the keyword is searched, and the data set is determined as the searched data. For example, the identification information with the keyword may be detected from the identification information of the database, and the data set corresponding to the detected identification information with the keyword may be determined as the searched data set.
In particular, the data may be stored in the form of data sets (hereinafter also referred to as data sets) in a database, for example, for data used in deep learning in the field of artificial intelligence and neural network technology, the data may be stored as data sets by classification, each data set in the database having at least one common feature. Here, the feature may refer to a standard of screening data, for example, the feature may be "male", "female", "age greater than 30 years", "height over 170 cm", or the like. Each data set may be filtered by characteristics, for example, data of persons having a gender of "male" characteristic may be stored in one data set, or data of persons having a gender of "male" characteristic and having a "height over 170 cm" characteristic may be stored in one data set.
In an example, a corpus association analysis may be performed on the data acquisition request to determine keywords. Specifically, the corpora, such as words, phrases, and the like, may be extracted from the data obtaining request, and then the associated corpora related thereto may be searched, and the corpora extracted from the data obtaining request and the searched associated corpora may be used as the keyword corresponding to the data obtaining request. As an example, the corpus or corpus association table may be pre-stored or the corpus topology may be pre-established, so that the associated corpus associated with the corpus may be found according to the corpus extracted from the data acquisition request. For example, when data of a human face is needed, a keyword "human face" may be extracted from the data acquisition request, and then relevant words such as "man", "woman", and the like related to the "human face" may be searched from the corpus, and then the "man", "woman" and the "human face" may be used together as keywords corresponding to the data acquisition request.
In another example, the parsing of the data obtaining request and the extraction of the keyword from the parsed data obtaining request may include: splitting a field of the data acquisition request, and denoising the split field; and recombining the denoised fields to obtain keywords.
Specifically, the data acquisition request may be first field split according to semantic rules. Since the split fields may contain unrealistic content, the unrealistic fields may be eliminated. In addition, since field splitting may split one continuous field representing a complete meaning into a plurality of fields, in order to extract a keyword more accurately, the split fields may be recombined to obtain the keyword.
For example, the data acquisition request may include "data requiring 3000 groups of women older than 30 years of age to surf the internet," which may be split into "needed," "3000 groups," "age," "greater," "30 years of age," "women," "surf the internet time," and "data," where "needed" is unrealistic and "data" is not conducive to retrieval, so both may be culled by field denoising; "age", "above", "30 years" are actually intended to mean a complete meaning and may therefore be recombined to "age above 30 years". Through the above process, the keywords extracted from the data acquisition request are: 3000 groups, age greater than 30 years, female and internet surfing time.
And S2, responding to the searching result of the data which is not searched, and creating a data set.
Specifically, as described above, the data provider may look up data in the database according to the data acquisition request of the data demander. In this regard, when data corresponding to the data acquisition request is not found in the database, a data set may be created in the database.
For example, a blank dataset, e.g., a dataset storage folder, may be created on the database server in response to a lookup result for the data that was not found. Accordingly, an access interface into the blank data set may be obtained, which may be, for example, an address link on a data server pointing to the blank data set.
As an example, the step of creating a data set may comprise: s21, generating data task information based on the data acquisition request; s22, sending data task information, wherein the data task information carries an access interface of a data set; and S23, responding to the triggering of the access interface, and performing authority verification.
Specifically, in step S21, the data task information may refer to information used to characterize data requirements to guide data collection. In an example, the data acquisition request may be directly used as the data task information, for example, in the above example, the data task information may be "data for 3000 groups of women older than 30 years old are needed"; in another example, a data fetch request may be processed to generate data task information. By way of example, additional task information may be added to the data acquisition request, which may include the time to complete data acquisition, the format requirements of the acquired data, the limited scope of the data source, and so forth.
In step S22, the data task information may be sent to a terminal or a data collection server used by the data staff, and specifically, the data task information may be sent by the database server to a terminal or a data collection server used by the data staff, an access interface may be added to the data task information, and the data staff or the data collection server may link to the blank data set described above by accessing the access interface to store the data in the data set.
For example, a data management system or an administrator of the system may create a blank data set according to a data acquisition request and obtain a corresponding access interface, which may be an address link to the data set, in particular, an address of a folder in which the data set is stored, or an address of a database server in which the data set folder is stored. The access interface and data acquisition request may be sent to the data personnel as data task information by the data management system or by an administrator of the system. The data staff can collect data and label the collected data according to the data requirement indicated in the data task information, and then can upload the collected or labeled data to an address link in the data task information from the local, for example, a page of the address link can be opened, and according to an upload button displayed in the page, an "upload" button is clicked to open a folder for storing the data set, so as to store the data in the data set folder, or open the database server, so as to select a corresponding data set folder, and store the data therein.
Further, in step S23, for security of data management, when the access interface is triggered, authority verification may be performed on the visitor. As an example, the rights may be verified by:
in one case, the identity information of the visitor may be verified. Specifically, when the access interface is triggered, an identity verification window can be displayed, the identity verification window can require data personnel to provide identity information such as a job number, a name and the like, then the input identity information can be compared with the identity of the personnel allowed to access and stored in an identity database, and when the comparison result shows that the input identity information allows the access to the database, a data set folder corresponding to the access interface can be opened; when the comparison result shows that the input identity information does not allow to access the database, prompt information without access right is displayed in the identity verification window.
In another case, the network address of the terminal used by the visitor may be verified. In particular, the network address of the terminal may be, for example, an IP address. A trusted network address may be set, for example, the network address of a particular segment may be authorized, and when the visitor accesses the link using the network address belonging to the particular segment, the authentication may be automatically passed without providing any authentication information.
Further, the step of creating a data set may further comprise: s24, receiving data from the data source, and checking the received data; s25, responding to the verification result that the data passes the verification, and storing the data passing the verification as a data set; and S26, responding to the verification result that the data is not verified, and sending prompt information of verification failure to the data source.
Specifically, in step S24, the data source may be a terminal that provides data to the database server, such as a computer used by a data staff, or a data collection server. For example, as described above, a data clerk may upload collected or annotated data to a database server through an access interface. In the embodiment of the present invention, the data received from the data source may be verified, for example, a manager of the data system or the data system may verify whether the data uploaded by the data manager matches the data task information or whether the data uploaded by the data manager meets the requirement of the data acquisition request.
In step S25, in the event that the data passes the verification, the data may be saved to the data set folder. In step S26, if the data fails to be verified, a verification failure message may be fed back to the data source, and the data staff may modify or re-collect the data according to the feedback message, where the verification failure message may carry reasons of the data failing to be verified, such as data type mismatch, data amount failing to reach the standard, and the like.
Referring back to fig. 1, in step S3, identification information of the data set is generated based on the data acquisition request, and the created data set and identification information corresponding to the data set are stored.
At this time, the created data set and identification information corresponding to the created data set may be stored in a database of the data provider. Therefore, when a data acquisition request is received each time, the data acquisition request is preferably searched from the database, and repeated collection of data is avoided.
In this step, the identification information may be information for characterizing the data set, and may include, for example, attributes of the data set, a type of the data set, a data format in the data set, a total amount of data in the data set, and the like.
As an example, identification information may be set for each data set, the identification information being obtained by a data acquisition request based on, for example, a keyword obtained by parsing the data acquisition request as described above. Furthermore, the identification information may comprise the data acquisition request itself in addition to information characterizing the data set. The identification information can be used as retrieval information of the data set, and is helpful for searching the whole database for data in the subsequent data multiplexing process. Here, multiplexing of data may mean that data is called or used again.
With the creation of the data set, more and more data sets are stored in the database server, and after a data demand party puts forward a new demand, a data acquisition request can be input through a search engine of the database server, so that whether the needed data set exists or not can be inquired in the identification information of the data set in the database according to the data acquisition request, and whether the data set meeting the demand exists or not can be quickly judged, and the related data set can be efficiently searched.
Further, with respect to step S25 described above, the step of storing the data that passes the verification as a data set may include: generating data information based on the data passing the verification; based on the data information, the identification information of the data set is updated.
Specifically, in the case that the data passes the verification, data information may be extracted from the data, where the data information may be information related to the acquired data, and may include, for example, a total amount of data, a data type, a data format, and the like, and the data information may also be identification information of the data set, and for example, the data information may be added as identification information of the data set, so that the data information and the data acquisition request together serve as retrieval information of the data set to facilitate retrieval when a subsequent data set is multiplexed.
In addition, the data management method of the embodiment of the present invention may further include: s4, responding to the search result of the searched data, and sending an access interface for accessing the searched data; and performing permission verification in response to the access interface being triggered.
Specifically, when the data set is found, the access interface of the data set may be sent to the data demander, and the data demander may view identification information of the data set according to the access interface, for example, the identification information may be recorded in a detail page of the data set, where the identification information includes an original data acquisition request when the data set is created and data information acquired when the data source provides data. Here, the result of the search may include a plurality of data sets, and the data demander may view the identification information of each data set through the access interface and may select an appropriate data set. As an example, in the detail page displaying the identification information, a download button may be displayed, and clicking the "download" button may download the corresponding data set folder to the local. In addition, in the detail page displaying the identification information, a sharing button can be displayed, and a sharing button is clicked to generate a data set access address link/download link, so that the data set access address link/download link can be provided for related personnel, the data set access address link/download link can be downloaded through the link, and the data set access address link/download link can be arranged after the data set is downloaded to the local.
In addition, the data management method according to the embodiment of the present invention may further include: and responding to the search result of the searched data, adjusting the searched data, and taking the adjusted data as a data set corresponding to the data acquisition request.
Specifically, after the data is found, the original requirement of the data set where the found data is located may be matched with the data obtaining request, if the data set is consistent with the data obtaining request, the data set is determined to be the data set requested by the data obtaining request, and if the data set is inconsistent with the data set, the data set may be adjusted, and the adjusted data set is determined to be the data set requested by the data obtaining request.
Specifically, the data collected for the data acquisition request a may be reused in the future according to the data acquisition request B, for example, data of men and women are required in the data acquisition request a, but data of faces are required in the data acquisition request B, in this case, "men", "women", "people" or other relevant keywords may be input in the input box, reusable data may be searched for, for example, data related to the data acquisition request a may be called, and then the called data may be simply processed and labeled for the data acquisition request B, so that the data acquisition efficiency may be improved, and the cost of data repeated collection may be reduced. In the process, the identification information of the data set plays an important role, and the difficulty of data set retrieval is greatly reduced.
In an example, the process of adjusting the found data may be: and marking the data set where the searched data is located according to the received data acquisition request, updating the identification information corresponding to the data set stored in the database by using the marked information, and taking the marked data set as the data set corresponding to the data acquisition request. In this example, the updated identification information includes the original data retrieval request corresponding to the located data set and the received data retrieval request.
In another example, the process of adjusting the located data may be: and marking the searched data set according to the received data acquisition request, storing the marked data set as a new data set, taking the marked information as identification information corresponding to the new data set, and taking the new data set as the data set corresponding to the received data acquisition request. In this example, the identification information corresponding to the new data set includes the received data acquisition request.
Furthermore, for data security considerations, a visitor to the data set in the database server may be authenticated for permission. For example, different permission levels can be set for a user accessing the database server, and when an access interface is triggered by an accessor, the permission level of the accessor can be judged, so that a corresponding resource space is opened according to the permission level for the accessor to access.
As examples, user permissions may include high-level administrator permissions, general administrator permissions, and general user permissions.
The high level administrator authority may upload/access/manage all data and may share a data access interface (e.g., address link) to the relevant personnel for uploading/downloading/accessing the relevant data. A senior administrator may be limited to the possession of only the relevant supervisor and the particular technician.
The common administrator can upload/manage the data related to the common administrator and access all the data, and can analyze the access interface of the data set created by the common administrator to the related personnel for uploading/downloading/accessing the related data, however, the common administrator has no authority to analyze the data set not created by the common administrator. A general administrator may only be owned by the personnel inside the data management system.
The common user authority can upload/access/download related data within a certain time period through a data access interface shared by a high-level manager or a common manager, and the data cannot be accessed when the access time period of the access interface is exceeded. The common user may be an internal person of the data management system or an external person, and in order to ensure the security of the data, the external person may only have the authority of the common user.
According to the data management method provided by the embodiment of the invention, off-line hard disk storage can be upgraded to on-line server storage, all collected data can be uploaded to a database server, and related information such as an original data acquisition request, data total amount and the like of a data set is recorded during uploading, so that the data set can be reused, and the condition of searching for similar requirements at any time and downloading and using the data set again can be achieved through the platform at a later stage.
According to the data management method provided by the embodiment of the invention, the data transmission efficiency can be improved, the risk of data loss is reduced, the existing data can be efficiently managed and searched, and all characteristics of the data set are fully utilized. And under the condition of existing reusable data, the cost for collecting and sorting the data is saved.
In addition, compared with the method using a mobile hard disk as a storage space, the data management method according to the embodiment of the invention can store the data set through the database server, so that on one hand, the storage pressure caused by large data volume can be solved, and on the other hand, the problem of data loss caused by unstable and damaged hard disks can be avoided.
Another aspect of the invention relates to a data management apparatus. Fig. 2 shows a schematic block diagram of a data management apparatus according to an exemplary embodiment of the present invention.
As shown in fig. 2, the data management apparatus of the embodiment of the present invention includes a search unit 100, a creation unit 200, and an identification storage unit 300.
The lookup unit 100 may receive the data acquisition request and lookup data according to the data acquisition request.
The creating unit 200 may create the data set in response to a lookup result for which the lookup unit 100 does not find data.
The identification storage unit 300 may generate identification information of a data set based on the data acquisition request, and store the created data set and the identification information corresponding to the data set.
The creating unit 200 also performs the following operations: receiving data from a data source and verifying the received data; in response to the verification result indicating that the data passes the verification, storing the data passing the verification as a data set; generating data information based on the data passing the verification; based on the data information, the identification information of the data set is updated.
The searching unit 100, the creating unit 200, and the identifier storing unit 300 may execute corresponding steps in the method according to the data management method in the method embodiment shown in fig. 1, for example, the steps may be implemented by machine-readable instructions executable by the searching unit 100, the creating unit 200, and the identifier storing unit 300, and specific implementation manners of the searching unit 100, the creating unit 200, and the identifier storing unit 300 may refer to the method embodiment described above, and are not described herein again.
The embodiment of the invention also provides electronic equipment, which comprises a processor and a memory. The memory stores a computer program. When the computer program is executed by a processor, the electronic device may perform corresponding steps in the method according to the data management method in the method embodiment shown in fig. 1, for example, by machine-readable instructions executable by the electronic device, and specific implementation manners of the electronic device may refer to the above-described method embodiment, which is not described herein again.
An embodiment of the present invention further provides a computer-readable storage medium storing a computer program, where the computer program can execute the steps of the data management method in the method embodiment shown in fig. 1 when executed by a processor, and specific implementation manners may refer to the method embodiment and are not described herein again.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment scheme of the invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
According to the data management method, the data management device, the electronic equipment and the storage medium, the data set is created, the identification corresponding to the data set is stored, the data can be efficiently transmitted and managed, and the data can be called again.
In addition, according to the data management method, the data management device, the electronic equipment and the storage medium provided by the embodiment of the invention, data task information can be generated according to the data acquisition request so as to guide data personnel to carry out data acquisition work.
In addition, according to the data management method, the data management device, the electronic equipment and the storage medium, the data received from the data source can be verified so as to ensure the accuracy of the data.
In addition, according to the data management method, the data management device, the electronic equipment and the storage medium, data information can be acquired from received data, and the data information can be used as identification information, so that more accurate retrieval information is provided for multiplexing of data sets.
In addition, according to the data management method, the data management device, the electronic equipment and the storage medium provided by the embodiment of the invention, the data acquisition request can be analyzed and the corpus correlation analysis can be carried out, and the keyword corresponding to the data acquisition request can be determined, so that the data set matched with the data acquisition request can be searched more accurately.
In addition, according to the data management method, the data management device, the electronic equipment and the storage medium, the authority of the visitor of the data can be verified so as to ensure the safety of the data.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A data management method, characterized in that the data management method comprises:
s1, receiving a data acquisition request, and searching data according to the data acquisition request;
in step S1, the step of searching for data according to the data obtaining request includes: determining a keyword corresponding to the data acquisition request; based on the keyword, finding a data set having identification information including the keyword,
wherein the step of determining the keyword corresponding to the data acquisition request comprises: extracting corpora from the data acquisition request, searching for associated corpora related to the extracted corpora, determining the extracted corpora and the searched associated corpora together as keywords corresponding to the data acquisition request,
s2, responding to the search result of the data which is not searched, and creating a data set;
generating identification information of the created data set based on the data acquisition request, and storing the created data set and identification information corresponding to the created data set, wherein the identification information includes information for characterizing the data set, including attributes of the data set, a type of the data set, a data format in the data set, and a total amount of data in the data set, and the data acquisition request itself,
the data management method further comprises:
s3, responding to the search result of the searched data, matching the original requirement of the data set of the searched data with the data acquisition request, if the original requirement is consistent with the data acquisition request, determining the data set of the searched data as the data set requested by the data acquisition request, if the original requirement is inconsistent with the data acquisition request, adjusting the data set of the searched data, determining the adjusted data as the data set requested by the data acquisition request,
in step S3, the step of adjusting the data set where the found data is located and determining the adjusted data set as the data set requested by the data obtaining request includes:
according to the data acquisition request, performing data annotation on a data set where the searched data is located, updating identification information corresponding to the searched data set stored in a database by using the data annotated information, and taking the data set after the data annotation as the data set corresponding to the data acquisition request, wherein the updated identification information comprises an original data acquisition request corresponding to the searched data set and the data acquisition request; or
And according to the data acquisition request, performing data annotation on the data set where the searched data is located, storing the data set after data annotation as a new data set, using the information after data annotation as identification information corresponding to the new data set, and using the new data set as the data set corresponding to the data acquisition request, wherein the identification information of the new data set comprises the data acquisition request.
2. The data management method of claim 1, further comprising:
generating data task information based on the data acquisition request;
sending the data task information, wherein the data task information carries an access interface of the created data set;
and performing permission verification in response to the access interface being triggered.
3. The data management method of claim 1, wherein the step of creating a data set comprises:
receiving data from a data source and verifying the received data;
in response to the verification result indicating that the data passes the verification, storing the data passing the verification as the created data set;
and responding to the verification result that the data is not verified, and sending prompt information of verification failure to the data source.
4. The data management method of claim 3, wherein the step of storing the verified data as the created data set comprises:
generating data information based on the verified data;
based on the data information, the identification information of the created data set is updated.
5. The data management method of claim 1, further comprising:
responding to the search result of the searched data, and sending an access interface for accessing the searched data;
and performing permission verification in response to the access interface being triggered.
6. A data management apparatus, characterized in that the data management apparatus comprises:
the searching unit receives a data acquisition request and searches data according to the data acquisition request;
the creating unit is used for creating a data set in response to a searching result of the data which is not searched;
an identification storage unit that generates identification information of the created data set based on the data acquisition request, and stores the created data set and identification information corresponding to the created data set, wherein the identification information includes information for characterizing the characteristics of the data set including an attribute of the data set, a type of the data set, a data format in the data set, and a total amount of data in the data set, and the data acquisition request itself,
wherein, according to the data acquisition request, the operation of searching data comprises: determining a keyword corresponding to the data acquisition request; based on the keyword, finding a data set having identification information including the keyword,
wherein the operation of determining the keyword corresponding to the data acquisition request comprises: extracting corpora from the data acquisition request, searching for associated corpora related to the extracted corpora, determining the extracted corpora and the searched associated corpora together as keywords corresponding to the data acquisition request,
wherein, the creating unit responds to the search result of the searched data, matches the original requirement of the data set of the searched data with the data acquisition request, if the original requirement is consistent with the data acquisition request, determines the data set of the searched data as the data set requested by the data acquisition request, if the original requirement is inconsistent with the data acquisition request, adjusts the data set of the searched data, determines the adjusted data set as the data set requested by the data acquisition request,
the operation of adjusting the data set where the searched data is located and determining the adjusted data set as the data set requested by the data acquisition request includes:
according to the data acquisition request, performing data annotation on a data set where the searched data is located, updating identification information corresponding to the searched data set stored in a database by using the data annotated information, and taking the data set after the data annotation as the data set corresponding to the data acquisition request, wherein the updated identification information comprises an original data acquisition request corresponding to the searched data set and the data acquisition request; or
And according to the data acquisition request, performing data annotation on the data set where the searched data is located, storing the data set after data annotation as a new data set, using the information after data annotation as identification information corresponding to the new data set, and using the new data set as the data set corresponding to the data acquisition request, wherein the identification information of the new data set comprises the data acquisition request.
7. The data management apparatus according to claim 6, wherein the creation unit further performs the following operations:
receiving data from a data source and verifying the received data;
in response to the verification result indicating that the data passes the verification, storing the data passing the verification as the created data set;
generating data information based on the verified data;
based on the data information, the identification information of the created data set is updated.
8. An electronic device, characterized in that the electronic device comprises:
a processor;
memory storing a computer program which, when executed by a processor, implements a data management method according to any one of claims 1 to 5.
9. A computer-readable storage medium storing a computer program, characterized in that the computer program realizes the data management method according to any one of claims 1 to 5 when executed by a processor.
CN202110293618.1A 2021-03-19 2021-03-19 Data management method, data management device, electronic device, and storage medium Active CN112685389B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110293618.1A CN112685389B (en) 2021-03-19 2021-03-19 Data management method, data management device, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110293618.1A CN112685389B (en) 2021-03-19 2021-03-19 Data management method, data management device, electronic device, and storage medium

Publications (2)

Publication Number Publication Date
CN112685389A CN112685389A (en) 2021-04-20
CN112685389B true CN112685389B (en) 2021-06-29

Family

ID=75455708

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110293618.1A Active CN112685389B (en) 2021-03-19 2021-03-19 Data management method, data management device, electronic device, and storage medium

Country Status (1)

Country Link
CN (1) CN112685389B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113779641B (en) * 2021-09-02 2024-03-05 新奥数能科技有限公司 Data configuration method, device, computer equipment and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106155925A (en) * 2015-04-09 2016-11-23 阿里巴巴集团控股有限公司 A kind of method and device obtaining data
CN108563788A (en) * 2018-04-27 2018-09-21 腾讯科技(深圳)有限公司 Blockchain-based data query method, device, server and storage medium
CN108874806A (en) * 2017-05-09 2018-11-23 广东神马搜索科技有限公司 Data query method, apparatus and data-storage system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8799235B2 (en) * 2012-09-07 2014-08-05 Oracle International Corporation Data de-duplication system
CN106980697A (en) * 2017-04-07 2017-07-25 广东浪潮大数据研究有限公司 A kind of catalogue distribution querying method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106155925A (en) * 2015-04-09 2016-11-23 阿里巴巴集团控股有限公司 A kind of method and device obtaining data
CN108874806A (en) * 2017-05-09 2018-11-23 广东神马搜索科技有限公司 Data query method, apparatus and data-storage system
CN108563788A (en) * 2018-04-27 2018-09-21 腾讯科技(深圳)有限公司 Blockchain-based data query method, device, server and storage medium

Also Published As

Publication number Publication date
CN112685389A (en) 2021-04-20

Similar Documents

Publication Publication Date Title
US12379975B2 (en) Systems and methods for censoring text inline
CN112328762A (en) Question and answer corpus generation method and device based on text generation model
US8407781B2 (en) Information providing support device and information providing support method
US20250013781A1 (en) System and Method for Serving Subject Access Requests
CN113495902A (en) Data processing method and data standard management system
JP2023542632A (en) Protecting sensitive data in documents
US20220237240A1 (en) Method and apparatus for collecting information regarding dark web
US20080208836A1 (en) Regression framework for learning ranking functions using relative preferences
CN111611396A (en) Information matching method and device based on legal knowledge graph and storage medium
CN106407316A (en) Topic model-based software question and answer recommendation method and device
CN112559526A (en) Data table export method and device, computer equipment and storage medium
CN117573819A (en) Data security control method for establishing intelligent assistant based on AIGC+enterprise internal knowledge base
CN110188291A (en) Document process based on proxy log
CN117093556A (en) Log classification method, device, computer equipment and computer readable storage medium
CN112685389B (en) Data management method, data management device, electronic device, and storage medium
CN110097258B (en) User relationship network establishment method, device and computer readable storage medium
JP5720536B2 (en) Information processing method and apparatus for searching for concealed data
Tsapatsoulis Image retrieval via topic modelling of Instagram hashtags
CN119961378A (en) RAG data processing method and system for data security
US20190354545A1 (en) Search device, search method and search program
CN109086438A (en) Method and apparatus for query information
CN112579747B (en) Identity information extraction method and device
CN113139005B (en) Same person identification method based on same person identification model and related equipment
US20250307292A1 (en) Systems and methods for consolidating applications used in an organization
CN119180472B (en) Power standard construction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant