CN119829730A - Data query method and device, storage medium and electronic equipment - Google Patents
Data query method and device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN119829730A CN119829730A CN202411763921.3A CN202411763921A CN119829730A CN 119829730 A CN119829730 A CN 119829730A CN 202411763921 A CN202411763921 A CN 202411763921A CN 119829730 A CN119829730 A CN 119829730A
- Authority
- CN
- China
- Prior art keywords
- vector
- data
- target
- query
- vectors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application provides a data query method and device, a storage medium and electronic equipment, wherein the method comprises the steps of obtaining a query request, responding to the query request, converting the query request into vector representation to obtain a query vector, searching vectors with matching degree larger than a preset threshold value between the query vector and the query vector from a target vector library to obtain a first vector set, executing sorting operation on the vectors included in the first vector set to obtain a target vector set, and searching target data matched with the query request from a target database based on the target vector set. The problem of low data query accuracy in the related technology is solved, and the effect of improving the data query accuracy is achieved.
Description
Technical Field
The embodiment of the application relates to the field of computers, in particular to a data query method and device, a storage medium and electronic equipment.
Background
In the age of information explosion, knowledge search technology is an important tool for people to acquire, understand and utilize information. The traditional knowledge search method, such as a full text search technology based on keyword matching, can quickly search out documents containing specified keywords, but has obvious limitations on the accuracy and efficiency, and has the problems of inaccurate queried results and low correlation of query results.
Disclosure of Invention
The embodiment of the application provides a data query method and device, a storage medium and electronic equipment, which are used for at least solving the problem of low data query accuracy in the related technology.
According to one embodiment of the application, a data query method is provided, which comprises the steps of obtaining a query request, converting the query request into a vector representation in response to the query request to obtain a query vector, wherein the query vector is used for representing semantic information of the query request from N semantic dimensions, N is a natural number greater than or equal to 1, searching vectors with matching degree between the query vector and the query vector greater than a preset threshold value from a target vector library to obtain a first vector set, wherein the vector representation of data in the target database in M semantic dimensions is included in the target vector library, M is a natural number greater than or equal to 1, performing a sorting operation on the vectors included in the first vector set to obtain a target vector set, and searching the target data matched with the query request from the target database based on the target vector set.
In an exemplary embodiment, before searching the vector with the matching degree between the query vector and the target vector library to obtain the first vector set, the method further comprises determining a vector representation of each data according to the target database when the target database comprises a plurality of data, wherein the target vector library is obtained by extracting keywords and key phrases of the data, converting the keywords and the key phrases into vector representations to obtain a plurality of key vectors, and constructing the vector representation of the data based on the plurality of key vectors.
In an exemplary embodiment, constructing a vector representation of the data based on the plurality of key vectors includes determining M of the semantic dimensions from a data type and a data source of the data in the target database, dividing the plurality of key vectors of the data into M sets of sub-vectors according to the M of the semantic dimensions, and determining the M sets of sub-vectors as vector representations of the data.
In an exemplary embodiment, searching vectors with the matching degree with the query vector being greater than a preset threshold value from a target vector library to obtain a first vector set, wherein searching vectors matched with N semantic dimensions from the target vector library to obtain an initial vector set, and determining a vector set with the matching degree with the query vector being greater than the preset threshold value from the initial vector set as the first vector set.
In an exemplary embodiment, performing a sorting operation on the vectors included in the first vector set to obtain a target vector set includes performing a sorting operation on the vectors included in the first vector set according to a matching degree between the vectors included in the first vector set and the query vector to obtain a second vector set, where the sorting operation includes selecting the sorting operation or interpolating the sorting operation, and performing a filtering operation on the vectors in the second vector set to obtain the target vector set, where a filtering condition of the filtering operation includes at least one of vector dimension, vector size, vector direction, vector attribute, and association between vectors.
In an exemplary embodiment, searching for target data matched with the query request from the target database based on the target vector set includes converting a plurality of target vectors included in the target vector set into data in a target format to obtain a plurality of first data, wherein the target format is the same as a format of data to be queried in the query request, searching for data matched with the plurality of first data from the target database to obtain a data list, and performing the sorting operation on the data list based on a degree of correlation between the data included in the data list to obtain target data.
In an exemplary embodiment, after converting the plurality of target vectors included in the set of target vectors into the data in the target format to obtain the plurality of first data, the method further includes generating structural data according to an association relationship between the plurality of first data, where the structural data includes keywords of the plurality of first data and associations between the keywords of the plurality of first data, and displaying the structural data through a target client.
According to another embodiment of the application, a data query device is provided, which comprises an acquisition module, a conversion module and a sorting module, wherein the acquisition module is used for acquiring a query request, the query request is used for requesting query data from a target database, the conversion module is used for responding to the query request and converting the query request into vector representations to obtain query vectors, the query vectors are used for representing semantic information of the query request from N semantic dimensions, the N is a natural number greater than or equal to 1, the first search module is used for searching vectors with matching degree between the query vectors and the target vector library greater than a preset threshold value to obtain a first vector set, the vector representations of the data in the target database in M semantic dimensions are included in the target vector library, the M is a natural number greater than or equal to 1, the sorting module is used for executing sorting operation on the vectors included in the first vector set to obtain a target vector set, and the second search module is used for searching the target data matched with the query request from the target database based on the target vector set.
According to a further embodiment of the application, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of any of the method embodiments described above.
According to a further embodiment of the present application, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
According to a further embodiment of the application, there is also provided an electronic device comprising a memory having stored therein a computer program, and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
According to the application, the acquired query request is converted into vector representation, namely a query vector, then vectors matched with N semantic dimensions in the query vector are searched from a target vector library, then the vectors with the matching degree higher than a preset threshold value with the query vector are further screened out to obtain a first vector set, the first vector set is ordered according to the matching degree to obtain a second vector set, the vectors in the second vector set are filtered, finally the target vector set is determined, the vectors in the target vector set are finally converted into target format data with the same format as the query request, a plurality of first data are formed, and the content corresponding to the first data is searched in the target database to obtain target data matched with the query request. By introducing the calculation of the vector similarity, the problem of low data query accuracy in the related technology can be solved, and the effect of improving the data query accuracy is achieved.
Drawings
FIG. 1 is a schematic diagram of a hardware environment of a data query method according to an embodiment of the present application;
FIG. 2 is a flow chart of a data query method according to an embodiment of the application;
FIG. 3 is a schematic diagram of a process for constructing a target vector library according to an embodiment of the present application;
FIG. 4 is a flow chart of a data query method according to an embodiment of the application;
Fig. 5 is a block diagram of a data query device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be executed in a server apparatus or similar computing device. Taking the operation on a server device as an example, fig. 1 is a schematic diagram of a hardware environment of a data query method according to an embodiment of the present application. As shown in fig. 1, the server device may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU, a programmable logic device FPGA, or the like processing means) and a memory 104 for storing data, wherein the server device may further include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those of ordinary skill in the art that the architecture shown in fig. 1 is merely illustrative and is not intended to limit the architecture of the server apparatus described above. For example, the server device may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a data query method in an embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, implement the method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located with respect to the processor 102, which may be connected to the server device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a server device. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as a NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
In this embodiment, a data query method is provided, fig. 2 is a flowchart of a data query method according to an embodiment of the present application, and as shown in fig. 2, the flowchart includes the following steps:
Step S202, acquiring a query request, wherein the query request is used for requesting to query data from a target database;
alternatively, the query request in this embodiment refers to a search command sent by the user to the system through natural language or specific query language, and is specific information or a question that the user wants to obtain from the target database, including but not limited to keywords, phrases, question forms or complex query logic.
Step S204, responding to the query request, converting the query request into vector representation to obtain a query vector, wherein the query vector is used for representing semantic information of the query request from N semantic dimensions, and N is a natural number greater than or equal to 1;
Optionally, the conversion of the query request into a vector representation in the present embodiment involves natural language processing (Natural Language Processing, abbreviated as NLP) techniques, including word embedding, sentence vector, or semantic vector generation.
Optionally, the semantic dimension in the embodiment includes, but is not limited to, technical field, research direction and application scene.
Optionally, the query vector in this embodiment can query semantic information contained in the request, including but not limited to meaning of keywords, context, user intent, and the like.
Step S206, searching vectors with matching degree between the target vector library and the query vector being greater than a preset threshold value to obtain a first vector set, wherein the target vector library comprises vector representations of data in the target database in M semantic dimensions, and M is a natural number greater than or equal to 1;
Optionally, the matching degree in this embodiment is a similarity measure between the query vector and the vectors in the target vector library, and may be obtained by cosine similarity, euclidean distance, or other vector similarity calculation.
Optionally, the preset threshold in this embodiment is a set similarity criterion, and the matching is considered successful only if the similarity between the query vector and the vectors in the library is higher than this threshold.
Step S208, executing a sorting operation on vectors included in the first vector set to obtain a target vector set;
optionally, the sorting operation in this embodiment is a process of sorting the vectors in the first vector set according to the matching degree with the query vector, where the vector with high matching degree is ranked in front, including but not limited to selecting sorting, interpolation sorting.
Step S210, searching target data matched with the query request from the target database based on the target vector set.
Optionally, the target data in this embodiment refers to a document or an information item that is found from the target database according to the target vector set and that matches the query request, and involves reversely converting the vector representation into text content, or directly using metadata of the vector to locate and retrieve specific data in the target database.
Through the steps, the obtained query request is converted into vector representation to obtain a query vector, then a vector with the matching degree larger than a threshold value is searched from a target vector library to obtain a first vector set, the first vector set is subjected to sorting operation to obtain a target vector set, and finally target data is obtained based on the target vector set. The semantic information of the query request is captured through vector representation for retrieval, so that the problem of low data query accuracy in the related technology is solved, and the effect of improving the data query accuracy is achieved.
In an exemplary embodiment, before searching the vector with the matching degree between the query vector and the target vector library to obtain the first vector set, the method further comprises determining a vector representation of each data according to the target database when the target database comprises a plurality of data, wherein the target vector library is obtained by extracting keywords and key phrases of the data, converting the keywords and the key phrases into vector representations to obtain a plurality of key vectors, and constructing the vector representation of the data based on the plurality of key vectors.
Optionally, the target database in this embodiment is used to store data including, but not limited to, academic papers, patent documents, technical reports, news articles, social media posts, and the like.
Optionally, the keywords and key phrases in this embodiment are identified and extracted from the text content of each data by natural language processing techniques, and words or phrases of significant importance, including but not limited to, core topics, concepts or entities that can reflect the data. For example, for a research paper on machine learning, keywords may include "algorithm", "dataset", "model", "deep learning", etc., while key phrases may include "neural network optimization", "overfitting problem", etc.
Alternatively, in this embodiment, the conversion of keywords and key phrases to vector representations may be performed using pre-trained deep learning models (e.g., BERT, roBERTa, etc.) to encode the keywords and key phrases, converting them to low-dimensional dense vectors.
Alternatively, the target vector library in this embodiment includes, but is not limited to, being stored in a tree structure, a graph structure, and a table.
Through the steps, for each data, the keywords and key phrases thereof are extracted, the keywords and key phrases are converted into key vectors, and then vector representations of the data are constructed based on the key vectors. The construction process of the key vector ensures that the vector representation can reflect key information of data, and a target vector library is constructed based on the key vector representation, so that a solid foundation is provided for subsequent high-precision knowledge search, and the accuracy of subsequent retrieval is improved.
In an exemplary embodiment, constructing a vector representation of the data based on the plurality of key vectors includes determining M of the semantic dimensions from a data type and a data source of the data in the target database, dividing the plurality of key vectors of the data into M sets of sub-vectors according to the M of the semantic dimensions, and determining the M sets of sub-vectors as vector representations of the data.
Alternatively, the data type in this embodiment represents a specific form or category of data in the target database, including but not limited to academic papers, patent documents, news articles, book chapters, and the like.
Alternatively, the data sources in this embodiment are different publishers, institutions, or platforms from which the data entries come, including but not limited to academic papers from specific universities, patent literature from authoritative patent offices, stories for well-known news media, and the like.
Optionally, determining M semantic dimensions in the present embodiment includes, but is not limited to, being identified by way of cluster analysis, topic model, and the like.
For example, fig. 3 is a schematic diagram of a process for constructing a target vector library, as shown in fig. 3, and the process includes the following steps:
Step S302, extracting keywords and key phrases of knowledge information in a target database, wherein the target database comprises a plurality of knowledge information, such as papers and articles, and the keywords and the key phrases can be extracted by natural language processing, keyword extraction and other technologies;
Step S304, converting the key words and key phrases into item vectors (corresponding to the key vectors) through a preset language model, wherein the preset language model can be BERT or RoBERTa;
step S306, determining semantic dimensions from data types and data sources of data in a target database, wherein the effect of determining the semantic dimensions can be realized by utilizing cluster analysis and a topic model;
step S308, dividing the item vector into N sub-vectors according to the semantic dimension;
step S310, storing the sub-vectors in the index nodes to obtain a target vector library stored in the form of a tree structure.
Through the steps, M semantic dimensions are determined, key vectors are divided into sub-vectors according to the dimensions, and then the sub-vectors are determined to be vector representations of data. Through dimension division, multiple semantic features of the data can be identified and processed, and subsequent data query is facilitated.
In an exemplary embodiment, searching vectors with the matching degree with the query vector being greater than a preset threshold value from a target vector library to obtain a first vector set, wherein searching vectors matched with N semantic dimensions from the target vector library to obtain an initial vector set, and determining a vector set with the matching degree with the query vector being greater than the preset threshold value from the initial vector set as the first vector set.
Optionally, the matching degree in this embodiment is a similarity measure between the query vector and the vectors in the target vector library, and may be obtained by cosine similarity, euclidean distance, or other vector similarity calculation.
Optionally, the preset threshold in this embodiment is a set similarity criterion, and the matching is considered successful only if the similarity between the query vector and the vectors in the library is higher than this threshold.
Through the steps, the vectors matched with the N semantic dimensions are searched to obtain an initial vector set, and then the vectors with the matching degree larger than the threshold value are further screened to be the first vector set, so that irrelevant results can be effectively filtered, and the retrieval efficiency and accuracy are improved.
In an exemplary embodiment, performing a sorting operation on the vectors included in the first vector set to obtain a target vector set includes performing a sorting operation on the vectors included in the first vector set according to a matching degree between the vectors included in the first vector set and the query vector to obtain a second vector set, where the sorting operation includes selecting the sorting operation or interpolating the sorting operation, and performing a filtering operation on the vectors in the second vector set to obtain the target vector set, where a filtering condition of the filtering operation includes at least one of vector dimension, vector size, vector direction, vector attribute, and association between vectors.
Alternatively, the sorting operation in this embodiment refers to a process of rearranging vectors according to the matching degree of the vectors in the first vector set and the query vector, and sorting may be performed by using different algorithms, including but not limited to selecting sorting, inserting sorting, fast sorting, and merging sorting.
Optionally, the filtering operation in this embodiment is to further screen a vector set that better meets the user requirement according to a certain filtering condition on the basis of the second vector set.
Optionally, the filtering conditions in this embodiment may be set according to specific application scenarios and user requirements, including but not limited to vector dimensions, vector sizes, vector directions, vector attributes, associations between vectors, and the like.
Optionally, the vector attributes in this embodiment include, but are not limited to, source, type, time, etc. For example, data vectors from unreliable sources or that do not match the user's desired time frame are filtered out.
Through the steps, the sorting operation is executed on the first vector set based on the matching degree, then the filtering operation is executed to obtain the target vector set, the sorting and filtering operation further optimizes the retrieval result, and the fact that the finally output data set has higher matching degree with the query request is ensured.
In an exemplary embodiment, searching for target data matched with the query request from the target database based on the target vector set includes converting a plurality of target vectors included in the target vector set into data in a target format to obtain a plurality of first data, wherein the target format is the same as a format of data to be queried in the query request, searching for data matched with the plurality of first data from the target database to obtain a data list, and performing the sorting operation on the data list based on a degree of correlation between the data included in the data list to obtain target data.
Optionally, the target format in this embodiment refers to the original format of the data to be queried in the query request, including but not limited to text, HTML, PDF, word documents, etc.
Alternatively, in this embodiment, converting the vectors in the target vector set into the data in the target format refers to initially recovering or reconstructing the original data information or content corresponding to the vectors.
Alternatively, in this embodiment, searching for data matching with the plurality of first data from the target database means that data corresponding to the first data in the original format content is found in the database.
Optionally, the association degree in this embodiment is calculated by matching keywords between the binding vectors and semantic association.
Through the steps, the target vector set is converted into the data in the target format, then the data matched with the data in the target format is searched from the database to obtain the target data, so that the original data corresponding to the target vector set can be obtained according to the vector reverse direction, and the target data corresponding to the query request can be further obtained.
In an exemplary embodiment, after converting the plurality of target vectors included in the set of target vectors into the data in the target format to obtain the plurality of first data, the method further includes generating structural data according to an association relationship between the plurality of first data, where the structural data includes keywords of the plurality of first data and associations between the keywords of the plurality of first data, and displaying the structural data through a target client.
Optionally, the association relationship in this embodiment refers to a relationship between the plurality of first data in terms of semantics, theme, time, geographic location, and the like. For example, when a plurality of first data all relate to a topic of "artificial intelligence," there is a topic association between them, and when they refer to the same research effort or data source, there is a reference relationship association.
By way of a specific example, the above method is described in the context of high-precision knowledge search of academic paper databases, FIG. 4 is a schematic flow chart of a data query method, as shown in FIG. 4, the flow chart comprising the steps of:
Step S402, converting the query request input by the user into a vector representation through natural language processing to obtain a query vector, for example, the query request input by the user is "deep learning application in natural language processing in 2010-2020", the query request is converted into a query vector, the query vector reflects semantic information of the query request on N semantic dimensions, and N can be 5 (for example, technical field, research method, application scene, time range, author information);
Step S404, searching an item vector with the matching degree between the item vector and the query vector being greater than a preset threshold value from a target vector library by using a preset model, wherein the preset model can be a model based on an approximate nearest neighbor search (Approximate Nearest Neighbor, ANN) algorithm or a BERT model search to obtain the first vector set, and if the BERT model is used, the query vector can be a 768-dimensional or 1024-dimensional dense vector;
Step S406, calculating the matching degree between the vectors in the first vector set and the query vector, calculating the matching degree by using cosine similarity or Euclidean distance and other modes, and executing sorting operation on the vectors included in the first vector set according to the matching degree to obtain a second vector set;
step S408, a filtering operation is performed on the second vector set to obtain a target vector set, and the vectors in the target vector set are ranked again based on the matching degree, for example, the vectors with release time not in 2010-2020 are filtered to obtain the target vector set, and then the ranking is performed again according to the matching degree of the vectors in the target vector set obtained after the filtering and the query vector;
Step S410, converting vectors in the target vector set into text data corresponding to the vectors, and searching data corresponding to the text data from a target database to obtain a data list;
Step S412, sorting again according to the matching degree of the vector in the data list and the keywords 'deep learning' and 'natural language processing', obtaining target data and displaying.
It should be noted that, from the description of the above embodiments, those skilled in the art will clearly understand that the method according to the above embodiments may be implemented by software plus a necessary general hardware platform, and of course may also be implemented by hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.
The embodiment also provides a data query device, which is used for implementing the foregoing embodiments and preferred embodiments, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 5 is a block diagram of a data query device according to an embodiment of the present application, as shown in fig. 5, the device includes:
an obtaining module 502, configured to obtain a query request, where the query request is used to request to query data from a target database;
A conversion module 504, configured to respond to the query request, convert the query request into a vector representation, and obtain a query vector, where the query vector is used to represent semantic information of the query request from N semantic dimensions, where N is a natural number greater than or equal to 1;
A first searching module 506, configured to search a target vector library for a vector having a matching degree with the query vector greater than a preset threshold value, to obtain a first vector set, where the target vector library includes vector representations of data in the target database in M semantic dimensions, where M is a natural number greater than or equal to 1;
A sorting module 508, configured to perform a sorting operation on vectors included in the first vector set to obtain a target vector set;
A second searching module 510, configured to search, based on the set of target vectors, target data matching the query request from the target database.
In an exemplary embodiment, the first lookup module 506 further includes, for each of the plurality of data included in the target database, determining a vector representation of each of the data by a first extraction unit configured to extract a keyword and a key phrase of the data, a first conversion unit configured to convert each of the keyword and the key phrase into a vector representation to obtain a plurality of key vectors, and a first construction module configured to construct a vector representation of the data based on the plurality of key vectors.
In an exemplary embodiment, the first lookup module 506 further includes a first determining unit configured to determine M semantic dimensions from a data type and a data source of the data in the target database, a first dividing unit configured to divide the plurality of key vectors of the data into M sets of sub-vectors according to the M semantic dimensions, and a second determining unit configured to determine the M sets of sub-vectors as vector representations of the data.
In an exemplary embodiment, the first searching module 506 further includes a first searching unit configured to search vectors matching N semantic dimensions from the target vector library to obtain an initial vector set, and a third determining unit configured to determine a vector set, which has a matching degree with the query vector greater than a preset threshold, in the initial vector set as the first vector set.
In an exemplary embodiment, the first lookup module 506 further includes a first sorting unit configured to perform a sorting operation on the vectors included in the first vector set according to a matching degree between the vectors included in the first vector set and the query vector to obtain a second vector set, where the sorting operation includes a selection sorting operation or an interpolation sorting operation, and a first filtering unit configured to perform a filtering operation on the vectors in the second vector set to obtain the target vector set, where a filtering condition of the filtering operation includes at least one of a vector dimension, a vector size, a vector direction, a vector attribute, and an association between vectors.
In an exemplary embodiment, the second lookup module 510 further includes a second conversion unit configured to convert a plurality of target vectors included in the set of target vectors into data in a target format to obtain a plurality of first data, where the target format is the same as a format of data to be queried in the query request, a second lookup unit configured to lookup data matching the plurality of first data from the target database to obtain a data list, and perform the sorting operation on the data list based on a degree of association between the data included in the data list to obtain target data.
In an exemplary embodiment, the second search module 510 further includes a first generating unit configured to generate structural data according to an association relationship between a plurality of the first data, where the structural data includes a plurality of keywords of the first data and an association between a plurality of keywords of the first data, and a first displaying unit configured to display the structural data through a target client.
It should be noted that each of the above modules may be implemented by software or hardware, and the latter may be implemented by, but not limited to, the above modules all being located in the same processor, or each of the above modules being located in different processors in any combination.
Embodiments of the present application also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
In an exemplary embodiment, the computer readable storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, etc. various media in which a computer program may be stored.
An embodiment of the application also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
In an exemplary embodiment, the electronic device may further include a transmission device connected to the processor, and an input/output device connected to the processor.
Embodiments of the application also provide a computer program product comprising a computer program which, when executed by a processor, implements the steps of any of the method embodiments described above.
Embodiments of the present application also provide another computer program product comprising a non-volatile computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of any of the method embodiments described above.
Embodiments of the present application also provide a computer program comprising computer instructions stored in a computer-readable storage medium, a processor of a computer device reading the computer instructions from the computer-readable storage medium, the computer instructions being executable by a burial device to cause the computer device to perform the steps of any of the method embodiments described above.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present application should be included in the protection scope of the present application.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411763921.3A CN119829730A (en) | 2024-12-03 | 2024-12-03 | Data query method and device, storage medium and electronic equipment |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411763921.3A CN119829730A (en) | 2024-12-03 | 2024-12-03 | Data query method and device, storage medium and electronic equipment |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN119829730A true CN119829730A (en) | 2025-04-15 |
Family
ID=95307114
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411763921.3A Pending CN119829730A (en) | 2024-12-03 | 2024-12-03 | Data query method and device, storage medium and electronic equipment |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN119829730A (en) |
-
2024
- 2024-12-03 CN CN202411763921.3A patent/CN119829730A/en active Pending
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111753060B (en) | Information retrieval method, apparatus, device and computer readable storage medium | |
| US20220261427A1 (en) | Methods and system for semantic search in large databases | |
| US10289717B2 (en) | Semantic search apparatus and method using mobile terminal | |
| US10482146B2 (en) | Systems and methods for automatic customization of content filtering | |
| CN109829104A (en) | Pseudo-linear filter model information search method and system based on semantic similarity | |
| CN111190997A (en) | A Question Answering System Implementation Method Using Neural Networks and Machine Learning Sorting Algorithms | |
| CN113806588B (en) | Method and device for searching videos | |
| CN113515589B (en) | Data recommendation method, device, equipment and medium | |
| US20250086215A1 (en) | Large language model-based information retrieval for large datasets | |
| CN113505196A (en) | Part-of-speech-based text retrieval method and device, electronic equipment and storage medium | |
| CN110727769A (en) | Corpus generation method and device, and man-machine interaction processing method and device | |
| CN111859079B (en) | Information search method, device, computer equipment and storage medium | |
| CN110674087A (en) | File query method, device and computer-readable storage medium | |
| CN118132791A (en) | Image retrieval method, device, equipment, readable storage medium and product | |
| CN111752922A (en) | Method and device for establishing knowledge database and realizing knowledge query | |
| CN112988952B (en) | Multi-level-length text vector retrieval method and device and electronic equipment | |
| CN117390169A (en) | Form data question-answering method, device, equipment and storage medium | |
| CN118245568A (en) | Question and answer method and device based on large model, electronic equipment and storage medium | |
| CN119357366B (en) | Large model retrieval method, device, equipment and storage medium based on priori atlas | |
| CN112347289B (en) | Image management method and terminal | |
| CN115293127A (en) | Contract document information comparison method, device and system | |
| CN115270777A (en) | A method, device and system for extracting contract document information | |
| CN114385777A (en) | Text data processing method and device, computer equipment and storage medium | |
| CN118503381A (en) | Method and system for searching and generating combined strong language dialogue | |
| CN119829730A (en) | Data query method and device, storage medium and electronic equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |