[go: up one dir, main page]

CN120578746A - Industry question-answer enhancement method and system based on multi-dimensional label automatic extraction - Google Patents

Industry question-answer enhancement method and system based on multi-dimensional label automatic extraction

Info

Publication number
CN120578746A
CN120578746A CN202511075834.3A CN202511075834A CN120578746A CN 120578746 A CN120578746 A CN 120578746A CN 202511075834 A CN202511075834 A CN 202511075834A CN 120578746 A CN120578746 A CN 120578746A
Authority
CN
China
Prior art keywords
answer
question
label
hierarchical
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202511075834.3A
Other languages
Chinese (zh)
Other versions
CN120578746B (en
Inventor
于海军
严晨
陈家树
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Anshuo Enterprise Credit Reporting Service Co ltd
Original Assignee
Shanghai Anshuo Enterprise Credit Reporting Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Anshuo Enterprise Credit Reporting Service Co ltd filed Critical Shanghai Anshuo Enterprise Credit Reporting Service Co ltd
Priority to CN202511075834.3A priority Critical patent/CN120578746B/en
Priority claimed from CN202511075834.3A external-priority patent/CN120578746B/en
Publication of CN120578746A publication Critical patent/CN120578746A/en
Application granted granted Critical
Publication of CN120578746B publication Critical patent/CN120578746B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of natural language processing, and provides an industry question-answer enhancement method and system based on automatic extraction of multidimensional labels. The method comprises the steps of labeling user questions, performing multi-scale expansion of labels to obtain a hierarchical demand label group, structuring the label group into a question-answer demand label map according to a cross-level association relation, performing hierarchical extraction on a question-answer library along K layers of labels in the map based on a preset retrieval convergence vector to output a minimum matching domain answer set, extracting K-1 layers of backtracking answer sets along the convergence vector, outputting association expansion answer sets through aggregation, and assembling the two groups of answer sets into an enhanced industry question-answer response output. The application solves the technical problems of low answer matching precision and narrow coverage range caused by single understanding level of the traditional industry question-answering system on the user demands and insufficient relevance of labels, and achieves the technical effects of improving question-answering accuracy, richness and response speed through hierarchical expansion and relevance expansion of multi-dimensional labels.

Description

Industry question-answer enhancement method and system based on multi-dimensional label automatic extraction
Technical Field
The application relates to the technical field of natural language processing, in particular to an industry question-answer enhancement method and system based on automatic extraction of multidimensional labels.
Background
In the industry application scene of deep fusion of digital and intelligent technologies, the traditional industry question-answering system based on keyword matching or shallow semantic analysis gradually exposes a technical bottleneck. Along with the exponential increase of the knowledge graph scale of each vertical field, the user query requirement presents multidimensional association characteristics, and the knowledge graph comprises explicit fact retrieval and implicit cross-level knowledge deduction requirements. The prior art system has three general defects that firstly, a planarization tag system is difficult to describe hierarchical structural features of industry knowledge, so that the requirement understanding dimension is single, secondly, a static tag association mechanism cannot adapt to the context evolution of a dynamic query scene, so that the contradiction between answer matching precision and recall rate is caused, thirdly, a single retrieval path lacks the expansion aggregation capability of associated knowledge, and the knowledge service requirement of a complex decision scene is difficult to meet. Therefore, a scheme capable of dynamically resolving the multi-scale requirements and realizing hierarchical answer extraction is needed to break through the technical bottlenecks of the traditional question-answering system in the aspects of industry knowledge deep understanding, cross-domain correlation reasoning, dynamic knowledge fusion and the like.
Disclosure of Invention
The application provides an industry question-answering enhancement method and system based on multi-dimensional label automatic extraction, and aims to solve the technical problems of low answer matching precision and narrow coverage range caused by single understanding level of a traditional industry question-answering system on user requirements and insufficient label relevance.
The application discloses a first aspect of the method, which provides an industry question-answer enhancement method based on multi-dimensional label automatic extraction, and the method comprises the steps of obtaining a hierarchical demand label group through multi-scale expansion of demand labels after labeling processing user questions, structuring the hierarchical demand label group into a question-answer demand label map according to a label cross-level association relation of the hierarchical demand label group, wherein the question-answer demand label map comprises K-layer expansion demand labels, carrying out industry question-answer hierarchical label extraction on a user question-answer information base along the K-layer expansion demand labels on the basis of a preset search convergence vector by the question-answer demand label map, outputting a minimum matching domain answer set, extracting a K-1 layer backtracking extraction answer set along the search convergence vector, and outputting an association expansion answer set through association expansion aggregation, and assembling the minimum matching domain answer set and the association expansion answer set into an enhanced industry question-answer response output.
The application discloses another aspect of the multi-dimensional label automatic extraction-based industry question and answer enhancement system, which comprises a multi-scale expansion module, a label structuring module, a label extraction module and a response output module, wherein the multi-scale expansion module is used for carrying out multi-scale expansion on a demand label to obtain a hierarchical demand label group after a user question is processed in a labeled mode, the label structuring module is used for structuring the hierarchical demand label group into a question and answer demand label map according to a cross-level association relation of labels of the hierarchical demand label group, the question and answer demand label map comprises K layers of expansion demand labels, the label extraction module is used for carrying out the hierarchical label extraction of the industry question and answer on a user question and answer information base along the K layers of expansion demand labels on the basis of a preset search convergence vector to output a minimum matching domain answer set, the expansion aggregation module is used for outputting an association expansion answer set through association expansion aggregation after extracting K-1 layers of retroactive answer sets along the search convergence vector, and the response output module is used for assembling the minimum matching domain answer set and the association expansion answer set into enhanced industry question and answer response output.
One or more technical schemes provided by the application have at least the following technical effects or advantages:
according to the industry question-answer enhancement method based on multi-dimensional label automatic extraction, firstly, a user question is decomposed into basic labels, and the labels are subjected to multi-round expansion from different dimensions to form a three-dimensional label system. Then, according to the hierarchical association rules among the labels, the labels are organized into a tree-like map containing K layers of structures, each layer representing knowledge dimensions of different granularity. In the retrieval stage, deep layers are formed in the map along a preset retrieval convergence vector, the most core relevant label layer is precisely positioned, and a direct matching answer set is screened out from a knowledge base. Meanwhile, the related answers of the upper layer are automatically extracted when the path is returned, more peripheral information is expanded through a knowledge association network, and finally, the accurate answers and the associated knowledge are intelligently combined to form an enhanced answer with a core conclusion and background information, so that the breadth and depth of knowledge coverage are improved.
The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow diagram of an industry question-answer enhancement method based on multi-dimensional label automatic extraction in one embodiment.
FIG. 2 is a diagram of an industry question-answer enhancement system architecture based on multi-dimensional tag automatic extraction in one embodiment.
Reference numerals illustrate a multi-scale expansion module 11, a tag structuring module 12, a tag extraction module 13, an expansion aggregation module 14, and a response output module 15.
Detailed Description
The embodiment of the application solves the technical problems of low answer matching precision and narrow coverage range caused by single understanding level of the traditional industry question-answering system on the user demand and insufficient label relevance by providing the industry question-answering enhancement method and system based on multi-dimensional label automatic extraction.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that the terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus.
In a first embodiment, as shown in fig. 1, the present application provides an industry question-answer enhancement method based on automatic extraction of multidimensional labels, the method comprising:
After the user problem is treated in the labeling mode, the hierarchical demand label group is obtained through multi-scale expansion of the demand labels.
In the embodiment of the application, when the user problem is processed, firstly, standardized processing (such as removing redundant information and extracting industry keywords) is carried out on user input to generate a basic tag group. Then, the basic tag is amplified layer by layer through a preset multi-level expansion container (such as a semantic expansion container, a hierarchy expansion container, a scene expansion container and the like), each expansion hierarchy is given different weights (such as the highest core word weight and the descending cross-domain associated word weight) according to the tag source, and a hierarchical tag group comprising a plurality of subsets is formed. The hierarchical structure not only maintains the core intention of the original problem, but also can cover potential association requirements through multi-scale expansion, and provides a structural basis for the subsequent construction of the knowledge graph.
Further, the present application provides a method for obtaining a hierarchical demand label group by performing multi-scale expansion of demand labels after user problems are treated by labeling, the method comprising:
The method comprises the steps of performing field self-adaptive text processing on a user problem to generate a standardized query text, dynamically extracting a core semantic unit of the standardized query text through a preset industry keyword library to generate a question-answer demand label group, and performing multi-scale expansion on the question-answer demand label group to obtain the hierarchical demand label group.
Preferably, firstly, the basic language cleaning process is performed on the text of the problem input by the user, redundant symbols, punctuations, stop words and the like are removed, then, according to the specific industry or domain to which the problem belongs, a specific dictionary or term library of the industry or domain is matched, and the matched dictionary or term library is used for converting common words and expressions in the problem of the user into standard terms of the domain, so that standardized query text is formed, for example, in the medical domain, "pain" can be mapped into "pain sense" or "pain score", and in the financial domain, "investment risk" can be mapped into "market fluctuation risk". Then, based on a preset industry keyword library, key semantic units are dynamically extracted from the processed standardized query text, wherein the key semantic units refer to parts with core information in the questions, such as proper nouns, important concepts or technical terms related to the industry, and through the extraction process, the most significant parts in the user questions can be captured and organized into question-answering requirement label groups, the labels reflect main focus points of the user questions, and an explicit direction is provided for subsequent processing of the question-answering system. The generated set of question-answer demand labels is then multi-scale expanded using semantic expansion containers, hierarchical expansion containers, scene expansion containers, and cross-domain expansion containers in the hierarchical expansion container set, thereby generating a hierarchical set of demand labels that may be progressively expanded from industry-level broad labels to more specific sub-domain, technology, or product labels. The finally formed hierarchical label group can comprehensively cover different dimensions of the user problem, so that the question-answering system can be more accurately matched with the requirements of the user in response.
Further, the application provides a multi-scale expansion method for the question-answer demand label group to obtain a hierarchical demand label group, wherein the method comprises the following steps:
The method comprises the steps of pre-constructing a hierarchy expansion container set, loading N question-answer demand labels in the question-answer demand label set into the hierarchy expansion container set, carrying out multi-scale label expansion through the semantic expansion container, the scene expansion container and the cross-domain expansion container, carrying out weight differentiation assignment according to container sources of expansion results, and outputting N hierarchy demand label subsets, wherein the N hierarchy demand label subsets form the hierarchy demand label set.
Optionally, in order to perform multi-scale expansion on the question-answer requirement tag group, a hierarchical expansion container group is pre-constructed, where the container group includes a semantic expansion container, a hierarchical expansion container, a scene expansion container and a cross-domain expansion container, where the semantic expansion container mainly performs expansion of synonyms and related words, the hierarchical expansion container is responsible for performing upper-lower level differentiation, the scene expansion container performs horizontal scene association, and the cross-domain expansion container expands tags from one domain to other related domains. And then, N question-answer demand labels in the question-answer demand label group are loaded into the hierarchical expansion container group, and in each container, the labels undergo a multi-scale expansion process, for example, a semantic expansion container firstly generates a synonymous/near-sense label for each question-answer demand label, the hierarchical expansion container carries out upper-lower level differentiation on a semantic expansion result, a scene expansion container associates an actual application scene label according to the upper-lower level differentiation result, and a cross-domain expansion container binds a cross-domain knowledge label according to an application scene association result. After the hierarchical expansion container group is expanded, the expanded labels can be subjected to weight differentiation assignment according to different source containers, which means that the expanded labels generated by different containers are given different weight values to represent the importance or applicability of the expanded labels, for example, the labels expanded by semantics can have higher weight because the labels directly affect the accuracy of semantics, and the labels expanded by cross domains can be given lower weight according to practical situations. And finally, summarizing the expansion results of the same question-answer demand label to form N hierarchical demand label subsets, wherein the subsets are more concrete in content and hierarchy to form a hierarchical demand label group, and the label subsets provide more accurate and multidimensional label information for subsequent question-answer matching, so that the accuracy of core answers is ensured.
Further, the method further comprises:
The method comprises the steps of loading a first question-answer demand label into a hierarchical expansion container set, carrying out synonymous term deduction expansion on the first question-answer demand label through a semantic expansion container to obtain a first equivalent expansion label set, loading the first equivalent expansion label set into the hierarchical expansion container to carry out upper and lower level differentiation expansion to obtain a first upper level expansion label set and a first lower level expansion label set, loading the first upper level expansion label set and the first lower level expansion label set into the scene expansion container to carry out multidimensional association scene expansion to obtain a first upper level multidimensional label set and a first lower level multidimensional label set, identifying a first knowledge association node of the first question-answer demand label, carrying out cross-domain label binding on the first upper level multidimensional label set and the first lower level multidimensional label set by taking the first knowledge association node as domain constraint, and obtaining a first upper level cross-domain expansion label set, a first upper level expansion label set, a first lower level expansion label set and a first lower level expansion label set according to label container sources, and assigning a first multi-domain expansion label set.
Alternatively, first, a question-answer demand label is randomly extracted from the N question-answer demand labels as a first question-answer demand label, and then the first question-answer demand label is loaded into the hierarchical expansion container group. After the hierarchical expansion container receives the first question-answer requirement label, the semantic expansion container is used for carrying out synonymous term deduction expansion on the first question-answer requirement label. Specifically, a synonym library, a semantic dictionary or a domain-specific knowledge base bound with a semantic expansion container is queried, words with the same meaning are obtained through a direct matching mode, in addition, a pre-trained industry word vector model is loaded, words with similar meaning are found through calculating cosine similarity, co-occurrence probability of the words with similar meaning and a first question-answer demand label in an industry corpus is counted and screened, words with similar meaning, the co-occurrence probability of which is smaller than or equal to the minimum co-occurrence frequency of the industry, are removed, and then the rest words and the words which are directly matched are summarized to form a first co-extensive label group so as to enrich the semantic range of the label. Then, the generated first homonymous expansion tag group is loaded into a hierarchical expansion container for upper and lower level differentiation expansion, the hierarchical expansion container defines the hierarchical structure of various tags through an industry ontology library (for example, industry standards, technical documents, dictionaries and the like aiming at wind power generation), a tree-shaped upper and lower level structure is formed, each tag in the first homonymous expansion tag group is matched with the tree-shaped upper and lower level structure, and accordingly the first homonymous expansion tag group is divided into a first upper expansion tag group and a first lower expansion tag group, the upper expansion tag group represents the wide or high-level classification of the tags, and the lower tag group represents the tags of more specific or subclasses. And then, loading the first upper expansion tag group and the first lower expansion tag group into a scene expansion container, analyzing the first upper expansion tag group and the first lower expansion tag group through a pre-trained scene classification model (such as TextCNN), and matching a typical service scene, so that the tags are further refined, the first upper multi-dimensional tag group and the first lower multi-dimensional tag group are generated, the expansion enables the tags to be more fit with the requirements of specific scenes, and the applicability of the tags is enhanced. And then, identifying a first knowledge association node of the first question-answer demand label, using the node as a domain constraint, and performing cross-domain label binding on the first upper multi-dimensional label group and the first lower multi-dimensional label group in a cross-domain expansion container, and expanding the labels from the current domain to other related domains through cross-domain binding to generate the first upper cross-domain label group and the first lower cross-domain label group. Finally, weights are assigned to each tag group (first homogeneous extended tag group, first upper extended tag group, first lower extended tag group, first upper multidimensional tag group, first lower multidimensional tag group, first upper cross-domain tag group, first lower cross-domain tag group) according to the source container of the tag, and these weights are differentially assigned according to the source and importance of the tag, for example, the semantic extended tag may have a higher weight because it directly affects the accuracy of the semantic, while the cross-domain extended tag may have a lower weight. Finally, these expanded tags would be combined into a complete first-tier demand tag subset as part of the hierarchical demand tag set, providing a structured input basis for subsequent atlas construction and answer extraction.
Taking a cutter abrasion label as an example, a first synchronous expansion label group generated after the processing of a semantic expansion container is [ cutter abrasion, cutting edge passivation and tool failure ], a first upper expansion label group generated after the processing of a hierarchical expansion container is [ machine tool cutter failure ], a first lower expansion label group is [ rear cutter surface abrasion, front cutter surface crater abrasion and cutting edge collapse defect ], a first upper multidimensional label group generated after the processing of a scene expansion container is [ batch processing cutter management, intelligent manufacturing system cutter monitoring ], a first lower multidimensional label group is [ superalloy milling working condition, high-speed cutting flutter scene and intermittent processing of composite materials ], a first upper cross-domain label group generated after the processing of a cross-domain expansion container is [ cutter life prediction, spare part stock optimization ], a first lower cross-domain label group is [ nickel-based alloy phase transition temperature parameter, cutting vibration spectrum characteristic and carbon fiber layering damage threshold ].
Further, the present application provides that the first knowledge association node is used as a domain constraint, and the cross-domain expansion container binds the cross-domain labels of the first upper multi-dimensional label group and the first lower multi-dimensional label group to obtain a first upper cross-domain label group and a first lower cross-domain label group, where the method further includes:
Extracting industry attribute characteristics of the first knowledge association node, matching a preset cross-domain binding rule base based on the industry attribute characteristics, adopting the cross-domain binding rule base to conduct macroscopic cross-domain binding of the first upper multi-dimensional tag group, outputting the first upper cross-domain tag group, adopting the cross-domain binding rule base to conduct microscopic cross-domain binding of the first lower cross-domain tag group, and outputting the first lower cross-domain tag group.
Optionally, first, the industry attribute feature of the first knowledge association node is extracted, where the industry attribute feature refers to industry specific information related to the node, such as a technical field, an application scenario, a standard requirement, and the like, and these features help to further understand and limit the industry background of the node, so that the subsequent label expansion and binding process can more accurately conform to the actual industry requirement. Then, according to the extracted industry attribute characteristics, a preset cross-domain binding rule base is matched, the most suitable cross-domain rule is selected, and the cross-domain binding rule base is a base containing different inter-industry label association rules, wherein how to effectively bind labels of one field with labels of other related fields is defined. And then, carrying out macroscopic cross-domain binding on the first upper multi-dimensional tag group by adopting the matched cross-domain binding rule library, wherein the macroscopic cross-domain binding mainly relates to the expansion of tags from the current field (such as a specific industry field) to a wider related field, generally cross-industry or a larger range of cross-field, and the step aims at mapping the tags in the first upper multi-dimensional tag group to the tags in other industry fields and generating the first upper cross-domain tag group. Then, the same cross-domain binding rule library is adopted to carry out micro cross-domain binding on the first lower multi-dimensional label group, and the micro cross-domain binding mainly relates to a more specific and more refined layer for expanding the labels from the current field to the related field. Unlike macro cross-domain binding, micro cross-domain binding is more focused on the exact matching of the details of the tag and the actual application scenario. Through this step, the first lower cross-domain tag group is output. Finally, the obtained first upper cross-domain tag group and the first lower cross-domain tag group can span different fields to provide wider and more diverse tag information, so that the adaptability and the accuracy of the question-answering system are enhanced.
And structuring the hierarchical demand label group into a question-answer demand label map according to the cross-hierarchy association relation of labels of the hierarchical demand label group, wherein the question-answer demand label map comprises K layers of expansion demand labels.
In one embodiment, after the hierarchical demand label set is obtained, the labels are structured into question-answer demand label maps according to cross-level associations between labels in the hierarchical demand label set. Specifically, firstly, a plurality of selected question-answer requirement labels are taken as root nodes, the labels are organized into a series of requirement label trees according to the relationship among labels of different levels, and the requirement label trees are expanded from the root nodes layer by layer and show the level relationship among the different labels. Then, a semantic relevance matrix corresponding to each question-answer demand label is locally invoked, and the matrices represent semantic similarity or relevance among the labels. On the basis, a plurality of requirement label trees are connected in an associated mode through a cross-tree connector, semantic association degree and hierarchical relation of the requirement label trees are utilized for accurate matching and connection, a structured question-answer requirement label map is output, the question-answer requirement label map comprises K layers of expansion requirement labels, each layer represents label expansion of different layers, a system can be enabled to accurately locate corresponding hierarchical labels according to a user problem, answer matching and searching are conducted better, and answer accuracy and adaptability are improved.
Further, the present application provides structuring the hierarchical demand label group into a question-answer demand label graph according to a cross-hierarchy association relationship of labels of the hierarchical demand label group, the question-answer demand label graph including K-layer expanded demand labels, the method comprising:
The N question-answer demand labels are used as root nodes, the N level demand label subsets are structured according to the multi-container cascade execution sequence in the level expansion container group to obtain N question-answer demand label trees, N initial semantic relevance matrixes of the N question-answer demand labels are called locally, a cross-tree connector is driven to carry out relevant connection of the N question-answer demand label trees according to the N initial semantic relevance matrixes, and the question-answer demand label map is output.
Optionally, N question-answer demand labels are selected as root nodes, these labels represent core demands of user problems, then according to a multi-container cascade execution sequence defined in a hierarchical expansion container group, namely, a semantic expansion container, a hierarchical expansion container, a scene expansion container and a cross-domain expansion container, a hierarchical demand label subset corresponding to the N root nodes is structured, so that each question-answer demand label subset is expanded layer by layer according to an expansion hierarchy of the container group, a tree structure with K layers of expansion demand labels is generated, for example, a root node "cutter wear" is divided into an upper label "machine tool cutter fault" and a lower label "rear cutter wear" (a second layer) through the hierarchical expansion container, the upper label "machine tool fault" and the lower label "are associated to a" batch processing cutter management "," superalloy milling "(a third layer) through the scene expansion container, and finally, the question-answer demand label tree with depth of k=4 is formed through binding to the cross-domain expansion container to" cutter life prediction "(a fourth layer). After the N question-answer demand label trees are structured, an initial semantic association matrix corresponding to each question-answer demand label is locally called, each matrix is used for measuring semantic relativity among labels, similarity and connection among labels at a semantic level are reflected, the semantic association matrix is usually calculated through a natural language processing algorithm, and semantic distances among the labels and other labels are represented. Then, based on the initial semantic association degree matrixes, a cross-tree connector is driven to carry out association connection between the question-answer requirement label trees, and the cross-tree connector is used for identifying and connecting relevant labels between different question-answer requirement label trees to establish association between label trees. The process guides the connection operation through the semantic similarity information in the association degree matrix so as to ensure the accuracy and the effectiveness of the connection, and finally forms a tightly connected question-answer demand label map which is a multi-layer and structured map, covers the multidimensional demand of the user problem, and can accurately capture the real intention of the user and provide corresponding industry question-answer results. By the method, accurate mapping from the simple labels to the complex requirements can be realized, and the high efficiency and the intelligent level of the question-answering system are ensured.
Further, the method further comprises:
According to the N initial semantic association degree matrixes, driving a cross-tree connector to carry out association connection of the N question-answer demand label trees to obtain initial association patterns, presetting topology optimization rules, wherein the topology optimization rules are a plurality of differentiated optimization strategies of a plurality of association levels, carrying out hierarchical network node optimization of the initial association patterns according to the plurality of differentiated optimization strategies by taking the plurality of association levels as hierarchical topology optimization triggering conditions, and outputting the question-answer demand label patterns.
Optionally, first, according to N initial semantic association matrices, a cross-tree connector is driven to perform association connection on N question-answer requirement label trees, each semantic association matrix reflects semantic similarity among labels, relevant nodes in different label trees are connected by identifying highly relevant label combinations from the initial semantic association matrix, an initial association map is formed, the initial association map comprises a plurality of layers of labels and semantic relations among the labels, and it is ensured that each requirement dimension of a user problem can be accurately reflected in the map. Then, in order to optimize the initial association graph, a preset topology optimization rule is read locally, and a series of differential optimization strategies are defined by the topology optimization rule, and the operations of node recombination, merging, splitting and the like are carried out according to the hierarchical structure of the graph, so that the accessibility and the accuracy of the nodes in the graph are improved. The diversity of the topology optimization rules enables the atlas to be flexibly adjusted under the complex problem scene, and ensures that the association relation of each hierarchy is reasonably optimized. And then, taking a plurality of associated levels as level topology optimization triggering conditions, and starting an optimization process of the map. In this stage, the connection mode between nodes is adjusted by using a differential optimization strategy according to the semantic and structural characteristics of each hierarchy, for example, for nodes with higher similarity, the nodes are converged to reduce redundancy, so that the simplicity and query efficiency of the map are optimized, for nodes frequently accessed in the query, the connection weight between the nodes is enhanced, so that relevant answers can be found more rapidly in the search process, for nodes with the degree of departure and the degree of entry being 0, the nodes are marked as isolated nodes and removed from the map, and for branches with the degree of support (an important index for measuring the association strength of labels and used for evaluating the frequency of co-occurrence of two or more labels in historical query data) being lower than the preset degree of support are pruned. Finally, through the series of optimization, a question and answer demand label map can be obtained, the question and answer demand label map not only can accurately reflect the demands of users, but also can improve the question and answer efficiency in practical application, and can ensure that the industry question and answer service can be provided rapidly and accurately when facing complex user problems.
And carrying out hierarchical label extraction of industry questions and answers on the user question and answer information base along the K-layer expansion demand labels on the question and answer demand label map based on a preset retrieval convergence vector, and outputting a minimum matching domain answer set.
In one embodiment,
Further, the application provides that each node of the scientific research information double-spiral path comprises an index module, wherein the index module comprises a sequential index facing to the next node, a reverse sequential index facing to the first node and a cross index jumping to another spiral path, and parallel spiral search of a plurality of inlets is carried out by the index module at the spiral inlet node.
Optionally, in the scientific research information double-spiral path, each node comprises an index module, the function of the index module is to provide a navigation function for each node, the index module is used for guiding the direction of the search process, the index module comprises three main types of indexes, namely a sequential index, an inverted index and a cross index, wherein the sequential index faces to the next node, namely the next node in the path, which is the next node in the path, and is usually used for searching in sequence, the inverted index faces to the previous node in the path, the retrieval process can trace back and visit the previous node, the cross index is used for guiding the node which jumps to another spiral path, and the index can span different spiral paths and connect the scientific research content path and the scientific research result path, so that information flow between the paths is realized. Upon locating the spiral entry node, the indexing module will assist in performing multi-directional parallel retrieval. Through the index module, not only can the next node be searched downwards (sequential search), but also the previous node can be traced upwards (reverse search), and the related node of another spiral path can be jumped to through the cross index, and the parallel search mode enables the system to search related information in multiple dimensions and directions at the same time. Through the three indexes, under the guidance of the spiral entry node, the related information and achievements can be flexibly switched between scientific research contents and scientific research achievements according to the intention of a user, and the parallel search ensures the comprehensiveness and depth of information acquisition, so that a search return result is determined. Through the cooperation of the index module, the retrieval efficiency can be improved, the full coverage of scientific research information from multiple angles is ensured, and a user can quickly find out the most relevant scientific research content and achievements.
Further, the present application provides for parallel spiral retrieval of multiple portals by the indexing module at the spiral portal node, the method comprising:
And respectively carrying out similarity calculation on the user intention vector and the candidate retrieval points, and determining the next retrieval direction from the candidate retrieval points by the index module according to a similarity calculation result.
Optionally, firstly, starting from a spiral entry node, obtaining candidate retrieval points of the node through an index module, wherein the candidate retrieval points comprise a next node, a previous node and a cross node, the next node is the next node following the current node in the path and is used for continuing to search forward in the path, the previous node is the previous node in the path and is used for backtracking to the front part of the path, and the cross node is a node which jumps from the current path to another spiral path, so that the cross node is allowed to span different scientific research contents or scientific research achievement paths, and the retrieval range is expanded. For each candidate retrieval point, cosine similarity is used for carrying out similarity calculation on the user intention vector and semantic vector codes of the node to measure the matching degree of the user intention vector and the semantic vector codes, and the higher the similarity value is, the more relevant the node and the user retrieval requirement are indicated. And then, according to the similarity calculation results of all candidate search points, the index module selects a node with the highest similarity value as a next search direction, and the next search direction and the spiral entry node form an optimal search path together to guide the subsequent search steps, so that the accuracy and efficiency of search are improved, and the most relevant scientific research information is ensured to be found.
In summary, the embodiment of the application has at least the following technical effects:
The method comprises the steps of firstly carrying out multi-scale expansion on a demand label after labeling processing a user question to obtain a hierarchical demand label group, then structuring the hierarchical demand label group into a question-answer demand label map according to a cross-level association relation of labels of the hierarchical demand label group, wherein the question-answer demand label map comprises K layers of expansion demand labels, then carrying out hierarchical label extraction on a user question-answer information base along the K layers of expansion demand labels on the basis of a preset search convergence vector to output a minimum matching domain answer set, then extracting K-1 layers of retrospective extraction answer sets along the search convergence vector, and finally assembling the minimum matching domain answer set and the association expansion answer set into an enhanced industry question-answer response output. The technical effects jointly solve the technical problems of low answer matching precision and narrow coverage range caused by single understanding level of the traditional industry question-answering system on the user demands and insufficient relevance of the labels, and achieve the technical effects of improving question-answering accuracy, richness and response speed through hierarchical expansion and relevance expansion of multidimensional labels.
In a second embodiment, based on the same inventive concept as the industry question-answer enhancement method based on multi-dimensional label automatic extraction in the previous embodiment, as shown in fig. 2, the application provides an industry question-answer enhancement system based on multi-dimensional label automatic extraction, which comprises a multi-scale expansion module 11, a label structuring module 12, a response output module 15 and a response output module, wherein the multi-scale expansion module 11 is used for obtaining a hierarchical demand label set by carrying out multi-scale expansion of a demand label after labeling a user question, the hierarchical demand label set is structured into a question-answer demand label map according to a label cross-level association relationship of the hierarchical demand label set, the question-answer demand label map comprises K-layer expansion demand labels, the label extraction module 13 is used for carrying out industry question-answer hierarchical label extraction on a user question-answer information base along the K-layer expansion demand labels based on a preset search convergence vector, the minimum matching domain answer set is output, the response output module 14 is used for outputting a response-1 layer back extraction answer set by association expansion, and the response output module 15 is used for outputting the response-enhancement answer set of the minimum matching domain answer set and the response enhancement answer set.
Further, the multi-scale expansion module 11 is further configured to perform the following method:
The method comprises the steps of performing field self-adaptive text processing on a user problem to generate a standardized query text, dynamically extracting a core semantic unit of the standardized query text through a preset industry keyword library to generate a question-answer demand label group, and performing multi-scale expansion on the question-answer demand label group to obtain the hierarchical demand label group.
Further, the multi-scale expansion module 11 is further configured to perform the following method:
The method comprises the steps of pre-constructing a hierarchy expansion container set, loading N question-answer demand labels in the question-answer demand label set into the hierarchy expansion container set, carrying out multi-scale label expansion through the semantic expansion container, the scene expansion container and the cross-domain expansion container, carrying out weight differentiation assignment according to container sources of expansion results, and outputting N hierarchy demand label subsets, wherein the N hierarchy demand label subsets form the hierarchy demand label set.
Further, the multi-scale expansion module 11 is further configured to perform the following method:
The method comprises the steps of loading a first question-answer demand label into a hierarchical expansion container set, carrying out synonymous term deduction expansion on the first question-answer demand label through a semantic expansion container to obtain a first equivalent expansion label set, loading the first equivalent expansion label set into the hierarchical expansion container to carry out upper and lower level differentiation expansion to obtain a first upper level expansion label set and a first lower level expansion label set, loading the first upper level expansion label set and the first lower level expansion label set into the scene expansion container to carry out multidimensional association scene expansion to obtain a first upper level multidimensional label set and a first lower level multidimensional label set, identifying a first knowledge association node of the first question-answer demand label, carrying out cross-domain label binding on the first upper level multidimensional label set and the first lower level multidimensional label set by taking the first knowledge association node as domain constraint, and obtaining a first upper level cross-domain expansion label set, a first upper level expansion label set, a first lower level expansion label set and a first lower level expansion label set according to label container sources, and assigning a first multi-domain expansion label set.
Further, the multi-scale expansion module 11 is further configured to perform the following method:
Extracting industry attribute characteristics of the first knowledge association node, matching a preset cross-domain binding rule base based on the industry attribute characteristics, adopting the cross-domain binding rule base to conduct macroscopic cross-domain binding of the first upper multi-dimensional tag group, outputting the first upper cross-domain tag group, adopting the cross-domain binding rule base to conduct microscopic cross-domain binding of the first lower cross-domain tag group, and outputting the first lower cross-domain tag group.
Further, the tag structuring module 12 is further configured to perform the following method:
The N question-answer demand labels are used as root nodes, the N level demand label subsets are structured according to the multi-container cascade execution sequence in the level expansion container group to obtain N question-answer demand label trees, N initial semantic relevance matrixes of the N question-answer demand labels are called locally, a cross-tree connector is driven to carry out relevant connection of the N question-answer demand label trees according to the N initial semantic relevance matrixes, and the question-answer demand label map is output.
Further, the tag structuring module 12 is further configured to perform the following method:
According to the N initial semantic association degree matrixes, driving a cross-tree connector to carry out association connection of the N question-answer demand label trees to obtain initial association patterns, presetting topology optimization rules, wherein the topology optimization rules are a plurality of differentiated optimization strategies of a plurality of association levels, carrying out hierarchical network node optimization of the initial association patterns according to the plurality of differentiated optimization strategies by taking the plurality of association levels as hierarchical topology optimization triggering conditions, and outputting the question-answer demand label patterns.
Further, the tag extraction module 13 is further configured to perform the following method:
Extracting a core dimension label of a newly added user answer bar to obtain an initial multi-dimension label, performing standardized multi-dimension expansion on the initial multi-dimension label by adopting the hierarchical expansion container group, performing discretization storage on the obtained label group to obtain a discrete matching label group, and adding the newly added user answer into the user question-answer information base after establishing bidirectional index association of the discrete matching label group and the newly added user answer.
Further, the tag extraction module 13 is further configured to perform the following method:
Extracting a first layer of expansion requirement labels from the question-answer requirement label atlas along the retrieval convergence vector, traversing the discrete matching label group of each historical user question-answer in the user question-answer information base by adopting the first layer of expansion requirement labels to execute weighted similarity calculation so as to screen and obtain an initial matching domain answer set meeting a predefined similarity threshold, and performing hierarchical label extraction on the initial matching domain answer set along the residual K-1 layer of expansion requirement labels in the question-answer requirement label atlas until the minimum matching domain answer set is output.
It should be noted that the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And the foregoing description has been directed to specific embodiments of this specification. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, nor the sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.
The specification and figures are merely exemplary illustrations of the present application and are considered to cover any and all modifications, variations, combinations, or equivalents that fall within the scope of the application. It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the scope of the application. Thus, the present application is intended to include such modifications and alterations insofar as they come within the scope of the application or the equivalents thereof.

Claims (10)

1.基于多维度标签自动萃取的行业问答增强方法,其特征在于,所述方法包括:1. An industry question-answering enhancement method based on automatic extraction of multi-dimensional tags, characterized in that the method comprises: 在标签化处理用户问题后,通过进行需求标签多尺度扩展,得到层级化需求标签组;After labeling user questions, a hierarchical demand label group is obtained by multi-scale expansion of demand labels; 根据所述层级化需求标签组的标签跨层级关联关系,将所述层级化需求标签组结构化为问答需求标签图谱,所述问答需求标签图谱包括K层扩展需求标签;According to the cross-level association relationship of the labels of the hierarchical demand label group, the hierarchical demand label group is structured into a question-answering demand label graph, wherein the question-answering demand label graph includes K layers of extended demand labels; 基于预设检索收敛向量,在所述问答需求标签图谱沿所述K层扩展需求标签对用户问答信息库进行行业问答的层级标签萃取,输出最小匹配域答案集;Based on the preset retrieval convergence vector, extract the hierarchical labels of industry questions and answers from the user question and answer information database along the K-layer extended demand labels in the question and answer demand label graph, and output the minimum matching domain answer set; 沿所述检索收敛向量提取K-1层回溯萃取答案集后,通过关联扩展聚合输出关联扩展答案集;After extracting the K-1 layer backtracking extraction answer set along the retrieval convergence vector, the associated extended answer set is output through associated extension aggregation; 将所述最小匹配域答案集和关联扩展答案集组装为增强式行业问答响应输出。The minimum matching domain answer set and the associated expanded answer set are assembled into an enhanced industry question and answer response output. 2.如权利要求1所述的基于多维度标签自动萃取的行业问答增强方法,其特征在于,在标签化处理用户问题后,通过进行需求标签多尺度扩展,得到层级化需求标签组,所述方法包括:2. The industry question-answering enhancement method based on automatic extraction of multi-dimensional tags according to claim 1 is characterized in that after labeling the user question, a hierarchical demand tag group is obtained by multi-scale expansion of the demand tags, and the method includes: 对所述用户问题进行领域自适应文本处理,生成标准化查询文本;Performing domain-adaptive text processing on the user question to generate standardized query text; 通过预设行业关键词库动态提取所述标准化查询文本的核心语义单元,生成问答需求标签组;Dynamically extract the core semantic units of the standardized query text through a preset industry keyword library to generate a question and answer requirement tag group; 对所述问答需求标签组进行多尺度扩展,得到所述层级化需求标签组。The question-and-answer requirement label group is expanded at multiple scales to obtain the hierarchical requirement label group. 3.如权利要求2所述的基于多维度标签自动萃取的行业问答增强方法,其特征在于,对所述问答需求标签组进行多尺度扩展,得到层级化需求标签组,所述方法包括:3. The industry question and answer enhancement method based on automatic extraction of multi-dimensional tags according to claim 2, characterized in that the question and answer requirement tag group is multi-scale expanded to obtain a hierarchical requirement tag group, and the method includes: 预构建层级扩展容器组,其中,所述层级扩展容器组包括级联的语义扩展容器、层级扩展容器、场景扩展容器和跨域扩展容器;Pre-constructing a hierarchical extension container group, wherein the hierarchical extension container group includes a cascaded semantic extension container, a hierarchical extension container, a scenario extension container, and a cross-domain extension container; 通过将所述问答需求标签组中N个问答需求标签加载至所述层级扩展容器组,经由所述语义扩展容器、场景扩展容器和跨域扩展容器执行多尺度的标签扩展后,根据扩展结果的容器来源进行权重差异化赋值,输出N个层级需求标签子集;By loading N question-and-answer requirement labels in the question-and-answer requirement label group into the hierarchical extension container group, performing multi-scale label expansion through the semantic extension container, scenario extension container, and cross-domain extension container, weighting is differentiated according to the container source of the expansion result, and N hierarchical requirement label subsets are output; 其中,所述N个层级需求标签子集构成所述层级化需求标签组。The N hierarchical requirement label subsets constitute the hierarchical requirement label group. 4.如权利要求3所述的基于多维度标签自动萃取的行业问答增强方法,其特征在于,根据所述层级化需求标签组的标签跨层级关联关系,将所述层级化需求标签组结构化为问答需求标签图谱,所述问答需求标签图谱包括K层扩展需求标签,所述方法包括:4. The industry question-and-answer enhancement method based on automatic extraction of multi-dimensional tags according to claim 3 is characterized in that, based on the cross-hierarchical association relationship of the tags in the hierarchical demand tag group, the hierarchical demand tag group is structured into a question-and-answer demand tag graph, the question-and-answer demand tag graph includes K layers of extended demand tags, and the method comprises: 以所述N个问答需求标签为根节点,根据所述层级扩展容器组中多容器级联执行顺序,结构化处理所述N个层级需求标签子集,得到N个问答需求标签树;Taking the N question-and-answer requirement tags as root nodes, and according to the multi-container cascade execution order in the hierarchical extension container group, structurally processing the N hierarchical requirement tag subsets to obtain N question-and-answer requirement tag trees; 本地调用所述N个问答需求标签的N个初始语义关联度矩阵;Locally calling N initial semantic relevance matrices of the N question-answering requirement labels; 依据所述N个初始语义关联度矩阵,驱动跨树连接器进行所述N个问答需求标签树的关联连接,输出所述问答需求标签图谱。Based on the N initial semantic association matrices, drive the cross-tree connector to perform associative connection of the N question and answer demand label trees, and output the question and answer demand label graph. 5.如权利要求4所述的基于多维度标签自动萃取的行业问答增强方法,其特征在于,所述方法还包括:5. The industry question-answering enhancement method based on automatic extraction of multi-dimensional tags according to claim 4, characterized in that the method further comprises: 依据所述N个初始语义关联度矩阵,驱动跨树连接器进行所述N个问答需求标签树的关联连接,得到初始关联图谱;According to the N initial semantic relevance matrices, drive the cross-tree connector to perform associative connection of the N question-answering requirement tag trees to obtain an initial association graph; 预设拓扑优化规则,其中,所述拓扑优化规则为多个关联层级的多个差异化优化策略;Presetting a topology optimization rule, wherein the topology optimization rule is a plurality of differentiated optimization strategies at a plurality of associated levels; 以所述多个关联层级为层级拓扑优化触发条件,依据所述多个差异化优化策略进行所述初始关联图谱的层级网络节点优化,输出所述问答需求标签图谱。Taking the multiple association levels as trigger conditions for hierarchical topology optimization, the hierarchical network nodes of the initial association graph are optimized according to the multiple differentiated optimization strategies, and the question-and-answer demand label graph is output. 6.如权利要求3所述的基于多维度标签自动萃取的行业问答增强方法,其特征在于,所述方法还包括:6. The industry question-answering enhancement method based on automatic extraction of multi-dimensional tags according to claim 3, characterized in that the method further comprises: 对新增用户回答条进行核心维度标签萃取,得到初始多维度标签;Extract core dimension labels from newly added user answers to obtain initial multi-dimensional labels; 采用所述层级扩展容器组对所述初始多维度标签执行标准化多尺度扩展后,进行所获标签组的离散化存储,得到离散匹配标签组;After performing standardized multi-scale expansion on the initial multi-dimensional labels using the hierarchical expansion container group, the obtained label groups are discretized and stored to obtain discrete matching label groups; 在建立所述离散匹配标签组与新增用户回答的双向索引关联后,将所述新增用户回答添加至所述用户问答信息库。After establishing a bidirectional index association between the discrete matching tag group and the newly added user answer, the newly added user answer is added to the user question and answer information library. 7.如权利要求6所述的基于多维度标签自动萃取的行业问答增强方法,其特征在于,基于预设检索收敛向量,在所述问答需求标签图谱沿所述K层扩展需求标签对用户问答信息库进行行业问答的层级标签萃取,输出最小匹配域答案集,所述方法还包括:7. The method for enhancing industry Q&A based on automatic extraction of multi-dimensional labels according to claim 6, characterized in that, based on a preset retrieval convergence vector, hierarchical label extraction of industry Q&A is performed on the user Q&A information database along the K-layer extended demand labels in the Q&A demand label graph, and a minimum matching domain answer set is output. The method further comprises: 沿所述检索收敛向量在所述问答需求标签图谱提取第一层扩展需求标签;Extracting the first layer of extended demand labels in the question-answer demand label graph along the retrieval convergence vector; 采用所述第一层扩展需求标签遍历所述用户问答信息库中每个历史用户问答的离散匹配标签组执行加权相似度计算,以筛选得到满足预定义相似度阈值的初始匹配域答案集;Using the first layer of extended requirement tags to traverse the discrete matching tag groups of each historical user question and answer in the user question and answer information database to perform weighted similarity calculation to screen out an initial matching domain answer set that meets a predefined similarity threshold; 以此类推,在所述问答需求标签图谱沿剩余K-1层扩展需求标签对所述初始匹配域答案集迭代执行层级标签萃取,直至输出所述最小匹配域答案集。Similarly, hierarchical label extraction is iteratively performed on the initial matching domain answer set along the remaining K-1 layers of expanded requirement labels in the question-answer requirement label graph until the minimum matching domain answer set is output. 8.如权利要求3所述的基于多维度标签自动萃取的行业问答增强方法,其特征在于,所述方法还包括:8. The industry question-answering enhancement method based on automatic extraction of multi-dimensional tags according to claim 3, characterized in that the method further comprises: 将第一问答需求标签加载至所述层级扩展容器组后,经由语义扩展容器对所述第一问答需求标签进行同义术语推导扩展,得到第一同义扩展标签组;After loading the first question-and-answer requirement tag into the hierarchical extension container group, performing synonym term derivation and expansion on the first question-and-answer requirement tag via the semantic extension container to obtain a first synonymous extension tag group; 将第一同义扩展标签组加载至所述层级扩展容器进行上下位分化扩展,得到第一上位扩展标签组和第一下位扩展标签组;Loading the first synonymous extension tag group into the hierarchical extension container for upper and lower differentiation expansion to obtain a first upper extension tag group and a first lower extension tag group; 将所述第一上位扩展标签组和第一下位扩展标签组加载至所述场景扩展容器进行多维关联场景扩展,得到第一上位多维标签组和第一下位多维标签组;Loading the first upper extension tag group and the first lower extension tag group into the scenario extension container to perform multi-dimensional associated scenario extension to obtain a first upper multi-dimensional tag group and a first lower multi-dimensional tag group; 识别所述第一问答需求标签的第一知识关联节点后,以所述第一知识关联节点为领域约束,在所述跨域扩展容器对所述第一上位多维标签组和第一下位多维标签组进行跨域标签绑定,得到第一上位跨域标签组和第一下位跨域标签组;After identifying the first knowledge association node of the first question-and-answer requirement tag, performing cross-domain tag binding on the first upper multidimensional tag group and the first lower multidimensional tag group in the cross-domain extension container using the first knowledge association node as a domain constraint to obtain a first upper cross-domain tag group and a first lower cross-domain tag group; 根据标签容器来源进行所述第一同义扩展标签组、第一上位扩展标签组、第一下位扩展标签组、第一上位多维标签组、第一下位多维标签组、第一上位跨域标签组合第一下位跨域标签组的权重赋值,输出第一层级需求标签子集。According to the source of the label container, weights are assigned to the first synonymous extended label group, the first upper extended label group, the first lower extended label group, the first upper multidimensional label group, the first lower multidimensional label group, the first upper cross-domain label combination, and the first lower cross-domain label group, and a first-level requirement label subset is output. 9.如权利要求8所述的基于多维度标签自动萃取的行业问答增强方法,其特征在于,以所述第一知识关联节点为领域约束,在所述跨域扩展容器对所述第一上位多维标签组和第一下位多维标签组进行跨域标签绑定,得到第一上位跨域标签组和第一下位跨域标签组,所述方法还包括:9. The method for enhancing industry Q&A based on automatic extraction of multidimensional tags according to claim 8, characterized in that, with the first knowledge association node as a domain constraint, cross-domain tag binding is performed on the first upper multidimensional tag group and the first lower multidimensional tag group in the cross-domain extension container to obtain the first upper cross-domain tag group and the first lower cross-domain tag group, and the method further comprises: 提取所述第一知识关联节点的行业属性特征,并基于所述行业属性特征匹配预置的跨域绑定规则库;Extracting industry attribute features of the first knowledge association node, and matching a preset cross-domain binding rule library based on the industry attribute features; 采用所述跨域绑定规则库进行所述第一上位多维标签组的宏观跨域绑定,输出所述第一上位跨域标签组;Using the cross-domain binding rule library to perform macro cross-domain binding of the first upper multi-dimensional tag group, and outputting the first upper cross-domain tag group; 采用所述跨域绑定规则库进行所述第一下位跨域标签组的微观跨域绑定,输出所述第一下位跨域标签组。The cross-domain binding rule library is used to perform micro-cross-domain binding of the first subordinate cross-domain tag group, and the first subordinate cross-domain tag group is output. 10.基于多维度标签自动萃取的行业问答增强系统,其特征在于,所述系统用于执行权利要求1-9任一项所述基于多维度标签自动萃取的行业问答增强方法,所述系统包括:10. An industry question-answering enhancement system based on automatic extraction of multi-dimensional tags, characterized in that the system is used to execute the industry question-answering enhancement method based on automatic extraction of multi-dimensional tags according to any one of claims 1 to 9, and the system comprises: 多尺度扩展模块:在标签化处理用户问题后,通过进行需求标签多尺度扩展,得到层级化需求标签组;Multi-scale expansion module: After labeling user questions, it performs multi-scale expansion of demand labels to obtain a hierarchical demand label group; 标签结构化模块:根据所述层级化需求标签组的标签跨层级关联关系,将所述层级化需求标签组结构化为问答需求标签图谱,所述问答需求标签图谱包括K层扩展需求标签;A label structuring module: structures the hierarchical demand label group into a question-and-answer demand label graph according to the cross-hierarchical association relationship of the labels in the hierarchical demand label group, wherein the question-and-answer demand label graph includes K layers of extended demand labels; 标签萃取模块:基于预设检索收敛向量,在所述问答需求标签图谱沿所述K层扩展需求标签对用户问答信息库进行行业问答的层级标签萃取,输出最小匹配域答案集;Label extraction module: Based on the preset retrieval convergence vector, the module extracts hierarchical labels of industry questions and answers from the user question and answer information database along the K-layer extended requirement labels in the question and answer requirement label graph, and outputs the minimum matching domain answer set; 扩展聚合模块:沿所述检索收敛向量提取K-1层回溯萃取答案集后,通过关联扩展聚合输出关联扩展答案集;Extension aggregation module: after extracting the K-1 layer backtracking extraction answer set along the retrieval convergence vector, output the associated extended answer set through associated extension aggregation; 响应输出模块:将所述最小匹配域答案集和关联扩展答案集组装为增强式行业问答响应输出。Response output module: assembles the minimum matching domain answer set and the associated extended answer set into an enhanced industry question-answering response output.
CN202511075834.3A 2025-08-01 Industry question-answering enhancement method and system based on automatic extraction of multi-dimensional tags Active CN120578746B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202511075834.3A CN120578746B (en) 2025-08-01 Industry question-answering enhancement method and system based on automatic extraction of multi-dimensional tags

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202511075834.3A CN120578746B (en) 2025-08-01 Industry question-answering enhancement method and system based on automatic extraction of multi-dimensional tags

Publications (2)

Publication Number Publication Date
CN120578746A true CN120578746A (en) 2025-09-02
CN120578746B CN120578746B (en) 2025-10-14

Family

ID=

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7669147B1 (en) * 2009-01-02 2010-02-23 International Business Machines Corporation Reorienting navigation trees based on semantic grouping of repeating tree nodes
CN111737400A (en) * 2020-06-15 2020-10-02 上海理想信息产业(集团)有限公司 Knowledge reasoning-based big data service tag expansion method and system
CN118779438A (en) * 2024-09-12 2024-10-15 北京北龙青云软件有限公司 Data intelligent question answering method and system integrating domain knowledge
CN119474328A (en) * 2025-01-16 2025-02-18 中铁建设集团有限公司 An intelligent question-answering method and system in the field of building construction
CN119669415A (en) * 2024-11-29 2025-03-21 中国农业银行股份有限公司 Data question and answer processing method, device, electronic device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7669147B1 (en) * 2009-01-02 2010-02-23 International Business Machines Corporation Reorienting navigation trees based on semantic grouping of repeating tree nodes
CN111737400A (en) * 2020-06-15 2020-10-02 上海理想信息产业(集团)有限公司 Knowledge reasoning-based big data service tag expansion method and system
CN118779438A (en) * 2024-09-12 2024-10-15 北京北龙青云软件有限公司 Data intelligent question answering method and system integrating domain knowledge
CN119669415A (en) * 2024-11-29 2025-03-21 中国农业银行股份有限公司 Data question and answer processing method, device, electronic device and storage medium
CN119474328A (en) * 2025-01-16 2025-02-18 中铁建设集团有限公司 An intelligent question-answering method and system in the field of building construction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XI ZHANG等: "A Multi-level and Multi-label Annotation Strategy for User Questions in ICT Customer Service", 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 14 June 2020 (2020-06-14) *
王若倪;赵慧玲;: "大数据技术发展趋势及灯塔大数据行业应用平台", 中兴通讯技术, no. 03, 4 March 2016 (2016-03-04) *

Similar Documents

Publication Publication Date Title
Hao et al. Integrating and navigating engineering design decision-related knowledge using decision knowledge graph
Wang et al. Q2semantic: A lightweight keyword interface to semantic search
CN118780398B (en) Large model training method and data query method based on large model
CN118967113A (en) A wind turbine intelligent decision-making method and system based on retrieval enhancement to generate large models
US20120130999A1 (en) Method and Apparatus for Searching Electronic Documents
CN115879441B (en) Text novelty detection method and device, electronic equipment and readable storage medium
CN119903234B (en) A law and regulation recommendation system and method based on multi-channel fusion
KR102096328B1 (en) Platform for providing high value-added intelligent research information based on prescriptive analysis and a method thereof
CN118964584B (en) A knowledge point search method and system based on binary tree structure
Wei et al. A data-driven human–machine collaborative product design system toward intelligent manufacturing
Ding et al. An automatic patent literature retrieval system based on LLM-RAG
CN119761376A (en) Project research content duplicate checking method and device based on semantic alignment and electronic equipment
Angermann et al. Taxonomy matching using background knowledge
Ghanadi Nezhad et al. Forecasting the subject trend of international library and information science research by 2030 using the deep learning approach
CN118503304A (en) Data table recall method, device, equipment and storage medium
CN120578746B (en) Industry question-answering enhancement method and system based on automatic extraction of multi-dimensional tags
CN120578746A (en) Industry question-answer enhancement method and system based on multi-dimensional label automatic extraction
CN117786042A (en) Policy interpretation method, device and storage medium
CN114186079B (en) Log field name generation method, system and electronic device based on knowledge graph
CN112668836B (en) Risk spectrum-oriented associated risk evidence efficient mining and monitoring method and apparatus
Vyas et al. Extraction of professional details from Web-URLs using DeepDive
Ordoñez et al. Business Process Models Clustering Based on Multimodal Search, K-means, and Cumulative and No-Continuous N-Grams
CN120508607B (en) Text retrieval optimization method and system for electric power scientific research information
Echarte et al. Self-adaptation of ontologies to folksonomies in semantic web
Roostaee Language-independent Profile-based Tag Recommendation for Community Question Answering Systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant