[go: up one dir, main page]

CN119474380B - A conflict and dispute event early warning method, system, program product and storage medium - Google Patents

A conflict and dispute event early warning method, system, program product and storage medium Download PDF

Info

Publication number
CN119474380B
CN119474380B CN202510024792.4A CN202510024792A CN119474380B CN 119474380 B CN119474380 B CN 119474380B CN 202510024792 A CN202510024792 A CN 202510024792A CN 119474380 B CN119474380 B CN 119474380B
Authority
CN
China
Prior art keywords
data
information
event
personnel
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510024792.4A
Other languages
Chinese (zh)
Other versions
CN119474380A (en
Inventor
罗军
梁建波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Jieyun Software Co ltd
Original Assignee
Fujian Jieyun Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Jieyun Software Co ltd filed Critical Fujian Jieyun Software Co ltd
Priority to CN202510024792.4A priority Critical patent/CN119474380B/en
Publication of CN119474380A publication Critical patent/CN119474380A/en
Application granted granted Critical
Publication of CN119474380B publication Critical patent/CN119474380B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a contradictory dispute early warning method, a system, a program product and a storage medium, and relates to the technical field of information and communication which are specially suitable for administrative, commercial, financial, management or supervision purposes. And extracting first characteristic data related to the subject information from the event structure tree by using a characteristic extraction rule so that the key characteristics of the event can be clearly presented. And for personnel data, after a personnel structure tree is constructed by word segmentation and part-of-speech tagging, emotion tendency analysis is carried out on the extracted words, negative emotion words are marked, and then contradiction dispute events are judged. And clustering the subject information and the personnel according to the first characteristic data and the second characteristic data, wherein the personnel carry marked words in the clustering process, so that the event is closely related to the related personnel, and the deep integration of the data is realized. The problem of data confusion and mismatch is solved, so that the real condition and the development trend of contradictory dispute events are judged, and the early warning accuracy is improved.

Description

Contradiction dispute early warning method, system, program product and storage medium
Technical Field
The application relates to the technical field of information and communication which are specially suitable for administrative, commercial, financial, management or supervision purposes, in particular to a method, a system, a program product and a storage medium for early warning of contradictory dispute events.
Background
In the related technical field, because the data sources are complicated, the data acquisition mode, the storage format and the data quality of each data source are uneven. When the real situation and the development trend of the contradictory dispute are difficult to accurately judge, the early warning of the contradictory dispute is inaccurate because the key information cannot be effectively extracted due to the confusion and the mismatch of the data even though a large amount of data related to the contradictory dispute exists.
Disclosure of Invention
The application provides a contradictory dispute early warning method, a system, a program product and a storage medium, which are used for solving the problem of data confusion and mismatch and improving early warning accuracy.
The application provides a contradiction and dispute event early warning method, which comprises the steps of extracting data information from a data source, dividing the data information into different types according to a preset data storage mode and a business data relation, wherein the different types comprise event data with the type of an event and personnel data with the type of a personnel, the data source comprises a requirement acquisition source, a grid member reporting source, a technical patrol and alarm acquisition source, an upper-level and department data source and an off-line data source, word segmentation is carried out on the event data and the personnel data respectively to obtain words, part-of-speech tagging is carried out on the words, an event structure tree and a personnel structure tree are respectively constructed according to part-of-speech tagging results, topic information is extracted from the event structure tree, first characteristic data related to the topic information is extracted from the event structure tree by using a characteristic extraction rule, the topic information is related to reflect event content or topic information, emotion trend analysis is carried out on the personnel structure tree by extracting words, negative emotion judgment is carried out on judgment words, the personnel related to the personnel structure tree is extracted by using the characteristic extraction rule, clustering is carried out on the personnel, the first characteristic data and the second characteristic data are carried out on the personnel, the words are clustered with the corresponding to the corresponding clustering information in the event structure tree, and the clustering threshold value is carried by the corresponding clustering information when the clustering information is carried by the corresponding to the set and the clustering threshold value is carried out, and the clustering information is carried out.
By adopting the technical scheme, the data information is extracted from multiple types of data sources and is divided into event data and personnel data, the preliminary classification and arrangement of the data are realized, the disorder of the data are avoided, then the event data is segmented to obtain words and part-of-speech labels, an event structure tree is constructed, and subject information is extracted from the event structure tree, so that the interference of the disorder data is greatly reduced, and the event core content is accurately positioned. And then extracting first characteristic data related to the subject information from the event structure tree by using a characteristic extraction rule so that the key characteristics of the event can be clearly presented. And for personnel data, after a personnel structure tree is constructed by word segmentation and part-of-speech tagging, emotion trend analysis is carried out on the extracted words, negative emotion words are marked, the contradictory dispute event is judged in the aspect of follow-up, second characteristic data related to personnel are extracted by utilizing a characteristic extraction rule, and key information in the personnel data is deeply extracted and initial step connection is established with the event. And clustering the subject information and the personnel according to the first characteristic data and the second characteristic data, wherein the personnel carry marked words in the clustering process, so that the event is closely related to the related personnel, and the deep integration of the data is realized. And finally, reading the threshold value of the subject information in the threshold value configuration file, accurately judging that the event to which the subject information belongs is a contradictory dispute event when the words carrying the marks exceed the corresponding threshold value, and outputting the clustered corresponding personnel, thereby solving the problem of data confusion mismatch, judging the real condition and the development trend of the contradictory dispute event, and improving the early warning accuracy.
In combination with some embodiments of the first aspect, in some embodiments, the step of dividing the data information into different types according to a preset data storage mode and a service data relationship includes extracting keywords in the current data information, dividing the data information into different types according to the types of the keywords, classifying the current data information into any data information, adding a source identification field to the current data information after classifying, wherein the source identification field is a corresponding identification value given according to a data source from the current data information, constructing the service data relationship by taking the current data information as a primary node and taking data information with circulation and association relationship between the data information as a secondary node based on the source identification field, adding classification of the current data information for the data information with circulation and association relationship between the data information, and storing the different types of data information into a corresponding database.
By adopting the technical scheme, the keywords in the data information are extracted, classified according to the types of the keywords, the source identification field is added, and then the service data relationship is constructed. In the process, the classification relation of the primary node is given to the secondary node, so that the secondary node is added with the classification relation. The data that was otherwise isolated or ambiguous in relation to is closely connected. Therefore, the data dimension is richer, and the relativity between the data is obviously enhanced. The method effectively avoids the information one-sidedness caused by the lack of data association, ensures that the early warning judgment basis is more accurate, and improves the accuracy of the early warning of contradictory events.
In combination with some embodiments of the first aspect, in some embodiments, the step of storing the data information of different types in the corresponding databases specifically includes, for a data information having a plurality of types, randomly selecting one corresponding database to store, and adding index identifiers traced back to other corresponding databases in the storage record.
By adopting the technical scheme, the index identifier is stored and added for the random selection library with the multi-type data information. Random storage can balance the load of each database, and the influence on the overall performance caused by overlarge pressure of a single database is avoided. The index identifier provides a convenient path for tracing data, and when the data needs to be comprehensively analyzed, related data in other databases can be rapidly associated according to the index even if the data is stored in a scattered manner.
In combination with some embodiments of the first aspect, in some embodiments, the step of extracting topic information from an event structure tree specifically includes extracting topic information from the event structure tree by using a topic information extraction model, wherein the training process of the topic information extraction model includes presetting training topic information and a training corpus, the training corpus includes training corpus of a category related to the training topic information and training corpus of an irrelevant category, traversing the training corpus once, recognizing each word by means of syntactic analysis and generating a corresponding extraction rule for the word, continuing traversing the training corpus, generating statistical data of each extraction rule, scoring the extraction rule according to the statistical data, wherein the extraction rule is positively correlated with the frequency of occurrence of training corpus of the category related to the training topic information and negatively correlated with the frequency of occurrence of training corpus of the category irrelevant to the training topic information, and retaining the extraction rule that the scoring result exceeds a scoring threshold.
By adopting the technical scheme, through presetting training topic information and training corpus containing related and unrelated categories, traversing the corpus to generate extraction rules and scoring and screening according to the occurrence times in the corpus of different categories. Rules frequently appearing in related categories and rarely appearing in unrelated categories are reserved, so that when subject information is extracted, the content related to contradictory dispute events is focused more accurately, and irrelevant information interference is effectively eliminated. Event core topics can be accurately extracted, and the accuracy and effectiveness of event information processing in the whole early warning process are improved.
In combination with some embodiments of the first aspect, in some embodiments, the step of scoring the extraction rule according to the statistical data specifically includes that the score of the extraction rule is the number of occurrences of the training corpus in the related category of training topic information divided by the number of occurrences of the training corpus in the unrelated category of training topic information, and after all scoring results are obtained, the scoring results are subjected to data binarization.
By adopting the technical scheme, the ratio of the occurrence times in the training corpus of the related and unrelated categories of the training topic information is used as the extraction rule score, and the calculation mode intuitively reflects the correlation strength of the rule and the topic information. After the data is binarized, the score result is further simplified, and whether the rule is available or not is highlighted.
In combination with some embodiments of the first aspect, in some embodiments, after the step of extracting the data information from the data source and dividing the data information into different types according to the preset data storage mode and the service data relationship, the method further includes storing the data information with the time below the time threshold in the in-memory database and storing the data information with the time not below the time threshold in the distributed file system.
By adopting the technical scheme, the data information with time lower than the threshold value is stored in the memory database, and the data which is generated recently and is possibly closely related to the current contradiction dispute event can be quickly called and analyzed due to the high reading and writing speed, so that the data processing rhythm is quickened. And the data with time not lower than the threshold value is stored in the distributed file system, so that the data is ensured to be stored for a long time, and meanwhile, excessive memory resources are not occupied, and the distributed architecture is beneficial to large-scale data management.
With reference to some embodiments of the first aspect, in some embodiments, the step of storing data information below the time threshold in the in-memory database, and the step of storing data information not below the time threshold in the distributed file system, the method further includes establishing a data index for the data information.
By adopting the technical scheme, the data index is built for the data information. When the data is queried, the whole database is not required to be traversed, the required data can be rapidly positioned through the index, and the required data, whether event data or personnel data, can be rapidly retrieved and invoked.
In a second aspect, the application provides a dispute event early warning system comprising one or more processors and a memory coupled to the one or more processors, the memory for storing computer program code comprising computer instructions, the one or more processors invoking the computer instructions to cause the dispute event early warning system to perform a method as described in the first aspect and any one of the possible implementations of the first aspect.
In a third aspect, the application provides a computer program product comprising instructions which, when run on a dispute early warning system, cause the dispute early warning system to perform a method as described in the first aspect and any one of the possible implementations of the first aspect.
In a fourth aspect, the application provides a computer readable storage medium comprising instructions which, when run on a dispute early warning system, cause the dispute early warning system to perform a method as described in the first aspect and any one of the possible implementations of the first aspect.
One or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:
1. the method comprises the steps of extracting data information from multiple types of data sources, dividing the data information into event data and personnel data, realizing preliminary classification and arrangement of the data, avoiding disorder of the data, then segmenting the event data to obtain words and parts of speech marks, constructing an event structure tree, extracting subject information from the event structure tree, greatly reducing interference of the unrelated data, and accurately positioning event core content. And then extracting first characteristic data related to the subject information from the event structure tree by using a characteristic extraction rule so that the key characteristics of the event can be clearly presented. And for personnel data, after a personnel structure tree is constructed by word segmentation and part-of-speech tagging, emotion trend analysis is carried out on the extracted words, negative emotion words are marked, the contradictory dispute event is judged in the aspect of follow-up, second characteristic data related to personnel are extracted by utilizing a characteristic extraction rule, and key information in the personnel data is deeply extracted and initial step connection is established with the event. And clustering the subject information and the personnel according to the first characteristic data and the second characteristic data, wherein the personnel carry marked words in the clustering process, so that the event is closely related to the related personnel, and the deep integration of the data is realized. And finally, reading the threshold value of the subject information in the threshold value configuration file, accurately judging that the event to which the subject information belongs is a contradictory dispute event when the words carrying the marks exceed the corresponding threshold value, and outputting the clustered corresponding personnel, thereby solving the problem of data confusion mismatch, judging the real condition and the development trend of the contradictory dispute event, and improving the early warning accuracy.
2. And extracting keywords in the data information, classifying according to the types of the keywords, adding source identification fields, and then constructing a business data relationship. In the process, the classification relation of the primary node is given to the secondary node, so that the secondary node is added with the classification relation. The data that was otherwise isolated or ambiguous in relation to is closely connected. Therefore, the data dimension is richer, and the relativity between the data is obviously enhanced. The method effectively avoids the information one-sidedness caused by the lack of data association, ensures that the early warning judgment basis is more accurate, and improves the accuracy of the early warning of contradictory events.
3. Through presetting training topic information and training corpus sets containing related and unrelated categories, traversing the corpus sets to generate extraction rules and scoring and screening according to the occurrence times in the corpus of different categories. Rules frequently appearing in related categories and rarely appearing in unrelated categories are reserved, so that when subject information is extracted, the content related to contradictory dispute events is focused more accurately, and irrelevant information interference is effectively eliminated. Event core topics can be accurately extracted, and the accuracy and effectiveness of event information processing in the whole early warning process are improved.
Drawings
FIG. 1 is a schematic flow chart of a method for early warning of contradictory dispute events in an embodiment of the present application;
FIG. 2 is a schematic diagram showing a specific flow of step S101 in the embodiment of the present application;
FIG. 3 is a schematic diagram showing a specific flow of step S103 in the embodiment of the present application;
FIG. 4 is a schematic diagram of an exemplary hardware architecture of a dispute early warning system in accordance with an embodiment of the present application.
Detailed Description
The terminology used in the following embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," "the," and "the" are intended to include the plural forms as well, unless the context clearly indicates to the contrary. It should also be understood that the term "and/or" as used in this disclosure refers to and encompasses any or all possible combinations of one or more of the listed items.
The terms "first," "second," and the like, are used below for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature, and in the description of embodiments of the application, unless otherwise indicated, the meaning of "a plurality" is two or more.
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for early warning of contradictory dispute events according to an embodiment of the present application;
S101, extracting data information from a data source, and dividing the data information into different types according to a preset data storage mode and a business data relationship, wherein the data information comprises event data of which the type is event and personnel data of which the type is personnel;
The data source represents a channel for acquiring data, the business data relationship is a logic context formed by generating, circulating and correlating data in a business process, and the source and the evolution process of the data can be clearly traced through the relationship.
In some embodiments, after collecting data information from numerous data sources, the storage path and format may be determined according to the type, generation time, belonging area, etc. of the data according to a preset data storage mode, and meanwhile, the internal relation between the data is analyzed according to the business data relation. For example, if reported data is found to be associated with demand data, the relevant data information is classified into different types based on the association, wherein information related to a specific occurrence is classified as event data, and information related to a person involved is classified as person data.
It should be noted that some data are not text data, such as video data, and related techniques of converting video into text are needed to be better integrated into the overall data processing flow. Firstly, an advanced video voice recognition technology is adopted, aiming at the part containing voice information in video, like recorded dialogue audios of field personnel, and the like, voice is accurately converted into text contents by means of a high-precision voice recognition algorithm, and in the conversion process, the characteristics of tone, intonation, speed and the like of the voice are intelligently analyzed, so that the recognition accuracy is improved, and errors caused by factors such as environmental noise, accent and the like are reduced.
For the part of the picture without voice in the video, the image recognition and natural language generation technology are combined.
After the video is converted into text, the non-text data originally existing in the video form can be classified and integrated in the text form together with other text data according to the established data storage mode and the business data relation, and can be smoothly participated in a series of analysis operations such as construction of grammar structure tree, extraction of characteristic data and the like in the follow-up, so that the various data can play a due role in social management work such as early warning of contradictory disputes and events.
In some specific embodiments, step S101 specifically includes:
referring to fig. 2, fig. 2 is a schematic flow chart of step S101 in the embodiment of the application;
S1011, extracting keywords in the current data information, and dividing the data information into different types according to the types of the keywords;
Wherein, the key words represent words capable of representing the core content, key theme or important characteristics of the data information. The current data information refers to any individual piece of data content collected from the respective data source.
In some embodiments, the keyword extraction algorithm in the natural language processing technology is utilized first, for example, based on word frequency statistics, TF-IDF and other methods, the current data information is analyzed, and the keyword is extracted. The data information is then divided into different categories according to the types of these keywords, for example, different types related to people, events, places, time, etc.
In some specific embodiments, a custom keyword dictionary is built, common keywords corresponding to various preset topics are covered in the custom keyword dictionary, words in current data information are sequentially matched with the keyword dictionary, the occurrence frequency of the keywords corresponding to each topic is counted, and the type of the data information is determined according to the frequency.
S1012, after classification, adding a source identification field for the current data information, wherein the source identification field is a corresponding identification value given according to a data source from the current data information;
the source identification field is a kind of marking information for specifying from which specific data source the data information is acquired, and the different sources are represented by assigning corresponding identification values.
In some embodiments, after the preliminary classification of the data information is completed, in order to clearly trace the source of the data later, it is necessary to perform this step in order to conveniently understand the reliability and accuracy of the data, control the quality of the data, and so on. And searching a preset identification value corresponding relation table according to the classified current data information and the data source from which the current data information is actually obtained, and adding a corresponding source identification field for the corresponding identification value corresponding relation table.
S1013, constructing a business data relationship by taking current data information as a main node and data information with a circulation and association relationship between the data information as a secondary node based on a source identification field;
The primary node represents the data information currently being processed, and is the core starting point for constructing the business data relationship, and the secondary node is the data information which has circulation and association relationship with the current data information.
In some embodiments, the current data information added with the source identifier is taken as a main node, and then the found data information is taken as a secondary node through a data association analysis algorithm, such as rule-based association analysis (setting some association rules such as time sequence and main identity) or association rule mining algorithm in machine learning (such as Apriori algorithm and the like). Then, the business data relationship is constructed according to a certain data structure (such as tree structure, graph structure, etc.).
S1014, adding classification of current data information for the data information with circulation and association relation between the data information;
Wherein, the classification of the current data information is added to the data information with the circulation and association relation between the data information, wherein the classification of the current data information refers to the category of the piece of data information determined in the previous step (S1011), such as event category, personnel category and the like. The aim is to make the secondary node data information associated with the primary node also take the classification attribute of the primary node, so as to further strengthen the association and classification consistency between the data.
In some embodiments, along the already constructed business data relationship, data information which is used as a secondary node and has a circulation relationship and an association relationship with a primary node is found, then the classification condition corresponding to the primary node is checked, and a classification mark is added to the corresponding attribute position of the secondary node data information associated with the classification mark.
S1015, storing the data information of different types into corresponding databases.
In some embodiments, after a series of operations such as previous data classification, adding source identification, building service data relationship, adding classification for secondary nodes are completed, in order to make data storage more standard and orderly, improve data storage efficiency and security, and facilitate the subsequent quick and accurate acquisition of data of a required type according to different requirements, this step is performed.
It can be seen that the keywords in the data information are extracted, classified according to the type thereof and the source identification field is added, and then the business data relationship is constructed. In the process, the classification relation of the primary node is given to the secondary node, so that the secondary node is added with the classification relation. The data that was otherwise isolated or ambiguous in relation to is closely connected. Therefore, the data dimension is richer, and the relativity between the data is obviously enhanced. The method effectively avoids the information one-sidedness caused by the lack of data association, ensures that the early warning judgment basis is more accurate, and improves the accuracy of the early warning of contradictory events.
In some embodiments, S1015 is specifically implemented by, for data information having multiple types, randomly selecting a corresponding database to store, and adding index identifiers traced back to other corresponding databases in the storage record.
In some embodiments, in an actual data storage scenario, since the data volume is often huge and the types are various, if all types of data are stored into the corresponding unique databases strictly according to the fixed classification rules, the situation that part of the databases are overloaded and other databases are idle in resources may occur, which affects the performance and efficiency of the whole data storage system. For data information having a plurality of types, a mode of randomly selecting a corresponding database to store is adopted. For example, there is a piece of data containing both community environmental event information (belonging to the event class) and worker information (belonging to the person class) related to the environmental event, and the whole is randomly selected and stored in one of the event database or the person database. And when the data is stored, adding index identifiers traced to other corresponding databases in the storage record of the data. For example, if the data is finally stored in the event database, an index mark pointing to the staff information related to the event in the staff database is added in the storage record, so that when the community environment event needs to be comprehensively analyzed later and the staff situation related to the community environment event is related to the query, the related data in the staff database can be quickly positioned by means of the index mark, the cross-database associated query of the data is realized, and powerful support is provided for deeper data analysis, comprehensive judgment of contradiction dispute events and other works.
It can be seen that the index identity is stored and added for a random repository with multiple types of data information. Random storage can balance the load of each database, and the influence on the overall performance caused by overlarge pressure of a single database is avoided. The index identifier provides a convenient path for tracing data, and when the data needs to be comprehensively analyzed, related data in other databases can be rapidly associated according to the index even if the data is stored in a scattered manner.
S102, respectively segmenting words aiming at event data and personnel data to obtain words, marking the parts of speech of the words, and respectively constructing an event structure tree and a personnel structure tree according to part of speech marking results;
The word segmentation refers to a process of segmenting continuous text data information into individual words according to semantic and grammar rules. Part of speech notations are grammatical category labels, such as nouns, verbs, adjectives, etc., that are used to represent the segmented words. The event structure tree is a tree structure constructed according to the grammatical relation among words based on the words in the event data, and can intuitively display the semantic architecture of the event.
In some embodiments, for event data, text data is segmented into words using specialized word segmentation tools, such as word segmentation modules in natural language processing libraries, and then each word is part-of-speech tagged by a part-of-speech tagging algorithm. For example, for event data "xx" which is arguably "due to cell parking space shortage, words such as" xx "," due to "cell", "parking space", "tension", "occurrence", "arguments" and the like are obtained after word segmentation, and part-of-speech labeling results are "noun", "preposition", "noun", "adjective", "verb" and "noun". According to the part-of-speech tagging results, an event main body is taken as a tree root, such as xx, an event structure tree is constructed according to semantic logic, and related words are connected in a reasonable grammatical relation. Similarly, the personnel data are subjected to word segmentation and part-of-speech tagging, and a personnel structure tree is constructed.
S103, extracting topic information from the event structure tree, and extracting first characteristic data related to the topic information from the event structure tree by using a characteristic extraction rule, wherein the topic information is related expression reflecting event content or topics;
the topic information refers to related expressions capable of summarizing the core content or main topics of the event, and is key information mined from the event structure tree.
In some embodiments, this step is performed after the event structure tree is built in order to extract key features and topics of the event for subsequent analysis and judgment. The topic information is extracted from the event structure tree by using a pre-trained topic information extraction model, for example, for an event structure tree related to community environmental disputes, the extracted topic information may be a "community environmental pollution problem". Then, a feature extraction rule, which may be a rule set according to a specific vocabulary, a grammar structure, a semantic relation, or the like, is used to extract first feature data related to the subject information, such as data information related to time, place, or the like, from the event structure tree.
In some specific embodiments, step S103 specifically includes:
Referring to fig. 3, fig. 3 is a schematic diagram showing a specific flow of step S103 in the embodiment of the application;
s1031, extracting topic information from an event structure tree by using a topic information extraction model;
The topic information extraction model is constructed based on a specific algorithm and architecture, and is used for automatically extracting related expression information capable of summarizing event core content, key topics and the like from a data representation form with a certain grammar and semantic structure, such as an event structure tree.
S1032, the training process of the topic information extraction model comprises the steps of presetting training topic information and a training corpus, wherein the training corpus comprises training corpuses of related categories and training corpuses of unrelated categories of the training topic information;
The preset training topic information refers to the approximate content and range of topic information which can be extracted from the event structure tree by a manually preset expected model according to service requirements and actual application scenes before training the topic information extraction model, and is equivalent to a guiding direction, so that the model knows what target to learn. The training corpus is a large amount of text data set for training the topic information extraction model, the text data cover various contents, wherein the training corpus of the category related to the training topic information is the text data containing the contents related to the preset training topic information, and the training corpus of the non-related category is the data which are not related to the preset training topic information, so that the model can better distinguish which are the topic information related contents really required to be extracted through comparison and learning.
S1033, traversing the training corpus once, identifying each word by means of syntactic analysis, and generating a corresponding extraction rule for each word;
Wherein the extraction rule refers to some criteria and methods formulated for each word based on the result of the syntactic analysis for judging whether the word is related to preset training topic information and how to extract the related topic information from the text, for example, if a term is in a particular syntactic structure (e.g., "cause...the production..the" structure is in the position of the object behind the production) and often appears in text related to the preset training topic information, a corresponding extraction rule may be formulated to focus on and extract topic information related content containing the term.
In some embodiments, each text is processed sequentially by a program or algorithm starting from the first text of the corpus, and when each text is processed, sentences in the text are analyzed one by using a syntactic analysis tool (such as open source software or deep learning model based on dependency syntactic analysis, etc.), so as to determine the grammatical relation between words in each sentence. For example, the sentence "xx" is argued with the property due to the cell noise problem, "xx" is the subject, "occurrence" is the predicate, "arguments" is the object, "cell noise problem" is the object, and so on. Then, for each word, according to the grammar position in the sentence, the collocation condition of the word and other words, the frequency of occurrence in the related texts in the whole training corpus and other factors, a corresponding extraction rule is formulated, for example, if the word "quarry" is found to frequently occur in similar grammar structures in a plurality of texts related to contradictory disputes, the rule can be formulated as "when the word" quarry "occurs in a predicate position and the subjects and objects thereof relate to personnel or groups," the text in which the word is located is related to preset training topic information with great probability, and the surrounding words thereof can be focused on to extract topic information.
S1034, traversing the training corpus continuously to generate statistical data of each extraction rule;
the continuous traversal of the training corpus refers to the operation of checking and processing all text data in the training corpus again according to a certain sequence after the extraction rule is generated by one-time traversal, and aims to further count the relevant performance condition of each rule in the whole training corpus based on the extraction rule generated before, and collect more comprehensive data for subsequent evaluation and screening of the extraction rule.
In some embodiments, the traversal process is restarted, starting with the first text of the corpus, and each text is checked again in turn. In this process, for each extraction rule that has been generated before, it is searched for whether the rule appears in each text, for example, if there is an extraction rule that "when a word 'dispute' appears in a text and words representing a place appear before and after it", whether this condition is satisfied is searched for in each text, and once satisfied, the appearance of the rule in that text is recorded. By traversing the whole training corpus again, the statistics of how many times each extraction rule appears in the training corpus of the category related to the training topic information, how many times each extraction rule appears in the training corpus of the non-related category, the distribution conditions of the extraction rules in different texts and other statistical data are counted, and a basis is provided for scoring and screening of the follow-up rules.
S1035, scoring extraction rules according to statistical data, wherein the extraction rules are positively correlated with the occurrence times of the training corpus in the training topic information related class and negatively correlated with the occurrence times of the training corpus in the training topic information unrelated class;
in some embodiments, the number of occurrences of the extraction rule in the relevant category of training corpus and the number of occurrences of the extraction rule in the irrelevant category of training corpus are calculated according to a preset scoring formula. For example, the number of times that the extraction rule appears in the relevant category training corpus is divided by the number of times that the extraction rule appears in the irrelevant category training corpus is used as a scoring basis, if the result is larger than 1 and the numerical value is larger, the rule is more representative in the relevant text, the score is higher, and if the result is smaller than 1, the rule is possibly common in the irrelevant text and the score is lower. Through the scoring mode, quantitative evaluation can be clearly carried out on each extraction rule, so that high-quality rules can be screened out later for the topic information extraction model.
In some embodiments, step S1035 specifically includes:
S10351, dividing the number of times of occurrence of the training corpus in the training topic information related category by the number of times of occurrence of the training corpus in the training topic information unrelated category by the score of the extraction rule;
In some embodiments, the score is calculated by dividing the number of occurrences of the extraction rule in the training corpus of the training topic information related class by the number of occurrences of the training corpus of the training topic information unrelated class. Still for the above example, according to this calculation, the score of the extraction rule is 30++10=3. The calculation mode is visual, if the score is larger than 1, the rule appears more frequently in the related category training corpus and is more likely to be an effective rule which is closely related to the topic information and can help accurately extract the topic information, and if the score is smaller than 1, the rule is indicated to appear more frequently in an irrelevant text, the effect on distinguishing the topic information is probably not large, and the accuracy is poor. Therefore, by such score calculation, preliminary value judgment can be quickly performed for each extraction rule.
S10352, after all scoring results are obtained, binarizing the scoring results.
The data binarization refers to the step of converting the score of the extraction rule obtained by the previous calculation into a form between 0 and 1 according to a certain standard and rule, and the difference of the scores is reduced under the condition of retaining the characteristics.
Therefore, the ratio of the occurrence times in the training corpus of the related and unrelated categories of the training topic information is taken as the extraction rule score, and the calculation mode intuitively reflects the correlation strength of the rule and the topic information. After the data is binarized, the score result is further simplified, and whether the rule is available or not is highlighted.
S1036, reserving extraction rules of scoring results exceeding the scoring threshold.
In some embodiments, all that remains is an extraction rule that performs better in the training corpus and has a higher relevance to the preset training topic information.
It can be seen that, through presetting training topic information and training corpus sets containing relevant and irrelevant categories, extraction rules are generated by traversing the corpus sets, and scoring and screening are performed according to the occurrence times in the corpora of different categories. Rules frequently appearing in related categories and rarely appearing in unrelated categories are reserved, so that when subject information is extracted, the content related to contradictory dispute events is focused more accurately, and irrelevant information interference is effectively eliminated. Event core topics can be accurately extracted, and the accuracy and effectiveness of event information processing in the whole early warning process are improved.
S104, carrying out emotion tendency analysis on the words extracted from the personnel structure tree, marking the words judged to have negative emotion, and extracting second characteristic data related to personnel from the personnel structure tree by applying a characteristic extraction rule;
The emotion tendency analysis refers to the process of judging and classifying emotion attitudes contained in words extracted from a staff structure tree, such as positive, negative, neutral and the like. Negative emotion words refer to words that are determined to have negative emotion colors in emotion tendencies analysis.
It should be noted that the feature extraction rules applied by the personnel structure tree and the event structure tree are identical. Even though the data of the personnel structure tree and the event structure tree originate from different source channels, respectively.
But as long as there is a communicative association of the person with the event, the extracted first and second feature data are identical over some key elements. For example, in the time dimension, both the time of occurrence of an event and the time of participation of a person in the event can be accurately extracted and kept consistent through the feature extraction rule, and in the aspect of places, the places where the event occurs and the relevant place information where the person is located can be accurately identified and correspond to each other.
In some embodiments, the words extracted by the staff structure tree are subjected to emotion tendency analysis, and the emotion tendency of the words is judged by matching the words with words in the emotion dictionary through an emotion analysis algorithm, such as a dictionary-based emotion analysis method. Words that are determined to have negative emotion are labeled, for example, in the personnel data "resident indicates dissatisfaction with the service of the district property," which is labeled as a negative emotion word.
In some specific embodiments, a custom emotion dictionary is built, common emotion vocabularies are classified into positive, negative and neutral, different emotion weights are set, a personnel structure tree is extracted to traverse words and match with the emotion dictionary, emotion tendency scores are calculated according to matching results and weights, emotion tendencies are judged and negative emotion words are marked, a characteristic extraction method based on paths is adopted according to the structural characteristics of the personnel structure tree, such as extracting path information from personnel nodes to specific emotion word nodes as second characteristic data, and the method is not limited herein.
S105, clustering the subject information and the personnel according to the first characteristic data and the second characteristic data, and carrying marked words by the personnel in the clustering process;
In some embodiments, after the first feature data, the second feature data and the marked words are obtained, in order to effectively associate and integrate the event with the related personnel, the subject information and the personnel are clustered according to the information contained in the first feature data and the second feature data. In the clustering process, people can carry the marked negative emotion words, so that the clustering result can better reflect the association of the event and the people and the emotion factors of the people in the event.
S106, reading a threshold value of the subject information preset in the threshold value configuration file, judging that the event to which the subject information belongs is a contradictory dispute event when the word carrying the mark exceeds the corresponding threshold value, and outputting the clustered corresponding personnel together.
In some embodiments, a threshold corresponding to the current topic information in the threshold configuration file is read, for example, for community dispute topic information, the threshold may be set such that the number of people carrying negative emotion words reaches a certain proportion or the strength of the negative emotion words reaches a certain degree. When the words carrying the marks after the clustering process exceed the corresponding threshold values, the event to which the subject information belongs is judged to be the contradictory dispute event, and the corresponding personnel after the clustering process are output together so as to process and process the contradictory dispute event later, such as arranging related personnel to conduct mediation or taking other management measures.
It can be seen that the data information is extracted from multiple kinds of data sources and divided into event data and personnel data, preliminary classification and arrangement of the data are realized, disorder of the data is avoided, then the event data is segmented to obtain words and parts of speech marks, an event structure tree is constructed, subject information is extracted from the event structure tree, the interference of the unrelated data is greatly reduced, and event core content is accurately positioned. And then extracting first characteristic data related to the subject information from the event structure tree by using a characteristic extraction rule so that the key characteristics of the event can be clearly presented. And for personnel data, after a personnel structure tree is constructed by word segmentation and part-of-speech tagging, emotion trend analysis is carried out on the extracted words, negative emotion words are marked, the contradictory dispute event is judged in the aspect of follow-up, second characteristic data related to personnel are extracted by utilizing a characteristic extraction rule, and key information in the personnel data is deeply extracted and initial step connection is established with the event. And clustering the subject information and the personnel according to the first characteristic data and the second characteristic data, wherein the personnel carry marked words in the clustering process, so that the event is closely related to the related personnel, and the deep integration of the data is realized. And finally, reading the threshold value of the subject information in the threshold value configuration file, accurately judging that the event to which the subject information belongs is a contradictory dispute event when the words carrying the marks exceed the corresponding threshold value, and outputting the clustered corresponding personnel, thereby solving the problem of data confusion mismatch, judging the real condition and the development trend of the contradictory dispute event, and improving the early warning accuracy.
In some embodiments, after step S101, further comprising:
S107, storing the data information with time lower than the time threshold in an internal memory database, and storing the data information with time not lower than the time threshold in a distributed file system;
In some embodiments, in actual operation, the system acquires a time attribute corresponding to each piece of data information (the time attribute may be a time of data generation, a time of last update, etc., according to a specific service definition), and then compares the time attribute with a preset time threshold. For the data information with the time attribute display time lower than the time threshold, the data information is stored into the memory database through a special data transmission interface or database operation statement, so that the data information can be quickly acquired when the later operation such as analysis and judgment of contradictory disputes needs to be quickly called. For the data information with the time not lower than the time threshold value, the data information is stored on each storage node of the distributed file system by using a corresponding transmission mechanism, so that the long-term storage of the data is ensured, the high-capacity characteristic of the distributed file system is utilized, excessive memory resources are prevented from being occupied, and an efficient and reasonable solution is provided for the data storage architecture of the whole system.
S108, establishing a data index for the data information.
In some embodiments, the system determines key dimensions needed to be indexed according to the characteristics of the data information and the service application scene, such as the type of the event, the identity of the related personnel, the time range of the event, and the like. And then, adopting a corresponding index creation algorithm and technology, and creating an index for the data information according to the selected key dimension by utilizing the built-in index creation function aiming at the data in the memory database. For data in the distributed file system, index information is built on each storage node according to set key dimensions through a distributed index building tool, and the index information is effectively integrated and associated, so that when data query is carried out, no matter where the data is stored in a memory database or the distributed file system, the data can be quickly positioned through indexes, high-efficiency data retrieval service is realized, and the data utilization efficiency of the whole system is improved.
Therefore, the data information with time lower than the threshold value is stored in the memory database, so that the data which is generated recently and is possibly closely related to the current contradictory dispute event can be quickly called and analyzed due to the high reading and writing speed, and the data processing rhythm is quickened. And the data with time not lower than the threshold value is stored in the distributed file system, so that the data is ensured to be stored for a long time, and meanwhile, excessive memory resources are not occupied, and the distributed architecture is beneficial to large-scale data management.
It can be seen that a data index is established for the data information. When the data is queried, the whole database is not required to be traversed, the required data can be rapidly positioned through the index, and the required data, whether event data or personnel data, can be rapidly retrieved and invoked.
An exemplary conflict-dispute event early warning system 400 provided by embodiments of the present application is described below. Fig. 4 is a schematic diagram of an exemplary hardware structure of a dispute early warning system 400 according to an embodiment of the present application.
In some embodiments, the dispute event early warning system 400 is a computer device or the dispute event early warning system 400 includes a computer device. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing data. The network interface of the computer device is used for communicating with other terminals or servers outside through network connection. In some embodiments, the network interface may be a wired network interface, and in some embodiments, the network interface may also be a wireless network interface. The computer program is executed by a processor to implement the method in the embodiment of the application.
It will be appreciated by persons skilled in the art that the architecture shown in fig. 4 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
While the application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that the foregoing embodiments may be modified or equivalents may be substituted for some of the features thereof, and that the modifications or substitutions do not depart from the spirit of the embodiments.
As used in the above embodiments, the term "when..is interpreted as meaning" if..or "after..or" in response to determining..or "in response to detecting..is" depending on the context. Similarly, the phrase "when determining..or" if (a stated condition or event) is detected "may be interpreted to mean" if determined.+ -. "or" in response to determining.+ -. "or" when (a stated condition or event) is detected "or" in response to (a stated condition or event) "depending on the context.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc.
Those of ordinary skill in the art will appreciate that implementing all or part of the above-described method embodiments may be accomplished by a computer program to instruct related hardware, the program may be stored in a computer readable storage medium, and the program may include the above-described method embodiments when executed. The storage medium includes a ROM or a random access memory RAM, a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims (9)

1.一种矛盾纠纷事件预警方法,其特征在于,包括:1. A conflict and dispute event early warning method, characterized by comprising: 从数据源提取数据信息,提取当前数据信息中的关键词,根据所述关键词的类型,将所述数据信息分成不同类型;所述当前数据信息为任一所述数据信息;包括类型为事件的事件数据以及类型为人员的人员数据;所述数据源包括需求采集来源、网格员上报来源、技巡与告警采集来源、上级与部门数据来源、线下数据来源;Extract data information from the data source, extract keywords in the current data information, and divide the data information into different types according to the types of the keywords; the current data information is any of the data information; including event data of type event and personnel data of type personnel; the data source includes demand collection source, grid member reporting source, technical patrol and alarm collection source, superior and department data source, and offline data source; 分类后,为所述当前数据信息添加来源标识字段,所述来源标识字段为根据来自的所述数据源赋予对应的标识值;After classification, a source identification field is added to the current data information, wherein the source identification field is assigned a corresponding identification value according to the data source; 基于所述来源标识字段,以所述当前数据信息作为主节点,具有所述数据信息之间的流转、关联关系的数据信息作为次节点构建业务数据关系;Based on the source identification field, the business data relationship is constructed by taking the current data information as the main node and the data information having the flow and association relationship between the data information as the secondary node; 为具有所述数据信息之间的流转、关联关系的数据信息添加所述当前数据信息的分类;Adding the classification of the current data information to the data information having the circulation and association relationship between the data information; 将不同类型的所述数据信息存储至对应的数据库中;Storing different types of data information in corresponding databases; 针对所述事件数据和所述人员数据分别进行分词得到词语,并对所述词语进行词性标注,根据词性标注结果分别构建事件结构树和人员结构树;Segmenting the event data and the personnel data to obtain words, and tagging the words with parts of speech, and constructing an event structure tree and a personnel structure tree according to the results of the part of speech tagging; 从所述事件结构树中抽取主题信息;并运用特征抽取规则在所述事件结构树抽取所述主题信息相关的第一特征数据;所述主题信息为反应事件内容或话题的相关表述;Extracting topic information from the event structure tree; and extracting first feature data related to the topic information from the event structure tree using feature extraction rules; the topic information is a relevant expression reflecting the event content or topic; 对所述人员结构树抽取词语进行情感倾向分析;对判定为具有负面情感词语进行标记,并运用所述特征抽取规则在所述人员结构树抽取人员相关的第二特征数据;所述第一特征数据和所述第二特征数据包括相同的要素;Perform sentiment analysis on the words extracted from the personnel structure tree; mark the words determined to have negative sentiment, and use the feature extraction rule to extract second feature data related to the personnel in the personnel structure tree; the first feature data and the second feature data include the same elements; 按照所述第一特征数据和所述第二特征数据将所述主题信息与所述人员进行聚类处理,同时所述人员在聚类处理的过程中会携带标记的词语;Clustering the subject information and the persons according to the first characteristic data and the second characteristic data, and the persons may carry marked words during the clustering process; 读取预先设定于阈值配置文件中所述主题信息的阈值,当携带标记的词语超过对应的阈值时,判定所述主题信息所属的事件为矛盾纠纷事件,并且将所述聚类处理后的对应人员一并输出。The threshold of the subject information pre-set in the threshold configuration file is read. When the words carrying the mark exceed the corresponding threshold, the event to which the subject information belongs is determined to be a conflict and dispute event, and the corresponding personnel after the clustering process are output together. 2.根据权利要求1所述的方法,其特征在于,所述将不同类型的所述数据信息存储至对应的数据库中的步骤,具体包括:2. The method according to claim 1, characterized in that the step of storing different types of data information in corresponding databases specifically comprises: 对于具有多个类型的所述数据信息,随机选择一个对应的数据库进行存储,并在存储记录中添加追溯至其他对应数据库的索引标识。For the data information having multiple types, a corresponding database is randomly selected for storage, and index identifiers tracing back to other corresponding databases are added to the storage record. 3.根据权利要求1所述的方法,其特征在于,所述从所述事件结构树中抽取主题信息的步骤,具体包括:3. The method according to claim 1, characterized in that the step of extracting subject information from the event structure tree specifically comprises: 利用主题信息抽取模型从所述事件结构树中抽取所述主题信息;Extracting the topic information from the event structure tree using a topic information extraction model; 所述主题信息抽取模型的训练过程为:The training process of the topic information extraction model is as follows: 预设训练主题信息和训练语料集,所述训练语料集包括与所述训练主题信息相关类别的训练语料和不相关类别的训练语料;Preset training topic information and training corpus set, wherein the training corpus set includes training corpus of categories related to the training topic information and training corpus of categories unrelated to the training topic information; 对所述训练语料集进行一次遍历,借助句法分析识别每一个词语,并针对其产生对应的抽取规则;Traversing the training corpus once, identifying each word by syntactic analysis, and generating corresponding extraction rules for it; 对所述训练语料集继续遍历,生成每个所述抽取规则的统计数据;Continue traversing the training corpus to generate statistical data for each of the extraction rules; 按照所述统计数据对所述抽取规则进行打分,其中,所述抽取规则与在所述训练主题信息相关类别的训练语料出现的次数正相关、与在所述训练主题信息不相关类别的训练语料出现的次数负相关;Scoring the extraction rules according to the statistical data, wherein the extraction rules are positively correlated with the number of times the training corpus appears in a category related to the training subject information, and negatively correlated with the number of times the training corpus appears in a category unrelated to the training subject information; 保留打分结果超过评分阈值的所述抽取规则。The extraction rules whose scoring results exceed the scoring threshold are retained. 4.根据权利要求3所述的方法,其特征在于,所述按照所述统计数据对所述抽取规则进行打分的步骤,具体包括:4. The method according to claim 3, characterized in that the step of scoring the extraction rules according to the statistical data specifically comprises: 所述抽取规则的分数为,在所述训练主题信息相关类别的训练语料出现的次数除以在所述训练主题信息不相关类别的训练语料出现的次数;The score of the extraction rule is the number of times the training corpus in the category related to the training subject information appears divided by the number of times the training corpus in the category unrelated to the training subject information appears; 得到全部打分结果后,将打分结果进行数据二值化。After obtaining all the scoring results, the scoring results are binarized. 5.根据权利要求1所述的方法,其特征在于,所述从数据源提取数据信息,依据预先设定的数据存储方式以及业务数据关系,将所述数据信息分成不同类型的步骤之后,所述方法还包括:5. The method according to claim 1, characterized in that after the step of extracting data information from the data source and dividing the data information into different types according to a preset data storage method and business data relationship, the method further comprises: 将时间低于时间阈值的所述数据信息存储在内存数据库中,时间不低于时间阈值的所述数据信息存储在分布式文件系统中。The data information whose time is lower than the time threshold is stored in the memory database, and the data information whose time is not lower than the time threshold is stored in the distributed file system. 6.根据权利要求5所述的方法,其特征在于,所述将时间低于时间阈值的所述数据信息存储在内存数据库中,时间不低于时间阈值的所述数据信息存储在分布式文件系统中的步骤之后,所述方法还包括:6. The method according to claim 5, characterized in that after the step of storing the data information whose time is less than the time threshold in the memory database and storing the data information whose time is not less than the time threshold in the distributed file system, the method further comprises: 对数据信息建立数据索引。Create a data index for the data information. 7.一种矛盾纠纷事件预警系统,其特征在于,所述矛盾纠纷事件预警系统包括:一个或多个处理器和存储器;所述存储器与所述一个或多个处理器耦合,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,所述一个或多个处理器调用所述计算机指令以使得所述矛盾纠纷事件预警系统执行如权利要求1-6中任一项所述的方法。7. A conflict and dispute event warning system, characterized in that the conflict and dispute event warning system includes: one or more processors and a memory; the memory is coupled to the one or more processors, the memory is used to store computer program code, the computer program code includes computer instructions, and the one or more processors call the computer instructions to enable the conflict and dispute event warning system to execute the method described in any one of claims 1-6. 8.一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在矛盾纠纷事件预警系统上运行时,使得所述矛盾纠纷事件预警系统执行如权利要求1-6中任一项所述的方法。8. A computer program product comprising instructions, characterized in that when the computer program product is run on a conflict and dispute event early warning system, the conflict and dispute event early warning system is caused to execute the method as described in any one of claims 1 to 6. 9.一种计算机可读存储介质,包括指令,其特征在于,当所述指令在矛盾纠纷事件预警系统上运行时,使得所述矛盾纠纷事件预警系统执行如权利要求1-6中任一项所述的方法。9. A computer-readable storage medium, comprising instructions, characterized in that when the instructions are executed on a conflict and dispute event early warning system, the conflict and dispute event early warning system executes the method as described in any one of claims 1 to 6.
CN202510024792.4A 2025-01-08 2025-01-08 A conflict and dispute event early warning method, system, program product and storage medium Active CN119474380B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510024792.4A CN119474380B (en) 2025-01-08 2025-01-08 A conflict and dispute event early warning method, system, program product and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510024792.4A CN119474380B (en) 2025-01-08 2025-01-08 A conflict and dispute event early warning method, system, program product and storage medium

Publications (2)

Publication Number Publication Date
CN119474380A CN119474380A (en) 2025-02-18
CN119474380B true CN119474380B (en) 2025-04-29

Family

ID=94595222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510024792.4A Active CN119474380B (en) 2025-01-08 2025-01-08 A conflict and dispute event early warning method, system, program product and storage medium

Country Status (1)

Country Link
CN (1) CN119474380B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740339A (en) * 2016-01-25 2016-07-06 河北中科恒运软件科技股份有限公司 Civil administration big data fusion and management system
CN114328907A (en) * 2021-10-22 2022-04-12 浙江嘉兴数字城市实验室有限公司 Natural language processing method for early warning risk upgrade event

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959383A (en) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 Analysis method, device and the computer readable storage medium of network public-opinion
CN116108955A (en) * 2022-11-18 2023-05-12 中国电信股份有限公司 Method, device, equipment and storage medium for upgrading and early warning of social contradiction disputes
CN117093460A (en) * 2023-08-23 2023-11-21 腾讯科技(深圳)有限公司 Evaluation method, evaluation device, electronic equipment and computer readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740339A (en) * 2016-01-25 2016-07-06 河北中科恒运软件科技股份有限公司 Civil administration big data fusion and management system
CN114328907A (en) * 2021-10-22 2022-04-12 浙江嘉兴数字城市实验室有限公司 Natural language processing method for early warning risk upgrade event

Also Published As

Publication number Publication date
CN119474380A (en) 2025-02-18

Similar Documents

Publication Publication Date Title
AU2019263758B2 (en) Systems and methods for generating a contextually and conversationally correct response to a query
US11521603B2 (en) Automatically generating conference minutes
CN113704451B (en) Power user appeal screening method and system, electronic device and storage medium
CN108304375B (en) Information identification method and equipment, storage medium and terminal thereof
CN109726274B (en) Question generation method, device and storage medium
CN106649260B (en) Product characteristic structure tree construction method based on comment text mining
CN112069298A (en) Human-computer interaction method, device and medium based on semantic web and intention recognition
CN118296132A (en) Customer service searching method and system based on intelligent large model
CN119988588A (en) A large model-based multimodal document retrieval enhancement generation method
CN119862287A (en) Intelligent legal document retrieval system based on vector technology
CN118838993A (en) Method for constructing keyword library and related products thereof
CN115221871A (en) Keyword Extraction Method of English Sci-tech Documents Based on Multi-feature Fusion
CN119579303A (en) End-to-end generation method and device for customer credit rating report
CN119088898B (en) Intelligent text retrieval and analysis system driven by natural language processing
CN117056524B (en) Aspect-level sentiment analysis method and system based on domain knowledge graph
CN119474380B (en) A conflict and dispute event early warning method, system, program product and storage medium
CN117216214A (en) Question and answer extraction generation method, device, equipment and medium
Mallek et al. An unsupervised approach for precise context identification from unstructured text documents
US12443596B2 (en) Systems and methods for generating a contextually and conversationally correct response to a query
Kaci et al. From NL preference expressions to comparative preference statements: A preliminary study in eliciting preferences for customised decision support
CN118942104B (en) A method and system for extracting structured information
CN119537553B (en) Business question-answering method and system based on large language model and scene tag
CN119622464A (en) Large model hallucination detection method and device, electronic device, and storage medium
CN120471052A (en) Method and device for extracting key sentences in text and electronic equipment
CN119474448A (en) Song recall method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant