CN110688843A - Method for distinguishing text information - Google Patents
Method for distinguishing text information Download PDFInfo
- Publication number
- CN110688843A CN110688843A CN201910973135.9A CN201910973135A CN110688843A CN 110688843 A CN110688843 A CN 110688843A CN 201910973135 A CN201910973135 A CN 201910973135A CN 110688843 A CN110688843 A CN 110688843A
- Authority
- CN
- China
- Prior art keywords
- text
- text information
- model
- written
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a method for distinguishing text information, which comprises the following steps: s1: and receiving the text information. S2: and identifying the text information as the relevant written text information or the relevant spoken text information, and sending the relevant written text information to a text model corresponding to the relevant written text information or sending the relevant spoken text information to a language model corresponding to the relevant spoken text information. S3: and judging the recognition results of the text model and the language model in the step S2, and judging whether to perform re-recognition according to the recognition results. The method achieves the effects of identifying the nuance of the oral (speech) and written (characters), correctly and completely identifying the oral (speech) and written (characters), so that enterprises can more accurately understand and reply the two communication forms, and the technical effect of effective communication in face-to-face is really restored.
Description
Technical Field
The invention belongs to the technical field of artificial intelligent natural language processing, and particularly relates to a method for distinguishing text information.
Background
Oral (speech) and written (text) are the two most important means of communication, especially in a commercial setting. The traditional natural language processing method is a simple extraction and analysis of a given text, regardless of the source of the text. However, there are many subtle differences in determining the meaning of each communication modality. For example, people often use filler words in verbal communication or make some sound ("e", "hmm", "en", clear throat, etc.) that never gets written. There are also some different meanings associated with sound that are never written down ("yes" meaning consent, "nozzle" meaning less consent). Also, there are many symbols (punctuation, symbolic symbols, emoticons, etc.) that often appear in written text, which are never spoken but may be important to understand meaning. People tend to respond verbally with more complete sentences, while with shorter phrases, how to recognize nuances of oral (speech) and written (text), correctly and completely recognize oral (speech) and written (text), truly reverting to effective face-to-face communication.
Disclosure of Invention
Aiming at the defects in the prior art, the embodiment of the invention provides a method for distinguishing written texts from spoken texts, so that the nuances of oral (speech) and written (text) are recognized, the oral (speech) and written (text) are recognized correctly and completely, enterprises can understand and reply the two communication forms more accurately, and the technical effect of effective face-to-face communication is truly restored.
In view of the above technical problems, an embodiment of the present invention provides a method for distinguishing text information, where the method includes:
s1: and receiving the text information.
S2: and identifying the text information as the relevant written text information or the relevant spoken text information, and sending the relevant written text information to a text model corresponding to the relevant written text information or sending the relevant spoken text information to a language model corresponding to the relevant spoken text information.
S3: and judging the recognition results of the text model and the language model in the step S2, and judging whether to perform re-recognition according to the recognition results.
According to an embodiment of the present invention, the identifying the text information as the written text related information or the spoken text related information in S2 includes: recognizing the text information using an external engine, and determining whether the text information received in S1 is the written text information or the spoken text information.
According to an embodiment of the present invention, the determining the recognition result of the text model and the language model in S2 in S3 includes: and judging whether the text information result identified by the text model is correct or not and judging whether the text information result identified by the language model is correct or not.
According to an embodiment of the present invention, the determining whether to perform re-recognition according to the recognition result in S3 includes: if the text information recognized by the text model is incorrect in result, recognizing the text information by using a language model; and if the text information recognized by the language model is incorrect in result, recognizing the text information by using the text model.
According to one embodiment of the present invention, S4: and updating parameters related to the text model and the language model according to the recognition results of the text model and the language model.
The invention achieves the technical effects that: the present invention is a method of classifying text as originating from speech or words and analyzing it using a separate model. Using a large number of written text and speech samples, we trained two independent models using machine learning and Python text classification. To accomplish this, speech conversion to text uses speech-to-text techniques. In application, the input phrases are received by an external engine located outside the training model, which can recognize the input source and assign it to the appropriate model for NLP analysis. However, this task is not a constant one, it is only a priority; the analysis from the assigned model turns out to be inaccurate and the input will automatically be transferred to another model for analysis. As with machine learning systems, each input further improves the accuracy of the model. The machine can determine the meaning of the input phrase and use it for artificial intelligence applications. The method aims at the nuances to train, so that enterprises can more accurately understand and reply the two communication forms, the communication invalidation caused by inaccurate machine translation is avoided, and meanwhile, the cooperative effectiveness in the working process can be greatly improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method of an embodiment of the present invention;
FIG. 2 is a flow diagram of yet another method of an embodiment of the present invention;
fig. 3 is a flow chart of yet another method of an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without any creative effort belong to the protection scope of the embodiments of the present invention.
The embodiment of the invention provides a method for distinguishing written texts from spoken texts, which achieves the purposes of identifying nuances of oral words (speeches) and written texts, correctly and completely identifying the oral words (speeches) and the written texts, so that enterprises can more accurately understand and reply the two communication forms, and the technical effect of effective face-to-face communication is really restored.
An embodiment of the present invention provides a method for distinguishing text information, as shown in fig. 1, the method includes:
s1: and receiving the text information.
S2: and identifying the text information as the relevant written text information or the relevant spoken text information, and sending the relevant written text information to a text model corresponding to the relevant written text information or sending the relevant spoken text information to a language model corresponding to the relevant spoken text information.
S3: and judging the recognition results of the text model and the language model in the step S2, and judging whether to perform re-recognition according to the recognition results.
According to an embodiment of the present invention, the identifying the text information as the written text related information or the spoken text related information in S2 includes: recognizing the text information using an external engine, and determining whether the text information received in S1 is the written text information or the spoken text information.
According to an embodiment of the present invention, the determining the recognition result of the text model and the language model in S2 in S3 includes: and judging whether the text information result identified by the text model is correct or not and judging whether the text information result identified by the language model is correct or not.
According to an embodiment of the present invention, the determining whether to perform re-recognition according to the recognition result in S3 includes: if the text information recognized by the text model is incorrect in result, recognizing the text information by using a language model; and if the text information recognized by the language model is incorrect in result, recognizing the text information by using the text model.
According to an embodiment of the invention, as shown in fig. 2, the method further comprises: s4: and updating parameters related to the text model and the language model according to the recognition results of the text model and the language model.
As shown in fig. 3, a flowchart of another method disclosed in the embodiment of the present invention includes:
1.0: a given phrase (speech or written text) is entered.
2.0: the engine determines the source of the input phrase (speech or written text) and assigns it to the appropriate model.
3.0: the model analyzes the text according to machine learning training to determine the meaning of the text.
4.0: if the analysis is inaccurate, another model will be used to analyze the input.
5.0: each new sample was used to refine the model.
The invention achieves the technical effects that: the present invention is a method of classifying text as originating from speech or words and analyzing it using a separate model. Using a large number of written text and speech samples, we trained two independent models using machine learning and Python text classification. To accomplish this, speech conversion to text uses speech-to-text techniques. In application, the input phrases are received by an external engine located outside the training model, which can recognize the input source and assign it to the appropriate model for NLP analysis. However, this task is not a constant one, it is only a priority; the analysis from the assigned model turns out to be inaccurate and the input will automatically be transferred to another model for analysis. As with machine learning systems, each input further improves the accuracy of the model. The machine can determine the meaning of the input phrase and use it for artificial intelligence applications. The method aims at the nuances to train, so that enterprises can more accurately understand and reply the two communication forms, the communication invalidation caused by inaccurate machine translation is avoided, and meanwhile, the cooperative effectiveness in the working process can be greatly improved.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the electronic device and the like are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the respective technical solutions of the embodiments of the present invention.
Claims (5)
1. A method of distinguishing textual information, the method comprising:
s1: receiving text information;
s2: identifying the text information as written text related information or spoken text related information, and sending the written text related information to a text model corresponding to the written text related information or sending the spoken text related information to a language model corresponding to the spoken text related information;
s3: and judging the recognition results of the text model and the language model in the step S2, and judging whether to perform re-recognition according to the recognition results.
2. The method according to claim 1, wherein said identifying said text information as written text related information or spoken text related information in said S2 comprises: recognizing the text message by using an external engine, and determining whether the text message received in S1 is a written text message or a spoken text message.
3. The method according to claim 1, wherein the determining the recognition result of the text model and the language model in the S2 in the S3 comprises: and judging whether the text information result identified by the text model is correct or not and judging whether the text information result identified by the language model is correct or not.
4. The method according to claim 3, wherein the determining whether to perform re-recognition according to the recognition result in S3 includes: if the text information recognized by the text model is incorrect, recognizing the text information by using a language model; and if the text information recognized by the language model is incorrect in result, recognizing the text information by using the text model.
5. The method of claim 1, further comprising:
s4: and updating parameters related to the text model and the language model according to the recognition results of the text model and the language model.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910973135.9A CN110688843A (en) | 2019-10-14 | 2019-10-14 | Method for distinguishing text information |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910973135.9A CN110688843A (en) | 2019-10-14 | 2019-10-14 | Method for distinguishing text information |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN110688843A true CN110688843A (en) | 2020-01-14 |
Family
ID=69112424
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201910973135.9A Pending CN110688843A (en) | 2019-10-14 | 2019-10-14 | Method for distinguishing text information |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN110688843A (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150220513A1 (en) * | 2014-01-31 | 2015-08-06 | Vivint, Inc. | Systems and methods for personifying communications |
| CN106354716A (en) * | 2015-07-17 | 2017-01-25 | 华为技术有限公司 | Method and device for converting text |
| CN110287461A (en) * | 2019-05-24 | 2019-09-27 | 北京百度网讯科技有限公司 | Text conversion method, device and storage medium |
-
2019
- 2019-10-14 CN CN201910973135.9A patent/CN110688843A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150220513A1 (en) * | 2014-01-31 | 2015-08-06 | Vivint, Inc. | Systems and methods for personifying communications |
| CN106354716A (en) * | 2015-07-17 | 2017-01-25 | 华为技术有限公司 | Method and device for converting text |
| CN110287461A (en) * | 2019-05-24 | 2019-09-27 | 北京百度网讯科技有限公司 | Text conversion method, device and storage medium |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109255113B (en) | Intelligent proofreading system | |
| CN105845134B (en) | Spoken language evaluation method and system for freely reading question types | |
| CN108536654B (en) | Method and device for displaying identification text | |
| WO2021042904A1 (en) | Conversation intention recognition method, apparatus, computer device, and storage medium | |
| CN111241357A (en) | Dialogue training method, device, system and storage medium | |
| CN101650886B (en) | Method for automatically detecting reading errors of language learners | |
| CN109886270B (en) | A Case Element Recognition Method Oriented to Electronic File Transcripts | |
| CN112992125B (en) | Voice recognition method and device, electronic equipment and readable storage medium | |
| CN113837594A (en) | Quality evaluation method, system, device and medium for customer service in multiple scenes | |
| CN112927679A (en) | Method for adding punctuation marks in voice recognition and voice recognition device | |
| CN103761975A (en) | Method and device for oral evaluation | |
| CN108563638A (en) | A kind of microblog emotional analysis method based on topic identification and integrated study | |
| CN113626573B (en) | Sales session objection and response extraction method and system | |
| CN115410560A (en) | Voice recognition method, device, storage medium and equipment | |
| CN114330318A (en) | Method and device for recognizing Chinese fine-grained entities in financial field | |
| JP2020064370A (en) | Text symbol insertion device and method | |
| US20110224985A1 (en) | Model adaptation device, method thereof, and program thereof | |
| CN120632013A (en) | Intelligent dialogue scene analysis method based on AI large model | |
| CN112015921B (en) | Natural language processing method based on learning auxiliary knowledge graph | |
| CN114186041A (en) | Answer output method | |
| CN111427996B (en) | Method and device for extracting date and time from man-machine interaction text | |
| CN113053358A (en) | Voice recognition customer service system for regional dialects | |
| CN111599234A (en) | Automatic English spoken language scoring system based on voice recognition | |
| CN110858268B (en) | Method and system for detecting unsmooth phenomenon in voice translation system | |
| CN110688843A (en) | Method for distinguishing text information |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200114 |
|
| RJ01 | Rejection of invention patent application after publication |