[go: up one dir, main page]

CN110688843A - Method for distinguishing text information - Google Patents

Method for distinguishing text information Download PDF

Info

Publication number
CN110688843A
CN110688843A CN201910973135.9A CN201910973135A CN110688843A CN 110688843 A CN110688843 A CN 110688843A CN 201910973135 A CN201910973135 A CN 201910973135A CN 110688843 A CN110688843 A CN 110688843A
Authority
CN
China
Prior art keywords
text
text information
model
written
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910973135.9A
Other languages
Chinese (zh)
Inventor
周继敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baikelu (beijing) Technology Co Ltd
Original Assignee
Baikelu (beijing) Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baikelu (beijing) Technology Co Ltd filed Critical Baikelu (beijing) Technology Co Ltd
Priority to CN201910973135.9A priority Critical patent/CN110688843A/en
Publication of CN110688843A publication Critical patent/CN110688843A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method for distinguishing text information, which comprises the following steps: s1: and receiving the text information. S2: and identifying the text information as the relevant written text information or the relevant spoken text information, and sending the relevant written text information to a text model corresponding to the relevant written text information or sending the relevant spoken text information to a language model corresponding to the relevant spoken text information. S3: and judging the recognition results of the text model and the language model in the step S2, and judging whether to perform re-recognition according to the recognition results. The method achieves the effects of identifying the nuance of the oral (speech) and written (characters), correctly and completely identifying the oral (speech) and written (characters), so that enterprises can more accurately understand and reply the two communication forms, and the technical effect of effective communication in face-to-face is really restored.

Description

Method for distinguishing text information
Technical Field
The invention belongs to the technical field of artificial intelligent natural language processing, and particularly relates to a method for distinguishing text information.
Background
Oral (speech) and written (text) are the two most important means of communication, especially in a commercial setting. The traditional natural language processing method is a simple extraction and analysis of a given text, regardless of the source of the text. However, there are many subtle differences in determining the meaning of each communication modality. For example, people often use filler words in verbal communication or make some sound ("e", "hmm", "en", clear throat, etc.) that never gets written. There are also some different meanings associated with sound that are never written down ("yes" meaning consent, "nozzle" meaning less consent). Also, there are many symbols (punctuation, symbolic symbols, emoticons, etc.) that often appear in written text, which are never spoken but may be important to understand meaning. People tend to respond verbally with more complete sentences, while with shorter phrases, how to recognize nuances of oral (speech) and written (text), correctly and completely recognize oral (speech) and written (text), truly reverting to effective face-to-face communication.
Disclosure of Invention
Aiming at the defects in the prior art, the embodiment of the invention provides a method for distinguishing written texts from spoken texts, so that the nuances of oral (speech) and written (text) are recognized, the oral (speech) and written (text) are recognized correctly and completely, enterprises can understand and reply the two communication forms more accurately, and the technical effect of effective face-to-face communication is truly restored.
In view of the above technical problems, an embodiment of the present invention provides a method for distinguishing text information, where the method includes:
s1: and receiving the text information.
S2: and identifying the text information as the relevant written text information or the relevant spoken text information, and sending the relevant written text information to a text model corresponding to the relevant written text information or sending the relevant spoken text information to a language model corresponding to the relevant spoken text information.
S3: and judging the recognition results of the text model and the language model in the step S2, and judging whether to perform re-recognition according to the recognition results.
According to an embodiment of the present invention, the identifying the text information as the written text related information or the spoken text related information in S2 includes: recognizing the text information using an external engine, and determining whether the text information received in S1 is the written text information or the spoken text information.
According to an embodiment of the present invention, the determining the recognition result of the text model and the language model in S2 in S3 includes: and judging whether the text information result identified by the text model is correct or not and judging whether the text information result identified by the language model is correct or not.
According to an embodiment of the present invention, the determining whether to perform re-recognition according to the recognition result in S3 includes: if the text information recognized by the text model is incorrect in result, recognizing the text information by using a language model; and if the text information recognized by the language model is incorrect in result, recognizing the text information by using the text model.
According to one embodiment of the present invention, S4: and updating parameters related to the text model and the language model according to the recognition results of the text model and the language model.
The invention achieves the technical effects that: the present invention is a method of classifying text as originating from speech or words and analyzing it using a separate model. Using a large number of written text and speech samples, we trained two independent models using machine learning and Python text classification. To accomplish this, speech conversion to text uses speech-to-text techniques. In application, the input phrases are received by an external engine located outside the training model, which can recognize the input source and assign it to the appropriate model for NLP analysis. However, this task is not a constant one, it is only a priority; the analysis from the assigned model turns out to be inaccurate and the input will automatically be transferred to another model for analysis. As with machine learning systems, each input further improves the accuracy of the model. The machine can determine the meaning of the input phrase and use it for artificial intelligence applications. The method aims at the nuances to train, so that enterprises can more accurately understand and reply the two communication forms, the communication invalidation caused by inaccurate machine translation is avoided, and meanwhile, the cooperative effectiveness in the working process can be greatly improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method of an embodiment of the present invention;
FIG. 2 is a flow diagram of yet another method of an embodiment of the present invention;
fig. 3 is a flow chart of yet another method of an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without any creative effort belong to the protection scope of the embodiments of the present invention.
The embodiment of the invention provides a method for distinguishing written texts from spoken texts, which achieves the purposes of identifying nuances of oral words (speeches) and written texts, correctly and completely identifying the oral words (speeches) and the written texts, so that enterprises can more accurately understand and reply the two communication forms, and the technical effect of effective face-to-face communication is really restored.
An embodiment of the present invention provides a method for distinguishing text information, as shown in fig. 1, the method includes:
s1: and receiving the text information.
S2: and identifying the text information as the relevant written text information or the relevant spoken text information, and sending the relevant written text information to a text model corresponding to the relevant written text information or sending the relevant spoken text information to a language model corresponding to the relevant spoken text information.
S3: and judging the recognition results of the text model and the language model in the step S2, and judging whether to perform re-recognition according to the recognition results.
According to an embodiment of the present invention, the identifying the text information as the written text related information or the spoken text related information in S2 includes: recognizing the text information using an external engine, and determining whether the text information received in S1 is the written text information or the spoken text information.
According to an embodiment of the present invention, the determining the recognition result of the text model and the language model in S2 in S3 includes: and judging whether the text information result identified by the text model is correct or not and judging whether the text information result identified by the language model is correct or not.
According to an embodiment of the present invention, the determining whether to perform re-recognition according to the recognition result in S3 includes: if the text information recognized by the text model is incorrect in result, recognizing the text information by using a language model; and if the text information recognized by the language model is incorrect in result, recognizing the text information by using the text model.
According to an embodiment of the invention, as shown in fig. 2, the method further comprises: s4: and updating parameters related to the text model and the language model according to the recognition results of the text model and the language model.
As shown in fig. 3, a flowchart of another method disclosed in the embodiment of the present invention includes:
1.0: a given phrase (speech or written text) is entered.
2.0: the engine determines the source of the input phrase (speech or written text) and assigns it to the appropriate model.
3.0: the model analyzes the text according to machine learning training to determine the meaning of the text.
4.0: if the analysis is inaccurate, another model will be used to analyze the input.
5.0: each new sample was used to refine the model.
The invention achieves the technical effects that: the present invention is a method of classifying text as originating from speech or words and analyzing it using a separate model. Using a large number of written text and speech samples, we trained two independent models using machine learning and Python text classification. To accomplish this, speech conversion to text uses speech-to-text techniques. In application, the input phrases are received by an external engine located outside the training model, which can recognize the input source and assign it to the appropriate model for NLP analysis. However, this task is not a constant one, it is only a priority; the analysis from the assigned model turns out to be inaccurate and the input will automatically be transferred to another model for analysis. As with machine learning systems, each input further improves the accuracy of the model. The machine can determine the meaning of the input phrase and use it for artificial intelligence applications. The method aims at the nuances to train, so that enterprises can more accurately understand and reply the two communication forms, the communication invalidation caused by inaccurate machine translation is avoided, and meanwhile, the cooperative effectiveness in the working process can be greatly improved.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the electronic device and the like are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may also be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the respective technical solutions of the embodiments of the present invention.

Claims (5)

1. A method of distinguishing textual information, the method comprising:
s1: receiving text information;
s2: identifying the text information as written text related information or spoken text related information, and sending the written text related information to a text model corresponding to the written text related information or sending the spoken text related information to a language model corresponding to the spoken text related information;
s3: and judging the recognition results of the text model and the language model in the step S2, and judging whether to perform re-recognition according to the recognition results.
2. The method according to claim 1, wherein said identifying said text information as written text related information or spoken text related information in said S2 comprises: recognizing the text message by using an external engine, and determining whether the text message received in S1 is a written text message or a spoken text message.
3. The method according to claim 1, wherein the determining the recognition result of the text model and the language model in the S2 in the S3 comprises: and judging whether the text information result identified by the text model is correct or not and judging whether the text information result identified by the language model is correct or not.
4. The method according to claim 3, wherein the determining whether to perform re-recognition according to the recognition result in S3 includes: if the text information recognized by the text model is incorrect, recognizing the text information by using a language model; and if the text information recognized by the language model is incorrect in result, recognizing the text information by using the text model.
5. The method of claim 1, further comprising:
s4: and updating parameters related to the text model and the language model according to the recognition results of the text model and the language model.
CN201910973135.9A 2019-10-14 2019-10-14 Method for distinguishing text information Pending CN110688843A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910973135.9A CN110688843A (en) 2019-10-14 2019-10-14 Method for distinguishing text information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910973135.9A CN110688843A (en) 2019-10-14 2019-10-14 Method for distinguishing text information

Publications (1)

Publication Number Publication Date
CN110688843A true CN110688843A (en) 2020-01-14

Family

ID=69112424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910973135.9A Pending CN110688843A (en) 2019-10-14 2019-10-14 Method for distinguishing text information

Country Status (1)

Country Link
CN (1) CN110688843A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150220513A1 (en) * 2014-01-31 2015-08-06 Vivint, Inc. Systems and methods for personifying communications
CN106354716A (en) * 2015-07-17 2017-01-25 华为技术有限公司 Method and device for converting text
CN110287461A (en) * 2019-05-24 2019-09-27 北京百度网讯科技有限公司 Text conversion method, device and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150220513A1 (en) * 2014-01-31 2015-08-06 Vivint, Inc. Systems and methods for personifying communications
CN106354716A (en) * 2015-07-17 2017-01-25 华为技术有限公司 Method and device for converting text
CN110287461A (en) * 2019-05-24 2019-09-27 北京百度网讯科技有限公司 Text conversion method, device and storage medium

Similar Documents

Publication Publication Date Title
CN109255113B (en) Intelligent proofreading system
CN105845134B (en) Spoken language evaluation method and system for freely reading question types
CN108536654B (en) Method and device for displaying identification text
WO2021042904A1 (en) Conversation intention recognition method, apparatus, computer device, and storage medium
CN111241357A (en) Dialogue training method, device, system and storage medium
CN101650886B (en) Method for automatically detecting reading errors of language learners
CN109886270B (en) A Case Element Recognition Method Oriented to Electronic File Transcripts
CN112992125B (en) Voice recognition method and device, electronic equipment and readable storage medium
CN113837594A (en) Quality evaluation method, system, device and medium for customer service in multiple scenes
CN112927679A (en) Method for adding punctuation marks in voice recognition and voice recognition device
CN103761975A (en) Method and device for oral evaluation
CN108563638A (en) A kind of microblog emotional analysis method based on topic identification and integrated study
CN113626573B (en) Sales session objection and response extraction method and system
CN115410560A (en) Voice recognition method, device, storage medium and equipment
CN114330318A (en) Method and device for recognizing Chinese fine-grained entities in financial field
JP2020064370A (en) Text symbol insertion device and method
US20110224985A1 (en) Model adaptation device, method thereof, and program thereof
CN120632013A (en) Intelligent dialogue scene analysis method based on AI large model
CN112015921B (en) Natural language processing method based on learning auxiliary knowledge graph
CN114186041A (en) Answer output method
CN111427996B (en) Method and device for extracting date and time from man-machine interaction text
CN113053358A (en) Voice recognition customer service system for regional dialects
CN111599234A (en) Automatic English spoken language scoring system based on voice recognition
CN110858268B (en) Method and system for detecting unsmooth phenomenon in voice translation system
CN110688843A (en) Method for distinguishing text information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200114

RJ01 Rejection of invention patent application after publication