[go: up one dir, main page]

CN108172242B - Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method - Google Patents

Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method Download PDF

Info

Publication number
CN108172242B
CN108172242B CN201810014999.3A CN201810014999A CN108172242B CN 108172242 B CN108172242 B CN 108172242B CN 201810014999 A CN201810014999 A CN 201810014999A CN 108172242 B CN108172242 B CN 108172242B
Authority
CN
China
Prior art keywords
data analysis
analysis processing
processing software
software app
intelligent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810014999.3A
Other languages
Chinese (zh)
Other versions
CN108172242A (en
Inventor
鲁霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xinzhongxin Technology Co Ltd
Original Assignee
Shenzhen Xinzhongxin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Xinzhongxin Technology Co Ltd filed Critical Shenzhen Xinzhongxin Technology Co Ltd
Priority to CN201810014999.3A priority Critical patent/CN108172242B/en
Publication of CN108172242A publication Critical patent/CN108172242A/en
Application granted granted Critical
Publication of CN108172242B publication Critical patent/CN108172242B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B5/00Near-field transmission systems, e.g. inductive or capacitive transmission systems
    • H04B5/70Near-field transmission systems, e.g. inductive or capacitive transmission systems specially adapted for specific purposes
    • H04B5/72Near-field transmission systems, e.g. inductive or capacitive transmission systems specially adapted for specific purposes for local intradevice communication
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision
    • G10L2025/786Adaptive threshold

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention relates to an improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method which comprises an intelligent cloud sound box, intelligent equipment, data analysis processing software APP and a Bluetooth module. The intelligent device is a mobile phone, a tablet personal computer and the like; the intelligent equipment comprises a Bluetooth module and data analysis processing software APP; the intelligent cloud sound box comprises a cloud server; the data analysis processing software APP is installed on the intelligent equipment; the Bluetooth module and the Bluetooth intelligent cloud sound box are connected with each other through an audio channel; the data analysis processing software APP of the intelligent device establishes connection of a control instruction with the Bluetooth intelligent cloud sound box through the Bluetooth module, and control data interaction between the data analysis processing software APP and the Bluetooth intelligent cloud sound box is achieved; the invention has the beneficial effects that: the problems of poor recognition rate, end point misjudgment and the like caused by environmental differences in the prior art are solved, and the man-machine voice interaction efficiency and experience are improved. The efficiency is improved, and the user experience is improved.

Description

Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method
Technical Field
The invention relates to the field of Bluetooth low energy consumption technology application, in particular to an improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method.
Background
In the field of man-machine interaction, Voice Activity Detection (VAD) is a very critical work, the quality of an algorithm of the Voice Activity Detection is a certain degree to directly determine the success or failure of the whole Voice interaction system, the Voice Activity Detection is used as a complete Voice interaction system, the final implementation and use effects of the Voice Activity Detection are not only dependent on the recognition algorithm, a plurality of related factors directly influence the success or failure of an application system, the purpose of end point Detection is to distinguish a Voice signal from a non-Voice signal in a signal stream under a complex application environment and determine the beginning and the end of the Voice signal, a good end point Detection method can solve the problems of unsatisfactory Detection effect, low recognition rate and the like of Voice recognition software, and the high accuracy of end point Detection can ensure that an input signal is an effective and complete Voice signal, so that the recognition effect is more accurate and rapid.
The traditional end point detection method uses double threshold detection of short-time energy and zero crossing rate, firstly, the first judgment is carried out on the short-time energy of audio, and a high threshold is selected for carrying out a rough judgment; and then the average zero-crossing rate is used for carrying out second discrimination. Although the double-threshold end point detection has small calculation amount and can gnaw a better recognition rate in a quiet environment, the double-threshold end point detection has a plurality of defects, for example, a threshold value needs to be set by experience and is a fixed parameter; in real-time voice interaction, scenes related to context pause are easy to misjudge, and the human-computer interaction effect is not ideal.
Therefore, in daily life, the field of man-machine interaction is involved, and how to accurately detect the end point position of an audio signal is a problem that needs to be solved urgently by technical staff.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method overcomes the problems of poor recognition rate, endpoint misjudgment and the like caused by environment difference in the prior art, and improves human-computer voice interaction efficiency and experience.
In order to solve the technical problem, the invention provides an improved method for detecting a voice interaction endpoint of a Bluetooth intelligent cloud sound box. The intelligent device is a mobile phone, a tablet personal computer and the like; the intelligent equipment comprises a Bluetooth module and data analysis processing software APP; the intelligent cloud sound box comprises a cloud server;
the data analysis processing software APP is installed on the intelligent equipment;
the Bluetooth module is connected with a Bluetooth intelligent cloud sound box through an audio channel;
further optimizing, establishing connection of a control instruction between data analysis processing software APP of the intelligent device and the Bluetooth intelligent cloud sound box through the Bluetooth module, and realizing control data interaction between the data analysis processing software APP and the Bluetooth intelligent cloud sound box;
further optimize, normal data analysis processing software APP is in the standby state, and when the intelligent device end awakens the voice interaction, data analysis processing software APP starts the bluetooth module to connect to begin the recording, gather audio signal, establish the data transmission passageway with the high in the clouds server of bluetooth intelligence cloud audio amplifier simultaneously.
Further optimizing, setting a mute protection time by the data analysis processing software APP, wherein the protection time is agreed by the data analysis processing software APP and the cloud server; when the voice interaction is awakened, even if the user does not speak, the mute acquisition time is 3 seconds, so that the situation that the whole system stops judging when the user does not have time to speak when the voice interaction is awakened is avoided; in addition, the connection-oriented SCO of the bluetooth module is operated too frequently in a very short time, which may cause system-level abnormality, and the mute protection time controls the connection-oriented SCO of the bluetooth module to be operated too frequently in a very short time.
Further optimizing, extracting each frame of audio signal from time to time by data analysis processing software APP of the intelligent equipment; the data analysis processing software APP sets the duration of the audio signal for each frame to 10 ms.
Further optimizing, calculating the short-time energy of each frame of audio signal by data analysis processing software APP of the smart phone, wherein the calculation formula of the short-time energy signal is as follows:
Figure 366173DEST_PATH_IMAGE001
further optimizing, dynamically judging whether each frame of audio signal is a voice frame by data analysis processing software APP of the intelligent equipment; the method comprises the steps that short-time energy directly reflects voice signal energy and amplitude, a voiced segment and a unvoiced segment are judged according to the short-time energy, data analysis processing software APP dynamically searches the maximum energy value of each frame and the previous audio frame, the threshold value is dynamically reduced as long as the following audio frame is smaller than the maximum energy frame threshold value (M), when the current short-time energy is small, the amplitude value of volume attenuation is too large, a non-voice frame is defined, non-voice counting is started, the non-voice frame is continuously counted for 200, and the pause is 2 seconds, so that the end of speaking is indicated, and if voice frame data exist in the middle, a counter is reset and counts again.
The formula of the adaptive threshold value is as follows:
Figure 369770DEST_PATH_IMAGE002
further optimizing, and judging effective endpoints by data analysis processing software APP of the intelligent equipment;
further optimizing, sending the acquisition completion to a cloud server by data analysis processing software APP of the intelligent equipment, and starting voice recognition; data analysis processing software APP stops the recording according to the result that finishes voice acquisition to send to the high in the clouds server and gather and accomplish the instruction, begin speech recognition, through in a large amount of voice interaction tests in the bluetooth intelligence cloud audio amplifier, accurately judge out the endpoint of pronunciation.
Further optimization, the working steps of the improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method are as follows:
a. the data analysis processing software APP of the intelligent equipment is connected with the Bluetooth intelligent cloud sound box;
b. the intelligent device end awakens voice interaction;
c. starting a mute protection time counter by data analysis processing software APP of the intelligent equipment;
d. extracting each frame of audio signal from time to time by data analysis processing software APP of the intelligent equipment;
e. calculating the short-time energy of each frame of audio signal by data analysis processing software APP of the intelligent equipment;
f. dynamically judging whether each frame of audio signal is a voice frame or not by data analysis processing software APP of the intelligent equipment;
h. the method comprises the steps that effective endpoint judgment is carried out by data analysis processing software APP of the intelligent equipment;
i. and sending the acquisition completion to the cloud server by the data analysis processing software APP of the intelligent equipment, and starting voice recognition.
After the technical scheme is adopted, the invention has the beneficial effects that:
compared with the prior art, the improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method is provided, the problems of poor recognition rate, endpoint misjudgment and the like caused by environment difference in the prior art are solved, and the man-machine voice interaction efficiency and experience are improved. The efficiency is improved, and the user experience is improved.
Drawings
FIG. 1 is a working block diagram of an improved Bluetooth intelligent cloud speaker voice interaction endpoint detection method
FIG. 2 is a flow chart of an improved Bluetooth intelligent cloud speaker voice interaction endpoint detection method
Detailed Description
The invention will be described in detail below with reference to fig. 1 to 2 and specific examples, but the invention is not limited thereto.
As shown in fig. 1 to 2, an improved method for detecting a voice interaction endpoint of a bluetooth smart cloud speaker includes a smart cloud speaker, a smart device, data analysis processing software APP, and a bluetooth module. The intelligent device is a mobile phone, a tablet personal computer and the like; the intelligent equipment comprises a Bluetooth module and data analysis processing software APP; the intelligent cloud sound box comprises a cloud server; the data analysis processing software APP is installed on the intelligent equipment; the Bluetooth module and the Bluetooth intelligent cloud sound box are connected with each other through an audio channel; the data analysis processing software APP of the intelligent device establishes connection of a control instruction with the Bluetooth intelligent cloud sound box through the Bluetooth module, and control data interaction between the data analysis processing software APP and the Bluetooth intelligent cloud sound box is achieved; normal data analysis processing software APP is in the standby state, and when the intelligent device end awakens the voice interaction, data analysis processing software APP starts the Bluetooth module to connect to begin the recording, gather audio signal, establish the data transmission passageway with the high in the clouds server of bluetooth intelligence cloud audio amplifier simultaneously. Setting a mute protection time by the data analysis processing software APP, wherein the protection time is agreed by the data analysis processing software APP and the cloud server; when the voice interaction is awakened, even if the user does not speak, the mute acquisition time is 3 seconds, so that the situation that the whole system stops judging when the user does not have time to speak when the voice interaction is awakened is avoided; in addition, the connection-oriented SCO of the bluetooth module is operated too frequently in a very short time, which may cause system-level abnormality, and the mute protection time controls the connection-oriented SCO of the bluetooth module to be operated too frequently in a very short time. Extracting each frame of audio signal from time to time by data analysis processing software APP of the intelligent equipment; the data analysis processing software APP sets the duration of the audio signal for each frame to 10 ms. Data analysis processing software APP of smart phone calculates short-time energy of each frame of audio signal, and calculation of short-time energy signalThe formula is as follows:
Figure 106782DEST_PATH_IMAGE001
(ii) a Dynamically judging whether each frame of audio signal is a voice frame or not by data analysis processing software APP of the intelligent equipment; the method comprises the steps that short-time energy directly reflects voice signal energy and amplitude, a voiced segment and a unvoiced segment are judged according to the short-time energy, data analysis processing software APP dynamically searches the maximum energy value of each frame and the previous audio frame, the threshold value is dynamically reduced as long as the following audio frame is smaller than the maximum energy frame threshold value (M), when the current short-time energy is small, the amplitude value of volume attenuation is too large, a non-voice frame is defined, non-voice counting is started, the non-voice frame is continuously counted for 200, and the pause is 2 seconds, so that the end of speaking is indicated, and if voice frame data exist in the middle, a counter is reset and counts again.
The formula of the adaptive threshold value is as follows:
Figure 567850DEST_PATH_IMAGE002
the method comprises the steps that effective endpoint judgment is carried out by data analysis processing software APP of the intelligent equipment; data analysis processing software APP of the intelligent equipment sends acquisition completion to a cloud server, and voice recognition is started; data analysis processing software APP stops the recording according to the result that finishes voice acquisition to send to the high in the clouds server and gather and accomplish the instruction, begin speech recognition, through in a large amount of voice interaction tests in the bluetooth intelligence cloud audio amplifier, accurately judge out the endpoint of pronunciation.
The working steps of the improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method are as follows:
a. the data analysis processing software APP of the intelligent equipment is connected with the Bluetooth intelligent cloud sound box;
b. the intelligent device end awakens voice interaction;
c. starting a mute protection time counter by data analysis processing software APP of the intelligent equipment;
d. extracting each frame of audio signal from time to time by data analysis processing software APP of the intelligent equipment;
e. calculating the short-time energy of each frame of audio signal by data analysis processing software APP of the intelligent equipment;
f. dynamically judging whether each frame of audio signal is a voice frame or not by data analysis processing software APP of the intelligent equipment;
h. the method comprises the steps that effective endpoint judgment is carried out by data analysis processing software APP of the intelligent equipment;
i. and sending the acquisition completion to the cloud server by the data analysis processing software APP of the intelligent equipment, and starting voice recognition.
In the embodiment of the invention:
s101, data analysis processing software APP of the intelligent equipment is connected with Bluetooth intelligent cloud sound box equipment;
firstly, establishing connection of an audio channel with a Bluetooth intelligent cloud sound box through a Bluetooth module in a mobile phone system; then, connection of a control instruction is established between the APP and the Bluetooth intelligent cloud sound box through data analysis processing software of the intelligent equipment, in order to guarantee good compatibility, SPP channel connection is established between the Android version and the equipment, BLE channel connection is established between the IOS version, and control data interaction between the APP and the Bluetooth intelligent cloud sound box equipment can be achieved.
S102, the intelligent device end awakens voice interaction;
normal data analysis processing software APP handles standby state, only when equipment end awakens the voice interaction, starts the bluetooth SCO connection to begin the recording, gather audio signal, establish data transmission channel with the high in the clouds server simultaneously.
S103, starting a mute protection time counter by data analysis processing software APP of the intelligent equipment;
the data analysis processing software APP of the intelligent device starts a mute protection time counter, and for better experience of a user and stability of a system, a mute protection time is set, when voice interaction is awakened, even if the user does not speak, the specific duration is agreed with a cloud server, 3 seconds of mute acquisition time is available, and the situation that the user does not speak in time when the voice interaction is awakened, the whole system is judged to stop is avoided; on the other hand, the SCO of bluetooth operates too frequently for a very short time, causing system level anomalies.
S104, extracting each frame of audio signal from time to time by data analysis processing software APP of the intelligent equipment;
the audio signal is an unsteady, time-varying signal, which is considered to be steady-state and time-invariant in a "short time" range for obtaining more accurate calculation results, and this time, the duration of the audio signal of each frame is set to 10ms by the general data analysis processing software APP.
S105, calculating the short-time energy of each frame of audio signal by data analysis processing software APP of the intelligent equipment;
the calculation formula of the short-time energy signal is as follows:
Figure 997694DEST_PATH_IMAGE001
wherein, the energy value of the mth sampling point in the ith frame is shown.
In terms of the short-time energy calculation formula, APP codes are exemplified as follows:
private long getRms(int end, int span) { int begin = end - span; if (begin < 0) { begin = 0;} if (begin % 2 != 0) { begin++;} long sum = 0;for (int i = begin; i < end; i += 2) { short curSample = getShort(this.mRecording[i], this.mRecording[i + 1]); sum += (long) (curSample * curSample);} return sum; }
s106, dynamically judging whether each frame of audio signal is a voice frame by data analysis processing software APP of the intelligent equipment;
the short-time energy can directly reflect the energy and amplitude of a voice signal, and then the voiced segment and the unvoiced segment can be judged, the data analysis processing software APP dynamically searches the maximum energy value in each frame and the previous audio frame, the threshold value is dynamically reduced as long as the following audio frame is smaller than the maximum energy frame threshold value (M), when the current short-time energy is small, when the amplitude of volume attenuation is too large, a non-voice frame is defined, the non-voice counting is started, the continuous counting of the non-voice frame reaches 200, which is equivalent to 2 seconds of pause, the end of speaking is indicated, if voice frame data exists in the middle, the counter is reset, and the counter is counted again.
Adaptive threshold value:
Figure 426271DEST_PATH_IMAGE002
APP example code is as follows:
private static final int RMS_COUNT_MAX = 200; // 2s
public boolean isPausing() {
long rms = getRms(this.mRecordedLength, this.mOneSec);
if (rms > this.highestRMS) {
this.highestRMS = rms;
this.rmsCount = 0;
return false;
} else if (((double) rms) < M * ((double) this.highestRMS)) {
if(this.rmsCount < RMS_COUNT_MAX){
this.rmsCount++;
return false;
}else{
this.rmsCount = 0;
return true;
}
} else {
this.rmsCount = 0;
return false;
}
}
s107, effective endpoint judgment is carried out by data analysis processing software APP of the intelligent equipment;
the voice endpoint judgment in the man-machine interaction is limited in many aspects, such as 3-second mute protection time, a locally improved short-time energy detection voice endpoint and a collection stopping instruction issued by a cloud end.
APP example code is as follows:
while (recorder != null && recorder.getState() == AudioRecorder.State.RECORDING) {
boolean pausing = recorder.isPausing();
if (pausing && mRecordDurationReached) {
if (mBtDeviceSpeechType == BT_DEVICE_SPEECH_RECOGNITION) {
mBtDeviceSpeechType = BT_DEVICE_SPEECH_RECOGNITION_NONE;
stopBluetoothSCO();
}
stopListening(true);
break;
}
try {
Thread.sleep(10);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
s108, sending the collected data to a cloud end by data analysis processing software APP of the intelligent equipment, and starting voice recognition;
data analysis processing software APP stops the recording according to the result that finishes pronunciation collection to send to the high in the clouds and gather and accomplish the instruction, can begin speech recognition, can cross in a large amount of pronunciation interaction tests in the bluetooth intelligence cloud audio amplifier, can accurately judge out the endpoint of pronunciation basically. The transmission and processing of non-voice frames are greatly reduced, the efficiency is improved, and the user experience is improved.
It will be appreciated by those skilled in the art that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The embodiments disclosed above are therefore to be considered in all respects as illustrative and not restrictive. All changes which come within the scope of or equivalence to the invention are intended to be embraced therein.

Claims (6)

1. An improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method comprises an intelligent cloud sound box, intelligent equipment, data analysis processing software APP and a Bluetooth module; the method is characterized in that: the intelligent equipment comprises a Bluetooth module and data analysis processing software APP; the intelligent cloud sound box comprises a cloud server; the data analysis processing software APP is installed on the intelligent equipment; the Bluetooth module is connected with a Bluetooth intelligent cloud sound box through an audio channel; the data analysis processing software APP of the intelligent device establishes connection of a control instruction with the Bluetooth intelligent cloud sound box through the Bluetooth module, and control data interaction between the data analysis processing software APP and the Bluetooth intelligent cloud sound box is achieved;
the working steps of the improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method are as follows:
a. the data analysis processing software APP of the intelligent equipment is connected with the Bluetooth intelligent cloud sound box;
b. the intelligent device end awakens voice interaction;
c. starting a mute protection time counter by data analysis processing software APP of the intelligent equipment;
d. extracting each frame of audio signal from time to time by data analysis processing software APP of the intelligent equipment;
e. calculating the short-time energy of each frame of audio signal by data analysis processing software APP of the intelligent equipment;
f. dynamically judging whether each frame of audio signal is a voice frame or not by data analysis processing software APP of the intelligent equipment;
h. the method comprises the steps that effective endpoint judgment is carried out by data analysis processing software APP of the intelligent equipment;
i. data analysis processing software APP of the intelligent equipment sends the collected data to a cloud server, voice recognition is started, and voice endpoints are accurately judged in a large number of voice interaction tests in the Bluetooth intelligent cloud sound box;
setting a mute protection time by the data analysis processing software APP, wherein the time length of the mute protection time is agreed by the data analysis processing software APP and the cloud server; when the voice interaction is awakened, even if the user does not speak, the mute acquisition time is 3 seconds, so that the situation that the whole system stops judging when the user does not have time to speak when the voice interaction is awakened is avoided; in addition, the connection-oriented SCO of the bluetooth module is operated too frequently in a very short time, which may cause system-level abnormality, and the mute protection time controls the connection-oriented SCO of the bluetooth module to be operated too frequently in a very short time.
2. The improved Bluetooth intelligent cloud speaker voice interaction endpoint detection method according to claim 1, wherein: normal data analysis processing software APP is in the standby state, and when the intelligent device end awakens the voice interaction, data analysis processing software APP starts the Bluetooth module to connect to begin the recording, gather audio signal, establish the data transmission passageway with the high in the clouds server of bluetooth intelligence cloud audio amplifier simultaneously.
3. The improved Bluetooth intelligent cloud speaker voice interaction endpoint detection method according to claim 1, wherein: extracting each frame of audio signal from time to time by data analysis processing software APP of the intelligent equipment; the data analysis processing software APP sets the duration of the audio signal for each frame to 10 ms.
4. The improved Bluetooth intelligent cloud speaker voice interaction endpoint detection method according to claim 1, wherein: the data analysis processing software APP of the smart phone calculates the short-time energy of each frame of audio signal, and the calculation formula of the short-time energy signal is as follows:
Figure FDA0002986406770000021
5. the improved Bluetooth intelligent cloud speaker voice interaction endpoint detection method according to claim 1, wherein: and the data analysis processing software APP of the intelligent equipment judges the effective end point.
6. The improved Bluetooth intelligent cloud speaker voice interaction endpoint detection method according to claim 1, wherein: the method comprises the steps that effective endpoint judgment is carried out by data analysis processing software APP of the intelligent equipment; data analysis processing software APP of the intelligent equipment sends acquisition completion to a cloud server, and voice recognition is started; data analysis processing software APP stops the recording according to the result that finishes voice acquisition to send to the high in the clouds server and gather and accomplish the instruction, begin speech recognition, through in a large amount of voice interaction tests in the bluetooth intelligence cloud audio amplifier, accurately judge out the endpoint of pronunciation.
CN201810014999.3A 2018-01-08 2018-01-08 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method Active CN108172242B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810014999.3A CN108172242B (en) 2018-01-08 2018-01-08 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810014999.3A CN108172242B (en) 2018-01-08 2018-01-08 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method

Publications (2)

Publication Number Publication Date
CN108172242A CN108172242A (en) 2018-06-15
CN108172242B true CN108172242B (en) 2021-06-01

Family

ID=62517740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810014999.3A Active CN108172242B (en) 2018-01-08 2018-01-08 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method

Country Status (1)

Country Link
CN (1) CN108172242B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110958348B (en) * 2018-09-25 2022-07-01 阿里巴巴集团控股有限公司 Voice processing method and device, user equipment and intelligent sound box
CN110971744B (en) * 2018-09-28 2022-09-23 深圳市冠旭电子股份有限公司 Method and device for controlling voice playing of Bluetooth sound box
CN111083678B (en) * 2018-10-22 2021-08-06 深圳市冠旭电子股份有限公司 Bluetooth speaker playback control method, system and smart device
CN110097884B (en) * 2019-06-11 2022-05-17 大众问问(北京)信息科技有限公司 A voice interaction method and device
CN112449050A (en) * 2019-08-29 2021-03-05 阿里巴巴集团控股有限公司 Voice interaction method, voice interaction device, computing device and storage medium
CN111554287B (en) * 2020-04-27 2023-09-05 佛山市顺德区美的洗涤电器制造有限公司 Voice processing method and device, household appliance and readable storage medium
CN111968680B (en) * 2020-08-14 2024-10-01 北京小米松果电子有限公司 Voice processing method, device and storage medium
CN112420079B (en) * 2020-11-18 2022-12-06 青岛海尔科技有限公司 Voice endpoint detection method and device, storage medium and electronic equipment
CN112863542B (en) * 2021-01-29 2022-10-28 青岛海尔科技有限公司 Voice detection method and device, storage medium and electronic equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2745116Y (en) * 2004-11-12 2005-12-07 联想(北京)有限公司 Computer I/O peripheral equipment having wireless connecting function
CN101107824A (en) * 2004-12-31 2008-01-16 英国电讯有限公司 Connection-oriented communication scheme for connectionless communication traffic
CN101984725A (en) * 2010-11-17 2011-03-09 广州杰赛科技股份有限公司 Wireless access device and method
CN202679358U (en) * 2012-05-09 2013-01-16 深圳市芯中芯科技有限公司 Stereo Bluetooth audio module
CN102891408A (en) * 2012-10-12 2013-01-23 歌尔声学股份有限公司 Bluetooth controlled power socket and implementation method for Bluetooth controlled power socket
CN103369677A (en) * 2012-04-02 2013-10-23 英特尔移动通信有限责任公司 Radio communication device and method for operating a radio communication device
CN104184496A (en) * 2013-05-24 2014-12-03 凌通科技股份有限公司 Bluetooth data/control information transmission module, interactive system and method thereof
CN204517806U (en) * 2015-01-09 2015-07-29 深圳市芯中芯科技有限公司 A kind of audio emission based on 5.8GHz frequency range and receiving system
CN105338645A (en) * 2012-05-30 2016-02-17 英特尔移动通信有限责任公司 Radio communication device

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4422545A1 (en) * 1994-06-28 1996-01-04 Sel Alcatel Ag Start / end point detection for word recognition
CN1141696C (en) * 2000-03-31 2004-03-10 清华大学 Non-particular human speech recognition and prompt method based on special speech recognition chip
CN100456356C (en) * 2004-11-12 2009-01-28 中国科学院声学研究所 A Speech Endpoint Detection Method Applied to Speech Recognition System
KR20080048175A (en) * 2006-11-28 2008-06-02 삼성전자주식회사 Sound source playback system and playback method of mobile terminal
US8578247B2 (en) * 2008-05-08 2013-11-05 Broadcom Corporation Bit error management methods for wireless audio communication channels
CN101625857B (en) * 2008-07-10 2012-05-09 新奥特(北京)视频技术有限公司 Self-adaptive voice endpoint detection method
CN103065629A (en) * 2012-11-20 2013-04-24 广东工业大学 Speech recognition system of humanoid robot
CN103871401B (en) * 2012-12-10 2016-12-28 联想(北京)有限公司 A kind of method of speech recognition and electronic equipment
CN106653021B (en) * 2016-12-27 2020-06-02 上海智臻智能网络科技股份有限公司 Voice wake-up control method and device and terminal
CN107277272A (en) * 2017-07-25 2017-10-20 深圳市芯中芯科技有限公司 A kind of bluetooth equipment voice interactive method and system based on software APP

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2745116Y (en) * 2004-11-12 2005-12-07 联想(北京)有限公司 Computer I/O peripheral equipment having wireless connecting function
CN101107824A (en) * 2004-12-31 2008-01-16 英国电讯有限公司 Connection-oriented communication scheme for connectionless communication traffic
CN101984725A (en) * 2010-11-17 2011-03-09 广州杰赛科技股份有限公司 Wireless access device and method
CN103369677A (en) * 2012-04-02 2013-10-23 英特尔移动通信有限责任公司 Radio communication device and method for operating a radio communication device
CN202679358U (en) * 2012-05-09 2013-01-16 深圳市芯中芯科技有限公司 Stereo Bluetooth audio module
CN105338645A (en) * 2012-05-30 2016-02-17 英特尔移动通信有限责任公司 Radio communication device
CN102891408A (en) * 2012-10-12 2013-01-23 歌尔声学股份有限公司 Bluetooth controlled power socket and implementation method for Bluetooth controlled power socket
CN104184496A (en) * 2013-05-24 2014-12-03 凌通科技股份有限公司 Bluetooth data/control information transmission module, interactive system and method thereof
CN204517806U (en) * 2015-01-09 2015-07-29 深圳市芯中芯科技有限公司 A kind of audio emission based on 5.8GHz frequency range and receiving system

Also Published As

Publication number Publication date
CN108172242A (en) 2018-06-15

Similar Documents

Publication Publication Date Title
CN108172242B (en) Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method
US11830479B2 (en) Voice recognition method and apparatus, and air conditioner
JP6171617B2 (en) Response target speech determination apparatus, response target speech determination method, and response target speech determination program
Li et al. Robust endpoint detection and energy normalization for real-time speech and speaker recognition
CN103325386B (en) The method and system controlled for signal transmission
CN103578468B (en) The method of adjustment and electronic equipment of a kind of confidence coefficient threshold of voice recognition
CN110335593B (en) Voice endpoint detection method, device, equipment and storage medium
CN101494049B (en) Method for extracting audio characteristic parameter of audio monitoring system
CN110268470A (en) The modification of audio frequency apparatus filter
CN103745723A (en) Method and device for identifying audio signal
CN110047470A (en) A kind of sound end detecting method
CN105139858A (en) Information processing method and electronic equipment
CN111179927A (en) Financial equipment voice interaction method and system
CN1763844B (en) End-point detecting method, apparatus and speech recognition system based on sliding window
CN109215634A (en) Method and system for multi-word voice control on-off device
CN110364178B (en) Voice processing method and device, storage medium and electronic equipment
CN111768800A (en) Voice signal processing method, apparatus and storage medium
CN103543814A (en) Signal processing device and signal processing method
CN109994129B (en) Speech processing system, method and device
CN119854414A (en) AI-based telephone answering system
CN112543972A (en) Audio processing method and device
CN111326159A (en) Voice recognition method, device and system
US8315865B2 (en) Method and apparatus for adaptive conversation detection employing minimal computation
CN102693721A (en) Simple and easy voice and gender detection device and method
CN113270118A (en) Voice activity detection method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant