KR100233532B1 - Audio Codec of Voice Communication System - Google Patents
Audio Codec of Voice Communication System Download PDFInfo
- Publication number
- KR100233532B1 KR100233532B1 KR1019970005111A KR19970005111A KR100233532B1 KR 100233532 B1 KR100233532 B1 KR 100233532B1 KR 1019970005111 A KR1019970005111 A KR 1019970005111A KR 19970005111 A KR19970005111 A KR 19970005111A KR 100233532 B1 KR100233532 B1 KR 100233532B1
- Authority
- KR
- South Korea
- Prior art keywords
- voice
- text data
- communication system
- audio codec
- voice communication
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Telephonic Communication Services (AREA)
Abstract
개시된 내용은 음성통신시스템의 오디오코덱(CODEC)에 관한 것으로서, 음성통신시스템에 있어서, 입력되는 음성신호를 인식하여 문자데이터로 변환시켜 출력하는 부호기, 및 부호기로부터 전송되는 문자데이터를 수신하며, 수신된 문자데이터를 기설정된 음색을 갖는 인공의 음성신호로 생성하여 출력하는 복호기를 포함한다. 이와 같은, 오디오코덱은 대역폭이 작은 통신라인을 이용하여 음성통신할 수 있게 하는 효과를 가져온다.The present disclosure relates to an audio codec (CODEC) of a voice communication system. In a voice communication system, an encoder recognizes an input voice signal, converts it into text data, and outputs the received text data, and receives text data transmitted from the coder. And a decoder for generating and outputting the generated text data into an artificial voice signal having a predetermined tone. As such, the audio codec has an effect of enabling voice communication using a communication line having a small bandwidth.
Description
본 발명은 음성통신시스템의 오디오코덱(CODEC)에 관한 것으로서, 보다 상세하게는, 음성인식과 인공(人工) 음성의 생성을 이용하여 대역폭이 작은 통신채널에서도 음성통신을 할 수 있도록 하는 장치에 관한 것이다.The present invention relates to an audio codec (CODEC) of a voice communication system, and more particularly, to an apparatus for enabling voice communication in a communication channel having a low bandwidth by using voice recognition and artificial voice generation. will be.
근래 들어, 지역적으로 멀리 떨어진 사람의 얼굴을 보며 대화를 할 수 있게 하는 화상통신에 기반을 둔 멀티미디어제품들이 등장하고 있다. 하나의 회선으로 연결되어 있는 화상통신시스템에서, 영상 및 음성신호를 부호화/복호화하는 방식, 및 신호들을 다중화하기 위한 방식의 표준안들이 회선의 종류마다 권고되어 있다.In recent years, multimedia products based on video communication have emerged, which enable people to talk face to face with remote people. In a video communication system connected by one line, standard proposals of a method of encoding / decoding video and audio signals and a method of multiplexing signals are recommended for each type of line.
화상통신시스템을 이용하여 통신을 하는 사용자가, 예를 들어, "안녕하십니까?"라고 2초에 걸쳐 말하였을 경우를 생각하여 보자. 여기서, 사용자로부터의 음성신호를 8kHZ의 주파수로 샘플링한 후, 한 샘플당 2바이트를 할당하여 디지탈변환시키면, 데이터량은 32,000바이트(byte)가 된다. 이렇게 많은 데이터를 전송하기 위해서는 대역폭이 큰 통신라인을 사용하여야 한다. 최근 들어, 대역폭이 작은 통신라인을 이용하여 음성신호를 전송할 수 있도록 G.723, G.728, G.729와 같은 음성압축부호화의 표준안들이 권고되었다. 이 중에서, 일반전화망(PSTN)을 이용한 디지탈통신을 위하여 제안된 G.723의 성능이 가장 우수한 것으로 알려져 있다. G.723에서 낮은 전송율(low-rate)의 경우, 5.3Kbps로 데이터를 압축한다. 이를 이용하여 전술한 음성신호를 압축하더라도 데이터량은 약 1333바이트가 된다.Consider a case where a user who communicates using a video communication system has said, for example, "Hello?" Over two seconds. Here, after sampling a speech signal from a user at a frequency of 8kH Z, when the digital conversion to allocate two bytes per sample, the amount of data is a 32,000 byte (byte). In order to transmit such a large amount of data, a communication line having a large bandwidth must be used. Recently, voice compression coding standards such as G.723, G.728, and G.729 have been recommended to transmit voice signals using a small bandwidth communication line. Among them, the performance of the proposed G.723 for digital communication using the public telephone network (PSTN) is known to be the best. For low-rate in G.723, data is compressed at 5.3Kbps. Even if the above-mentioned audio signal is compressed using this, the data amount is about 1333 bytes.
여기서, 통신에 참가한 사용자들 사이에, 말하는 사람이 누구인 지를 분명히 알 수 있으며, 그런 이유에서 말하는 사람의 음색(tone)에 상관없이 그 말한 내용만을 알 수 있어도 된다면, 우리는 굳이 전술한 압축부호화방법들을 사용하지 않아도 된다. 다시 말해, 전술한 화자(話者)의 음색에 상관없이 "안녕하십니까?"라는 내용만 인식하여 문자화하면, 12바이트만을 사용하여 그 내용을 표현할 수 있다.Here, if the users who participated in the communication can clearly know who the speaker is, and for that reason, only the contents of the speaker can be known regardless of the speaker's tone, There is no need to use methods. In other words, regardless of the tone of the speaker described above, if only the content of "Hello?" Is recognized and characterized, the contents can be expressed using only 12 bytes.
본 발명의 목적은, 화자의 음성을 인식하여 데이터량이 적은 문자데이터로 변환시켜 전송하고, 전송되어진 문자데이터를 인공음성으로 변환시켜 출력하므로써 대역폭이 작은 통신채널을 이용하여 음성통신할 수 있도록 하는 음성통신시스템의 오디오코덱을 제공함에 있다.An object of the present invention is to recognize the voice of the speaker to convert the data into a small amount of text data and transmit, and to convert the transmitted text data into artificial voice and output the voice to enable the voice communication using a small bandwidth communication channel An audio codec of a communication system is provided.
도 1은 본 발명에 따른 음성통신시스템의 구성도.1 is a block diagram of a voice communication system according to the present invention.
<도면의 주요부분에 대한 부호의 설명><Description of the symbols for the main parts of the drawings>
11 : 마이크12 : 음성인식기11: microphone 12: voice recognizer
13 : 음성생성기14 : 스피커13: voice generator 14: speaker
이와 같은 목적을 달성하기 위한 본 발명에 따른 오디오코덱은, 음성통신시스템에 있어서, 입력되는 음성신호를 인식하여 문자데이터로 변환시켜 출력하는 부호기, 및 부호기로부터 전송되는 문자데이터를 수신하며, 수신된 문자데이터를 기설정된 음색을 갖는 인공의 음성신호로 생성하여 출력하는 복호기를 포함한다.The audio codec according to the present invention for achieving the above object, in the voice communication system, recognizes the input voice signal, converts it into text data and outputs, and receives the text data transmitted from the coder, And a decoder for generating and outputting text data into an artificial voice signal having a predetermined tone.
이하, 첨부한 도면을 참조하여 본 발명을 상세히 설명하기로 한다.Hereinafter, with reference to the accompanying drawings will be described in detail the present invention.
도 1은 본 발명에 따른 음성통신시스템의 구성을 보여준다.1 shows a configuration of a voice communication system according to the present invention.
도 1에서, 마이크(11)는 사용자로부터 음성신호를 입력받는다. 음성인식기(12)는 마이크(11)로부터의 음성신호를 입력받으며, 문자화된 데이터를 출력한다. 문자화된 데이터는 일반전화망인 PSTN을 통해 전송된다. 수신측의 음성생성기(13)는 수신되는 문자화된 데이터로부터 음성신호를 생성하며, 생성된 음성신호를 스피커(14)를 통해 출력시킨다.In FIG. 1, the microphone 11 receives a voice signal from a user. The
이와 같이 구성된 도 1을 사용하는 사용자는, 화상회의시 마이크(11)를 이용해 자신의 의견을 말한다. 마이크(11)를 통해 입력되는 사용자의 음성신호는 음성인식기(12)로 인가된다. 음성인식기(12)는 입력되는 신호를 시스템내에서 인식할 수 있는 코드로 변환시킨다. 코드로 변환된 음성데이터, 즉, 문자화된 음성데이터는 일반전화망(PSTN)을 통해 상대측 화상단말로 전송된다. 여기서, 문자화된 데이터의 양은, 문자변환되기 이전의 음성신호를 압축했을 때보다 훨씬 작기 때문에 대역폭이 작은 통신채널을 이용하여 전송해도 된다.The user using FIG. 1 configured as described above speaks his or her opinion using the microphone 11 during a video conference. The voice signal of the user input through the microphone 11 is applied to the
한편, 복호기는 일반전화망(PSTN)을 통해 전송되는 문자화된 데이터를 수신하며, 수신된 데이터를 음성신호로 복원한다. 즉, 음성생성기(13)는 수신되는 문자데이터를 기설정된 음색을 갖는 인공의 음성신호로 변환시켜 출력한다. 음성생성기(13)로부터 발생된 음성신호는 스피커(14)를 통해 출력된다. 여기서, 마이크(11)를 통해 입력되는 음색과 스피커(14)를 통해 출력되는 음색은 서로 다르지만, 화자가 말한 내용은 다른 사용자들에게 그대로 전달된다.Meanwhile, the decoder receives the text data transmitted through the PSTN, and restores the received data into the voice signal. That is, the
전술한 부호화측의 음성인식기(12)와 복호화측의 음성생성기(13)는 최근 들어 그 개발에 많은 진척을 보인 음성인식IC와 음성합성용IC를 시스템에 내장하므로써 구현이 가능하다.The above-mentioned speech recognizer 12 on the encoding side and the
이와 같은 본 발명에 따른 오디오코덱은 대역폭이 작은 통신라인을 이용하여 음성통신할 수 있게 하는 효과를 가져온다.Such an audio codec according to the present invention has the effect of enabling voice communication using a communication line having a small bandwidth.
Claims (3)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1019970005111A KR100233532B1 (en) | 1997-02-20 | 1997-02-20 | Audio Codec of Voice Communication System |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1019970005111A KR100233532B1 (en) | 1997-02-20 | 1997-02-20 | Audio Codec of Voice Communication System |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| KR19980068496A KR19980068496A (en) | 1998-10-26 |
| KR100233532B1 true KR100233532B1 (en) | 1999-12-01 |
Family
ID=19497504
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1019970005111A Expired - Fee Related KR100233532B1 (en) | 1997-02-20 | 1997-02-20 | Audio Codec of Voice Communication System |
Country Status (1)
| Country | Link |
|---|---|
| KR (1) | KR100233532B1 (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119207421B (en) * | 2024-09-23 | 2025-09-12 | 东南大学 | A method and system for speech recognition and cloning semantic speech transmission |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR930024395A (en) * | 1992-05-27 | 1993-12-22 | 정용문 | Automated response system using phoneme conversion method |
-
1997
- 1997-02-20 KR KR1019970005111A patent/KR100233532B1/en not_active Expired - Fee Related
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR930024395A (en) * | 1992-05-27 | 1993-12-22 | 정용문 | Automated response system using phoneme conversion method |
Also Published As
| Publication number | Publication date |
|---|---|
| KR19980068496A (en) | 1998-10-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8340959B2 (en) | Method and apparatus for transmitting wideband speech signals | |
| US8259629B2 (en) | System and method for transmitting and receiving wideband speech signals with a synthesized signal | |
| KR19980701696A (en) | Method and apparatus for detection and bypass of tandem vocoding | |
| WO2007140724A1 (en) | A method and apparatus for transmitting and receiving background noise and a silence compressing system | |
| US7177801B2 (en) | Speech transfer over packet networks using very low digital data bandwidths | |
| WO2010059342A1 (en) | Apparatus and method for encoding at least one parameter associated with a signal source | |
| KR100233532B1 (en) | Audio Codec of Voice Communication System | |
| JPH1049199A (en) | Silence compressed voice coding and decoding device | |
| Decina et al. | CCITT standards on digital speech processing | |
| Kitawaki et al. | Speech coding technology for ATM networks | |
| CN215868635U (en) | A Narrowband Channel Oriented Voice Communication System | |
| JP4333005B2 (en) | Speech encoding / decoding device, speech encoding device, and encoding method | |
| JPH06216779A (en) | Communication device | |
| JPS60107933A (en) | Adpcm encoding device | |
| JPH06216860A (en) | Voice communication device | |
| JPH10145764A (en) | Speaker detection method and multipoint video conference device | |
| KR960003626B1 (en) | Decoding method of deaf-coded audio signal | |
| EP1220202A1 (en) | System and method for coding and decoding speaker-independent and speaker-dependent speech information | |
| KR100400720B1 (en) | Method for transfering data through internet | |
| JP2000244949A (en) | Dtmf signal transmitting device | |
| JPH08307366A (en) | Speech coding device | |
| Ma et al. | A solution to mix low bit-rate speech signal in decentralized multipoint conference | |
| JPH0431457B2 (en) | ||
| JPH0434339B2 (en) | ||
| Nakatsui et al. | Dual adaptive delta modulation for mobile voice channel and its DSP implementation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A201 | Request for examination | ||
| PA0109 | Patent application |
St.27 status event code: A-0-1-A10-A12-nap-PA0109 |
|
| PA0201 | Request for examination |
St.27 status event code: A-1-2-D10-D11-exm-PA0201 |
|
| R17-X000 | Change to representative recorded |
St.27 status event code: A-3-3-R10-R17-oth-X000 |
|
| PG1501 | Laying open of application |
St.27 status event code: A-1-1-Q10-Q12-nap-PG1501 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-3-3-R10-R18-oth-X000 |
|
| PN2301 | Change of applicant |
St.27 status event code: A-3-3-R10-R13-asn-PN2301 St.27 status event code: A-3-3-R10-R11-asn-PN2301 |
|
| E902 | Notification of reason for refusal | ||
| PE0902 | Notice of grounds for rejection |
St.27 status event code: A-1-2-D10-D21-exm-PE0902 |
|
| P11-X000 | Amendment of application requested |
St.27 status event code: A-2-2-P10-P11-nap-X000 |
|
| P13-X000 | Application amended |
St.27 status event code: A-2-2-P10-P13-nap-X000 |
|
| E701 | Decision to grant or registration of patent right | ||
| PE0701 | Decision of registration |
St.27 status event code: A-1-2-D10-D22-exm-PE0701 |
|
| GRNT | Written decision to grant | ||
| PR0701 | Registration of establishment |
St.27 status event code: A-2-4-F10-F11-exm-PR0701 |
|
| PR1002 | Payment of registration fee |
St.27 status event code: A-2-2-U10-U11-oth-PR1002 Fee payment year number: 1 |
|
| PN2301 | Change of applicant |
St.27 status event code: A-5-5-R10-R13-asn-PN2301 St.27 status event code: A-5-5-R10-R11-asn-PN2301 |
|
| PG1601 | Publication of registration |
St.27 status event code: A-4-4-Q10-Q13-nap-PG1601 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| PN2301 | Change of applicant |
St.27 status event code: A-5-5-R10-R13-asn-PN2301 St.27 status event code: A-5-5-R10-R11-asn-PN2301 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 4 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 5 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 6 |
|
| PN2301 | Change of applicant |
St.27 status event code: A-5-5-R10-R13-asn-PN2301 St.27 status event code: A-5-5-R10-R11-asn-PN2301 |
|
| PN2301 | Change of applicant |
St.27 status event code: A-5-5-R10-R13-asn-PN2301 St.27 status event code: A-5-5-R10-R11-asn-PN2301 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 7 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 8 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 9 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 10 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 11 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 12 |
|
| FPAY | Annual fee payment |
Payment date: 20110830 Year of fee payment: 13 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 13 |
|
| R18-X000 | Changes to party contact information recorded |
St.27 status event code: A-5-5-R10-R18-oth-X000 |
|
| FPAY | Annual fee payment |
Payment date: 20120830 Year of fee payment: 14 |
|
| PR1001 | Payment of annual fee |
St.27 status event code: A-4-4-U10-U11-oth-PR1001 Fee payment year number: 14 |
|
| P22-X000 | Classification modified |
St.27 status event code: A-4-4-P10-P22-nap-X000 |
|
| LAPS | Lapse due to unpaid annual fee | ||
| PC1903 | Unpaid annual fee |
St.27 status event code: A-4-4-U10-U13-oth-PC1903 Not in force date: 20130914 Payment event data comment text: Termination Category : DEFAULT_OF_REGISTRATION_FEE |
|
| PC1903 | Unpaid annual fee |
St.27 status event code: N-4-6-H10-H13-oth-PC1903 Ip right cessation event data comment text: Termination Category : DEFAULT_OF_REGISTRATION_FEE Not in force date: 20130914 |
|
| P22-X000 | Classification modified |
St.27 status event code: A-4-4-P10-P22-nap-X000 |