[go: up one dir, main page]

KR100233532B1 - Audio Codec of Voice Communication System - Google Patents

Audio Codec of Voice Communication System Download PDF

Info

Publication number
KR100233532B1
KR100233532B1 KR1019970005111A KR19970005111A KR100233532B1 KR 100233532 B1 KR100233532 B1 KR 100233532B1 KR 1019970005111 A KR1019970005111 A KR 1019970005111A KR 19970005111 A KR19970005111 A KR 19970005111A KR 100233532 B1 KR100233532 B1 KR 100233532B1
Authority
KR
South Korea
Prior art keywords
voice
text data
communication system
audio codec
voice communication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
KR1019970005111A
Other languages
Korean (ko)
Other versions
KR19980068496A (en
Inventor
김남시
Original Assignee
윤종용
삼성전자주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 윤종용, 삼성전자주식회사 filed Critical 윤종용
Priority to KR1019970005111A priority Critical patent/KR100233532B1/en
Publication of KR19980068496A publication Critical patent/KR19980068496A/en
Application granted granted Critical
Publication of KR100233532B1 publication Critical patent/KR100233532B1/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Telephonic Communication Services (AREA)

Abstract

개시된 내용은 음성통신시스템의 오디오코덱(CODEC)에 관한 것으로서, 음성통신시스템에 있어서, 입력되는 음성신호를 인식하여 문자데이터로 변환시켜 출력하는 부호기, 및 부호기로부터 전송되는 문자데이터를 수신하며, 수신된 문자데이터를 기설정된 음색을 갖는 인공의 음성신호로 생성하여 출력하는 복호기를 포함한다. 이와 같은, 오디오코덱은 대역폭이 작은 통신라인을 이용하여 음성통신할 수 있게 하는 효과를 가져온다.The present disclosure relates to an audio codec (CODEC) of a voice communication system. In a voice communication system, an encoder recognizes an input voice signal, converts it into text data, and outputs the received text data, and receives text data transmitted from the coder. And a decoder for generating and outputting the generated text data into an artificial voice signal having a predetermined tone. As such, the audio codec has an effect of enabling voice communication using a communication line having a small bandwidth.

Description

음성통신시스템의 오디오코덱Audio Codec of Voice Communication System

본 발명은 음성통신시스템의 오디오코덱(CODEC)에 관한 것으로서, 보다 상세하게는, 음성인식과 인공(人工) 음성의 생성을 이용하여 대역폭이 작은 통신채널에서도 음성통신을 할 수 있도록 하는 장치에 관한 것이다.The present invention relates to an audio codec (CODEC) of a voice communication system, and more particularly, to an apparatus for enabling voice communication in a communication channel having a low bandwidth by using voice recognition and artificial voice generation. will be.

근래 들어, 지역적으로 멀리 떨어진 사람의 얼굴을 보며 대화를 할 수 있게 하는 화상통신에 기반을 둔 멀티미디어제품들이 등장하고 있다. 하나의 회선으로 연결되어 있는 화상통신시스템에서, 영상 및 음성신호를 부호화/복호화하는 방식, 및 신호들을 다중화하기 위한 방식의 표준안들이 회선의 종류마다 권고되어 있다.In recent years, multimedia products based on video communication have emerged, which enable people to talk face to face with remote people. In a video communication system connected by one line, standard proposals of a method of encoding / decoding video and audio signals and a method of multiplexing signals are recommended for each type of line.

화상통신시스템을 이용하여 통신을 하는 사용자가, 예를 들어, "안녕하십니까?"라고 2초에 걸쳐 말하였을 경우를 생각하여 보자. 여기서, 사용자로부터의 음성신호를 8kHZ의 주파수로 샘플링한 후, 한 샘플당 2바이트를 할당하여 디지탈변환시키면, 데이터량은 32,000바이트(byte)가 된다. 이렇게 많은 데이터를 전송하기 위해서는 대역폭이 큰 통신라인을 사용하여야 한다. 최근 들어, 대역폭이 작은 통신라인을 이용하여 음성신호를 전송할 수 있도록 G.723, G.728, G.729와 같은 음성압축부호화의 표준안들이 권고되었다. 이 중에서, 일반전화망(PSTN)을 이용한 디지탈통신을 위하여 제안된 G.723의 성능이 가장 우수한 것으로 알려져 있다. G.723에서 낮은 전송율(low-rate)의 경우, 5.3Kbps로 데이터를 압축한다. 이를 이용하여 전술한 음성신호를 압축하더라도 데이터량은 약 1333바이트가 된다.Consider a case where a user who communicates using a video communication system has said, for example, "Hello?" Over two seconds. Here, after sampling a speech signal from a user at a frequency of 8kH Z, when the digital conversion to allocate two bytes per sample, the amount of data is a 32,000 byte (byte). In order to transmit such a large amount of data, a communication line having a large bandwidth must be used. Recently, voice compression coding standards such as G.723, G.728, and G.729 have been recommended to transmit voice signals using a small bandwidth communication line. Among them, the performance of the proposed G.723 for digital communication using the public telephone network (PSTN) is known to be the best. For low-rate in G.723, data is compressed at 5.3Kbps. Even if the above-mentioned audio signal is compressed using this, the data amount is about 1333 bytes.

여기서, 통신에 참가한 사용자들 사이에, 말하는 사람이 누구인 지를 분명히 알 수 있으며, 그런 이유에서 말하는 사람의 음색(tone)에 상관없이 그 말한 내용만을 알 수 있어도 된다면, 우리는 굳이 전술한 압축부호화방법들을 사용하지 않아도 된다. 다시 말해, 전술한 화자(話者)의 음색에 상관없이 "안녕하십니까?"라는 내용만 인식하여 문자화하면, 12바이트만을 사용하여 그 내용을 표현할 수 있다.Here, if the users who participated in the communication can clearly know who the speaker is, and for that reason, only the contents of the speaker can be known regardless of the speaker's tone, There is no need to use methods. In other words, regardless of the tone of the speaker described above, if only the content of "Hello?" Is recognized and characterized, the contents can be expressed using only 12 bytes.

본 발명의 목적은, 화자의 음성을 인식하여 데이터량이 적은 문자데이터로 변환시켜 전송하고, 전송되어진 문자데이터를 인공음성으로 변환시켜 출력하므로써 대역폭이 작은 통신채널을 이용하여 음성통신할 수 있도록 하는 음성통신시스템의 오디오코덱을 제공함에 있다.An object of the present invention is to recognize the voice of the speaker to convert the data into a small amount of text data and transmit, and to convert the transmitted text data into artificial voice and output the voice to enable the voice communication using a small bandwidth communication channel An audio codec of a communication system is provided.

도 1은 본 발명에 따른 음성통신시스템의 구성도.1 is a block diagram of a voice communication system according to the present invention.

<도면의 주요부분에 대한 부호의 설명><Description of the symbols for the main parts of the drawings>

11 : 마이크12 : 음성인식기11: microphone 12: voice recognizer

13 : 음성생성기14 : 스피커13: voice generator 14: speaker

이와 같은 목적을 달성하기 위한 본 발명에 따른 오디오코덱은, 음성통신시스템에 있어서, 입력되는 음성신호를 인식하여 문자데이터로 변환시켜 출력하는 부호기, 및 부호기로부터 전송되는 문자데이터를 수신하며, 수신된 문자데이터를 기설정된 음색을 갖는 인공의 음성신호로 생성하여 출력하는 복호기를 포함한다.The audio codec according to the present invention for achieving the above object, in the voice communication system, recognizes the input voice signal, converts it into text data and outputs, and receives the text data transmitted from the coder, And a decoder for generating and outputting text data into an artificial voice signal having a predetermined tone.

이하, 첨부한 도면을 참조하여 본 발명을 상세히 설명하기로 한다.Hereinafter, with reference to the accompanying drawings will be described in detail the present invention.

도 1은 본 발명에 따른 음성통신시스템의 구성을 보여준다.1 shows a configuration of a voice communication system according to the present invention.

도 1에서, 마이크(11)는 사용자로부터 음성신호를 입력받는다. 음성인식기(12)는 마이크(11)로부터의 음성신호를 입력받으며, 문자화된 데이터를 출력한다. 문자화된 데이터는 일반전화망인 PSTN을 통해 전송된다. 수신측의 음성생성기(13)는 수신되는 문자화된 데이터로부터 음성신호를 생성하며, 생성된 음성신호를 스피커(14)를 통해 출력시킨다.In FIG. 1, the microphone 11 receives a voice signal from a user. The voice recognizer 12 receives a voice signal from the microphone 11 and outputs text data. The text data is transmitted through the PSTN, which is a general telephone network. The voice generator 13 on the receiving side generates a voice signal from the received text data, and outputs the generated voice signal through the speaker 14.

이와 같이 구성된 도 1을 사용하는 사용자는, 화상회의시 마이크(11)를 이용해 자신의 의견을 말한다. 마이크(11)를 통해 입력되는 사용자의 음성신호는 음성인식기(12)로 인가된다. 음성인식기(12)는 입력되는 신호를 시스템내에서 인식할 수 있는 코드로 변환시킨다. 코드로 변환된 음성데이터, 즉, 문자화된 음성데이터는 일반전화망(PSTN)을 통해 상대측 화상단말로 전송된다. 여기서, 문자화된 데이터의 양은, 문자변환되기 이전의 음성신호를 압축했을 때보다 훨씬 작기 때문에 대역폭이 작은 통신채널을 이용하여 전송해도 된다.The user using FIG. 1 configured as described above speaks his or her opinion using the microphone 11 during a video conference. The voice signal of the user input through the microphone 11 is applied to the voice recognizer 12. The speech recognizer 12 converts an input signal into a code that can be recognized in the system. Voice data converted into codes, i.e., textualized voice data, is transmitted to the opposite video terminal through the PSTN. In this case, since the amount of text data is much smaller than that of compressing the voice signal before the text conversion, the text data may be transmitted using a communication channel having a small bandwidth.

한편, 복호기는 일반전화망(PSTN)을 통해 전송되는 문자화된 데이터를 수신하며, 수신된 데이터를 음성신호로 복원한다. 즉, 음성생성기(13)는 수신되는 문자데이터를 기설정된 음색을 갖는 인공의 음성신호로 변환시켜 출력한다. 음성생성기(13)로부터 발생된 음성신호는 스피커(14)를 통해 출력된다. 여기서, 마이크(11)를 통해 입력되는 음색과 스피커(14)를 통해 출력되는 음색은 서로 다르지만, 화자가 말한 내용은 다른 사용자들에게 그대로 전달된다.Meanwhile, the decoder receives the text data transmitted through the PSTN, and restores the received data into the voice signal. That is, the voice generator 13 converts the received text data into an artificial voice signal having a predetermined tone and outputs the same. The voice signal generated from the voice generator 13 is output through the speaker 14. Here, the voice input through the microphone 11 and the voice output through the speaker 14 are different from each other, but the content of the speaker is transmitted to other users.

전술한 부호화측의 음성인식기(12)와 복호화측의 음성생성기(13)는 최근 들어 그 개발에 많은 진척을 보인 음성인식IC와 음성합성용IC를 시스템에 내장하므로써 구현이 가능하다.The above-mentioned speech recognizer 12 on the encoding side and the speech generator 13 on the decoding side can be implemented by incorporating a speech recognition IC and a speech synthesis IC, which have made much progress in recent years in the system.

이와 같은 본 발명에 따른 오디오코덱은 대역폭이 작은 통신라인을 이용하여 음성통신할 수 있게 하는 효과를 가져온다.Such an audio codec according to the present invention has the effect of enabling voice communication using a communication line having a small bandwidth.

Claims (3)

음성통신시스템에 있어서,In a voice communication system, 입력되는 음성신호를 인식하여 문자데이터로 변환시켜 출력하는 부호기; 및An encoder which recognizes an input voice signal and converts the input voice signal into text data; And 상기 부호기로부터 전송되는 문자데이터를 수신하며, 수신된 문자데이터를 기설정된 음색을 갖는 인공의 음성신호로 생성하여 출력하는 복호기; 및A decoder which receives the text data transmitted from the encoder and generates and outputs the received text data as an artificial voice signal having a predetermined tone; And 상기 부호기와 상기 복호기 사이에 통신연결을 하기 위한 네트워크를 포함하는 오디오코덱.And a network for establishing a communication connection between the encoder and the decoder. 제 1항에 있어서, 상기 부호기는 음성인식IC를 포함하는 것을 특징으로 하는 오디오코덱.The audio codec according to claim 1, wherein the encoder comprises a speech recognition IC. 제 1항에 있어서, 상기 복호기는 음성합성용IC를 포함하는 것을 특징으로 하는 오디오코덱.The audio codec according to claim 1, wherein the decoder comprises a voice synthesis IC.
KR1019970005111A 1997-02-20 1997-02-20 Audio Codec of Voice Communication System Expired - Fee Related KR100233532B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1019970005111A KR100233532B1 (en) 1997-02-20 1997-02-20 Audio Codec of Voice Communication System

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1019970005111A KR100233532B1 (en) 1997-02-20 1997-02-20 Audio Codec of Voice Communication System

Publications (2)

Publication Number Publication Date
KR19980068496A KR19980068496A (en) 1998-10-26
KR100233532B1 true KR100233532B1 (en) 1999-12-01

Family

ID=19497504

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019970005111A Expired - Fee Related KR100233532B1 (en) 1997-02-20 1997-02-20 Audio Codec of Voice Communication System

Country Status (1)

Country Link
KR (1) KR100233532B1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119207421B (en) * 2024-09-23 2025-09-12 东南大学 A method and system for speech recognition and cloning semantic speech transmission

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR930024395A (en) * 1992-05-27 1993-12-22 정용문 Automated response system using phoneme conversion method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR930024395A (en) * 1992-05-27 1993-12-22 정용문 Automated response system using phoneme conversion method

Also Published As

Publication number Publication date
KR19980068496A (en) 1998-10-26

Similar Documents

Publication Publication Date Title
US8340959B2 (en) Method and apparatus for transmitting wideband speech signals
US8259629B2 (en) System and method for transmitting and receiving wideband speech signals with a synthesized signal
KR19980701696A (en) Method and apparatus for detection and bypass of tandem vocoding
WO2007140724A1 (en) A method and apparatus for transmitting and receiving background noise and a silence compressing system
US7177801B2 (en) Speech transfer over packet networks using very low digital data bandwidths
WO2010059342A1 (en) Apparatus and method for encoding at least one parameter associated with a signal source
KR100233532B1 (en) Audio Codec of Voice Communication System
JPH1049199A (en) Silence compressed voice coding and decoding device
Decina et al. CCITT standards on digital speech processing
Kitawaki et al. Speech coding technology for ATM networks
CN215868635U (en) A Narrowband Channel Oriented Voice Communication System
JP4333005B2 (en) Speech encoding / decoding device, speech encoding device, and encoding method
JPH06216779A (en) Communication device
JPS60107933A (en) Adpcm encoding device
JPH06216860A (en) Voice communication device
JPH10145764A (en) Speaker detection method and multipoint video conference device
KR960003626B1 (en) Decoding method of deaf-coded audio signal
EP1220202A1 (en) System and method for coding and decoding speaker-independent and speaker-dependent speech information
KR100400720B1 (en) Method for transfering data through internet
JP2000244949A (en) Dtmf signal transmitting device
JPH08307366A (en) Speech coding device
Ma et al. A solution to mix low bit-rate speech signal in decentralized multipoint conference
JPH0431457B2 (en)
JPH0434339B2 (en)
Nakatsui et al. Dual adaptive delta modulation for mobile voice channel and its DSP implementation

Legal Events

Date Code Title Description
A201 Request for examination
PA0109 Patent application

St.27 status event code: A-0-1-A10-A12-nap-PA0109

PA0201 Request for examination

St.27 status event code: A-1-2-D10-D11-exm-PA0201

R17-X000 Change to representative recorded

St.27 status event code: A-3-3-R10-R17-oth-X000

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

R18-X000 Changes to party contact information recorded

St.27 status event code: A-3-3-R10-R18-oth-X000

PN2301 Change of applicant

St.27 status event code: A-3-3-R10-R13-asn-PN2301

St.27 status event code: A-3-3-R10-R11-asn-PN2301

E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

St.27 status event code: A-1-2-D10-D21-exm-PE0902

P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000

P13-X000 Application amended

St.27 status event code: A-2-2-P10-P13-nap-X000

E701 Decision to grant or registration of patent right
PE0701 Decision of registration

St.27 status event code: A-1-2-D10-D22-exm-PE0701

GRNT Written decision to grant
PR0701 Registration of establishment

St.27 status event code: A-2-4-F10-F11-exm-PR0701

PR1002 Payment of registration fee

St.27 status event code: A-2-2-U10-U11-oth-PR1002

Fee payment year number: 1

PN2301 Change of applicant

St.27 status event code: A-5-5-R10-R13-asn-PN2301

St.27 status event code: A-5-5-R10-R11-asn-PN2301

PG1601 Publication of registration

St.27 status event code: A-4-4-Q10-Q13-nap-PG1601

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

PN2301 Change of applicant

St.27 status event code: A-5-5-R10-R13-asn-PN2301

St.27 status event code: A-5-5-R10-R11-asn-PN2301

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 4

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 5

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 6

PN2301 Change of applicant

St.27 status event code: A-5-5-R10-R13-asn-PN2301

St.27 status event code: A-5-5-R10-R11-asn-PN2301

PN2301 Change of applicant

St.27 status event code: A-5-5-R10-R13-asn-PN2301

St.27 status event code: A-5-5-R10-R11-asn-PN2301

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 7

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 8

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 9

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 10

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 11

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 12

FPAY Annual fee payment

Payment date: 20110830

Year of fee payment: 13

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 13

R18-X000 Changes to party contact information recorded

St.27 status event code: A-5-5-R10-R18-oth-X000

FPAY Annual fee payment

Payment date: 20120830

Year of fee payment: 14

PR1001 Payment of annual fee

St.27 status event code: A-4-4-U10-U11-oth-PR1001

Fee payment year number: 14

P22-X000 Classification modified

St.27 status event code: A-4-4-P10-P22-nap-X000

LAPS Lapse due to unpaid annual fee
PC1903 Unpaid annual fee

St.27 status event code: A-4-4-U10-U13-oth-PC1903

Not in force date: 20130914

Payment event data comment text: Termination Category : DEFAULT_OF_REGISTRATION_FEE

PC1903 Unpaid annual fee

St.27 status event code: N-4-6-H10-H13-oth-PC1903

Ip right cessation event data comment text: Termination Category : DEFAULT_OF_REGISTRATION_FEE

Not in force date: 20130914

P22-X000 Classification modified

St.27 status event code: A-4-4-P10-P22-nap-X000