JPH07175808A - Natural language processor - Google Patents
Natural language processorInfo
- Publication number
- JPH07175808A JPH07175808A JP5318510A JP31851093A JPH07175808A JP H07175808 A JPH07175808 A JP H07175808A JP 5318510 A JP5318510 A JP 5318510A JP 31851093 A JP31851093 A JP 31851093A JP H07175808 A JPH07175808 A JP H07175808A
- Authority
- JP
- Japan
- Prior art keywords
- semantic analysis
- verb
- sentence
- analysis result
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
(57)【要約】
【目的】先行する文章全体の情報に基づいて、省略され
た名詞句の補完を行なう。
【構成】単語辞書106と、名詞シソーラス辞書109
及び動詞シソーラス辞書108を含むシソーラス辞書1
07と、単語の属性情報を格納する属性辞書110から
なる辞書群を備える。意味解析部102は、入力手段1
01から入力された複数の文を、辞書群に基づいて、1
文ずつ意味解析を行い、動詞を核とした5W1H相当句
との格関係で表現される意味解析を行い、解析結果を意
味解析結果格納部103に格納する。省略情報補完処理
部104は、動詞を核とした5W1H相当句との格関係
表現毎に省略された名詞句を含む格要素を検出し、動詞
の時系列関係に基づいて、対象となる動詞と時系列関係
にある動詞を意味解析結果格納部103から抽出し、そ
の格要素を基に意味解析結果中の省略された語句の補完
を行なう。
(57) [Summary] [Purpose] The omitted noun phrase is complemented based on the information of the entire preceding sentence. [Structure] Word dictionary 106 and noun thesaurus dictionary 109
And thesaurus dictionary 1 including verb thesaurus dictionary 108
07 and an attribute dictionary 110 that stores attribute information of words. The semantic analysis unit 102 uses the input unit 1
1 based on the dictionary group
The semantic analysis is performed for each sentence, the semantic analysis expressed by the case relation with the 5W1H equivalent phrase having the verb as the core is performed, and the analysis result is stored in the semantic analysis result storage unit 103. The abbreviated information complementing processing unit 104 detects a case element including a noun phrase omitted for each case relation expression with a 5W1H equivalent phrase having a verb as a core, and based on the time-series relation of the verb, a target verb is detected. Verbs having a time series relationship are extracted from the semantic analysis result storage unit 103, and the omitted words and phrases in the semantic analysis result are complemented based on the case elements.
Description
【0001】[0001]
【産業上の利用分野】本発明は自然言語処理装置に係
り、特に、情報検索装置、機械翻訳装置、文書要約支援
装置などに利用可能な省略表現補完を行う自然言語処理
装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a natural language processing device, and more particularly to a natural language processing device which can be used in an information retrieval device, a machine translation device, a document summarization support device, etc.
【0002】[0002]
【従来の技術】一般に、複数の文から構成される文章を
処理する技術は、文脈処理技術と呼ばれており、その中
の技術のひとつとして、指示代名詞の照応先を同定する
技術や、文脈による語の省略を処理する技術が挙げられ
る。特に、日本語の表現は文脈による名詞の省略が多い
という特性があり、情報検索、機械翻訳及び文書要約等
を行う場合の障害となっていた。2. Description of the Related Art Generally, a technique for processing a sentence composed of a plurality of sentences is called a context processing technique. As one of the techniques, a technique for identifying a referent of a demonstrative pronoun and a context There is a technique for processing the abbreviation of the word by. In particular, Japanese expressions have the characteristic that nouns are often omitted depending on the context, which has been an obstacle to information retrieval, machine translation, document summarization, and the like.
【0003】文脈による語の省略を処理する従来技術と
しては、以下の方法が提案されている。 (1)自然言語による情報検索において、入力文を解析
し、先行文脈中の名詞句と兄弟関係にある名詞句をその
入力文から検出することにより省略された名詞句を補完
する「文脈処理方式」(特開平4−220767号公
報)。 (2)会話システムにおいて、指示可能な対象が複数存
在する場合に、入力自然言語を格フレーム表現に変換
し、各対象物間の相対距離を記憶し、指示対象となる対
象物の同定を対象物の相対的な距離関係で行なう「会話
システム」(特開平4−306769号公報)。The following methods have been proposed as conventional techniques for processing the omission of words depending on the context. (1) In information retrieval by natural language, an input sentence is analyzed, and a noun phrase that is a sibling to the noun phrase in the preceding context is detected from the input sentence to complement the omitted noun phrase. (Japanese Patent Laid-Open No. 4-220767). (2) In a conversation system, when there are a plurality of instructable objects, the input natural language is converted into a case frame expression, the relative distance between the objects is stored, and the object to be instructed is identified. A "conversation system" for performing relative distance relations between objects (Japanese Patent Laid-Open No. 4-306769).
【0004】[0004]
【発明が解決しようとする課題】しかしながら、前記
(1)に述べた文脈処理方式では、単語間の兄弟関係に
着目したデータである機能表現テーブルを利用すること
により、名詞句の補完を行なっており、補完の範囲対象
が限定される。したがって、入力された名詞が先行文脈
中の名詞句と兄弟間の関係を持たない場合は、表示され
た入力文字列に対して省略された名詞句の補完処理が行
なえないという問題点があった。However, in the context processing method described in (1) above, the noun phrase is complemented by using the functional expression table which is the data focusing on the sibling relationship between words. Therefore, the scope of complement is limited. Therefore, if the input noun does not have a sibling relationship with the noun phrase in the preceding context, there is a problem that the omitted noun phrase cannot be complemented for the displayed input character string. .
【0005】また、前記(2)に述べた方法では、指示
対象となる対象物の同定を各対象物の相対的な距離を用
いて行なっているが、会話文を対象としているために、
参照する格フレームを会話履歴の中の最新の格フレーム
表現中に限っている。しかしながら、現実の文章では、
直前の文の情報に限らず、先行する文章全体による省略
が起こり得る。したがって、入力文の指示対象が最新よ
り前の格フレームにおける対象物を指示している場合は
省略語の補完処理が行なえないという問題点があった。Further, in the method described in (2) above, the identification of the object to be instructed is performed by using the relative distance of each object, but since it is intended for conversational sentences,
The reference case frame is limited to the latest case frame expression in the conversation history. However, in a real sentence,
Omitting not only the information of the preceding sentence but the entire preceding sentence may occur. Therefore, there is a problem in that the abbreviation complement processing cannot be performed when the input object is pointing to an object in a case frame before the latest.
【0006】以上の問題点に鑑み、本発明の課題は、複
数の文からなるある話題を提供する文章を対象に、省略
された名詞句に対して、名詞の兄弟間の関係に限定され
ることなく、また直前の格フレームに限定せず、先行す
るの文章全体の情報に基づいて、指示された名詞句の補
完を行なうことができる自然言語処理装置を提供するこ
とである。In view of the above problems, the problem of the present invention is limited to the relationship between siblings of a noun with respect to an abbreviated noun phrase targeted at a sentence that provides a topic consisting of a plurality of sentences. It is an object of the present invention to provide a natural language processing device capable of complementing an instructed noun phrase based on the information of the entire preceding sentence without being limited to the immediately preceding case frame.
【0007】[0007]
【課題を解決するための手段】前記課題を解決するた
め、本発明は次の構成を有する。すなわち、本発明は、
文を入力する入力手段と、単語に関する情報を格納する
単語情報格納手段と、名詞関連語辞書及び動詞関連語辞
書を含み、単語間の関係に関する情報を格納するシソー
ラス情報格納手段と、単語の属性情報を格納する属性情
報格納手段と、前記入力手段から入力された複数の文
を、前記単語情報、シソーラス情報、属性情報を参照す
ることにより、1文ずつ解析し、その意味解析結果を抽
出する意味解析手段と、前記意味解析手段によって解析
された意味解析結果を格納する意味解析結果格納手段
と、前記シソーラス情報を基に、意味解析結果中の省略
された語句の補完を行なう省略表現補完処理手段と、前
記省略表現補完処理手段による省略表現の補完結果を表
示する省略結果出力手段とを備えたことを特徴とする自
然言語処理装置である。In order to solve the above problems, the present invention has the following constitution. That is, the present invention is
Input means for inputting a sentence, word information storage means for storing information about words, thesaurus information storage means for storing information about relations between words, including a noun-related word dictionary and a verb-related word dictionary, and word attributes By referring to the word information, thesaurus information, and attribute information, the attribute information storage unit that stores information and a plurality of sentences input from the input unit are analyzed one sentence at a time, and the semantic analysis result is extracted. Semantic analysis means, semantic analysis result storage means for storing the semantic analysis result analyzed by the semantic analysis means, and abbreviation complementation processing for complementing the omitted words in the semantic analysis result based on the thesaurus information. A natural language processing device comprising: means and an abbreviation result output means for displaying a complementation result of the abbreviation expression by the abbreviation complementation processing means.
【0008】また本発明においては、前記意味解析手段
が、動詞を核として、WHO(主体)、WHAT(客
体)、WHEN(時間)、WHERE(場所)、WHY
(理由)及びHOW(方法)からなる5W1H相当句と
の格関係で表現される意味を解析する意味解析手段であ
ることができる。Further, in the present invention, the semantic analysis means has WHO (subject), WHAT (object), WHEN (time), WHERE (place) and WHY with a verb as a core.
It can be a semantic analysis means for analyzing the meaning expressed in the case relation with the 5W1H equivalent phrase consisting of (reason) and HOW (method).
【0009】また本発明においては、前記シソーラス情
報格納手段に含まれる動詞関連語辞書が、動詞の時系列
関係を格納する動詞関連語辞書であり、前記省略表現補
完処理手段が、前記意味解析結果から処理対象となる意
味解析結果を1文ずつ読み込み、動詞を核とした5W1
H相当句との格関係表現毎に、省略された名詞句を含む
格要素を検出し、前記動詞関連語辞書から読み出された
動詞の時系列関係に基づいて、処理対象文の動詞と時系
列関係を持つ動詞を先行する文の意味解析結果から抽出
し、該抽出された動詞の格要素に基づいて、省略された
名詞句を同定し、該名詞句により省略表現を補完する省
略表現補完処理手段であることができる。Further, in the present invention, the verb-related word dictionary included in the thesaurus information storage means is a verb-related word dictionary for storing time series relationships of verbs, and the abbreviation complement processing means is the semantic analysis result. 5W1 with the verb as the core, reading the semantic analysis results to be processed
A case element including an abbreviated noun phrase is detected for each case relation expression with the H equivalent phrase, and the verb of the processing target sentence and the time are detected based on the time series relation of the verb read from the verb-related word dictionary. Abbreviation completion that extracts a verb having a series relation from the result of semantic analysis of a preceding sentence, identifies an abbreviated noun phrase based on the case element of the extracted verb, and complements the abbreviation with the noun phrase It can be a processing means.
【0010】[0010]
【作用】本発明の自然言語処理装置では、単語情報格納
手段は意味解析処理に利用する単語に関する情報を格納
し、シソーラス情報格納手段は、単語情報格納手段に格
納されている各単語間の意味的関係に関する情報を格納
する。シソーラス情報格納手段は名詞シソーラス辞書と
動詞シソーラス辞書から構成され、名詞シソーラス辞書
は名詞間の上位−下位関係に関する情報を格納し、動詞
シソーラス辞書は動詞の時系列関係に関する情報を格納
する。また、属性情報格納手段は、単語情報格納手段に
格納されている単語の属性関係に関する情報を格納す
る。In the natural language processing apparatus of the present invention, the word information storage means stores information regarding words used in the semantic analysis processing, and the thesaurus information storage means stores the meaning between the words stored in the word information storage means. Stores information about physical relationships. The thesaurus information storage means is composed of a noun thesaurus dictionary and a verb thesaurus, the noun thesaurus dictionary stores information about the upper-lower relation between nouns, and the verb thesaurus dictionary stores information about the time-series relation of verbs. Further, the attribute information storage means stores information regarding the attribute relationship of the words stored in the word information storage means.
【0011】入力手段は複数の文を1文ずつ入力する。
意味解析手段は、入力手段から入力された複数の文を、
前記単語情報格納手段の文法情報を用いて1文ずつ形態
素解析する。次いで、前記単語情報格納手段の用言パタ
ーン情報、シソーラス情報格納手段、属性情報格納手段
を使用して意味解析し、動詞を核とした5W1H相当語
句と格関係で表現される意味解析結果を意味解析結果格
納手段に格納する。The input means inputs a plurality of sentences one by one.
Semantic analysis means, a plurality of sentences input from the input means,
Morphological analysis is performed sentence by sentence using the grammatical information stored in the word information storage means. Then, the meaning analysis is performed using the synonym pattern information of the word information storage means, the thesaurus information storage means, and the attribute information storage means, and the meaning analysis result expressed in the case relation with the 5W1H equivalent words and phrases with the verb at the core is defined. Stored in the analysis result storage means.
【0012】省略情報補完処理手段は、前記意味解析結
果中から対象となる意味解析結果を1文ずつ読み込む。
次いで、動詞を核にした5W1H相当語句との格関係表
現の中で名詞句が省略されている格関係を検出する。前
記動詞シソーラス辞書の動詞の時系列関係を基に、対象
となる動詞と時系列関係を持つ動詞を、対象となる文に
先行する文の意味解析結果から抽出する。次いで、その
動詞の5W1H(WHO,WHAT,WHEN,WHE
RE,WHY,HOW)相当語句との格関係表現を基に
省略された名詞句を同定し、対象となる意味解析結果の
中の省略された名詞句の埋め込み処理を行なう。次い
で、前記省略情報補完処理によって補完処理された省略
表現補完結果を省略情報補完処理結果出力手段から出力
する。The omitted information complementing processing means reads the target semantic analysis result from the semantic analysis results one sentence at a time.
Then, the case relation in which the noun phrase is omitted in the case relation expression with the 5W1H equivalent phrase centered on the verb is detected. Based on the time-series relationship of the verbs in the verb thesaurus, a verb having a time-series relationship with the target verb is extracted from the semantic analysis result of the sentence preceding the target sentence. Then, the verb's 5W1H (WHO, WHAT, WEN, WHE
(RE, WHY, HOW) The omitted noun phrase is identified based on the case relation expression with the equivalent phrase, and the omitted noun phrase in the target semantic analysis result is embedded. Next, the abbreviated expression complementing result complemented by the abbreviated information complementing process is output from the abbreviated information complementing process result output means.
【0013】このように、複数の文から構成される文章
において、動詞の時系列関係を記述した動詞シソーラス
辞書を利用することにより、先行する全ての文を参照し
て、省略された名詞句の補完処理を行なうことが可能に
なり、文の構造を正しく抽出することが可能になる。As described above, in a sentence composed of a plurality of sentences, by using the verb thesaurus dictionary that describes the time-series relations of verbs, all preceding sentences are referred to and the omitted noun phrase Complementary processing can be performed, and the sentence structure can be correctly extracted.
【0014】[0014]
【実施例】以下、図面を参照して、本発明の自然言語処
理装置の実施例を説明する。図1は、本発明の自然言語
処理装置における一実施例の構成を示すブロック図であ
る。図1の自然言語処理装置は、(1)文を入力する入
力部101、(2)前記入力部に接続された意味解析部
102、(3)前記意味解析部102に接続された、文
を解析するために必要な単語に関する情報を格納する単
語情報格納部106、(4)前記意味解析部102と後
述の省略表現補完処理部104に接続され、単語間の関
係に関する情報を格納し、後述の名詞シソーラス辞書1
08と動詞シソーラス辞書109から構成されるシソー
ラス辞書107、(5)前記シソーラス辞書107を構
成する名詞間の関係に関する情報を格納する名詞シソー
ラス辞書108、(6)同じく動詞間の時系列的な関係
に関する情報を格納する動詞シソーラス辞書109、
(7)意味解析部102に接続され、所定の属性情報を
格納する属性辞書110、(8)前記意味解析部に10
2接続された意味解析結果格納部103と、(9)意味
解析結果格納部103に接続され、前記シソーラス辞書
情報を基に、意味内容表現中の省略された語句の補完を
行なう省略表現補完処理部104、(10)前記省略表
現補完処理部104による省略表現の補完結果を表示す
る省略情報補完結果出力部105とによって構成され
る。Embodiments of the natural language processing apparatus of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of an embodiment of the natural language processing apparatus of the present invention. The natural language processing apparatus of FIG. 1 includes (1) an input unit 101 for inputting a sentence, (2) a semantic analysis unit 102 connected to the input unit, and (3) a sentence connected to the semantic analysis unit 102. A word information storage unit 106 for storing information on words necessary for analysis, (4) Connected to the semantic analysis unit 102 and an abbreviation complementation processing unit 104, which will be described later, and stores information on the relationship between words, which will be described later. Noun thesaurus dictionary 1
08 and a verb thesaurus dictionary 109, (5) a noun thesaurus dictionary 108 that stores information about the relationship between the nouns that make up the thesaurus dictionary 107, and (6) the same chronological relationship between verbs. A verb thesaurus dictionary 109 for storing information about
(7) An attribute dictionary 110 that is connected to the semantic analysis unit 102 and stores predetermined attribute information, and (8) 10 in the semantic analysis unit.
(2) Connected to the semantic analysis result storage unit 103 and (9) connected to the semantic analysis result storage unit 103, and based on the thesaurus dictionary information, an abbreviation complementation process for complementing the omitted words in the semantic content expression. Unit 104, and (10) abbreviated information complementation result output unit 105 for displaying the abbreviated expression complementation result by said abbreviation complementation processing unit 104.
【0015】次に、上記構成の本実施例の動作を説明す
る。入力部101は複数文からなる文章の1文を読み込
み、意味解析部102は単語情報格納部106の文法情
報を用いて形態素解析を行ない、さらに単語辞書106
の意味情報及びシソーラス辞書107を用いて構文解
析、意味解析を行ない、意味解析結果格納部103にそ
の解析結果を格納する。省略情報補完処理部104は、
前記意味解析結果格納部103から意味解析結果を読み
込む。次いで、意味解析結果の用言毎に、図2のフロー
トチャートに示すように、動詞シソーラス辞書109の
時系列関係を基に前述の文の意味解析結果から省略され
ている名詞句を格要素毎に埋め込み、各名詞句の補完処
理を行なう。省略情報補完結果出力部105は、省略情
報補完処理部104によって行なわれた省略情報の補完
結果を出力する。上述した処理は全文に対して行なう。Next, the operation of this embodiment having the above configuration will be described. The input unit 101 reads one sentence of a plurality of sentences, the semantic analysis unit 102 performs morphological analysis using the grammatical information of the word information storage unit 106, and further the word dictionary 106.
The syntactic analysis and the semantic analysis are performed using the semantic information and the thesaurus dictionary 107, and the analysis result is stored in the semantic analysis result storage unit 103. The omitted information complement processing unit 104
The semantic analysis result is read from the semantic analysis result storage unit 103. Then, for each of the synonyms of the semantic analysis result, as shown in the float chart of FIG. 2, the noun phrases omitted from the semantic analysis result of the above sentence are classified by case elements based on the time-series relationship of the verb thesaurus dictionary 109. Embedded in, and complements each noun phrase. The omitted information complementation result output unit 105 outputs the result of the omitted information complementation performed by the omitted information complementation processing unit 104. The above-mentioned processing is performed on the whole sentence.
【0016】次に図2のフロートチャートを参照して、
図1の自然言語処理装置の動作を説明する。まず対象文
の解析結果を対象意味解析結果読み取り処理で読み込ん
で(ステップS1)、対象文に前述する文があるかどう
かを判断(ステップS2)し、前述する文がある場合
は、対象文の動詞毎に、動詞と格関係でリンクされる5
W1H相当語句が未確定の格を検出(ステップS3)す
る。次いで、動詞シソーラス情報参照処理で各動詞の時
系列関係を参照(ステップS4)し、対象文に先行する
文の意味解析結果の動詞を順番に参照(ステップS5)
し、その結果を比較処理(ステップS6)し、上記ステ
ップの比較処理で同一であれば、その動詞を含む意味解
析結果の格要素を参照(ステップS7)する。Next, referring to the float chart of FIG.
The operation of the natural language processing device of FIG. 1 will be described. First, the analysis result of the target sentence is read by the target semantic analysis result reading process (step S1), and it is determined whether the target sentence has the above-mentioned sentence (step S2). For each verb, it is linked with the verb in a case relationship 5
A case in which the word equivalent to W1H is undetermined is detected (step S3). Next, the verb thesaurus information reference processing refers to the time-series relationship of each verb (step S4), and sequentially refers to the verbs of the semantic analysis result of the sentence preceding the target sentence (step S5).
Then, the results are compared (step S6), and if they are the same in the above comparison, the case element of the semantic analysis result including the verb is referred to (step S7).
【0017】次いで、対象文の意味解析結果中の未確定
の格要素と比較処理を行ない(ステップ8)、比較処理
で同一であれば、両要素の名詞の制約条件を参照(ステ
ップS9)し、名詞シソーラス辞書参照により、両名詞
句間が上位−下位の関係を持つかどうかの判定処理を行
ない(ステップS10)、上位−下位関係を持つ場合は
その内容を未確定の格要素情報欄に埋め込み記入(ステ
ップS11)する。上記ステップ7の比較処理で同一で
なかった場合は、さらに先行する文の動詞参照処理を行
なう。この動詞参照処理は、対象文に先行する文を直前
文から昇順に参照し、対象文に先行する文章全体を検索
対象とする。Next, a comparison process is performed with an undetermined case element in the semantic analysis result of the target sentence (step 8). If the comparison process is the same, the noun constraint condition of both elements is referred to (step S9). By referring to the noun thesaurus dictionary, it is determined whether or not there is an upper-lower relationship between the two noun phrases (step S10), and if there is a higher-lower relationship, the content is stored in the undetermined case element information column. Embedded entry (step S11). If they are not the same in the comparison process of step 7, the verb reference process of the preceding sentence is performed. In this verb reference processing, the sentences preceding the target sentence are referred to in ascending order from the preceding sentence, and the entire sentence preceding the target sentence is the search target.
【0018】次に、図3、図4及び図5を参照して、本
発明の自然言語処理装置における省略情報補完処理を説
明する。図3に示すように、例文1は「A社はi486
を搭載するパソコンを開発した。」(以下、文31と称
する)、例文2は「1992年3月に日本で発売す
る。」(以下、文32と称する)を示している。まず、
先行する文31において、A社がパソコンを開発したこ
とが明示されているので、文32では、「だれ(WH
O)」という名詞句と「何(WHAT)」をという名詞
句が省略されている。しかし、文脈より「もの」を「開
発」した後にその「もの」を「発売」するという時系列
上の関係が明らかであるため、文32において省略され
た名詞句を補完することができる。Next, with reference to FIGS. 3, 4 and 5, the omitted information complementing process in the natural language processing apparatus of the present invention will be described. As shown in FIG. 3, the example sentence 1 is “A company has i486.
We have developed a personal computer equipped with. (Hereinafter referred to as sentence 31), and example sentence 2 indicates “to be released in Japan in March 1992.” (hereinafter referred to as sentence 32). First,
In the preceding sentence 31, it is clearly stated that Company A has developed a personal computer, so in sentence 32, "who (WH
The noun phrase "O)" and the noun phrase "what (WHAT)" are omitted. However, since the chronological relationship of “developing” a “thing” and then “selling” the “thing” is clear from the context, the noun phrase omitted in the sentence 32 can be complemented.
【0019】次に、本実施例における省略補完処理を具
体的に説明する。本実施例では、単語辞書、シソーラス
辞書を使用して、文31を意味解析部で処理し、意味解
析結果33を意味解析結果格納部に格納しておく。図4
に意味解析処理に使用した単語辞書の用言パターン情報
の内容の一例を示す。用言パターンとは、例えば「開
発」35という動詞が取り得る格関係と名詞の制約条件
を表す。なお、図4の文31の意味解析結果33におい
て網がけになっている格要素は、文31を意味解析結果
からは抽出されないが、「開発」という動詞が取り得る
格関係情報を表している。このような文から情報が抽出
されない格要素は単語辞書の用言パターン情報を基に名
詞句を仮に格納しておく。文32も同様に処理結果34
を得て格納しておく。Next, the omission-complementing process in this embodiment will be specifically described. In the present embodiment, the sentence 31 is processed by the semantic analysis unit using the word dictionary and thesaurus, and the semantic analysis result 33 is stored in the semantic analysis result storage unit. Figure 4
An example of the content of the word pattern information of the word dictionary used for the semantic analysis processing is shown in FIG. The verb pattern represents a case relation and a noun constraint condition that can be taken by the verb “development” 35, for example. The shaded case elements in the semantic analysis result 33 of the sentence 31 of FIG. 4 represent the case relation information that the verb “development” can take, although the sentence 31 is not extracted from the semantic analysis result. . For a case element whose information is not extracted from such a sentence, a noun phrase is temporarily stored based on the word pattern information of the word dictionary. Similarly, the statement 32 also has a processing result 34.
Get and store.
【0020】次に、文31の処理結果33を対象検索結
果読み取り処理で読み込み、対象文に先行する文がある
かどうかを処理する。先行する文がない場合は、参照で
きる情報がないので省略処理を行なわない。同様に、文
32の解析結果34を読み込む。先行する文があれば、
解析結果の中の動詞を核とした5W1H相当語句との格
関係表現の中で格要素が未確定の格要素を検出する。例
えば、意味解析結果34から、動詞「発売」36を核と
した5W1H相当語句との格関係表現の中で格要素が未
確定である「who格」を検出する。次いで、動詞シソ
ーラス辞書参照処理で前記動詞「発売」36の時系列関
係を参照し、「発売」と時系列関係にある「商品化」を
抽出する。Next, the processing result 33 of the sentence 31 is read by the target retrieval result reading processing, and it is processed whether or not there is a preceding sentence to the target sentence. If there is no preceding sentence, there is no information that can be referred to and no omission processing is performed. Similarly, the analysis result 34 of the sentence 32 is read. If there is a preceding sentence,
A case element whose case element is undecided is detected in the case relational expression with the 5W1H equivalent word having the verb as the core in the analysis result. For example, from the semantic analysis result 34, the “who case” in which the case element is undetermined is detected in the case relational expression with the 5W1H equivalent word having the verb “release” 36 as the core. Next, the verb thesaurus dictionary reference processing refers to the time-series relationship of the verb “release” 36, and extracts “commercialization” that has a time-series relationship with “release”.
【0021】図5に本実施例が参照する動詞シソーラス
情報の内容の一部を示す。次に、前述の文の意味解析結
果33の動詞を参照する。意味解析結果33の動詞は
「商品化」と同一ではないので、再度動詞シソーラスの
時系列関係を参照し、「商品化」と時系列関係を持つ
「開発」を参照し、同様に意味解析結果の動詞参照処理
を行なう。意味解析結果33の動詞が「開発」と同一で
あれば、動詞シソーラスの格パターン情報に基づいて動
詞「開発」35を含む意味解析結果33の格要素を参照
する。例えば、「開発」35の格要素「who格」が意
味解析結果34における未確定の格要素「who格」と
格要素が同一であるので、次に、両格要素の名詞句の制
約条件を参照処理する。本実施例では詳細を明示してい
ないが、名詞シソーラス情報を参照すると、意味解析結
果33における名詞句「A社」と意味解析結果34にお
ける名詞の制約条件「組織」が上位−下位の関係を持つ
ので、「A社」を「発売」36の格要素「who格」の
情報欄に埋め込み処理する。同様に、全ての動詞毎に未
確定格要素の省略情報を補完処理され、省略情報補完処
理結果出力部から処理結果37が出力される。FIG. 5 shows a part of the contents of the verb thesaurus information referred to in this embodiment. Next, the verb of the semantic analysis result 33 of the above sentence is referred to. The verb of the semantic analysis result 33 is not the same as "commercialization", so refer again to the time-series relationship of the verb thesaurus, and refer to "development" that has a time-series relationship with "commercialization". Performs the verb reference process. If the verb of the semantic analysis result 33 is the same as “development”, the case element of the semantic analysis result 33 including the verb “development” 35 is referred to based on the case pattern information of the verb thesaurus. For example, since the case element “who case” of “development” 35 has the same case element as the undetermined case element “who case” in the semantic analysis result 34, the constraint conditions of the noun phrases of both case elements are Perform reference processing. Although details are not specified in the present embodiment, referring to the noun thesaurus information, the noun phrase “A company” in the semantic analysis result 33 and the constraint condition “organization” of the noun in the semantic analysis result 34 show a higher-lower relationship. Since it has it, "A company" is embedded in the information column of the case element "who case" of "release" 36. Similarly, the omission information of the undetermined case element is complemented for every verb, and the processing result 37 is output from the omission information complementation processing result output unit.
【0022】[0022]
【発明の効果】以上説明したように、本発明の自然言語
処理装置は、動詞の時系列関係を記述した動詞シソーラ
ス辞書及び名詞間の関係を記述した名詞シソーラス辞書
と、言語解析した解析結果及び前述の両シソーラスを参
照して省略情報を補完する処理手段を備えているので、
複数の文からなるひとつの文章において、直前の文に拘
らず、文脈全体の文を利用して省略情報を補完すること
が可能になる。これにより、文の構造を正しく抽出する
ことが可能になり、それを基に複数の文から構成される
文章の構造をも正しく抽出することができるという効果
がある。As described above, the natural language processing apparatus of the present invention includes a verb thesaurus dictionary describing time series relationships of verbs, a noun thesaurus dictionary describing relationships between nouns, an analysis result of language analysis, and Since it is equipped with processing means that complements omitted information by referring to both thesauri described above,
In one sentence composed of a plurality of sentences, it is possible to complement the omitted information by using the sentence of the entire context regardless of the sentence immediately before. As a result, it is possible to correctly extract the structure of a sentence, and it is possible to correctly extract the structure of a sentence composed of a plurality of sentences based on the structure.
【図1】本発明に係る自然言語処理装置の一実施例の構
成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of an embodiment of a natural language processing apparatus according to the present invention.
【図2】図1の自然言語処理装置による省略情報補完処
理の動作を説明するためのフローチャートである。FIG. 2 is a flow chart for explaining an operation of abbreviated information complementing processing by the natural language processing device of FIG.
【図3】省略情報補完処理の具体的な説明図である。FIG. 3 is a specific explanatory diagram of omitted information complementing processing.
【図4】単語辞書の用言パターン情報の内容の一例を示
す説明図である。FIG. 4 is an explanatory diagram showing an example of contents of word pattern information in a word dictionary.
【図5】動詞シソーラス辞書の内容の一部を示す説明図
である。FIG. 5 is an explanatory diagram showing a part of the contents of a verb thesaurus dictionary.
【符号の説明】 101 入力部 102 意味解析部 103 意味解析結果格納部 104 省略情報補完処理部 105 省略情報補完結果出力部 106 単語辞書 107 シソーラス辞書 108 動詞シソーラス辞書 109 名詞シソーラス辞書 110 属性辞書[Explanation of Codes] 101 Input Unit 102 Semantic Analysis Unit 103 Semantic Analysis Result Storage Unit 104 Omitted Information Completion Processing Unit 105 Omitted Information Complementary Result Output Unit 106 Word Dictionary 107 Thesaurus Dictionary 108 Verb Thesaurus Dictionary 109 Noun Thesaurus Dictionary 110 Attribute Dictionary
Claims (3)
係に関する情報を格納するシソーラス情報格納手段と、 単語の属性情報を格納する属性情報格納手段と、 前記入力手段から入力された複数の文を、前記単語情
報、シソーラス情報、属性情報を参照することにより、
1文ずつ解析し、その意味解析結果を抽出する意味解析
手段と、 前記意味解析手段によって解析された意味解析結果を格
納する意味解析結果格納手段と、 前記シソーラス情報を基に、意味解析結果中の省略され
た語句の補完を行なう省略表現補完処理手段と、 前記省略表現補完処理手段による省略表現の補完結果を
表示する省略結果出力手段とを備えたことを特徴とする
自然言語処理装置。1. A thesaurus information storage means for storing a relation between words, including an input means for inputting a sentence, a word information storage means for storing information about words, and a noun-related word dictionary and a verb-related word dictionary. An attribute information storage unit that stores attribute information of a word; and a plurality of sentences input from the input unit, by referring to the word information, thesaurus information, and attribute information,
A semantic analysis unit that analyzes each sentence and extracts the semantic analysis result, a semantic analysis result storage unit that stores the semantic analysis result analyzed by the semantic analysis unit, and a semantic analysis result based on the thesaurus information. 2. A natural language processing apparatus comprising: an abbreviation completion processing means for complementing the omitted words and phrases; and an abbreviation result output means for displaying a result of abbreviation completion by the abbreviation completion processing means.
体)、WHAT(客体)、WHEN(時間)、WHER
E(場所)、WHY(理由)及びHOW(方法)からな
る5W1H相当句との格関係で表現される意味を解析す
る意味解析手段であることを特徴とする自然言語処理装
置。2. The semantic analysis means according to claim 1, wherein the verb is a core, and WHO (subject), WHAT (object), WWH (time), and WHER are used.
A natural language processing device characterized by being a semantic analysis means for analyzing a meaning expressed by a case relation with a 5W1H equivalent phrase consisting of E (place), WHY (reason) and HOW (method).
が、動詞の時系列関係を格納する動詞関連語辞書であ
り、 前記省略表現補完処理手段が、前記意味解析結果から処
理対象となる意味解析結果を1文ずつ読み込み、動詞を
核とした5W1H相当句との格関係表現毎に、省略され
た名詞句を含む格要素を検出し、前記動詞関連語辞書か
ら読み出された動詞の時系列関係に基づいて、処理対象
文の動詞と時系列関係を持つ動詞を先行する文の意味解
析結果から抽出し、該抽出された動詞の格要素に基づい
て、省略された名詞句を同定し、該名詞句により省略表
現を補完する省略表現補完処理手段であることを特徴と
する自然言語処理装置。3. The verb-related word dictionary included in the thesaurus information storage means according to claim 1 or 2, wherein the abbreviation completion processing means is a verb-related word dictionary for storing time series relationships of verbs. , The semantic analysis result to be processed is read sentence by sentence from the semantic analysis result, and a case element including an abbreviated noun phrase is detected for each case relation expression with a 5W1H equivalent phrase having a verb as a core, and the verb is detected. Based on the time series relationship of verbs read from the related word dictionary, a verb having a time series relationship with the verb of the processing target sentence is extracted from the semantic analysis result of the preceding sentence, A natural language processing device, characterized in that it is an abbreviation complement processing means for identifying an abbreviated noun phrase based on the aforesaid noun phrase and complementing the abbreviated expression with the noun phrase.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP31851093A JP3300142B2 (en) | 1993-12-17 | 1993-12-17 | Natural language processor |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP31851093A JP3300142B2 (en) | 1993-12-17 | 1993-12-17 | Natural language processor |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JPH07175808A true JPH07175808A (en) | 1995-07-14 |
| JP3300142B2 JP3300142B2 (en) | 2002-07-08 |
Family
ID=18099924
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP31851093A Expired - Fee Related JP3300142B2 (en) | 1993-12-17 | 1993-12-17 | Natural language processor |
Country Status (1)
| Country | Link |
|---|---|
| JP (1) | JP3300142B2 (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH09179875A (en) * | 1995-12-25 | 1997-07-11 | Sharp Corp | Information retrieval device |
| US6338034B2 (en) | 1997-04-17 | 2002-01-08 | Nec Corporation | Method, apparatus, and computer program product for generating a summary of a document based on common expressions appearing in the document |
| JP2003519841A (en) * | 1999-12-22 | 2003-06-24 | キム,チュンテ | Information modeling method and method of performing search using database constructed by information modeling |
| JP2013232098A (en) * | 2012-04-27 | 2013-11-14 | Nippon Hoso Kyokai <Nhk> | Information processing device and program |
| WO2018066258A1 (en) * | 2016-10-06 | 2018-04-12 | シャープ株式会社 | Dialog device, control method of dialog device, and control program |
-
1993
- 1993-12-17 JP JP31851093A patent/JP3300142B2/en not_active Expired - Fee Related
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH09179875A (en) * | 1995-12-25 | 1997-07-11 | Sharp Corp | Information retrieval device |
| US6338034B2 (en) | 1997-04-17 | 2002-01-08 | Nec Corporation | Method, apparatus, and computer program product for generating a summary of a document based on common expressions appearing in the document |
| JP2003519841A (en) * | 1999-12-22 | 2003-06-24 | キム,チュンテ | Information modeling method and method of performing search using database constructed by information modeling |
| US7325010B1 (en) | 1999-12-22 | 2008-01-29 | Chungtae Kim | Information modeling method and database searching method using the information modeling method |
| JP2013232098A (en) * | 2012-04-27 | 2013-11-14 | Nippon Hoso Kyokai <Nhk> | Information processing device and program |
| WO2018066258A1 (en) * | 2016-10-06 | 2018-04-12 | シャープ株式会社 | Dialog device, control method of dialog device, and control program |
| CN109791766A (en) * | 2016-10-06 | 2019-05-21 | 夏普株式会社 | Interface, the control method of Interface and control program |
| JPWO2018066258A1 (en) * | 2016-10-06 | 2019-09-05 | シャープ株式会社 | Interactive device, interactive device control method, and control program |
Also Published As
| Publication number | Publication date |
|---|---|
| JP3300142B2 (en) | 2002-07-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7243305B2 (en) | Spelling and grammar checking system | |
| US10296584B2 (en) | Semantic textual analysis | |
| EP1899835B1 (en) | Processing collocation mistakes in documents | |
| US5890103A (en) | Method and apparatus for improved tokenization of natural language text | |
| US5020021A (en) | System for automatic language translation using several dictionary storage areas and a noun table | |
| KR100515641B1 (en) | Method for sentence structure analysis based on mobile configuration concept and method for natural language search using of it | |
| EP0886226A1 (en) | Linguistic search system | |
| US20120035905A1 (en) | System and method for handling multiple languages in text | |
| JP2002215617A (en) | Method for attaching part of speech tag | |
| WO1997004405A9 (en) | Method and apparatus for automated search and retrieval processing | |
| JPH0644296A (en) | Machine translation device | |
| JPH0242572A (en) | How to generate and maintain a co-occurrence relationship dictionary | |
| JPH083815B2 (en) | Natural language co-occurrence relation dictionary maintenance method | |
| US8489384B2 (en) | Automatic translation method | |
| JP3300142B2 (en) | Natural language processor | |
| US7440890B2 (en) | Systems and methods for normalization of linguisitic structures | |
| JPS59140582A (en) | Natural language translation assisting system | |
| JPH05298349A (en) | Co-occurrence relation knowledge learning method, its system, and co-occurrence relation dictionary and its use method | |
| Neumann et al. | Shallow natural language technology and text mining | |
| JP3222173B2 (en) | Japanese parsing system | |
| JP2840258B2 (en) | Method of creating bilingual dictionary and co-occurrence dictionary for machine translation system | |
| Sinha | Translating News Headings from English to Hindi | |
| Branco et al. | EtiFac: A facilitating tool for manual tagging | |
| JPH1115846A (en) | Information retrieval device and recording medium | |
| JPH0973454A (en) | Document creating apparatus and document creating method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20080419 Year of fee payment: 6 |
|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20090419 Year of fee payment: 7 |
|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20090419 Year of fee payment: 7 |
|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100419 Year of fee payment: 8 |
|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100419 Year of fee payment: 8 |
|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20110419 Year of fee payment: 9 |
|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120419 Year of fee payment: 10 |
|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120419 Year of fee payment: 10 |
|
| FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20130419 Year of fee payment: 11 |
|
| LAPS | Cancellation because of no payment of annual fees |