[go: up one dir, main page]

CN104156353B - A kind of method and apparatus of computer based natural language syntactic structure parsing - Google Patents

A kind of method and apparatus of computer based natural language syntactic structure parsing Download PDF

Info

Publication number
CN104156353B
CN104156353B CN201410419634.0A CN201410419634A CN104156353B CN 104156353 B CN104156353 B CN 104156353B CN 201410419634 A CN201410419634 A CN 201410419634A CN 104156353 B CN104156353 B CN 104156353B
Authority
CN
China
Prior art keywords
vector
syntax
unit
predicate verb
predicate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410419634.0A
Other languages
Chinese (zh)
Other versions
CN104156353A (en
Inventor
秦男
秦一男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201410419634.0A priority Critical patent/CN104156353B/en
Publication of CN104156353A publication Critical patent/CN104156353A/en
Priority to PCT/CN2015/083760 priority patent/WO2016026359A1/en
Application granted granted Critical
Publication of CN104156353B publication Critical patent/CN104156353B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

Disclose a kind of method and apparatus of computer based natural language syntactic structure parsing.The present invention is according to the mathematical principle of subject and corresponding computer technologies such as Abstract Algebra, set theory, Combinational Mathematics, computability theory and computational linguistics, with the mathematical thought of compound function, natural language syntactic structure parsing is carried out by setting up matrix model and linear model, construction recursive function;Meanwhile, the method such as integrated use mathematical induction is proved important conclusion.The present invention establishes a set of brand-new mathematical modeling for the sentence of natural language, compared with conventional conventional method, there is fundamental difference in thinking.The present invention creatively proposes unilateral order-preserving in the same direction and the one side overall plug hole method of two kinds of not order-preserving in the same direction, and creatively with the method for generating processing syntactic constituent arranged side by side of set race.The present invention takes full advantage of mathematics and Computer Subject rule, and methods described accuracy is higher, and operand is very big, there is certain technical difficulty.

Description

A kind of method and apparatus of computer based natural language syntactic structure parsing
Technical field
The present invention relates to field of computer data processing, and in particular to a kind of computer based natural language syntactic structure The method and apparatus of parsing.
Background technology
Natural language processing is an important directions in computer science and artificial intelligence field.It is studied can be real The existing various theoretical and methods for carrying out efficient communication between people and computer using natural language.
Syntactic structure parsing is an importance of natural language processing, and it is by computer to natural language sentence Sentence element carries out automatic division to aid in the further processing for sentence.In existing syntactic structure analytic technique, lead to Frequently with probability context without bounding algorithm (Probabilistic Context Free Grammars, PCFG), it is based on certainly The characteristics of right language has complicated nesting, computing statement and the rule match probability of syntactic structure analysis result, choose probability Maximum syntax analysis result is used as final syntactic structure.
But, this method complexity is high, moreover, the parsing accuracy for combined type sentence structure is also urgently further carried It is high.
The content of the invention
In view of this, the invention provides a kind of method of computer based natural language syntactic structure parsing and dress Put, design is unique, method is ingenious, demonstration is full and accurate, takes full advantage of the rule of mathematics and Computer Subject, methods described accuracy Higher, operand is very big, there is certain technical difficulty.
The present invention provides a kind of method of computer based natural language syntactic structure parsing, including:
In S1, reading pretreated phrase data structure to be resolved, the pretreated phrase data structure only Conjunctive word unit arranged side by side, subordinate conjunctive word unit, predicate verb unit, noun pronoun unit including sentence, and each word unit It is numbered according to the order in the pretreated sentence, and marking types;
S2, to each predicate verb unit, generate corresponding leading question element, subject element, predicate element and object member Element;
Arranged side by side conjunctive word of the possibility value of the leading question element for numbering less than corresponding predicate verb element number One of unit or subordinate conjunctive word unit, or the conjunctive word list arranged side by side by a numbering less than corresponding predicate verb element number Member and an adjacent thereto and numbering are less than corresponding predicate verb element number and numbering is more than conjunctive word unit volume arranged side by side Number one of the conjunctive word mix vector that constitutes of subordinate conjunctive word unit, or dummy cell;
Noun pronoun unit of the possibility value of the subject element for numbering less than corresponding predicate verb element number One of, or most major term unit numbering less than corresponding predicate verb element number all coordinate noun pronoun mix vector races Included in one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of preceding appearance, or Dummy cell;
The predicate element is the corresponding predicate verb unit;
The possibility value numbering of the object element is more than corresponding predicate verb element number and less than adjacent rear One of noun pronoun unit of the predicate verb element number of appearance, or the numbering of minimum word unit are more than corresponding predicate verb Element number and less than the adjacent predicate verb element number in rear appearance all coordinate noun pronoun mix vector races in Comprising one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of rear appearance, or empty Unit;
S3, the possibility value according to the leading question element, subject element, predicate element and object element, are obtained each Predicate verb unit corresponding syntax vector is possible to value, the syntax vector including leading question element, subject element, Predicate element and object element;
S4, according to all syntaxes vector be possible to value, generate at least one syntactic structure possibility matrix solution, it is described The possible matrix solution of syntactic structure according to the tactic syntax vector of predicate verb element number by constituting;
S5, checking according to syntactic structure may the obtained sentence of matrix solution whether with the pretreated complete phase of sentence Together, if identical, using the syntactic structure may be in matrix solution each syntax vector as syntactic structure analysis result it One;
Wherein, S5 includes performing following operation successively in order, and excluding ineligible syntactic structure may solve:
S5.1, if there is the sequence valve that may do not occur in the syntactic structure in matrix solution, then exclude the syntactic structure Possible matrix solution;
If S5.2, identical sequence valve occur in different syntax vectors or identical syntax vector occur, arrange Except the syntactic structure may matrix solution;
S5.3, in each possible matrix solution, will other syntaxes vector between exist mutually substitute into relation syntax Vector all carries out equivalent substitution, if occurring the intersection contradiction of two syntax vectors after equivalent substitution, excludes the sentence Method structure may matrix solution;
S5.4, in each possible matrix solution, will other syntaxes vector between exist mutually substitute into relation syntax Vector all carries out equivalent substitution, if occurring the converse sequence valve in two positions after equivalent substitution, excludes the syntax Structure may matrix solution;
S5.5, in any one possible matrix solution, if there is other syntaxes vector between without mutually substitute into close The syntax vector of system, then perform plug hole operation to obtain the possibility syntax analytic structure corresponding to all possible matrix solutions, and Whether the sentence that checking is obtained according to the possible syntax analytic structure is identical with the pretreated sentence, and it enters one Step includes:
S5.5.1, first to exist each other in the possible matrix solution syntax of substitution relation vector carry out equivalent substitution, So as to which the possible matrix solution is converted into one group of syntax vector each other in the absence of substitution relationWill Syntax vector in possible matrix solution is referred to as first kind syntax vector, the syntax vector that conversion is obtainedClaim For Equations of The Second Kind syntax vector;
S5.5.2, appoint and take Equations of The Second Kind syntax vectorMark one by one in the predetermined directionIn each sentence The sequence valve of method element;After the sequence valve for marking syntax elements, appoint and takeIn i-th of syntax elements, only the syntax member The unique room of the first side structure of element;Make after sky, appoint and take a syntax vectorEquations of The Second Kind syntax vector in additionWith The mode of overall plug hole is vectorial by syntaxThe constructed room of insertion, and then a new syntax vector is generated, this is new Syntax vector is designated asAnd by syntax vector obtained from overall plug hole, it is referred to as the 3rd class syntax vector;
S5.5.3, to the 3rd class syntax vectorIn the predetermined direction to from vectorIn The syntax elements of the first side first start to vectorIn the vector that includesThe syntax member of the second side first Each syntax elements untill element, all mark sequence valve;Positioned at vectorIn the vector that includesFirst side Element, do not mark sequence valve;By vectorFirst syntax elements of the second side be designated asWill in the manner previously described To vectorThe syntax vector portion of mark, is designated as whipping syntax vectorAfter mark sequence valve, Appoint j-th of the syntax elements taken in a foregoing whipping vector, only in the unique room of the side structure of element first;Make sky Afterwards, appoint and take an original Equations of The Second Kind syntax vectorBy syntax vector in the way of overall plug holeInsertion is constructed Room, and then generate a new syntax vector, be then designated as newly-generated syntax vectorOr
3rd class syntax vectorAccording to predetermined direction, distich normal vectorIn each Syntax elements all mark sequence valve;After the sequence valve for marking syntax elements, appoint and take oneIn t-th Method element, in the unique room of the first side structure of the syntax elements;After making sky, appoint take one to be not used Equations of The Second Kind sentence Normal vectorBy the vector in the way of overall plug holeThe room that insertion is above constructed, and then generate a new vector, then should New vector is designated as
S5.5.4, S5.5.3 is repeated, when the last time, which makes empty and plug hole step, to be terminated, to by last Make empty and the 3rd class syntax vector obtained from plug hole step carry out next time make the operation of empty and plug hole, until by all second Class syntax vectorWhole plug holes are finished, and finally obtain the 3rd class syntax vector of single file, will it is described it is last must The 3rd class syntax vector arrived is referred to as final single file vector;
If having two in the corresponding all final single files vectors of S5.5.5, a possible syntax analytic structure The converse sequence valve in position, then exclude the possible syntax analytic structure;
S5.5.6, repeat S5.5.2 to S5.5.5 until all possible syntax analytic structures be traversed.
Further, S2 includes generation coordinate noun pronoun mix vector race:
S2.1 chooses unduplicated two nouns pronoun unit:
If A, between the two noun pronoun units there is no other word units, by the two noun pronoun units As a coordinate noun pronoun mix vector, and retain the coordinate noun pronoun mix vector;
If B, between the two noun pronoun units exist other word units, check between the two noun generations Each word unit between word unit:If any one word unit between the two noun pronoun units, all It is noun pronoun unit or conjunctive word unit arranged side by side, then by two selected noun pronoun units and between the two noun generations All word units between word unit as a coordinate noun pronoun mix vector, and retain the coordinate noun pronoun combine to Amount;Otherwise, coordinate noun pronoun mix vector is not generated;
S2.2 performs S2.1 again until the combination of all noun pronoun units is traversed, and it is all that generation is obtained Coordinate noun pronoun mix vector;
If there is coordinate noun pronoun mix vector in the S2.3 possible syntax analytic structures, to all coordinate nouns Pronoun mix vector is divided, so as to form several coordinate noun pronoun mix vector races so that:In each name arranged side by side In word pronoun mix vector race, each coordinate noun pronoun included in the coordinate noun pronoun mix vector race combine to Amount all contains two common noun pronoun units.
S2.4 chooses the volume included in all noun pronoun mix vectors in each noun pronoun mix vector race Number maximum word unit, as the most major term unit of the noun pronoun mix vector race, in case being used when being subsequently generated subject;Choosing The word unit for taking the numbering included in all noun pronoun mix vectors minimum, as the noun pronoun mix vector race most Small word unit, in case being used when being subsequently generated object.
Further, generating corresponding subject element includes:
When corresponding predicate verb element number is minimum predicate verb element number, the possibility of the subject element Value is small less than one of noun pronoun unit of corresponding predicate verb element number, or the numbering of its most major term unit for numbering Coordinate noun pronoun group included in all coordinate noun pronoun mix vector races of corresponding predicate verb element number One of resultant vector, or dummy cell.
When corresponding predicate verb element number is not minimum predicate verb element number, the subject element can Energy value is small less than one of noun pronoun unit of corresponding predicate verb element number, or the numbering of most major term unit for numbering Coordinate noun pronoun group included in all coordinate noun pronoun mix vector races of corresponding predicate verb element number One of resultant vector, or in one of corresponding syntax vector of predicate verb unit of preceding appearance, or dummy cell.
Further, generating corresponding object element includes:
When corresponding predicate verb element number is maximum predicate verb element number, the possibility of the object element Value is big more than one of noun pronoun unit of corresponding predicate verb element number, or the numbering of its minimum word unit for numbering Coordinate noun pronoun group included in all coordinate noun pronoun mix vector races of corresponding predicate verb element number One of resultant vector, or dummy cell.
When corresponding predicate verb element number is not maximum predicate verb element number, the object element can Energy value is numbering more than corresponding predicate verb element number and less than the adjacent predicate verb element number in rear appearance One of noun pronoun unit, or its minimum word unit numbering more than corresponding predicate verb element number and less than adjacent Coordinate noun pronoun included in all coordinate noun pronoun mix vector races of the predicate verb element number of rear appearance One of mix vector, or in one of corresponding syntax vector of predicate verb unit of rear appearance, or dummy cell.
Further, in two steps of S4 and S5, the syntax is substituted using with the possible linear representation solution of syntactic structure Structure may matrix solution;
The syntactic structure may linear representation solution and the possible matrix solution equivalence of the syntactic structure;
The possible linear representation solution of the syntactic structure is included by according to the tactic syntax of predicate verb element number Vector expression is constituted;Each syntax vector expression is the leading question element, subject element, meaning of corresponding syntax vector The expression formula that language element, object element are added up partially item by item in sequence.
Further, methods described also includes:
Each syntax in syntactic structure analysis result is vectorial and corresponding syntax structural relationship tree is in people Shown in machine interactive interface.
The present invention also provides a kind of device of computer based natural language syntactic structure parsing, including:
Read part, the pretreated phrase data structure to be resolved for reading, the pretreated sentence number According to only including conjunctive word unit arranged side by side, subordinate conjunctive word unit, predicate verb unit, the noun pronoun unit of sentence in structure, And each word unit is numbered according to the order in the pretreated sentence, and marking types;
Element generation part, for each predicate verb unit, generating corresponding leading question element, subject element, meaning Language element and object element;
Wherein, arranged side by side pass of the possibility value of the leading question element for numbering less than corresponding predicate verb element number Join one of word unit or subordinate conjunctive word unit, or the association arranged side by side by a numbering less than corresponding predicate verb element number Word unit and one it is adjacent thereto and numbering less than corresponding predicate verb element number and numbering be more than the conjunctive word list arranged side by side One of conjunctive word mix vector that the subordinate conjunctive word unit of member numbering is constituted, or dummy cell;
Noun pronoun unit of the possibility value of the subject element for numbering less than corresponding predicate verb element number One of, or most major term unit numbering less than corresponding predicate verb element number all coordinate noun pronoun mix vector races Included in one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of preceding appearance, or Dummy cell;
The predicate element is the corresponding predicate verb unit;
The possibility value numbering of the object element is more than corresponding predicate verb element number and less than adjacent rear One of noun pronoun unit of the predicate verb element number of appearance, or the numbering of minimum word unit are more than corresponding predicate verb Element number and less than the adjacent predicate verb element number in rear appearance all coordinate noun pronoun mix vector races in Comprising one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of rear appearance, or empty Unit;
Vectorial generating unit, for the possibility according to the leading question element, subject element, predicate element and object element Value, obtains the value that is possible to of the corresponding syntax vector of each predicate verb unit, and the syntax vector includes leading question Element, subject element, predicate element and object element;
Matrix generation component, for being possible to value according to all syntaxes vector, generates at least one syntactic structure Possible matrix solution, the possible matrix solution of the syntactic structure is by according to the tactic syntax Vector Groups of predicate verb element number Into;
Decider, for verify according to syntactic structure may the obtained sentence of matrix solution whether with it is described pretreated Sentence is identical, if identical, regard each syntax vector in the possible matrix solution of the syntactic structure as syntactic structure One of analysis result;
Wherein, the decider may be solved by the syntactic structure for operating exclusion ineligible with lower module:
First row removes module, and if there is the sequence valve not occurred in the possible matrix solution of the syntactic structure, then excluding should Syntactic structure may matrix solution;
Second row removes module, if there is identical sequence valve in different syntax vectors or occur identical syntax to Amount, then excluding the syntactic structure may matrix solution;
3rd excludes module, in each possible matrix solution, will exist mutually to substitute between other syntaxes vector and close The syntax vector of system all carries out equivalent substitution, if occurring the intersection contradiction of two syntax vectors after equivalent substitution, Excluding the syntactic structure may matrix solution;
4th excludes module, in each possible matrix solution, will exist mutually to substitute between other syntaxes vector and close The syntax vector of system all carries out equivalent substitution, if occurring the converse sequence valve in two positions after equivalent substitution, arranges Except the syntactic structure may matrix solution;
5th excludes module, in any one possible matrix solution, does not have phase if there is between other syntaxes vector The syntax vector of relation is mutually substituted into, then performs plug hole operation and is parsed with obtaining the possibility syntax corresponding to all possible matrix solutions Structure, and verify the sentence that is obtained according to the possible syntax analytic structure whether with the pretreated complete phase of sentence Together, it further comprises:
First submodule, first carries out equivalent generation to there is the syntax of substitution relation vector in the possible matrix solution each other Change, so that the possible matrix solution is converted into one group of syntax vector each other in the absence of substitution relation Syntax vector that will likely be in matrix solution is referred to as first kind syntax vector, will convert obtained syntax vector Referred to as Equations of The Second Kind syntax is vectorial;
Second submodule, appoint and take Equations of The Second Kind syntax vectorMark one by one in the predetermined directionIn it is each The sequence valve of individual syntax elements;After the sequence valve for marking syntax elements, appoint and takeIn i-th of syntax elements, only in the sentence The unique room of the first side structure of method element;Make after sky, appoint and take a syntax vectorEquations of The Second Kind syntax vector in additionBy syntax vector in the way of overall plug holeThe constructed room of insertion, and then a new syntax vector is generated, by this Individual new syntax vector is designated asAnd by syntax vector obtained from overall plug hole, it is referred to as the 3rd class syntax vector;
3rd submodule, to the 3rd class syntax vectorIn the predetermined direction to from vectorIn the syntax elements of the first side first start to vectorIn the vector that includesThe second side Each syntax elements untill first syntax elements, all mark sequence valve;Positioned at vectorIn include VectorThe element of first side, does not mark sequence valve;By vectorFirst syntax elements of the second side be designated asWill be by According to aforementioned manner to vectorThe syntax vector portion of mark, is designated as whipping syntax vectorMark order After value, appoint j-th of the syntax elements taken in a foregoing whipping vector, it is only uniquely empty in the side structure of element first Position;Make after sky, appoint and take an original Equations of The Second Kind syntax vectorBy syntax vector in the way of overall plug holeInsert Enter constructed room, and then generate a new syntax vector, be then designated as newly-generated syntax vector Or
3rd class syntax vectorAccording to predetermined direction, distich normal vectorIn each Syntax elements all mark sequence valve;After the sequence valve for marking syntax elements, appoint and take oneIn t-th Method element, in the unique room of the first side structure of the syntax elements;After making sky, appoint take one to be not used Equations of The Second Kind sentence Normal vectorBy the vector in the way of overall plug holeThe room that insertion is above constructed, and then generate a new vector, then The new vector is designated as
4th submodule, repeats the operation of the 3rd submodule, right when the last time, which makes empty and plug hole step, to be terminated Make sky and the plug hole of sky with the 3rd class syntax vector progress obtained from plug hole step next time are made by the last time to operate, until By all Equations of The Second Kind syntaxes vectorWhole plug holes are finished, and finally obtain the 3rd class syntax vector of a single file, will The 3rd class syntax vector finally obtained is referred to as final single file vector;
5th submodule, if existed in the corresponding all final single files vectors of a possible syntax analytic structure The converse sequence valve in two positions, then exclude the possible syntax analytic structure;
6th submodule, repeats to call the operation of the second submodule to the 5th submodule until all possible syntax parsing knots Structure is traversed.
Further, in addition to:
As a result display unit, each syntax in syntactic structure analysis result is vectorial and corresponding syntax structural relationship is used Tree shown on human-computer interaction interface.
Brief description of the drawings
By description referring to the drawings to the embodiment of the present invention, above-mentioned and other purposes of the invention, feature and Advantage will be apparent from, in the accompanying drawings:
Fig. 1 is the flow chart of the method for the computer based natural language syntactic structure parsing of the embodiment of the present invention;
Fig. 2 is the schematic diagram of the device of the computer based natural language syntactic structure parsing of the embodiment of the present invention.
Embodiment
Below based on preferred embodiment, present invention is described, but the present invention is not restricted to these embodiments. It is detailed to describe some specific detail sections below in the detailed description of the present invention.For a person skilled in the art The description of part can also understand the present invention completely without these details.In order to avoid obscuring the essence of the present invention, known side Method, flow, element and circuit do not have detailed narration.
Partial ordering relation and inclined add operation on part A semigroup
A1 parts natural language is the open-shop problem that vocabulary and punctuation mark collection close
According to Abstract Algebra and the theory of computational linguistics, natural language is free one that vocabulary and punctuation mark collection close Semigroup.Below in English exemplified by illustrate, but it should be readily apparent to one skilled in the art that the present invention method be also applied for it His natural language.
The symbol string given on set an A, A is abutted by the element in A, can be repeated when adjacent, is formed One time-limited linear array.For example:From set { a, b, c }, symbol string acbaab can be formed.This symbol string includes a Three times appearance, b appearance twice, c once appearance, it be different from symbol string acaabb.Although each symbol goes out occurrence Number is identical, but their order is different.It can be seen that, what symbol string was ordered into.Especially, length is 0 symbol string for 0 symbol string, It is designated as e.It is exactly one from natural manifold N to A accordingly, for the symbol string that the given limited upper length of assemble of symbol A, A is n Individual mapping:f:N→A.
From two symbol strings, we can constitute new symbol string with they adjacent method.For example, in symbol string Abac right-hand member adjoining symbol string bbac, just forms new symbol string abacbbac.
The computing of this adjacent symbol string is referred to as:Adjoin computing, referred to as adjoin.
The symbol string ψ that the symbol string φ and length that given length is n are m, wherein:
φ={ (1, x1), (2, x2), (3, x3) ... ..., (n-1, xn-1), (n, xn)};
ψ={ (1, y1), (2, y2), (3, y3) ... ..., (m-1, ym-1), (m, ym)};
Adjoining for φ and ψ is designated as:φ^ψ.It is length for n+m and by set { (1, x1), (2, x2), (3, x3) ... ..., (n-1, xn-1), (n, xn), (n+1, y1), (n+2, y2) ... ..., (n+m, ym) symbol string that provides.So, it is definition to adjoin A kind of binary operation in symbol string, the result of computing is to obtain a new symbol string.
φ's and ψ adjoins, and can also omit and adjoin mark ^, and simplification is designated as:φψ.
Then have:φ ^ ψ=φ ψ.
It is combinative to adjoin computing, because for any symbol string φ, ψ, ω, having:
φ ^ (ψ ^ ω)=(φ ^ ψ) ^ ω
Existing each English word and english punctuation mark are defined as a symbol, then all words and mark in S Set A={ a of point symbol1, a2, a3..., an(n ∈ N) be exactly a glossary of symbols.
Appoint one the given time-limited symbol string b being made up of English word and english punctuation mark1b2......bk(k∈ N), referred to as word unit or continuous word string.For appointing the word unit a=b given1b2......bm(m ∈ N), it is first in A to claim a The word unit of element composition, and if only if, b1, b2..., bm∈A。
Length is referred to as dummy cell for 0 unique word unit, is designated as e.
Note collection for all word units (continuous word string) that element is constituted in A is combined into AsIf, sentence S= a1a2a3......an, wherein, anTo constitute the word unit of sentence.Algebra system (As, ^ e) is English word and punctuation mark collection Close the open-shop problem on A.
Each word unit is arranged in order according to its order in sentence, and serial number is designated as under it, and note T (α) is word unit α Numbering in sentence S.
The condition for constructing syntactic constituent Sequential Mapping a ω, ω is as follows:
(1)ω:{a1, a2, a3... ..., an} → N, N are nature manifold;
(2) to any one ai, ai∈ S, have:ω(ai)=T (ai)。
Obviously, ω is a single mapping.
A2 parts define a kind of partial ordering relation
Simultaneously for algebra system (As, ^, e), definition binary crelation <
For AsIn arbitrary word unit α, β ∈ As, claim α <β, and if only if α, β code T (α), T (β) satisfactions:T (α) < T (β).
According to definition, binary crelation <Meet following condition:
(1) appoint and give a ∈ As, have a ≮a;
(2) for any a, b, c ∈ AsIf, a <B, then b ≮a;
(3) for any a, b, c ∈ AsIf, a <B and b <C, then a <c。
The then definition according to strict partial ordering relation, binary crelation <It is strict partial ordering relation.
A3 parts define inclined add operation and syntax sequence valve
Meanwhile, in algebra system (As, ^, e) on, define a new binary operation+<.+ < is called to be defined on AsIn Strict partial ordering relation <On inclined add operation, referred to as partially plus, it meets following characteristic:For any a, b ∈ AsIf, a <B, then have a+ < b=a^b=ab.
We can determine whether:For any a, b ∈ AsIf, a <B, then have inclined add operation+< and adjoin computing ^ etc. Valency.Inclined add operation+<, can be regarded as being limited in strict partial ordering relation <On adjoin computing.
The sentence S of any natural language can be seen as by each word unit according to strict partial ordering relation <It is formed by connecting Word string formula, i.e.,:S=a1+ < a2+ < a3+ < ...+< an.This feature, it is highly beneficial for expansion Mathematical treatment.
In former sentence S, according to order from left to right, disposably from beginning of the sentence to sentence tail, be adjacent in full sentence n even Continuous word string α1, α2..., αnMark serial number:1,2 ... .., n.
In mark once determining, as described above, the serial number for remembering an any given continuous word string α is τ (α), then τ (α) is called α sequence valve from left to right.That is, appoint to the syntax elements γ in a former sentence S, syntax elements γ is existed Syntax sequence valve in former sentence S is designated as τ (γ).
Part B technical scheme is described in detail
Preliminary classification of the B1 parts to language message
In the present invention, the word unit a of sentence will be constitutediRegard as constant.Word unit aiWith its linguistic property.Constitute The word unit of kernel sentence structure can be divided into conjunctive word unit arranged side by side, subordinate conjunctive word unit, predicate verb unit, noun pronoun The type of unit 4.Each word unit includes at least one natural language vocabulary, and it can be word, the phrase of specific structure or many Individual same attribute word it is arranged side by side.
For conjunctive word unit arranged side by side, it can be coordinating conjunction and, the but for connecting compound sentence and syntactic constituent arranged side by side, Or, so, yet etc..
For subordinate conjunctive word unit, it can be the company of the conjunctional pronoun or conjunctive adverbs and guiding subordinate clause that guide subordinate clause Phrase is connect, is listed below for typical introducer:That, what, which, who, whom, wherever, whenever, Whose, where, when, why, how, whoever, whichever, while, whether, because, before, After, whatever, whomever, as, if, once, until, though, unless, although, no matter What, no matter who, no matter whom, no matter which, in that, in order that, as Though, as if, even though, even if, so that etc..It mainly includes:The pass of guiding subordinate clause is served as by word Join word unit, the conjunctive word unit of the conjunctive word unit of guiding subordinate clause, connection compound sentence and compound sentence is served as by phrase.
For predicate verb unit, it can also be verb or verb phrase, for example, can do, do.Predicate is defined as Main actions language in English in a natural sentences.Generally it is made up of in structure two parts:Aid in verb+notional verb (main Except copular construction).Predicate has the call format of tense and voice, is defined as follows with philological formula is calculated:
For noun pronoun unit, Ke Yishi:Pure noun phrase (being not included in the noun phrase in guest's Jie phrase), Verb phrase (the verb phrase definition of noun of noun:It is with noun property, subject or object this class name can be served as The verb phrase of part of speech syntactic constituent, including:Infinitive phrase and the major class of gerund phrase two), the pronoun that can be used alone. Noun pronoun unit is exemplified below:Food, wolf, the men, me, it, this, to do etc..
The verb phrase of noun has call format, is defined as follows with philological formula is calculated:
1 To+VB 7 RB+To+VB
2 To+VB+VBN 8 RB+To+VB+VBN
3 To+VB+VBN+VBN 9 RB+To+VB+VBN+VBN
4 VBG 10 RB+VBG
5 VBG+VBN 11 RB+VBG+VBN
6 VBG+VBN+VBN 12 RB+VBG+VBN+VBN
{ explanation of important sign }
r Predicate verb unit
k The ordinal number for the predicate verb unit being presently processing
Lead Subordinate conjunctive word unit
NPI Pure noun unit
Conj Conjunctive word unit arranged side by side
VNP The verb unit of noun property
NOMP Nominative pronoun unit
OBJP Objective case pronoun unit
NP The general designation of noun pronoun unit
In above-mentioned word unit list, there is following relation between the set of word unit:
{ NP }={ NPI } ∪ { VNP } ∪ { NOMP } ∪ { OBJP }.
B2 parts define important concept
Explanation:The subordinate sentence of natural language sentence is defined as follows:Subordinate sentence is exactly simple sentence, i.e. the most basic sentence of natural language Formula.One subordinate sentence, is exactly a set of subject-predicate matching structure.The class word unit of the above three constitutes the trunk of natural language sentence subordinate sentence, its In, predicate verb unit serves as predicate, and noun pronoun unit serves as subject or object.
In the present invention, defined variable is x, y, z, and wherein x is leading question element, and y is subject element, and z is object element, Meanwhile, note r is predicate element, then the subject-predicate matching structure in each sentence can be expressed as:
F (x, y, r, z)=x+ < Λ+< y+ < σ+< r+ < ρ+< z+ < μ
Λ, σ, ρ, μ represent any composition or the punctuation mark outside x, y, r, z, referred to as impurity respectively, by existing Some sentence preconditioning techniques can remove impurity.It can will remove function f (x, y, r, z)=x+ < y+ < r+ after impurity < z.Represented with the mode of vectorial (x, y, r, z).
Leading question element x is a composition of simple sentence:When simple sentence is subordinate clause, leading question element is the company of guiding subordinate clause Connect pronoun or conjunctive adverbs, the conjunctive phrase for guiding subordinate clause;Simple sentence be compound sentence when, leading question element be by the compound sentence with The coordinating conjunction of preceding other compound sentences connection.That is, in a simple sentence, leading question element x is by conjunctive word unit structure Into, syntactic constituent for guiding follow-up simple sentence.
If a function f being presently processing in S, it is f to remember this current function fk;Note is presently processing Predicate verb unit ordinal number be k.(k ∈ N, N are nature manifold, k≤n)
The crucial set of B3 parts generation three:{xk, { yk, { zk}
B3.1 parts generation { xk}
[B3.1.1] preparation work:It is defined as follows subclass:
1)Leadk=Lead | Lead <rk};
2)conjk=conj | conj <rk};
3)(conjkοLeadk)=
{Rk|Rk=conj+ < Lead, conj <rk, Lead <rk, τ (Lead)=τ (conj)+1 };
[B3.1.2]{xkGenerating algorithm:
{xk}=Leadk∪conjk∪(conjkοLeadk)∪{e}。
B3.2 parts on syntactic constituent arranged side by side being specifically described of generation method (using subject arranged side by side and object arranged side by side as Example)
[B3.2.1] directviewing description
Explanation:In following narration, for the ease of expression, by the formula Ф of continuous word stringtOrIn comprising syntax member ElementIt is designated as
1st step
The phrase of all noun property in former sentence is taken, the phrase of all noun property in former sentence is compiled as a collection Close, be designated as set Ψ={ α1..., αm-1, αm, m ∈ N, m are the numbers of the element in set Ψ.
2nd step
According toMode, take set Ψ={ α1..., αm-1, αmIn any two element whole combinations,If set
3rd step
To appointing one givenAccording to elementSyntax sequence valve in former sentence S from small To longer spread.It might as well then setOrdered pair can then be obtainedSet up the formula of a continuous word stringWhereinBe in former sentence S fromArriveOne group it is adjacent Continuous word string or empty word string.Ordered pair as limit and continuous word string formula.
4th step
To formula ФtChecked, if for formula ФtIn appoint give between elementWithBetween element γ, has:The phrase of γ either noun properties, or conjunction, or empty string side by side, then by ФtMark be changed toReferred to as ФtGenerationIf setThen
5th step
Appoint and take setIf setExist correspondingThen define one Individual set race, the set race is by including setAll set constitute.The set race is designated as that form is expressed as below:
6th step
If setThere is corresponding set raceBy any one set raceIn Each syntax elements for having under its command of set all take out, be classified as set
7th step
Set is taken out respectivelyIn in former sentence S The minimum and maximum element of syntax sequence valve.
Note:This method can also be used for the other kinds of composition arranged side by side of generation, for example, generate adjective phrase arranged side by side.As long as By all NPI in this method, entirety VNP, entirety NOMP phrases, by all NPI in former sentence, entirety VNP, entirety NOMP words Group is substituted for corresponding syntactic constituent, you can obtain.
[B3.2.2] formal definitions
Definition:All NPI phrases, entirety VNP phrases in the former sentence S of function of a single variable A (S), A (S) expression taking-up, entirety NOMP phrases, meanwhile, all NPI phrases in former sentence, entirety VNP phrases, entirety NOMP phrases are classified as a set, by this Set is designated as Ψ={ α1..., αm-1, αm, m ∈ N, m are the numbers of the element in set Ψ.Then A (S)=Ψ={ α1..., αm-1, αm}。
Definition:Function of a single variable B (Ψ), B (Ψ) represent according toMode take set Ψ={ α1..., αm-1, αmIn appoint The whole of two elements that anticipate combine,If set Then Will Appoint one givenIt is designated asThenThen
Definition:Binary function K (α, β), K (α, β) represent the result to function of a single variable B (Ψ), i.e., to appointing one givenAccording to elementThe arrangement from small to large of syntax sequence valve in former sentence S.It might as well then setIt then can obtain ordered pairIf set And then set up a continuous word string formulaWhereinBe in former sentence S fromArriveOne group of adjacent continuous word string or empty word string, and Then
Definition:Function of a single variable H (Фt), H (Фt) represent what binary function K (α, β) was generated Checked:If to appointing the element γ ∈ Ф givent, andAndHave:γ=NPI or γ =VNP or γ=NOMP or γ=CONJ or γ=e, then by ФtMark be changed toReferred to as ФtGenerationIf setThen
Definition:Binary function M (α, β), M (α, β) are represented for appointing one taken set If setExist correspondingA set race is then defined, the set race is by including set's Entirety set is constituted, and the set race is designated asThen
Definition:Binary function N (α, β), N (α, β) represent the result to binary function M (α, β)I.e. pair Set is taken in appointingIf setThere is corresponding set raceThen Construct a new set as followsThen
Definition:Function of a single variable U (α), U (α) represent the result to binary function N (α, β)TakeIfTo appointing the element γ given,There is τ (γ)≤τ (δ).Then
Definition:Function of a single variable V (β), V (β) represent the result to binary function N (α, β)TakeIfTo appointing the element γ given,There is τ (δ)≤τ (γ).Then
The generating algorithm of [B3.2.3] subject arranged side by side:
The generating algorithm of [B3.2.4] object arranged side by side:
The illustration of [B3.2.4] to subject arranged side by side and the generating algorithm of object arranged side by side
Illustrate:Word order list is:
Former sentence phrase Phrase type Serial number
After Subordinate conjunctive word unit 1
Jack Noun pronoun unit 2
Mary Noun pronoun unit 3
and Conjunctive word unit arranged side by side 4
Linda Noun pronoun unit 5
left Predicate verb unit 6
I Noun pronoun unit 7
gave Predicate verb unit 8
my son Noun pronoun unit 9
a book Noun pronoun unit 10
In generation subject element set { y1}、{y2During, run subject generating algorithm arranged side by side as follows:
1. A (S) takes out all NPI phrases, entirety VNP phrases, the entirety NOMP phrases in former sentence, and will be complete in former sentence Body NPI phrases, entirety VNP phrases, entirety NOMP phrases are classified as a set, by the set be designated as Ψ=Jack, Mary, Linda, I, my son, a book }={ 2,3,5,7,9,10 }.
2. B (Ψ) represent according toMode take whole groups of any two element in set Ψ={ 2,3,5,7,9,10 } Close, if set Then B (Ψ)={ 2,3 }, { 2,5 }, { 2,7 }, { 2,9 }, { 2,10 }, { 3,5 }, { 3,7 }, { 3,9 }, { 3,10 }, { 5,7 }, { 5,9 }, { 5,10 }, { 7,9 }, { 10,7 }, 10, 9}}。
3. K (α, β) is to function of a single variable B (Ψ) result, i.e., to appointing one givenAccording to elementThe arrangement from small to large of syntax sequence valve in former sentence S.It might as well then setIt then can obtain ordered pairThenThe ordered pair of generation is:
{<2,3>,<2,5>,<2,7>,<2,9>,<2,10>,<3,5>,<3,7>,<3,9>,<3,10>,<5,7>,<5,9>, <5,10>,<7,9>,<7,10>,<9,10>};
If setAnd then set up a continuous word string FormulaWhereinBe in former sentence S fromArriveOne group it is adjacent Continuous word string or empty word string, andThen Then Φ1=2+ < e+ < 3, Φ2=2+ < 3+ < 4+ < 5, Φ3=2+ < 3+ < 4+ < 5+ < 6+ < 7, Φ4= 2+ < 3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9, Φ5=2+ < 3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ6=3+ < 4+ < 5, Φ7=3+ < 4+ < 5+ < 6+ < 7, Φ8=3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9, Φ9=3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ10=5+ < 6+ < 7, Φ11=5+ < 6+ < 7+ < 8+ < 9, Φ12=5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ13=7+ < 8+ < 9, Φ14=7+ < 8+ < 9+ < 10, Φ15=9+ < e+ < 10.
④H(Φt) binary function K (α, β) is generatedChecked:If right Appoint the element γ ∈ Φ givent, andAndHave:γ=NPI or γ=VNP or γ=NOMP or γ=CONJ or γ=e, then by ΦtMark be changed toReferred to as ΦtGenerationIf setThen gatherThen
5. M (α, β) is represented for appointing one taken setIf setDeposit CorrespondingA set race is then defined, the set race is by including setAll set constitute, the set Race is designated asThen Then M (α, β)={ I1({ 2,3 }), I2({ 3,5 }), I3({ 9,10 }) }.
6. results of the N (α, β) to binary function M (α, β)Set is taken for appointing If setThere is corresponding set raceThen construct a new set as followsP [I can then be obtained1({ 2,3 })]={ 2,3,4,5 }, P [I2 ({ 3,5 })]={ 2,3,4,5 }, P [I3({ 9,10 })]={ 9,10 }.
7. results of the U (α) to binary function N (α, β)TakeIfTo appointing the element γ given,There is τ (γ) ≤τ(δ).Then Pmax[I1({ 2,3 })]=5, Pmax[I2({ 3,5 })]=5, Pmax[I3({ 9,10 })]=10.
In generation object element set { z1}、{z2During, run object generating algorithm arranged side by side as follows:
1. A (S) takes out all NPI phrases, entirety VNP phrases, the entirety NOMP phrases in former sentence, and will be complete in former sentence Body NPI phrases, entirety VNP phrases, entirety NOMP phrases are classified as a set, by the set be designated as Ψ=Jack, Mary, Linda, I, my son, a book }={ 2,3,5,7,9,10 }.
2. B (Ψ) represent according toMode take whole groups of any two element in set Ψ={ 2,3,5,7,9,10 } Close, if set Then B (Ψ)={ 2,3 }, { 2,5 }, { 2,7 }, { 2,9 }, { 2,10 }, { 3,5 }, { 3,7 }, { 3,9 }, { 3,10 }, { 5,7 }, { 5,9 }, { 5,10 }, { 7,9 }, { 10,7 }, 10, 9}}。
3. K (α, β) is to function of a single variable B (Ψ) result, i.e., to appointing one givenAccording to elementThe arrangement from small to large of syntax sequence valve in former sentence S.It might as well then setIt then can obtain ordered pairThenThe ordered pair of generation is:
{<2,3>,<2,5>,<2,7>,<2,9>,<2,10>,<3,5>,<3,7>,<3,9>,<3,10>,<5,7>,<5,9>, <5,10>,<7,9>,<7,10>,<9,10>};
If setAnd then set up a continuous word string FormulaWhereinBe in former sentence S fromArriveOne group it is adjacent Continuous word string or empty word string, andThen Then Φ1=2+ < e+ < 3, Φ2=2+ < 3+ < 4+ < 5, Φ3=2+ < 3+ < 4+ < 5+ < 6+ < 7, Φ4= 2+ < 3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9, Φ5=2+ < 3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ6=3+ < 4+ < 5, Φ7=3+ < 4+ < 5+ < 6+ < 7, Φ8=3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9, Φ9=3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ10=5+ < 6+ < 7, Φ11=5+ < 6+ < 7+ < 8+ < 9, Φ12=5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ13=7+ < 8+ < 9, Φ14=7+ < 8+ < 9+ < 10, Φ15=9+ < e+ < 10.
④H(Φt) binary function K (α, β) is generatedChecked:If right Appoint the element γ ∈ Φ givent, andAndHave:γ=NPI or γ=VNP or γ=NOMP or γ=CONJ or γ=e, then by ΦtMark be changed toReferred to as ΦtGenerationIf setThen gatherThen
5. M (α, β) is represented for appointing one taken setIf setDeposit CorrespondingA set race is then defined, the set race is by including setAll set constitute, the set Race is designated asThen Then M (α, β)={ I1({ 2,3 }), I2({ 3,5 }), I3({ 9,10 }) }.
6. results of the N (α, β) to binary function M (α, β)Set is taken for appointing If setThere is corresponding set raceThen construct a new set as followsP [I can then be obtained1({ 2,3 })]={ 2,3,4,5 }, P [I2 ({ 3,5 })]={ 2,3,4,5 }, P [I3({ 9,10 })]={ 9,10 }.
7. V (β) represents the result to binary function N (α, β)TakeIfTo appointing the element γ given,There is τ (δ)≤τ (γ).Then Pmin[I1({ 2,3 })]=2, Pmin[I2({ 3,5 })]=2, Pmin[I3({ 9,10 })]=9.
B3.3 parts { ykGeneration method
[B3.3.1] preparation work:It is defined as follows subclass:
1)NPIyk=NPI | NPI <rk}。
2)VNPyk=VNP | VNP <rk}。
3)NOMPk=NOMP | NOMP <rk}。
4)
Wherein:
5)ryk={ rα| α < k, α ∈ N }.(N is nature manifold)
6)fyk={ fα| α < k, α ∈ N }.(N is nature manifold)
[B3.3.2]{ykGenerating algorithm
2. it is converted into:When there is rk-1When:{yk}=NPIyk∪VNPyk∪NOMPk∪Gk∪fyk∪ { e }, then above formula conversion For:
B3.4 parts { zkGeneration method
[B3.4.1] preparation work:It is defined as follows subclass:
1)
2)
3)
4)
Wherein:
5)rzk={ rα| k < α, α ∈ N }.(N is nature manifold)
6)fzk={ fα| k < α, α ∈ N }.(N is nature manifold)
[B3.4.2]{zkGenerating algorithm
2. it is converted into:When there is rk+1When:{zk}=NPIzk∪VNPzk∪OBJPk∪Hk∪fzk∪ { e }, then above formula conversion For:
B3.5 part matrixs expression formula and linear representation
[B3.5.1] matrix expression
And then, sentence S can be expressed with matrix form, i.e.,:
As a function fjServe as another function fkSubject element or object element when, for example:Work as fk=x+ < y+ < R+ < fjOr fk=x+ < fjDuring+< r+ < y, claim fkIt is to be obtained by compound operation.Compound operation is designated as f in the present invention (f)。
It is also word unit because function f is seen on the whole, so adding computing to be applied to function partially.If function fi、fjMeet fifj, and another function fkIt can be expressed as fiAnd fjInclined plus i.e. fk=fi+ < fj, claim fkObtained by adding computing partially 's.
Each English sentence S for not omitting predicate verb can be regarded as by n function f1... ..., fn(n is equal to meaning Language verb element number) by the compound of limited number of time and partially plus obtained from computing.Accordingly, any one can not omitted to meaning The English sentence S of language is designated as:
That is, any one do not omit the English sentence of predicate by including leading question element, subject element, predicate element or The vector of object element is obtained through compound or inclined plus computing.Next, just facing a kind of rationally expression is chosen for English natural sentences S The problem of formula.This expression formula, it has to be possible to rightly show all compound and inclined plus computings included in S.Matrix Form possesses such condition just, and it can embody the compound operation of function with the position of element in a certain row vector, example Such as:fk(fj)=fk(xk, fj, rk, zk), indicate that fkWith fjTherebetween compound operation relation;Meanwhile, and do not destroy member Inclined plus relation between element:fk=xk+ < fj+ < rk+ < zk.To sum up, in order to it is accurate, directly perceived, clearly express English natural sentences S, in order to preferably disclose natural sentences S inherent mathematical and physical structure, we are using primary expression formula of the matrix as natural sentences S.
[B3.5.2] linear representation
At the same time it can also express sentence S using linear forms, i.e.,:
Especially emphasize:
1. each English natural sentences S for not omitting predicate linear representation contain limited number of time inclined plus computing and Compound operation.Herein using supplement expression formula of the linear representation as natural sentences S.
2. it is equivalence relation between matrix expression of the invention and linear representation.
3. English natural sentences S linear representation, while being also natively one with function f1... ..., fn(n is equal to Predicate verb element number) be unknown quantity system of linear equations, therefore, ensuing syntactic structure solution is tried to achieve with substitution method herein The process of result is analysed, is also considered as being to solve for this naturally with function f1... ..., fn(n is equal to predicate verb element number) For the process of the system of linear equations of unknown quantity.
The substitution solver of B3.6 part matrixs
1st step
If there is the sequence valve not occurred in the possible matrix solution of the syntactic structure, then excluding the syntactic structure may square Battle array solution;
For example for following possibility matrix solution
The word unit that numbering is 4 does not occur, and excludes.
2nd step
If identical sequence valve occur in different syntax vectors or identical syntax vector occur, the sentence is excluded Method structure may matrix solution;
For example for following possibility matrix solution
The word unit that numbering is 5 is occurred in that twice, is excluded.
3rd step
It is in each possible matrix solution, the syntax vector that there is mutually substitution relation between other syntaxes vector is complete Equivalent substitution is all carried out, if occurring the intersection contradiction of two syntax vectors after equivalent substitution, the syntactic structure is excluded Possible matrix solution;
For example for following possibility matrix solution
Above-mentioned matrix is substituted into, f2And f3Occur in that the substitution of function intersects contradiction.Substitution is obtained:f2=3+ < e+ < 6+ < (4+ < f2+ < 7+ < e).Equation left and right ends occur in that f simultaneously2, this has occurred as soon as () logical contradiction.Exclude.
4th step
It is in each possible matrix solution, the syntax vector that there is mutually substitution relation between other syntaxes vector is complete Equivalent substitution is all carried out, if occurring the converse sequence valve in two positions after equivalent substitution, excluding the syntactic structure can Can matrix solution;This is the fundamental requirement of both Mathematical treatments, is also defined in strict partial ordering relation <On inclined add operation sheet Matter requirement.
For example for following possibility matrix solution
It is substituted into, f2=4+ < 5+ < 6+ < 3+ < e+ < 7+ < e, obtain order for (4,5,6,3, e, 7, e), There is the converse sequence valve in position, exclude.
5th step
In any one possible matrix solution, do not substituted into mutually if there is if there is between other syntaxes vector The syntax vector of relation, then perform plug hole operation to obtain the possibility syntax analytic structure corresponding to all possible matrix solutions, And whether the sentence that checking is obtained according to the possible syntax analytic structure is identical with the pretreated sentence, it enters One step includes:
5.5.1 equivalent substitution first, is carried out to there is the syntax of substitution relation vector in the possible matrix solution each other, from And the possible matrix solution is converted into one group of syntax vector each other in the absence of substitution relationCan Syntax vector in energy matrix solution is referred to as first kind syntax vector, and the syntax that conversion is obtained is vectorialReferred to as Equations of The Second Kind syntax vector;
5.5.2, appoint and take an Equations of The Second Kind syntax vectorMark one by one in the predetermined directionIn each syntax The sequence valve of element;After the sequence valve for marking syntax elements, appoint and takeIn i-th of syntax elements, only in the syntax elements The unique room of the first side structure;Make after sky, appoint and take a syntax vectorEquations of The Second Kind syntax vector in additionWith whole The mode of body plug hole is vectorial by syntaxThe constructed room of insertion, and then a new syntax vector is generated, by this new sentence Normal vector is designated asAnd by syntax vector obtained from overall plug hole, it is referred to as the 3rd class syntax vector;
5.5.3, to the 3rd class syntax vectorIn the predetermined direction to from vectorIn First syntax elements of side first start to vectorIn the vector that includesThe syntax elements of the second side first Untill each syntax elements, all mark sequence valve;Positioned at vectorIn the vector that includesFirst side Element, does not mark sequence valve;By vectorFirst syntax elements of the second side be designated asWill be right in the manner previously described VectorThe syntax vector portion of mark, is designated as whipping syntax vectorMark after sequence valve, appoint J-th of syntax elements in a foregoing whipping vector are taken, only in the unique room of the side structure of element first;Make sky Afterwards, appoint and take an original Equations of The Second Kind syntax vectorBy syntax vector in the way of overall plug holeInsertion is constructed Room, and then generate a new syntax vector, be then designated as newly-generated syntax vectorOr
3rd class syntax vectorAccording to predetermined direction, distich normal vectorIn each Syntax elements all mark sequence valve;After the sequence valve for marking syntax elements, appoint and take oneIn t-th Method element, in the unique room of the first side structure of the syntax elements;After making sky, appoint take one to be not used Equations of The Second Kind sentence Normal vectorBy the vector in the way of overall plug holeThe room that insertion is above constructed, and then generate a new vector, then The new vector is designated as
5.5.4 5.5.3, is repeated, when the last time, which makes empty and plug hole step, to be terminated, to being made by the last time Empty and the 3rd class syntax vector obtained from plug hole step carry out next time make the operation of empty and plug hole, until by all Equations of The Second Kind Syntax vectorWhole plug holes are finished, and finally obtain the 3rd class syntax vector of single file, will it is described it is last must The 3rd class syntax vector arrived is referred to as final single file vector;
If 5.5.5, have two in the corresponding all final single files vectors of a possible syntax analytic structure The converse sequence valve in position, then exclude the possible syntax analytic structure;
5.5.6 5.5.2 to 5.5.5, is repeated until all possible syntax analytic structures are traversed.
B3.7 part matrix revision programs
If necessary, revision program is transferred to, to be modified to more than two syntactic structure analysis results, is specifically included Following operation:
(1) noun pronoun unit serves as the inspection again and choice of subject and object.
(2) syntactic structure is checked with language rule.Citing:
1. according to English syntactic structure rule, the introducer of subject clause can not be omitted.
That of guiding subject clause can not be omitted;
2. according to English syntactic structure rule, subject will be consistent in person and quantity with predicate;
3. according to verb and thing and non-transitivity matter, judge whether connect object thereafter.
(3) reexamining for structural ambiguity is examined and excluded.
(4) upside-down mounting, omission, there be are treated as special case.
(5) composition of extraction is put back to.
(6) generate and export last solution.
By amendment the nonstandard problem of division statement structure can be overcome to improve the parsing degree of accuracy.
Preferably, can according to analysis result by syntactic structure formation syntax data tree structure.
B3.8 parts are specifically described to two kinds of plug hole methods
The common principle of [B3.8.1] two kinds of different plug hole methods:
The sequence valve ordered series of numbers 1,2 of former sentence ..., k can be regarded as by finding clear and definite position in possible matrix solution It is limited between the syntax that can not find clear and definite position vector in the equivalent substitution and the possible matrix solution of process of the syntax vector put Obtained from secondary overall plug hole.That is, initial syntax vector corresponding with former sentenceIt can be regarded as elder generation By the equivalent substitution of the syntax for the finding clear and definite position vector in possible matrix solution, then by looking for not in possible matrix solution To obtained from the overall plug hole of limited number of time between the syntax vector of clear and definite position.The various plug hole situations differed, substantially It is exactly the arrangement and combination in Combinational Mathematics.
[B3.8.2] the 1st kind of plug hole method:
In any one possible matrix solution, all do not substituted into clearly if there is between other any syntax vectors The syntax vector of relation, then first to there is the syntax vector of substitution relation in the possible matrix solution between other syntaxes vector All carry out equivalent substitution, with season the possible matrix solution in other syntaxes vector between be not present substitution relation syntax to Amount all keeps constant, in terms of comprehensive both of the aforesaid, and the possible matrix solution is converted into one group is all not present generation each other Enter the syntax vector of relationWill likely original syntax vector f in matrix solution1, f2..., fδIt is referred to as One class syntax vector;After foregoing equivalent substitution, one group that is converted to come in the manner previously described is not present each other The syntax vector of substitution relationIt is referred to as Equations of The Second Kind syntax vector;Emphasize, Equations of The Second Kind syntax vector is all Each other in the absence of the syntax vector of substitution relation.As θ >=2, overall plug hole is meaningful;The discussion below all preset θ >= 2。
Next, carrying out the unilateral overall plug hole of order-preserving in the same direction, the overall plug hole of unilateral forward order-preserving is referred to as:Appoint and take one Individual Equations of The Second Kind syntax vectorMark syntax vector one by one according to direction (can also be from left to right) from right to leftIn it is every The sequence valve of one syntax elements.After the sequence valve for marking syntax elements, appoint and take oneIn syntax elements, this might as well be set Syntax elements areIn several i-th of the element in the right, it is only unique in right side (can also be only in left side) construction of the syntax elements Room;Make after sky, appoint and take one except syntax vectorEquations of The Second Kind syntax vector in additionIn the way of overall plug hole By syntax vectorThe room that insertion is above constructed, and then a new syntax vector is generated, this new vector is designated asIt is every vectorial by syntax obtained from overall plug hole, the 3rd class syntax vector is referred to as, then It is the 3rd class syntax vector;For appointing the two syntax vector α and β given, if vectorial β is inserted into sentence in the way of overall plug hole Room corresponding to normal vector α several i-th of the syntax elements in the right, has obtained new a 3rd class syntax vector, then by this Individual the 3rd class syntax vector newly obtained is designated as [α]i+ < β;Emphasize, the 3rd class syntax vector is all to be not present each other The syntax vector of substitution relation.Empty and plug hole step is made for 1st to finish.
It is transferred to the 2nd and makes empty and plug hole step.3rd class syntax obtained to making empty and plug hole step by the 1st to AmountAccording to from right to left direction (also can be from left to right but identical with the direction that last mark sequence is chosen, I.e. with last time mark sequence in the same side), to from vectorIn first syntax elements of the right number start to vectorIn the vector that includesFirst syntax elements of left side number untill each syntax elements, all mark suitable Sequence value;Positioned at vectorIn the vector that includesThe syntax elements in left side, do not mark sequence valve;By vector's Number first syntax elements in the left side are designated asWill be in the manner previously described to vectorThe syntax of middle mark order Vector portion, is designated as:Syntax vectorThe syntax vector is called:Whipping vector.Mark after sequence valve, appoint and take Element in one foregoing whipping vector, it is whipping vector that might as well set the elementIn j-th yuan of the right number Element, and only on the right side of the element (also can be but identical with the last direction that make empty selection only in left side, i.e., make sky with the last time In the same side), construct unique room;After making sky, appoint and take one to remove to make used sentence in empty and plug hole step at the 1st Normal vectorWithEquations of The Second Kind syntax vector in additionBy syntax vector in the way of overall plug holeWhat insertion was above constructed Room, and then generate a new syntax vector, then newly-generated syntax vector is designated asFor appointing Vectorial β first syntax elements of left side number are designated as λ (β) by the syntax vector α and β given, vectorial [α] if there is syntaxi+ < β, then in the manner previously described to vectorial [α]i+ < β are labeled, and by the syntax vector portion marked in the manner previously described, It is designated as:Vector [αk\λ(βk-1)], the vector is called:Whipping vector.Empty and plug hole step is made for 2nd to finish.
According to preceding method, the 3rd class syntax vector chooses whipping obtained to making empty and plug hole step by the last time Vector, and as the method previously described to selected whipping vector mark sequence valve, but to mark the direction that sequence is chosen with last It is identical, i.e., with last time mark sequence in the same side;Mark after sequence valve, appoint the syntax elements taken in a whipping vector, press According to preceding method construction uniquely unilateral room, but identical with the last direction that make empty selection, i.e., sky is made with the last time and is existed The same side;Make after sky, appoint take one in addition to used syntax is vectorial in empty and plug hole step in previous making second Class syntax vector, the room for above being constructed Equations of The Second Kind syntax vector insertion in the way of overall plug hole, and then generate one New syntax vector;Repeat foregoing operation:When the last time, which makes empty and plug hole step, to be terminated, all according to foregoing Method, the 3rd class syntax vector obtained to making empty and plug hole step by the last time carry out next time make empty and plug hole Operation, until Equations of The Second Kind syntax is vectorialWhole plug holes are finished, finally obtain the 3rd class syntax of a single file to Amount;The 3rd class syntax finally obtained vector is referred to as final single file vector.
Will be foregoing vectorial from Equations of The Second Kind syntax is chosen firstMake to an entire flow for generating final single file vector For a concrete scheme, i.e. so that the foregoing step made each time in empty and plug hole step concrete scheme.
By each exhaustive step all may situation, the whole schemes of exhaustion.Check each exhaustive generated most Whole single file vector:Delete the final single file vector for the converse sequence valve in two positions occur.
If the converse sequence valve in two positions occurs in each final single file vector of exhaustion generation, nature is violated Rule, excludes all final single file vectors, and then exclude the possible matrix solution of the syntactic structure.
It is every all to meet the natural law without there is the final single file vector of the converse sequence valve in two positions, all it is to close The final single file vector of reason;Retain rational final single file vector as one of correct result, and it is possible to retain the syntactic structure Matrix solution is as one of correct result, in case generation syntax tree is used.
[B3.8.3] the 2nd kind of plug hole method:
In any one possible matrix solution, all do not substituted into clearly if there is between other any syntax vectors The syntax vector of relation, then first to there is the syntax vector of substitution relation in the possible matrix solution between other syntaxes vector All carry out equivalent substitution, with season the possible matrix solution in other syntaxes vector between be not present substitution relation syntax to Amount all keeps constant, in terms of comprehensive both of the aforesaid, and the possible matrix solution is converted into one group is all not present substitution each other The syntax vector of relationBy those original syntax vector fs in the possible matrix solution1, f2..., fδIt is referred to as For first kind syntax vector;After foregoing equivalent substitution, next one group will be converted in the manner previously described each other not In the presence of the syntax vector of the relation of substitutionIt is referred to as Equations of The Second Kind syntax vector;Emphasize, Equations of The Second Kind syntax vector is complete All it is the syntax vector each other in the absence of substitution relation.As θ >=2, overall plug hole is meaningful;The discussion below is all preset θ≥2。
Next, carrying out the unilateral overall plug hole of not order-preserving in the same direction, the overall plug hole of unilateral forward not order-preserving is referred to as:Appoint Take an Equations of The Second Kind syntax vectorAccording to direction (also can be from left to right) distich normal vector from right to leftIn each Syntax elements mark sequence valve one by one.After the sequence valve for marking syntax elements, appoint and take oneIn syntax elements, Bu Fangshe The syntax elements areIn several m-th of the element in the right, only the syntax elements right side (can also be only in left side) construction only One room;Make after sky, appoint and take one except syntax vectorEquations of The Second Kind syntax vector in additionIn the way of overall plug hole By vectorThe room that insertion is above constructed, and then generate a new syntax vector, then newly-generated syntax vector is designated asIt is every vectorial by syntax obtained from overall plug hole, it is referred to as the 3rd class syntax vector;For appointing give two Individual syntax vector α and β, if several m-th of syntax elements institutes in the right that vectorial β is inserted into vectorial α in the way of overall plug hole are right The room answered, has obtained new a 3rd class syntax vector, has then been designated as the 3rd class syntax vector of this new acquisition (α)m+ < β.Empty and plug hole step is made for 1st to finish.
It is transferred to the 2nd and makes empty and plug hole step.For making empty and the 3rd class syntax obtained from plug hole step by the 1st Vector(also from left to right, but the direction phase that sequence is chosen can be marked with last according to direction from right to left Together, i.e., with last time mark sequence in the same side), distich normal vectorIn each syntax elements all mark order Value;After the sequence valve for marking syntax elements, appoint and take oneIn syntax elements, might as well set the element isIn several t-th of the syntax elements in the right, only on the right side of the syntax elements (also can be only in left side, but will be with upper one The secondary direction for making empty selection is identical, i.e., sky is made with the last time in the same side), construct unique room;Make after sky, appoint and take one It is empty vectorial with used syntax in plug hole step except being made at the 1stWithEquations of The Second Kind syntax vector in additionWith entirety The mode of plug hole is by the vectorThe room that insertion is above constructed, and then generate a new vector, then the new vector is designated asEmpty and plug hole step is made for 2nd to finish.
As the method previously described, the 3rd class syntax vector mark is suitable obtained from making empty and plug hole step to the process last time Sequence value, but it is identical with the direction that last mark sequence is chosen, i.e., with last time mark sequence in the same side;Mark after sequence valve, appoint Take the syntax elements in the 3rd class syntax vector, construct unique unilateral room as the method previously described, but will with it is upper The direction for once making empty selection is identical, i.e., sky is made with the last time in the same side;After making sky, appoint and take one except making sky in previous It is vectorial with Equations of The Second Kind syntax of the used syntax in plug hole step beyond vectorial, by Equations of The Second Kind sentence in the way of overall plug hole The room that normal vector insertion is above constructed, and then generate a new syntax vector;Repeat foregoing operation:Whenever upper one It is secondary when make empty and plug hole step and terminate, all as the method previously described, obtained to making empty and plug hole step by the last time The 3rd class syntax vector carry out next time make the operation of empty and plug hole, until by Equations of The Second Kind syntax vectorAll Plug hole is finished, and finally obtains the 3rd class syntax vector of a single file;The 3rd class syntax finally obtained vector is referred to as final Single file vector.
Will be foregoing vectorial from Equations of The Second Kind syntax is chosen firstMake to an entire flow for generating final single file vector For a concrete scheme, i.e. so that the foregoing step made each time in empty and plug hole step concrete scheme.
By each exhaustive step all may situation, the whole schemes of exhaustion.Check each exhaustive generated most Whole single file vector:Under the premise of e on the diverse location in distinguishing possible matrix solution, by two or more complete phases Same final single file vector retains one, deletes unnecessary identical final single file vector, then deletes again and two positions occur The final single file vector of converse sequence valve.
After unnecessary identical final single file vector is deleted, if there are two positions in each final single file vector Converse sequence valve is put, then is violated the rules of nature, all final single file vectors are excluded, and then exclude the possible matrix of the syntactic structure Solution.
After unnecessary identical final single file vector is deleted, every sequence valve converse without two positions of appearance Final single file vector, all meets the natural law, is all rational final single file vector;Retain the rational vectorial conduct of final single file One of correct result, and retain the possible matrix solution of the syntactic structure as one of correct result, in case generation syntax tree is used.
The characteristics of [B3.8.4] is to two methods are summarized and compared:
It can be proved by mathematical method:Whole schemes and Overall Steps in both of the aforesaid method are all limited, solid It is fixed, provable, and the number calculation formula of whole concrete schemes and Overall Steps can be provided, it can also be given by The number calculation formula of all final single file vector of exhaustion generation.Both of the aforesaid method all constructs corresponding mapping fully intermeshing Set and corresponding recursive function are used as its mathematical modeling.Each link in both of the aforesaid method has strict Mathematics Proof With tight mathematic(al) argument.Both of the aforesaid method is the method for complying fully with the natural law.
What two foregoing methods were deployed both for following principles, be all the specific embodiment of following principles:
The sequence valve ordered series of numbers 1,2 of former sentence ..., k can be regarded as by finding clear and definite position in possible matrix solution It is limited between the syntax that can not find clear and definite position vector in the equivalent substitution and the possible matrix solution of process of the syntax vector put Obtained from secondary overall plug hole.That is, initial syntax vector corresponding with former sentenceIt can be regarded as elder generation By the equivalent substitution of the syntax for the finding clear and definite position vector in possible matrix solution, then by looking for not in possible matrix solution To obtained from the overall plug hole of limited number of time between the syntax vector of clear and definite position.The various plug hole situations differed, substantially It is exactly the arrangement and combination in Combinational Mathematics.
Two foregoing methods all meet the requirement of mentioned above principle, and the final result of two methods is completely the same 's.As can be seen here, two foregoing methods are equivalent methods.
In terms of the syntax elements for being used for making sky are chosen:Selection of the 1st kind of foregoing method for making sky element is limited System, actually require to keep the sequencing of plug hole syntax vector;The 2nd kind of foregoing method is for making the selection of sky element It is not have conditional, is actually the sequencing for not requiring to keep plug hole syntax vector.
In terms of final single file vector:Distinguish may be in matrix solution diverse location on e premise under, foregoing the The final single file vector of a kind of method generation is all syntax vector different between two, and that the 2nd kind of foregoing method is generated is final Single file vector is it is possible that duplicate, therefore to delete unnecessary identical final single file vector.
Below, foregoing two methods will be illustrated respectively herein.
[B3.8.5] is illustrated to the 1st kind of plug hole method
[B3.8.5.1] constructs the mapping as scheme mode
Note:Coordinate noun pronoun mix vector and conjunctive word mix vector all regard an entirety as, it is impossible to by other syntaxes Vectorial entirety plug hole.
The sequence valve ordered series of numbers 1,2 of former sentence ..., k can be regarded as by finding clear and definite position in possible matrix solution It is limited between the syntax that can not find clear and definite position vector in the equivalent substitution and the possible matrix solution of process of the syntax vector put Obtained from secondary overall plug hole.That is, initial syntax vector corresponding with former sentenceIt can be regarded as elder generation By the equivalent substitution of the syntax for the finding clear and definite position vector in possible matrix solution, then by looking for not in possible matrix solution To obtained from the overall plug hole of limited number of time between the syntax vector of clear and definite position.The various plug hole situations differed, substantially It is exactly the arrangement and combination in Combinational Mathematics.
The overall plug hole method of foregoing unilateral order-preserving is described in detail below.This method can be depicted accurately may matrix Each situation of the overall plug hole of limited number of time between the syntax that can not find clear and definite position vector in solution.
Former possible matrix solution is converted into the θ new syntaxes that can not find clear and definite position vectorsNote For:This θ syntax vector is subjected to fully intermeshing, according to the relative theory of Combinational Mathematics, this The fully intermeshing result of sample isIt is individual;I.e. by such fully intermeshing, θ is obtained!The individual orderly group of θ members.Will be by such The θ that fully intermeshing is obtained!The individual θ members set that group is constituted in order is designated as(θ≥2)
θ member mapping ρ j, j the ∈ N, 1≤j that construction j is different from two-by-two≤θ!It is from set to make each θ members mapping ρ j {t1, t2..., tθArrive setMapping;Gather { t1, t2..., tθOnly it is used for representing mapping ρ j domain of definition, Jin Erfu Demarcation set Φ θ member fully intermeshings are helped, without others physical meaning.Construction mapping is as follows: J ∈ N, 1≤j≤θ!;To appointing the j given1And j2, j1∈ N, j2∈ N, 1≤j1≤θ!, 1≤j2 ≤θ!If, j1≠j2, then ρ j1≠ρj2。(θ≥2)
For θ member mapping ρ j, there is following conclusion to set up: Obviously, to any one ρ j (tk), all exist1≤δ≤θ so thatIt is i.e. any one Individual ρ j (tk) all demarcate setIn a syntax vector(θ≥2)
Understood according to foregoing construction:By θ!Finite aggregate Ω={ ρ 1, ρ 2, ρ 3, ρ 4, ρ 5, ρ that individual θ members mapping ρ j are constituted 6 ..., ρ θ!Just feature finite aggregate setθ member fully intermeshings.(θ≥2)
Set π={ ρ 1, ρ 2, ρ 3, ρ 4, ρ 5, ρ 6 ..., ρ θ!Tabulate:(θ≥2)
Define 1.1:Will be foregoing from choosing Equations of The Second Kind syntax vector at oneTo the complete of the final single file vector of generation Any one order of the whole θ Equations of The Second Kind syntaxes vector used in operating process is arranged, and is used as a scheme mode.Lift one Example is described in detail:IfIt is used as scheme mode, then it represents that the 1st step is to choose vectorWithAnd by vectorVector is inserted in the way of the overall plug hole of unilateral order-preserving2nd step is to chooseAnd willWith list The mode of the overall plug hole of side order-preserving is inserted in the new syntax vector of the 1st step generation ... ..., and the θ step is to choose vectorAnd willIn the new syntax vector that (θ -1) individual step generation is inserted in the way of the overall plug hole of unilateral order-preserving.Obviously may be used See, any one θ member mappings ρ j are a scheme modes, and all θ members mapping ρ j setIt is exactly whole scheme moulds The set of formula, then the sum of whole scheme modes is θ!It is individual.(θ≥2)
Define 1.2:By the operation of any plug hole carried out according to any one scheme mode and the new vector of generation, It is used as a step of program pattern.To any one scheme mode ρ j, it will be designated as according to ρ j k-th of the step performed[nk] represent a total of n of k-th of stepkPlant alternative situation.
Define 1.3:Any one specific feelings will be chosen according to any one scheme mode ρ j each step performed Condition, then each step is all joined together, it is used as a concrete scheme.Any one concrete scheme is designated asρ j represent the scheme mode of the concrete scheme institute foundation, ikRepresent that k-th of step is chosen to be somebody's turn to do I-th in stepkThe situation of kind.
[B3.8.5.2] constructs plug hole recursive function
Next, to construct a plug hole recursive algorithm according to any one scheme mode ρ j herein Pass through the recursive algorithm, it becomes possible to portray the specific operation process of the foregoing overall plug hole of order-preserving unilateral each time.Inserted in construction Before empty recursive algorithm, following 5 definition are provided first, pre-knowledge is used as:
The plug hole recursive algorithm to be constructed belowIt is exactly that foregoing foundation scheme mode ρ j are performed K-th of step.K therein represents plug hole recursive algorithmThe number of times of operation, that is, perform foregoing one side The number of times of order-preserving entirety plug hole operation.
Define 1.4:Appoint and represent to take out simultaneously mark-up syntax vector α to syntax vector a α, function of a single variable W.W (α)=αkTable Show taking-up syntax vector α, and syntax vector α is labeled as αk, claim αkFor:Input vector.
Define 1.5:Appoint and represent to take out simultaneously mark-up syntax vector β to syntax vector a β, function of a single variable Q.Q (β)=βkTable Show taking-up syntax vector β, and syntax vector β is labeled as βk, claim βkFor:Plug hole vector.In operation plug hole recursive algorithmDuring, by syntax vector βkInsert syntax vector αkIn.
Define 1.6:Binary function Z represents distich normal vector αkSequence valve is marked, by syntax vector αkIn from it is right it is several 1 syntax elements marks sequence valve 1, then marks sequence valve 2,3 successively from right to left ..., until mark arrives vector αk In the vectorial β that includesk-1In from it is left it is several the 1st syntax elements untill.By vectorial βk-1In from it is left it is several the 1st sentence Method element is designated as λ (βk-1), by λ (βk-1) sequence valve be designated as nk, then nkIt is the maximum sequence valve of foregoing mark.The process is portrayed For:If syntax vector αk=b...... λ (βk-1)......b2b1, element λ (βk-1) represent vector βk-1In from it is left it is several 1 element.In k-th of step, to αkRun binary function Zk(α, β):λ(βk-1)<nk>......b2<2>b1<1>, by this Individual result is designated as:Vector [αk\λ(βk-1)], the vector is called:Whipping vector.MarkRepresent:To whipping to Measure [αk\λ(βk-1)] mark maximum sequence valve be nk.Operation binary function Z is obtained
Define 1.7:Binary function T is represented to carry out whipping vector overall plug hole operation, is run in binary function Z and finish it Afterwards, i-th from the right number is chosen on whipping vectorkIndividual syntax elements, and i-thkThe right side construction of individual syntax elements is unique Room, then by vectorial βkThe room is inserted in the way of overall plug hole.The process portray for:If syntax vector αk=b...... λ(βk-1)......b2b1, element λ (βk-1) represent vector βk-1In from it is left it is several the 1st element.Then whipping vector is [αk\λ (βk-1)]=λ (βk-1)<nk>......b2<2>b1<1>.In k-th of step, to vector [αk\λ(βk-1)] and vector βkOperation Binary function Tk(α, β), by βk[α is inserted in the way of overall plug holek\λ(βk-1)] the right number i-thkCorresponding to individual syntax elements Room obtain:By by new vector obtained from foregoing plug hole It is designated asClaim vectorFor:Output vector.
Define 1.8:Appoint to a syntax vector α, the number of the syntax elements included in α is designated as σ [α].If syntax Comprising n syntax elements in vectorial α, n ∈ N then obviously have:N=σ [α].
Note:In the definition of following plug hole recursive algorithm, such equation can be appreciated that:
α and β therein are the implications of independent variable, are abstract marks, can be with wide in range value.Therefore, above-mentioned notation is not Contradiction.
Below, according to mapping ρ j described previously herein and 4 functions, plug hole recursive algorithm is definedSuch as Under:
Note:Appoint and take a scheme mode ρ j;Set
Especially emphasize:As k=1, plug hole recursive algorithmPrimary condition It is:
Recursive algorithmThe characteristics of be:The overall plug hole operation of foregoing unilateral order-preserving is resolved into Four processes:1. take and make blank vector;2. plug hole vector is taken;3. sequence valve is marked to making specific syntax elements in blank vector, And intercept whipping vector;4. any syntax elements chosen in whipping vector, and unique room is constructed on the right side of it, so The room for the front construction that plug hole vector is inserted in the way of overall plug hole afterwards.
[B3.8.5.3] plug hole recursive algorithmOperation citing
(distinguishing the e on diverse location)
Order setθ=4, θ!=24, ρ j=ρ 2.
OrderThen
Make nεIt is the number of the syntax elements in the whipping vector that the ε step is chosen, iεIt is the sky of the ε step construction The corresponding order of elements number in position.
The pattern that carries into execution a plan ρ 2, then by plug hole recursive algorithmOperation 3 times.Choose one specifically Scheme
Operation is as follows:(n1∈ N, i1∈ N, 1≤i1≤n1)
Be easy to get n1=4;Take i1=3, obtain:T1(α, β)=(7eC(eA5 6eB)8eD)
Operation is as follows:(n2∈ N, i2∈ N, 1≤i2≤n2)
Be easy to get n2=6;Take i2=4, obtain:
T2(α, β)=(7eC(eA 5 6(1 2 3 4)eB)8eD)
Operation is as follows:(n3∈ N, i3∈ N, 1≤i3≤n3)
Be easy to get n3=7;Take i3=2, obtain:
T3(α, β)=(7eC(eA 5 6(1 2 3 4)eB)8(eE9 10eF)eD)
[B3.8.5.4] is to the method exhaustive and the explanation of heterogeneite
Conclusion 1.1:The fully intermeshing set of foregoing θ member mappingsWith foregoing plug hole recursive algorithm The overall plug hole of limited number of time that is exhausted between the foregoing syntax vector that can not find clear and definite position all may.
Conclusion 1.2:According to foregoing plug hole recursive algorithmDefinition understand, in context of methods In any two concrete scheme, mutually different two steps are all respectively present, i.e. any two concrete scheme is different from.Cause This, under the premise of the e on diverse location is distinguished, all final single file vector of whole concrete schemes generation of context of methods All it is syntax vector different between two.
[B3.8.5.5] first group of calculation formula and related proof
Note:Coordinate noun pronoun mix vector and conjunctive word mix vector all regard an entirety as, it is impossible to by other syntaxes Vectorial entirety plug hole.
Note:Following lemma and theorem, are all any one scheme mode ρ j and plug hole in given context of methods Recursive algorithmOn the premise of discuss;All it is the e on the diverse location in distinguishing possible matrix solution Premise under discuss.
Define 1.9:Plug hole recurrence is run to calculateIt is located at of the whipping vector intercepted in k-th of step Number is n, and it is α that might as well be located at the n whipping vector intercepted in k-th of step1, α2..., αn, then it is right in k-th of step Each in these vectors carries out overall plug hole, and the number sum of these vectorial plug hole situations is designated as into τ [∑ (k)].
Lemma 1.1:(σ definition, referring to definition 1.8)
Card:It is as follows:
(1) during k >=2, understood according to foregoing definition 1.6, the whipping vector of k-th of step Syntax elements thereinRepresent vector αkIn include Element,<βk-1>Represent vector βk-1In vectorial αkIn elementOn corresponding room.According to whipping vector [αk\λ(βk-1)] Expression formula, the syntax vector in, the number of element is clearly:σ[βk]+(ik-1).Conclusion to be proved is set up.
(2) during k=1, according to foregoing plug hole recursive algorithmDefinition directly obtain.
Card is finished
Lemma 1.2:Appoint to a k, k ∈ N and k >=1, to the syntax of the whipping vector mark intercepted in k-th of step The maximum sequence valve of element is:
Card:According to algorithmIn function ZkThe recursive definition of (α, β), to being cut in k-th of step Whipping vector [the α takenk\λ(βk-1)] the maximum sequence valve of syntax elements of mark is:
According to lemma 1.1, the whipping vector [α of k-th of stepk\λ(βk-1)] number of syntax elements that includes is:
It is to the maximum sequence valve of the syntax elements of the whipping vector mark intercepted in k-th of step then:
Card is finished
Lemma 1.3:Plug hole recursive algorithmThe concrete scheme of generation Number be:(σ definition, referring to definition 1.8) (k ∈ N, k >=1)
Card:According to lemma 1.2, appoint to a k, k ∈ N and k >=1, the whipping vector intercepted in k-th of step is marked The maximum sequence valves of syntax elements be:
According to algorithmIn plug hole function TkThe recursive definition of (α, β), i1Span be:1 ≤i1≤n1, i.e. 1≤i1≤σ[α1]。
According to algorithmIn plug hole function TkThe i when recursive definition of (α, β), k >=3k-1Value Scope is:1≤ik-1≤nk-1, i.e. 1≤ik-1≤σ[βk-2]+(ik-2-1)。
It might as well setβ1=λ (β1)......b2b1, then calculated according to foregoing result and plug hole recurrence MethodDefinition, the plug hole situation sum of the 1st step is σ [α1], i.e. β1There are σ [α1] individual insertion α1Insert Air situation condition.
β might as well be set2=λ (β2)......c2c1,Then foundation Foregoing result and algorithmIn plug hole function TkThe recursive definition of (α, β), to any given one i1, 1≤i1≤σ[α1], the number of the plug hole situation of the 2nd step is all σ [β1]+(i1-1);Due to i1Obtaining value method be time Go through real number interval [1, σ [α1]] in all natural numbers, then for i1Whole values, by sum ∑ in the way of counted, It can calculate:2nd step plug hole situation sum beThat is β2Have The plug hole situation of individual insertion whipping vector.
β might as well be set3=λ (β3)...d2d1,Or Then according to foregoing result and algorithmIn Plug hole function TkThe recursive definition of (α, β), to an any given i2, 1≤i2≤σ[β1]+(i1- 1), the 3rd step Plug hole situation number is all σ [β2]+(i2-1);Due to i2Obtaining value method be traversal real number interval [1, σ [β2]+(i2- 1) in] All natural numbers, i1Obtaining value method be traversal real number interval [1, σ [α1]] in all natural numbers, then for i2Whole take Value, carries out counting cumulative in the way of ∑ of summing, can calculate:3rd step plug hole situation sum beThat is β3HaveIt is individual insertion whipping to The plug hole situation of amount.
Proved below with mathematical induction:In k-th of step, overall plug hole is carried out to all whippings vector, acquisition The total τ [∑ (k)] of plug hole situation is:(k≥2)
Assuming that:In k-th of step, overall plug hole, the total τ of the plug hole situation of acquisition are carried out to all whippings vector [∑ (k)] is:(k≥2)
According to algorithmIn plug hole function TkThe i when recursive definition of (α, β), k >=2kValue model Enclosing is:1≤ik≤nk, i.e. 1≤ik≤σ[βk-1]+(ik-1-1)。
According to algorithmIn plug hole function TkThe recursive definition of (α, β), to any given one ik, 1≤ik≤σ[βk-1]+(ik-1- 1), the plug hole situation number of (k+1) individual step is all σ [βk]+(ik-1);Due to ikTake Value method is traversal real number interval [1, σ [βk-1]+(ik-1- 1) all natural numbers in], then for an any given σ [βk-1]+(ik-1- 1) value, carries out counting cumulative in the way of ∑ of summing, travels through ikWhole values, easily calculate: Sum by the plug hole situation of cumulative (k+1) the individual step obtained of number is I.e. for appointing the σ [β givenk-1]+(ik-1- 1) value, βk+1HaveIndividual insertion The plug hole situation of whipping vector.
Further, inductive assumption has been provided for formula σ [βk-1]+(ik-1- 1) expression formula, i.e. pass through inductive assumption σ [β can be determinedk-1]+(ik-1- 1) whole values.So, carry out counting cumulative in the way of ∑ of summing, from formulaSet out, according to inductive assumption formula Travel through ik-1..., i2, i1Whole values, so as to eliminate parameter ik-1..., i2, i1, then directly calculate:The plug hole situation sum of (k+1) individual step is:
That is βk+1It is a total ofIndividual insertion whipping vector Plug hole situation.Conclusion to be proved is set up, and mathematical induction proves to finish.
The result of summary, the total τ [∑ (k)] of the plug hole situation of k-th of step is as follows:(k≥1)
According to plug hole recursive algorithmDefinition, the total τ [∑s of the plug hole situation of k-th of step (k)], that is, algorithmLast step plug hole situation total τ [∑ (k)], with algorithmThe concrete scheme of generation<i1, i2..., ik>Number be equal, the knot of summary Really, plug hole recursive algorithmThe concrete scheme of generation<i1, i2..., ik>Number be:(σ Definition, referring to definition 1.8) (k ∈ N, k >=1)
Card is finished
Theorem 1.1:By the concrete scheme of any one scheme mode ρ j generations in context of methods< i1, i2..., ik>Number be designated as Ω [ρ j], then have:Ω [ρ j]=(formula is as follows) (θ >=2)
Card:According to plug hole recursive algorithmDefinition, apply mechanically lemma 1.3 in conjunction with this definition, then treat Card conclusion is obviously set up.Card is finished
Theorem 1.2:The number of the final single file vector of any one scheme mode ρ j generations in context of methods is designated as Ω [ρ j], then have:Ω [ρ j]=(formula is as follows) (θ >=2)
Card:According to the definition of final single file vector sum concrete scheme, the number of final single file vector and of concrete scheme Number is equal.It can be obtained according to theorem 1.1 again.Card is finished
The operation of [B3.8.5.6] to the 1st kind of plug hole method is illustrated
Take setθ=4, θ!=24, ρ j=ρ 2.
TakeThen
Then
Number Ω [ρ 2]=(formula is as follows) for the final single file vector that scheme mode ρ 2 is generated:
Work as i1When=1:
Work as i1When=2:
Work as i1When=3:
Work as i1When=4:
Scheme mode ρ 2 generate final single file vector number be:Ω [ρ 2]=140.
Context of methods whole scheme modes generation final single file vector number be:
[B3.8.5.7] second group of calculation formula and related proof
Former possible matrix solution is converted into the θ new syntaxes that can not find clear and definite position vectorsNote For:By the newly-generated syntax that can not find clear and definite position vectorSystem Referred to as Equations of The Second Kind syntax is vectorial.Any one syntax vector being eliminated during foregoing equivalent substitution is referred to as predecessor's sentence Normal vector.For any one newly-generated Equations of The Second Kind syntax vectorBy during foregoing equivalent substitution QuiltThe number of the predecessor's syntax vector f replaced is designated as u ε.ThenIt is to be obtained by ε equivalent substitution of u.
For example:By vector f2=e+ < f3+ < 7+ < f5And f3=3+ < e+ < 4+ < e and f5=8+ < e+ < 9+ < 10 are passed through Cross equivalent substitution and generate an Equations of The Second Kind vector Then It will be apparent that u1=2, i.e.,Obtained by 2 equivalent substitutions.
Theorem 1.3:Appoint to an Equations of The Second Kind syntax vectorWillIn the individual number scale of syntax elements that includes ForBy syntax vectorThe number of predecessor's syntax vector f of cancellation is designated as uε, thenMeet recurrence formula:
Card:Prove as follows with mathematical induction:
(1), if uε=0, then syntax is vectorialThe number of predecessor's syntax vector f of cancellation is 0, i.e.,It is former possible square It is vectorial without the syntax by equivalent substitution in battle array solution, syntax vectorIn the syntax elements number that includes be 4, it is clear that now FormulaSet up.
(2), it is assumed that work as uεDuring=kSet up, nowThe number of predecessor's syntax vector f of cancellation is k,In the element number that includes be 3k+4;Work as uεDuring=k+1, it is considered asK predecessor's syntax vector f is first eliminated, then herein On basis,Introduce predecessor's syntax vector f again while itself syntax elements is subtracted, i.e.,Subtracting certainly 4 elements are introduced while 1 syntax elements of body again.Then nowIn the element number that includes be 3k+4+3, then formulaSet up.Comprehensive (1), (2), it is known that conclusion to be proved is set up.
The summary of [B3.8.5.8] on the number calculation formula of context of methods
Conclusion 1.3:Under the premise of e on diverse location is distinguished, by any one scheme mode ρ in context of methods The number of the corresponding concrete schemes of j is identical with the number for the final single file vector that scheme mode ρ j are generated, and is all designated as Ω [ρ j], then Have:Ω [ρ j]=(formula is as follows) (σ definition, referring to definition 1.8) (θ >=2)
Conclusion 1.4:D might as well be definedθ=σ [ρ j (tθ)], then the formula of conclusion 1.3 is converted into:Ω [ρ j]=(as follows) be (σ's Definition, referring to definition 1.8) (θ >=2)
Conclusion 1.5:According to theorem 1.3, then have:Ω [ρ j]=(as follows) (σ definition, referring to definition 1.8) (θ >=1)
Conclusion 1.6:Define gθ=3uθ+ 4, then Ω [ρ j] formula of conclusion 1.5 be converted into:(σ's determines Ω [ρ j]=(as follows) Justice, referring to definition 1.8) (θ >=2)
Conclusion 1.7:Because producing θ altogether in context of methods implementation process!Individual scheme mode, then distinguishing diverse location On e premise under, the number and the whole of generation of whole concrete schemes of whole scheme modes in context of methods generation The number of final single file vector is identical, and number formula is:(θ≥2)
Conclusion 1.8:Comprehensive each foregoing conclusion, generated in context of methods implementation process limited concrete scheme and Limited final single file vector.What the number of concrete scheme and the number of final single file vector were to determine, there is definite calculating public Formula and corresponding proof, meet the natural law.
Comprehensive demonstration of [B3.8.5.9] the 1st kind of method
Illustrate:Take as follows may matrix solution, and will first be found in the possible matrix solution syntax of clear and definite position to Amount carries out equivalent substitution.If:Possible matrix solution, it is as follows:
Former possible matrix solution, is converted into by equivalent substitution:
SetIt is listed as follows:(setθ=3, θ!=6)
Construction mapping ρ j are as follows:(θ=3,)
J ∈ N, 1≤j≤6
Set π={ ρ 1, ρ 2, ρ 3, ρ 4, ρ 5, ρ 6 } is listed as follows:
The pattern that carries into execution a plan ρ 1, by recursive algorithmOperation 2 times.
Operation is as follows:(n1∈ N, i1∈ N, 1≤i1≤n1)
Operation is as follows:(n2∈ N, i2∈ N, 1≤i2≤n2)
According to scheme mode ρ 1,List:
According to scheme mode ρ 1,List:
The pattern that carries into execution a plan ρ 2, by recursive algorithmOperation 2 times.
Operation is as follows:(n1∈ N, i1∈ N, 1≤i1≤n1)
Operation is as follows:(n2∈ N, i2∈ N, 1≤i2≤n2)
According to scheme mode ρ 2,List:
According to scheme mode ρ 2,List:
The pattern that carries into execution a plan ρ 2, by recursive algorithmOperation 2 times.
Operation is as follows:(n1∈ N, i1∈ N, 1≤i1≤n1)
Operation is as follows:(n2∈ N, i2∈ N, 1≤i2≤n2)
According to scheme mode ρ 3,List:
According to scheme mode ρ 3,List:
The pattern that carries into execution a plan ρ 4-- ρ 6 process, slightly.
By plug hole recursive algorithmImportant information list be summarized as follows:
Above-mentioned formula can be also converted into:The form represented with the number of predecessor's syntax vector.
The number list of element in the vectorial number of predecessor's syntax and Equations of The Second Kind syntax vector:
The important information of plug hole recursive algorithm is expressed with the number of predecessor's syntax vector, is listed as follows:
Check whether each final single file vector two converse sequence valves in position occurs, slightly.
Demonstration is finished example comprehensively.
[B3.8.6] is illustrated to the 2nd kind of plug hole method
The foregoing overall plug hole method of unilateral not order-preserving is described in detail below.This method can be depicted accurately may square Each situation of the overall plug hole of limited number of time between the syntax that can not find clear and definite position vector in battle array solution.
[B3.8.6.1] constructs plug hole recursive function
Scheme mode, concrete scheme, the definition of step, with reference to the 1st kind of method.Here is the 2nd kind of method and the 1st kind of method Difference.Construct a plug hole recursive algorithmIt is foregoing every with regard to that can portray by the recursive algorithm The once detailed process of the overall plug hole of unilateral not order-preserving.Before constitution step recursive algorithm, following 5 definition are provided first, It is used as pre-knowledge:
The plug hole recursive algorithm to be constructed belowIt is k-th performed according to scheme mode ρ j Step.K therein represents plug hole recursive algorithmThe number of times of operation, that is, perform foregoing unilateral not order-preserving The number of times of overall plug hole operation.
Define 2.1:Appoint and represent to take out simultaneously mark-up syntax vector α to syntax vector a α, function of a single variable W.W (α)=αkTable Show taking-up syntax vector α, and syntax vector α is labeled as αk
Define 2.2:Appoint and represent to take out simultaneously mark-up syntax vector β to syntax vector a β, function of a single variable Q.Q (β)=βkTable Show taking-up syntax vector β, and syntax vector β is labeled as βk.In operation plug hole recursive algorithmProcess In, by syntax vector βkInsert syntax vector αkIn.
Define 2.3:Function of a single variable Z represents distich normal vector αkSequence valve is marked, by syntax vector αkIn from it is right it is several 1 syntax elements mark sequence valve 1, then mark sequence valve 2,3 successively from right to left ..., until by syntax vector αk In syntax elements all mark finish.The maximum sequence valve of mark is designated as nk.Operation function of a single variable Z is obtained:
Define 2.4:Binary function T represents using function of a single variable Z distich normal vectors αkAfter mark sequence, in vectorial αkUpper choosing Take right several mkIndividual element, and in mkThe right side of individual element constructs unique room, then by syntax vector βkWith overall plug hole Mode insert the room.The new vector obtained after plug hole is designated as:Operation binary function T is obtained:
Define 2.5:Appoint to a syntax vector α, the number of the syntax elements included in α is designated as σ [α].If syntax Comprising n syntax elements in vectorial α, n ∈ N then obviously have:N=σ [α].
Note:In the definition of following plug hole recursive algorithm, such equation can be appreciated that:
Wk(α)=Tk-1(α, β);
α and β therein are the implications of independent variable, are abstract marks, can be with wide in range value.Therefore, above-mentioned notation is not Contradiction.
Below, according to mapping ρ j described previously herein and 4 functions, plug hole recursive algorithm is definedSuch as Under:(set)
The number calculation formula of [B3.8.6.2] concrete scheme and final single file vector
Lemma 2.1:
Card:According to foregoing definition, in operation recursive algorithmDuring, make blank vector αkWith Plug hole vector βkIn syntax elements all do not increase or decrease, i.e., for any one syntax elements b:
If 1. b ∈ αkOr b ∈ αk, then
If 2.AndThen
According to 1. and 2., it is clear that have:
Card is finished
Lemma 2.2:If:(k ∈ N, k >=1)
Syntax vector ∑ ΨkExpression is by ρ j (t1), ρ j (t2) ..., ρ j (tk) sequentially pass through unilateral not order-preserving and integrally insert Syntax vector obtained from sky.m1, m2..., mk-1Any one room ordinal number of corresponding vector is represented respectively.Then just like Draw a conclusion establishment:
(note:)
Card:Prove as follows with mathematical induction:
(1), if k=1,Conclusion is set up.
(2), it is assumed that set up, then have as k=h
As k=h+1,Then have:
According to lemma 2.1, then it can obtain:
It can be obtained according to inductive assumption:
So as to:
It can obtain again:
Card is finished
Theorem 2.1:The number of the concrete scheme of any one scheme mode ρ j generations in context of methods is designated as Ω [ρ J], then have:(θ≥2)
Card:According to plug hole recursive algorithmDefinition, for arbitrary scheme mode ρ j, kth The number of the plug hole situation of individual step makes blank vector α with k-th stepkSyntax elements number it is identical.According to foregoing Definition, it is clear thatThen according to lemma 2.2, scheme mode ρ j k-th of step hasIndividual situation.Due to Any one scheme mode ρ j have (θ -1) individual step under its command, and the multiplicative principle according to Combinational Mathematics is understood:Context of methods it is any One scheme mode ρ j is corresponded toIndividual concrete scheme.
Card is finished
Theorem 2.2:The number of the final single file vector of any one scheme mode ρ j generations in context of methods is designated as Ω [ρ j], then have:(θ≥2)
Card:According to the definition of final single file vector sum concrete scheme, the number of final single file vector and of concrete scheme Number is equal;Again according to theorem 2.1, it can obtain.
Card is finished
Conclusion 2.1:A total of θ of context of methods!Individual scheme mode, according to theorem 2.1, and it is former according to the addition of Combinational Mathematics Reason, it is known that the sum of concrete scheme is:
Conclusion 2.2:A total of θ of context of methods!Individual scheme mode, according to theorem 2.2, and it is former according to the addition of Combinational Mathematics Reason, it is known that finally the sum of single file vector is:
The presented example of [B3.8.6.3] the 2nd kind of method
Illustrate:Take as follows may matrix solution, and will first be found in the possible matrix solution syntax of clear and definite position to Amount carries out equivalent substitution.If:Possible matrix solution, it is as follows:
Former possible matrix solution, is converted into by equivalent substitution:
SetIt is listed as follows:Setθ=3, θ!=6
Construction mapping ρ j are as follows:(θ=3,)
Set π={ ρ 1, ρ 2, ρ 3, ρ 4, ρ 5, ρ 6 } is listed as follows:
The pattern that carries into execution a plan ρ 1, by step recursive algorithmOperation 2 times.
Operation is as follows:
Operation is as follows:
According to scheme mode ρ 1, plug hole function is runList:
According to scheme mode ρ 1, plug hole function is runList:
The pattern that carries into execution a plan ρ 2-- ρ 6 process, slightly.
By plug hole recursive algorithmImportant information list be summarized as follows:
Two or more identical final single file vectors are retained one, unnecessary identical final list is deleted Row vector, finally obtains 210 final single file vectors different between two, is fitted like a glove with the result of method 1.
Check whether each final single file vector two converse sequence valves in position occurs, slightly.
Presented example is finished.
C portion applicating example
C1 parts example 1
Example 1:By pretreatment, the impurity in sentence can be removed, and mark and the word element number in identification sentence and Type.For example, for english statement S=" I can completely understand what what you just said Really meant ", it removes sentence S=" the I can understand what what you said obtained after impurity Meant ", after the identification of word unit and word cell type mark and numbering are carried out to it, can obtain the number matched with following table According to structure.
Sentence Word cell type Numbering
I Noun pronoun unit 1
can understand Predicate verb unit 2
what A Subordinate conjunctive word unit 3
what B Subordinate conjunctive word unit 4
you Noun pronoun unit 5
said Predicate verb unit 6
meant Predicate verb unit 7
The present invention to the pretreated sentence represented by data above structure based on syntactic analysis is carried out, to obtain each word Composition relation of the unit in sentence.
Fig. 1 is the flow chart of the method for the computer based natural language syntactic structure parsing of the embodiment of the present invention.Such as Shown in Fig. 1, methods described includes:
Step 110, reading pretreated phrase data structure to be resolved, the pretreated phrase data structure In only include conjunctive word unit, predicate verb unit and the noun pronoun unit of sentence, and each word unit according to described through pre- Order in the sentence of processing is numbered and marking types.
Step 120, to each predicate verb unit, generate corresponding leading question element, subject element, predicate element and guest Language element;Arranged side by side conjunctive word list of the possibility value of the leading question element for numbering less than corresponding predicate verb element number One of member or subordinate conjunctive word unit, or the conjunctive word unit arranged side by side by a numbering less than corresponding predicate verb element number With one it is adjacent thereto and numbering less than corresponding predicate verb element number and numbering be more than the conjunctive word element number arranged side by side One of the conjunctive word mix vector that constitutes of subordinate conjunctive word unit, or dummy cell;
Noun pronoun unit of the possibility value of the subject element for numbering less than corresponding predicate verb element number One of, or most major term unit numbering less than corresponding predicate verb element number all coordinate noun pronoun mix vector races Included in one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of preceding appearance, or Dummy cell;
The predicate element is the corresponding predicate verb unit;
The possibility value numbering of the object element is more than corresponding predicate verb element number and less than adjacent rear One of noun pronoun unit of the predicate verb element number of appearance, or the numbering of minimum word unit are more than corresponding predicate verb Element number and less than the adjacent predicate verb element number in rear appearance all coordinate noun pronoun mix vector races in Comprising one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of rear appearance, or empty Unit.
Specifically, for pretreated sentence, if its predicate verb unit total quantity is n, due to predicate verb unit It is only capable of as predicate, therefore, each predicate verb unit corresponds to a predicate element, it is r to remember each predicate verb unitk, k =1 ..., n.
After predicate element is obtained, Position Number generation corresponding leading question element of the continuation based on each predicate element, Subject element, object element.
I, leading question element
Remember each predicate verb unit rkCorresponding conjunctive word unit set is:
{xk}=Leadk∪conjk∪(conjkοLeadk)∪{e}
Remember predicate verb unit rkCorresponding leading question element is xk, its possible value collection is combined into { xk}.Generate predicate verb Unit rkCorresponding leading question element is xkPossibility value set (preferably) include:
Arranged side by side conjunctive word of the possibility value of the leading question element for numbering less than corresponding predicate verb element number One of unit or subordinate conjunctive word unit, or the conjunctive word list arranged side by side by a numbering less than corresponding predicate verb element number Member and an adjacent thereto and numbering are less than corresponding predicate verb element number and numbering is more than conjunctive word unit volume arranged side by side Number one of the conjunctive word mix vector that constitutes of subordinate conjunctive word unit, or dummy cell.
That is, xk∈Leadk∪conjk∪(conjkοLeadk)∪{e}。
In above-mentioned formula, set LeadkRepresent that numbering is associated less than the subordinate of corresponding predicate verb element number Word unit set;conjkRepresent arranged side by side conjunctive word unit set of the numbering less than corresponding predicate verb element number;(conjkο Leadk) represent adjacent thereto less than the conjunctive word unit arranged side by side of corresponding predicate verb element number and one by a numbering And numbering is more than the subordinate conjunctive word list of the conjunctive word element number arranged side by side less than corresponding predicate verb element number and numbering The set for the conjunctive word mix vector that member is constituted, e represents dummy cell.
For example, for pretreated sentence S=" the I can understand what what shown in above-mentioned table 1 You said meant ", have:
r1=" can understand ", for r1There is { x1}={ e }, that is, and r1The value of corresponding leading question element For dummy cell.
r2=" said ", for r2There is { x2}={ what A, what B, e }, with r2Corresponding leading question element it is desirable It is worth for first what or second what in sentence, i.e. " what A " and " one of what B ", or dummy cell.
r3=" meant ", for r3There is { x3}={ what A, what B, e }, with r3Corresponding leading question element it is desirable It is worth for first what or second what in sentence, i.e. " what A " and " one of what B ", or dummy cell.
II, subject element
Remember each predicate verb unit rkCorresponding subject noun pronoun unit set is { yk}=NPIyk∪VNPyk∪ NOMPk∪Gk∪ { e } or { yk}=NPIyk∪VNPyk∪NOMPk∪Gk∪fyk∪{e}。
Remember predicate verb unit rkCorresponding subject element is yk, its possible value collection is combined into { yk}。
Generate corresponding subject element ykPreferably include:
(1) when corresponding predicate verb element number is minimum predicate verb element number, the subject element Volume of the possible value for numbering less than one of noun pronoun unit of corresponding predicate verb element number, or its most major term unit Number be less than corresponding predicate verb element number all coordinate noun pronoun mix vector races included in coordinate noun generation One of phrase resultant vector, or dummy cell.
That is, when in the absence of rk-1When:{yk}=NPIyk∪VNPyk∪NOMPk∪Gk∪{e};
So as to yk∈NPIyk∪VNPyk∪NOMPk∪Gk∪{e}。
In above-mentioned formula, set NPIykRepresent pure noun list of the numbering less than corresponding predicate verb element number Member set;VNPykRepresent verb unit set of the numbering less than the noun property of corresponding predicate verb element number;NOMPkCompile Number be less than corresponding predicate verb element number nominative pronoun unit set;GkRepresent by most major term unit numbering be less than pair The predicate verb element number answered all coordinate noun pronoun mix vector races composition and gather;E represents dummy cell.
(2) when corresponding predicate verb element number is not minimum predicate verb element number, the subject element One of noun pronoun unit for numbering less than corresponding predicate verb element number of possibility value, or the most volume of major term unit Number be less than corresponding predicate verb element number all coordinate noun pronoun mix vector races included in coordinate noun generation One of phrase resultant vector, or in one of corresponding syntax vector of predicate verb unit of preceding appearance, or dummy cell.
There is r that is, working ask-1When:{yk}=NPIyk∪VNPyk∪NOMPk∪Gk∪fyk∪{e}。
So as to yk∈NPIyk∪VNPyk∪NOMPk∪Gk∪fyk∪{e}。
In above-mentioned formula, set NPIykRepresent pure noun list of the numbering less than corresponding predicate verb element number Member set;VNPykRepresent verb unit set of the numbering less than the noun property of corresponding predicate verb element number;NOMPkCompile Number be less than corresponding predicate verb element number nominative pronoun unit set;GkRepresent by most major term unit numbering be less than pair The predicate verb element number answered all coordinate noun pronoun mix vector races composition and gather;fykRepresent occur preceding The vector set of predicate verb unit corresponding syntax;E represents dummy cell.
For example, for pretreated sentence S=" the I can understand what what shown in above-mentioned table 1 You said meant ", have:
r1=" can understand ", for r1It is the minimum predicate verb unit of numbering, therefore, { y to have it1}= NOMP1∪ { e }={ I, e }.
r2=" said ", for r2There is its predicate verb unit for not numbering minimum, in r1And r2Between noun pronoun Unit only has " you ", and the function numbered less than 2 is f1, therefore, { y2}=NOMP2∪fy2∪ { e }={ I, you } ∪ { f1}∪ {e}。
r3=" meant ", for r3It not numbers the predicate verb unit of minimum, in r2And r3Between there is no noun generation Word unit, and the function numbered less than 3 is f1And f2, therefore, have:{y3}=NOMP3∪fy3∪ { e }={ I, you } ∪ { f1, f2}∪{e}。
III, object element
Remember each predicate verb unit rkCorresponding object noun pronoun unit set is { zk}=NPIzk∪VNPzk∪ OBJPk∪Hk∪ { e } or { zk}=NPIzk∪VNPzk∪OBJPk∪Hk∪fzk∪{e}。
Meanwhile, note predicate verb unit rkCorresponding leading question element is zk, its possible value collection is combined into { zk}。
Generate corresponding object element { zkPreferably include:
(1) when corresponding predicate verb element number is maximum predicate verb element number, the object element Volume of the possible value for numbering more than one of noun pronoun unit of corresponding predicate verb element number, or its minimum word unit Number be more than corresponding predicate verb element number all coordinate noun pronoun mix vector races included in coordinate noun generation One of phrase resultant vector, or dummy cell.
That is, when in the absence of rk+1When:{zk}=NPIzk∪VNPzk∪OBJPk∪Hk∪{e}。
In above-mentioned formula, set NPIzkRepresent pure noun list of the numbering more than corresponding predicate verb element number Member set;VNPzkRepresent verb unit set of the numbering more than the noun property of corresponding predicate verb element number;OBJPkTable Show objective case pronoun unit set of the numbering more than the noun property of corresponding predicate verb element number;HkRepresent by minimum word list The numbering of member is all coordinate noun pronoun mix vector races composition more than corresponding predicate verb element number and gathers;e Represent dummy cell.
(2) when corresponding predicate verb element number is not maximum predicate verb element number, the object element Possibility value be numbering more than corresponding predicate verb element number and less than the adjacent predicate verb unit in rear appearance One of noun pronoun unit of numbering, or the numbering of its minimum word unit are more than corresponding predicate verb element number and less than phase Coordinate noun included in all coordinate noun pronoun mix vector races of the adjacent predicate verb element number in rear appearance One of pronoun mix vector, or in one of corresponding syntax vector of predicate verb unit of rear appearance, or dummy cell.
There is r that is, working ask+1When:{zk}=NPIzk∪VNPzk∪OBJPk∪Hk∪fzk∪{e}。
In above-mentioned formula, set NPIzkRepresent numbering more than corresponding predicate verb element number and less than adjacent In the pure noun unit set of the predicate verb element number of rear appearance;VNPzkRepresent that numbering is more than corresponding predicate verb Element number and less than the adjacent predicate verb element number in rear appearance noun property verb unit set;NOMPk Represent more than corresponding predicate verb element number and less than the objective case generation of the adjacent predicate verb element number in rear appearance Word unit set;HkRepresent the numbering by minimum word unit more than corresponding predicate verb element number and less than adjacent rear The predicate verb element number of appearance all coordinate noun pronoun mix vector races composition and gather;fzkExpression goes out after The corresponding syntax vector set of existing predicate verb unit;E represents dummy cell.
For example, for pretreated sentence S=" the I can understand what what shown in above-mentioned table 1 You said meant ", have:
r1=" can understand ", for r1It is not the largest number of predicate verb unit, in r1And r2Between deposit In noun pronoun unit " you ", without coordinate noun pronoun mix vector, and the function numbered more than 1 is f2, f3, therefore, {z1}=OBJP1∪fz1∪ { e }={ you } ∪ { f2, f3}∪{e}。
r2=" said ", for r2Its not the largest number of predicate verb unit, in r1And r2Between there is no noun pronoun Unit, and the function numbered more than 2 is f3, also without coordinate noun pronoun mix vector, therefore, have:{z2}=fz2∪ { e }= {f3}∪{e}。
r3=" meant ", for r3It is the largest number of predicate verb unit, in r3Afterwards without noun pronoun unit, Also without coordinate noun pronoun mix vector, and the function numbered more than 3 is also not present, therefore, { z3}={ e }.
Thus, handled via step 120, for above-mentioned example, the value set for obtaining each element can be generated.
Step 130, according to the leading question element, subject element, predicate element, object element possibility value, obtain The corresponding syntax vector of each predicate verb unit is possible to value, and the syntax vector includes leading question element, subject Element, predicate element, object element.
As it was previously stated, each subject-predicate matching structure can be represented with the mode of syntax vector.According to the fortune of step 120 Row result, for pretreated sentence S=" the I can understand what what you said shown in above-mentioned table 1 Meant ", has:
{r1}={ can understand }
{x1}={ e }
{y1}={ I, e }
{z1}={ you, f2, f3, e }
With the multiplicative principle in Combinational Mathematics:f1(x1, y1, r1, z1)=(sees below list)
Sequence number Row matrix f1 Sequence number Row matrix f1
(1-1) f1=(e, I, r1, you) (1-5) f1=(e, e, r1, you)
(1-2) f1=(e, I, r1, f2) (1-6) f1=(e, e, f1, f2)
(1-3) f1=(e, I, r1, f3) (1-7) f1=(e, e, r1, f3)
(1-4) f1=(e, I, r1, e) (1-8) f1=(e, e, r1, e)
Constant is replaced with sequence valve, is obtained:f1(x1, y1, r1, z1)=(sees below list)
Sequence number Row matrix f1 Sequence number Row matrix f1
(1-1) f1=(e, 1,2,5) (1-5) f1=(e, e, 2,5)
(1-2) f1=(e, 1,2, f2) (1-6) f1=(e, e, 2, f2)
(1-3) f1=(e, 1,2, f3) (1-7) f1=(e, e, 2, f3)
(1-4) f1=(e, 1,2, e) (1-8) f1=(e, e, 2, e)
{r2}={ said }
{x2}={ what A, what B, e }
{y2}={ I, you, f1, e }
{z2}={ f3, e }
With the multiplicative principle in Combinational Mathematics:f2(x2, y2, r2, z2)=(sees below list)
Sequence number Row matrix f2 Sequence number Row matrix f2
(2-1) f2=(what A, I, r2, f3) (2-13) f2=(what B, f1, r2, f3)
(2-2) f2=(what A, I, r2, e) (2-14) f2=(what B, f1, r2, e)
(2-3) f2=(what A, you, r2, f3) (2-15) f2=(what B, e, r2, f3)
(2-4) f2=(what A, you, r2, e) (2-16) f2=(what B, e, r2, e)
(2-5) f2=(what A, f1, r2, f3) (2-17) f2=(e, I, r2, f3)
(2-6) f2=(what A, f1, r2, e) (2-18) f2=(e, I, r2, e)
(2-7) f2=(what A, e, r2, f3) (2-19) f2=(e, you, r2, f3)
(2-8) f2=(what A, e, r2, e) (2-20) f2=(e, you, r2, e)
(2-9) f2=(what B, I, r2, f3) (2-21) f2=(e, f1, r2, f3)
(2-10) f2=(what B, I, r2, e) (2-22) f2=(e, f1, r2, e)
(2-11) f2=(what B, you, r2, f3) (2-23) f2=(e, e, r2, f3)
(2-12) f2=(what B, you, r2, e) (2-24) f2=(e, e, r2, e)
Constant is replaced with sequence valve, is obtained:f2(x2, y2, r2, z2)=(sees below list)
Sequence number Row matrix f2 Sequence number Row matrix f2
(2-1) f2=(3,1,6, f3) (2-13) f2=(4, f1, 6, f3)
(2-2) f2=(3,1,6, e) (2-14) f2=(4, f1, 6, e)
(2-3) f2=(3,5,6, f3) (2-15) f2=(4, e, 6, f3)
(2-4) f2=(3,5,6, e) (2-16) f2=(4, e, 6, e)
(2-5) f2=(3, f1, 6, f3) (2-17) f2=(e, 1,6, f3)
(2-6) f2=(3, f1, 6, e) (2-18) f2=(e, 1,6, e)
(2-7) f2=(3, e, 6, f3) (2-19) f2=(e, 5,6, f3)
(2-8) f2=(3, e, 6, e) (2-20) f2=(e, 5,6, e)
(2-9) f2=(4,1,6, f3) (2-21) f2=(e, f1, 6, f3)
(2-10) f2=(4,1,6, e) (2-22) f2=(e, f1, 6, e)
(2-11) f2=(4,5,6, f3) (2-23) f2=(e, e, 6, f3)
(2-12) f2=(4,5,6, e) (2-24) f2=(e, e, 6, e)
{r3}={ meant }
{x3}={ what A, what B, e }
{y3}={ I, you, f1, f2, e }
{z3}={ e }
With the multiplicative principle in Combinational Mathematics:f3(x3, y3, r3, z3)=(sees below list)
Sequence number Row matrix f3 (3-8) f3=(what B, f1, r3, e)
(3-1) f3=(what A, I, r3, e) (3-9) f3=(what B, f2, r3, e)
(3-2) f3=(what A, you, r3, e) (3-10) f3=(what B, e, r3, e)
(3-3) f3=(what A, f1, r3, e) (3-11) f3=(e, I, r3, e)
(3-4) f3=(what A, f2, r3, e) (3-12) f3=(e, you, r3, e)
(3-5) f3=(what A, e, r3, e) (3-13) f3=(e, f1, r3, e)
(3-6) f3=(what B, I, r3, e) (3-14) f3=(e, f2, r3, e)
(3-7) f3=(what B, you, r3, e) (3-15) f3=(e, e, r3, e)
Constant is replaced with sequence valve, is obtained:f3(x3, y3, r3, z3)=(sees below list)
Sequence number Row matrix f3 (3-8) f3=(4, f1, 7, e)
(3-1) f3=(3,1,7, e) (3-9) f3=(4, f2, 7, e)
(3-2) f3=(3,5,7, e) (3-10) f3=(4, e, 7, e)
(3-3) f3=(3, f1, 7, e) (3-11) f3=(e, 1,7, e)
(3-4) f3=(3, f2, 7, e) (3-12) f3=(e, 5,7, e)
(3-5) f3=(3, e, 7, e) (3-13) f3=(e, f1, 7, e)
(3-6) f3=(4,1,7, e) (3-14) f3=(e, f2, 7, e)
(3-7) f3=(4,5,7, e) (3-15) f3=(e, e, 7, e)
With the multiplicative principle in Combinational Mathematics:
| S |=| f1|×|f2|×|f3|=8 × 24 × 15=2880
Then collectively generate 2880 possible matrix solutions.
Step 140, according to all syntaxes vector be possible to value generate at least one syntactic structure possibility matrix solution, The possible matrix solution of the syntactic structure according to the tactic syntax vector of predicate verb element number by constituting.
For pretreated sentence S=" the I can understand what what you shown in above-mentioned table 1 Said meant " are based on f1, f2And f3Possibility value, multiple possible matrix solutions can be obtained.
Step 150, checking according to syntactic structure may the obtained sentence of matrix solution whether with the pretreated sentence It is identical, if identical, each syntax vector in the possible matrix solution of the syntactic structure is exported, and be used as syntax knot One of structure analysis result.Preferably, substitute word unit using word element number and carry out equivalent substitution, overall plug hole, inclined add operation, Whether the statement sequence for being then based on obtaining is that Serial No. sequentially judges whether and the complete phase of pretreated sentence Together.
Step 150 may include steps of:
Step 151, if there is the sequence valve that may do not occur in the syntactic structure in matrix solution, then exclude the syntax knot Structure may matrix solution;For example for following possibility matrix solution:
The word unit that numbering is 4 does not occur, and excludes.
If step 152, there is identical sequence valve in different syntax vectors or identical syntax vector occur, Excluding the syntactic structure may matrix solution;For example for following possibility matrix solution:
The word unit that numbering is 5 is occurred in that twice, is excluded.
Step 153, in each possible matrix solution, will other syntaxes vector between exist mutually substitute into relation sentence Normal vector all carries out equivalent substitution, if occurring the intersection contradiction of two syntax vectors after equivalent substitution, excluding should Syntactic structure may matrix solution;
For example for following possibility matrix solution:
Above-mentioned matrix is substituted into, f2And f3Occur in that the substitution of function intersects contradiction.Substitution is obtained:f2=3+ < e+ < 6+ < (4+ < f2+ < 7+ < e).Equation left and right ends occur in that f simultaneously2, this has occurred as soon as () logical contradiction.Exclude.
Step 154, in each possible matrix solution, will other syntaxes vector between exist mutually substitute into relation sentence Normal vector all carries out equivalent substitution, if occurring the converse sequence valve in two positions after equivalent substitution, excludes the sentence Method structure may matrix solution;
For example for following possibility matrix solution:
It is substituted into, f2=4+ < 5+ < 6+ < 3+ < e+ < 7+ < e, obtain order for (4,5,6,3, e, 7, e), There is the converse sequence valve in position, exclude.
Step 155, in any one possible matrix solution, if there is other syntaxes vector between do not substitute into mutually The syntax vector of relation, then perform plug hole operation to obtain the possibility syntax analytic structure corresponding to all possible matrix solutions, And whether the sentence that checking is obtained according to the possible syntax analytic structure is identical with the pretreated sentence, it enters One step includes:
Step 155.1, first equivalent generation is carried out to there is the syntax of substitution relation vector in the possible matrix solution each other Change, so that the possible matrix solution is converted into one group of syntax vector each other in the absence of substitution relationSyntax vector that will likely be in matrix solution is referred to as first kind syntax vector, will convert obtained syntax to AmountReferred to as Equations of The Second Kind syntax is vectorial;
Step 155.2, appoint and take an Equations of The Second Kind syntax vectorMark one by one in the predetermined directionIn each The sequence valve of syntax elements;After the sequence valve for marking syntax elements, appoint and takeIn i-th of syntax elements, only in the syntax The unique room of the first side structure of element;Make after sky, appoint and take a syntax vectorEquations of The Second Kind syntax vector in additionBy syntax vector in the way of overall plug holeThe constructed room of insertion, and then a new syntax vector is generated, will This new syntax vector is designated asAnd by obtained from overall plug hole syntax vector, be referred to as the 3rd class syntax to Amount;
Step 155.3, to the 3rd class syntax vectorIn the predetermined direction to from vectorIn the syntax elements of the first side first start to vectorIn the vector that includesThe second side Each syntax elements untill first syntax elements, all mark sequence valve;Positioned at vectorIn include VectorThe element of first side, does not mark sequence valve;By vectorFirst syntax elements of the second side be designated asWill In the manner previously described to vectorThe syntax vector portion of mark, is designated as whipping syntax vectorMark Note after sequence valve, appoint j-th of the syntax elements taken in a foregoing whipping vector, it is only unique in the side structure of element first Room;Make after sky, appoint and take an original Equations of The Second Kind syntax vectorIn the way of overall plug hole by syntax to AmountThe constructed room of insertion, and then generate a new syntax vector, then newly-generated syntax vector is designated asOr
3rd class syntax vectorAccording to predetermined direction, distich normal vectorIn each Syntax elements all mark sequence valve;After the sequence valve for marking syntax elements, appoint and take oneIn t-th Syntax elements, in the unique room of the first side structure of the syntax elements;Make after sky, appoint and take an Equations of The Second Kind was not used to obtain Syntax vectorBy the vector in the way of overall plug holeThe room that insertion is above constructed, and then the new vector of generation one, Then the new vector is designated as
Step 155.4, step 155.3 is repeated, when the last time, which makes empty and plug hole step, to be terminated, to passing through Last time make empty and the 3rd class syntax vector obtained from plug hole step carry out next time make the operation of empty and plug hole, until by institute There is Equations of The Second Kind syntax vectorWhole plug holes are finished, and finally obtain the 3rd class syntax vector of a single file, will be described The 3rd class syntax vector finally obtained is referred to as final single file vector;
Step 155.5, if existed in the corresponding all final single files vectors of a possible syntax analytic structure The converse sequence valve in two positions, then exclude the possible syntax analytic structure;
Step 155.6, step 155.2 is repeated to step 155.5 until all possible syntax analytic structures are traversed.
For example, for example as described above, the possible matrix solution of a syntactic structure is:
Above-mentioned matrix is converted into linear representation is:
Operated by foregoing plug hole, there is the converse sequence valve in two positions, row in each final single file vector Remove.
For example as described above, the possible matrix solution of a syntactic structure is:
Can be linear representation by matrix conversion:
Carry out equivalent substitution operation and obtain sentence:
α=e+ < 1+ < 2+ < (3+ < (4+ < 5+ < 6+ < e)+< 7+ < e)
Remove dummy cell e, obtain:
α=1+ < 2+ < (3+ < (4+ < 5+ < 6)+< 7)
It is identical with pretreated sentence, and the nested structure is one of syntactic structure analysis result.
Word unit constant is substituted into above-mentioned matrix, then syntactic structure matrix solution can be expressed as:
The S corresponding with this matrix expression linear representation is as follows:
Accordingly, " its syntactic structure of I can understand what what you said meant " is parsing sentence: I is as the subject of main clause, and can understand are as the predicate of main clause, and " what what you said meant " make subordinate clause For the object clause of main clause, in the subordinate clause, first what is subordinate clause introducer, and " what you said " are the master of subordinate clause Language, meant is the predicate of object clause, and object clause is in itself without object;For " what you said " subordinates clause, it is served as Nested subject clause in object clause, what is introducer, you is subject, and said is predicate.
Further, methods described can also include step display, by syntactic structure analysis result each syntax vector with And corresponding syntax structural relationship is shown with tree in human-computer interaction interface.
C2 parts example 2
Example 2:As another example, illustrate the method for the present embodiment for for example below:“That men who were Appointed didn ' t bother the liberals wash ' t remarked upon by the press. " are so Labyrinth sentence resolving.
Above-mentioned sentence by pretreatment remove impurity and number after word order list be:
Former sentence phrase Phrase type Serial number
That Subordinate conjunctive word unit 1
men Noun pronoun unit 2
who Subordinate conjunctive word unit 3
were appointed Predicate verb unit 4
didn’t bother Predicate verb unit 5
the liberals Noun pronoun unit 6
wasn’t remarked Predicate verb unit 7
The sentence has three predicate verb units, and r is designated as respectively1、r2And r3
For r1Have, { r1}={ were appointed }
{x1}={ That, who, e } (e is null character string)
{y1}={ men, e }
{z1}={ f2, f3, e }
With the multiplicative principle in Combinational Mathematics:f1(x1, y1, r1, z1)=(sees below list)
Sequence number Row matrix f1 Sequence number Row matrix f1
(1-1) f1=(That, men, r1, f2) (1-10) f1=(That, e, r1, f3)
(1-2) f1=(who, men, r1, f2) (1-11) f1=(who, e, r1, f3)
(1-3) f1=(e, men, r1, f2) (1-12) f1=(e, e, r1, f3)
(1-4) f1=(That, e, r1, f2) (1-13) f1=(That, men, r1, e)
(1-5) f1=(who, e, r1, f2) (1-14) f1=(who, men, r1, e)
(1-6) f1=(e, e, r1, f2) (1-15) f1=(e, men, r1, e)
(1-7) f1=(That, men, r1, f3) (1-16) f1=(That, e, r1, e)
(1-8) f1=(who, men, r1, f3) (1-17) f1=(who, e, r1, e)
(1-9) f1=(e, men, r1, f3) (1-18) f1=(e, e, r1, e)
Constant is replaced with sequence valve, is obtained:f1(x1, y1, r1, z1)=(sees below list)
Sequence number Row matrix f1 Sequence number Row matrix f1
(1-1) f1=(1,2,4, f2) (1-10) f1=(1, e, 4, f3)
(1-2) f1=(3,2,4, f2) (1-11) f1=(3, e, 4, f3)
(1-3) f1=(e, 2,4, f2) (1-12) f1=(e, e, 4, f3)
(1-4) f1=(1, e, 4, f2) (1-13) f1=(1,2,4, e)
(1-5) f1=(3, e, 4, f2) (1-14) f1=(3,2,4, e)
(1-6) f1=(e, e, 4, f2) (1-15) f1=(e, 2,4, e)
(1-7) f1=(1,2,4, f3) (1-16) f1=(1, e, 4, e)
(1-8) f1=(3,2,4, f3) (1-17) f1=(3, e, 4, e)
(1-9) f1=(e, 2,4, f3) (1-18) f1=(e, e, 4, e)
For r2Have, { r2}={ didn ' t bother }
{x2}={ That, who, e } (e is null character string)
{y2}={ men, f1, e }
{z2}={ the liberals, f3, e }
With the multiplicative principle in Combinational Mathematics:f2(x2, y2, r2, z2)=(sees below list)
Sequence number Row matrix f2 (2-14) f2=(who, f1, r2, f3)
(2-1) f2=(That, men, r2, the liberals) (2-15) f2=(e, f1, r2, f3)
(2-2) f2=(who, men, r2, the liberals) (2-16) f2=(That, e, r2, f3)
(2-3) f2=(e, men, r2, the liberals) (2-17) f2=(who, e, r2, f3)
(2-4) f2=(That, f1, r2, the liberals) (2-18) f2=(e, e, r2, f3)
(2-5) f2=(who, f1, r2, the liberals) (2-19) f2=(That, men, r2, e)
(2-6) f2=(e, f1, r2, the liberals) (2-20) f2=(who, men, r2, e)
(2-7) f2=(That, e, r2, the liberals) (2-21) f2=(e, men, r2, e)
(2-8) f2=(who, e, r2, the liberals) (2-22) f2=(That, f1, r2, e)
(2-9) f2=(e, e, r2, the liberals) (2-23) f2=(who, f1, r2, e)
(2-10) f2=(That, men, r2, f3) (2-24) f2=(e, f1, r2, e)
(2-11) f2=(who, men, r2, f3) (2-25) f2=(That, e, r2, e)
(2-12) f2=(e, men, r2, f3) (2-26) f2=(who, e, r2, e)
(2-13) f2=(That, f1, r2, f3) (2-27) f2=(e, e, r2, e)
Constant is replaced with sequence valve, is obtained:f2(x2, y2, r2, z2)=(sees below list)
Sequence number Row matrix f2 (2-14) f2=(3, f1, 5, f3)
(2-1) f2=(1,2,5,6) (2-15) f2=(e, f1, 5, f3)
(2-2) f2=(3,2,5,6) (2-16) f2=(1, e, 5, f3)
(2-3) f2=(e, 2,5,6) (2-17) f2=(3, e, 5, f3)
(2-4) f2=(1, f1, 5,6) (2-18) f2=(e, e, 5, f3)
(2-5) f2=(3, f1, 5,6) (2-19) f2=(1,2,5, e)
(2-6) f2=(e, f1, 5,6) (2-20) f2=(3,2,5, e)
(2-7) f2=(1, e, 5,6) (2-21) f2=(e, 2,5, e)
(2-8) f2=(3, e, 5,6) (2-22) f2=(1, f1, 5, e)
(2-9) f2=(e, e, 5,6) (2-23) f2=(3, f1, 5, e)
(2-10) f2=(1,2,5, f3) (2-24) f2=(e, f1, 5, e)
(2-11) f2=(3,2,5, f3) (2-25) f2=(1, e, 5, e)
(2-12) f2=(e, 2,5, f3) (2-26) f2=(3, e, 5, e)
(2-13) f2=(1, f1, 5, f3) (2-27) f2=(e, e, 5, e)
For r3Have:{r3}={ wasn ' t remarked }
{x3}={ That, who, e }
{y3}={ men, the liberals, f1, f2, e }
{z3}={ e }
With the multiplicative principle in Combinational Mathematics:f3(x3, y3, r3, z3)=(sees below list)
Sequence number Row matrix f3 (3-8) f3=(who, f1, r3, e)
(3-1) f3=(That, men, r3, e) (3-9) f3=(e, f1, r3, e)
(3-2) f3=(who, men, r3, e) (3-10) f3=(That, f2, r3, e)
(3-3) f3=(e, men, r3, e) (3-11) f3=(who, f2, r3, e)
(3-4) f3=(That, the liberals, r3, e) (3-12) f3=(e, f2, r3, e)
(3-5) f3=(who, the liberals, r3, e) (3-13) f3=(That, e, r3, e)
(3-6) f3=(e, the liberals, r3, e) (3-14) f3=(who, e, r3, e)
(3-7) f3=(That, f1, r3, e) (3-15) f3=(e, e, r3, e)
Constant, f are replaced with sequence valve3(x3, y3, r3, z3)=(sees below list)
Sequence number Row matrix f3 (3-8) f3=(3, f1, 7, e)
(3-1) f3=(1,2,7, e) (3-9) f3=(e, f1, 7, e)
(3-2) f3=(3,2,7, e) (3-10) f3=(1, f2, 7, e)
(3-3) f3=(e, 2,7, e) (3-11) f3=(3, f2, 7, e)
(3-4) f3=(1,6,7, e) (3-12) f3=(e, f2, 7, e)
(3-5) f3=(3,6,7, e) (3-13) f3=(1, e, 7, e)
(3-6) f3=(e, 6,7, e) (3-14) f3=(3, e, 7, e)
(3-7) f3=(1, f1, 7, e) (3-15) f3=(e, e, 7, e)
With the multiplicative principle in Combinational Mathematics:
| S |=| f1|×|f2|×|f3|=18 × 27 × 15=7290
Then collectively generate 7290 possible matrix solutions.
To the possible matrix solution of all syntactic structures, operation matrix substitutes into solver, structural modifications program, can made The possibility matrix solution of final result is parsed for syntactic structure:
The example sentence is this paper typical example sentence of overall plug hole method be successfully processed one.Inserted by entirety described previously herein Vacancy is managed, and one of above-mentioned overall plug hole result of possibility matrix solution is the final single file vector of following one:E+ < (1+ < 2+ < (3+ < e+ < 4+ < e)+< 5+ < 6)+< 7+ < e.There is not permutation number in this final single file vector, be it is rational most Whole single file vector.This final single file vector, it is identical with former sentence.The possible matrix solution also exactly correct sentence of this example sentence Method structure elucidation result.
The numbering of above-mentioned possible matrix solution is reduced to word unit, following form is obtained:
This matrix is converted into linear representation:
Remove e to obtain:
Thus, the correct parsing for above-mentioned sentence example is obtained, i.e.,:f3It is main clause, that is, kernel sentence;f2It is f3's Subject, i.e. subject clause;f1It is attributive clause, modifies men.
This example can preferably show the superiority of this method.For above-mentioned sentence, what current computer industry was generally acknowledged Two kinds in the world FA natural language syntactic structure resolver-Berkeley resolver (Berkeley Parser) and Stamford resolver (Stanford Parser), when being submitted to the application, what is provided is still the analysis result of mistake.This The result that two kinds of devices are provided is identical.Its result is as follows:
①That men didn’t bother;
②who were appointed;
③the liberals wasn’t remarked upon by the press.
1. it is main clause, that is, kernel sentence;3. it is object 1., i.e. object clause;2. it is attributive clause, modifies men; That is determiner, modifies men.
Among English, if subject clause is located at full sentence beginning of the sentence, and guided by that, then that cannot be omitted, Even if spoken language is also such.In the method for the invention, due to sentence is processed as into syntax vector, therefore it is just subject clause This part of That men didn ' t bother the liberals, sufficient space has been reserved during parsing, has been filled Protect the possibility that it is generated as a complete subordinate sentence with dividing.
Parsing for that subject clauses guided is often malfunctioned this important technical leak, and it is submitted to the application When, above two natural language syntactic structure resolver advanced in the world still could not make up.
C3 parts example 3
Example 3:As another example, illustrate the method for the present embodiment for for example below:“Jack who has a The resolving of the sentence of labyrinth as beautiful car is a businessman. ".Above-mentioned sentence is by pre- Handling the word order list after removing impurity and numbering is:
Former sentence phrase Phrase type Serial number
Jack Noun pronoun unit 1
who Subordinate conjunctive word unit 2
has Predicate verb unit 3
a car Noun pronoun unit 4
is Predicate verb unit 5
a businessman Noun pronoun unit 6
The sentence has two predicate verb units, and r is designated as respectively1And r2
For r1Have, { r1}={ has }
{x1}={ who, e } (e is null character string)
{y1}={ Jack, e }
{z1}={ a car, f2, e }
With the multiplicative principle in Combinational Mathematics:f1(x1, y1, r1, z1)=(sees below list)
Sequence number Row matrix f1 Sequence number Row matrix f1
(1-1) f1=(who, Jack, r1, a car) (1-7) f1=(who, e, r1, f2)
(1-2) f1=(e, Jack, r1, a car) (1-8) f1=(e, e, r1, f2)
(1-3) f1=(who, e, r1, a car) (1-9) f1=(who, Jack, r1, e)
(1-4) f1=(e, e, r1, a car) (1-10) f1=(e, Jack, r1, e)
(1-5) f1=(who, Jack, r1, f2) (1-11) f1=(who, e, r1, e)
(1-6) f1=(e, Jack, r1, f2) (1-12) f1=(e, e, r1, e)
Constant, f are replaced with sequence valve1(x1, y1, r1, z1)=(sees below list)
Sequence number Row matrix f1 Sequence number Row matrix f1
(1-1) f1=(2,1,3,4) (1-7) f1=(2, e, 3, f2)
(1-2) f1=(e, 1,3,4) (1-8) f1=(e, e, 3, f2)
(1-3) f1=(2, e, 3,4) (1-9) f1=(2,1,3, e)
(1-4) f1=(e, e, 3,4) (1-10) f1=(e, 1,3, e)
(1-5) f1=(2,1,3, f2) (1-11) f1=(2, e, 3, e)
(1-6) f1=(e, 1,3, f2) (1-12) f1=(e, e, 3, e)
For r2Have, { r2}={ is }
{x2}={ who, e } (e is null character string)
{y2}={ Jack, a car, f1, e }
{z2}={ a businessman, e }
With the multiplicative principle in Combinational Mathematics:f2(x2, y2, r2, z2)=(sees below list)
Sequence number Row matrix f2 Sequence number Row matrix f2
(2-1) f2=(who, Jack, r2, a businessman) (2-9) f2=(who, Jack, r2, e)
(2-2) f2=(e, Jack, r2, a businessman) (2-10) f2=(e, Jack, r2, e)
(2-3) f2=(who, a car, r2, a businessman) (2-11) f2=(who, a car, r2, e)
(2-4) f2=(e, a car, r2, a businessman) (2-12) f2=(e, a car, r2, e)
(2-5) f2=(who, f1, r2, a businessman) (2-13) f2=(who, f1, r2, e)
(2-6) f2=(e, f1, r2, a businessman) (2-14) f2=(e, f1, r2, e)
(2-7) f2=(who, e, r2, a businessman) (2-15) f2=(who, e, r2, e)
(2-8) f2=(e, e, r2, a businessman) (2-16) f2=(e, e, r2, e)
Constant, f are replaced with sequence valve2(x2, y2, r2, z2)=(sees below list)
Sequence number Row matrix f2 Sequence number Row matrix f2
(2-1) f2=(2,1,5,6) (2-9) f2=(2,1,5, e)
(2-2) f2=(e, 1,5,6) (2-10) f2=(e, 1,5, e)
(2-3) f2=(2,4,5,6) (2-11) f2=(2,4,5, e)
(2-4) f2=(e, 4,5,6) (2-12) f2=(e, 4,5, e)
(2-5) f2=(2, f1, 5,6) (2-13) f2=(2, f1, 5, e)
(2-6) f2=(e, f1, 5,6) (2-14) f2=(e, f1, 5, e)
(2-7) f2=(2, e, 5,6) (2-15) f2=(2, e, 5, e)
(2-8) f2=(e, e, 5,6) (2-16) f2=(e, e, 5, e)
With the multiplicative principle in Combinational Mathematics:
| S |=| f1|×|f2|=12 × 16=192
Then collectively generate 192 possible matrix solutions.
The possibility matrix solution of final result can be obtained parsing as syntactic structure:
The example sentence is this paper typical example sentence of overall plug hole method be successfully processed one.Inserted by entirety described previously herein Vacancy is managed, and above-mentioned possibility matrix solution has obtained the unique one final single file for permutation number do not occur vector:E+ < 1+ < (2+ < e+ < 3+ < 4)+< 5+ < 6.This final single file vector is rational final single file vector.This final single file vector Syntax sequence valve numbering, it is identical with former sentence.The possible matrix solution also exactly correct syntactic structure parsing of this example sentence As a result.
The numbering of above-mentioned possible matrix solution is reduced to word unit, following form is obtained:
This matrix is converted into linear representation:
Remove e to obtain:
C4 parts example 4
Example 4:As another example, illustrate the method for the present embodiment for for example below:" After Jack, Mary and The resolving of the sentence of parallel construction as Linda left, I gave my son a new book. ".
Above-mentioned sentence by pretreatment remove impurity and number after word order list be:
Former sentence phrase Phrase type Serial number
After Subordinate conjunctive word unit 1
Jack Noun pronoun unit 2
Mary Noun pronoun unit 3
and Conjunctive word unit arranged side by side 4
Linda Noun pronoun unit 5
left Predicate verb unit 6
I Noun pronoun unit 7
gave Predicate verb unit 8
my son Noun pronoun unit 9
a book Noun pronoun unit 10
Include generation coordinate noun pronoun mix vector race by following steps:
S2.1 chooses unduplicated two nouns pronoun unit:
If A, between the two noun pronoun units there is no other word units, by the two noun pronoun units As a coordinate noun pronoun mix vector, and retain the coordinate noun pronoun mix vector;
If B, between the two noun pronoun units exist other word units, check between the two noun generations Each word unit between word unit:If any one word unit between the two noun pronoun units, all It is noun pronoun unit or conjunctive word unit arranged side by side, then by two selected noun pronoun units and between the two noun generations All word units between word unit as a coordinate noun pronoun mix vector, and retain the coordinate noun pronoun combine to Amount;Otherwise, coordinate noun pronoun mix vector is not generated;
S2.2 performs S2.1 again until the combination of all noun pronoun units is traversed, and it is all that generation is obtained Coordinate noun pronoun mix vector;
If there is coordinate noun pronoun mix vector in the S2.3 possible syntax analytic structures, to all coordinate nouns Pronoun mix vector is divided, so as to form several coordinate noun pronoun mix vector races so that:In each name arranged side by side In word pronoun mix vector race, each coordinate noun pronoun included in the coordinate noun pronoun mix vector race combine to Amount all contains two common noun pronoun units.
S2.4 chooses the volume included in all noun pronoun mix vectors in each noun pronoun mix vector race Number maximum word unit, as the most major term unit of the noun pronoun mix vector race, in case being used when being subsequently generated subject;Choosing The word unit for taking the numbering included in all noun pronoun mix vectors minimum, as the noun pronoun mix vector race most Small word unit, in case being used when being subsequently generated object.
In generation subject element set { y1}、{y2During, run subject generating algorithm arranged side by side as follows:
1. A (S) takes out all NPI phrases, entirety VNP phrases, the entirety NOMP phrases in former sentence, and will be complete in former sentence Body NPI phrases, entirety VNP phrases, entirety NOMP phrases are classified as a set, by the set be designated as Ψ=Jack, Mary, Linda, I, my son, a book }={ 2,3,5,7,9,10 }.
2. B (Ψ) represent according toMode take whole groups of any two element in set Ψ={ 2,3,5,7,9,10 } Close, if set Then B (Ψ)={ 2,3 }, { 2,5 }, { 2,7 }, { 2,9 }, { 2,10 }, { 3,5 }, { 3,7 }, { 3,9 }, { 3,10 }, { 5,7 }, { 5,9 }, { 5,10 }, { 7,9 }, { 10,7 }, 10, 9}}。
3. K (α, β) is to function of a single variable B (Ψ) result, i.e., to appointing one givenAccording to elementThe arrangement from small to large of syntax sequence valve in former sentence S.It might as well then setIt then can obtain ordered pairThenThe ordered pair of generation is:
{<2,3>,<2,5>,<2,7>,<2,9>,<2,10>,<3,5>,<3,7>,<3,9>,<3,10>,<5,7>,<5,9>, <5,10>,<7,9>,<7,10>,<9,10>};
If setAnd then set up a continuous word String formulaWhereinBe in former sentence S fromArriveOne The adjacent continuous word string of group or empty word string, andThen Then Φ1=2+ < e+ < 3, Φ2=2+ < 3+ < 4+ < 5, Φ3=2+ < 3+ < 4+ < 5+ < 6+ < 7, Φ4 =2+ < 3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9, Φ5=2+ < 3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ6=3 + < 4+ < 5, Φ7=3+ < 4+ < 5+ < 6+ < 7, Φ8=3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9, Φ9=3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ10=5+ < 6+ < 7, Φ11=5+ < 6+ < 7+ < 8+ < 9, Φ12=5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ13=7+ < 8+ < 9, Φ14=7+ < 8+ < 9+ < 10, Φ15=9+ < e+ < 10.
④H(Φt) binary function K (α, β) is generatedChecked:If To appointing the element γ ∈ Φ givent, andAndHave:γ=NPI or γ=VNP or γ=NOMP or γ= CONJ or γ=e, then by ΦtMark be changed toReferred to as ΦtGenerationIf set Then gatherThen
5. M (α, β) is represented for appointing one taken setIf setDeposit CorrespondingA set race is then defined, the set race is by including setAll set constitute, the set Race is designated asThen Then M (α, β)={ I1({ 2,3 }), I2({ 3,5 }), I3({ 9,10 }) }.
6. results of the N (α, β) to binary function M (α, β)Set is taken for appointing If setThere is corresponding set raceThen construct a new set as followsP [I can then be obtained1({ 2,3 })]={ 2,3,4,5 }, P [I2({ 3,5 })]={ 2,3,4,5 }, P [I3({ 9,10 })]={ 9,10 }.
7. results of the U (α) to binary function N (α, β)TakeIfTo appointing the element γ given,There is τ (γ) ≤τ(δ).Then Pmax[I1({ 2,3 })]=5, Pmax[I2({ 3,5 })]=5, Pmax[I3({ 9,10 })]=10.
For r1There is { r1}={ left }, numbering is 6.Then the choosing method of corresponding subject is:When in the absence of rk-1When: {yk}=NPIyk∪VNPyk∪NOMPk∪Gk∪{e};
Wherein:
Wherein:
In above-mentioned formula, GkRepresent entirety arranged side by side name of the maximum numbering less than corresponding predicate verb element number The set of word pronoun race and is gathered.
Then have:
Then have: Then r1The set of corresponding subject element is:
In generation object element set { z1}、{z2During, run object generating algorithm arranged side by side as follows:
1. A (S) takes out all NPI phrases, entirety VNP phrases, the entirety NOMP phrases in former sentence, and will be complete in former sentence Body NPI phrases, entirety VNP phrases, entirety NOMP phrases are classified as a set, by the set be designated as Ψ=Jack, Mary, Linda, I, my son, a book }={ 2,3,5,7,9,10 }.
2. B (Ψ) represent according toMode take whole groups of any two element in set Ψ={ 2,3,5,7,9,10 } Close, if set Then B (Ψ)={ 2,3 }, { 2,5 }, { 2,7 }, { 2,9 }, { 2,10 }, { 3,5 }, { 3,7 }, { 3,9 }, { 3,10 }, { 5,7 }, { 5,9 }, { 5,10 }, { 7,9 }, { 10,7 }, 10, 9}}。
3. K (α, β) is to function of a single variable B (Ψ) result, i.e., to appointing one givenAccording to elementThe arrangement from small to large of syntax sequence valve in former sentence S.It might as well then setIt then can obtain ordered pairThenThe ordered pair of generation is:
{<2,3>,<2,5>,<2,7>,<2,9>,<2,10>,<3,5>,<3,7>,<3,9>,<3,10>,<5,7>,<5,9>, <5,10>,<7,9>,<7,10>,<9,10>};
If setAnd then set up a continuous word String formulaWhereinBe in former sentence S fromArriveOne The adjacent continuous word string of group or empty word string, andThen Then Φ1=2+ < e+ < 3, Φ2=2+ < 3+ < 4+ < 5, Φ3=2+ < 3+ < 4+ < 5+ < 6+ < 7, Φ4 =2+ < 3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9, Φ5=2+ < 3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ6=3 + < 4+ < 5, Φ7=3+ < 4+ < 5+ < 6+ < 7, Φ8=3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9, Φ9=3+ < 4+ < 5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ10=5+ < 6+ < 7, Φ11=5+ < 6+ < 7+ < 8+ < 9, Φ12=5+ < 6+ < 7+ < 8+ < 9+ < 10, Φ13=7+ < 8+ < 9, Φ14=7+ < 8+ < 9+ < 10, Φ15=9+ < e+ < 10.
④H(Φt) binary function K (α, β) is generatedChecked:If To appointing the element γ ∈ Φ givent, andAndHave:γ=NPI or γ=VNP or γ=NOMP or γ= CONJ or γ=e, then by ΦtMark be changed toReferred to as ΦtGenerationIf set Then gatherThen
5. M (α, β) is represented for appointing one taken setIf setDeposit CorrespondingA set race is then defined, the set race is by including setAll set constitute, the set Race is designated asThen Then M (α, β)={ I1({ 2,3 }), I2({ 3,5 }), I3({ 9,10 }) }.
6. results of the N (α, β) to binary function M (α, β)Set is taken for appointing If setThere is corresponding set raceThen construct one New set is as followsP [I can then be obtained1({ 2,3 })]= { 2,3,4,5 }, P [I2({ 3,5 })]={ 2,3,4,5 }, P [I3({ 9,10 })]={ 9,10 }.
7. V (β) represents the result to binary function N (α, β)TakeIfTo appointing the element γ given,There is τ (δ) ≤τ(γ).Then Pmin[I1({ 2,3 })]=2, Pmin[I2({ 3,5 })]=2, Pmin[I3({ 9,10 })]=9.
For r2There is { r2}={ gave }, numbering is 8.Then the choosing method of corresponding object is:When in the absence of rk+1When: {zk}=NPIzk∪VNPzk∪OBJPk∪Hk∪{e};
Wherein:
Wherein:When in the absence of rk+1When:
In above-mentioned formula, HkRepresent all coordinate noun generation of the minimum value numbering more than corresponding predicate verb unit Set of words set race and is gathered.
Then have:
Then have:
Then r2The set of corresponding object element is:
Note:During processing, coordinate noun pronoun mix vector is handled as an entirety;Coordinate noun pronoun Combine to can not be by the vectorial plug hole of other syntaxes;In checks sequence value, directly coordinate noun pronoun is combined to being included Syntax sequence valve is substituted into.
For r1There is { r1}={ left }
{x1}={ After, and, e } (e is null character string)
{z1}={ f2, e }
With the multiplicative principle in Combinational Mathematics:f1(x1, y1, r1, z1)=(sees below list)
For r2There is { r2}={ gave }
{x2}={ After, and, e } (e is null character string)
With the multiplicative principle in Combinational Mathematics:f2(x2, y2, r2, z2)=(sees below list)
Constant is replaced with sequence valve, slightly.
With the multiplicative principle in Combinational Mathematics:
| S |=| f1|×|f2|=42 × 108=4536
Then collectively generate 4536 possible matrix solutions.
The possibility matrix solution of final result can be obtained parsing as syntactic structure:
Above-mentioned possible matrix solution is further reduced, following form is obtained:
Note:The result is obtained by overall plug hole method.
This matrix is converted into linear representation:
Remove e to obtain:
C5 parts example 5
Example 5:As another example, illustrate the method for the present embodiment for for example below:" Linda was singing, The resolving of the sentence of parallel construction as and Mary was dancing. ".
Above-mentioned sentence by pretreatment remove impurity and number after word order list be:
Former sentence phrase Phrase type Serial number
Linda Noun pronoun unit 1
was singing Predicate verb unit 2
and Conjunctive word unit arranged side by side 3
Mary Noun pronoun unit 4
was dancing Predicate verb unit 5
For r1There is { r1}={ was singing }
{x1}={ e } (e is null character string)
{y1}={ Linda, e }
{z1}={ Mary, f2, e }
With the multiplicative principle in Combinational Mathematics:f1(x1, y1, r1, z1)=(sees below list)
Sequence number Row matrix f1
(1-1) f1=(e, Linda, r1, Mary)
(1-2) f1=(e, e, r1, Mary)
(1-3) f1=(e, Linda, r1, f2)
(1-4) f1=(e, e, r1, f2)
(1-5) f1=(e, Linda, r1, e)
(1-6) f1=(e, e, r1, e)
For r2There is { r2}={ was dancing }
{x2}={ and, e } (e is null character string)
{y2}={ Linda, Mary, f1, e }
{z2}={ e }
With the multiplicative principle in Combinational Mathematics:f2(x2, y2, r2, z2)=(sees below list)
Sequence number Row matrix f2
(2-1) f2=(and, Linda, r2, e)
(2-2) f2=(and, Mary, r2, e)
(2-3) f2=(and, f1, r2, e)
(2-4) f2=(and, e, r2, e)
(2-5) f2=(e, Linda, r2, e)
(2-6) f2=(e, Mary, r2, e)
(2-7) f2=(e, f1, r2, e)
(2-8) f2=(e, e, r2, e)
Constant is replaced with sequence valve, slightly.
With the multiplicative principle in Combinational Mathematics:
| S |=| f1|×|f2|=6 × 8=48
Then collectively generate 48 possible matrix solutions.
The numbering of above-mentioned possible matrix solution is reduced to word unit, following form is obtained:
This matrix is converted into linear representation:
Remove e to obtain:
C6 parts example 6
Example 6:As another example, illustrate the method for the present embodiment for for example below:“I know that you have The resolving of the sentence of parallel construction as a car and that he has a bike. ".
Above-mentioned sentence by pretreatment remove impurity and number after word order list be:
Former sentence phrase Phrase type Serial number
I Noun pronoun unit 1
know Predicate verb unit 2
that A Subordinate conjunctive word unit 3
you Noun pronoun unit 4
have Predicate verb unit 5
a car Noun pronoun unit 6
and Conjunctive word unit arranged side by side 7
that B Subordinate conjunctive word unit 8
he Noun pronoun unit 9
has Predicate verb unit 10
a bike Noun pronoun unit 11
Remember each predicate verb unit rkThe collection of corresponding leading question element is combined into:
{xk}=Leadk∪conjk∪(conjkοLeadk)∪{e}
Remember predicate verb unit rkCorresponding leading question element is xk, its possible value collection is combined into { xk}.Generate predicate verb Unit rkCorresponding leading question element is xkPossibility value set:
{x1}={ Lead1∪ { e }={ that A, e };
{x2}=Lead2∪conj2∪(conj2οLead2) ∪ { e }={ that A, and, that B, Ψ, e }.
Above-mentioned two formula derives from the generating algorithm of leading question element:{xk}=Leadk∪conjk∪(conjkο Leadk)∪{e}.Wherein, (conjkοLeadk)={ Rk|Rk=conj+ < Lead, conj <rk, Lead <rk, τ (Lead) =τ (conj)+1 };(conjkοLeadk) represent the conjunctive word arranged side by side less than corresponding predicate verb element number by a numbering Unit and one it is adjacent thereto and numbering less than corresponding predicate verb element number and numbering be more than the conjunctive word unit arranged side by side The set for the conjunctive word mix vector that the subordinate conjunctive word unit of numbering is constituted.
For r1There is { r1}={ know }
{x1}={ e } (e is null character string)
{y1}={ I, e }
{z1}={ you, f2, f3, e }
With the multiplicative principle in Combinational Mathematics:f1(x1, y1, r1, z1)=(sees below list)
Sequence number Row matrix f1 Sequence number Row matrix f1
(1-1) f1=(e, I, r1, you) (1-5) f1=(e, I, r1, f3)
(1-2) f1=(e, e, r1, you) (1-6) f1=(e, e, r1, f3)
(1-3) f1=(e, I, r1, f2) (1-7) f1=(e, I, r1, e)
(1-4) f1=(e, e, r1, f2) (1-8) f1=(e, e, r1, e)
For r2There is { r2}={ have }
{x2}={ that A, e } (e is null character string)
{y2}={ you, I, f1, e }
{z2}={ a car, f3, e }
With the multiplicative principle in Combinational Mathematics:f2(x2, y2, r2, z2)=(sees below list)
Sequence number Row matrix f2 Sequence number Row matrix f2
(2-1) f2=(that A, you, r2, a car) (2-13) f2=(that A, f1, r2, f3)
(2-2) f2=(e, you, r2, a car) (2-14) f2=(e, f1, r2, f3)
(2-3) f2=(that A, I, r2, a car) (2-15) f2=(that A, e, r2, f3)
(2-4) f2=(e, I, r2, a car) (2-16) f2=(e, e, r2, f3)
(2-5) f2=(that A, f1, r2, a car) (2-17) f2=(that A, you, r2, e)
(2-6) f2=(e, f1, r2, a car) (2-18) f2=(e, you, r2, e)
(2-7) f2=(that A, e, r2, a car) (2-19) f2=(that A, I, r2, e)
(2-8) f2=(e, e, r2, a car) (2-20) f2=(e, I, r2, e)
(2-9) f2=(that A, you, r2, f3) (2-21) f2=(that A, f1, r2, e)
(2-10) f2=(e, you, r2, f3) (2-22) f2=(e, f1, r2, e)
(2-11) f2=(that A, I, r2, f3) (2-23) f2=(that A, e, r2, e)
(2-12) f2=(e, I, r2, f3) (2-24) f2=(e, e, r2, e)
For r3There is { r3}={ has }
{x3}={ that A, that B, and, Ψ, e } (e is null character string)
{y3}={ you, I, a car, he, f1, f2, e }
{z3}={ a bike, e }
With the multiplicative principle in Combinational Mathematics:f3(x3, y3, r3, z3)=(sees below list)
Sequence number Row matrix f3 Sequence number Row matrix f3
(3-1) f3=(that A, you, r3, a bike) (3-36) f3=(that A, you, r3, e)
(3-2) f3=(that B, you, r3, a bike) (3-37) f3=(that B, you, r3, e)
(3-3) f3=(and, you, r3, a bike) (3-38) f3=(and, you, r3, e)
(3-4) f3=(Ψ, you, r3, a bike) (3-39) f3=(Ψ, you, r3, e)
(3-5) f3=(e, you, r3, a bike) (3-40) f3=(e, you, r3, e)
(3-6) f3=(that A, I, r3, a bike) (3-41) f3=(that A, I, r3, e)
(3-7) f3=(that B, I, r3, a bike) (3-42) f3=(that B, I, r3, e)
(3-8) f3=(and, I, r3, a bike) (3-43) f3=(and, I, r3, e)
(3-9) f3=(Ψ, I, r3, a bike) (3-44) f3=(Ψ, I, r3, e)
(3-10) f3=(e, I, r3, a bike) (3-45) f3=(e, I, r3, e)
(3-11) f3=(that A, a car, r3, a bike) (3-46) f3=(that A, a car, r3, e)
(3-12) f3=(that B, a car, r3, a bike) (3-47) f3=(that B, a car, r3, e)
(3-13) f3=(and, a car, r3, a bike) (3-48) f3=(and, a car, r3, e)
(3-14) f3=(Ψ, a car, r3, a bike) (3-49) f3=(Ψ, a car, r3, e)
(3-15) f3=(e, a car, r3, a bike) (3-50) f3=(e, a car, r3, e)
(3-16) f3=(that A, he, r3, a bike) (3-51) f3=(that A, he, r3, e)
(3-17) f3=(that B, he, r3, a bike) (3-52) f3=(that B, he, r3, e)
(3-18) f3=(and, he, r3, a bike) (3-53) f3=(and, he, r3, e)
(3-19) f3=(Ψ, he, r3, a bike) (3-54) f3=(Ψ, he, r3, e)
(3-20) f3=(e, he, r3, a bike) (3-55) f3=(e, he, r3, e)
(3-21) f3=(that A, f1, r3, a bike) (3-56) f3=(that A, f1, r3, e)
(3-22) f3=(that B, f1, r3, a bike) (3-57) f3=(that B, f1, r3, e)
(3-23) f3=(and, f1, r3, a bike) (3-58) f3=(and, f1, r3, e)
(3-24) f3=(Ψ, f1, r3, a bike) (3-59) f3=(Ψ, f1, r3, e)
(3-25) f3=(e, f1, r3, a bike) (3-60) f3=(e, f1, r3, e)
(3-26) f3=(that A, f2, r3, a bike) (3-61) f3=(that A, f2, r3, e)
(3-27) f3=(that B, f2, r3, a bike) (3-62) f3=(that B, f2, r3, e)
(3-28) f3=(and, f2, r3, a bike) (3-63) f3=(and, f2, r3, e)
(3-29) f3=(Ψ, f2, r3, a bike) (3-64) f3=(Ψ, f2, r3, e)
(3-30) f3=(e, f2, r3, a bike) (3-65) f3=(e, f2, r3, e)
(3-31) f3=(that A, e, r3, a bike) (3-66) f3=(that A, e, r3, e)
(3-32) f3=(that B, e, r3, a bike) (3-67) f3=(that B, e, r3, e)
(3-33) f3=(and, e, r3, a bike) (3-68) f3=(and, e, r3, e)
(3-34) f3=(Ψ, e, r3, a bike) (3-69) f3=(Ψ, e, r3, e)
(3-35) f3=(e, e, r3, a bike) (3-70) f3=(e, e, r3, e)
Constant is replaced with sequence valve, slightly.
With the multiplicative principle in Combinational Mathematics:
| S |=| f1|×|f2|×|f3|=8 × 24 × 70=13440
Then collectively generate 13440 possible matrix solutions.
The possibility matrix solution of final result can be obtained parsing as syntactic structure:
The S corresponding with this matrix expression linear representation is as follows:
The correct structure of the example sentence is:I know are used as main clause;That Ayou have a car are the predicates of main clause First object clause that know has under its command;And that B he has a bike are second arranged side by side with first object clause Individual object clause;Subordinate conjunctive word unit that A and that B guide two object clauses respectively;Between two object clauses By conjunctive word unit and connections arranged side by side;During processing, conjunctive word mix vector Ψ=and that B are whole as one Body is handled;Conjunctive word mix vector Ψ can not be by the vectorial plug hole of other syntaxes;In checks sequence value, phrase will be directly associated Two syntax sequence valves that resultant vector is included are substituted into.Final result, is to regard second object clause with entirety as The mode of plug hole, plug hole is at the end of first object clause.
Fig. 2 is a kind of schematic diagram of the device of computer based natural language syntactic structure parsing of the present invention, shown Device includes:
Read part 21, the pretreated phrase data structure to be resolved for reading, the pretreated sentence Only include conjunctive word unit arranged side by side, subordinate conjunctive word unit, predicate verb unit, the noun pronoun list of sentence in data structure Member, and each word unit is numbered according to the order in the pretreated sentence, and marking types;
Element generation part 22, for each predicate verb unit, generate corresponding leading question element, subject element, Predicate element and object element;
Wherein, arranged side by side pass of the possibility value of the leading question element for numbering less than corresponding predicate verb element number Join one of word unit or subordinate conjunctive word unit, or the association arranged side by side by a numbering less than corresponding predicate verb element number Word unit and one it is adjacent thereto and numbering less than corresponding predicate verb element number and numbering be more than the conjunctive word list arranged side by side One of conjunctive word mix vector that the subordinate conjunctive word unit of member numbering is constituted, or dummy cell;
Noun pronoun unit of the possibility value of the subject element for numbering less than corresponding predicate verb element number One of, or most major term unit numbering less than corresponding predicate verb element number all coordinate noun pronoun mix vector races Included in one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of preceding appearance, or Dummy cell;
The predicate element is the corresponding predicate verb unit;
The possibility value numbering of the object element is more than corresponding predicate verb element number and less than adjacent rear One of noun pronoun unit of the predicate verb element number of appearance, or the numbering of minimum word unit are more than corresponding predicate verb Element number and less than the adjacent predicate verb element number in rear appearance all coordinate noun pronoun mix vector races in Comprising one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of rear appearance, or empty Unit;
Vectorial generating unit 23, for according to the leading question element, subject element, predicate element and object element can Energy value, obtains the value that is possible to of the corresponding syntax vector of each predicate verb unit, and the syntax vector includes guiding Language element, subject element, predicate element and object element;
Matrix generation component 24, for being possible to value according to all syntaxes vector, generates at least one syntax knot The possible matrix solution of structure, the possible matrix solution of the syntactic structure is by according to the tactic syntax Vector Groups of predicate verb element number Into;
Decider 25, for verify according to syntactic structure may the obtained sentence of matrix solution whether with it is described preprocessed Sentence it is identical, if identical, each syntax vector that may be in matrix solution using the syntactic structure is used as syntax knot One of structure analysis result;
Wherein, the decider 25 may be solved by the syntactic structure for operating exclusion ineligible with lower module:
First row removes module, and if there is the sequence valve not occurred in the possible matrix solution of the syntactic structure, then excluding should Syntactic structure may matrix solution;
Second row removes module, if there is identical sequence valve in different syntax vectors or occur identical syntax to Amount, then excluding the syntactic structure may matrix solution;
3rd excludes module, in each possible matrix solution, will exist mutually to substitute between other syntaxes vector and close The syntax vector of system all carries out equivalent substitution, if occurring the intersection contradiction of two syntax vectors after equivalent substitution, Excluding the syntactic structure may matrix solution;
4th excludes module, in each possible matrix solution, will exist mutually to substitute between other syntaxes vector and close The syntax vector of system all carries out equivalent substitution, if occurring the converse sequence valve in two positions after equivalent substitution, arranges Except the syntactic structure may matrix solution;
5th excludes module, in any one possible matrix solution, does not have phase if there is between other syntaxes vector The syntax vector of relation is mutually substituted into, then performs plug hole operation and is parsed with obtaining the possibility syntax corresponding to all possible matrix solutions Structure, and verify the sentence that is obtained according to the possible syntax analytic structure whether with the pretreated complete phase of sentence Together, it further comprises:
First submodule, first carries out equivalent generation to there is the syntax of substitution relation vector in the possible matrix solution each other Change, so that the possible matrix solution is converted into one group of syntax vector each other in the absence of substitution relationSyntax vector that will likely be in matrix solution is referred to as first kind syntax vector, will convert obtained syntax to AmountReferred to as Equations of The Second Kind syntax is vectorial;
Second submodule, appoint and take Equations of The Second Kind syntax vectorMark one by one in the predetermined directionIn it is each The sequence valve of individual syntax elements;After the sequence valve for marking syntax elements, appoint and takeIn i-th of syntax elements, only in the sentence The unique room of the first side structure of method element;Make after sky, appoint and take a syntax vectorEquations of The Second Kind syntax vector in additionBy syntax vector in the way of overall plug holeThe constructed room of insertion, and then a new syntax vector is generated, will This new syntax vector is designated asAnd by obtained from overall plug hole syntax vector, be referred to as the 3rd class syntax to Amount;
3rd submodule, to the 3rd class syntax vectorIn the predetermined direction to from vectorIn the syntax elements of the first side first start to vectorIn the vector that includesThe second side Each syntax elements untill first syntax elements, all mark sequence valve;Positioned at vectorIn include VectorThe element of first side, does not mark sequence valve;By vectorFirst syntax elements of the second side be designated asWill In the manner previously described to vectorThe syntax vector portion of mark, is designated as whipping syntax vectorMark Note after sequence valve, appoint j-th of the syntax elements taken in a foregoing whipping vector, it is only unique in the side structure of element first Room;Make after sky, appoint and take an original Equations of The Second Kind syntax vectorIn the way of overall plug hole by syntax to AmountThe constructed room of insertion, and then generate a new syntax vector, then newly-generated syntax vector is designated asOr
3rd class syntax vectorAccording to predetermined direction, distich normal vectorIn each Syntax elements all mark sequence valve;After the sequence valve for marking syntax elements, appoint and take oneIn t-th Syntax elements, in the unique room of the first side structure of the syntax elements;Make after sky, appoint and take an Equations of The Second Kind was not used to obtain Syntax vectorBy the vector in the way of overall plug holeThe room that insertion is above constructed, and then generate a new vector, then The new vector is designated as
4th submodule, repeats the operation of the 3rd submodule, right when the last time, which makes empty and plug hole step, to be terminated Make sky and the plug hole of sky with the 3rd class syntax vector progress obtained from plug hole step next time are made by the last time to operate, until By all Equations of The Second Kind syntaxes vectorWhole plug holes are finished, and finally obtain the 3rd class syntax vector of a single file, will The 3rd class syntax vector finally obtained is referred to as final single file vector;
5th submodule, if existed in the corresponding all final single files vectors of a possible syntax analytic structure The converse sequence valve in two positions, then exclude the possible syntax analytic structure;
6th submodule, repeats to call the operation of the second submodule to the 5th submodule until all possible syntax parsing knots Structure is traversed.
Further, described device can also include:
As a result display unit, each syntax in syntactic structure analysis result is vectorial and corresponding syntax structural relationship is used Tree shown on human-computer interaction interface.
The present invention lays particular emphasis on the accurate parsing of the combined type sentence structure solved the problems, such as in natural language.The maximum of the present invention Feature is:1. the property of compound function is taken full advantage of;2. syntax formula is described using matrix model and linear model;3. transport With the relative theory generator matrix model and linear model of Combinational Mathematics.With the present invention, natural language syntax knot can be improved The accuracy rate of structure parsing.
From the point of view of mathematics, natural language carries discreteness feature, and this difficulty exactly in syntactic structure dissection process Point.The present invention does not both destroy the integrality of sentence structure by the way that syntax vector is effectively combined with matrix form, and not The relation between inherent composition and words and phrases among each sentence of obstruction analysis.The present invention is portrayed using matrix model and linear model Sentence formula, this had both met the discreteness feature of natural language, and effectively disclosed the information association on syntactic structure.
From the point of view of computer technology, the present invention uses matrix model and linear model, by the natural language language of single file Sentence is converted into the nested form of stacked linear, so as to largely avoid former sentence mark of the computer directly to natural language Form point and partition structure and the entanglement that occurs, and then make the program task of computer clearer, succinct.The present invention is used Matrix model and linear model, draw a plurality of parallel runway equivalent to the sentence for natural language, make the language of natural language Sentence is simultaneously off on a plurality of parallel runway, and correct result is then therefrom screened again;Also correspond to the sentence for natural language Multiple planes are provided, the sentence of natural language is handled simultaneously in multiple planes, correct result is then therefrom screened again.
During generator matrix, the present invention has used the relative theory of Combinational Mathematics to generate all matrix, Ran Houzai Exclude one by one, finally obtain at least one possible correct syntactic structure analysis result.In this course, it is only necessary to use Mathematical principle and information coding, it is only necessary to handle the numerical value of real number, each step, which is finally implemented to, checks syntax vector Whether numerical value is ascending order arrangement, that is, compare the size of real number, without regard to the language message of English in itself.
Meanwhile, the present invention needs to carry out substantial amounts of mathematical operation, it is therefore necessary to by the computing capability of computer, can just have Effect is realized.
To sum up, the present invention is learned according to Abstract Algebra, set theory, Combinational Mathematics, computability theory and computational linguistics etc. The mathematical principle of section and corresponding computer technology, with the mathematical thought of compound function, by setting up matrix model and linear Model, construction recursive function carry out natural language syntactic structure parsing;Meanwhile, the method counterweight such as integrated use mathematical induction The conclusion wanted is proved.
Present inventive concept is unique, method is ingenious, demonstration is full and accurate, takes full advantage of the rule of mathematics and Computer Subject, institute State that method accuracy is higher, there is certain technical difficulty.

Claims (8)

1. a kind of method of computer based natural language syntactic structure parsing, including:
Only include in S1, reading pretreated phrase data structure to be resolved, the pretreated phrase data structure Conjunctive word unit arranged side by side, subordinate conjunctive word unit, predicate verb unit, the noun pronoun unit of sentence, and each word unit according to Order in the pretreated sentence is numbered, and marking types;
S2, to each predicate verb unit, generate corresponding leading question element, subject element, predicate element and object element;
Arranged side by side conjunctive word unit of the possibility value of the leading question element for numbering less than corresponding predicate verb element number Or one of subordinate conjunctive word unit, or by a numbering less than the conjunctive word unit arranged side by side of corresponding predicate verb element number and One adjacent thereto and numbering is less than corresponding predicate verb element number and numbering is more than the conjunctive word element number arranged side by side One of conjunctive word mix vector that subordinate conjunctive word unit is constituted, or dummy cell;
The possibility value of the subject element is less than one of noun pronoun unit of corresponding predicate verb element number for numbering, Or the numbering of most major term unit is less than institute in all coordinate noun pronoun mix vector races of corresponding predicate verb element number Comprising one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of preceding appearance, or empty single Member;
The predicate element is the corresponding predicate verb unit;
The possibility value numbering of the object element rear occurs more than corresponding predicate verb element number and less than adjacent One of the noun pronoun unit of predicate verb element number, or the numbering of minimum word unit is more than corresponding predicate verb unit Number and be less than in all coordinate noun pronoun mix vector races of the adjacent predicate verb element number in rear appearance and wrapped One of coordinate noun pronoun mix vector contained, or in one of corresponding syntax vector of predicate element of rear appearance, or dummy cell;
S3, the possibility value according to the leading question element, subject element, predicate element and object element, obtain each predicate Verb unit corresponding syntax vector is possible to value, and the syntax vector includes leading question element, subject element, predicate Element and object element;
S4, according to all syntaxes vector be possible to value, generate at least one syntactic structure possibility matrix solution, the syntax The possible matrix solution of structure according to the tactic syntax vector of predicate verb element number by constituting;
Whether the sentence that S5, checking are obtained according to the possible matrix solution of syntactic structure is identical with the pretreated sentence, If identical, each syntax vector in the possible matrix solution of the syntactic structure is regard as one of syntactic structure analysis result;
Wherein, S5 includes performing following operation successively in order, and excluding ineligible syntactic structure may solve:
S5.1, if there is the sequence valve that may do not occur in the syntactic structure in matrix solution, then excluding the syntactic structure may Matrix solution;
If S5.2, identical sequence valve occur in different syntax vectors or identical syntax vector occur, excluding should Syntactic structure may matrix solution;
S5.3, in each possible matrix solution, will other syntaxes vector between exist mutually substitute into relation syntax vector Equivalent substitution is all carried out, if occurring the intersection contradiction of two syntax vectors after equivalent substitution, the syntax knot is excluded Structure may matrix solution;
S5.4, in each possible matrix solution, will other syntaxes vector between exist mutually substitute into relation syntax vector Equivalent substitution is all carried out, if occurring the converse sequence valve in two positions after equivalent substitution, the syntactic structure is excluded Possible matrix solution;
S5.5, in any one possible matrix solution, if there is not having mutually to substitute into relation between other syntaxes vector Syntax vector, then perform plug hole operation to obtain the possibility syntax analytic structure corresponding to all possible matrix solutions, and verify Whether identical with the pretreated sentence according to the sentence that the possible syntax analytic structure is obtained, it is further wrapped Include:
S5.5.1, first to exist each other in the possible matrix solution syntax of substitution relation vector carry out equivalent substitution so that The possible matrix solution is converted into one group of syntax vector each other in the absence of substitution relationWill likely Syntax vector in matrix solution is referred to as first kind syntax vector, the syntax vector that conversion is obtainedReferred to as Two class syntaxes vector;
S5.5.2, appoint and take Equations of The Second Kind syntax vectorMark one by one in the predetermined directionIn each syntax member The sequence valve of element;After the sequence valve for marking syntax elements, appoint and takeIn i-th of syntax elements, only in the syntax elements The unique room of first side structure;Make after sky, appoint and take a syntax vectorEquations of The Second Kind syntax vector in additionWith entirety The mode of plug hole is vectorial by syntaxThe constructed room of insertion, and then a new syntax vector is generated, by this new syntax Vector is designated asAnd by syntax vector obtained from overall plug hole, it is referred to as the 3rd class syntax vector;
S5.5.3, to the 3rd class syntax vectorIn the predetermined direction to from vectorIn The syntax elements of side first start to vectorIn the vector that includesThe syntax elements of the second side first be Each syntax elements only, all mark sequence valve;Positioned at vectorIn the vector that includesThe member of first side Element, does not mark sequence valve;By vectorFirst syntax elements of the second side be designated asWill in the manner previously described to AmountThe syntax vector portion of mark, is designated as whipping syntax vectorMark after sequence valve, appoint and take J-th of syntax elements in one foregoing whipping vector, only in the unique room of the side structure of element first;After making sky, Appoint and take an original Equations of The Second Kind syntax vectorBy syntax vector in the way of overall plug holeThe constructed sky of insertion Position, and then generate a new syntax vector, then newly-generated syntax vector is designated asOr
3rd class syntax vectorAccording to predetermined direction, distich normal vectorIn each syntax Element all marks sequence valve;After the sequence valve for marking syntax elements, appoint and take oneIn t-th of syntax member Element, in the unique room of the first side structure of the syntax elements;After making sky, appoint take one to be not used Equations of The Second Kind syntax to AmountBy the vector in the way of overall plug holeThe room that insertion is above constructed, and then generate a new vector, then this it is new to Amount is designated as
S5.5.4, S5.5.3 is repeated, when the last time, which makes empty and plug hole step, to be terminated, to making sky by the last time With the 3rd class syntax vector obtained from plug hole step carry out next time make the operation of empty and plug hole, until by all Equations of The Second Kind sentence Normal vectorWhole plug holes are finished, finally obtain the 3rd class syntax vector of single file, are finally obtained described 3rd class syntax vector is referred to as final single file vector;
If there are two positions in the corresponding all final single files vectors of S5.5.5, a possible syntax analytic structure Converse sequence valve, then exclude the possible syntax analytic structure;
S5.5.6, repeat S5.5.2 to S5.5.5 until all possible syntax analytic structures be traversed.
2. the method for computer based natural language syntactic structure parsing according to claim 1, it is characterised in that S2 Including generation coordinate noun pronoun mix vector race:
S2.1 chooses unduplicated two nouns pronoun unit:
If A, between the two noun pronoun units there is no other word units, using the two noun pronoun units as One coordinate noun pronoun mix vector, and retain the coordinate noun pronoun mix vector;
If B, between the two noun pronoun units exist other word units, check between the two noun pronoun lists Each word unit between member:If any one word unit between the two noun pronoun units, is all name Word pronoun unit or conjunctive word unit arranged side by side, then by two selected noun pronoun units and between the two noun pronoun lists All word units between member retain the coordinate noun pronoun mix vector as a coordinate noun pronoun mix vector; Otherwise, coordinate noun pronoun mix vector is not generated;
S2.2 performs S2.1 again until the combination of all noun pronoun unit is traversed, and it is all arranged side by side that generation is obtained Noun pronoun mix vector;
If there is coordinate noun pronoun mix vector in the S2.3 possible syntax analytic structures, to all coordinate noun pronouns Mix vector is divided, so as to form several coordinate noun pronoun mix vector races so that:In each coordinate noun generation In phrase resultant vector race, each coordinate noun pronoun mix vector included in the coordinate noun pronoun mix vector race is complete All contain two common noun pronoun units;
S2.4 is in each noun pronoun mix vector race, and the numbering included in all noun pronoun mix vectors of selection is most Big word unit, as the most major term unit of the noun pronoun mix vector race, in case being used when being subsequently generated subject;Choose institute There is the minimum word unit of the numbering included in noun pronoun mix vector, be used as the minimum word of the noun pronoun mix vector race Unit, in case being used when being subsequently generated object.
3. computer based natural language syntactic structure analytic method according to claim 1, it is characterised in that generation Corresponding subject element includes:
When corresponding predicate verb element number is minimum predicate verb element number, the possibility value of the subject element To number one of noun pronoun unit less than corresponding predicate verb element number, or the numbering of its most major term unit is less than pair Coordinate noun pronoun included in all coordinate noun pronoun mix vector races for the predicate verb element number answered combine to One of amount, or dummy cell;
When corresponding predicate verb element number is not minimum predicate verb element number, the possibility of the subject element takes It is worth to number one of noun pronoun unit less than corresponding predicate verb element number, or the most numbering of major term unit is less than pair Coordinate noun pronoun included in all coordinate noun pronoun mix vector races for the predicate verb element number answered combine to One of amount, or in one of corresponding syntax vector of predicate verb unit of preceding appearance, or dummy cell.
4. the method for computer based natural language syntactic structure parsing according to claim 1, it is characterised in that raw Include into corresponding object element:
When corresponding predicate verb element number is maximum predicate verb element number, the possibility value of the object element To number one of noun pronoun unit more than corresponding predicate verb element number, or the numbering of its minimum word unit is more than pair Coordinate noun pronoun included in all coordinate noun pronoun mix vector races for the predicate verb element number answered combine to One of amount, or dummy cell;
When corresponding predicate verb element number is not maximum predicate verb element number, the possibility of the object element takes It is worth to number more than corresponding predicate verb element number and less than the name of the adjacent predicate verb element number in rear appearance One of word pronoun unit, or the numbering of its minimum word unit are more than corresponding predicate verb element number and less than adjacent rear Coordinate noun pronoun combination included in all coordinate noun pronoun mix vector races of the predicate verb element number of appearance One of vector, or in one of corresponding syntax vector of predicate verb unit of rear appearance, or dummy cell.
5. the method for computer based natural language syntactic structure parsing according to claim 1, it is characterised in that , may matrix solution using the syntactic structure is substituted with the possible linear representation solution of syntactic structure in two steps of S4 and S5;
The syntactic structure may linear representation solution and the possible matrix solution equivalence of the syntactic structure;
The possible linear representation solution of the syntactic structure is included by according to the tactic syntax vector of predicate verb element number Expression formula is constituted;Each syntax vector expression is leading question element, subject element, the predicate member of corresponding syntax vector The expression formula that element, object element are added up partially item by item in sequence.
6. the method for computer based natural language syntactic structure parsing according to claim 1, it is characterised in that institute Stating method also includes:
Each syntax in syntactic structure analysis result is vectorial and corresponding syntax structural relationship tree is in man-machine friendship Shown in mutual interface.
7. a kind of device of computer based natural language syntactic structure parsing, including:
Read part, the pretreated phrase data structure to be resolved for reading, the pretreated phrase data knot Only include conjunctive word unit arranged side by side, subordinate conjunctive word unit, predicate verb unit, the noun pronoun unit of sentence in structure, and respectively Word unit is numbered according to the order in the pretreated sentence, and marking types;
Element generation part, for each predicate verb unit, generating corresponding leading question element, subject element, predicate member Element and object element;
Wherein, arranged side by side conjunctive word of the possibility value of the leading question element for numbering less than corresponding predicate verb element number One of unit or subordinate conjunctive word unit, or the conjunctive word list arranged side by side by a numbering less than corresponding predicate verb element number Member and an adjacent thereto and numbering are less than corresponding predicate verb element number and numbering is more than conjunctive word unit volume arranged side by side Number one of the conjunctive word mix vector that constitutes of subordinate conjunctive word unit, or dummy cell;
The possibility value of the subject element is less than one of noun pronoun unit of corresponding predicate verb element number for numbering, Or the numbering of most major term unit is less than institute in all coordinate noun pronoun mix vector races of corresponding predicate verb element number Comprising one of coordinate noun pronoun mix vector, or in one of corresponding syntax vector of predicate element of preceding appearance, or empty single Member;
The predicate element is the corresponding predicate verb unit;
The possibility value numbering of the object element rear occurs more than corresponding predicate verb element number and less than adjacent One of the noun pronoun unit of predicate verb element number, or the numbering of minimum word unit is more than corresponding predicate verb unit Number and be less than in all coordinate noun pronoun mix vector races of the adjacent predicate verb element number in rear appearance and wrapped One of coordinate noun pronoun mix vector contained, or in one of corresponding syntax vector of predicate element of rear appearance, or dummy cell;
Vectorial generating unit, for being taken according to the possibility of the leading question element, subject element, predicate element and object element Value, obtains the value that is possible to of the corresponding syntax vector of each predicate verb unit, and the syntax vector includes leading question member Element, subject element, predicate element and object element;
Matrix generation component, for being possible to value according to all syntaxes vector, generating at least one syntactic structure may Matrix solution, the possible matrix solution of the syntactic structure according to the tactic syntax vector of predicate verb element number by constituting;
Decider, for verify according to syntactic structure may the obtained sentence of matrix solution whether with the pretreated sentence It is identical, if identical, each syntax vector in the possible matrix solution of the syntactic structure is parsed as syntactic structure One of as a result;
Wherein, the decider may be solved by the syntactic structure for operating exclusion ineligible with lower module:
First row removes module, if there is the sequence valve not occurred in the possible matrix solution of the syntactic structure, then excludes the syntax Structure may matrix solution;
Second row removes module, if identical sequence valve occur in different syntax vectors or identical syntax vector occur, Then excluding the syntactic structure may matrix solution;
3rd excludes module, in each possible matrix solution, and relation is mutually substituted into by existing between other syntaxes vector Syntax vector all carries out equivalent substitution, if occurring the intersection contradiction of two syntax vectors after equivalent substitution, excludes The syntactic structure may matrix solution;
4th excludes module, in each possible matrix solution, and relation is mutually substituted into by existing between other syntaxes vector Syntax vector all carries out equivalent substitution, if occurring the converse sequence valve in two positions after equivalent substitution, excluding should Syntactic structure may matrix solution;
5th excludes module, in any one possible matrix solution, does not have mutual generation if there is between other syntaxes vector Enter the syntax vector of relation, then perform plug hole operation to obtain the possibility syntax corresponding to all possible matrix solutions and parse knot Structure, and whether the sentence that checking is obtained according to the possible syntax analytic structure is identical with the pretreated sentence, It further comprises:
First submodule, first carries out equivalent substitution to there is the syntax of substitution relation vector in the possible matrix solution each other, So as to which the possible matrix solution is converted into one group of syntax vector each other in the absence of substitution relationWill Syntax vector in possible matrix solution is referred to as first kind syntax vector, the syntax vector that conversion is obtainedClaim For Equations of The Second Kind syntax vector;
Second submodule, appoint and take Equations of The Second Kind syntax vectorMark one by one in the predetermined directionIn each sentence The sequence valve of method element;After the sequence valve for marking syntax elements, appoint and takeIn i-th of syntax elements, only the syntax member The unique room of the first side structure of element;Make after sky, appoint and take a syntax vectorEquations of The Second Kind syntax vector in additionWith The mode of overall plug hole is vectorial by syntaxThe constructed room of insertion, and then a new syntax vector is generated, this is new Syntax vector is designated asAnd by syntax vector obtained from overall plug hole, it is referred to as the 3rd class syntax vector;
3rd submodule, to the 3rd class syntax vectorIn the predetermined direction to from vectorIn The syntax elements of the first side first start to vectorIn the vector that includesThe syntax member of the second side first Each syntax elements untill element, all mark sequence valve;Positioned at vectorIn the vector that includesFirst side Element, do not mark sequence valve;By vectorFirst syntax elements of the second side be designated asWill in the manner previously described To vectorThe syntax vector portion of mark, is designated as whipping syntax vectorAfter mark sequence valve, Appoint j-th of the syntax elements taken in a foregoing whipping vector, only in the unique room of the side structure of element first;Make sky Afterwards, appoint and take an original Equations of The Second Kind syntax vectorBy syntax vector in the way of overall plug holeInsertion is constructed Room, and then generate a new syntax vector, be then designated as newly-generated syntax vectorOr
3rd class syntax vectorAccording to predetermined direction, distich normal vectorIn each syntax Element all marks sequence valve;After the sequence valve for marking syntax elements, appoint and take oneIn t-th of syntax member Element, in the unique room of the first side structure of the syntax elements;After making sky, appoint take one to be not used Equations of The Second Kind syntax to AmountBy the vector in the way of overall plug holeThe room that insertion is above constructed, and then generate a new vector, then this it is new to Amount is designated as
4th submodule, repeats the operation of the 3rd submodule, when the last time, which makes empty and plug hole step, to be terminated, to passing through Last time make empty and the 3rd class syntax vector obtained from plug hole step carry out next time make the operation of empty and plug hole, until by institute There is Equations of The Second Kind syntax vectorWhole plug holes are finished, and finally obtain the 3rd class syntax vector of a single file, will be described The 3rd class syntax vector finally obtained is referred to as final single file vector;
5th submodule, if having two in the corresponding all final single files vectors of a possible syntax analytic structure The converse sequence valve in position, then exclude the possible syntax analytic structure;
6th submodule, repeats to call the operation of the second submodule to the 5th submodule until all possible syntax analytic structure quilts Traversal.
8. the device of computer based natural language syntactic structure parsing according to claim 7, it is characterised in that also Including:
As a result display unit, by each syntax in syntactic structure analysis result is vectorial and corresponding syntax structural relationship is with tree-shaped Structure shown on human-computer interaction interface.
CN201410419634.0A 2014-08-22 2014-08-22 A kind of method and apparatus of computer based natural language syntactic structure parsing Active CN104156353B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410419634.0A CN104156353B (en) 2014-08-22 2014-08-22 A kind of method and apparatus of computer based natural language syntactic structure parsing
PCT/CN2015/083760 WO2016026359A1 (en) 2014-08-22 2015-07-10 Computer-based method and device for parsing natural language syntactic structures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410419634.0A CN104156353B (en) 2014-08-22 2014-08-22 A kind of method and apparatus of computer based natural language syntactic structure parsing

Publications (2)

Publication Number Publication Date
CN104156353A CN104156353A (en) 2014-11-19
CN104156353B true CN104156353B (en) 2017-10-31

Family

ID=51881858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410419634.0A Active CN104156353B (en) 2014-08-22 2014-08-22 A kind of method and apparatus of computer based natural language syntactic structure parsing

Country Status (2)

Country Link
CN (1) CN104156353B (en)
WO (1) WO2016026359A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156353B (en) * 2014-08-22 2017-10-31 秦一男 A kind of method and apparatus of computer based natural language syntactic structure parsing
CN107422691B (en) * 2017-08-11 2020-05-12 山东省计算中心(国家超级计算济南中心) A Construction Method of Collaborative PLC Programming Language
CN108009234B (en) * 2017-11-29 2022-02-11 苏州大学 Extraction method, device and equipment of non-entity type argument
CN111666405B (en) * 2019-03-06 2023-07-07 百度在线网络技术(北京)有限公司 Method and device for identifying text implication relationship
CN110020434B (en) * 2019-03-22 2021-02-12 北京语自成科技有限公司 Natural language syntactic analysis method
CN112069798B (en) * 2020-09-14 2024-09-27 深圳前海微众银行股份有限公司 Dependency syntax-based compound sentence recognition method, apparatus and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075929A (en) * 2007-03-02 2007-11-21 腾讯科技(深圳)有限公司 Method, system and server for inquiring information
CN103927298A (en) * 2014-04-25 2014-07-16 秦一男 Natural language syntactic structure analyzing method and device based on computer

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003071393A2 (en) * 2002-02-15 2003-08-28 Mathsoft Engineering And Education, Inc. Linguistic support for a regognizer of mathematical expressions
CN102945230B (en) * 2012-10-17 2015-03-25 刘运通 Natural language knowledge acquisition method based on semantic matching driving
CN104156353B (en) * 2014-08-22 2017-10-31 秦一男 A kind of method and apparatus of computer based natural language syntactic structure parsing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075929A (en) * 2007-03-02 2007-11-21 腾讯科技(深圳)有限公司 Method, system and server for inquiring information
CN103927298A (en) * 2014-04-25 2014-07-16 秦一男 Natural language syntactic structure analyzing method and device based on computer

Also Published As

Publication number Publication date
WO2016026359A1 (en) 2016-02-25
CN104156353A (en) 2014-11-19

Similar Documents

Publication Publication Date Title
CN104156353B (en) A kind of method and apparatus of computer based natural language syntactic structure parsing
Padawitz Computing in Horn clause theories
Shynkarenko et al. Constructive-synthesizing structures and their grammatical interpretations. I. Generalized formal constructive-synthesizing structure
US9043265B2 (en) Methods and systems for constructing intelligent glossaries from distinction-based reasoning
Shih et al. Smoothing structured decomposable circuits
CN104391837A (en) Intelligent grammatical analysis method based on case semantics
CN103927298B (en) A kind of computer based natural language syntactic structure analysis method and device
CN110020434A (en) A kind of method of natural language syntactic analysis
Sakuma et al. Translating regular expression matching into transducers
CN107589936B (en) Product Line Variability Configuration Optimization Method Based on Requirement Text and Variability Model Tracking Relationship
CN108829666A (en) A kind of reading understanding topic method for solving solved based on semantic parsing and SMT
Clark An introduction to multiple context free grammars for linguists
De Kok et al. Natural language processing for the working programmer
CN115935943A (en) An Analysis Framework Supporting Computation of Natural Language Structures
CN114169343A (en) Automatic translation method, device, equipment and medium based on expert dictionary
Okhotin Describing the syntax of programming languages using conjunctive and Boolean grammars
Ogihara An Introduction to Theory of Computation: An Algorithmic Approach
Hafiz et al. Lazy combinators for executable specifications of general attribute grammars
Dodds et al. From hyperedge replacement to separation logic and back
Sakharov One-Counter Automata for Parsing and Language Approximation
Krawchuk et al. Efficient Generation of Parameterised Quantum Circuits from Large Texts
Smit et al. A mathematical approach towards hardware design
Lechtchinsky et al. Costing nested array codes
Delpeuch et al. From natural language to RDF graphs with pregroups
Clark Inference of inversion transduction grammars

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant