CN113253853B

CN113253853B - Chinese character input method for computer and mobile phone

Info

Publication number: CN113253853B
Application number: CN202110337156.9A
Authority: CN
Inventors: 周长河
Original assignee: Individual
Current assignee: Individual
Priority date: 2021-03-29
Filing date: 2021-03-29
Publication date: 2023-01-10
Anticipated expiration: 2041-03-29
Also published as: CN113253853A

Abstract

A Chinese-character input method for computer and mobile telephone features that the words are input without selection, so increasing input speed. The main scheme is as follows: the code used is limited to the pinyin initial of the character, the pinyin initial when the character-forming component is independent of the character, and the key name letter indicating the stroke type; setting the code of the stroke type as a key name letter different from the initial consonant of the pinyin; regularly dispersing and acquiring the codes of the etymons in the characters according to the stroke order; the number of codes of the full code coding of the characters and the words is different; all the full code codes of the words which can be adapted into the brevity code codes are adapted into the brevity code codes; assigning the brevity codes with fewer codes to the more common words; the words which are arranged into the brevity code only keep the brevity code in the system coding table; the same code word is input by a method of repeatedly striking the last code once or twice.

Description

Chinese character input method for computer and mobile phone

1. Field of the invention

A Chinese-character input method for computer and mobile phone features that the pronunciation of Chinese character, the shape of Chinese character root and the association between pronunciation are used to encode Chinese characters and words, and the codes are input on English keyboard of computer or mobile phone for inputting Chinese characters.

2. Background of the invention

The Pinyin input method and the Wubi input method are the most commonly used English keyboard Chinese character input methods of computers and mobile phones at present and are the input methods with the best use effect. The pinyin input method is simple and easy to learn, and has the defects that the ratio of the same coded characters and words is high, and the characters and words are frequently selected on a screen when being input, so that the input speed is influenced; the five-stroke input method has the advantages that the same coded character rate is low, characters are basically not required to be selected on a screen when the characters are input, the defects that a large number of codes of etymons need to be memorized, learning is difficult, forgetting is fast, due to the reason, most of the existing five-stroke input methods are few character workers, the pinyin input method is almost used in students, and the bad consequences that the students can not write and wrongly write wrongly written characters are caused because the codes, namely pinyin letters, of the pinyin input method are separated from the etymons of the characters.

The five-stroke input method has the biggest defect that the method is very unfavorable for inputting words, namely Chinese characters are input by taking the words as units. The Chinese character input method mainly uses word input to input Chinese characters, and can double the input speed. The words are frequently selected on the screen by the pinyin input method, and the words input by the five-stroke input method is practically impossible because if a little more words are loaded in a system word bank by the five-stroke input method, the words with the same codes can be layered endlessly when being input, and if the loading amount of the words is insufficient, a user does not know which words can be input by the words when inputting, so that a person using the five-stroke input method only needs to abandon the words input and one word-ground code word. The reasons for this are three (please note that the technical solution of this input method is for these three reasons): firstly, the full code (including the code of all codes of a word) of each word of the five-stroke input method is four codes, and the full code of a word is also four codes, so that the difference of the structures formed by the codes of the full code of the word and the word is small, and the full code of the word and the word is easy to repeat due to the fact that the number of the codes included in the full code of the word and the word is the same; secondly, the full code coding of the words of the five-stroke input method is only four codes, and the coding space is insufficient, so that the full code coding among the words is easy to be the same, because the number of the words is not comparable to that of the words; thirdly, except for one code word, the' simplified code (code of one or more codes after the full code is omitted) of the words of the five-stroke input method is automatically generated by sorting the full code codes of the words in the system code table of the input method according to the alphabetical order, the words which can be input by the simplified code are not sorted in the first order in the words of which the former code of the full code is the same or the former codes are the same or all codes are the same, although the part of the words can be input by the simplified code, the words still exist in the system code table in the form of full code codes, and the number of the full code codes is not reduced; the number of full code codes is not reduced, and the same code rate is not reduced; meanwhile, the brevity code of the five-stroke input method is automatically generated by sorting the full code according to the alphabetical order, so that the phenomenon that several brevity codes belong to the same character or word is caused, the number of the brevity code words is influenced, the brevity code is few, the full code is more, and the same code number of the words is more; in addition, the phenomenon that the simple code is automatically generated by sorting the full code according to the alphabetical order also causes: in the words with the same code before the full code coding or the same codes of all the codes, some unusual words can be input by using unique codes, and the unusual words are selected on the screen instead. The above disadvantages exist in all the shape code input methods such as the wubi input method and all the input methods combining the phonetic code and the shape code.

Countless Chinese character input methods have appeared in China, because people feel the defects of the pinyin input method or the five-stroke input method, people try to invent better input methods, but so far, no input method can not have the advantages of the pinyin input method and the advantages of the five-stroke input method, and cannot cause weather and mostly self-generate and self-extinguish because no input method can not have the defects of the pinyin input method and the five-stroke input method.

In addition to the defects of the pinyin input method and the wubi input method, all the input methods currently used in society, including the pinyin input method and the wubi input method, have the common greatest defect that users cannot smoothly input Chinese characters mainly by inputting words. The words comprise words and phrases with two characters or more than two characters, the number of the words and phrases is astonishing, particularly, the phrases in the words and phrases are theoretically inexhaustible, a user can smoothly input Chinese characters by mainly inputting the words and phrases, the condition is that more than one hundred thousand words and phrases are required in a word stock of an input method system, the codes are mostly unique, and words are basically not selected on a screen during input. If this could be done, it would be a large span of the Chinese keyboard input method history.

China needs the input method, can simultaneously have the advantages of the pinyin input method and the wubi input method, has no defects of the pinyin input method and the wubi input method, basically does not need to select words on a screen, and enables users to smoothly input Chinese characters by mainly inputting the words.

3. Summary of the invention

The technical problem to be solved is as follows: firstly, in the same input method, the advantages of the pinyin input method and the advantages of the five-stroke input method are achieved, and the defects of the pinyin input method and the five-stroke input method do not exist, namely: the input Chinese characters have basically no same codes, do not need to frequently select characters on a screen, and can enable most common characters to be input by using brevity codes; the codes of the etymons do not need to be memorized, the input rule is simple and clear, the user can use the input description at a glance, and the input description is not easy to forget; secondly, the words and phrases are basically not coded in the same way, the words and phrases do not need to be frequently selected on a screen, most commonly used words and phrases can be input by using the brevity code, the input speed is doubled, and the problem that all Chinese character input methods cannot solve the problem that a user can smoothly input Chinese characters by mainly inputting the words and phrases is solved all the time.

(II) technical scheme for solving technical problem and beneficial effect

1. Technical scheme for solving technical problem

In order to solve the technical problems, the technical problems are not solved from one aspect or two aspects, and need to be solved from multiple aspects, the input method adopts the following technical scheme:

scheme first part:

the etymons are divided into strokes and character-forming stroke components (the character-forming stroke components refer to stroke components in characters which can be independently formed into one character, namely commonly called ' character-in-character ', and are hereinafter referred to as ' character-forming components '), codes used for the stroke components are limited to pinyin initial letters of the characters, pinyin initial letters of the character-forming components when the character-forming components are independently formed into characters, and preset stroke types of Chinese characters, namely ' stroke (dot) ", ' one (horizontal)", ' I (vertical) ", ' ノ (left-falling)", ') and the like "

English letters of key names of codes of (pressing down) "," B (representing all strokes with turning hooks) ", wherein each stroke type is" left-hand "," one "," v "," I "," ノ

The code of the second letter is set to select English letters with key names having common characteristic points with the shapes of the English letters as much as possible so as to facilitate the memory.

Resolution of the first part of the scheme: the setting of all the codes ensures that the input method has the advantage that the pinyin input method has almost no memory requirement, and the method for setting the codes for the codes of the characters has no defect of setting the codes of the five-stroke input method. The five-stroke input method code takes the roots of the Chinese characters except strokes and the character forming components, and also has non-character forming components (including radicals and the non-character forming components which are manually split from the Chinese characters), and the name of each root is manually specified, and the code has no association with the characteristics or the attributes of the roots, so the codes of the roots must be remembered by a hard background, unlike the input method that the codes are set according to the shape characteristics or the associated pronunciation of the roots. The five-stroke input method is characterized in that the five-stroke input method takes the situation that the same coded words are not generated as the guide, and key name letters are scattered into all etymons as codes, and the method for setting the codes achieves the aim of scattering the codes to a certain extent.

Scheme second part:

the code has no disadvantage of hard memorization, and then how to reduce the same encoding rate of characters and words to the minimum, so that character selection and word selection are basically not needed when inputting. This is the focus of the present invention, and the difficulty lies in that the codes of the roots of the input method are inherent in the words, unlike the five-stroke input method which is artificially specified by using the same codes between words as the guide to avoid the words. The invention adopts the following scheme to reduce the same coding rate of words to the minimum: 1. each stroke type is strokes of stroke left-falling stroke, stroke one, stroke/, "-," stroke ノ "", and stroke left-falling stroke,

The code of the 'B' must also be set into a key name English letter which is different from the initial consonant in the Chinese phonetic alphabet, so that the code of the etymon is not mainly concentrated on the key name English letter which is the same as the initial consonant in the Chinese phonetic alphabet, but is dispersed into all twenty-six key name English letters, thereby creating conditions for reducing the same coding of words; 2. the method comprises the steps of formulating a rule according to the structural characteristics of the Chinese characters, regularly and successively obtaining codes of etymons in a dispersed manner according to stroke orders, namely, not taking codes one by one according to the positions of the etymons in the characters, wherein only the codes are taken dispersedly, and the code change is large, so that the same coded characters are few (prompting that except for a pinyin input method and a five-stroke input method, the existing vitality of people in use mainly comprises a stroke input method and a radical input method, the stroke input method and the radical input method are that the etymon codes are not obtained dispersedly in the characters, the word selection is more frequent than the pinyin input method, and the Chinese character input is more impossible than the five-stroke input method by mainly inputting words); 3. all the full code codes of the characters which can be changed into the brevity codes are changed into the brevity codes, and the characters which are changed into the brevity codes only keep the brevity codes in a system code table, so that a user only uses the brevity codes to input the characters, thereby eliminating a large number of same codes,the user can develop the habit of inputting by using the brevity code, and simultaneously, when the full code of the character is changed into the brevity code, the brevity code with the least codes is distributed to the most frequently used character in the character with the same code or the first codes or all the codes in the full code or the first codes or all the codes are the same, so as to improve the input speed of the user; 4. for a few of the rest same code words (the same code words are all full code words), a method of adding one or two codes which are the same as the last code of the full code after the full code coding of the word is used, and the method is distinguished from the words which are coded by the same code words, a user only needs to repeatedly click the last-clicked letter key once or twice during inputting, and the method is convenient and quick without thinking; 5. the full code of the character of the input method is four codes, the full code of the word is six codes, so that the full code of the character and the full code of the word cannot generate the same codes, the difference of the code composition structure of the full code between the character and the word is increased, and conditions [ prompt: although the full code encoding of the words is two more than the full code encoding of the words, because the code taking method of the full code encoding of the words is different from that of the words (see the following details), the difference is embodied in the simple code encoding of the words, so that when one word is input, the number of codes is not necessarily more than that of the words]Meanwhile, the full code coding of the words has two more codes than the full code coding of the words, and the same codes among the words are greatly reduced; 6. the full code codes of the words which can be recomposed into the brevity codes are recomposed into the brevity codes, the brevity codes of the words distinguish the codes of the characters, and the words which are recomposed into the brevity codes only keep the brevity codes in a system coding table, so that a user only uses the brevity codes to input the words, the repetition of a large number of word codes is eliminated, the codes of most of the words are reduced, and the habit of inputting the words by using the brevity codes can be developed by the user; meanwhile, when in adaptation, in the words with the same code before the full code coding or the same codes before the several codes or the same codes, the simplified code with the least code is allocated to the most common words.

2. Advantageous effects of solving the technical problems

(1) The input method is used for inputting all Chinese characters except characters which cannot be input and displayed by a current public computer in a universal standard Chinese character table, wherein 8959 Chinese characters comprise homomorphic and allophone characters, and the codes of other Chinese characters are unique except 360 Chinese characters (mostly rare Chinese characters) which have the same codes (the last character of the input method is attached with a system code table of 8959 Chinese characters for verification); in addition, the input method can ensure that sixteen thousand of system words basically have no same codes (only because of space relation, a paper word system code table cannot be provided, and if necessary, an electronic form can be provided at any time).

(2) The software manufactured by the inventor is not used for public trial in private, the effect is very ideal, the defects of the Pinyin input method and the Wubi input method in the same input method are overcome, the advantages are improved, the rule is simple, the etymon is not required to be memorized, the operation is easy, the method is difficult to forget, students, particularly pupils can consolidate the writing of characters after mastering the input method, most commonly used characters and phrases can be input by using brevity codes, and more importantly, the method reaches a new height which cannot be reached by any input method: not only the character input is almost free from character selection, but also the word input is also almost free from word selection, so that the user can smoothly input the Chinese character by taking the input word as the main part, and the existing input speed is greatly broken through.

4. Detailed description of the preferred embodiments

Setting radical code for Chinese character radical

The code set for the etymons is called 'etymon code', and the etymons of the Chinese characters are divided into two main categories: strokes and word-forming components.

1. Radical code for setting strokes

The stroke types are divided into five major classes, each major class of strokes is set with a key name letter as a radical code, and the requirements are as follows:

(1) The key name letter having a characteristic point similar to the shape of the stroke is selected as much as possible as the radical code of the stroke for the reason: is convenient for memory.

(2) The key name letters different from the initial consonants in the Chinese pinyin must be selected as the radical codes of the strokes, for the reasons: it is described above that the first letter of the pinyin of a character will be used as a code of the character to participate in the coding of the character, and the code of the character-forming component is also used as the first letter of the pinyin when the character-forming component is independently a Chinese character, so that the code of the etymon is concentrated on the same key name letter as the initial letter of the pinyin of the Chinese character, and the key name letter different from the initial letter in the pinyin of the Chinese character is selected as the root code of the stroke, so that the code of the etymon can be dispersed into all twenty-six key name english letters without being concentrated on the same key name letter as the initial letter of the pinyin of the Chinese character, thereby creating conditions for reducing the same coding of the character, because in a Chinese character, no matter which initial letter is used as the first letter of the pinyin, the subordinate Chinese characters are much more numerous than the Chinese characters to which a certain final letter is used as the first letter of the pinyin, and what is, in twenty-six pinyin names, three vowels serving as the first letters "a", "E", "O", and three letters "I", "V", "U", which are not used as the first letters.

The list of settings for the five stroke type radical codes is as follows:

2. radical code set as word component

The codes of all the character-forming components are set as the pinyin initial letters when the character-forming components are independent Chinese characters, for example, the code of the 'stone' in the 'code' character is set as the pinyin initial letter 'S' of the 'stone' of the Chinese character, and the code of the 'horse' in the 'code' character is set as the pinyin initial letter 'M' of the 'horse' of the Chinese character.

(II) coding Chinese characters with full codes

1. Word and coding scheme

The full code of each character is four codes, wherein three codes are radical codes, namely codes of strokes and codes of character forming components, and one code is taken from the pinyin initial letter of the whole character and called as a 'character code', for example, the character code of the 'character code' is the pinyin initial letter 'M' of the 'code'. The input method has two coding systems for users to select, one is a coding system with the character code arranged in front of the etymon code: if "X" represents a word code and "Y" represents a radical code, the four codes in the full code encoding are arranged as "XYYY"; the other is a coding system with the character code arranged behind the etymon code: if "X" represents a word code and "Y" represents a radical code, four codes in the full code encoding are arranged as "YYYX". Both coding schemes have advantages (see below) and the user can choose to use them.

2. Acquisition of three etymon codes in full code coding

(1) And code fetching of the first code of the three etymon codes. For clarity, the first code of the root code is referred to as "first basic code of the root code", and is abbreviated as "first basic code". The first basic code takes the code of the first etymon when writing, such as the code "A" of the first basic code of the "technical" word taking the first etymon "one (horizontal)"; the first basic code of the word "question" takes the code "B" - "the first root" an ancient type of spoon "-" the first basic code of the word "question" cannot take the code of the first root "ノ", but only the code of the first root "an ancient type of spoon", because the code-taking root must be maximized, "an ancient type of spoon" contains "ノ", which is larger than "ノ"; the first basic code of the're' word takes the code 'U' of the first etymon 'フ'; the first basic code of a word "one" takes the code "a" of the first radical "one" (note: when a word is a radical, the word is taken as a stroke, namely, a word (horizontal) and an improper word-making component, namely, a word (y ī). First basic code fetching rule: the first basic code fetch follows the rule that the fetched etymons must be maximized, and the first etymons are fetched when the character is written.

(2) And taking codes by the second code of the three etymon codes. We call the second code of the etymon code as the "second basic code of the etymon code", which is called the "second basic code" for short. The second basic code takes the code of the last etymon when the character is written: for example, the second basic code of the 'skill' word takes the code 'Z' of the tail root 'branch' -the second basic code of the 'skill' word cannot take the code of the tail root 'again' because the code-taking root must be maximized, and the 'branch' contains 'again' and is larger than 'again'; the second basic code of the 'doubt' word is the code 'R' of the last etymon 'man' -the second basic code of the 'doubt' word can not be the last etymon

Because the word root of the code extraction must be maximized, "person" includes

Than

Large; the second basic code of the're' word takes the last radical

Code "V" of (1); in the coding system with the word arranged in front of the root code, the second basic code of the word is used as two basic code arrangements of the word { "one" - "with the code" A "of the first basic code" one (horizontal) ", namely: first basic code, second basic code: AA [ one (horizontal)]In the coding system with the character codes arranged behind the root codes, the second basic code of the 'one' character is used as the pinyin initial character 'Y' of the 'one' character (for the reason see below), and the two basic code arrangements of the 'one' character are as follows: first basic code, second basic code: AY [ one (horizontal) one (y ī)]The first phonetic letter of the word "one (y ī)" is used as the radical code, which belongs to the radical code. Second basic code fetch rule: the second basic code fetch follows the rule that the fetched etymons must be maximized, the code of the last etymon when the character is written is fetched, if the character is a stroke character, the second basic code is used as the first basic code in the coding system with the etymon code arranged in front of the etymon code, and the second basic code is used as the pinyin initial letter of the character in the coding system with the etymon code arranged behind the etymon code.

(3) And taking the third code of the three etymon codes. We call the third code of the etymon code as the supplementary code of the etymon code, which is called the supplementary code for short. The supplementary code is the code of the last etymon when the etymon of the code taken by the second basic code is written: for example, the second basic code of 'zhi' character takes the code whose root is 'zhi', and the supplementary code of 'zhi' character takes the code of 'zhi' and 'er' at the end of 'zhi' characterThe supplement code of the ' Y ' -skill ' character can not be taken

Because the radical of the code fetch must maximize the three radical code permutations of the "technical" word, i.e.: first basic code, second basic code, supplementary code: AZY (one piece)](ii) a The root of the code taken by the second basic code of the word "doubtful" is "man", and the supplementary code of the word "doubtful" is the root of the last word of "man

The three radical code arrangements of the code "V" [ "doubtful" word are that: first basic code, second basic code, supplementary code: BRV

](ii) a The second basic code of the're' word takes the root of the code of

The character root can not be split to obtain the supplementary code, and in the coding system with character code arranged in front of character root code, the supplementary code of're' character uses the second basic code

The code "V" of (a) serves as the three radical code arrangement of the [ again "word, i.e.: first basic code, second basic code, supplemental code: UVV

]In the coding system with the character codes arranged behind the etymon codes, the supplementary codes of the're' characters are used as the three etymon codes of the're' characters (the reason is shown below) with the pinyin initial letter 'Y' of the're' characters, namely: first basic code, second basic code, supplementary code: UVY

](ii) a Similarly, a word "one" has no roots for obtaining supplementary codes, and in a coding system with the codes arranged in front of the roots, the supplementary codes of the word "one" are arranged by using the second basic code (also the first basic code), namely the code "a" one (horizontal) "as three roots of the word" one ", namely: first basic code, second basic code, supplementary code: AAA [ one (horizontal)]In a coding system in which the word codes are arranged behind the radical codes, the supplementary codes of a word "one" serve (for reasons given below) as the three radical code arrangements of the word "one" using the pinyin initial (also the second basic code) "Y" of the word "one": first basic code, second basic code, supplementary code: AYY [ one (horizontal) one (y ī) one (y ī)]. Supplemental code fetch rules: the supplementary code is selected according to the rule that the selected etymons must be maximized, the codes of the last etymons when the etymons of the codes selected by the second basic codes of the characters are written are selected, the etymons of the codes selected by the second basic codes of the characters are the characters with one stroke or the whole characters are the characters with one stroke, the supplementary codes are used as the second basic codes in a coding system with the etymons arranged in front of the etymon codes, and the supplementary codes are used as the pinyin initial letters of the characters in a coding system with the etymon codes arranged behind the etymon codes.

Why is the second basic code of a stroke word, the supplemental code of a stroke word, and the etymon of the code taken by the second basic code, which are supplemental codes of a stroke word, in a coding system in which the etymon codes are arranged before the etymon codes, and is served by the previous etymon code, and is the second basic code of a stroke word, the supplemental code of a stroke word, and the etymon of the code taken by the second basic code, which are supplemental codes of a stroke word, and is served by the pinyin initial letter of a word, in a coding system in which the etymon codes are arranged after the etymon codes? For example, for an "art" word, the word code is the pinyin initial "Y" of "art", the first basic code is the code "a" of "one (horizontal)", the second basic code is the code U of the stroke "b (horizontal bending hook)", in the coding system with the word code arranged in front of the root code, the complementary code of "art" is used as the second basic code U, it is convenient and more important to input only one previous key, when the full code of "art" word is coded as "YAUU", the first three codes of "art" are three different codes, and the word code is involved, which makes the first three codes of "art" and the first three codes of other words the same less probable, the possibility of art "being input by the brevity code is greater, and the input of the words of two words and words of words is directly related to the first three codes of words (see details below), the first three codes of words and the first three codes of other words are not repeated, the second word and the word in which the word is located cannot be the same as the full code of other second word and the word in which the word is located can be input by the brevity code of the word, however, if the supplementary code of the art word is still the second basic code U in the coding system with the word code arranged behind the root code, the full code of the art word is AUUY, the word code does not participate in the first three codes, and the second and the third codes are U, like the art, the second basic code refers to a lot of words with strokes of turning hooks, the probability that the full code of the art word is the same as the full code of other words is greatly increased, otherwise, if the supplementary code of the art word in the coding system with the word code arranged behind the root code is the first letter of pinyin of the art word, the full code of the skill is AUYY, the first three codes are different, and the first three codes of the skill and the first three codes of other characters participate in the full code, the probability that the full code of the skill is the same as the full code of other characters is greatly reduced, the probability that the skill can be input by the brevity code is greatly increased, the probability that the full code of the second character word where the skill is located is different from the full code of other second character words and the probability that the second character word where the skill is located can be input by the brevity code are correspondingly larger.

Three etymon codes are regularly and dispersedly obtained in a character according to the positions of etymons of the codes, the same coding rate of full code coding of the character is greatly reduced, if the codes are not dispersed, the same coding rate of the character is very high, such as a stroke input method, all the first three-code coded etymons of characters starting with "+", are sequentially 'one', 'I' and 'I', all the first three-code coded etymons of characters starting with "Ji", are sequentially 'one', 'Equ', 'I', and the like, and such as a radical input method taking codes by taking radicals in a dictionary as units, all the first three-code coded etymons of characters starting with the components of "Mo", such as 'curtain', 'mu', 'tomb', 'night', 'twitch', 'mimic', and the like, are sequentially 'large', all the first three-code-coded etymons of characters ending with the components of "Taiqu ',' Zhang ',' Zheng ',' Zhang ',' Ji ', and the like are sequentially input, and the three-code roots of the characters ending with the components of" Tai', "and the natural codes are sequentially and the same.

3. Advantages of the two coding schemes

The first phonetic letter of a word is used as a word code to become a code in the code, and the code is also scattered to reduce the same code rate of the word.

The advantages of the coding system of the character codes before the root codes are as follows: because the input of the simple code codes of the characters is only the first code or the first two or the first three codes in the full code codes, the character codes are arranged in front of the etymon codes, so that the input of the simple code codes is more convenient; more importantly, the input of the multi-word is mainly related to the first code of the word (see below), and the word code brings more convenience to the input of the word.

The advantages of the coding system of the character code after the root code are as follows: because the first letter of the pinyin of the character is arranged at the end of the full code, some unknown characters can be directly typed by the short code, some unknown characters can be found in screen candidate characters after the first three codes of the code are input, and the unknown characters can be typed by the word language because the input of the word is only related to the short code of the character.

The coding system of the character code before the etymon code is suitable for common users, and the coding system of the character code after the etymon code is suitable for professional typewriters.

(III) compiling rules followed by full code coding

1. The code is fetched according to the writing stroke order, strokes are not continuously completed character forming components during writing, and the code can not be fetched as the character forming components (the writing stroke order of the characters is favorably consolidated by students). Like a 'fu' word, the first basic code of the etymon code cannot be taken as 'ten', and is taken as 'one', because two strokes of 'ten' are not continuously completed according to the writing stroke order. Similarly, for the word "in the word", the first basic code of the radical code cannot take "Wu", and should take "factory", because the strokes of "Wu" in the writing order are not continuously completed.

2. The coded roots must be maximized (see above for the word example).

3. When the first word and the second word are used as etymons, the first word and the second word are uniformly used as strokes to fetch codes, and are not used as word forming components to fetch codes.

4. In order to enable users with low text level, particularly pupils, to smoothly input Chinese characters, if a word forming component in a character is a rarely-used character when the word forming component is independent, codes are added to the character, and the added codes do not take the word forming component as the word forming component when the codes are taken. For example, when the word-forming component in the word "packet" is independent word-forming, it is a more uncommon word, and adds a code of "packet", and when the added code is used for fetching code, it does not regard "word-forming component" as word-forming component, i.e. the second basic code of root code of "packet" is used for fetching "" and does not fetch "word-forming component".

5. The two horizontal lines with inconsistent stroke length and Chinese character 'two' can not be taken as a character forming component 'two' to obtain codes; the three horizontal lines with inconsistent stroke length and Chinese character three can not be taken as a character-forming component three to obtain codes. For example, two horizontal lines in the word "when" and the word "day" can not be taken as the word-forming component "two" to obtain codes; the three bars in the "aim" and the two sets of three bars in the "not" word cannot be regarded as the word-forming component "three" to obtain the code.

(IV) coding the character according to the full code of the character

The full code coding of the coded called word containing all four codes is omitted, and the short code coding of the coded called word of one code or two or three codes behind the full code coding of the dropped character is omitted.

Brevity code definition of words: the full code after which the word is dropped is omitted is followed by one code or two or three codes, and only the code before the full code is reserved, so that the codes of other codes, called word code, can be distinguished by the least codes. That is, the brevity codes are all unique codes.

The input method adapts the whole code codes of the characters which can be adapted to the brevity code codes (note: sorting, counting, deleting the codes and adapting by the EXCEL table), and the Chinese characters which are adapted to the brevity code codes only reserve the brevity code codes in the system code table, so that users can input the Chinese characters only by the brevity code codes.

The full code of a word can not be changed into the abbreviated code, and whether the changed abbreviated code has other words which need the abbreviated code is considered.

The principle and benefits of adapting full code encoding of words to reduced code encoding are:

1. eliminating the repetition of coding and reducing the same coding rate; the code of the code is reduced, and the input speed is improved. If the full code of three words is the same in all the word codes, all the words are 'LMOS', if two words can be respectively changed into the abbreviated code 'LM' and 'LMO', the full code of the two words is deleted, and only the abbreviated code is reserved, each word of the three words has no same code, and after the three words are sorted according to the alphabetical order in the input method system code table, the user does not need to select the word on the screen when inputting the three words. On the contrary, if none of the three words is adapted to the brevity code, the other two words except the first word in the three words in the system code table can be selected on the screen when inputting.

2. In the word with the same code before the full code coding or the same codes before several codes or the same codes, the simplified code with the least codes is distributed to the most common words to increase the input speed. For example, in a coding system with character codes arranged in front of etymon codes, in a word with all codes of 'W' before full code coding, the brevity code 'W' with the least codes is distributed to the most frequently used word 'I' in the part of words; in the words of which the first two codes are 'WO' in the full code coding, allocating the simplified code 'WO' with the least codes to the most frequently used word 'WO' in the part of words; in the words of which all three codes are 'WOW' before full code encoding, the brevity code 'WOW' with the least codes is allocated to the most common word 'error' in the part of words.

(V) processing of identical code words

After the full code of the word capable of being rearranged is rearranged into the abbreviated code, a small number of words (mostly rare words) in the rest full code coded words have the condition that a plurality of words share the same full code. We call the same code word by sharing a full code coded word with another word.

The same code word of the input method is that two characters share one full code, a small number of three characters share one full code, and rarely four characters or more than four characters share one full code.

The input method processes the same code word as follows:

1. the two words share a full code, and after the full code of one of the less common words is coded, a code which is the same as the last code of the code is added, so that the codes of the two words are different. For example, in an encoding system with a character code arranged in front of a root code, "locust", "" are identical encoding characters, and all full-code encoding is "ZCZA", we add a code "A" which is identical to the last code of encoding after the full-code encoding of a less common word "", change the encoding of the word "" into "ZCZAA", and make the encoding of the word "" different from that of the word "locust".

2. The three words share one full code, so that the full code of the most common word is kept, one code which is the same as the last code of the code is added after the full code of the more common word is coded, and two codes which are the same as the last code of the code are added after the full code of the least common word, so that the codes of the three words are different. In the coding system with the character codes arranged in front of the root codes, the codes of 'rash', 'chilblain' and '' are the same code words, and the full code codes are all 'ZGVV', the full code of the most common 'rash' character is kept as it is, a code V which is the same as the last code of the code is added after the full code of the more common 'chilblain', the code of the 'chilblain' character is changed into 'ZGVV', two codes V which are the same as the last code of the code are added after the full code of the least common '', and the code of the '' character is changed into 'ZGVV', so that the codes of the three characters are different.

3. Four or more than four words share one full code, so that the full code of the most common word is kept, one code which is the same as the last code of the code is added after the full code of the more common word, and two codes which are the same as the last code of the code are added after the full codes of the other words, thus the code of the most common word and the code of the more common word are unique.

The advantage of processing the same code words is that when the user inputs the words, the user does not need to consider what code needs to be added after inputting the full code, and most of the same code words are not needed to be selected on a screen only by clicking the last key which is clicked once or twice, so that the operation is convenient and quick.

(VI) coding the words according to the full code of the characters

The full code of each word is coded into six codes, and the code taking method is as follows:

1. the second word takes the first, the second and the third codes of the full code of each word in sequence. For example, in a coding system in which word codes are arranged in front of root codes, the word "then" is coded with full codes, the full code of the word "then" is coded with "yeuu", the full code of the word "then" is coded with "sdrv", and the full code of "then" is the first, second and third codes y, e, u of the full code of the word "then" plus the first, second and third codes s, d, r of the full code of the word "then" i.e. "yeusdr".

2. The three-character word takes the first and second codes of the full code of each character in sequence.

3. The four-character word takes the first code of each character full code of the first two characters and the first and the second code of each character full code of the last two characters in sequence.

4. The five-character words take the first code of the full code coding of the first four characters and the first and second codes of the full code coding of the last character in sequence.

5. The six-character word takes the first code of each character full code in sequence.

6. The words with more than six characters take the first code of the full code coding of each character of the first six characters in sequence.

It can be seen from the above that, because the full code coding and code fetching methods of the characters and the words are different, a character may be typed by four codes of the full code coding, and a word beginning with the character may be typed by the abbreviated code coding of three codes or even by the abbreviated code coding of less codes.

(VII) coding the brevity code of the words according to the full code of the words

The input method adapts the full code codes of the words which can be adapted to the brevity code codes, and the words which are adapted to the brevity code codes only keep the brevity code codes in a system code table, so that a user only uses the brevity code codes for input.

The brevity code of the word can not only distinguish the codes of other words, but also distinguish the codes of the characters, namely, the brevity code of the word is not the same as the code of the characters.

The full code of a word can not be changed into the simplified code, and whether the changed simplified code has a word needing the word or other words needing the word is considered.

When the brevity code of the word conflicts with the code of the word, the code with less codes is generally given to the word, and only when the word is extremely rare and the word is more common, the code with less codes is given to the word [ note: and (4) putting the full code codes of the words and the codes of the words (including the brevity code codes and the full code codes of the words) together by using an EXCEL table, sequencing, counting, deleting or adding the codes, and carrying out adaptation.

The principle and benefits of adapting full-code encoding of words to abbreviated encoding are:

1. eliminating the repetition of coding and reducing the same coding rate; the code of the code is reduced, and the input speed is improved. If the full codes of five words in the codes of all words are the same and are all 'YYOYO', if four words can be respectively coded into the brevity codes 'YY', 'YYO', 'YYOYY' and 'YOYY', and the full codes of the four words are deleted, only the brevity codes are reserved, then each word of the five words does not have the same code, and after the codes of the five words are sorted according to the letter sequence in the input method system code table, the user does not need to select words on the screen when inputting the five words. On the contrary, if none of the five words is adapted to the brevity code, the input method system code list is input with the first word in the five words, and all other four words are selected on the screen.

2. In the words with the same code before the full code coding or the same codes of all the codes, the simplified code with the least codes is allocated to the most common words so as to improve the input speed. For example, in the coding system with the word code arranged behind the etymon code, in the words of which the three codes are all 'KIZ' before the full code coding, the simplified code coding 'KIZ' with the least code is allocated to the most commonly used word 'china'; in the words of which the four codes are 'KIZI' before the full code coding, allocating the abbreviated code 'KIZI' with the least codes to the most common word 'center'; in the words of which the five codes are 'KIZIS' before the full code, the abbreviated code 'KIZIS' with the least code is allocated to the most common word 'Chinese meal'.

5. Chinese character system coding table (coding system with character code arranged in front of radical code)

[ note: 1. sorting according to alphabetical order and columns, and can be used for checking the instruction; 2. 8959 Chinese characters are collected, wherein the codes of only 360 characters (most of rarely used characters) are non-unique; 3. for words with one word and multiple codes, only the primary code is recorded based on space relation

Claims

1. A Chinese character input method for computer and mobile phone, which uses the pronunciation of Chinese character, uses the shape character of Chinese character root and the association with the pronunciation to encode the Chinese character and words, and inputs the code on the English keyboard of computer or mobile phone to input the Chinese character, characterized in that, the encoding for Chinese character is to divide the Chinese character root into two categories of strokes and character forming components, and sets the root code for the root, wherein the code of the strokes is set as the key name English letter different from the initial in Chinese phonetic alphabet, the code of the character forming component is set as the phonetic initial letter when the character forming component is independent Chinese character, three of the four codes of the full code encoding are root codes, and one is a code using the phonetic initial letter of the character as the code, according to the position of the character code in the encoding, there are two encoding systems for the user to select one, and one is the encoding system with the character code arranged in front of three root codes, one is a coding system of which the etymons are arranged behind three etymon codes, the first code of the three etymon codes is taken according to the rule that the etymon of the taken codes must be maximized, the code of the first etymon when the character is written is taken, the second code of the three etymon codes is taken according to the rule that the etymon of the taken codes must be maximized, the code of the last etymon when the character is written is taken, if the character is a stroke character, the second code of the three etymon codes is taken as the first etymon code in the coding system of which the etymon codes are arranged in front of the three etymon codes, the pinyin initial character of the character is taken as the coding system of which the etymon codes are arranged behind the three etymon codes, the third code of the three etymon codes is taken according to the rule that the etymon of the taken codes must be maximized, the code of the last etymon when the etymon of the second etymon codes is taken, and the etymon of the second etymon of the etymon codes taken by the second etymon codes is a character or a character of a character or a character with the whole character of one character or a character, the third code of the three etymon codes uses the second etymon code in a coding system with the etymon codes arranged in front of the three etymon codes, uses the pinyin initial of the character in a coding system with the etymon codes arranged behind the three etymon codes to act, the Chinese character is coded, the full code codes of the character which can be changed into the simplified code codes are changed into the simplified code codes, and the Chinese character which is changed into the simplified code codes only reserves the simplified code in a system coding table, so that a user only inputs the simplified code, the simplified code definition of the character means that one code or two codes or three codes behind the full code which drops the character are omitted, only reserves one code or two codes or three codes in front of the full code, and uses the least code to distinguish the codes of other codes, namely the simplified code codes of the character, the simplified code codes of the character are unique codes, the Chinese character is coded, the process of coding the same code character which shares the full code with other characters is coded is carried out on the Chinese character, two characters share one full code, one code identical to the last code of the code is added after the full code of one character, three characters share one full code, the full code of one character is kept as it is, one code identical to the last code of the code is added after the full code of one character, two codes identical to the last code of the code are added after the full code of another character, four characters or more than four characters share one full code, the full code of one character is kept as it is, one code identical to the last code of the code is added after the full code of one character, two codes identical to the last code of the code are added after the full code of the other characters, the full code of the coding is six codes, and two words and phrases first take the first word and phrase of the full code of the first character, second and third codes, then first, second and third codes of full code coding of second word are taken and connected in sequence, the three-word words are first the first and second codes of full code coding of first word, then the first and second codes of full code coding of second word, finally the first and second codes of full code coding of third word are taken and connected in sequence, the four-word words are first the first code of full code coding of first word, then the first code of full code coding of second word, then the first and second codes of full code coding of third word, finally the first and second codes of full code coding of fourth word are taken and connected in sequence, the five-word words are first the first code of full code coding of first four words, then the first and second codes of full code of last word are taken, the words with six characters are connected in sequence, the first code of each character full code is taken according to the sequence of the characters, the words with six characters are connected in sequence, the first code of each character full code of the six characters is taken according to the sequence of the characters, the words are coded, the full code of the words which can be changed into the simplified code is changed into the simplified code, the simplified code of the words is not only different from the codes of other words, but also distinguishes the codes of the characters, the words which are changed into the simplified code only keep the simplified code in a system coding table, so that a user only uses the simplified code to input, the definition of the simplified code of the words means that one or two or three or four or five codes behind the full code of the words are omitted, only one or two or three or four or five codes in front of the full code are kept, the codes of other codes are distinguished by the least codes, namely the brevity codes of the words, which are all unique codes.