WO1999041680A3 - Segmentation de mots dans un texte chinois - Google Patents
Segmentation de mots dans un texte chinois Download PDFInfo
- Publication number
- WO1999041680A3 WO1999041680A3 PCT/IB1999/000320 IB9900320W WO9941680A3 WO 1999041680 A3 WO1999041680 A3 WO 1999041680A3 IB 9900320 W IB9900320 W IB 9900320W WO 9941680 A3 WO9941680 A3 WO 9941680A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- words
- characters
- combination
- character
- facility
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/268—Morphological analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/53—Processing of non-Latin text
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2000531795A JP4573432B2 (ja) | 1998-02-13 | 1999-01-13 | 漢字文における単語区分方法 |
EP99902779A EP1055182A2 (fr) | 1998-02-13 | 1999-01-13 | Segmentation de mots dans un texte chinois |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US2358698A | 1998-02-13 | 1998-02-13 | |
US09/023,586 | 1998-02-13 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO1999041680A2 WO1999041680A2 (fr) | 1999-08-19 |
WO1999041680A3 true WO1999041680A3 (fr) | 1999-11-25 |
Family
ID=21816034
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB1999/000320 WO1999041680A2 (fr) | 1998-02-13 | 1999-01-13 | Segmentation de mots dans un texte chinois |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1055182A2 (fr) |
JP (2) | JP4573432B2 (fr) |
CN (1) | CN1114165C (fr) |
WO (1) | WO1999041680A2 (fr) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6810375B1 (en) * | 2000-05-31 | 2004-10-26 | Hapax Limited | Method for segmentation of text |
CN1545665A (zh) * | 2001-06-29 | 2004-11-10 | 英特尔公司 | 用于多分析器架构的预测串联算法 |
FR2880708A1 (fr) * | 2005-01-11 | 2006-07-14 | Vision Objects Sa | Procede de recherche dans l'encre par conversion dynamique de requete. |
CN100424685C (zh) * | 2005-09-08 | 2008-10-08 | 中国科学院自动化研究所 | 一种基于标点处理的层次化汉语长句句法分析方法及装置 |
US8310461B2 (en) | 2010-05-13 | 2012-11-13 | Nuance Communications Inc. | Method and apparatus for on-top writing |
CN103177089A (zh) * | 2013-03-08 | 2013-06-26 | 北京理工大学 | 基于中心块的句义成分关系分层识别方法 |
CN107748744B (zh) * | 2017-10-31 | 2021-01-26 | 广东小天才科技有限公司 | 一种勾勒框知识库的建立方法及装置 |
CN110955748B (zh) * | 2018-09-26 | 2022-10-28 | 华硕电脑股份有限公司 | 语意处理方法、电子装置以及非暂态电脑可读取记录媒体 |
CN109670123B (zh) * | 2018-12-28 | 2021-02-26 | 杭州迪普科技股份有限公司 | 一种数据处理的方法和装置 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5806021A (en) * | 1995-10-30 | 1998-09-08 | International Business Machines Corporation | Automatic segmentation of continuous text using statistical approaches |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2798931B2 (ja) * | 1988-04-26 | 1998-09-17 | 健 楠井 | 中国語の語音区切方式および語音漢字変換方式 |
US5448474A (en) * | 1993-03-03 | 1995-09-05 | International Business Machines Corporation | Method for isolation of Chinese words from connected Chinese text |
JPH08339383A (ja) * | 1995-04-11 | 1996-12-24 | Ricoh Co Ltd | 文書検索装置及び辞書作成装置 |
-
1999
- 1999-01-13 JP JP2000531795A patent/JP4573432B2/ja not_active Expired - Fee Related
- 1999-01-13 CN CN99802944A patent/CN1114165C/zh not_active Expired - Fee Related
- 1999-01-13 EP EP99902779A patent/EP1055182A2/fr not_active Withdrawn
- 1999-01-13 WO PCT/IB1999/000320 patent/WO1999041680A2/fr not_active Application Discontinuation
-
2010
- 2010-02-23 JP JP2010037953A patent/JP5100770B2/ja not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5806021A (en) * | 1995-10-30 | 1998-09-08 | International Business Machines Corporation | Automatic segmentation of continuous text using statistical approaches |
Non-Patent Citations (2)
Title |
---|
CHARNG-KANG FAN ET AL.: "Automatic Word Identification in Chinese Sentences by the Relaxation Technique", COMPUTER PROCESSING OF CHINESE & ORIENTAL LANGUAGES, vol. 4, no. 1, November 1988 (1988-11-01), pages 33 - 56, XP002114839 * |
XIAOHONG HUANG ET AL: "A Quick Method for Chinese Word Segmentation", IEEE CONF. ON INTELLIGENT PROCESSING SYSTEMS, 28 October 1997 (1997-10-28) - 31 October 1997 (1997-10-31), pages 1773 - 1776, XP002114838 * |
Also Published As
Publication number | Publication date |
---|---|
JP2002503849A (ja) | 2002-02-05 |
JP4573432B2 (ja) | 2010-11-04 |
CN1114165C (zh) | 2003-07-09 |
JP2010157260A (ja) | 2010-07-15 |
JP5100770B2 (ja) | 2012-12-19 |
CN1290371A (zh) | 2001-04-04 |
EP1055182A2 (fr) | 2000-11-29 |
WO1999041680A2 (fr) | 1999-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2001290464A1 (en) | Method for normalizing case | |
WO1999062000A3 (fr) | Systeme de verification orthographique et grammaticale | |
CY2579B1 (en) | Text processor | |
WO2006052858A8 (fr) | Appareil et procede fournissant une indication visuelle de l'ambiguite d'un caractere pendant une saisie textuelle | |
WO2006039398A3 (fr) | Procedes et systemes de selection d'un langage de segmentation de texte | |
HK1046786A1 (zh) | 具有自动校正功能的键盘系统 | |
WO1999008390A3 (fr) | Procede de saisie de texte japonais a l'aide d'un clavier ne possedant que des caracteres kana de base | |
TW428137B (en) | Sentence processing apparatus and method thereof | |
EP1178408A3 (fr) | Segmenteur pour un système de traitement de langues naturelles | |
JP2002517039A5 (fr) | ||
WO1999041680A3 (fr) | Segmentation de mots dans un texte chinois | |
DE60045283D1 (fr) | ||
US5619563A (en) | Mnemonic number dialing plan | |
UA24036C2 (uk) | Словhик алфавітhої іhоземhої мови | |
EP1359515A3 (fr) | Système et procédé de filtration de langage extrême-orientale | |
EP1248183A3 (fr) | Systeme de resolution d'ambiguites pour clavier reduit | |
US20200273370A1 (en) | Interlinear targum | |
CN1200332C (zh) | 一种汉字计算机输入方法 | |
CN101957663B (zh) | 五笔汉字输入方法 | |
WO2009116835A3 (fr) | Dispositif et procédé de saisie de caractères japonais | |
Tumasonis | Encoding of Lithuanian accented letters | |
EP1113413A3 (fr) | Poste de travail avec antemémoire pour police de caractères | |
Cartlidge | A Book of Middle English. | |
Siu-Pong et al. | 3Cantonese Romanization | |
Vasilev | The Numerals from 200 to 900 in Serbo-Croatian |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 99802944.0 Country of ref document: CN |
|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): CA CN JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: A3 Designated state(s): CA CN JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
NENP | Non-entry into the national phase |
Ref country code: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1999902779 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1999902779 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1999902779 Country of ref document: EP |