[go: up one dir, main page]

CN111460809B - Arabic place name proper name transliteration method, device, translation equipment and storage medium - Google Patents

Arabic place name proper name transliteration method, device, translation equipment and storage medium Download PDF

Info

Publication number
CN111460809B
CN111460809B CN202010234562.8A CN202010234562A CN111460809B CN 111460809 B CN111460809 B CN 111460809B CN 202010234562 A CN202010234562 A CN 202010234562A CN 111460809 B CN111460809 B CN 111460809B
Authority
CN
China
Prior art keywords
arabic
transliterated
transliteration
romanized
place name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202010234562.8A
Other languages
Chinese (zh)
Other versions
CN111460809A (en
Inventor
毛曦
马维军
王继周
岳振华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese Academy of Surveying and Mapping
Original Assignee
Chinese Academy of Surveying and Mapping
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese Academy of Surveying and Mapping filed Critical Chinese Academy of Surveying and Mapping
Priority to CN202010234562.8A priority Critical patent/CN111460809B/en
Publication of CN111460809A publication Critical patent/CN111460809A/en
Priority to AU2021100730A priority patent/AU2021100730A4/en
Application granted granted Critical
Publication of CN111460809B publication Critical patent/CN111460809B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a method, a device, translation equipment and a storage medium for transliteration of Arabic geographical names, wherein the method comprises the following steps: romanizing the Arabic place name to be transliterated to obtain the romanized Arabic place name to be transliterated; preprocessing the standard transliteration table to obtain a target transliteration table; inputting the romanized Arabic place names to be transliterated into a target transliteration table for matching to obtain a transliteration result. Compared with manual translation, the translation efficiency in the Arabic place name proper name translation is improved, the labor cost is reduced, and translation errors are easy to check.

Description

阿拉伯语地名专名音译方法、装置、翻译设备和存储介质Arabic place name proper name transliteration method, device, translation equipment and storage medium

技术领域technical field

本发明涉及地理信息中的地名翻译技术领域,具体涉及一种阿拉伯语地名专名音译方法、装置、翻译设备和存储介质。The invention relates to the technical field of place name translation in geographic information, in particular to a transliteration method, device, translation device and storage medium for Arabic place name proper names.

背景技术Background technique

地名翻译是指将某一个地理实体在某一种语言上的表达转化为另一种语言的表达。地名专名是指某一地理实体并用以区分同类事物的专用词,是地名两大构成词之一。《外语地名汉字译写导则》(GB/T17693.3-2009)中规定了地名专名的一般音译。Geographical name translation refers to converting the expression of a geographical entity in a certain language into another language. The proper name of a place name refers to a special word for a certain geographical entity and is used to distinguish similar things. It is one of the two major constituent words of a place name. The "Guidelines for the Translation and Writing of Chinese Characters of Place Names in Foreign Languages" (GB/T17693.3-2009) stipulates the general transliteration of proper names of place names.

阿语地名专名的翻译形式分为两种,一种是从阿语原文直接翻译为中文,另一种是利用罗马化后的阿拉伯语地名实现翻译。目前阿语专名音译方式主要为人工进行,专名与音译表机械匹配,然而,这种模式在大规模作业的背景下,效率低,人工成本高,不易核查错误。另外,还需要根据不同地区、不同语种和不同类别地名的规定规则实现翻译。这样导致现有的机器翻译方法不能独立高效解决阿语地名的音译问题。There are two forms of translation of proper place names in Arabic, one is direct translation from the original Arabic into Chinese, and the other is translation using romanized Arabic place names. At present, the transliteration of proper names in Arabic is mainly carried out manually, and the proper names are matched with the transliteration table mechanically. However, in the context of large-scale operations, this mode has low efficiency, high labor costs, and is not easy to check for errors. In addition, it is also necessary to realize the translation according to the prescribed rules of different regions, different languages and different types of place names. As a result, the existing machine translation methods cannot independently and efficiently solve the problem of transliteration of Arabic place names.

发明内容Contents of the invention

有鉴于此,提供一种阿拉伯语地名专名音译方法、装置、翻译设备和存储介质,以解决现有技术中人工翻译阿拉伯语地名专名中效率低、成本高以及不易核查错误等问题。In view of this, a transliteration method, device, translation equipment and storage medium for proper names of Arabic place names are provided to solve the problems of low efficiency, high cost and difficulty in checking errors in manual translation of proper names of Arabic place names in the prior art.

本发明采用如下技术方案:The present invention adopts following technical scheme:

第一方面,本申请实施例提供了一种阿拉伯语地名专名音译方法,该方法包括:In the first aspect, the embodiment of the present application provides a method for transliterating proper names of Arabic place names, the method comprising:

将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;Romanize the Arabic place names to be transliterated to obtain the Romanized Arabic place names to be transliterated;

对标准音译表进行预处理,得到目标音译表;Preprocess the standard transliteration table to obtain the target transliteration table;

将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,得到音译结果。The romanized Arabic place name to be transliterated is input into the target transliteration table for matching, and a transliteration result is obtained.

第二方面,本申请实施例提供了一种阿拉伯语地名专名音译装置,该装置包括:In the second aspect, the embodiment of the present application provides a device for transliterating proper names of Arabic place names, the device comprising:

罗马化模块,用于将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;The romanization module is used to romanize the Arabic place names to be transliterated to obtain the romanized Arabic place names to be transliterated;

音译表预处理模块,用于对标准音译表进行预处理,得到目标音译表;The transliteration table preprocessing module is used to preprocess the standard transliteration table to obtain the target transliteration table;

音译模块,用于将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,得到音译结果。The transliteration module is configured to input the romanized Arabic place name to be transliterated into the target transliteration table for matching to obtain a transliteration result.

第三方面,本申请实施例提供了一种翻译设备,该翻译设备包括:In a third aspect, the embodiment of the present application provides a translation device, which includes:

处理器,以及与所述处理器相连接的存储器;a processor, and a memory connected to the processor;

所述存储器用于存储计算机程序,所述计算机程序至少用于执行本申请实施例第一方面所述的阿拉伯语地名专名音译方法;The memory is used to store a computer program, and the computer program is at least used to execute the method for transliterating proper names of Arabic place names described in the first aspect of the embodiments of the present application;

所述处理器用于调用并执行所述存储器中的所述计算机程序。The processor is used to call and execute the computer program in the memory.

第四方面,本申请实施例提供了一种存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时,实现如第一方面所述的阿拉伯语地名专名音译方法中各个步骤。In a fourth aspect, an embodiment of the present application provides a storage medium, the storage medium stores a computer program, and when the computer program is executed by a processor, the method for transliterating proper names of Arabic place names as described in the first aspect is implemented. various steps.

本发明采用以上技术方案,通过将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;对标准音译表进行预处理,得到目标音译表;将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。根据阿语地名专名特定发音特点合理翻译,独立高效解决了阿语地名专名的自动翻译问题,降低了人工成本,错误易核查。The present invention adopts the above technical scheme, by romanizing the Arabic place name to be transliterated, obtaining the Arabic place name to be transliterated after romanization; preprocessing the standard transliteration table to obtain the target transliteration table; The Arabic place names are input to the target transliteration table for matching, and the transliteration results are obtained. Reasonable translation according to the specific pronunciation characteristics of Arabic place-name proper names, independently and efficiently solves the problem of automatic translation of Arabic place-name proper names, reduces labor costs, and makes mistakes easy to check.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1是本发明实施例提供的一种阿拉伯语地名专名音译方法的流程图;Fig. 1 is the flow chart of a kind of Arabic place name proper name transliteration method that the embodiment of the present invention provides;

图2是本发明实施例提供的另一种阿拉伯语地名专名音译方法的流程图;Fig. 2 is the flow chart of another kind of Arabic place name proper name transliteration method that the embodiment of the present invention provides;

图3是本发明实施例中适用的一种音译示例图;Fig. 3 is a transliteration example diagram applicable in the embodiment of the present invention;

图4是本发明实施例提供的一种阿拉伯语地名专名音译装置的结构示意图;Fig. 4 is a structural schematic diagram of a transliteration device for proper names of Arabic place names provided by an embodiment of the present invention;

图5是本发明实施例提供的一种翻译设备的结构示意图。Fig. 5 is a schematic structural diagram of a translation device provided by an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚,下面将对本发明的技术方案进行详细的描述。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所得到的所有其它实施方式,都属于本发明所保护的范围。In order to make the purpose, technical solution and advantages of the present invention clearer, the technical solution of the present invention will be described in detail below. Apparently, the described embodiments are only some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other implementations obtained by persons of ordinary skill in the art without making creative efforts fall within the protection scope of the present invention.

实施例Example

图1为本发明实施例提供的一种阿拉伯语地名专名音译方法的流程图,该方法可以由本发明实施例提供的阿拉伯语地名专名音译装置来执行,该装置可采用软件和/或硬件的方式实现。参考图1,该方法具体可以包括如下步骤:Fig. 1 is a flow chart of a method for transliterating proper names of Arabic place names provided by an embodiment of the present invention. The method can be executed by a device for transliterating proper names of place names in Arabic provided by an embodiment of the present invention. The device can use software and/or hardware way to achieve. Referring to Fig. 1, the method specifically may include the following steps:

S101、将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名。S101. Romanize the Arabic place name to be transliterated to obtain the Romanized Arabic place name to be transliterated.

其中,罗马化还可以称为称拉丁化,罗马化是语言学的一个术语,指将不是拉丁字母或称罗马字母形式的拼音文字系统转换成拉丁文字系统的过程,即将被转换系统里的非拉丁字符,按照转写系统的规则和转写表,忠实地将字符(也包括字符的变音符号及单音素双字符)对号入座地转写成转换系统里的拉丁字符。另外,为了方便描述,将本申请实施例中的阿拉伯语简称为阿语。Among them, romanization can also be called Latinization. Romanization is a term in linguistics, which refers to the process of converting the phonetic writing system that is not in the form of the Latin alphabet or the Roman alphabet into the Latin writing system. Latin characters, according to the rules of the transliteration system and the transliteration table, faithfully transcribe characters (including diacritics and monophone double characters) into Latin characters in the conversion system. In addition, for the convenience of description, Arabic in the embodiments of the present application is referred to as Arabic for short.

在本申请实施例中,首先将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名。这里的阿拉伯语地名主要是指地名专名。In the embodiment of the present application, the Arabic place name to be transliterated is firstly romanized to obtain the Arabic place name to be transliterated after romanization. The Arabic place names here mainly refer to the proper names of place names.

S102、对标准音译表进行预处理,得到目标音译表。S102. Perform preprocessing on the standard transliteration table to obtain a target transliteration table.

其中,标准音译表为现有技术中对阿拉伯语地名专名进行音译时参考的音译表,例如可以是《外语地名汉字译写导则阿拉伯语》中的阿汉音译部分等。具体的,这里将阿汉音译部分对应的标准音译表中元音字母、辅音字母组合以及与之对应的汉字持久化,进而得到本申请实施例中应用的目标音译表。目标音译表与原始音译表相比,更适用于机器翻译过程中的快速查找和准确查找。Among them, the standard transliteration table is a transliteration table referred to in the prior art when transliterating proper names of Arabic place names, for example, it may be the Arabic-Chinese transliteration part in "Guidelines for Translating Chinese Characters of Place Names in Foreign Languages into Arabic". Specifically, the combination of vowels and consonants in the standard transliteration table corresponding to the Arabic-Chinese transliteration part and the corresponding Chinese characters are persisted to obtain the target transliteration table used in the embodiment of the present application. Compared with the original transliteration table, the target transliteration table is more suitable for quick search and accurate search in the process of machine translation.

S103、将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。S103. Input the romanized Arabic place name to be transliterated into the target transliteration table for matching, and obtain a transliteration result.

具体的,应用正向最大音译匹配算法,将罗马化后的待音译阿拉伯语地名中的已有音节按最大匹配算法来匹配元音单元、辅音单元,得到对应的汉字,对应实现翻译,得到音译结果。Specifically, apply the forward maximum transliteration matching algorithm to match the existing syllables in the Arabic place names to be transliterated after romanization according to the maximum matching algorithm to match the vowel units and consonant units to obtain the corresponding Chinese characters, correspondingly realize the translation, and obtain the transliteration result.

本发明采用以上技术方案,通过将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;对标准音译表进行预处理,得到目标音译表;将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。根据阿语地名专名特定发音特点合理翻译,独立高效解决了阿语地名专名的自动翻译问题,降低了人工成本,错误易核查。The present invention adopts the above technical scheme, by romanizing the Arabic place name to be transliterated, obtaining the Arabic place name to be transliterated after romanization; preprocessing the standard transliteration table to obtain the target transliteration table; The Arabic place names are input to the target transliteration table for matching, and the transliteration results are obtained. Reasonable translation according to the specific pronunciation characteristics of Arabic place-name proper names, independently and efficiently solves the problem of automatic translation of Arabic place-name proper names, reduces labor costs, and makes mistakes easy to check.

图2为本发明又一实施例提供的阿拉伯语地名专名音译方法的流程图,本实施例在上述实施例的基础上实现。参考图2,该方法具体可以包括如下步骤:FIG. 2 is a flow chart of a method for transliterating proper names of Arabic place names provided by another embodiment of the present invention. This embodiment is implemented on the basis of the above-mentioned embodiments. Referring to Figure 2, the method specifically may include the following steps:

S201、将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名。S201. Romanize the Arabic place name to be transliterated to obtain the Romanized Arabic place name to be transliterated.

S202、提取标准音译表的表头的元音字母和辅音字母。S202. Extract vowel letters and consonant letters in the header of the standard transliteration table.

其中,全国地名标准化技术委员会等编撰的《外语地名汉字译写导则阿拉伯语》以及阿汉音译表用来指导我国的阿语地名汉字译写工作。这里的阿汉音译表即为本申请实施例中的标准音译表。在标准音译表中,横行表头包括元、辅音字母以及相应罗马字母转写后辅音字母,竖行表头包括阿语的元音符号和对应转写后的元音字母,每行每列交叉位置即元、辅音组合后对应的汉字。而当辅音匹配到元音的静音符号时,即罗马化撰写后只有单个辅音字母,则音译时按照元音第一行汉字译写。具体的,这里提取标准音译表中的表头的元音字母和辅音字母,以便进行存储。Among them, the "Guidelines for Translation and Writing of Place Names in Foreign Languages and Chinese Characters in Arabic" compiled by the National Technical Committee for the Standardization of Geographical Names and the Arabic-Chinese transliteration table are used to guide the translation of Arabic-language place names into Chinese characters in our country. The Arabic-Chinese transliteration table here is the standard transliteration table in the embodiment of this application. In the standard transliteration table, the header of the horizontal row includes vowels, consonants, and consonants transcribed from the corresponding Roman letters, and the header of the vertical row includes the vowel symbols of Arabic and the corresponding transliterated vowels. Each row and column are crossed. The position is the Chinese character corresponding to the combination of vowels and consonants. And when the consonant matches the mute symbol of the vowel, that is, there is only a single consonant letter after romanization, then the transliteration will be transliterated according to the first line of Chinese characters of the vowel. Specifically, the vowels and consonants of the header in the standard transliteration table are extracted for storage.

S203、按照标准音译表的行列对应关系得到对应汉字。S203. Obtain the corresponding Chinese characters according to the row-column correspondence in the standard transliteration table.

S204、在预设表格中分别录入元音字母、辅音字母和对应汉字,得到目标音译表。S204. Input vowels, consonants and corresponding Chinese characters respectively in the preset table to obtain a target transliteration table.

其中,预设表格可以是一个空的Excel表,将标准音译表的表头元音和辅音字母分别录入Excel表格,然后按照行列对应关系录入对应汉字,这样就可以得到目标音译表。Among them, the default table can be an empty Excel table, enter the header vowels and consonants of the standard transliteration table into the Excel table, and then enter the corresponding Chinese characters according to the corresponding relationship between rows and columns, so that the target transliteration table can be obtained.

S205、遍历罗马化后的待音译阿拉伯语地名的整个字母序列,查找各个元音字母所在的位置。S205. Traverse the entire letter sequence of the Romanized Arabic place name to be transliterated, and find the location of each vowel.

在音节划分之前,首先对阿语字母系统进行简单介绍。阿语字母系统本身并无元音字母,它是利用标注在辅音上面的元音符号,也即,发音符号,来表示发音类型。叠音除外,这总发音符号共有12个,主要是元音动符,短元音动符包括开口符

Figure BDA0002430550250000051
齐齿符
Figure BDA0002430550250000052
合口符
Figure BDA0002430550250000053
他们与不同的辅音字母相结合代表不同的阿语音节。然而,元音动符只是阿语字母的元音标志,并不是音标,例如,3个长元音动符分别与辅音相结合时,并不是都发相应长元音,比如,
Figure BDA0002430550250000061
与辅音字母
Figure BDA0002430550250000062
结合时发a,与其他辅音结合读作ā。另外,阿语所有音节都是以辅音开始,并与元音相结合,“一辅一元”的基本结构是固定的。Before the division of syllables, a brief introduction to the Arabic alphabet is given first. The Arabic alphabet itself has no vowels. It uses the vowel symbols marked on the consonants, that is, pronunciation symbols, to indicate the pronunciation type. Except for overlapping sounds, there are 12 pronunciation symbols in total, mainly vowel verbs, and short vowel verbs including opening symbols
Figure BDA0002430550250000051
Alignment
Figure BDA0002430550250000052
catchphrase
Figure BDA0002430550250000053
They are combined with different consonants to represent different Arabic syllables. However, the vowel symbol is only the vowel sign of the Arabic alphabet, not the phonetic symbol. For example, when the three long vowel symbols are combined with consonants, not all of them emit corresponding long vowels. For example,
Figure BDA0002430550250000061
with consonants
Figure BDA0002430550250000062
It is pronounced a when combined, and pronounced ā when combined with other consonants. In addition, all syllables in Arabic start with consonants and combine with vowels. The basic structure of "one consonant and one vowel" is fixed.

在一个具体的例子中,音节划分通常包括以下4种形式:(1)短开音节,也即,辅音+短元音,na-shi-ba中的3个音节都为短开音节;(2)长开音节,也即,辅音+长元音,na-shū-ba中的shū;(3)短闭音节,也即,辅音+短元音+辅音,man-sha-'a中的man;(4)长闭音节,也即,辅音+长元音+辅音),nash-wān中wān。In a specific example, the division of syllables usually includes the following four forms: (1) short open syllables, that is, consonants + short vowels, and all three syllables in na-shi-ba are short open syllables; (2) ) long open syllable, that is, consonant + long vowel, shū in na-shū-ba; (3) short closed syllable, that is, consonant + short vowel + consonant, man in man-sha-'a ; (4) long closed syllable, that is, consonant + long vowel + consonant), wān in nash-wān.

具体的,本申请实施例中,遍历罗马化后的待音译阿拉伯语地名的整个字母序列,查找各个元音字母所在的位置。Specifically, in the embodiment of the present application, the entire letter sequence of the Romanized Arabic place name to be transliterated is traversed to find the position of each vowel.

S206、由每个元音字母的位置从右向左发起遍历,确定与当前元音字母配对的辅音字母。S206. Initiate traversal from right to left based on the position of each vowel, and determine the consonant paired with the current vowel.

具体的,在确定各个元音字母的位置后,由每个元音字母的位置从右向左发起遍历,确定与每个元音字母配对的辅音字母。Specifically, after the position of each vowel is determined, a traverse is initiated from the position of each vowel from right to left to determine the consonant paired with each vowel.

可选的,确定辅音字母具体可以通过如下方式实现:由每个元音字母的位置从右向左发起遍历;若辅音字母个数为1,则直接将所辅音字母和元音字母组合为一个音节;若辅音字母个数为2,则将当前左辅音划分到左音节,将当前右辅音划分到右音节;若辅音字母个数为3个及以上,则按照临近原则将当前辅音划分到对应的音节;确定每个与当前元音字母配对的辅音字母。Optionally, determining the consonants can be achieved in the following way: start the traversal from right to left from the position of each vowel; if the number of consonants is 1, directly combine the consonants and vowels into one Syllables; if the number of consonants is 2, divide the current left consonant into the left syllable, and divide the current right consonant into the right syllable; if the number of consonants is 3 or more, divide the current consonant into the corresponding consonant according to the principle of proximity syllables; determine each consonant paired with the current vowel.

其中,由每个元音字母的位置从右向左发起遍历,若辅音字母的个数为1,则直接将该辅音字母和当前元音字母组合为一个音节;若辅音字母的个数为2,则实现音节划分,将当前左辅音划分到左音节,将当前右辅音划分到右音节。另外,若辅音个数为3个及以上,则按照临近原则将当前辅音划分到对应的音节,其中,临近原则是指以元音字母位置较近的辅音字母。Among them, the traversal is initiated from right to left by the position of each vowel. If the number of consonants is 1, the consonant and the current vowel are directly combined into one syllable; if the number of consonants is 2 , the syllable division is realized, the current left consonant is divided into the left syllable, and the current right consonant is divided into the right syllable. In addition, if the number of consonants is 3 or more, the current consonant is divided into corresponding syllables according to the principle of proximity, wherein the principle of proximity refers to the consonant letters with the closer position of the vowel.

S207、完成遍历,得到罗马化后的待音译阿拉伯语地名的音节划分结果。S207, complete the traversal, and obtain the syllable division result of the Romanized Arabic place name to be transliterated.

这样,从右向左依次遍历,得到所有音节划分结果。In this way, all syllable division results are obtained by traversing from right to left.

S208、根据罗马化后的待音译阿拉伯语地名的音节划分结果,以音节为单位,将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。S208. According to the syllable division result of the romanized Arabic place name to be transliterated, input the romanized Arabic place name to be transliterated into the target transliteration table for matching in units of syllables, and obtain a transliteration result.

其中,正向最大匹配算法的基本思想是从左到右选取整个待分词文本或子字串与目标音译表进行匹配,若匹配成功,则切分出当前字串,否则去掉一个字串继续匹配,或者,去掉匹配失败的字串所在的音节,继续匹配。Among them, the basic idea of the forward maximum matching algorithm is to select the entire word text or substring to be segmented from left to right to match with the target transliteration table, if the match is successful, then segment the current string, otherwise remove a string and continue matching , or, remove the syllable where the string that failed to match is located, and continue to match.

可选的,匹配过程具体可以通过如下方式实现:将每个音节切分为多个字串;从左到右开始针对每个音节,将当前音节中的每个字串输入至目标音译表进行匹配,得到对应的汉字;直到全部音节中的全部字串匹配完毕。Optionally, the matching process can be implemented in the following manner: each syllable is divided into multiple word strings; for each syllable from left to right, each word string in the current syllable is input to the target transliteration table for Match to get the corresponding Chinese characters; until all the strings in all syllables are matched.

其中,最大匹配算法主要包括正向最大匹配算法、逆向最大匹配算法、双向匹配算法等。其主要原理都是切分出单字串,然后和词库进行比对,如果是一个词就记录下来,否则通过增加或者减少一个单字串,继续比较,一直还剩下一个单字串则终止,如果该单字串无法切分,则作为未登录处理。Among them, the maximum matching algorithm mainly includes a forward maximum matching algorithm, a reverse maximum matching algorithm, a two-way matching algorithm, and the like. The main principle is to segment a single-character string, and then compare it with the thesaurus. If it is a word, it will be recorded. Otherwise, it will continue to compare by adding or subtracting a single-character string. If there is only one single-character string left, it will terminate. If the single character string cannot be segmented, it will be treated as not registered.

在一个具体的例子中,对已经切分完毕的音节,逐个利用最大匹配算法进行音译。首先,从左至右一次选取音节,假设当前音节的长度为n,则逐个去匹配已经组合好的元辅音组合而成的音节集合中的元素;如果匹配成功,则获取当前音节对应的汉字,并进行打包存储,然后将当前音节从整个音节词条中除去,选取下一音节接着匹配,直到完成所有的音节词条。另外,若匹配失败,则在当前音节从左至右重新选取n-1个字符去匹配音节集合。最后将所有的那个音节的音译结果进行拼接,整理输出结果。示例性的,n为大于1的正整数。In a specific example, the maximum matching algorithm is used to transliterate the segmented syllables one by one. First, select syllables one at a time from left to right, assuming that the length of the current syllable is n, then match the elements in the syllable set formed by the combined vowel and consonant one by one; if the matching is successful, obtain the Chinese character corresponding to the current syllable, And pack and store, then current syllable is removed from whole syllable entry, select next syllable and then match, until finishing all syllable entries. In addition, if the matching fails, reselect n-1 characters from left to right in the current syllable to match the syllable set. Finally, all the transliteration results of that syllable are spliced, and the output results are sorted out. Exemplarily, n is a positive integer greater than 1.

图3示出了一种音译示例,其中

Figure BDA0002430550250000071
为罗马化后的待音译阿拉伯语地名,图3中,首先进行音节划分,然后进行最大匹配,得到“海萨耶赫”的音译结果。Figure 3 shows an example of transliteration, where
Figure BDA0002430550250000071
It is the Arabic place name to be transliterated after romanization. In Figure 3, the syllable division is first performed, and then the maximum matching is performed to obtain the transliteration result of "Haisayeh".

需要说明的是,S202-S204为得到目标音译表的过程,S205-S207为音节划分过程,两个过程之间并无明显的先后关系,图2中只是一个示例。It should be noted that S202-S204 is the process of obtaining the target transliteration table, and S205-S207 is the process of dividing syllables. There is no obvious sequence relationship between the two processes, and FIG. 2 is just an example.

本申请实施例中,首先通过将标准音译表中的元音字母、辅音字母和对应汉字分别录入,得到目标音译表,这样就可以基于目标音译表进行音节匹配;然后通过遍历每个元音字母的位置,确定与每个元音字母配对的辅音字母,将罗马化后的待音译阿拉伯语地名的整个字母序列划分为各个音节;以音节为单位,将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。应用最大匹配算法,提高了匹配的效率和准确率。因此,与现有技术中的人工音译相比,节约了成本,提高了音译效率和准确率。In the embodiment of the present application, firstly, the target transliteration table is obtained by respectively entering the vowels, consonants and corresponding Chinese characters in the standard transliteration table, so that syllable matching can be performed based on the target transliteration table; then, by traversing each vowel The position of each vowel is determined, and the entire letter sequence of the romanized Arabic place name to be transliterated is divided into syllables; the romanized Arabic place name to be transliterated is input in units of syllables Match to the target transliteration table to get the transliteration result. The maximum matching algorithm is applied to improve the efficiency and accuracy of matching. Therefore, compared with the manual transliteration in the prior art, the cost is saved, and the transliteration efficiency and accuracy are improved.

图4是本发明是实施例提供的一种阿拉伯语地名专名音译装置的结构示意图,该装置适用于执行本发明实施例提供给的一种阿拉伯语地名专名音译方法。如图4所示,该装置具体可以包括:罗马化模块401、音译表预处理模块402和音译模块403。Fig. 4 is a schematic structural diagram of a device for transliterating proper names of Arabic place names provided by an embodiment of the present invention, and the device is suitable for implementing a method for transliterating proper names of Arabic place names provided by an embodiment of the present invention. As shown in FIG. 4 , the device may specifically include: a romanization module 401 , a transliteration table preprocessing module 402 and a transliteration module 403 .

其中,罗马化模块401,用于将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;音译表预处理模块402,用于对标准音译表进行预处理,得到目标音译表;音译模块403,用于将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。Among them, the romanization module 401 is used to romanize the Arabic place names to be transliterated to obtain the romanized Arabic place names to be transliterated; the transliteration table preprocessing module 402 is used to preprocess the standard transliteration table to obtain the target Transliteration table; the transliteration module 403 is used to input the romanized Arabic place name to be transliterated into the target transliteration table for matching, and obtain the transliteration result.

本发明采用以上技术方案,通过将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;对标准音译表进行预处理,得到目标音译表;将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。根据阿语地名专名特定发音特点合理翻译,独立高效解决了阿语地名专名的自动翻译问题,降低了人工成本,错误易核查。The present invention adopts the above technical scheme, by romanizing the Arabic place name to be transliterated, obtaining the Arabic place name to be transliterated after romanization; preprocessing the standard transliteration table to obtain the target transliteration table; The Arabic place names are input to the target transliteration table for matching, and the transliteration results are obtained. Reasonable translation according to the specific pronunciation characteristics of Arabic place-name proper names, independently and efficiently solves the problem of automatic translation of Arabic place-name proper names, reduces labor costs, and makes mistakes easy to check.

可选的,音译表预处理模块402具体用于:Optionally, the transliteration table preprocessing module 402 is specifically used for:

提取标准音译表的表头的元音字母和辅音字母;Extract the vowels and consonants of the head of the standard transliteration table;

按照标准音译表的行列对应关系得到对应汉字;Obtain the corresponding Chinese characters according to the row-column correspondence of the standard transliteration table;

在预设表格中分别录入元音字母、辅音字母和对应汉字,得到目标音译表。Enter vowels, consonants and corresponding Chinese characters in the preset table to obtain the target transliteration table.

可选的,还包括音节划分模块,用于将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配之前:Optionally, a syllable division module is also included, which is used to input the romanized Arabic place names to be transliterated into the target transliteration table for matching:

遍历罗马化后的待音译阿拉伯语地名的整个字母序列,查找各个元音字母所在的位置;Traverse the entire letter sequence of the Arabic place name to be transliterated after romanization, and find the location of each vowel;

由每个元音字母的位置从右向左发起遍历,确定与当前元音字母配对的辅音字母;The position of each vowel is traversed from right to left to determine the consonant paired with the current vowel;

完成遍历,得到罗马化后的待音译阿拉伯语地名的音节划分结果;Complete the traversal to obtain the syllable division result of the Romanized Arabic place name to be transliterated;

相应的,音译模块具体用于:Correspondingly, the transliteration module is specifically used for:

根据罗马化后的待音译阿拉伯语地名的音节划分结果,以音节为单位,将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配。According to the syllable division results of the romanized Arabic place names to be transliterated, the romanized Arabic place names to be transliterated are input into the target transliteration table for matching in units of syllables.

可选的,音节划分模块具体用于:Optionally, the syllable division module is specifically used for:

由每个元音字母的位置从右向左发起遍历;The traversal is initiated from right to left by the position of each vowel;

若辅音字母个数为1,则直接将所辅音字母和元音字母组合为一个音节;If the number of consonants is 1, the consonants and vowels are directly combined into one syllable;

若辅音字母个数为2,则将当前左辅音划分到左音节,将当前右辅音划分到右音节;If the number of consonant letters is 2, the current left consonant is divided into the left syllable, and the current right consonant is divided into the right syllable;

若辅音字母个数为3个及以上,则按照临近原则将当前辅音划分到对应的音节;If the number of consonant letters is 3 or more, the current consonant is divided into the corresponding syllable according to the principle of proximity;

确定每个与当前元音字母配对的辅音字母。Determine each consonant paired with the current vowel.

可选的,音译模块具体用于:Optionally, the transliteration module is specifically used for:

将每个音节切分为多个字串;Divide each syllable into multiple strings;

从左到右开始针对每个音节,将当前音节中的每个字串输入至目标音译表进行匹配,得到对应的汉字;For each syllable from left to right, input each word string in the current syllable to the target transliteration table for matching to obtain the corresponding Chinese characters;

直到全部音节中的全部字串匹配完毕。Until all word strings in all syllables are matched.

本发明实施例提供的阿拉伯语地名专名音译装置可执行本发明任意实施例提供的阿拉伯语地名专名音译方法,具备执行方法相应的功能模块和有益效果。The device for transliterating proper names of Arabic place names provided in the embodiments of the present invention can execute the method for transliterating proper names of Arabic place names provided in any embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the method.

本发明实施例还提供一种翻译设备,请参阅图5,图5为一种翻译设备的结构示意图,如图5所示,该翻译设备包括:处理器510,以及与处理器510相连接的存储器520;存储器520用于存储计算机程序,所述计算机程序至少用于执行本发明实施例中的阿拉伯语地名专名音译方法;处理器510用于调用并执行所述存储器中的所述计算机程序;上述阿拉伯语地名专名音译方法至少包括如下步骤:将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;对标准音译表进行预处理,得到目标音译表;将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。The embodiment of the present invention also provides a translation device. Please refer to FIG. 5. FIG. 5 is a schematic structural diagram of a translation device. As shown in FIG. 5, the translation device includes: a processor 510, and a Memory 520; memory 520 is used to store a computer program, and the computer program is at least used to execute the transliteration method of Arabic proper name in the embodiment of the present invention; processor 510 is used to call and execute the computer program in the memory The above-mentioned transliteration method for proper names of Arabic place names at least includes the following steps: Romanizing the Arabic place names to be transliterated to obtain the Arabic place names to be transliterated after Romanization; preprocessing the standard transliteration table to obtain the target transliteration table; The Romanized Arabic place names to be transliterated are input into the target transliteration table for matching, and the transliteration results are obtained.

本发明实施例还提供一种存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时,实现如本发明实施例中的阿拉伯语地名专名音译方法中各个步骤;将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;对标准音译表进行预处理,得到目标音译表;将罗马化后的待音译阿拉伯语地名输入至目标音译表进行匹配,得到音译结果。The embodiment of the present invention also provides a storage medium, the storage medium stores a computer program, and when the computer program is executed by the processor, each step in the transliteration method for Arabic place names in the embodiment of the present invention is realized; Romanize the Arabic place names to be transliterated to obtain the romanized Arabic place names to be transliterated; preprocess the standard transliteration table to obtain the target transliteration table; input the romanized Arabic place names to be transliterated into the target transliteration table Match to get the transliteration result.

可以理解的是,上述各实施例中相同或相似部分可以相互参考,在一些实施例中未详细说明的内容可以参见其他实施例中相同或相似的内容。It can be understood that, the same or similar parts in the above embodiments can be referred to each other, and the content that is not described in detail in some embodiments can be referred to the same or similar content in other embodiments.

需要说明的是,在本发明的描述中,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性。此外,在本发明的描述中,除非另有说明,“多个”的含义是指至少两个。It should be noted that, in the description of the present invention, the terms "first", "second" and so on are only used for description purposes, and should not be understood as indicating or implying relative importance. In addition, in the description of the present invention, unless otherwise specified, the meaning of "plurality" means at least two.

流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本发明的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本发明的实施例所属技术领域的技术人员所理解。Any process or method descriptions in flowcharts or otherwise described herein may be understood to represent modules, segments or portions of code comprising one or more executable instructions for implementing specific logical functions or steps of the process , and the scope of preferred embodiments of the invention includes alternative implementations in which functions may be performed out of the order shown or discussed, including substantially concurrently or in reverse order depending on the functions involved, which shall It is understood by those skilled in the art to which the embodiments of the present invention pertain.

应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that various parts of the present invention can be realized by hardware, software, firmware or their combination. In the embodiments described above, various steps or methods may be implemented by software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented by any one or combination of the following techniques known in the art: Discrete logic circuits, ASICs with suitable combinational logic gates, programmable gate arrays (PGAs), field programmable gate arrays (FPGAs), etc.

本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。Those of ordinary skill in the art can understand that all or part of the steps carried by the methods of the above embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium. During execution, one or a combination of the steps of the method embodiments is included.

此外,在本发明各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing module, each unit may exist separately physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. If the integrated modules are realized in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium.

上述提到的存储介质可以是只读存储器,磁盘或光盘等。The storage medium mentioned above may be a read-only memory, a magnetic disk or an optical disk, and the like.

在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims (8)

1.一种阿拉伯语地名专名音译方法,其特征在于,包括:1. A method for transliterating proper names of Arabic place names, characterized in that, comprising: 将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;Romanize the Arabic place names to be transliterated to obtain the Romanized Arabic place names to be transliterated; 对标准音译表进行预处理,得到目标音译表;Preprocess the standard transliteration table to obtain the target transliteration table; 将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,得到音译结果;Input the Arabic place name to be transliterated after the romanization into the target transliteration table for matching, and obtain a transliteration result; 其中,在标准音译表中,横行表头包括阿语辅音字母以及相应罗马化后辅音字母,竖行表头包括阿语的元音符号和对应转写罗马化后的元音字母,每行每列交叉位置即元、辅音组合后对应的汉字;Among them, in the standard transliteration table, the horizontal header includes the Arabic consonants and the corresponding romanized consonants, and the vertical header includes the Arabic vowel symbols and the corresponding transliterated Romanized vowels. The column intersection position is the corresponding Chinese character after the combination of vowels and consonants; 其中,所述对标准音译表进行预处理,得到目标音译表,包括:Wherein, the standard transliteration table is preprocessed to obtain the target transliteration table, including: 提取所述标准音译表的表头的罗马化后的元音字母和辅音字母;extracting the romanized vowels and consonants of the header of the standard transliteration table; 按照所述标准音译表的行列对应关系得到对应汉字;Obtain corresponding Chinese characters according to the row-column correspondence of the standard transliteration table; 在预设表格中分别录入罗马化后的所述元音字母、所述辅音字母和所述对应汉字,得到目标音译表;Entering the romanized vowels, consonants and corresponding Chinese characters respectively in the preset table to obtain the target transliteration table; 其中,所述将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,之前,还包括:Wherein, the input of the romanized Arabic place name to be transliterated into the target transliteration table for matching, before, also includes: 遍历所述罗马化后的待音译阿拉伯语地名的整个字母序列,查找各个元音字母所在的位置;Traverse the entire sequence of letters of the Arabic place name to be transliterated after the romanization, and find the position of each vowel; 由每个元音字母的位置从右向左发起遍历,确定与当前元音字母配对的辅音字母;The position of each vowel is traversed from right to left to determine the consonant paired with the current vowel; 完成遍历,得到所述罗马化后的待音译阿拉伯语地名的音节划分结果;Complete the traversal to obtain the syllable division result of the Arabic place name to be transliterated after the romanization; 相应的,将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,包括:Correspondingly, input the Arabic place name to be transliterated after the romanization into the target transliteration table for matching, including: 根据所述罗马化后的待音译阿拉伯语地名的音节划分结果,以音节为单位,将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配。According to the syllable division result of the romanized Arabic place name to be transliterated, the romanized Arabic place name to be transliterated is input into the target transliteration table in units of syllables for matching. 2.根据权利要求1所述的方法,其特征在于,所述由每个元音字母的位置从右向左发起遍历,确定与当前元音字母配对的辅音字母,包括:2. The method according to claim 1, wherein said traversal is initiated from right to left by the position of each vowel to determine the consonant paired with the current vowel, including: 由每个元音字母的位置从右向左发起遍历;The traversal is initiated from right to left by the position of each vowel; 若辅音字母个数为1,则直接将所辅音字母和元音字母组合为一个音节;If the number of consonants is 1, the consonants and vowels are directly combined into one syllable; 若辅音字母个数为2,则将当前左辅音划分到左音节,将当前右辅音划分到右音节;If the number of consonant letters is 2, the current left consonant is divided into the left syllable, and the current right consonant is divided into the right syllable; 若辅音字母个数为3个及以上,则按照临近原则将当前辅音划分到对应的音节;If the number of consonant letters is 3 or more, the current consonant is divided into the corresponding syllable according to the principle of proximity; 确定每个与当前元音字母配对的辅音字母。Determine each consonant paired with the current vowel. 3.根据权利要求1所述的方法,其特征在于,以音节为单位,将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,包括:3. The method according to claim 1, wherein, taking syllables as a unit, inputting the romanized Arabic place name to be transliterated into the target transliteration table for matching includes: 将每个音节切分为多个字串;Divide each syllable into multiple strings; 从左到右开始针对每个音节,将当前音节中的每个字串输入至目标音译表进行匹配,得到对应的汉字;For each syllable from left to right, input each word string in the current syllable to the target transliteration table for matching to obtain the corresponding Chinese characters; 直到全部音节中的全部字串匹配完毕。Until all word strings in all syllables are matched. 4.一种阿拉伯语地名专名音译装置,其特征在于,包括:4. An Arabic place name proper name transliteration device is characterized in that, comprising: 罗马化模块,用于将待音译的阿拉伯语地名进行罗马化,得到罗马化后的待音译阿拉伯语地名;The romanization module is used to romanize the Arabic place names to be transliterated to obtain the romanized Arabic place names to be transliterated; 音译表预处理模块,用于对标准音译表进行预处理,得到目标音译表;The transliteration table preprocessing module is used to preprocess the standard transliteration table to obtain the target transliteration table; 音译模块,用于将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,得到音译结果;A transliteration module, configured to input the romanized Arabic place name to be transliterated into the target transliteration table for matching to obtain a transliteration result; 其中,在标准音译表中,横行表头包括阿语辅音字母以及相应罗马化后辅音字母,竖行表头包括阿语的元音符号和对应转写罗马化后的元音字母,每行每列交叉位置即元、辅音组合后对应的汉字;Among them, in the standard transliteration table, the horizontal header includes the Arabic consonants and the corresponding romanized consonants, and the vertical header includes the Arabic vowel symbols and the corresponding transliterated Romanized vowels. The column intersection position is the corresponding Chinese character after the combination of vowels and consonants; 其中,所述对标准音译表进行预处理,得到目标音译表,包括:Wherein, the standard transliteration table is preprocessed to obtain the target transliteration table, including: 提取所述标准音译表的表头的罗马化后的元音字母和辅音字母;Extracting the romanized vowels and consonants of the header of the standard transliteration table; 按照所述标准音译表的行列对应关系得到对应汉字;Obtain corresponding Chinese characters according to the row-column correspondence of the standard transliteration table; 在预设表格中分别录入罗马化后的所述元音字母、所述辅音字母和所述对应汉字,得到目标音译表;Entering the romanized vowels, consonants and corresponding Chinese characters respectively in the preset table to obtain the target transliteration table; 其中,所述将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,之前,还包括:Wherein, the input of the romanized Arabic place name to be transliterated into the target transliteration table for matching, before, also includes: 遍历所述罗马化后的待音译阿拉伯语地名的整个字母序列,查找各个元音字母所在的位置;Traverse the entire sequence of letters of the Arabic place name to be transliterated after the romanization, and find the position of each vowel; 由每个元音字母的位置从右向左发起遍历,确定与当前元音字母配对的辅音字母;The position of each vowel is traversed from right to left to determine the consonant paired with the current vowel; 完成遍历,得到所述罗马化后的待音译阿拉伯语地名的音节划分结果;Complete the traversal to obtain the syllable division result of the Arabic place name to be transliterated after the romanization; 相应的,将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配,包括:Correspondingly, input the Arabic place name to be transliterated after the romanization into the target transliteration table for matching, including: 根据所述罗马化后的待音译阿拉伯语地名的音节划分结果,以音节为单位,将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配。According to the syllable division result of the romanized Arabic place name to be transliterated, the romanized Arabic place name to be transliterated is input into the target transliteration table in units of syllables for matching. 5.根据权利要求4所述的装置,其特征在于,所述音译表预处理模块具体用于:5. device according to claim 4, is characterized in that, described transliteration table preprocessing module is specifically used for: 提取所述标准音译表的表头的元音字母和辅音字母;extract the vowels and consonants of the header of the standard transliteration table; 按照所述标准音译表的行列对应关系得到对应汉字;Obtain corresponding Chinese characters according to the row-column correspondence of the standard transliteration table; 在预设表格中分别录入所述元音字母、所述辅音字母和所述对应汉字,得到目标音译表。Entering the vowels, the consonants and the corresponding Chinese characters respectively in the preset table to obtain the target transliteration table. 6.根据权利要求4所述的装置,其特征在于,还包括音节划分模块,用于将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配之前:6. The device according to claim 4, further comprising a syllable division module for inputting the Romanized Arabic place name to be transliterated into the target transliteration table before matching: 遍历所述罗马化后的待音译阿拉伯语地名的整个字母序列,查找各个元音字母所在的位置;Traversing the entire letter sequence of the Arabic place name to be transliterated after the romanization, searching for the position of each vowel; 由每个元音字母的位置从右向左发起遍历,确定与当前元音字母配对的辅音字母;The position of each vowel is traversed from right to left to determine the consonant paired with the current vowel; 完成遍历,得到所述罗马化后的待音译阿拉伯语地名的音节划分结果;Complete the traversal to obtain the syllable division result of the Arabic place name to be transliterated after the romanization; 相应的,所述音译模块具体用于:Correspondingly, the transliteration module is specifically used for: 根据所述罗马化后的待音译阿拉伯语地名的音节划分结果,以音节为单位,将所述罗马化后的待音译阿拉伯语地名输入至所述目标音译表进行匹配。According to the syllable division result of the romanized Arabic place name to be transliterated, the romanized Arabic place name to be transliterated is input into the target transliteration table in units of syllables for matching. 7.一种翻译设备,其特征在于,包括:7. A translation device, characterized in that it comprises: 处理器,以及与所述处理器相连接的存储器;a processor, and a memory connected to the processor; 所述存储器用于存储计算机程序,所述计算机程序至少用于执行权利要求1-3任一项所述的阿拉伯语地名专名音译方法;The memory is used to store a computer program, and the computer program is at least used to implement the method for transliterating proper names of Arabic place names according to any one of claims 1-3; 所述处理器用于调用并执行所述存储器中的所述计算机程序。The processor is used to call and execute the computer program in the memory. 8.一种存储介质,其特征在于,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时,实现如权利要求1-3任一项所述的阿拉伯语地名专名音译方法中各个步骤。8. A storage medium, characterized in that the storage medium stores a computer program, and when the computer program is executed by a processor, it realizes the method for transliterating proper names of Arabic place names as claimed in any one of claims 1-3 in each step.
CN202010234562.8A 2020-03-30 2020-03-30 Arabic place name proper name transliteration method, device, translation equipment and storage medium Expired - Fee Related CN111460809B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010234562.8A CN111460809B (en) 2020-03-30 2020-03-30 Arabic place name proper name transliteration method, device, translation equipment and storage medium
AU2021100730A AU2021100730A4 (en) 2020-03-30 2021-02-05 Method and apparatus for transliterating special term of arabic geographical name, translation device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010234562.8A CN111460809B (en) 2020-03-30 2020-03-30 Arabic place name proper name transliteration method, device, translation equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111460809A CN111460809A (en) 2020-07-28
CN111460809B true CN111460809B (en) 2023-03-10

Family

ID=71684984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010234562.8A Expired - Fee Related CN111460809B (en) 2020-03-30 2020-03-30 Arabic place name proper name transliteration method, device, translation equipment and storage medium

Country Status (2)

Country Link
CN (1) CN111460809B (en)
AU (1) AU2021100730A4 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011135B (en) * 2021-03-03 2024-08-23 科大讯飞股份有限公司 Arabic vowel recovery method, apparatus, device and storage medium
CN113361288B (en) * 2021-06-30 2024-03-12 民政部地名研究所 Automatic foreign language place name Chinese character translation writing method based on word group
CN118070819B (en) * 2024-04-19 2024-07-09 南京师范大学 An LSTM-based automatic Chinese translation model and method for Arabic place names
CN120197626B (en) * 2025-05-27 2025-08-12 陕西天润科技股份有限公司 Intelligent place name and address translation method based on multilingual syllable segmentation

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102393793A (en) * 2004-06-04 2012-03-28 B·F·加萨比安 Systems for Enhanced Data Entry in Mobile and Stationary Environments
US20060080082A1 (en) * 2004-08-23 2006-04-13 Geneva Software Technologies Limited System and method for product migration in multiple languages
EG25474A (en) * 2007-05-21 2012-01-11 Sherikat Link Letatweer Elbarmaguey At Sae Method for translitering and suggesting arabic replacement for a given user input
CN108628846A (en) * 2017-03-22 2018-10-09 湖南本来文化发展有限公司 Based on ES Expert System Models to the interpretation method of Sichuan accent and Arabic

Also Published As

Publication number Publication date
CN111460809A (en) 2020-07-28
AU2021100730A4 (en) 2021-04-22

Similar Documents

Publication Publication Date Title
CN111460809B (en) Arabic place name proper name transliteration method, device, translation equipment and storage medium
US9582489B2 (en) Orthographic error correction using phonetic transcription
KR101083540B1 (en) System and method for transforming vernacular pronunciation with respect to hanja using statistical method
US9852728B2 (en) Process for improving pronunciation of proper nouns foreign to a target language text-to-speech system
Chakravarthi et al. A survey of orthographic information in machine translation
Chea et al. Khmer word segmentation using conditional random fields
CN104239289B (en) Syllabification method and syllabification equipment
CN102135956B (en) A kind of Tibetan language segmenting method based on lexeme mark
Younes et al. Romanized tunisian dialect transliteration using sequence labelling techniques
Li et al. Improving text normalization using character-blocks based models and system combination
US20190286702A1 (en) Display control apparatus, display control method, and computer-readable recording medium
Mon et al. SymSpell4Burmese: Symmetric delete spelling correction algorithm (SymSpell) for burmese spelling checking
CN104331400B (en) A kind of Mongolian code conversion method and device
US9384191B2 (en) Written language learning using an enhanced input method editor (IME)
Jayalatharachchi et al. Data-driven spell checking: the synergy of two algorithms for spelling error detection and correction
Kaur et al. Hybrid approach for spell checker and grammar checker for Punjabi
CN107229611B (en) A Word Alignment-Based Word Segmentation Method for Historical Books
Sunitha et al. A phoneme based model for english to malayalam transliteration
JPS59165179A (en) Dictionary reference method
CN103984420B (en) A kind of Tibetan language intelligent input method based on phonetic
Yang et al. Automatic error detection and correction of text: The state of the art
QasemiZadeh et al. Challenges in persian electronic text analysis
Lehal Conversion between scripts of Punjabi: Beyond simple transliteration
CN107870905B (en) Method for identifying specific vocabulary
Lu et al. Language model for Mongolian polyphone proofreading

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20230310