Om: One tool for many (Indian) languages

来源 :Journal of Zhejiang University Science A(Science in Engineer | 被引量 : 0次 | 上传用户:csc000000
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Many different languages are spoken in India, each language being the mother tongue of tens of millions of people. While the languages and scripts are distinct from each other, the grammar and the alphabet are similar to a large extent. One common feature is that all the Indian languages are phonetic in nature. In this paper we describe the development of a translit- eration scheme Om which exploits this phonetic nature of the alphabet. Om uses ASCII characters to represent Indian language alphabets, and thus can be read directly in English, by a large number of users who cannot read script in other Indian languages than their mother tongue. It is also useful in computer applications where local language tools such as email and chat are not yet available. Another significant contribution presented in this paper is the development of a text editor for Indian languages that integrates the Om input for many Indian languages into a word processor such as Microsoft WinWord?. The text editor is also developed on Java? platform that can run on Unix machines as well. We propose this transliteration scheme as a possible standard for Indian language transliteration and keyboard entry. Many different languages ​​are spoken in India, each language being the mother tongue of tens of millions of people. While the languages ​​and scripts are distinct from each other, the grammar and the alphabet are similar to a large extent. One common feature is that all all the Indian languages ​​are phonetic in nature. In this paper we describe the development of a translit- eration scheme Om which exploits this phonetic nature of the alphabet. Om uses ASCII characters to represent Indian language alphabets, and thus can be read directly in English, by a large number of users who can not read script in other Indian languages ​​than their mother tongue. It is also useful in computer applications where local language tools such as email and chat are not yet available. Another significant contribution in this paper is the development of a text editor for Indian languages ​​that integrates the Om input for many Indian languages ​​into a word processor such as Microsoft WinWord ?. The text editor is also developed on Java? platform that can run on Unix machines as well. We propose this transliteration scheme as a possible standard for Indian language transliteration and keyboard entry.
其他文献
更新摘要除了要解决传统的面向话题的多文档摘要的两个要求——话题相关性和信息多样性,还要求应对用户对信息新颖性的需求.文中为更新摘要提出一种基于热传导模型的抽取式摘
基于POD方法的理论基础与数学矩阵分析,通过引入随机场均值的相关矩阵及对均值相关矩阵的POD分解,阐述均值对随机场完整信号和脉动信号POD分解的影响机理。以双曲冷却塔模型
The fabrication of LPFG in single mode fiber (SMF) was fabricated using amplitude mask writing techniques. The birefringence effect of LPFG for sensing the tran
This paper describes a new framework for reusing hand-drawn cartoon clips based on language understanding approach. Our framework involves two stages: a preproc
为开展脉冲功率系统小型化研究,介绍了直线变压器驱动源的工作原理,阐述了其设计思想和关键技术,研制了以脉冲形成网络为初级脉冲形成单元、采用双边对称输入方式的4模块直线
视觉系统是菠萝采摘机械的关键部件之一,可为采摘终端提供待采果实的位置导航信息.考虑到菠萝果形较大,易于识别,以及系统应用于农业领域,需尽可能降低成本.该研究选取双目视
为了提高web缓存性能,在已有缓存替换算法的基础上加入预测机制,提出了一种面向社交网站(SNS)用户访问行为特征的预测替换算法.通过研究SNS的用户行为模型,引入预测对象集,减
Giemsa C-banding was applied to the chromosome complements of six diploid species belonging to six genera in Chrysanthemum sensu lato (Anthemideae) distributed
Combination of pressboard and mineral oil continues to be the choice of materials used as the insulation die- lectric media in large transformers. The purpose o
This paper deals with a digest on electrical treeing degradation in nanocomposite of magnesium oxide (MgO) added to a low-density polyethylene (LDPE). The objective