jagomart
digital resources
picture1_Language Pdf 101577 | Ijaiem 2014 08 31 58


 142x       Filetype PDF       File size 0.15 MB       Source: www.ijaiem.org


File: Language Pdf 101577 | Ijaiem 2014 08 31 58
international journal of application or innovation in engineering management ijaiem web site www ijaiem org email editor ijaiem org volume 3 issue 8 august 2014 issn 2319 4847 transliteration generator ...

icon picture PDF Filetype PDF | Posted on 22 Sep 2022 | 3 years ago
Partial capture of text on file.
 
              International Journal of Application or Innovation in Engineering & Management (IJAIEM) 
                                      Web Site: www.ijaiem.org Email: editor@ijaiem.org  
             Volume 3, Issue 8, August 2014                                                        ISSN 2319 - 4847 
                   Transliteration Generator for Devnagari using 
                                                     Mangal Font 
                                                                
                                                           1                2             3
                                          Arti P. Khadtare , Dr. Suhas Raut , M. S. Otari   
                                                                   
                                          1, 2 & 3 
                                              N.K. Orchid College of Engineering and Technology, Solapur 
                  
              
                                                             ABSTRACT 
             Transliteration generation is a part of language processing. A transliteration generator is a program to generate the 
             transliteration for an input word given in Marathi (Devnagari). In current technique the transliteration generated for 
             an input word depends on font used for typing the text. In proposed system we provide the transliteration generator for 
             Mangal font. This transliteration of Marathi (Devnagari) words is used for further text processing such as translation 
             (i.e.  translating  text  from  one  language  into  another),  comprehension  etc.  or  other  NLP  (Natural  Language 
             Processing) applications. 
             Keywords: Transliteration, NLP, Devnagari fonts. 
             1. INTRODUCTION 
             Now a day’s use of regional languages in computing is increased. For Maharashtra, Marathi is the language used by 
             many people. To be able to write in Marathi (Devnagari) user has to install Devnagari font. The fonts available for typing 
             in Devnagari are Aakar, Aakriti, Akshar, Shivaji, Mangal, Kruti, Ganga, Gauri, Kiran etc. User can use local language 
             setting and on-screen keyboard feature available with operating system to write in Marathi (Devnagari) language. But the 
             key mapping of standard keyboard for Marathi is different for different fonts. As user may not always have this keyboard 
             mapping readily available with him. For Marathi language user it is difficult to spell Marathi words if he doesn’t have 
             enough English knowledge. For fonts like Shivaji, to type the Marathi word requires to type the word in equivalent 
             English letters i.e. transliteration. For e.g. to type the word ‘ज’, user has to type two letters ‘j’ and ‘a’. So user should 
             know which alphabet represents exactly which word i.e. keyboard mapping for Shivaji font. And if he is not aware of it 
             then it can be troublesome. But in case of Mangal font user can directly type Marathi words using the keyboard, because 
             keyboard mapping required for Mangal font is available on on-screen keyboard and user doesn’t need to know how to 
             generate exact spelling because by setting local language to Marathi he can directly type in Devnagari. But internal 
             representation in such a font is also in Marathi unlike Shivaji in which it is in form of transliteration. So for user to be 
             able to enter words in Marathi without bothering about keyboard mapping, we have designed transliteration generator. 
             This will convert the word entered in Marathi (Devnagari) to its equivalent transliteration.  
             2. RELATED WORK 
             From an information-theoretical point of view, systematic transliteration is a mapping from one system of writing into 
             another, word by word, or ideally letter by letter. Most transliteration systems are one-to-one, so a reader who knows the 
             system can reconstruct the original spelling. Transliteration of single words is often an informal non-systematic process; 
             many variants of the same word are often used [8]. The International Alphabet of Sanskrit Transliteration (IAST) is a 
             subset  of  the ISO  15919 standard,  used  for  the  transliteration  of Sanskrit and Pali into  Roman  script  with  diacritics. 
             IAST is a widely used standard. The Hunterian system is the "national system of romanization in India" and the one 
             officially  adopted  by  the Government  of  India.  Compared  to IAST, Harvard-Kyoto looks  much  simpler.  It  does  not 
             contain any of the diacritic marks that IAST contains. Instead of diacritics, Harvard-Kyoto uses capital letters. The use of 
             capital letters makes typing in Harvard-Kyoto much easier than in IAST but produces words with capital letters inside 
             them. ITRANS is an extension of Harvard-Kyoto [9]. There are many software’s available which convert  word typed 
             using transliteration to Marathi text for example Baraha Keyboard where a user can type transliteration of Marathi word 
             using English letters but the output will be Marathi text. In Shivaji font, for example, to write “पूजा” user has to enter 
             “pUjaa”.  Here  to  represent  “ज”  user  has  to  enter  two  alphabets  ‘j’  and  ‘a’  to  make  it  complete  word  “ज”,  as  it  is  a 
             consonant. But for common user who knows Marathi but is not proficient in English and user who knows English but is 
             not proficient in Marathi it can be a difficult task. But none of these provide user the reverse process i.e. entering Marathi 
             (Devnagari) words which will be converted to its respective English letter. 
             Volume 3, Issue 8, August 2014                                                                  Page 159 
              
               International Journal of Application or Innovation in Engineering & Management (IJAIEM) 
                                       Web Site: www.ijaiem.org Email: editor@ijaiem.org  
              Volume 3, Issue 8, August 2014                                                           ISSN 2319 - 4847 
               
              3. TRANSLITERATION GENERATOR FOR MANGAL FONT 
              3.1 WORKING OF PROPOSED SYSTEM  
              In the proposed system we are going to replace the entered Marathi word with its equivalent English alphabet. For this we 
              are going to create a list which will contain Marathi word and an English alphabet which represents it. As soon as user 
              enters Marathi word from keyboard program will look up in list for entry and will replace it in string with equivalent 
              English word. For performing exact match we will use Unicode representation of words to compare. Figure.1 shows the 
              in general processing of proposed system. In our system we have used representation similar to Harvard-Kyoto i.e. we are 
              using capital letters to represent words, e.g. ‘ध’- Dh, ‘ढ’-D. But we are eliminating the ‘a’ in the end of a consonant to keep 
              transliteration simple which will help in further processing of text in different application [1] - [7]. Here we are using 
              Shivaji font to compare how the proposed system for Mangal font helps and reduces overhead of user. Fig 2.a and 2.b 
              shows the different transliteration along with transliteration generated by proposed system. 
               
                                            Figure 1: Processing of proposed transliteration generator   
                  
                                           Devanagari   IAST    Harvard-   ITRANS  Proposed 
                                                                Kyoto                 System 
                                           अ            a       a          a          a 
                                           आ            ā       A          A/aa       aa/A 
                                           इ            i       i          i          i 
                                           ई            ī       I          I/ii       ii/I 
                                           उ            u       u          u          u 
                                           ऊ            ū       U          U/uu       uu/U 
                                           ए            e       e          e          e 
                                           ऐ            ai      ai         ai         ai 
                                           ओ            o       o          o          o 
                                           औ            au      au         au         au 
                                           अं           ṃ       M          M/.n/.m    M 
                                           अः           ḥ       H          H          H 
                                                   Figure 2.a Transliteration of Vowels [2] 
                 
              Volume 3, Issue 8, August 2014                                                                    Page 160 
               
                                                                                                                                      International Journal of Application or Innovation in Engineering & Management (IJAIEM) 
                                                                                                                                                                                                                                                                                                                                                                      Web Site: www.ijaiem.org Email: editor@ijaiem.org  
                                                                                                                           Volume 3, Issue 8, August 2014                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ISSN 2319 - 4847 
                                                                                                                            
                                                                                                                                                                                  Devanagari                                                                                          IAST                                                                                    Harvard-                                                                                  ITRANS                                                                                     Proposed                                                                                                          Devanagari                                                                                          IAST                                                                             Harvard-                                                                           ITRANS                                                                             Proposed  
                                                                                                                                                                                                                                                                                                                                                                              Kyoto                                                                                                                                                                                System                                                                                                                                                                                                                                                                                                 Kyoto                                                                                                                                                                 System  
                                                                                                                                                                                                       क                                                                                                   ka                                                                                      Ka                                                                                        ka                                                                                         K                                                                                                                 द                                                                                                   da                                                                               da                                                                                 da                                                                                 d 
                                                                                                                                                                                                       ख                                                                                                   kha                                                                                     Kha                                                                                       kha                                                                                        kh                                                                                                                ध                                                                                                   dha                                                                              dha                                                                                dha                                                                                dh 
                                                                                                                                                                                                       ग                                                                                                   ga                                                                                      Ga                                                                                        ga                                                                                         g                                                                                                                 न                                                                                                   na                                                                               na                                                                                 na                                                                                 n 
                                                                                                                                                                                                       घ                                                                                                   gha                                                                                     Gha                                                                                       gha                                                                                        gh                                                                                                                प                                                                                                   pa                                                                               pa                                                                                 pa                                                                                 p 
                                                                                                                                                                                                       ङ                                                                                                   ṅa                                                                                      Ga                                                                                        ~Na                                                                                        Da                                                                                                                फ                                                                                                   pha                                                                              pha                                                                                pha                                                                                ph 
                                                                                                                                                                                                       च                                                                                                   ca                                                                                      Ca                                                                                        cha                                                                                        c                                                                                                                 ब                                                                                                   ba                                                                               ba                                                                                 ba                                                                                 b 
                                                                                                                                                                                                       छ                                                                                                   cha                                                                                     Cha                                                                                       Cha                                                                                        Ch                                                                                                                भ                                                                                                   bha                                                                              bha                                                                                bha                                                                                bh 
                                                                                                                                                                                                       ज                                                                                                   ja                                                                                      Ja                                                                                        ja                                                                                         j                                                                                                                 म                                                                                                   ma                                                                               ma                                                                                 ma                                                                                 m 
                                                                                                                                                                                                       झ                                                                                                   jha                                                                                     Jha                                                                                       jha                                                                                        jh                                                                                                                य                                                                                                   ya                                                                               ya                                                                                 ya                                                                                 y 
                                                                                                                                                                                                       ट                                                                                                   ṭa                                                                                      Ta                                                                                        Ta                                                                                         T                                                                                                                 र                                                                                                   ra                                                                               ra                                                                                 ra                                                                                 r 
                                                                                                                                                                                                       ठ                                                                                                   ṭha                                                                                     Tha                                                                                       Tha                                                                                        Th                                                                                                                ल                                                                                                   la                                                                               la                                                                                 la                                                                                 l 
                                                                                                                                                                                                       ड                                                                                                   ḍa                                                                                      Da                                                                                        Da                                                                                         D                                                                                                                 व                                                                                                   va                                                                               va                                                            va/wa                                                                                                   v 
                                                                                                                                                                                                       ढ                                                                                                   ḍha                                                                                     Dha                                                                                       Dha                                                                                        Dh                                                                                                                श                                                                                                   śa                                                                               za                                                                                 sha                                                                                sh 
                                                                                                                                                                                                       ण                                                                                                   ṇa                                                                                      Na                                                                                        Na                                                                                         N                                                                                                                 ष                                                                                                   ṣa                                                                               Sa                                                                                 Sha                                                                                Sh 
                                                                                                                                                                                                       त                                                                                                   ta                                                                                      Ta                                                                                        ta                                                                                         t                                                                                                                 स                                                                                                   sa                                                                               sa                                                                                 sa                                                                                 s 
                                                                                                                                                                                                       थ                                                                                                   tha                                                                                     Tha                                                                                       tha                                                                                        th                                                                                                                ह                                                                                                   ha                                                                               ha                                                                                 ha                                                                                 h 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Figure 2.b Transliteration of Consonants [2] 
                                                                                                                           3.2 WORKING WITH SHIVAJI FONT 
                                                                                                                           In Shivaji font, it is necessary for user to type five letters ‘javaL’ to spell the word ‘जवळ’ i.e. as ‘j’ and ‘v’ are consonant to 
                                                                                                                           complete them vowel ‘a’ has to be added but this is not the case for word ‘L’. Here though user is typing using English 
                                                                                                                           letters the output or text that is displayed on screen is in Devnagari. That is user must have knowledge about how to spell 
                                                                                                                           the Marathi (Devnagari) words using English letters on keyboard. The problem that arises here is that this transliteration 
                                                                                                                           may vary according to users understanding. User may try to type ‘Ka’ or ‘kha’ to represent ‘ख’ but here only single capital 
                                                                                                                           letter ‘K’ is enough to represent ‘ख’. 
                                                                                                                            
                                                                                                                                                                                                                                                                                                                                                                                                                                                          Figure 3. keyboard layout for Shivaji font [10]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
                                                                                                                           3.3  WORKING WITH MANGAL FONT 
                                                                                                                           For Mangal font, as shown in figure 3.2, user has to type only three letters ज, व, ळ to type the word ‘जवळ’ in Devnagari 
                                                                                                                           keyboard layout i.e. he do not have to bother about which English alphabet represents which Devnagari letter because he 
                                                                                                                           can directly press the key for Devnagari word from keyboard. Here user is typing in Devnagari letters and the output or 
                                                                                                                           text displayed on screen is also in Devnagari form. The proposed system will then convert the word to its equivalent 
                                                                                                                           transliteration ‘jvL’ by using unicode value comparison. 
                                                                                                                           Volume 3, Issue 8, August 2014                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       Page 161 
                                                                                                                            
                International Journal of Application or Innovation in Engineering & Management (IJAIEM) 
                                          Web Site: www.ijaiem.org Email: editor@ijaiem.org  
               Volume 3, Issue 8, August 2014                                                                 ISSN 2319 - 4847 
                
                
                                           Figure 4. on-screen keyboard layout for Devnagari Mangal font                    
               4. CONCLUSION 
               In this paper we have studied Devnagari transliteration which is used by some fonts for typing in Devnagari. We have 
               proposed transliteration generator for Devnagari font Mangal in which transliteration will be generated for the word 
               entered  in  Devnagari.  This  will  help  to  reduce  user  overhead.  Similarly,  transliteration  can  be  generated  for  other 
               Devnagari fonts by altering the key mapping. 
               References 
               [1]   A  Paradigm-Based  Finite  State  Morphological  Analyzer  for  Marathi.  Mugdha  Bapat,  Harshada  Gune,  Pushpak 
                   Bhattacharyya,  Proceedings  of  the  1st  Workshop  on  South  and  Southeast  Asian  Natural  Language  Processing 
                   (WSSANLP), pages 26–34,the 23rd International Conference on Computational Linguistics (COLING), Beijing, 
                   August 2010. 
               [2]   Morphological Analyzer for Marathi using NLP ,Pratiksha Gawade, Deepika Madhavi, Jayshree Gaikwad, Sharvari 
                   Jadhav, Rahul Ambekar  , International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-
                   9622 Vol. 3, Issue 2, March-April 2013. 
               [3]   Natural  Language  Processing:  A  Paninian  Perspective.  Bharati  Akshar,  Vineet  Chaitanya,  Rajeev  Sanghal, 
                   Department  of  Computer  Science  and  Engineering,  Indian  Institute  of  Technology  Kanpur,1995.Publication 
                   Prentice-Hall of India, New Delhi. 
               [4]   Plural  Problems in the Nominal Morphology of Marathi Shalmalee Pitale, Vaijayanthi Sarma, 25th Pacific Asia 
                   Conference on Language, Information and Computation, pages 178–185, December 2011. 
               [5]   Processing  of  Kridanta  (Participle)  in  Marathi  Ganesh  Bhosale, Subodh Kembhavi, Archana Amberkar, Supriya 
                   Mhatre, Lata Popale, Pushpak Bhattacharyya, Proceedings of ICON-2011: 9th International Conference on Natural 
                   Language Processing Macmillan Publishers, India. 
               [6]   Two-level Morphology: a general computational model for word-form recognition and production.    Koskenniemi 
                   Kimmo, University of Helsinki, Department of General Linguistic, Hallituskatu 11-13, SF-00100, HELSINKI 10, 
                   FINLAND, PUBLICATIONS, No. 11, 1983. 
               [7]   Verbs  are  where  all  the  action  lies:  Experiences  of  Shallow  Parsing  of  a  Morphologically  Rich        Language.  
                   Harshada Gune, Mugdha Bapat, Mitesh M. Khapra, and Pushpak Bhattacharyya,    Proceedings of COLING '10 
                   Proceedings of the 23rd International Conference on Computational Linguistics: Pages 347-355,  Association for 
                   Computational Linguistics stroudsburg, PA, USA 2010. 
               [8]   Transliteration –  http://en.wikipedia.org/wiki/Transliteration 
               [9]   Devnagari Transliteration – http://en.wikipedia.org/wiki/Devanagari_transliteration  
               [10]  http://www.mmd.gov.in/shusha_files/shushakeyboard.gif 
                
                
                
                
                
               Volume 3, Issue 8, August 2014                                                                           Page 162 
                
The words contained in this file might help you see if this file matches what you are looking for:

...International journal of application or innovation in engineering management ijaiem web site www org email editor volume issue august issn transliteration generator for devnagari using mangal font arti p khadtare dr suhas raut m s otari n k orchid college and technology solapur abstract generation is a part language processing program to generate the an input word given marathi current technique generated depends on used typing text proposed system we provide this words further such as translation i e translating from one into another comprehension etc other nlp natural applications keywords fonts introduction now day use regional languages computing increased maharashtra by many people be able write user has install available are aakar aakriti akshar shivaji kruti ganga gauri kiran can local setting screen keyboard feature with operating but key mapping standard different may not always have readily him it difficult spell if he doesn t enough english knowledge like type requires equiv...

no reviews yet
Please Login to review.