Wrong unicode character display for Sinhala Language



  • I am using notepad++ for my research and collecting large set of Sinhala sentences to build a language model of Sinhala. I just want to type sinhala characters ඍ (unicode 0D8D) and ඎ (0D8E) characters in the editor and copy paste sentences with those characters. Editor is not rendering those characters properly and display them as පෘ and පෲ . When I copy them from the notpad++ editor and put it in windows notepad it correctly displays the character in notepad. Can somebody help me to solve this rendering issue it is much appreciated.



  • Hello Malinda Punchimudiyanse,

    From the official web site of the Unicode Consortium, below :

    http://www.unicode.org/charts/

    We can get a PDF list of all the Sinhala characters, in the range [\x{0D80}-\x{0DFF}], from the link, below :

    http://www.unicode.org/charts/PDF/U0D80.pdf

    It’s easy to verify that :

    • The Unicode value 0D8D corresponds to the SINHALA LETTER IRUYANNA ඍ ( = sinhala letter vocalic r )

    • The Unicode value 0D8E corresponds to the SINHALA LETTER IRUUYANNA ඎ ( = sinhala letter vocalic rr )

    which are, both, independent vowels

    And, you’ll also notice that :

    • The Unicode value 0DB4 corresponds to the SINHALA LETTER ALPAPRAANA PAYANNA ප ( = sinhala letter pa )

    which is a consonant

    Finally, you’ll remark that :

    • The Unicode value 0DD8 corresponds to the SINHALA VOWEL SIGN GAETTA-PILLA ෘ ( = sinhala vowel sign vocalic r )

    • The Unicode value 0DF2 corresponds to the SINHALA VOWEL SIGN DIGA GAETTA-PILLA ෲ ( = sinhala vowel sign vocalic rr )

    with a type, different as the above characters, as they are, both, dependent vowel signs


    Unfortunately, as I’m French and, as I, absolutely, not used to Asiatic languages, I cannot deduce anything valuable, from these facts :-(( Hope that these tiny hints will make sense, for you !!

    Note that you can search for, any of your alphabet character, with the syntax \x{0Dhh} ( where lowercase h stands for an hexdecimal digit )

    BTW, which encoding do you use ? Look at the right part of the bottom status bar

    Also, what is your current font name ?

    • Open the menu option Settings - Style Configurator…
    • Select the Global Styles language
    • Select the Default Style style
    • See the current font name, on the right, in the Font style zone

    Best Regards,

    guy038

    P.S. :

    Beware ! The glyphs of the two additional dependent vowel signs, (\x{0DF2} and \x{0DF3} ) in the PDF file, are reversed :-((


Log in to reply