Community
    • Login

    Wrong unicode character display for Sinhala Language

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    2 Posts 2 Posters 2.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Malinda PunchimudiyanseM
      Malinda Punchimudiyanse
      last edited by

      I am using notepad++ for my research and collecting large set of Sinhala sentences to build a language model of Sinhala. I just want to type sinhala characters ඍ (unicode 0D8D) and ඎ (0D8E) characters in the editor and copy paste sentences with those characters. Editor is not rendering those characters properly and display them as පෘ and පෲ . When I copy them from the notpad++ editor and put it in windows notepad it correctly displays the character in notepad. Can somebody help me to solve this rendering issue it is much appreciated.

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello Malinda Punchimudiyanse,

        From the official web site of the Unicode Consortium, below :

        http://www.unicode.org/charts/

        We can get a PDF list of all the Sinhala characters, in the range [\x{0D80}-\x{0DFF}], from the link, below :

        http://www.unicode.org/charts/PDF/U0D80.pdf

        It’s easy to verify that :

        • The Unicode value 0D8D corresponds to the SINHALA LETTER IRUYANNA ඍ ( = sinhala letter vocalic r )

        • The Unicode value 0D8E corresponds to the SINHALA LETTER IRUUYANNA ඎ ( = sinhala letter vocalic rr )

        which are, both, independent vowels

        And, you’ll also notice that :

        • The Unicode value 0DB4 corresponds to the SINHALA LETTER ALPAPRAANA PAYANNA ප ( = sinhala letter pa )

        which is a consonant

        Finally, you’ll remark that :

        • The Unicode value 0DD8 corresponds to the SINHALA VOWEL SIGN GAETTA-PILLA ෘ ( = sinhala vowel sign vocalic r )

        • The Unicode value 0DF2 corresponds to the SINHALA VOWEL SIGN DIGA GAETTA-PILLA ෲ ( = sinhala vowel sign vocalic rr )

        with a type, different as the above characters, as they are, both, dependent vowel signs


        Unfortunately, as I’m French and, as I, absolutely, not used to Asiatic languages, I cannot deduce anything valuable, from these facts :-(( Hope that these tiny hints will make sense, for you !!

        Note that you can search for, any of your alphabet character, with the syntax \x{0Dhh} ( where lowercase h stands for an hexdecimal digit )

        BTW, which encoding do you use ? Look at the right part of the bottom status bar

        Also, what is your current font name ?

        • Open the menu option Settings - Style Configurator…
        • Select the Global Styles language
        • Select the Default Style style
        • See the current font name, on the right, in the Font style zone

        Best Regards,

        guy038

        P.S. :

        Beware ! The glyphs of the two additional dependent vowel signs, (\x{0DF2} and \x{0DF3} ) in the PDF file, are reversed :-((

        1 Reply Last reply Reply Quote 0
        • First post
          Last post
        The Community of users of the Notepad++ text editor.
        Powered by NodeBB | Contributors