Community
    • Login

    Wrong unicode character display for Sinhala Language

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    2 Posts 2 Posters 3.3k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Malinda PunchimudiyanseM Offline
      Malinda Punchimudiyanse
      last edited by

      I am using notepad++ for my research and collecting large set of Sinhala sentences to build a language model of Sinhala. I just want to type sinhala characters ඍ (unicode 0D8D) and ඎ (0D8E) characters in the editor and copy paste sentences with those characters. Editor is not rendering those characters properly and display them as පෘ and පෲ . When I copy them from the notpad++ editor and put it in windows notepad it correctly displays the character in notepad. Can somebody help me to solve this rendering issue it is much appreciated.

      1 Reply Last reply Reply Quote 0
      • guy038G Offline
        guy038
        last edited by guy038

        Hello Malinda Punchimudiyanse,

        From the official web site of the Unicode Consortium, below :

        http://www.unicode.org/charts/

        We can get a PDF list of all the Sinhala characters, in the range [\x{0D80}-\x{0DFF}], from the link, below :

        http://www.unicode.org/charts/PDF/U0D80.pdf

        It’s easy to verify that :

        • The Unicode value 0D8D corresponds to the SINHALA LETTER IRUYANNA ඍ ( = sinhala letter vocalic r )

        • The Unicode value 0D8E corresponds to the SINHALA LETTER IRUUYANNA ඎ ( = sinhala letter vocalic rr )

        which are, both, independent vowels

        And, you’ll also notice that :

        • The Unicode value 0DB4 corresponds to the SINHALA LETTER ALPAPRAANA PAYANNA ප ( = sinhala letter pa )

        which is a consonant

        Finally, you’ll remark that :

        • The Unicode value 0DD8 corresponds to the SINHALA VOWEL SIGN GAETTA-PILLA ෘ ( = sinhala vowel sign vocalic r )

        • The Unicode value 0DF2 corresponds to the SINHALA VOWEL SIGN DIGA GAETTA-PILLA ෲ ( = sinhala vowel sign vocalic rr )

        with a type, different as the above characters, as they are, both, dependent vowel signs


        Unfortunately, as I’m French and, as I, absolutely, not used to Asiatic languages, I cannot deduce anything valuable, from these facts :-(( Hope that these tiny hints will make sense, for you !!

        Note that you can search for, any of your alphabet character, with the syntax \x{0Dhh} ( where lowercase h stands for an hexdecimal digit )

        BTW, which encoding do you use ? Look at the right part of the bottom status bar

        Also, what is your current font name ?

        • Open the menu option Settings - Style Configurator…
        • Select the Global Styles language
        • Select the Default Style style
        • See the current font name, on the right, in the Font style zone

        Best Regards,

        guy038

        P.S. :

        Beware ! The glyphs of the two additional dependent vowel signs, (\x{0DF2} and \x{0DF3} ) in the PDF file, are reversed :-((

        1 Reply Last reply Reply Quote 0

        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

        With your input, this post could be even better 💗

        Register Login
        • First post
          Last post
        The Community of users of the Notepad++ text editor.
        Powered by NodeBB | Contributors