Community
    • Login

    Extended ASCII ALT+xxx char Display issue

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    extended ascii
    9 Posts 2 Posters 2.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • David TchekiD
      David Tcheki
      last edited by

      Hello,

      I meet some issues while using Extended ASCII code (127-255) in an text UTF-8 file.
      Some char are displayed with a “void square”

      1st case :

      • With NotePAD++ Installer v7.x : KO
      • After updating to last v8 (v8.1.3) : KO
      • After removing all extensions : KO

      2nd case :

      • By using NotePAD++ Portable v8.1.4 : OK

      3rd case :

      • By using Microsoft NotePAD : OK

      Rem. : As you can see, the ASCII Code panel display ALT+0xxx char values (BAD Extended ASCII Chars) and not ALT+000 char value

      P.S. : I don’t know how to join some pictures to illustrate the issue.

      David TchekiD PeterJonesP 2 Replies Last reply Reply Quote 0
      • David TchekiD
        David Tcheki @David Tcheki
        last edited by

        @David-Tcheki Update :
        The issue concerns the following char (x17) :

        • 176-178 (x3)
        • 185-188 (x4)
        • 200-206 (x7)
        • 219-220 (x2)
        • 223 (x1)
        1 Reply Last reply Reply Quote 0
        • PeterJonesP
          PeterJones @David Tcheki
          last edited by

          @David-Tcheki said in Extended ASCII ALT+xxx char Display issue:

          First, responding to some side points:

          1st case :

          • With NotePAD++ Installer v7.x : KO
          • After updating to last v8 (v8.1.3) : KO
          • After removing all extensions : KO

          By “KO” I assume you mean it doesn’t work, for some definition of “work”

          • By using NotePAD++ Portable v8.1.4 : OK

          By “OK” I assume you mean it does work.

          3rd case :

          • By using Microsoft NotePAD : OK

          FYI: neither product capitalizes the PAD – Microsoft Notepad and Don Ho’s Notepad++.

          Rem. : As you can see,

          Sorry, I cannot see that from your post.

          P.S. : I don’t know how to join some pictures to illustrate the issue.

          Copy the image (Windows standard feature: old-fashioned Alt+PrintScreen, or the new Snip & Sketch tool), then paste it in your post here.

          —
          Now back to the meat of your question.

          I meet some issues while using Extended ASCII code (127-255) in an text UTF-8 file.
          Some char are displayed with a “void square”

          Could you give a more-specific example? Screenshots would have helped. Or copy/paste the actual text you are trying to display, so we know which characters you mean.

          Do you mean you’re really getting the unknown-glyph symbol 𖡄 or 𖡄 ? (Sometimes that glyph is rendered with a ? inside and sometimes not)
          867a2429-fab3-4da4-8b24-bf89b1a05992-image.png d90c1cf0-c46e-476a-be0a-fdfbc6d1770a-image.png

          … That symbol means your chosen font doesn’t have a glyph for that character. Though I would doubt that any of the codepoints in the 127-255 range would give you that in a modern Windows environment. You can see/change your chosen font in Settings > Style Configurator > Global Styles > Default Style:
          0128e383-d1ac-4479-a08f-61aa89d21482-image.png

          Sometimes toggling the setting of Settings > Preferences > MISC > Use DirectWrite will help Windows/Notepad++ find glyphs for all of your characters, or display those glyphs better. But sometimes it makes it harder for some people to read. You will have to choose whichever toggle on that setting works best for you.

          But given all the mentions of ALT+xxx and the specific characters you mentioned, I wonder if you meant the old box-drawing shaded boxes like ░ ▒ ▓ 9561ede9-1b65-4630-a7c2-e75180a09e16-image.png
          I am going to go with that for the rest of the post.

          the ASCII Code panel display ALT+0xxx char values (BAD Extended ASCII Chars) and not ALT+000 char value

          903e4094-3cb9-42d6-8828-6273e023dfcc-image.png
          – Microsoft’s documentation

          When Microsoft documentation tells you to use ALT+0xxx to get a particular character, it is the right way of doing it, not the “BAD” way.

          The OLD, 1980’s-technology way of entering characters from the OEM-US character set is to use ALT+xxx for the codepoint within that 255-character set. The correct way in modern windows is to prefix with the 0 (ALT+0xxx).

          Here’s an external resource that shows the ways of typing the degree symbol:
          6214e7ce-f452-49a3-8be8-798d55305b53-image.png

          And Wikipedia’s Alt Codes entry concurs.

          Note that in Notepad++, if you use Edit > Character Panel to show the ASCII codes insertion panel, you can see that 176 is the degree symbol when the encoding is UTF-8:
          76068162-637b-4b8c-ad49-39f9ea9b46bf-image.png

          OTOH, if you create a file, and use Encoding > Character Sets > Western European > OEM-US to get the old so-called “extend ASCII” or “box drawing” character set, then codepoint 176 is a box-drawing character ░.
          b9f663ee-c061-4952-84c4-e1423c109476-image.png

          But it doesn’t matter which encoding you are in in Notepad++, if you type ALT+176 it will do the box-drawing character; if you do ALT+0176 it will do the degree symbol. The same is true in MS Notepad as well:
          e5f24d12-b5f1-42f6-86ad-0e21134e6256-image.png

          David TchekiD 3 Replies Last reply Reply Quote 3
          • David TchekiD
            David Tcheki @PeterJones
            last edited by

            @PeterJones
            First a big Thanks for your complete answer.

            I don’t know why, but I have only 3mn to have the possibility of editing my post, so I can’t apply any update now (I mean as example using “Notepad” instead “NotePAD”).

            UPDATE :

            • OK means “No Issue”
            • KO means “Issue appears”

            About screenshots, I have done all I need.
            But I will expect as when I select the “image” icon I could choose a file from my computer and upload it on the message.
            Instead, it asks a link, so I didn’t know how to do.
            (Habits…)

            Finally :
            The issue comes from the font used :

            1. It seems there is no Default Style defined :
              79d2b33b-e01f-44a5-8837-cfb999501655-image.png

            2. I can’t see DejaVu Sans Mono font in the list but by memory I remember Consolas is also a monospaced font.
              930e26d7-fd48-4216-a4bb-2303f2a3c582-image.png

            3. From the Portable version of Notepad++ (v8.1.4), there is well a Default Style defined.
              So I don’t know what happened with the Installer version (even after the update, it seems there is always no Default Style defined)
              0050c95b-a596-45be-929d-0ada136d38df-image.png

            David TchekiD 1 Reply Last reply Reply Quote 0
            • David TchekiD
              David Tcheki @PeterJones
              last edited by David Tcheki

              @PeterJones

              UPDATE

              Please see below the initial screenshots, I would like to share :

              1. With NotePAD++ Installer v7.x and also after updating to last v8 (v8.1.3) -> KO
                1dcc9fc9-2b3e-4886-a986-48fc3b15e9c3-image.png

              2. By using Notepad++ Portable v8.1.4 -> OK
                5a904c4f-32e0-47e7-b72b-528d96bd76ff-image.png

              3. By using Microsoft Notepad -> OK
                037b8111-1f1c-44e4-9caf-2710955fd819-image.png

              1 Reply Last reply Reply Quote 0
              • David TchekiD
                David Tcheki @David Tcheki
                last edited by

                @David-Tcheki

                SOLUTION :
                It seems the issue comes from the Obsidian theme which has no Default Style Font defined.

                1 Reply Last reply Reply Quote 1
                • David TchekiD
                  David Tcheki @PeterJones
                  last edited by David Tcheki

                  @PeterJones

                  About Notepad++ char setting - OEM-850 vs UTF-8

                  In all case if I want to display an extended ASCII char, I will have to type ALT+xxx sequence (not ALT+0xxx) either in OEM-850 or either in UTF-8 Encoding.

                  e9a22c41-7b0c-4d23-8339-0967fe40d4dd-image.png

                  a8824240-01d2-4542-a628-319436ab5059-image.png

                  It seems not enough clear for me :

                  • what is the binary “char” code of an extended ASCII char (128-255) in UTF-8 format ? (I mean are Extended ASCII char have their own code in Unicode Format ?)
                  • Where are coded Extended ASCII char with Windows Page Code (ALT+0xxx “new” Format) ?
                  PeterJonesP 1 Reply Last reply Reply Quote 0
                  • PeterJonesP
                    PeterJones @David Tcheki
                    last edited by

                    @David-Tcheki said,

                    I can’t see DejaVu Sans Mono font in the list

                    No, I had to install that myself. I prefer it to the default Courier New; it has a lot more of the technical unicode glyphs that I use, and I like the look of the font. Consolas is a reasonable choice (it comes default with modern Windows, and has more glyphs than Courier New), though I don’t like it’s “look” quite as much as DejaVu Sans Mono (personal preference).

                    SOLUTION : It seems the issue comes from the Obsidian theme which has no Default Style Font defined.

                    Oi. I’m surprised. But yes, if there is no font defined, then Windows probably goes through its fallback choices, which can get confusing. The DirectWrite option I mentioned might have made it pick a better font… but it’s better to define a font.

                    Re: “surprised”: Ah, looking through the GitHub “blame” on the Obsidian, it shows that the Default style was fixed to include a font name in commit 6dacca9, which has been in effect since v7.3.2. So apparently your upgrade path started from v7.3.1 or earlier. (Notepad++ doesn’t overwrite theme files when you update, because people would complain that styles they had customized were lost – for example, if they had tweaked Obsidian to use DejaVu Sans Mono instead of nothing or Courier New).

                    In all case if I want to display an extended ASCII char, I will have to type ALT+xxx sequence (not ALT+0xxx) either in OEM-850 or either in UTF-8 Encoding.

                    Yes, as I said (emphasis added): “it doesn’t matter which encoding you are in in Notepad++, if you type ALT+176 it will do the box-drawing character; if you do ALT+0176 it will do the degree symbol.”

                    what is the binary “char” code of an extended ASCII char (128-255) in UTF-8 format ? (I mean are Extended ASCII char have their own code in Unicode Format ?)

                    Easy enough to look up: the “extended ASCII” OEM-850 is well documented… for example, Wikipedia’s Code page 850 entry shows the upper 128 characters; the position in CP850 is noted by the rows and columns; the 4 hex-digit Unicode codepoint is listed in each box:
                    3c0a3a85-0171-4a77-8025-f168c644228a-image.png
                    … so, for example, ALT+176 = CP850#176 is U+2591 (so it’s at Uniocode codepoint 0x2591 = decimal 9617).
                    And that screenshot, or the Wiki article, can be used to look up any of the others

                    Where are coded Extended ASCII char with Windows Page Code (ALT+0xxx “new” Format) ?

                    The original “external resource” I linked has a link on the “How to type in Microsoft Windows” which explains – and even the screenshot I included above showed three ways of typing the degree symbol; that site has pages for Unicode character, and will show the ALT sequences for each.

                    Further, Wikipedia’s Alt Codes entry, which I also linked, says how to type the ALT sequences for any unicode hex point (so the #### from the U+#### notation or from the #### at the bottom of each of the cells in the table shown above)

                    PeterJonesP 1 Reply Last reply Reply Quote 4
                    • PeterJonesP
                      PeterJones @PeterJones
                      last edited by

                      @David-Tcheki wrote,

                      OEM-850

                      BTW: OEM-850 / CP850 was the default codepage in Western Europe. US Computers default to CP437 (OEM-US), so their table of ALT+### is a bit different:

                      a4eebea5-5263-452d-b9b6-9d9424215403-image.png

                      As the Wiki: Alt Codes page points out,

                      The familiar Alt+number combinations produced codes from the OEM code page (for example, CP437 in the United States)[c], matching the results from MS-DOS. But prefixing a leading zero (0) to the number (usually meaning 4 digits) produced the character specified by the newer Windows code page, allowing them to be typed as well.

                      So future readers on a US machine would want to use this table as their map, not the CP850 table shown previously.

                      1 Reply Last reply Reply Quote 3
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors