• Login
Community
  • Login

Broken emoji

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
5 Posts 3 Posters 311 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H
    Hosein GSD
    last edited by Jan 24, 2025, 1:33 PM

    I have text full of broken emoji like 💀 displayed in strange symbols. is the anyway to fix it? (inside or outside Notepad++)
    Notepad++ dose not break the emojis. Emojis were broken in the first place. I want to fix them and I don’t know How.
    I read this but it didn’t help to fix my problem.

    P H 2 Replies Last reply Jan 24, 2025, 2:24 PM Reply Quote 0
    • P
      PeterJones @Hosein GSD
      last edited by Jan 24, 2025, 2:24 PM

      @Hosein-GSD ,

      When you open with Notepad++, does the lower-right (between “Windows (CR LF)” and “INS”) say “UTF-8” or “ANSI” or something else?

      Because if it says ANSI or something other that UTF-8, then you might be able to trick Notepad++ into re-reading the bytes as UTF by using the Encoding > UTF-8 (not Convert to UTF-8).

      Other than that, you weren’t very clear whether your text was a pure text file, or whether it was HTML like in the SO link you provided. And whether when you look at it in Notepad++ it’s wrong, or if it’s only wrong if you look using your browser but it’s right in Notepad++.

      Click the </> button in your reply, and paste in some example text from your file into the code-block that the forum button created (make sure none of what you paste is confidential/proprietary) – and maybe also include a screenshot of what it looks like in your Notepad++ (and also in your browser, if it really is HTML like the one from SO).

      P 1 Reply Last reply Jan 24, 2025, 3:13 PM Reply Quote 2
      • P
        PeterJones @PeterJones
        last edited by Jan 24, 2025, 3:13 PM

        @Hosein-GSD ,

        💀 could represent the bytes with hex values F0 9F 92 80 – and those, in turn, are the UTF-8 bytes of U+01F480 💀

        Unfortunately, there is no easy with a “regular expression”-mode search-and-replace to generically convert those from the individual bytes to the actual character.

        However, you still might be able to use Notepad++ to help:

        Copy that sequence of four weird characters, then File > New, Encoding > ANSI, paste (it will still look like those four bytes), save to a temporary file, Encoding > UTF-8, it will convert it; you should then be able to copy that character back into your original UTF-8 file. Before you save your final file, I would recommend putting in the BOM character so that future editors (and/or browsers) will be able to see that you want UTF-8: Encoding > Convert to UTF-8-BOM then save. (The lower right will now say UTF-8-BOM to indicate that Notepad++ knows the UTF-8 file has the BOM sequence at the beginning)

        If all special characters in your file are the mis-encoded emoji, like you showed, then you might be able to get away with:

        • Open File
        • Confirm it says UTF-8 in the corner, but has the bad characters like the ones you showed
        • Copy
        • Create new file, set Encoding > ANSI
        • Paste
          faf279cd-015d-4712-95b7-8f6d9a3e45d5-image.png
        • Re-interpret using Encoding > UTF-8
          9c733916-0881-4fa8-9f38-9711a2515d36-image.png
        • Then you can copy from there and paste back in your original. Save.
        • Highly recommended: Encoding > Convert to UTF-8-BOM, Save.

        You can test this procedure by starting with the following as your “bad UTF-8 source”:

        Here is a file that claims to be UTF-8 (see the lower corner),
        but is showing the emoji's as bad characters, like 💀
        I will include a couple more: 🌀 🌂
        
        1 Reply Last reply Reply Quote 4
        • G
          guy038
          last edited by guy038 Jan 24, 2025, 9:07 PM Jan 24, 2025, 9:02 PM

          Hello, @hosein-GSD, @Peterjones and All,

          I think that the clever @peterjones’s method can be recorded as a N++ macro !

          So, first, insert this new macro in your active shortcut.xml file, right before the </macros> ending tag

                  <Macro name="Test_Emoji" Ctrl="no" Alt="yes" Shift="no" Key="0">
                      <Action type="2" message="0" wParam="41001" lParam="0" sParam="IDM_FILE_NEW" />
                      <Action type="2" message="0" wParam="45009" lParam="0" sParam="IDM_FORMAT_CONV2_ANSI" />
                      <Action type="0" message="2179" wParam="0" lParam="0" sParam="SCINTILLAMESSAGE.SCI_PASTE" />
                      <Action type="2" message="0" wParam="45008" lParam="0" sParam="IDM_FORMAT_AS_UTF_8" />
                      <Action type="2" message="0" wParam="45011" lParam="0" sParam="IDM_FORMAT_CONV2_UTF_8" />
                      <Action type="2" message="0" wParam="41006" lParam="0" sParam="IDM_FILE_SAVE" />
                  </Macro>
          
          • Exit Notepad++

          • Restart N++

          • Open your original file in Notepad++

          • Select all your text ( Ctrl + A )

          • Copy its contents in the clipboard ( Ctrl + C)

          • Run the Macro > Test_Emoji macro

          • Save the new created file with its default name or else

          => You should see all your Emoji characters correctly displayed !


          To test my method, simply copy the two lines below, in the clipboard and use the Macro > Test_Emoji macro

          I have a file which is showing all emoji's as bad characters, like 💀
          Here are some other examples of bad syntaxes : 🌀 and 🌂
          

          Of course, you can easily change the macro’s name and/or affect a shortcut to this new macro

          Best Regards,

          guy038

          1 Reply Last reply Reply Quote 4
          • H
            Hosein GSD @Hosein GSD
            last edited by Feb 13, 2025, 10:20 PM

            @PeterJones
            thank you so much for your detailed answer 🙏

            Actually the problem was with the output files of this chrome extension.
            I wrote a comment for that and today I used it again and noticed that it’s problems is solved.

            (before that, I treied to solve many problems using regex manually which was useful for things like & but not for things like like 💀. )

            @guy038 said in Broken emoji:

            I think that the clever @peterjones’s method can be recorded as a N++ macro !

            absolutely!

            1 Reply Last reply Reply Quote 1
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors