• Login
Community
  • Login

UnicodeEncodeError: 'gbk' codec can't encode character '\u2649' in position 4: illegal multibyte sequence

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
4 Posts 3 Posters 882 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    mabangde0
    last edited by Aug 17, 2024, 3:34 PM

    As shown in the figure, when running Python code, notepad++ reports an error “UnicodeEncodeError: ‘gbk’ codec can’t encode character ‘\u2649’ in position 4: illegal multibyte sequence”.

    The same code will not report an error when running on vscode. What is the reason? How should notepad++ be set? Thank you for your answers.

    屏幕截图 2024-08-17 190206.png
    屏幕截图 2024-08-17 190143.png

    P 1 Reply Last reply Aug 17, 2024, 4:58 PM Reply Quote 0
    • P
      PeterJones @mabangde0
      last edited by Aug 17, 2024, 4:58 PM

      @mabangde0 ,

      it’s python.exe, not Notepad++, which is giving you that error. However, since Notepad++'s use of encoding might be the culprit, it’s still on-topic here.

      Possibly, the file is not saved with the encoding that you think it is in Notepad++. You will want to look at Notepad++'s Encoding menu’s selection, and/or the status bar (which you didn’t show in the screenshot):

      007b5bac-2f23-48f0-a3ed-6d0142ff078c-image.png

      Also, look to see what encoding VSCode claims it is.

      If I remember correctly, without an # encoding=... or similar, python3 assumes UTF-8. I know that the one I showed has an error when I do UTF-8 – but I just used OCR on your screenshot to try to get similar characters… I have no idea how to enter the exact characters. (it would have been nice if you’d clicked on </> in the forum, then put your example code, with the right characters, in between the ``` lines, so we could copy/paste.) So it might be that by pasting from OCR, I got characters that won’t encode correctly.

      When looking at the error I get when I do Notepad++ in UTF-8, I thought Notepad++ might be using the surrogate pairs (U+D800–U+DFFF) to encode the U+1xxxx, and then encodes those into UTF-8 in the file. I don’t think UTF-8 normally needs/uses surrogate pairs, so that might be confusing the Python interpreter. (I think “modified UTF-8” allows it, but maybe not normal UTF-8.) And since Python 3.12 codecs says that it doesn’t support surrogates for even UTF-16 (which are the encodings for which surrogate pairs are defined), it wouldn’t surprise me if Python doesn’t accept the surrogates for UTF-8 either. But someone who knows more about Python’s encoding rules would really need to chime in for that. But before posting, while re-reading, I noticed that it wasn’t actually listing anything in the surrogate-pair range, so this paragraph was probably wrong – at least for the characters I used; maybe your characters do, I’m not sure.

      Since your error message included “gbk”, I tried setting # encoding=gbk (which was listed in Python’s codecs), but then couldn’t find a Notepad++ Encoding (at least not in the Encoding > Character Set > Chinese) that matched GBK . I tried # encoding=gb2312 and using that encoding in Notepad++ gives me SyntaxError: encoding problem: gb2312 instead…

      (But I’m not an expert on Chinese encodings – everything that seems right in anything I said above specifically about those encodings was only because google apparently gave me good answers; anything that’s wrong is probably my fault for not understanding/interpreting correctly.)

      I guess I might just be wasting everyone’s time, other than suggesting that you double-check encodings on both applications to see if there’s a difference. Sorry for the rambling. Maybe there’s someone else who has more experience with Chinese characters and python and Notepad++ all working together, who can come give you a real answer.

      M 1 Reply Last reply Aug 18, 2024, 10:47 AM Reply Quote 2
      • M
        mabangde0 @PeterJones
        last edited by Aug 18, 2024, 10:47 AM

        @PeterJones Thank you for your answer, here is the original code:

        n = eval(input(“请输入一个数字:”))
        print(“{:+^11}”.format(chr(n-1)+chr(n)+chr(n+1)))

        After running, enter 9802 in the console to reproduce.

        E 1 Reply Last reply Aug 18, 2024, 11:34 AM Reply Quote 1
        • E
          Ekopalypse @mabangde0
          last edited by Aug 18, 2024, 11:34 AM

          @mabangde0

          see here for a solution to your problem

          1 Reply Last reply Reply Quote 2
          4 out of 4
          • First post
            4/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors