Community
    • Login

    UnicodeEncodeError: 'gbk' codec can't encode character '\u2649' in position 4: illegal multibyte sequence

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    4 Posts 3 Posters 2.4k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M Offline
      mabangde0
      last edited by

      As shown in the figure, when running Python code, notepad++ reports an error “UnicodeEncodeError: ‘gbk’ codec can’t encode character ‘\u2649’ in position 4: illegal multibyte sequence”.

      The same code will not report an error when running on vscode. What is the reason? How should notepad++ be set? Thank you for your answers.

      屏幕截图 2024-08-17 190206.png
      屏幕截图 2024-08-17 190143.png

      PeterJonesP 1 Reply Last reply Reply Quote 0
      • PeterJonesP Offline
        PeterJones @mabangde0
        last edited by

        @mabangde0 ,

        it’s python.exe, not Notepad++, which is giving you that error. However, since Notepad++'s use of encoding might be the culprit, it’s still on-topic here.

        Possibly, the file is not saved with the encoding that you think it is in Notepad++. You will want to look at Notepad++'s Encoding menu’s selection, and/or the status bar (which you didn’t show in the screenshot):

        007b5bac-2f23-48f0-a3ed-6d0142ff078c-image.png

        Also, look to see what encoding VSCode claims it is.

        If I remember correctly, without an # encoding=... or similar, python3 assumes UTF-8. I know that the one I showed has an error when I do UTF-8 – but I just used OCR on your screenshot to try to get similar characters… I have no idea how to enter the exact characters. (it would have been nice if you’d clicked on </> in the forum, then put your example code, with the right characters, in between the ``` lines, so we could copy/paste.) So it might be that by pasting from OCR, I got characters that won’t encode correctly.

        When looking at the error I get when I do Notepad++ in UTF-8, I thought Notepad++ might be using the surrogate pairs (U+D800–U+DFFF) to encode the U+1xxxx, and then encodes those into UTF-8 in the file. I don’t think UTF-8 normally needs/uses surrogate pairs, so that might be confusing the Python interpreter. (I think “modified UTF-8” allows it, but maybe not normal UTF-8.) And since Python 3.12 codecs says that it doesn’t support surrogates for even UTF-16 (which are the encodings for which surrogate pairs are defined), it wouldn’t surprise me if Python doesn’t accept the surrogates for UTF-8 either. But someone who knows more about Python’s encoding rules would really need to chime in for that. But before posting, while re-reading, I noticed that it wasn’t actually listing anything in the surrogate-pair range, so this paragraph was probably wrong – at least for the characters I used; maybe your characters do, I’m not sure.

        Since your error message included “gbk”, I tried setting # encoding=gbk (which was listed in Python’s codecs), but then couldn’t find a Notepad++ Encoding (at least not in the Encoding > Character Set > Chinese) that matched GBK . I tried # encoding=gb2312 and using that encoding in Notepad++ gives me SyntaxError: encoding problem: gb2312 instead…

        (But I’m not an expert on Chinese encodings – everything that seems right in anything I said above specifically about those encodings was only because google apparently gave me good answers; anything that’s wrong is probably my fault for not understanding/interpreting correctly.)

        I guess I might just be wasting everyone’s time, other than suggesting that you double-check encodings on both applications to see if there’s a difference. Sorry for the rambling. Maybe there’s someone else who has more experience with Chinese characters and python and Notepad++ all working together, who can come give you a real answer.

        M 1 Reply Last reply Reply Quote 2
        • M Offline
          mabangde0 @PeterJones
          last edited by

          @PeterJones Thank you for your answer, here is the original code:

          n = eval(input(“请输入一个数字:”))
          print(“{:+^11}”.format(chr(n-1)+chr(n)+chr(n+1)))

          After running, enter 9802 in the console to reproduce.

          EkopalypseE 1 Reply Last reply Reply Quote 1
          • EkopalypseE Offline
            Ekopalypse @mabangde0
            last edited by

            @mabangde0

            see here for a solution to your problem

            1 Reply Last reply Reply Quote 2

            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

            With your input, this post could be even better 💗

            Register Login
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors