Incorrect display of UTF-8 in the console.
-
Hello. I want to execute Python code so that the execution output is in the console. Configured as shown in the YouTube video “Setting up Notepad ++ for Python”. Only Russian words are distorted in the console when I run the file for execution. The encoding of the executable is UTF-8.
Notepad++ v7.9.1 (64-bit)
Build time : Nov 2 2020 - 01:07:46
Path : C:\soft\Notepad++\notepad++.exe
Admin mode : OFF
Local Conf mode : ON
OS Name : Windows 10 Home (64-bit)
OS Version : 2009
OS Build : 19042.1052
Current ANSI codepage : 1251
Plugins : mimeTools.dll NppConverter.dll NppExec.dll NppExport.dll NppRegExTractorPlugin.dll NppScripts.dll PreviewHTML.dll PythonScript.dll RegexTrainer.dll Remove Duplicate Lines.dll -
Could you show an example script that you think should output Russian text in the PythonScript console, and show a screenshot of what actually did get output?
-
@Александр-К-ш said in Incorrect display of UTF-8 in the console.:
Configured as shown in the YouTube video “Setting up Notepad ++ for Python”
Nobody’s going to watch a video so that they can answer your question here.
-
For example, if I use the PythonScript v1.54 plugin, and use the PythonScript console’s immediate command to run
console.write("Здравствуйте")
, I get:
(I snagged that Russian text from your “Russian Translation”-topic post of the same problem)
If I paste the same text in a new editor window in Notepad++ (UTF-8), and use
console.write(editor.getText())
to print the contents of the active file to the console window, I also get the correct display:Thus, if used properly, Notepad++ and PythonScript v1.54 has no difficulty in writing Russian text
Show us your code that produces the problem in your environment, and one of the Python/PythonScript experts here will probably quickly see how you are mishandling encoding in Python.
-
Correction: I meant PythonScript version 1.5.4 – forgot the second dot.
Addendum: The experiment above was in Notepad++ v8.1. However, I unzipped a fresh Notepad++ v7.9.1-64bit portable and installed PythonScript 1.5.4, and it behaved identically.
-
@Александр-К-ш said in Incorrect display of UTF-8 in the console.:
I want to execute Python code so that the execution output is in the console.
I’d read this as wanting to run an external python via NppExec and having the output in the NppExec console window, but that’s for the OP to elaborate on.
Of course, maybe someone watched the video…and it is indeed confirmed to be PythonScript. -
@Alan-Kilborn said in Incorrect display of UTF-8 in the console.:
maybe someone watched the video
Since the OP didn’t link to the video, but just expected our search to come up with exactly the same video he found…
At this point, the questions that you need to answer / thinks you need to understand:
- What exactly are you trying to do?
- Which plugin are you using to try to do it?
- We are not going to watch a video to know how you think you’ve set up your system
- If you are using PythonScript,
- try the experiments I showed, and show your results
- share the PythonScript code you are using, and a screenshot of results, showing that it doesn’t work
- If you are using NppExec
- Show an example of the code you ran, with a screenshot of the “bad” results
- Show the output of
npe_console
, which will tell us what input and output encodings you are using: for example, mine is
If you are using NppExec, and you are using it out-of-the-box without changing your encoding settings, it might not be properly set up for unicode output.
- If you are using something other than PythonScript or NppExec, you will have to show us what you’re really doing, with example code/scripts and screenshots.
-
If you are using NppExec…
I didn’t have a python executable installed on this machine (pythonscript’s DLL is sufficient for my python needs), so when I had a couple spare minutes, I downloaded the “Windows embeddable package” (the minimalistic Python3 executable, which can be embedded, or is otherwise “portable”).
#!python3 # encoding=utf-8 print("Hello World") print("Здравствуйте")
When I ran from the command line, I got
showing that the code was good.
When I tried to run from NppExec’s console, I got a UnicodeEncodeError. When I changed my python command line to invoke the
-X utf8
option, then it worked inside NppExec’s console as well. Both these runs are shown in the screen cap below:So, if I use the NppExec script
cd "$(CURRENT_DIRECTORY)" C:\usr\local\apps\python-3.9.6-embed-amd64\python.exe -X utf8 "$(FULL_CURRENT_PATH)"
… then it properly outputs UTF8 text for me.
When I first got this error, I hopped over to my normal cmd.exe window, and verified that the code works under normal circumstances (as shown above); then I ran
python -h
to find command line options, and looked through until I found:python -h ... -X utf8: enable UTF-8 mode for operating system interfaces, overriding the default locale-aware mode. -X utf8=0 explicitly disables UTF-8 mode (even when it would otherwise activate automatically)
That is how I found the option which enabled it to work.
I am sharing this detailed process with you, because when you are dealing with command lines, and embedding one process inside another, encoding can get confusing, even if none of the individual pieces are technically acting wrong. You sometimes have to be willing to run experiments and make guesses, trying to troubleshoot it yourself, before you can claim that a specific piece is causing the problem. It appears that running from NppExec provides a different locale to the standard cmd.exe environment, at least with my version of python.exe v3.9.6.
-
I have the same problem
You must unicode and use convertetion
My solution bellow# -*- coding: utf-8 -*- import os import sys from Npp import notepad # import it first! filePathSrc=ur"c:\stash\test" # Path to the folder with files to convert for root, dirs, files in os.walk(filePathSrc): for fn in files: if fn[-4:] == u'.txt' : # Specify type of the files notepad.open((root + u"\\" + fn).encode('utf8')) notepad.runMenuCommand("Encoding", "Convert to UTF-8") notepad.saveAs(u"{}{}".format((root + u"\\" + fn), u'.utf8_txt' ).encode('utf8')) notepad.close()
-
I’m not sure you or any of the previous posters have answered the question posed by the OP.
But, since the OP hasn’t returned to post and clarify, we don’t know (and it doesn’t matter since OP was the one asking for help).