pythonscript: saying that encoding is not defined.

V S Rawat

I have the following line in my python script
#coding=utf8

the script runs but there is a message at console that
SyntaxError: Non-ASCII character ‘\xe0’ in file C:\Users\ilLUSIon\AppData\Roaming\Notepad++\plugins\Config\PythonScript\scripts\Repl_nuqta.py on line 4, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

this script was running well in w8-1, 64 bit on 32 bit npp,

recently I switched to w10-64 bit, and installed 64 bit npp,
now it is giving this error.

I had

Please resolve.
Thanks.

V S Rawat

Edit: I had installed python script from within npp plugin manager, so I guess that 1.3.0 version is 64 bit, matching with npp bit.

PeterJones

Which version of Notepad++ are you using? Not just 32-bit vs 64-bit, but the version number. my examples below are in v7.5.8 (with 32-bit or 64-bit indicated)

Assuming that the unicode character (☺ in my example) is inside a string, you may need to use unicode strings notation (u'☺' ) rather than normal string notation ('☺').

PythonScript currently uses Python 2.7. I don’t know enough about the intracacies of Python 2.7 to know whether #coding=utf8 was always sufficient to be able to not use them; ~~my minimal understanding, all unicodish strings in 2.7 (and thus in PythonScript) should use u'' for unicode strings.~~ I ~~don’t~~ didn’t know whether the PythonScript python2.7.dll was enabled with that option or not.

Ok, I was wrong. Some experimenting (though this is 32-bit NPP 7.5.8 with PythonScript 1.3.0.0):

from Npp import *

def forum_post16899_FunctionName():
    console.show()
    console.clear()
    console.write(u'SMILE: ☺\n')

if __name__ == '__main__': forum_post16899_FunctionName()

will give me the error.

# encoding=utf-8

from Npp import *

def forum_post16899_FunctionName():
    console.show()
    console.clear()
    console.write(u'SMILE: ☺\n')

if __name__ == '__main__': forum_post16899_FunctionName()

does not give me the error. So the u'SMILE: ☺\n' notation is not sufficient, and the # coding=utf-8 does work

Let’s see if i can get PythonScript working in my 7.5.8 64-bit portable. Yes, those two scripts have the same behavior in both 32-bit and 64-bit NPP v7.5.8.

Some more experiments, since the PEP 263 doesn’t show any utf-8 examples:

# encoding=utf-8
# worked

# encoding=utf8
# no hyphen: worked

#encoding=utf8
# no space before `encoding`: worked

# encoding= utf8
# space after equal, not before: worked

# encoding = utf8
# space before and after equal: gave your error message

# encoding =utf8
# space before equal, but not after equal: gave your error message

So it appears you cannot have a space between the “encoding” and the “equal”

Is the #coding=utf8 line that you showed an exact quote, or was it modified by the forum?

Rendering help below

-----

You can get it to render exactly in this forum by surrounding it by the ` mark, like `#coding=utf8`, or by putting it on a line by itself, prefixed with four spaces, so

    #coding=utf8

becomes

#coding=utf8

of you can use ```z on a line before, and ``` on a line after (with blank lines surrounding) like:

 
```z
#coding=utf8
```

to render like

#coding=utf8

this help-with-markdown post will give more details on how to mark up for this forum to successfully communicate.

Alan Kilborn

The thing that strikes me in this thread is that sometimes encoding is used and sometimes coding is used…are these supposed to be interchangeable?

PeterJones

Whoops. PEP 263 said coding, but I had encoding.

# coding=utf8
# worked

# coding= utf8
# worked

# coding =utf8
# error

# coding = utf8
# error

But I get the same results for coding or encoding: the space between the g and the = is the critical part.

Alan Kilborn

Ah…it needs to match this regex: ^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)

Since encoding and coding both will match that, both are acceptable.

PeterJones

Ah, yeah, I hadn’t read down far enough to notice the regular expression. That was nice of them to include. :-)

So, now it just remains for @V-S-Rawat to confirm whether his line actually matches that regex, and/or paste it in the forum without the forum mangling it. :-)

V S Rawat

Which version of Notepad++ are you using?

I am on npp 7.6.2-64 bit on w10-64 bit and w8.1-64 bit (multiboot, I work in both os with the same npp)

thnaks.

V S Rawat

This is my entire script

editor.beginUndoAction()
#coding=utf8

This replaces separate-nuqta + letter with nuqta-containing letters

editor.replace(u"क़", u"क़")
editor.replace(u"ख़", u"ख़")
editor.replace(u"ग़", u"ग़")
editor.replace(u"ज़", u"ज़")
editor.replace(u"ड़", u"ड़")
editor.replace(u"ढ़", u"ढ़")
editor.replace(u"फ़", u"फ़")

#removes ZERO WIDTH SPACE
editor.replace(unichr(8203),“”)
#removes ZERO WIDTH NON JOINER
editor.replace(unichr(8204),“”)
#removes ZERO WIDTH JOINER
editor.replace(unichr(8205),“”)

#trim leading trailing space
notepad.menuCommand(42043)
#remove empty lines (containing blank characters)
notepad.menuCommand(42056)
editor.endUndoAction()

you may need to use unicode strings notation (u’☺’ )

I am using double quotes, not single. like editor.replace(u"क़", u"क़")

this script was working ever since in npp 7.6.1 32-bit I guess,
then I switched to npp 7.6.2 64 bit and I noticed that this unicode chars replacements have stopped working.

thanks.

V S Rawat

it doesn’t even give a notice on npp window while running. It only show on console which is not always on. so I had processed several files wrongly while this was not working, and then I noticed in one file and then checked the message on console to know.

If script has some problem or error, is there any method to stop script right there, instead of it going ahead and skipping processing of the incorrect part, without the user getting to know?

Thanks.

V S Rawat

#coding=utf8 is not working

coding=utf8 is not working

#encoding=utf8 is not working

encoding=utf8 is not working

you had mentioned (u’☺’ ) with single quote so I tried that also,
but single quote as well as double quotes are not working.

thanks.

V S Rawat

searching on net, I found https://stackoverflow.com/questions/18078851/syntaxerror-of-non-ascii-character
so I used

-- coding: utf-8 --

but this is also not working.
thanks.

Alan Kilborn

@V-S-Rawat said:

If script has some problem or error, is there any method to stop script right there, instead of it going ahead and skipping processing of the incorrect part, without the user getting to know?

If there is a Python-level error, scripts stop dead and the reason is reported in the Pythonscript console. Of course, to see it you have to have the console opened. It would be nice, and I think it has been requested in the past, if in such a case the console would be opened (if not open) or made-visible to the user (if not the active tab on a multitabbed docked window) when such a thing occurs.

BTW, please put your code in proper markdown form. Those huge lines were really jarring.

or should I say jarring?

PeterJones

@V-S-Rawat,

Please, if you want our help, format your posts so that code looks like code, rather than getting interpreted by the forum. As we’ve already pointed out, the help can be found by clicking that ? in the COMPOSE window, or by following the link I posted above which gives an excellent summary of how to use markdown in the forum.

I am using double quotes, not single. like editor.replace(u"क़", u"क़")

python does not distinguish between single and double quotes, unlike some other languages, so that’s irrelvant (as you discovered later).

this script was working ever since in npp 7.6.1 32-bit I guess,
then I switched to npp 7.6.2 64 bit and I noticed that this unicode chars replacements have stopped working.

Ah, this is useful information: you not only changed between 32-bit and 64-bit, you also changed version. This is likely the culprit. I will have to find time to download portable editions of those, install pythonscript, and see if I can reproduce your problem.

In the mean time, you example script (even if the source wasn’t clobbered by the forum) is way longer than it needs to be in order to debug the problem. The issue at hand is only trying to set the encoding, so we just need a minimal script that shows the issue.

In 7.5.8 32-bit, this exact text (copy/paste from the box into a new pythonscript file, then run the python script), will run successfully:

# encoding=utf-8
from Npp import console
console.show()
console.clear()
console.write( u'SMILE: ☺\n' )
console.write( u"क़" )

Where the output is

SMILE: ☺
क़

(That was an even-more-simplified version of the script that I had shown earlier, which removes the namespace-protecting function names – but then I used your double-quoted u-string)

And this version of the script (the exact text shown) will give the error

# encoding =utf-8
from Npp import console
console.show()
console.clear()
console.write( u'SMILE: ☺\n' )
console.write( u"क़" )

where the error I see is

File "C:\Users\peter.jones\AppData\Roaming\Notepad++\plugins\Config\PythonScript\scripts\NppForumPythonScripts\16899-encoding-sscce.py", line 5
SyntaxError: Non-ASCII character '\xe2' in file C:\Users\peter.jones\AppData\Roaming\Notepad++\plugins\Config\PythonScript\scripts\NppForumPythonScripts\16899-encoding-sscce.py on line 5, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Please try those two exact scripts in your installation(s) of Notepad++ and describe your results.

PeterJones

@V-S-Rawat,

I downloaded portable editions of 7.6.1-32, 7.6.1-64, 7.6.2-32, and 7.6.2-64. I manually installed PythonScript 1.3.0.0 into all four portable installations.

I ran the two scripts I just showed in all four instances. In all four, the version with # encoding=utf-8 worked, and the version with # encoding =utf-8 failed.

In 7.6.2-64, I then edited the two scripts to use coding instead of encoding:

# coding =utf-8
from Npp import console
console.show()
console.clear()
console.write( u'SMILE: ☺\n' )
console.write( u"क़" )

This version failed with the same error.

And the “correct” version:

# coding=utf-8
from Npp import console
console.show()
console.clear()
console.write( u'SMILE: ☺\n' )
console.write( u"क़" )

passed, as it did with encoding.

Using the encoding lines that are coming through your forum-markdown badly formatted, I cannot reproduce your problem. The only ways I can reproduce your error message are to put a space between coding and =, or by not having the encoding line.

It is not a problem with PythonScript 1.3.0.0. It is not a problem with my portable versions of 7.6.2 for either 32bit or 64bit.

Either the text you are quoting is getting mangled – in which case, you will have to correctly use markdown to avoid it getting mangled – or you are doing something else wrong, or there is something else unique about your setup that I cannot reproduce in my portable setup.

-—
Complete ? > Debug Info for 7.6.2 64bit

Notepad++ v7.6.2   (64-bit)
Build time : Jan  1 2019 - 00:02:38
Path : C:\usr\local\apps\npp64.7.6.2\notepad++.exe
Admin mode : OFF
Local Conf mode : ON
OS : Windows 10 (64-bit)
Plugins : DSpellCheck.dll mimeTools.dll NppConverter.dll PythonScript.dll

PeterJones

I see that you were also trying without the hyphen in utf8 and with a colon instead of an equal:

# coding:utf8
from Npp import console
console.show()
console.clear()
console.write( u'SMILE: ☺\n' )
console.write( u"क़" )

that one with # coding:utf8 (no space before the colon) passed

# coding :utf8
from Npp import console
console.show()
console.clear()
console.write( u'SMILE: ☺\n' )
console.write( u"क़" )

The one with # coding :utf8 (with the space before the colon) failed.

Try again with no space between # and coding, and a space after the colon:

#coding : utf8
from Npp import console
console.show()
console.clear()
console.write( u'SMILE: ☺\n' )
console.write( u"क़" )

with space colon space, it fails

#coding: utf8
from Npp import console
console.show()
console.clear()
console.write( u'SMILE: ☺\n' )
console.write( u"क़" )

With nospace colon space, it passes.

I cannot get the error with the lines you say you are trying. Sorry.

[these four attempts were still with 7.6.2 64bit portable, as above]

V S Rawat

your last script gave this output on cosole.
SMILE: ☺
क़

it is correct. so it seems that encoding is not the problem.

thanks for you putting so much time and effort.

V S Rawat

it worked.

I had put
editor.beginUndoAction()
#coding:utf8

in my file. meaning coding was not on the first line.

how I put coding in the first line.

#coding:utf8
editor.beginUndoAction()

the error stopped and it did the required change in my text file.

I can still say that the previous version was working ever since, but stopped working after I switched to 64 bit and new version.

maybe, that had change some python version or something that had been causing the error.

Thanks a lot for guiding me step by step to solution.

PeterJones

I’m glad you found the problem.
Per PEP 263, “To define a source code encoding, a magic comment must be placed into the source files either as first or second line” (emphasis added). That has been true since Python 2.3 in 2001, so it wasn’t a recent change in the Python library. (Besides, since I started using PythonScript a few years ago, they haven’t changed from Python 2.7). I am not sure how it ever would have worked on the third line for you. But the important thing is that you now know it needs to go on the first or second line of your file.

Alan Kilborn

@V-S-Rawat said:

This is my entire script

editor.beginUndoAction()
#coding=utf8

The real problem likely could have been found in like 3 seconds if you would have ever learned how to present code via correct markdown on this forum.

Scrolling back quickly thru all of the postings shows that only Peter’s code replies use the black-box markdown.