bug in encoding (greek) in 7.6.3

  • I tried to save a xml which includes greek characters,
    it saves it with unrecognized characters,
    this issue doesn’t exist in 7.6.2

  • hi @patrickdrd

    unfortunately the current auto detection for character encodings is still broken in notepad++.
    maybe that’s also the reason why your xml got saved with wrong characters, if you have it enabled.

    for example if your xml was utf-8 in it’s original form and notepad++ detected it as another (wrong) encoding (example: encoding > character sets > vietnamese > windows-1258, like it happens many times for french and spanish)
    if you now edit such a file and save it, the characters can get messed up.

    currently, if you have it enabled, it’s recommended to disable “autodetect character encoding” in settings > preferences > misc, like seen at the screenshot below.
    then restart notepad++ and retry with your xml file (i hope you still have an old backup of it, with correct greek characters).

    settings - auto detect character encoding

    hope it helps you a bit.

  • I’ve got that one disabled a long time ago (it was suggested here to me)

    as I said and I’ll repeat myself, the bug is definitely in 7.6.3,
    I replaced the executable and it worked fine, just fine

  • @patrickdrd

    I’ve got that one disabled a long time ago (it was suggested here to me)

    yes, my apology, i forgot.

    as I said and I’ll repeat myself, the bug is definitely in 7.6.3,
    I replaced the executable and it worked fine, just fine

    i will test that too, both 7.6.2 and 7.6.3 exe, with my files and hope it did not get worse in 7.6.3, or at least that we find a workaround.
    good idea to use the old exe if it works for you.

  • strange, I’m trying to reproduce and it works now, something happened though and broke my greek characters when I copied the context between the browser and npp

  • encoding is a mess anyway,
    I tried to view that same file on my mobile and
    while my default encoding is greek-iso,
    I’ve had to switch to Unicode in order for these (greek) characters to be recognized,
    it’s very awkward having to switch from greek-iso to utf-8 and vice versa to read a file

  • and that android app has auto-detection too and it doesn’t work properly either

  • @patrickdrd

    is it an xml file you can share, or does it contain private data ?
    can you find out which greek letters will trigger this, by making an empty xml with just some greek words ?

    in french it is triggered, for example, by a single word Mosaïque because of the ï.
    @guy038 also found out, that if you combine the two words mosaïque étaitin a new file it will work correctly, but also était alone in a file will not work.

    if it’s easy to reproduce, you could file an issue at github: https://github.com/notepad-plus-plus/notepad-plus-plus/issues.
    (and then hope it gets looked at by the developers, due to over 2400 open issues at the moment)

    i for myself use utf-8 file encoding only, and convert all files to utf-8 if they are not.

    something that’s quick to try out if it works for you:
    a user told us, that he converts all problematic documents with encoding problems to utf-8-bom (utf-8 with a byte order mark header), because the bom header will explicitly state which format this file has, and utf-8-bom seems to be compatible with all his applications and web services.

    it would be interesting if your encoding issue gets better using bom, and what happens on android with a bom file.

  • thanks for the suggestions,
    I’m having similar problems on android with an app I use to log my sms messages,
    some text is unicode and some is not I guess,
    because if I select unicode some part is recognized, but not all of it,
    the same if I reverse it, I contacted the app developer and he said that it’s not one of his priorities (to fix it) now, so there is an issue in general

    as for the xml file, the problem started from the fact that I couldn’t send the file as an email from my work’s desktop to my mobile (it was rejected by exchange server - only mail I have access to - I’ve still haven’t figured out why… anyway…), so I opened it, I copied the content, I pasted it on github and then I opened my mobile browser and I got the text from github, somewhere on the whole process greek characters broke

    I opened the file from the disk now and it is utf-8-bom, but it didn’t matter I guess

Log in to reply