Encodage
-
Bonjour
Depuis la version 1.6 de Notepad++ il semble y avoir un problème concernant l’encodage des fichiers !
Exemple : Notepad++ m’indique que ce fichier (révision 113136) https://zone.spip.net/trac/spip-zone/browser/spip-zone/_plugins/mosaique/trunk/paquet.xml est en CRLF et Windows-1258
Alors que Geany m’indique CRLF mais utf-8 comme les versions antérieures à Notepad++ 7.6
Mon PC est en Windows 10 (1803) en langue française
Franck -
https://zone.spip.net/trac/spip-zone/browser/spip-zone/plugins/mosaique/trunk/paquet.xml
Il faut faire l’ajout d’un _ avant et après le mot plugin dans le lien, car il sans quoi, le lien ne fonctionne pas -
Interestingly, a commit was made with 7.6 which should improve the char detection area.
Not sure if this commit has side effects on the other side.
Did you try to disable the auto detection at all? (Settings->Preferences->MISC.->Autodetect character encoding) -
i can confirm the wrong detection as windows-1258 (vietnamese) instead of utf-8 in notepad++ 7.6.2
it seems to be triggered by theï
inMosaïque
here are the direct links to your paquet.xml if any one else likes to test it:
-
Merci de votre aide !!! :-)
Oui, le problème vient de l’auto-détection (paramètre/préférences/divers)
Quand “détecter l’encodage automatiquement” est cocher, alors le problème est présent !Un autre exemple:
Le fichier paquet.xml en révision 111406 de https://zone.spip.net/trac/spip-zone/browser/spip-zone/plugins/reservation_communication/trunk/paquet.xml?rev=111406
Geany m’indique utf-8 !
Notepad++ 7.2 avec l’auto-détection cocher me dit = windows-1258
Notepad++ 7.2 sans l’auto-détection coché me dit = utf-8
https://zone.spip.net/trac/spip-zone/export/111406/spip-zone/plugins/reservation_communication/trunk/paquet.xmlCe qui implique, le même problème que dans ce sujet :-(
https://notepad-plus-plus.org/community/topic/16828/encoding
Franck -
Encoding is a difficult area, there is not really a safe way to ensure
that the correct encoding is detected always.
Personally, I don’t use the automatic encoding at all. -
i can confirm all
Ce qui implique, le même problème que dans ce sujet :-(
https://notepad-plus-plus.org/community/topic/16828/encodingyes, but thanks to you, the devs have real life example files now. 👍
the topic you mentioned did not provide us with any file(s) i asked for that can be tested.i’ve opened a new issue #5202 at github:
Auto Detect UTF-8 Encoding for French is broken in Notepad++ 7.6.x -
Si besoin, je peux fournir d’autres exemples, j’ai eu le problème sur environ 15/20 fichiers !
-
Hello, @meta-chuh, @franckybleu, @eko-palypse and All,
@meta-chuh, I read your
#5202
issue from :https://github.com/notepad-plus-plus/notepad-plus-plus/issues/5202
And I agree, that, for instance, the phrase “Cette mosaïque est jolie” ( so, in English : This mosaic is pretty nice ), in a new
UTF-8
-encoded file, is wrongly detected as theWindows-1258
encoding :-((Let’s me add
3
remarks :-
If we simply change this phrase, as “Cette mosaïque était jolie” ( This mosaic was pretty nice ), the
UTF-8
encoding is, this time, preserved ! Quite logic, as this auto-detection needs some “material”, in order to works correctly ! -
You may, of course, just disable the auto-detection of encodings, in
Settings > Preferences... > MISC.
-
You may, also, convert your file to the
UTF-8-BOM
encoding (Encoding > Convert to UTF-8 BOM
), before saving it and you will not have any encoding problem, anymore, for that file, thanks to the3
invisible bytes of theBOM
;-))
Best Regards
guy038
-
-
there’s more than that broken with the current auto detection !
same vietnamese detection happens if you save the word “Réservation” to a new utf-8 file and reopen,so only the combination of ï and é in the same document is detected correctly, é only will detect it as vietnamese
this did not happen prior to this commit that @Eko-palypse mentioned.
You may, also, convert your file to the UTF-8-BOM encoding
this is not possible in many cases, because those files like in this case are in an open source repository.
same with spanish, exept if i have an ñ in my documents, so i didn’t notice it before, as most of them have an ñ.
german characters work ok so far.