How to adjust encoding detection algorithm: ANSI vs UTF-8 w/o BOM?
-
Assume I have a text file which contains (beginning at line 23) german Umlaute like
hexadecimal xC3 xB6 for “ö”When I load this text file into NP++ then at this character position always crap chars are shown an NOT the Umlaute. At the left side of Status bar “ANSI” is shown.
However when I delete some lines at the top, then save and reload the file into NP++ then the Umlaute are displayed correctly.
The Status bar shows now UTF-8 w/o BOM.
So my guess is that NP++ investigates only the first e.g.200 bytes to find out if the text is ANSI or UTF-8 w/o BOM.
Can someone confirm this?
Is there a way to increase this number (or even to tell NP+ to check the WHOLE text file before deciding what encoding is should be)?
Peter
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login