How to adjust encoding detection algorithm: ANSI vs UTF-8 w/o BOM?
-
Assume I have a text file which contains (beginning at line 23) german Umlaute like
hexadecimal xC3 xB6 for “ö”When I load this text file into NP++ then at this character position always crap chars are shown an NOT the Umlaute. At the left side of Status bar “ANSI” is shown.
However when I delete some lines at the top, then save and reload the file into NP++ then the Umlaute are displayed correctly.
The Status bar shows now UTF-8 w/o BOM.
So my guess is that NP++ investigates only the first e.g.200 bytes to find out if the text is ANSI or UTF-8 w/o BOM.
Can someone confirm this?
Is there a way to increase this number (or even to tell NP+ to check the WHOLE text file before deciding what encoding is should be)?
Peter