How to adjust encoding detection algorithm: ANSI vs UTF-8 w/o BOM?



  • Assume I have a text file which contains (beginning at line 23) german Umlaute like
    hexadecimal xC3 xB6 for “ö”

    When I load this text file into NP++ then at this character position always crap chars are shown an NOT the Umlaute. At the left side of Status bar “ANSI” is shown.

    However when I delete some lines at the top, then save and reload the file into NP++ then the Umlaute are displayed correctly.

    The Status bar shows now UTF-8 w/o BOM.

    So my guess is that NP++ investigates only the first e.g.200 bytes to find out if the text is ANSI or UTF-8 w/o BOM.

    Can someone confirm this?

    Is there a way to increase this number (or even to tell NP+ to check the WHOLE text file before deciding what encoding is should be)?

    Peter


Log in to reply