Community
    • Login

    How to adjust encoding detection algorithm: ANSI vs UTF-8 w/o BOM?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    1 Posts 1 Posters 2.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Peter SsteinerP
      Peter Ssteiner
      last edited by

      Assume I have a text file which contains (beginning at line 23) german Umlaute like
      hexadecimal xC3 xB6 for “ö”

      When I load this text file into NP++ then at this character position always crap chars are shown an NOT the Umlaute. At the left side of Status bar “ANSI” is shown.

      However when I delete some lines at the top, then save and reload the file into NP++ then the Umlaute are displayed correctly.

      The Status bar shows now UTF-8 w/o BOM.

      So my guess is that NP++ investigates only the first e.g.200 bytes to find out if the text is ANSI or UTF-8 w/o BOM.

      Can someone confirm this?

      Is there a way to increase this number (or even to tell NP+ to check the WHOLE text file before deciding what encoding is should be)?

      Peter

      1 Reply Last reply Reply Quote 0
      • First post
        Last post
      The Community of users of the Notepad++ text editor.
      Powered by NodeBB | Contributors