Community
    • Login

    How to adjust encoding detection algorithm: ANSI vs UTF-8 w/o BOM?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    1 Posts 1 Posters 3.3k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Peter SsteinerP Offline
      Peter Ssteiner
      last edited by

      Assume I have a text file which contains (beginning at line 23) german Umlaute like
      hexadecimal xC3 xB6 for “ö”

      When I load this text file into NP++ then at this character position always crap chars are shown an NOT the Umlaute. At the left side of Status bar “ANSI” is shown.

      However when I delete some lines at the top, then save and reload the file into NP++ then the Umlaute are displayed correctly.

      The Status bar shows now UTF-8 w/o BOM.

      So my guess is that NP++ investigates only the first e.g.200 bytes to find out if the text is ANSI or UTF-8 w/o BOM.

      Can someone confirm this?

      Is there a way to increase this number (or even to tell NP+ to check the WHOLE text file before deciding what encoding is should be)?

      Peter

      1 Reply Last reply Reply Quote 0

      Hello! It looks like you're interested in this conversation, but you don't have an account yet.

      Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

      With your input, this post could be even better 💗

      Register Login
      • First post
        Last post
      The Community of users of the Notepad++ text editor.
      Powered by NodeBB | Contributors