ascii nfo sh problem dos2unix is required to fix the bash files!

Daniel B. 0

I am a big fan of Notepad++, but in the latest versions I have noticed that it has problems with ASCII files and sometimes also with *.sh scripts.

When I edit existing *.sh scripts with Notepad, they are usually unusable afterwards. I then have to convert them back using dos2unix so that they work. That wasn’t the case before.

When I open and edit a *.nfo file and save it, it ends up with Chinese characters, completely destroying the layout. I have tested other programs, including Ultra Edit and Em Editor, and this does not happen with them; it only happens with Notepad++. I hope that someone else has the same problem and can perhaps give me a workaround.

And if we’re being honest, there’s nothing better than Notepad++ on the freeware market

Alan Kilborn

@Daniel-B-0 said:

problems with ASCII files

What exactly do you think is an ASCII file?

That wasn’t the case before.

What’s “before”?

I think you need to keep a close eye on the righthand side of the status bar when you load a file, in order to see what encoding Notepad++ is using. At the very least, you would then be armed with the knowledge to put in a posting about what is happening. Also, posting your Debug Info here is not a bad idea.

Daniel B. 0

I have now switched from the 32-bit version to the current 64-bit version, and now I can open the file, but it is recognized incorrectly with Unix (LF) EUC-KR, but this time I can change it to UTF8 myself, and lo and behold, it looks fine again!

In the 32-bit version, it kept switching back to EUC-KR automatically as soon as I selected something else.

I need to check whether there are also fixed settings for file extensions such as nfo and sh that allow you to open them so that they always use UNIX LF and UTF8.

Thanks in advance for your quick response!

file -s test.nfo
test.h: ISO-8859 text, with very long lines (622)

The AutoCodepage plugin was the solution.

PeterJones

@Daniel-B-0 said in ascii nfo sh problem dos2unix is required to fix the bash files!:

When I open and edit a *.nfo file and save it, it ends up with Chinese characters

Notepad++ has tried to guess the encoding, and failed. Turn off Settings > MISC > Autodetect character encoding. (See also the Encoding Autodetection section of the User Manual)

But it’s weird, because if *.nfo is still properly associated with the Language > M > MS-DOS Style (which is DOS Style in the Style Configurator), I thought it should always use the CP437 for those. Hmm, apparently, that rule was changed at some point, because v8.8.9 does not. (Looking at the Change History, my guess is that it was v8.7.4 that changed that behavior.)

So I created a new file, set the encoding to Encoding > Character Sets > Western European > OEM-US (which is the CP437 box-drawing encoding), then pasted in the following:

┌────────────────────────────────────────┐
│  .NFO Style Box Drawing Example        │
├────────────────────────────────────────┤
│  ┌─ Single Line Box                    │
│  │                                     │
│  └─                                    │
│                                        │
│  ╔═ Double Line Box                  ═╗│
│  ║                                    ║│
│  ╚════════════════════════════════════╝│
│                                        │
│  Mixed lines and shading:              │
│  ░ Light shade                         │
│  ▒ Medium shade                        │
│  ▓ Dark shade                          │
│  █ Full block                          │
│                                        │
└────────────────────────────────────────┘

and saved it as something.nfo
If I close the file, then re-open it (with the default autodetect encoding on), I got the EUC-KR. I closed it without saving it. I then turned off the autodetect, and when I loaded the file again, it showed up as ANSI; with my computer’s default “ANSI” codepage being 1252, that meant that when I loaded the something.nfo file, it was interpreted as

ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³  .NFO Style Box Drawing Example        ³
ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´
³  ÚÄ Single Line Box                    ³
³  ³                                     ³
³  ÀÄ                                    ³
³                                        ³
³  ÉÍ Double Line Box                  Í»³
³  º                                    º³
³  ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ¼³
³                                        ³
³  Mixed lines and shading:              ³
³  ° Light shade                         ³
³  ± Medium shade                        ³
³  ² Dark shade                          ³
³  Û Full block                          ³
³                                        ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ

But at this point, I can just do Encoding > Character Sets > Western European > OEM-US and it will properly re-interpret those same bytes as the box-drawing characters.

I need to check whether there are also fixed settings for file extensions such as nfo and sh that allow you to open them so that they always use UNIX LF and UTF8.

If you’re wanting UTF-8, it’s even easier(*). Because if I saved my original copy/paste as UTF-8 originally (or if I force the OEM-US, then use Encoding > Convert to UTF-8 then save), then close the file and re-open the file, it opens as UTF-8, not “ANSI”,

*: my settings for New Document are Encoding = UTF-8 + Apply to ANSI (which is the default nowadays). And that, combined with turning off the encoding autodetection on the MISC settings page, is enough for me to reliably get files to be interpreted as UTF-8.

always use UNIX LF

Notepad++ will use the first newline sequence from your file to determine the line-ending mode. If you first line ends with just LF, then even if every other line in the file has CRLF, it will treat it as UNIX LF.

update: you added a new line to your post while I was typing up my reply:

The AutoCodepage plugin was the solution.

Yep, I thought it would be. Good job figuring it out while I was still writing up my reply.

PeterJones

@PeterJones said in ascii nfo sh problem dos2unix is required to fix the bash files!:

my guess is that it was v8.7.4 that changed that behavior.

Yep. If I have the cp437.nfo, if I open it in fresh (default options) for v8.7.3, it automatically opens it in OEM-US = CP437, and it looks right; if I open it in fresh v8.7.4, it autodetects as EUC-KR. Similarly, if I have the utf8.nfo, v8.7.4-and-newer will interpret it as UTF-8 bytes, so it is right… but if I open utf8.nfo in v8.7.3, it will open it as CP437, so it looks like junk.

Daniel B. 0

Thank you for your explanation! I’m glad I wasn’t alone with this problem. It’s very well written and easy for me to understand. Thank you!