Debug info dialog, what does Current ANSI codepage mean?
-
In Notepad++ in the menu under
? -> Debug info... -> Copy debug info into clipboard
, what does the line with “Current ANSI codepage” mean?As far as I can tell it’s not related to the current file, so is it a general environment or Windows setting? Is it possible to change this setting somehow?
The reason I ask is that someone reported that my CSV Lint plugin crashes on a certain file with diacritic characters in it, for example
noël
. I have the file but can’t reproduce the error but I do have the Debug info.The only noticable difference is
Current ANSI codepage : 65001
on the pc where it crashes, andCurrent ANSI codepage : 1252
where the plugin doesn’t crash.So on this machine it crashes on a certain file
Notepad++ v8.2.1 (64-bit)
Build time : Jan 19 2022 - 18:43:05
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line :
Admin mode : ON
Local Conf mode : OFF
Cloud Config : OFF
OS Name : Windows 11 (64-bit)
OS Version : 2009
OS Build : 22000.493
Current ANSI codepage : 65001
Plugins : CSVLint.dll mimeTools.dll NppConverter.dll NppExport.dllAnd on this machine the plugin works correctly:
Notepad++ v8.2.1 (64-bit)
Build time : Jan 19 2022 - 18:43:05
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line :
Admin mode : OFF
Local Conf mode : OFF
Cloud Config : OFF
OS Name : Windows 11 (64-bit)
OS Version : 2009
OS Build : 22000.493
Current ANSI codepage : 1252
Plugins : CSVLint.dll mimeTools.dll NppConverter.dll NppExport.dll -
@bas-de-reuver said in Debug info dialog, what does Current ANSI codepage mean?:
so is it a general environment or Windows setting
I haven’t tracked it back in the N++ source, but would assume it leads to this:
https://docs.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getacp
65001
That seems like UTF-8.
-
@alan-kilborn said in Debug info dialog, what does Current ANSI codepage mean?:
I haven’t tracked it back in the N++ source, but would assume it leads to this:
https://docs.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getacpthat appears to be correct.
Cheers.
-
@bas-de-reuver said in Debug info dialog, what does Current ANSI codepage mean?:
what does the line with “Current ANSI codepage” mean?
For Notepad++, it means that, when a file is opened with ANSI as the encoding, see status bar:
that the codepage indicated in the Debug Info will be used to determine the meaning/display of characters above byte-value 127.
Is it possible to change this setting somehow?
It must be, but I don’t have expertise because I’ve never had to do it.
Obviously in your situation there’s a need…because you want to debug the problem you are seeing.It would be interesting if you come back here and post your efforts as you work on and hopefully solve your problem.
-
The top two answers (for me) for https://www.google.com/search?q=windows+10+how+to+change+default+codepage show two different ways of doing it. The one at SU is one I had seen before, but has caveats in some of the responses to that “solution”; I have never tried the one on the other site, nor seen any comments as to its effectiveness.
- https://knowledgebase.progress.com/articles/Article/4677
- https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8
If you find practical results from either of those methods, it would be good to share them here.
-
Hello, @bas-de-reuver, @peterjones, @alan-kilborn, @michael-vincent and All,
Globally, I would say that the
ANSI
codepage ( orACP
) is the default encoding used by all theNon-Unicode
programs of your machines(s)To know which is the exact Windows encoding, hidden behind the
ACP
acronym, simply ask the registry :reg query HKLM\SYSTEM\CurrentControlSet\Control\Nls\CodePage /v *CP /t reg_SZ
On my recent French
Win 10
laptop, I get :HKEY_LOCAL_MACHINE\SYSTEM\currentControlSet\Control\Nls\CodePage ACP REG_SZ 1252 OEMCP REG_SZ 850 MACCP REG_SZ 10000 Fin de la recherche : 3 correspondance(s) trouvée(s).
This means that my system :
-
Uses the
Win-1252
encoding when saving files with theANSI
encoding ( Single-byte character encoding of the Latin alphabet in Western languages ) -
Uses the
OEM-850
encoding which is used when opening a console prompt window ( Single-byte character encoding of the Latin alphabet used under DOS in Western Europe )
Moreover, Windows defines the
MACCP - 10000
encoding, calledMAC OS Roman
( APPLE Single-byte character encoding of the Latin alphabet in Western Languages )
Instead of using the parameters of the
reg
program, you may simply filter the results of theCodePage
key with thefindstr
built-in DOS command :reg query HKLM\SYSTEM\CurrentControlSet\Control\Nls\CodePage | findstr CP.*REG_SZ
You’ll obtain the same displaying :
ACP REG_SZ 1252 OEMCP REG_SZ 850 MACCP REG_SZ 10000
For instance, a polish user could see :
ACP REG_SZ 1250 OEMCP REG_SZ 852 MACCP REG_SZ 10029
because :
-
Windows-1250
is a single byte encoding to represent Central or Eastern European languages, that use the Latin script -
OEM-852
is a single byte encoding, used under DOS to write Central European languages, that use the Latin script -
10029
is an APPLE single-byte encoding, calledMac OS Central European
to represent text in Central or Southeastern European languages, that use the Latin script
Best Regards,
guy038
-
-
Everybody has given you good answers, but as always, a good search will get you better depth answers. Goes way back, as guy alludes to. Here is the Wikipedia Link that can give you more information and links to other references.
-
Hi, All,
Forgot to mention that I got the main information from this link :
https://stackoverflow.com/questions/53691278/how-to-see-which-ansi-code-page-is-used-in-windows
BR
guy038
-
-
-