How can I get the encoding of current document?
-
Hello, I want to get the encoding of current document, such as UTF8、ANSI、Chinese GB2312。
I use code below:
int32_t buffer_id = ::SendMessage(nppData._nppHandle, NPPM_GETCURRENTBUFFERID, 0, 0);
int32_t encoding = ::SendMessage(nppData._nppHandle, NPPM_GETBUFFERENCODING, buffer_id, 0);The encoding return 0, when the document’s encoding is ANSI(show in Encoding menu of Notepad++),But Both return 4 when the document’s encoding is UTF8 and Chinese GB2312.
So, I can’t distinguish UTF8 and GB2312. What’s wrong? Thanks a lot!
-
First, is this relevant?
Getting encoding in x64 plugin
Second, make sure you distinguish encoding from codepage.
Perhaps you need also Scintilla SCI_GETCODEPAGE message. -
@xb-zhou said in How can I get the encoding of current document?:
So, I can’t distinguish UTF8 and GB2312. What’s wrong?
Per documentation, NPPM_GETBUFFERENCODING returns a UniMode enum, which is defined here.
Notepad++ doesn’t handle character sets internally the way it appears to a user. For editing, everything is either in the user default code page (“ANSI”) or in UTF-8. Whenever you’re not using the default code page, Notepad++ uses UTF-8 (so it is possible to enter and see characters that aren’t in the code page). Translation to other code pages is done when reading and writing the file. I don’t know the details beyond that; if no one else responds with a deeper explanation, you’ll probably need to read the code to figure out what will work for your particular use case.
-
THIS THREAD got off into a tangent about encoding issues; might want to have a look there to round out some understanding about how it all works.
-
@gstavi Yes,SCI_GETCODEPAGE message is what I find. Thank you verymuch.
-
@Coises Thank you for your answer! It was very valuable.
-
@Alan-Kilborn Thank you!
-