How to auto-convert text (Umlaute) when changing file encoding from ANSI to UTF-8 BOM?
-
Assume I open a new, empty file with file encoding ANSI.
I type some german text which contains Umlaute like äöü
Later I decide to switch the file encoding to UTF-8 by clicking on menu
Encoding—>UTF-8 BOM
Yes, the file encoding is now UTF-8 BOM.
But the Umlaute äöü appear now as
xE4 xF6 xFC (with black background)How can I tell NP++ to automatically convert all Umlaute to the corresponding UTF-8 bytes when switching file encoding from ANSI to UTF-8?
If this is not possible automatically:
How can I mark the text and do it manually? -
Assume I open a new, empty file with file encoding ANSI.
I type some german text which contains Umlaute like äöü
Later I decide to switch the file encoding to UTF-8 by clicking on menu
Encoding—>UTF-8 BOM
Yes, the file encoding is now UTF-8 BOM.
But the Umlaute äöü appear now as
xE4 xF6 xFC (with black background)How can I tell NP++ to automatically convert all Umlaute to the corresponding UTF-8 bytes when switching file encoding from ANSI to UTF-8?
If this is not possible automatically:
How can I mark the text and do it manually?Use the bottom section of the Encoding menu, e.g, Convert to UTF-8-BOM, when you want to convert.
There are usually only two times you should use the top section:
-
when you have a completely empty new file and you want to change from the default encoding to something else before you start adding text;
-
when you have just opened a file and the encoding Notepad++ determined it to be is wrong, so you want to change it and have Notepad++ reread the file as a different encoding.
-
-
funzt. Danke
-
It works only partially.
Assume I started with an empty file an ANSI encoding.
I write some text.Then (later) I copied some UTF-8 encoded text from browser webpage or from other document into this ANSI file.
Now this file contains two types of text:
One part is ANSI encoded the other UTF-8 encoded.No matter if I switch the file encoding or if I convert the text
a part of the file content does not match the encoding.What I need is a smarter convert feature:
If I select a part of the text and click a “Convert to UTF-8 BOM” then NP++ should…
…check if some text is marked. If yes, then only the marked text should be converted. Otherwise the full text.
Can this be implemented in the next release?
-
@Claudia-Svenson said:
Assume I started with an empty file an ANSI encoding.
I write some text.Then (later) I copied some UTF-8 encoded text from browser webpage or from other document into this ANSI file.
Now this file contains two types of text:
One part is ANSI encoded the other UTF-8 encoded.No matter if I switch the file encoding or if I convert the text
a part of the file content does not match the encoding.Have you actually tried this? Can you show a minimal demonstration? I can’t reproduce it.
When you paste text from the Windows clipboard into a document, the text should be converted right then to match the current encoding Scintilla (the control used to display documents in Notepad++) is using. (That encoding is not always the same as the file encoding that will be saved; it will always be either ANSI, if the file encoding is ANSI, or else UTF-8; anything else is converted when reading or writing the file.) There cannot be two different encodings in the same document window in Notepad++.
Does the text appear wrong in Notepad++ when you paste it? Or are you saying that it looks good when you paste it, but when you reload the file the text you pasted is corrupted?
If the text appears wrong when you paste, it is probably a problem with the application from which you are copying the text. If it is a common application that some of us might have, please tell us and give an example of how to reproduce the problem; but I suspect it will be out of Notepad++’s control.
If the text appears good when you paste it but is corrupt when you reload, then you are probably pasting characters that are not in the codepage you are using. That can happen if you are using a named legacy codepage (not ANSI, but something like ISO-8859-15), because internally Notepad++ uses UTF-8 when you have anything other than ANSI. The pasted characters look fine, because they exist in UTF-8, but they can’t be converted to the codepage when you save if they aren’t in the codepage.
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login