@Nick-Boescht said in The Vietnamese language doesn't show up correctly after saving:
Everything went great!
I assume this means the problem is solved for you. At least for now.
However, I am going to give future advice on this problem in case there are others with similar difficulties, or if you have more problems. As such, I will make one more comment on an earlier statement, then outline what I understand so far. If I have stuff wrong, you’ll have to correct me either now or in whenever you need our help again.
This is what I see
But it’s not all I asked for. You did not show the status bar. Which is still making us guess. Or assume that you have correctly understood what we’ve been asking
If you give a detailed, step by step explanation of exactly what you are doing, you can save a lot of the back-and-forth we’ve had to go through to get to this point.
The process I understand you are doing:
use some extractor tool to extract XML from a game’s binary db file
open the xml in Notepad++
the file automatically opens in encoding _____ : I think it’s either opening as Win-1251 or some other western-european encoding instead of Win-1258… but I’m not sure, because you’ve been too vague) – but characters are not showing up right. Or maybe, if you’ve really set everything like
@Ekopalypse showed, it was opening as UTF-8.
You manually selected
Encoding > ____ (I assume Win-1258), and it appeared correct. So you saved.
When you exit and reload, it once again comes up in the same encoding as in step#3.
If this is not your process, you’ll need to correct it before we can give you more help.
Back to the problem at hand: I don’t actually do a lot of encoding-based text editing (except when I help others in this forum), as I’m in simple circumstances. But in what I’ve picked up over the years: if at all possible, it’s best to use UTF-8 or another unicode encoding – that’s been the right way of doing things since the 90s when Unicode was invented, and should have been more encouraged since the turn of the century once UTF-8 started gaining in popularity. The fact that any modern tool (game, what have you) is still using old 256-codepoint Win-#### encodings shows the complete lack of understanding on the part of those developers.
Unfortunately, without a unicode encoding, then Notepad++ is left with two options: guess the encoding based on the frequency of particular bytes with values between 128 and 255, or use a default setting. The guessing is often wrong, and using a default setting can make things even worse (especially if you’re dealing with a mix). This is because Windows (based on 1980’s pre-unicode DOS) didn’t store encoding information or other file meta-data in the directory table or any other file meta-data location, so it was up to applications to decide what to do with any particular sequence of bytes found in a disk file.
In this situation, I would err on the side of use-the-default-setting, then when it’s wrong, manually change the encoding – if you are frequently going to use Win-1258, then use Settings > Shortcut Mapper, set filter: 1258, and assign a keyboard shortcut to Win-1258, so from then on, you can just hit that keyboard shortcut to set that encoding.
But actually, I might try the experiment of changing the xml encoding line to <?xml version="1.0" encoding="utf-8">, and do an Encoding > Convert to UTF-8 on the file. That way, when Notepad++ applies the reasonable default of UTF-8 to 8bit files, it will Do What You Mean. After saving it as UTF-8, it should maintain the right encoding from then on in Notepad++… and you should try to see if the game will accept an XML config file encoded in UTF-8. If not, complain to the game company that they don’t care about non-western languages, and see if you can convince them to accept utf-8 and other unicode-based encodings.