Encoding of files with ASCII only
-
Hello,
when I open a textfile that only contains ASCII characters Notepad++ shows encoding as UTF8 w/o BOM.
If I add a non ASCII sign like §, it shows encoding ANSI, which is what I actually defined for this file.Is this normal or is my file really UTF8 encoding of some sort?
-
ASCII is a subset of many encodings such as utf8, ansi, etc., so there is no way to figure out which encoding was intended
-
ASCII is a subset of many encodings such as utf8, ansi, etc., so there is no way to figure out which encoding was intended
All you need is a hex viewer (*1). “ASCII” is a general term for any variety of single-byte encoding, so expect to see a 1:1 correspondence between characters and bytes:

§is included in many single-byte encodings, like the default OEM code page on Windows PCs. Go to? > Debug Info...and check theCurrent ANSI codepage. If the number is1252, then§is a valid “ASCII” character. Or just type this into a Python REPL:print('§'.encode('cp1252'))The output will be the single byte
b'\xa7'.If the file is truly UTF-8, then
§(and only§) will occupy multiple bytes:
Or, at the Python REPL:
print('§'.encode('utf8')) # => b'\xc2\xa7'
(*1) I used the HexEdit plugin.
-
If I understood the question correctly, OP implicitly asked if there is a way to report the encoded file as, in his case, ANSI if it contains only ASCII characters. Based on my previous statement, this is not possible. Even if I use a hex editor, there is no way to tell if I wanted to use the file as ANSI or as some other encoding with ASCII characters as a subset. If I misunderstood the question, sorry.
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login