Option to display all zero-width characters?
-
As outlined in Krebs Security, and other places, we’re now starting to see the inclusion of UDF control characters to obfuscate what appears in text, compared to what is actually in the text. This can be used with malicious intent, and without any tool to check for the presence of these zero-width characters, we can unwittingly include malicious code in the libraries we use as developers.
For example, if you have the text:
class M{public static void main(String[]a){System.out.print(new char[] {'H','e','l','l','o',' ','W','o','r','l','d','!'});}}
And you paste it into an empty Notepad++ document, it recognizes it as UTF-8, No Language (Normal Text). If you change the language to C#, it changes the output to be:
This highlights the complexity of the problem. Someone reading that will see very reasonable text. A compiler, bypassing the zero-width characters, will see something different entirely, and nobody is the wiser if the sterilized text is still valid code.
I’m looking for a feature in Notepad++ that will display ALL zero-width characters as their ASCII-Unicode value (as in “[U+202e]”), permitting at least a cursory review of code ensuring that all zero-width characters are correct in their placement and usage.
-
Natively, that feature is not exposed. However, there are API messages which will allow you to set the representation for specific characters, which can be called from Plugins.
Rather than writing a custom plugin just for that, you can use one of the scripting plugins to send those messages and thus change the representation of the characters. @Alan-Kilborn shared such a script in Invisible Characters Unwanted – you would install the PythonScript plugin, then create a new script and paste in the code from that discussion; then, when you run that script, it will change the representation of those characters to the little black boxes like CR and LF use. (Make sure you read down the thread, and see if there are additional characters added later that you want to include in your copy of that script.)
-
Wow, cool popup preview if you hover over the blue “Invisible Characters Unwanted” text … apparently another aspect of the recent NodeBB update!