New N++ feature to show/hide Non-Printing characters
- 
 Hi All, Do you remember of this Invisible characters unwanted discussion and of my last post, below, about the main invisible characters which need a visual representation ? https://community.notepad-plus-plus.org/post/62169 From this post, and regarding the new N++ feature, in the v8.5release, to show the non-printing characters, it think that it would be interesting to have a new look at this topic !
 Firstly, in the last N++ release, the invisible characters, located outside the BMP( Basic Multilingual Plane ), are not taken in account. I think that this position is acceptable as :- 
The two format Kaithicharacters are related to the historicalKaithiscript, which is rather ignored since the 1970’s years
- 
The nine format Egyptiancharacters refer to the ancient Egyptian hieroglyphs
- 
The four format Shorthandcharacters cannot be considered as true characters as it may encode a lot of european languages, simultaneously
- 
The 233 characters of the Musical symbolsUnicode block cannot be strictly considered as characters and rather represents a modern musical notation system
- 
To end with, all the format characters of the Tagblock are strongly discouraged by the Unicode Consortium
 
 Now, if we consider all the non-printing characters seen, when the View > Show Symbol > Show Non-Printing charactersis set, we get a list of42lines :•-------•---------•---------------------------------------------•----•------------------•------- | Code | Abbrev. | Character Name | Cg | N++ Regex | Char •-------•---------•---------------------------------------------•----•------------------•------- | 00A0 | NBSP | NO-BREAK SPACE | Zs | \x{00A0} | | | | | | | | 061C | ALM | ARABIC LETTER MARK | Cf | \x{061C} |  | | | | | | | 1680 | OSPM | OGHAM SPACE MARK | Zs | \x{1680} | | | | | | | | 180E | MVS | MONGOLIAN VOWEL SEPARATOR | Cf | \x{180E} |  | | | | | | | 2000 | NQSP | EN QUAD | Zs | \x{2000} | | 2001 | MQSP | EM QUAD | Zs | \x{2001} | | 2002 | ENSP | EN SPACE | Zs | \x{2002} | | 2003 | EMSP | EM SPACE | Zs | \x{2003} | | 2004 | 3/MSP | THREE-PER-EM SPACE | Zs | \x{2004} | | 2005 | 4/MSP | FOUR-PER-EM SPACE | Zs | \x{2005} | | 2006 | 6/MSP | SIX-PER-EM SPACE | Zs | \x{2006} | | 2007 | FSP | FIGURE SPACE | Zs | \x{2007} | | 2008 | PSP | PUNCTUATION SPACE | Zs | \x{2008} | | 2009 | THSP | THIN SPACE | Zs | \x{2009} | | 200A | HSP | HAIR SPACE | Zs | \x{200A} | | | | | | | | 200B | ZWSP | ZERO WIDTH SPACE | Cf | \x{200B} |  | 200C | ZWNJ | ZERO WIDTH NON-JOINER | Cf | \x{200C} |  | 200D | ZWJ | ZERO WIDTH JOINER | Cf | \x{200D} |  | 200E | LRM | LEFT-TO-RIGHT MARK | Cf | \x{200E} |  | 200F | RLM | RIGHT-TO-LEFT MARK | Cf | \x{200F} |  | | | | | | | 202A | LRE | LEFT-TO-RIGHT EMBEDDING | Cf | \x{202A} |  | 202B | RLE | RIGHT-TO-LEFT EMBEDDING | Cf | \x{202B} |  | 202C | PDF | POP DIRECTIONAL FORMATTING | Cf | \x{202C} |  | 202D | LRO | LEFT-TO-RIGHT OVERRIDE | Cf | \x{202D} |  | 202E | RLO | RIGHT-TO-LEFT OVERRIDE | Cf | \x{202E} |  | | | | | | | 2028 | LS | LINE SEPARATOR | Zl | \x{2028} | | 2029 | PS | PARAGRAPH SEPARATOR | Zp | \x{2029} | | | | | | | | 202F | NNBSP | NARROW NO-BREAK SPACE | Zs | \x{202F} | | | | | | | | 205F | MMSP | MEDIUM MATHEMATICAL SPACE | Zs | \x{205F} | | | | | | | | 2060 | WJ | WORD JOINER | Cf | \x{2060} |  | | | | | | | 2066 | LRI | LEFT-TO-RIGHT ISOLATE | Cf | \x{2066} |  | 2067 | RLI | RIGHT-TO-LEFT ISOLATE | Cf | \x{2067} |  | 2068 | FSI | FIRST STRONG ISOLATE | Cf | \x{2068} |  | 2069 | PDI | POP DIRECTIONAL ISOLATE | Cf | \x{2069} |  | 206A | ISS | INHIBIT SYMMETRIC SWAPPING | Cf | \x{206A} |  | 206B | ASS | ACTIVATE SYMMETRIC SWAPPING | Cf | \x{206B} |  | 206C | IAFS | INHIBIT ARABIC FORM SHAPING | Cf | \x{206C} |  | 206D | AAFS | ACTIVATE ARABIC FORM SHAPING | Cf | \x{206D} |  | 206E | NADS | NATIONAL DIGIT SHAPES | Cf | \x{206E} |  | 206F | NOSP | NOMINAL DIGIT SHAPES | Cf | \x{206F} |  | | | | | | | 3000 | IDSP | IDEOGRAPHIC SPACE | Zs | \x{3000} | | | | | | | | FEFF | ZWNBSP | ZERO WIDTH NO-BREAK SPACE / BYTE ORDER MARK | Cf | \x{FEFF} |  •-------•---------•---------------------------------------------•----•------------------•------I would like to make three points about this list: - I don’t see the purpose of showing the Line Separator, \x{2028}character and the Paragraph Separator\x{2029}characters. Indeed, there are not format characters and, morever, they already have a “black on white” representation LS and PS. Simply check theShow Non-Printing characters`option then uncheck it and observe the changes regarding these two chars, in the above table !
 Conversely, I think that we miss two individual and two sets of invisible characters, which do belong to the Unicode BMP:- 
The Soft Hyphen character of code \x{00AD}
- 
The Syriac Abreviation Mark of code \x{070F}
- 
The four invisible operators of the General Punctuation block, between \x{2061}and\x{2064}. Refer to :- https://www.w3.org/TR/2010/REC-MathML3-20101021/chapter3.html#presm.invisibleops for further information
 
- 
The three Interlinear annotationcharacters of the Specials block, between\x{FFF9}and\x{FFFB}. Refer to :
 
 So, if we take in account the above remarks, regarding the new items to include / exclude to the list of the existing non-printing chars, we get the updated list below : •-------•---------•---------------------------------------------•----•------------------•------- | Code | Abbrev. | Character Name | Cg | N++ Regex | Char •-------•---------•---------------------------------------------•----•------------------•------- | 00A0 | NBSP | NO-BREAK SPACE | Zs | \x{00A0} | | | | | | | | 00AD | SHY | SOFT HYPHEN | Cf | \x{00AD} |  | | | | | | | 061C | ALM | ARABIC LETTER MARK | Cf | \x{061C} |  | | | | | | | 070F | SAM | SYRIAC ABBREVIATION MARK | Cf | \x{070F} |  | | | | | | | 1680 | OSPM | OGHAM SPACE MARK | Zs | \x{1680} | | | | | | | | 180E | MVS | MONGOLIAN VOWEL SEPARATOR | Cf | \x{180E} |  | | | | | | | 2000 | NQSP | EN QUAD | Zs | \x{2000} | | 2001 | MQSP | EM QUAD | Zs | \x{2001} | | 2002 | ENSP | EN SPACE | Zs | \x{2002} | | 2003 | EMSP | EM SPACE | Zs | \x{2003} | | 2004 | 3/MSP | THREE-PER-EM SPACE | Zs | \x{2004} | | 2005 | 4/MSP | FOUR-PER-EM SPACE | Zs | \x{2005} | | 2006 | 6/MSP | SIX-PER-EM SPACE | Zs | \x{2006} | | 2007 | FSP | FIGURE SPACE | Zs | \x{2007} | | 2008 | PSP | PUNCTUATION SPACE | Zs | \x{2008} | | 2009 | THSP | THIN SPACE | Zs | \x{2009} | | 200A | HSP | HAIR SPACE | Zs | \x{200A} | | | | | | | | 200B | ZWSP | ZERO WIDTH SPACE | Cf | \x{200B} |  | 200C | ZWNJ | ZERO WIDTH NON-JOINER | Cf | \x{200C} |  | 200D | ZWJ | ZERO WIDTH JOINER | Cf | \x{200D} |  | 200E | LRM | LEFT-TO-RIGHT MARK | Cf | \x{200E} |  | 200F | RLM | RIGHT-TO-LEFT MARK | Cf | \x{200F} |  | | | | | | | 202A | LRE | LEFT-TO-RIGHT EMBEDDING | Cf | \x{202A} |  | 202B | RLE | RIGHT-TO-LEFT EMBEDDING | Cf | \x{202B} |  | 202C | PDF | POP DIRECTIONAL FORMATTING | Cf | \x{202C} |  | 202D | LRO | LEFT-TO-RIGHT OVERRIDE | Cf | \x{202D} |  | 202E | RLO | RIGHT-TO-LEFT OVERRIDE | Cf | \x{202E} |  | | | | | | | 202F | NNBSP | NARROW NO-BREAK SPACE | Zs | \x{202F} | | | | | | | | 205F | MMSP | MEDIUM MATHEMATICAL SPACE | Zs | \x{205F} | | | | | | | | 2060 | WJ | WORD JOINER | Cf | \x{2060} |  | | | | | | | 2061 | (FA) | FUNCTION APPLICATION | Cf | \x{2061} |  | 2062 | (IT) | INVISIBLE TIMES | Cf | \x{2062} |  | 2063 | (IS) | INVISIBLE SEPARATOR | Cf | \x{2063} |  | 2064 | (IP) | INVISIBLE PLUS | Cf | \x{2064} |  | | | | | | | 2066 | LRI | LEFT-TO-RIGHT ISOLATE | Cf | \x{2066} |  | 2067 | RLI | RIGHT-TO-LEFT ISOLATE | Cf | \x{2067} |  | 2068 | FSI | FIRST STRONG ISOLATE | Cf | \x{2068} |  | 2069 | PDI | POP DIRECTIONAL ISOLATE | Cf | \x{2069} |  | 206A | ISS | INHIBIT SYMMETRIC SWAPPING | Cf | \x{206A} |  | 206B | ASS | ACTIVATE SYMMETRIC SWAPPING | Cf | \x{206B} |  | 206C | IAFS | INHIBIT ARABIC FORM SHAPING | Cf | \x{206C} |  | 206D | AAFS | ACTIVATE ARABIC FORM SHAPING | Cf | \x{206D} |  | 206E | NADS | NATIONAL DIGIT SHAPES | Cf | \x{206E} |  | 206F | NOSP | NOMINAL DIGIT SHAPES | Cf | \x{206F} |  | | | | | | | 3000 | IDSP | IDEOGRAPHIC SPACE | Zs | \x{3000} | | | | | | | | FEFF | ZWNBSP | ZERO WIDTH NO-BREAK SPACE / BYTE ORDER MARK | Cf | \x{FEFF} |  | | | | | | | FFF9 | IAA | INTERLINEAR ANNOTATION ANCHOR | Cf | \x{FFF9} |  | FFFA | IAS | INTERLINEAR ANNOTATION SEPARATOR | Cf | \x{FFFA} |  | FFFB | IAT | INTERLINEAR ANNOTATION TERMINATOR | Cf | \x{FFFB} |  •-------•---------•---------------------------------------------•----•------------------•------Of course, if my remarks seem pertinent enough to most people, I’ll create a GitHub issue ! 
 Notes : - Regarding the syntax to use for naming this new N++ feature, I propose this one :
 View > Show Symbol > Show BMP Format chars / Non Regular Spaces. Do you like it ?- Regarding the last table, the equivalent regex to mark  these 49special characters becomes :
 MARK [\x{00A0}\x{00AD}\x{061C}\x{070F}\x{1680}\x{180E}\x{2000}-\x{200A}\x{200B}-\x{200F}\x{202A}-\x{202E}\x{202F}\x{205F}-\x{206F}\x{3000}\x{FEFF}\x{FFF9}\x{FFFA}\x{FFFB}]Depending of the character marked, you should get, either, an red/orange space for a space character OR a thin red line for all the other characters ! Best Regards, guy038 P.S. : To have a overview of all these strange Unicode characters, read this exhaustive article : https://en.wikipedia.org/wiki/Universal_Character_Set_characters 
- 
- 
 Hi, All, Although my post did not get some positive reviews, I still created an issue on GiHub!https://github.com/notepad-plus-plus/notepad-plus-plus/issues/13408 Best Regards guy038 
- 
 @guy038 I’m wondering why IDSP (“Ideographic Space”) has been included in this list. It is not a zero width character, so it is a printable (albeit whitespace) character. 
