• Visualization for zero-width characters

    Locked
    3
    1 Votes
    3 Posts
    13k Views
    guy038G

    Hello, @petyo-vodenicharov, @peterjones and All,

    In addition, to the two valuable Peter’s posts, above, here is my contribution to these strange characters ;-))

    Here is, below, a list of all the Unicode characters, with the General category property = Cf ( Format Character ), which, both, have a code value < FFFF and do NOT, strictly, depend on a specific language !

    •--------•--------•-------------------------------------------•------•---------• | Code | Abbr. | Complete Name | Cat. | >Car< | •--------•--------•-------------------------------------------•------•---------• | 00AD | SHY | SOFT HYPHEN | Cf | >­< | •--------•--------•-------------------------------------------•------•---------• | 200B | ZWSP | ZERO WIDTH SPACE | Cf | >​< | | 200C | ZWNJ | ZERO WIDTH NON-JOINER | Cf | >‌< | | 200D | ZWJ | ZERO WIDTH JOINER | Cf | >‍< | | 200E | LRM | LEFT-TO-RIGHT MARK | Cf | >‎< | | 200F | RLM | RIGHT-TO-LEFT MARK | Cf | >‏< | •--------•--------•-------------------------------------------•------•---------• | 202A | LRE | LEFT-TO-RIGHT EMBEDDING | Cf | >‪< | | 202B | RLE | RIGHT-TO-LEFT EMBEDDING | Cf | >‫< | | 202C | PDF | POP DIRECTIONAL FORMATTING | Cf | >‬< | | 202D | LRO | LEFT-TO-RIGHT OVERRIDE | Cf | >‭< | | 202E | RLO | RIGHT-TO-LEFT OVERRIDE | Cf | >‮| •--------•--------•-------------------------------------------•------•---------• | 2060 | WJ | WORD JOINER | Cf | >⁠< | | 2061 | ƒ() | FUNCTION APPLICATION | Cf | >⁡< | | 2062 | × | INVISIBLE TIMES | Cf | >⁢< | | 2063 | , | INVISIBLE SEPARATOR | Cf | >⁣< | | 2064 | + | INVISIBLE PLUS | Cf | >⁤< | | 2066 | LRI | LEFT-TO-RIGHT ISOLATE | Cf | >⁦< | | 2067 | RLI | RIGHT-TO-LEFT ISOLATE | Cf | >⁧< | | 2068 | FSI | FIRST STRONG ISOLATE | Cf | >⁨< | | 2069 | PDI | POP DIRECTIONAL ISOLATE | Cf | >⁩< | | 206A | ISS | INHIBIT SYMMETRIC SWAPPING | Cf | >< | | 206B | ASS | ACTIVATE SYMMETRIC SWAPPING | Cf | >< | | 206C | IAFS | INHIBIT ARABIC FORM SHAPING | Cf | >< | | 206D | AAFS | ACTIVATE ARABIC FORM SHAPING | Cf | >< | | 206E | NADS | NATIONAL DIGIT SHAPES | Cf | >< | | 206F | NODS | NOMINAL DIGIT SHAPES | Cf | >< | •--------•--------•-------------------------------------------•------•---------• | FEFF | ZWNBSP | ZERO WIDTH NO-BREAK SPACE | Cf | >< | •--------•--------•-------------------------------------------•------•---------• | FFF9 | IAA | INTERLINEAR ANNOTATION ANCHOR | Cf | >< | | FFFA | IAS | INTERLINEAR ANNOTATION SEPARATOR | Cf | >< | | FFFB | IAT | INTERLINEAR ANNOTATION TERMINATOR | Cf | >< | •--------•--------•-------------------------------------------•------•---------•

    Now, depending of the current font, that is used in N++, the glyph of these characters may :

    Be invisible ( A true Zero Width character )

    Display a square or a thin rectangular box ( Character not handled by current font )

    Display a specific character ( case of the Soft Hyphen )

    Of course, with the simple regular expression \x{####}, you can match the character of Unicode value = ####. But, it would be better to find out a regex to match any of these format characters !

    I noticed that the Posix character class [[:cntrl:]] matches most of these characters :

    The 4 characters, from \x{200C} to \x{200F}

    The 5 characters, from \x{202A} to \x{202E}

    The 6 characters, from \x{206A} to \x{206F}

    The character \x{FEFF}

    The 3 characters, from \x{FFF9} to \x{FFFB}

    Unfortunately, the [[:cntrl:]] regex, also matches the Control characters :

    The 32 C0 characters, from \x{0000} to \x{001F}

    The 32 C1 characters, from \x{0080} to \x{009F}

    Moreover, the [[:cntrl:]] regex misses some characters :

    The Soft Hyphen \x{00AD}

    The Zero Width Space \x{200B}

    The 9 characters, from \x{2060} to \x{2069}

    So, a correct regex, to match all these format characters, above, in an Unicode encoded file, could be :

    (?=[[:unicode:]])[[:cntrl:]\x{200B}\x{2060}-\x{2069}]|\xAD

    Now, how to visualize a zero-width character ? If you just hit the Find Next button, you see that a specific line is reached but you do not know the exact location of this/these zero-width char(s) :-((

    Two solutions are possible :

    (?-s).((?=[[:unicode:]])[[:cntrl:]\x{200B}\x{2060}-\x{2069}]|\xAD)+.

    Which match two standard characters, separated by, one or several consecutive format character(s)

    ((?=[[:unicode:]])[[:cntrl:]\x{200B}\x{2060}-\x{2069}]|\xAD)+

    Which mark all these format chars, while clicking on the Mark All button ( the best solution, to my mind ! )

    So, trying the simple regex \x{200B}, against the sentence, below and using the Mark option, will convince you that this sentence does contain some Zero Width Space characters, inside !

    F​or exam​ple, I’ve ins​erted 10 ze​ro-width spa​ces in​to thi​s sentence, c​an you tel​​l me where ?

    Note that, between the two letters l of the verb tell, there are two consecutive chars \x{200B} !

    You can see a description of these format characters, from the following links :

    http://www.unicode.org/charts/PDF/U2000.pdf

    http://www.unicode.org/charts/PDF/UFE70.pdf

    http://www.unicode.org/charts/PDF/UFFF0.pdf

    Refer, also, to that post :

    https://notepad-plus-plus.org/community/topic/14812/how-to-search-for-unknown-3-digit-characters-with-black-background/2

    Best Regards,

    guy038

    P.S. :

    Simply, copy/paste the list and the sentence, above, in inverse video, in a new tab and enjoy !

  • Text file overwritten, it is still recoverable?

    9
    0 Votes
    9 Posts
    5k Views
    lantzauL

    @Li-Kasilter said:

    @lantzau said:

    it is on a Windows 10 machine

    Lol…It doesn’t matter ,This will be easier to solve on here ,I have experience in this problem ,

    Thanks. hope this find the lost file. I am almost give up

  • Create a Msg box in Notepad++ Macros

    Locked
    3
    1 Votes
    3 Posts
    4k Views
    Дмитрий Трошин194Д

    For text processing, I use the jN plug-in for Notepad ++.
    https://github.com/sieukrem/jn-npp-plugin
    But it allows you to process text in the editor using JavaScript.
    For example, cut lines longer than 300 characters:

    function InputBox(psTxt, psCapt, psVal) {
    var rv = psVal;
    var so = new ActiveXObject(“MSScriptControl.ScriptControl”);
    so.Language = ‘VBScript’;
    var vCode =
    ’ Function getInputNumber() \n’+
    ’ val = InputBox(“‘+psTxt+’”,“‘+psCapt+’”,“‘+psVal+’”) \n’+
    ’ getInputNumber = val \n’+
    ‘End Function \n’;
    so.AddCode(vCode);
    rv = parseInt(so.Run(“getInputNumber”));
    return rv;
    }

    // удаляем строки которые длинее n символов
    function rowsOverLengthRemote(psOper) {
    var vOLen = 300;
    if(!psOper) { psOper = 1; }
    vOLen = InputBox(‘Input length’,“For very long rows”,vOLen);
    vOLen = parseInt(vOLen);
    if(vOLen <= 40) {
    return;
    }
    // debugger;
    // return;
    var vTextAll = Editor.currentView.text;
    var vArr = vTextAll.split(‘\n’);
    var vTextNeed = ‘’;
    var vLine = ‘’;
    for(var i = 0; i<vArr.length; i++) {
    vLine = vArr[i];
    if(vLine.length <= vOLen) {
    vTextNeed = vTextNeed + ‘\n’ + vLine;
    } else {
    if(psOper == 2) {
    message(“Cut string N: “+i+’ length” ‘+vLine.length+’ >> ‘+vLine.substring(0, 500));
    vTextNeed = vTextNeed + ‘\n’ + vLine.substring(0,vOLen);
    } else if(psOper == 1) {
    message("Kill string N: "+i+’ length” ‘+vLine.length+’ >> '+vLine.substring(0, 500));
    }

    } } Editor.currentView.text = vTextNeed;

    }

    var myKillVeryLengthRows = {
    text: “Удалить строки длинее N \tCtrl+Shift+K”,
    ctrl: true, shift: true, alt: false,
    key: 0x4B, // “K key”
    cmd: rowsOverLengthRemote
    };

    addHotKey(myKillVeryLengthRows);
    scriptsMenu.addItem(myKillVeryLengthRows);

    function rowsOverLengthCut() { rowsOverLengthRemote(2); }
    var myCutVeryLengthRows = {
    text: "Обрезать строки длинее N ",
    cmd: rowsOverLengthCut
    };
    scriptsMenu.addItem(myCutVeryLengthRows);

    It also has the ability to display messages:

    JavaScript alert (‘Message’) status (‘Message’); - in the status bar N ++ message (“Message”) - output to the man-made message box (my development):
    http://prntscr.com/j6s9j9
  • Finding Matching Brackets

    4
    0 Votes
    4 Posts
    29k Views
    Scott SumnerS

    @linpengcheng said:

    real-time Matching Current Brackets block

    I think the OP is talking about a different use-case (“long file”, “whole file”) than what the script you pointed to provides. The script is for where the opening and closing delimiter are viewable simultaneously in an editing tab.

  • 0 Votes
    1 Posts
    2k Views
    No one has replied
  • How to automatically add space on both sides of the operator

    13
    0 Votes
    13 Posts
    7k Views
    Fahim AnwerF

    You could try to use “find and replace” feature, set to consider regular expressions, and replace all no-space (x) char with x followed a space in a current selection.

    Save it as a macro and apply in other parts of code.
    https://10bestgame.com/instagram-cool-captions/

  • Create a login to notepad++ like in google chrome

    2
    0 Votes
    2 Posts
    2k Views
    chcgC

    At least the settings could be already taken from a local sync folder of a cloud storage like e.g. dropbox, onedrive, … See settings->options->cloud

  • 0 Votes
    1 Posts
    1k Views
    No one has replied
  • Collapsed lines become visible when printed?

    Locked
    1
    0 Votes
    1 Posts
    836 Views
    No one has replied
  • Find function window

    Locked
    1
    1 Votes
    1 Posts
    696 Views
    No one has replied
  • How To Set-up Command Pr in Notepad++

    Locked
    1
    0 Votes
    1 Posts
    635 Views
    No one has replied
  • FIXED: Drag and drop files into Notepad++

    Locked
    1
    0 Votes
    1 Posts
    9k Views
    No one has replied
  • 3 Votes
    1 Posts
    916 Views
    No one has replied
  • False positive warning analysis ?

    2
    0 Votes
    2 Posts
    1k Views
    ChillerThonC

    image details about the file

  • File splitter Enhancement : to view big files

    Locked
    1
    0 Votes
    1 Posts
    780 Views
    No one has replied
  • A silly question...

    3
    0 Votes
    3 Posts
    2k Views
    chcgC

    Could you please add the debug information of N++. What kind of project do you compile and how? With the help of a N++ plugin?

  • Use of tab for white space

    Locked
    3
    0 Votes
    3 Posts
    2k Views
    Daniel RosenbergD

    Thanks so much. I never thought to look in the language settings, I was always looking in the edit settings.

    Happy Day!

  • Search for text in my tabs.

    Locked
    2
    0 Votes
    2 Posts
    1k Views
    Scott SumnerS

    @Bekno

    How about using the Find All in All Opened Documents button on the Find tab of the Find window??

    Of course, maybe your request makes more sense (as a new feature) if you are talking about limiting the search to open tabs that match a certain filespec…

  • Change style for PHP

    Locked
    1
    0 Votes
    1 Posts
    708 Views
    No one has replied
  • "Search Private" option on "Open" window disappeared.

    Locked
    1
    0 Votes
    1 Posts
    612 Views
    No one has replied