Poor-man's "regex favorites" in Notepad++



  • So often people request a way to create regular expression find/replace pairs in Notepad++, for easy recall later.
    This is often wanted with a capability to provide a descriptive name to the pair.
    I’m one of these people, although I’ve been silent – never asked for it in a forum.

    I’ve found somewhat of a workaround for this that I’ll share.

    First, a disclaimer: Is it a lame technique? Yes, I’ll admit it. But, I’ve found it handy.

    Here’s the technique:

    • Record a macro that does a Find All in Current Document for the find part of your find/replace pair
    • Save your macro, giving it a descriptive name; of course verify that it finds the correct matches as well
    • Exit Notepad++ so that your macro gets flushed to shortcuts.xml
    • Restart Notepad++
    • Edit shortcuts.xml and go to where your macro is defined
    • Change your find expression text to include the replace expression after the find expression, but, and here’s the important part: inside a regex “comment” syntax
    • Save and close shortcuts.xml; best to restart N++ at this point as well

    So when you need to recall the find/replace pair for real use, do the following:

    • Run the macro (find it by its descriptive name on the Macro menu)
    • It will appear in the “Find results” window on a “Search” line; you will see both the find expression and the replace expression
    • Copy+paste each expression to the Replace window’s Find what and Replace with boxes
    • Make any adjustments for your specific situation of the moment (this is why it can’t simply be a macro you just run to do the replacement)
    • Run your replacement

    An example helps.
    I often need to write a replacement such that I only keep the text between two delimiters.
    Say between “doc_id” tags in a text file there are values, and I only need a list of the values and not any of the surrounding text.

    Thus I record my macro to do a Find All in Current Document on this expression:

    (?s).*?<doc_id>(\d+)</doc_id>|(?s).*\z

    When editing shortcuts.xml I change it to the following:

    (?x) (?s).*?<doc_id>(\d+)</doc_id>|(?s).*\z (?# REPL:(?1${1}\r\n) )

    Note from this that I can clearly see my “find” and my “replace” expressions.
    I’ve used the (?x) regex construct so that I can add whitespacing to make it clearer.

    When I need to recall it at some later time, I run the macro, and this appears in the “Find result” window:

    Search "(?x) (?s).*?<doc_id>(\d+)</doc_id>|(?s).*\z (?# REPL:(?1${1}\r\n⟯ )" (2 hits in 1 file of 1 searched)

    Even if I get no hits in the current file, I’ve recalled the patterns I need (for copying).
    And I can adjust accordingly for a related replacement.
    At least I’ve got the cryptic bits of regex text needed without having to go search in a text file of notes.

    Maybe someone asks a question on the Community about how to keep only the text between /.* and .*/ pairs. It COULD HAPPEN.

    Then, to quickly offer help, I run my macro that puts the technique handily on my screen, and I can copy+paste from it over to the Replace window, make adjustments, and it’s off-to-the-races.

    One caveat (which has its roots in the info HERE):

    I lied about the “replace” expression above; it’s actually this instead:

    (?x) (?s).*?<doc_id>(\d+)</doc_id>|(?s).*\z (?# REPL:(?1${1}\r\n⟯ )

    See the difference?
    I had to use a “mock” right parens right after the \n near the end.

    Because my real replacement expression uses a real ), I can’t put one where I’ve indicated, because that will close the (?#...) regex comment structure. I suppose it is okay to do that if you have only ONE right parens in your replace expression, and it would then look like this:

    (?x) (?s).*?<doc_id>(\d+)</doc_id>|(?s).*\z (?# REPL:(?1${1}\r\n)

    So anyway, long explanation, but I’ve found the technique described useful for several replacements that I just can’t memorize the syntax for.



  • Hmmm, my extra whitespacing didn’t appear like I wanted; let me try to present it differently (and hopefully more effectively):

    Search "(?x)    (?s).*?<doc_id>(\d+)</doc_id>|(?s).*\z    (?#  REPL:(?1${1}\r\n⟯  )" (2 hits in 1 file of 1 searched)
    

    Yes, that’s better.

    I guess this syntax converts multiple interior spaces into a single space.
    I thought it was a “verbatim” type construct, but it appears NOT.
    :-(
    Something to remember for the future.



  • Hello, @alan-kilborn and All,

    Ah nice, Alan . Clever use of some enhanced regex features !

    The problem when using the free-spacing mode (?x) is to take care about :

    • Any space character, located in the search part, which must be changed as [ ] or \x20

    • Any possible # character which needs to be escaped as \#


    I thought about these limitations and remembering the backtracking control verbs, introduced with Boost v1.70, I found out a very interesting application of the (*FAIL) backtracking control verb, also written (*F) ! Refer to :

    https://community.notepad-plus-plus.org/post/55467

    Indeed, the magical syntax is :

    Your search regex|(*F)\Q REPL:Your replacement regex

    The only thing to do is add a new alternative to your overall pattern which should look for the string REPL: followed by your replacement regex, taken literally, due to the \Q syntax. But, because of the (*F) control verb, this additional alternative always fails and will never match anything !

    In brief, it’s just as if this regex was reduced to your initial search regex , during execution ;-)). In addition, whatever the replacement contents, no more worries as the \Q ensures that anything after, is considered as literal characters ! Anyway, the \Q is mandatory as some replacement regexes may be invalid when used as search regexes !


    For instance, given the text :

    AAA BBB CCC
    

    The regex (AAA)|(BBB)|(CCC)|(*F)\Q REPL:--(?1XXX)(?2YYY)(?3ZZZ)-- does find the strings AAA, BBB and CCC

    And when using the regex S/R :

    SEARCH (AAA)|(BBB)|(CCC)|(*F)\Q REPL:--(?1XXX)(?2YYY)(?3ZZZ)--

    REPLACE --(?1XXX)(?2YYY)(?3ZZZ)--

    We get, as expected, the string :

    --XXX-- --YYY-- --ZZZ--
    

    It’s worth to point out this easy method :

    • Select the zone (AAA)|(BBB)|(CCC)|(*F)\Q REPL:--(?1XXX)(?2YYY)(?3ZZ)--

    • Open the Replace dialog, with Ctrl + H

    • Click within the Find what: zone, right after the string REPL:

    • Select all the remaining part of the Find what: zone, so the text --(?1XXX)(?2YYY)(?3ZZ)--

    • Copy this text in the clipboard, with Ctrl + C

    • Paste it in the Replace with: zone, with Ctrl + V

    Voilà !

    Best Regards,

    guy038



  • @guy038 said in Poor-man's "regex favorites" in Notepad++:

    Your search regex|(*F)\Q REPL:Your replacement regex

    Nice, one Guy!
    I’d like the original (?#...) one better, because it is more readable (once one knows that it is a “comment”), except for the problem I noted with the closing parens.
    Because of that limitation, |(*F)\Q is definitely better here.


Log in to reply