Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation



  • hello. I just run a regex such as this:

    Search: \s\S*(?:<p class="best">|\G)(?:(?!</p>).)*?\s\K\s+|(?<=<p class="best">)\s+|\s+(?=</p>)
    Replace by: (Leave Empty)

    This regex will delete/eliminate all the empty spaces from the tag <p class="best">. But it cannot be done from the first attempt, so I must run many time this regex formula.

    I’m thinking of doing something like a “loop” or "batch command’, in order to run multiple times the operation, until is finally done all the “replaces”.

    Ho can I do this?



  • Hello @vasile-caraus and All,

    If we assume that :

    • Any zone <p class="best">........</p> is a single line area of chars

    • Any zone <p class="best">........</p> is a non-nested zone

    I mean that the case ........<p class="best">.........<p class="best">...... ..</p>... ..</p>....... never happens

    Then, the regex, below, will get rid of all excedentary horizontal blank characters, between any starting tag <p class="best">, with that exact case, and its ending tag </p>.

    Note that text, in current line, before the starting tag.<p class="best"> and after the ending tag </p> is not concerned by this S/R

    SEARCH (?-si)(<p class="best">|\G)((?!</p>).)*?\h\K\h+

    REPLACE Leave Empty

    Simply, click, once, on the Replace All button

    Remark : I did not use non-capturing group as the first group represent a 16-chars string or the 0-char assertion \G and the second group is a 1-char string, anyway !

    Best Regards,

    guy038



  • This post is deleted!


  • @guy038 said in Loop or “batch command” - Search and Replace (especially regex) , in order to run multiple times the operation:

    (?-si)(<p class=“best”>|\G)((?!</p>).)*?\h\K\h+

    Not very good. My regex seems to be better, even is too long. Also, your regex needs a lot of “Replace all”.

    Please see this example, after using your regex (and mine), you will see there are still empty spaces at the beginning of the row and at the end.

    <p class="best"> I go    home with my mother </em> and my     father is watching tv. </p>
    

    should become:

    <p class="best">I go home with my mother </em> and my father is watching tv.</p>
    


  • Hi, @vasile-caraus and All,

    Oooupps ! You’re quite right about it ! I didn’t notice the space character, right after <p class="best"> and right before </p>, once the S/R is done :-((

    Here is one possible solution ( the shorter one that I could find out, yet ! ) :

    SEARCH (?-si)((<p class="best">)|\G)((?!</p>).)*?\K\h+(?=(</p>)|)

    REPLACE ?2:(?4:\x20)

    Notes : In replacement, the conditional syntaxes lead to the following logic :

    • If group 2 exists ( the starting tag = <p class="best"> ), we do nothing, so the blank chars matched \h+ are deleted

      • Else, if group 4 exists ( the ending tag = </p> ), in the same way, the blank chars matched \h+ are deleted

        • Else ( case where blank character(s) matched ( \h+ ) are, both, not preceded with <p class="best"> and not followed with </p> ), a single space char replaces the overall range of blank characters \h+, whatever they are !

    So, for instance, the text :

    abc   def   <p class="best">   I go    home with my mother </em> and my     father is watching tv.   </p>   abc   def
    

    will be changed into :

    abc   def   <p class="best">I go home with my mother </em> and my father is watching tv.</p>   abc   def
    

    Cheers,

    guy038

    P.S. :

    Within the positive look-behind, at the end of the regex, we may not use the alternation symbol | ) and use, instead, the (?=(</p>)?) syntax, with the optional group </p> ! The replacement regex is identical



  • @guy038 said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:

    SEARCH (?-si)((<p class=“best”>)|\G)((?!</p>).)*?\K\h+(?=(</p>)|)

    REPLACE ?2:(?4:\x20)

    your regex is GREAT for the first option, thank you.

    but, seems that I didn’t mention this one. In my html pages, I have both kind of lines. Some lines with tags that contains 2-3 SPACES between words, and tags that have only one space (those are good).

    So, I need to Replace just those lines that have more then one space between words (like your regex, very good). But leave alone those who don’t have two or more spaces between words (such as the second line).



  • Hello, @vasile-caraus and All,

    Ok ! Here is an other solution, slightly longer, which looks for :

    • All horizontal blank characters right after the string <p class="best">

    • All horizontal blank characters right before the string </p>

    • The excess horizontal blank characters, only, if they are not closed to the starting and/or ending tag

    In the last case, this means that it skips all ranges of 1-space long, not concerned by the S/R

    SEARCH (?-si)(?<=<p class="best">)\K\h+|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)

    REPLACE Leave EMPTY

    Best Regards,

    guy038

    P.S. :

    This S/R does work, also, in the two particular cases, below :

    abc   def   <p class="best">   Test    </p>   abc   def
    
    abc   def   <p class="best">           </p>   abc   def
    

    giving the results :

    abc   def   <p class="best">Test</p>   abc   def
    
    abc   def   <p class="best"></p>   abc   def
    


  • @guy038 said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:

    (?-si)(?<=<p class=“best”>)\K\h+|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)

    Almost :) Please see this case (the last 2 lines don’t change after using regex). Those have 2 tabs at before starting <p class... Please copy the text to see. Must change a little bit the regex. :)

     <p class="best">  WORKS FINE        </p>
    <p class="best">  WORKS FINE        </p>
        <p class="best">Why are you so beauty?        </p>
        <p class="best">I go   home. </p>


  • @Vasile-Caraus said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:

    Almost :)

    This is the third time you’ve changed the requirements after the original request. If what Guy has given you is close, then using the examples and details that Guy has given you, plus the documentation linked in the Community FAQ or directly in the official Notepad++ Documentation set, you should be able to give it a try, and attempt to make your requested fixes yourself.

    After you’ve tried, if it works, great! If not, show us what you tried, why you thought it would work, and give examples of how it didn’t work right. We’re here to help you learn how to use the tool, not to just supply all your regexes without any effort from you.

    -----

    Please Read And Understand This

    FYI: I often add this to my response in regex threads, unless I am sure the original poster has seen it before. Here is some helpful information for finding out more about regular expressions, and for formatting posts in this forum (especially quoting data) so that we can fully understand what you’re trying to ask:

    This forum is formatted using Markdown. Fortunately, it has a formatting toolbar above the edit window, and a preview window to the right; make use of those. The </> button formats text as “code”, so that the text you format with that button will come through literally; use that formatting for example text that you want to make sure comes through literally, no matter what characters you use in the text (otherwise, the forum might interpret your example text as Markdown, with unexpected-for-you results, giving us a bad indication of what your data really is). Images can be pasted directly into your post, or you can hit the image button. (For more about how to manually use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic, and my updates near the end.) Please use the preview window on the right to confirm that your text looks right before hitting SUBMIT. If you want to clearly communicate your text data to us, you need to properly format it.

    If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study the official Notepad++ searching using regular-expressions docs, as well as this forum’s FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.

    Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.



  • @PeterJones said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:

    This is the third time you’ve changed the requirements

    @Vasile-Caraus gets spanked for this kind of thing a lot here. He goes away for a while (licking his wounds) but then returns with the same approach. :-(



  • people, calm down. I came here for help. I am not a scientist, not a programmer, just know how to use regex. And if someone helps me with a solution, I try to make it the best solution. That’s all.

    I am a fan of Notepad ++, it helps me modify my .html files, because I have tried hard to make a website and I have a lot of bugs.

    No one is bound to help me. But maybe one day, this topic will come to the aid of someone else. And @guy038 is always here when need it.



  • @Vasile-Caraus said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:

    people, calm down.

    We’re calm. We’re just trying to help you learn.

    I came here for help.

    Help us help you. Show effort. Give us a reason to want to continue to help you. Right now, it feels like you’re asking us to do your homework (or worse, the job you’re being paid to do) for you, for free.

    I am not a scientist, not a programmer, just know how to use regex.

    You don’t have to be a scientist nor a programmer to be able to follow advice and read documentation and try to understand what’s already been explained and given to you, and try to modify that to fit your actual needs.

    No one is bound to help me.

    Definitely true. But using phrases like “Must change” makes it sound rather demanding. (I understand that English might not be your native language.)

    But part of my definition of “help” is “help the person learn”, not “just give them the answer”. If we help you to learn how to do this yourself, you could be much more efficient in writing your HTML website (waiting hours or days for one of us to write a regex for you is rather inefficient). By encouraging you to learn regex yourself, rather than just relying on us to write the regex for you, we are trying to help you.

    But maybe one day, this topic will come to the aid of someone else.

    Indeed

    And @guy038 is always here when need it.

    Not always. There have been long stretches when he’s not around. And who knows, he might go the same way as others of the long-time contributors to the forum. Or the forum may be killed off when Don gets tired of it. In the long term, it’s better for you to learn.

    One of the best ways to learn regex is to study what’s been given, try to make changes that you think will work the way you want it to, and then ask specific questions if it doesn’t work as you expected.

    And one of the best ways to get the regex you want on the first time you ask the question, rather than having to do 3+ iterations, is to give truly-representative data sets. Make sure the example data you post includes both lines that you want to be changed and ones you don’t; make sure they include a reasonable variety of spacing variations, to make it clear when you want space to be important and when you don’t.

    This is all advice to help you learn, and to help you ask better questions in the future.



  • Hi, @vasile-caraus and All,

    As always, regexes should always be processed against real user text ! Vasile, this issue has nothing related to leading tab characters ! It’s, simply, because, in your 3rd and 4th line, the starting tag is not followed with any space char !

    So :

    WRONG (?-si)(?<=<p class="best">)\K\h+|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)

    RIGHT    (?-si)(?<=<p class="best">)\K\h*|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)

    Indeed, because of the \G syntax and the fact that the dot . did not process EOL chars, when the regex engine processes possible blank characters before </p>, the only possibility that the S/R process goes on and skips to next line is to match, first, the first alternative of the regex, i.e. the new beginning part (?<=<p class="best">)\K\h*, which allows possible lack of blank chars ;-))

    Then, due to the \G feature, further blank characters on this next line can be deleted !

    So, from the text :

     <p class="best">  WORKS FINE        </p>
    <p class="best">  WORKS FINE        </p>
    	<p class="best">Why are you so beauty?        </p>
    	<p class="best">I go   home. </p>
    

    This time, we get, as expected :

     <p class="best">WORKS FINE</p>
    <p class="best">WORKS FINE</p>
    	<p class="best">Why are you so beauty?</p>
    	<p class="best">I go home.</p>
    

    BR

    guy038



  • @guy038 said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:

    (?-si)(?<=<p class=“best”>)\K\h*|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)

    now, this is the best solution ! Probably, if I didn’t test other cases, it would not have been complete.

    thank you very much @guy038 .


Log in to reply