Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation
-
Hi, @vasile-caraus and All,
Oooupps ! You’re quite right about it ! I didn’t notice the space character, right after
<p class="best">
and right before</p>
, once the S/R is done :-((Here is one possible solution ( the shorter one that I could find out, yet ! ) :
SEARCH
(?-si)((<p class="best">)|\G)((?!</p>).)*?\K\h+(?=(</p>)|)
REPLACE
?2:(?4:\x20)
Notes : In replacement, the conditional syntaxes lead to the following logic :
-
If group
2
exists ( the starting tag =<p class="best">
), we do nothing, so the blank chars matched\h+
are deleted-
Else, if group
4
exists ( the ending tag =</p>
), in the same way, the blank chars matched\h+
are deleted- Else ( case where blank character(s) matched (
\h+
) are, both, not preceded with<p class="best">
and not followed with</p>
), a singlespace
char replaces the overall range of blank characters\h+
, whatever they are !
- Else ( case where blank character(s) matched (
-
So, for instance, the text :
abc def <p class="best"> I go home with my mother </em> and my father is watching tv. </p> abc def
will be changed into :
abc def <p class="best">I go home with my mother </em> and my father is watching tv.</p> abc def
Cheers,
guy038
P.S. :
Within the positive look-behind, at the end of the regex, we may not use the alternation symbol
|
) and use, instead, the(?=(</p>)?)
syntax, with the optional group</p>
! The replacement regex is identical -
-
@guy038 said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:
SEARCH (?-si)((<p class=“best”>)|\G)((?!</p>).)*?\K\h+(?=(</p>)|)
REPLACE ?2:(?4:\x20)
your regex is GREAT for the first option, thank you.
but, seems that I didn’t mention this one. In my html pages, I have both kind of lines. Some lines with tags that contains 2-3 SPACES between words, and tags that have only one space (those are good).
So, I need to Replace just those lines that have more then one space between words (like your regex, very good). But leave alone those who don’t have two or more spaces between words (such as the second line).
-
Hello, @vasile-caraus and All,
Ok ! Here is an other solution, slightly longer, which looks for :
-
All horizontal blank characters right after the string
<p class="best">
-
All horizontal blank characters right before the string
</p>
-
The excess horizontal blank characters, only, if they are not closed to the starting and/or ending tag
In the last case, this means that it skips all ranges of
1-space
long, not concerned by the S/RSEARCH
(?-si)(?<=<p class="best">)\K\h+|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)
REPLACE
Leave EMPTY
Best Regards,
guy038
P.S. :
This S/R does work, also, in the two particular cases, below :
abc def <p class="best"> Test </p> abc def abc def <p class="best"> </p> abc def
giving the results :
abc def <p class="best">Test</p> abc def abc def <p class="best"></p> abc def
-
-
@guy038 said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:
(?-si)(?<=<p class=“best”>)\K\h+|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)
Almost :) Please see this case (the last 2 lines don’t change after using regex). Those have 2 tabs at before starting
<p class..
. Please copy the text to see. Must change a little bit the regex. :)<p class="best"> WORKS FINE </p> <p class="best"> WORKS FINE </p> <p class="best">Why are you so beauty? </p> <p class="best">I go home. </p>
-
@Vasile-Caraus said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:
Almost :)
This is the third time you’ve changed the requirements after the original request. If what Guy has given you is close, then using the examples and details that Guy has given you, plus the documentation linked in the Community FAQ or directly in the official Notepad++ Documentation set, you should be able to give it a try, and attempt to make your requested fixes yourself.
After you’ve tried, if it works, great! If not, show us what you tried, why you thought it would work, and give examples of how it didn’t work right. We’re here to help you learn how to use the tool, not to just supply all your regexes without any effort from you.
-----
Please Read And Understand This
FYI: I often add this to my response in regex threads, unless I am sure the original poster has seen it before. Here is some helpful information for finding out more about regular expressions, and for formatting posts in this forum (especially quoting data) so that we can fully understand what you’re trying to ask:
This forum is formatted using Markdown. Fortunately, it has a formatting toolbar above the edit window, and a preview window to the right; make use of those. The
</>
button formats text as “code”, so that the text you format with that button will come through literally; use that formatting for example text that you want to make sure comes through literally, no matter what characters you use in the text (otherwise, the forum might interpret your example text as Markdown, with unexpected-for-you results, giving us a bad indication of what your data really is). Images can be pasted directly into your post, or you can hit the image button. (For more about how to manually use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic, and my updates near the end.) Please use the preview window on the right to confirm that your text looks right before hitting SUBMIT. If you want to clearly communicate your text data to us, you need to properly format it.If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study the official Notepad++ searching using regular-expressions docs, as well as this forum’s FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.
Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.
-
@PeterJones said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:
This is the third time you’ve changed the requirements
@Vasile-Caraus gets spanked for this kind of thing a lot here. He goes away for a while (licking his wounds) but then returns with the same approach. :-(
-
people, calm down. I came here for help. I am not a scientist, not a programmer, just know how to use regex. And if someone helps me with a solution, I try to make it the best solution. That’s all.
I am a fan of Notepad ++, it helps me modify my .html files, because I have tried hard to make a website and I have a lot of bugs.
No one is bound to help me. But maybe one day, this topic will come to the aid of someone else. And @guy038 is always here when need it.
-
@Vasile-Caraus said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:
people, calm down.
We’re calm. We’re just trying to help you learn.
I came here for help.
Help us help you. Show effort. Give us a reason to want to continue to help you. Right now, it feels like you’re asking us to do your homework (or worse, the job you’re being paid to do) for you, for free.
I am not a scientist, not a programmer, just know how to use regex.
You don’t have to be a scientist nor a programmer to be able to follow advice and read documentation and try to understand what’s already been explained and given to you, and try to modify that to fit your actual needs.
No one is bound to help me.
Definitely true. But using phrases like “Must change” makes it sound rather demanding. (I understand that English might not be your native language.)
But part of my definition of “help” is “help the person learn”, not “just give them the answer”. If we help you to learn how to do this yourself, you could be much more efficient in writing your HTML website (waiting hours or days for one of us to write a regex for you is rather inefficient). By encouraging you to learn regex yourself, rather than just relying on us to write the regex for you, we are trying to help you.
But maybe one day, this topic will come to the aid of someone else.
Indeed
And @guy038 is always here when need it.
Not always. There have been long stretches when he’s not around. And who knows, he might go the same way as others of the long-time contributors to the forum. Or the forum may be killed off when Don gets tired of it. In the long term, it’s better for you to learn.
One of the best ways to learn regex is to study what’s been given, try to make changes that you think will work the way you want it to, and then ask specific questions if it doesn’t work as you expected.
And one of the best ways to get the regex you want on the first time you ask the question, rather than having to do 3+ iterations, is to give truly-representative data sets. Make sure the example data you post includes both lines that you want to be changed and ones you don’t; make sure they include a reasonable variety of spacing variations, to make it clear when you want space to be important and when you don’t.
This is all advice to help you learn, and to help you ask better questions in the future.
-
Hi, @vasile-caraus and All,
As always, regexes should always be processed against real user text ! Vasile, this issue has nothing related to leading tab characters ! It’s, simply, because, in your
3rd
and4th
line, the starting tag is not followed with any space char !So :
WRONG
(?-si)(?<=<p class="best">)\K\h+|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)
RIGHT
(?-si)(?<=<p class="best">)\K\h*|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)
Indeed, because of the
\G
syntax and the fact that the dot.
did not process EOL chars, when the regex engine processes possible blank characters before</p>
, the only possibility that the S/R process goes on and skips to next line is to match, first, the first alternative of the regex, i.e. the new beginning part(?<=<p class="best">)\K\h*
, which allows possible lack of blank chars ;-))Then, due to the
\G
feature, further blank characters on this next line can be deleted !So, from the text :
<p class="best"> WORKS FINE </p> <p class="best"> WORKS FINE </p> <p class="best">Why are you so beauty? </p> <p class="best">I go home. </p>
This time, we get, as expected :
<p class="best">WORKS FINE</p> <p class="best">WORKS FINE</p> <p class="best">Why are you so beauty?</p> <p class="best">I go home.</p>
BR
guy038
-
@guy038 said in Loop or "batch command" - Search and Replace (especially regex) , in order to run multiple times the operation:
(?-si)(?<=<p class=“best”>)\K\h*|\G((?!</p>).)*?\K(\h+(?=</p>)|\h\K\h+)
now, this is the best solution ! Probably, if I didn’t test other cases, it would not have been complete.
thank you very much @guy038 .