Unable to use Replace in files
-
Hello, @dr-ramaanand, @peterjones, @alan-kilborn and All,
-
First, @dr-ramaanand, I don’t understand why you inist to use very complicated regexes, with backtracking verbs ?! There’s certainly some more simple ways to get the jod done ! To my mind, if you’are able to express your needs in natural language, you’ll probably find out the correct regex more easily !
-
Secondly, as usual, you must test on a single file to begin with and verify that the replacement is exactly what you expect to !
-
Thirdly, if you are to replace a lot of files in one go, please, do a backup of these files, first : One never knows !
-
Fourthly, it could be that one or several files, needing replacement, are too important in size or complicated and lead to the message :
The complexity of matching the regular expression exceeded predefined bounds. Try refactoring the regular expression to make each choice made by the state machine unambiguous.
. You could probably try the replacement on subsets of all your files ?
Regarding your regex :
It contains two parts :
-
The regex
<code\s*style="background-color:\s*transparent;"
which represents the true searched string -
The regex
((?:<p[^>]*?color: black.*?>[\S\s\n]*?<\/p>\s*<span[^>]*>)|(?:<span[^>]*?color: black.*?>[\S\s\n]*?<code))
which is followed with thebacktracking
verbs(*SKIP)(*F)
. Note that this is the first alternative of the regex and, as soon as this part is matched, this part is skipped and the search process fails. Thus, this first alternative is ignored and the regex engine tries the second alternative ( our searched string ) -
Of course, in the case that this first alternative cannot be found at all, the regex egine simply tries the second alternative !
@dr-ramaanand, given your example, below :
-
Mark the part of text that you are looking for, so the regex
(?s)<code\s*style="background-color:\s*transparent;"
: You should get6
occurrences
-
Now, move to the Find dialog and searches for the first alternative, without the part
(*SKIP)(*F)
. So the regex :(?s)((?:<p[^>]*?color: black.*?>[\S\s\n]*?<\/p>\s*<span[^>]*>)|(?:<span[^>]*?color: black.*?>[\S\s\n]*?<code))
-
As you can see, it spans through two lines and the first red mark is embedded in the match so will not be considered. Thus, near the end, the second line is matched. The third short line is also matched, as well.
-
Now, after clicking of the
Find Next
button, note that the second occurrence of the first alternative OVERLAPS the fourth red mark, so this searched string cannot be found at this point and a searched string is only found at the very end of the fifth line. Then, the sixth red mark is also matched on the sixth non-empty line !
You can verify, using the whole regex, that, indeed, only four zones, out of the
6
, are matched !<html> <p style="font-family: "verdana"; font-size: 18px; color: black; line-height: 18px; text-align: justify; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: cyan;"><span style="font-size: 13.5pt; font-family: "Verdana","sans-serif";"><code style="background-color: transparent;"><b>some text here</b></code></span></p> <span><span style="font-size: 13.5pt; font-family: "Verdana","sans-serif"; background-color: cyan;"><code style="background-color: transparent;"><b>some text here</b></code></span> <code style="background-color: transparent;"> <p style="font-family: "verdana"; font-size: 18px; color: cyan; line-height: 18px; text-align: justify; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: cyan;"><span style="color: black; font-size: 13.5pt; font-family: "Verdana","sans-serif";"><code style="background-color: transparent;"><b>some text here</b></code></span></p> <span><span style="font-size: 13.5pt; font-family: "Verdana","sans-serif"; background-color: cyan;"><code style="background-color: transparent;"><b>some text here</b></code></span> <p style="font-family: "verdana"; font-size: 18px; color: cyan; line-height: 18px; text-align: justify; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: navy;"><span style="font-size: 13.5pt; font-family: "Verdana","sans-serif";"><code style="background-color: transparent;"><b>some text here</b></code></span></p> </html>
However, I could not understand exactly in which situations you need to do a replacement and, of course, the contents of the replacement itself !
So, read this post first ! After a 2-hour lunch break, try to be connected : I’ll post you my e-mail address. Note it as soon as possible, as it will be displayed a short amount of time ! I’ll probably delete this short message !
Best Regards,
guy038
-
-
@guy038 My Regular expression helped skip finding the
<code\s*style="background-color:\s*transparent;"[^>]*>
if it was preceded by<p.......color: black...>(any white spaces, including if they were on the next line)<span.......>
, or<span.......color: black...>
and found other strings of<code\s*style="background-color:\s*transparent;"[^>]*>
in the current, open file, but not in all the files of a folder (when I selected, “Find in files” and clicked on, “Replace in files”), probably because it is getting into a continuous loop. How to avoid it from getting into a loop? I get an, “Invalid Regular expression” error if I try to find this or replace this in multiple files of a folder; then on clicking the […] icon next to where it shows that, “Invalid Regular expression”, error, I am getting this message: “The complexity of matching the regular expression exceeded predefined bounds. Try refactoring the regular expression to make each choice made by the state machine unambiguous. This exception is thrown to prevent eternal matches that take an indefinite period time to locate.” I will manage any replacements on my own. Thank you very much! -
This post is deleted! -
-
@guy038 No, I believe that you posted it after I slept and deleted it before I woke up. In the meantime, I worked on that RegEx and arrived at this Regular expression as a solution:
(?s)(?:<p[^>]*?color:\s*black[^>]*>\s*<span[^>]*>\s*<code|<span[^>]*?color:\s*black[^>]*>\s*<code)(*SKIP)(*F)|<code\s*style="background-color:\s*transparent;">
- that can be shortened to(?s)(?:<p[^>]*?(color:\s*black[^>]*>\s*)<span[^>]*>\s*<code|<span[^>]*?$1<code)(*SKIP)(*F)|<code\s*style="background-color:\s*transparent;">
-
Hello, @dr-ramaanand,
You said in your last post :
I worked on that RegEx and arrived at this Regular expression as a solution:
(?s)(?:<p[^>]*?color:\s*black[^>]*>\s*<span[^>]*>\s*<code|<span[^>]*?color:\s*black[^>]*>\s*<code)(*SKIP)(*F)|<code\s*style="background-color:\s*transparent;">
- that can be shortened to(?s)(?:<p[^>]*?(color:\s*black[^>]*>\s*)<span[^>]*>\s*<code|<span[^>]*?$1<code)(*SKIP)(*F)|<code\s*style="background-color:\s*transparent;">
No, you’re wrong ! Even if I apply, against your tiny example text, the first regex :
(?s)(?:<p[^>]*?color:\s*black[^>]*>\s*<span[^>]*>\s*<code|<span[^>]*?color:\s*black[^>]*>\s*<code)(*SKIP)(*F)|<code\s*style="background-color:\s*transparent;">
It returns the message :
Mark: 4 matches in entire file
But, if I run the second regex :
(?s)(?:<p[^>]*?(color:\s*black[^>]*>\s*)<span[^>]*>\s*<code|<span[^>]*?$1<code)(*SKIP)(*F)|<code\s*style="background-color:\s*transparent;">
It returns the message :
Mark: 5 matches in entire file
Thus, there are not identical !
Again, here is my E-mail address :
See you later, by e-mail !
Best Regards,
guy038
-
@guy038 Okay, got it, thank you very much. Merci beaucoup
-
@guy038 said:
it could be that one or several files, needing replacement, are too important in size or complicated and lead to the message : The complexity of matching the regular expression exceeded predefined bounds. Try refactoring the regular expression to make each choice made by the state machine unambiguous.
This made me wonder what happens when this problem occurs during Replace in Files.
Say this occurs in a file that isn’t opened by the user in Notepad++; does the file then get opened and user is left looking at it (to know in which file the problem occurred)? If not, what happens to prior replacements already made in that file (that didn’t hit a complexity problem); are they then saved in the disk file? Does the search/replace then continue with the next file in the sequence or is the whole replacement set operation cancelled (probably)?
This whole thing doesn’t sound like a good situation…
I’d say it is probably advisable that every replacement operation of this nature should be preceded by the search operation (executed by the user), just to avoid the possibility of raising these issues – if the search yields a complexity problem, then (obviously!) don’t attempt the replacement.
-
This Regular expression helped me replace in files (multiple files) only what I wanted:
(?:<p(?!\w)[^>]*?color\s*:\s*(?:(black))[^>]*?>(?(1)(?:\s*<span(?!\w)[^>]*?>)?)|<span\b[^>]*?color\s*:\s*black[^>]*?>|<li\b[^>]*?style[^>]*?color\s*:\s*black[^>]*?>\s*<span\b[^>]*?>)(?s)\s*<code\b(?:".*?"|'.*?'|[^>]*?)+>(*SKIP)(*FAIL)|<code\b[^>]*?style[^>]*?background-color\s*:\s*transparent[^>]*?>
as asked right on top in my first question of this thread, that is, it helps find<code style="background-color: transparent;">
if it is not preceded by<p.......color: black...>(any white spaces, including a new line)<span.......>
or<span.......color: black...>
-
Take the time to read my last e-mail to you, where I explained the differences between two simple regexes containing, each, the
(*SKIP)(*F)
syntax !BR
guy038