Deleting a group of characters in lines with same beginning and ending, but different in between (re-post)
-
Fellow Notepad++ Users,
Could you please help me the the following search-and-replace problem I am having?
I want to delete the [/b] in lines with the same beginning and ending, but different characters in between, like these:Here is the data I currently have (“before” data):
[m3][c #0A5D00]▸[i] You’d get up early[/i][/b][/c][/m3] [m3][c #0A5D00]▸[i] We prefer cheese[/i][/b][/c][/m3] [m3][c #0A5D55]▸[b][i] charity fund[/i][/b][/c][/m3] [m3][c #0A5D00]▸[i] They never came[/i][/b][/c][/m3] [m3][c #0A5D55]▸[b][i] board of charity[/i][/b][/c][/m3]
Here is how I would like that data to look (“after” data):
[m3][c #0A5D00]▸[i] You’d get up early[/i][/c][/m3] [m3][c #0A5D00]▸[i] We prefer cheese[/i][/c][/m3] [m3][c #0A5D55]▸[b][i] charity fund[/i][/b][/c][/m3] [m3][c #0A5D00]▸[i] They never came[/i][/c][/m3] [m3][c #0A5D55]▸[b][i] board of charity[/i][/b][/c][/m3]
To accomplish this, I have tried using the following Find/Replace expressions and settings
• Find What =[m3][c #0A5D00]▸[i]*[/i][/b][/c][/m3]
• Replace With =[m3][c #0A5D00]▸[i]*[/i][/c][/m3]
• Search Mode = all the three, one after another (REGULAR EXPRESSION, then NORMAL, then EXTENDED)
• Dot Matches Newline = NOT CHECKED
I tried the Find What function first, but it didn’t work, and I’m not sure why.
Could you please help me understand what went wrong and help me find the solution?
Thank you. -
Hi. You were sort of getting there, but you’re missing a few techniques:
- \Q…\E to force special characters (like square braces) to be treated as literal
- * is not the simple wild card you may be used to
- \K to throw away everything matched so far
This should do it:
\Q[m3][c #0A5D00]▸[i]\E.*?\K\Q[/b]\E
-
@Neil-Schipper And, since the match is only on what we want removed, we keep “Replace with” empty.
-
Hello, @polar-bear, @neil-schipper and All,
This simple regex should work, too :
-
SEARCH
(?-si)^(.+#0A5D00.+)\\[/b\\]
-
REPLACE
$1
Tick preferably the
Wrap around
optionSelect the
Regular expression
search modeClick, either, once on the
Replace All
button or several times on theReplace
one
Notes :
-
The modifiers
(?-is)
assure that the. matches newline
option is not checked and that theMatch case
option is checked -
Then, after the beginning of line (
^
) the part.+#0A5D00.+
matches all standard characters… till the string#0A5D00
, included, and then an other non-null range of standard characters till… -
…The literal string
[/b]
( Note that the square brackets[
and]
, have a special signification in regexes. So, they must be escaped in order to search these characters literally ) -
As the part
.+#0A5D00.+
is embedded in parentheses, it is stored as group1
and can be re-used, in the replacement regex, with the$1
or\1
syntax. So, the part\\[/b\\]
, alone, is not rewritten !
Best Regards,
guy038
-
-
aaaaaaaaand I forgot to say it’s a regex
aaaaaaaaand I forgot to say: you may click Find to satisfy yourself it’s matching the text to remove; when you want to apply the changes to the whole file, use Replace All. This is a class of regex for which the a single Replace operation does not work, for reasons I don’t understand.
-
@polar-bear Note that @guy038’s solution, which does not rely on \K, allows you to do single Replace operations in case you had a need to interactively check each instance before replacing.
It’s also very tolerant of unspecified text both before and after
#0A5D00
, while mine uses more rigid constraints. -
Hi, @polar-bear, @neil-schipper and All,
Neil, you’re right about the general behavior of my regex. Yours is more robust, of course.
However the example provided by the OP is really minimalist : we don’t about about possible other
#0A....
strings, different from#0A5D00
and#0A5D55
. We don’t know about the context of these lines and so on…So, I just rely on the changes of the
#A0....
part !
Now, if @polar-bear want to search, for example, for the strings
#0A5D00
,#0A5C00
,#0A5E50
and#0A5FFF
, simultaneously, my regex would become :SEARCH
(?-si)^(.+#0A(?:5D00|5C00|5E50|5FFF).+)\\[/b\\]
REPLACE
$1
BR
guy038
P.S.:
Oh, I just saw, in the OP’s example, that the suppression of
[/b]
must occur only in lines which do not have the string[b]
, before the random text. So, may be his challenge could be expressed, in fluent language, as :“How to delete the
[/b]
string in any line which does not contain a[b]
string before” ???Wait and see !
-
@guy038 said in Deleting a group of characters in lines with same beginning and ending, but different in between (re-post):
“How to delete the [/b] string in any line which does not contain a [b] string before”
I am just realizing, late in the game, his spec could have been very concisely stated, “delete [/b] from all uncharitable lines”! (You may not give up until you find the pun.)
@polar-bear Note that the two solutions presented to you differ in another interesting way:
#1 satisfies a “before-text” precondition, then matches only the “to-remove-text” which is replaced with nil; it “takes away”
#2 both “before-text” and “to-remove-text” are matched but the “before-text” is stored in its own named basket, and that is what replaces the total match; it “replaces a whole with a part”.
This gives you an idea of the power and flexibility of this rather unpretty programming language.
-
I’ve been able to get the job done, using the suggestions by guy038, which seems the simplest for me.
(
SEARCH (?-si)^(.+#0A5D00.+)[/b]REPLACE $1
… )
Anyhow, thank you both for taking the trouble to help.
With best wishes -
This post is deleted!