search and replace / regEx
-
Hi community.
I have a corrupted iCalendar file and want to delete with search and replace the waste between two tags:END:VCALENDAR
3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAREverything between END:VCALENDAR and BEGIN:VCALENDAR should be delete.
How do I have to use the search pattern?Thanks, and best regards,
Frank -
using regular expression in find dialog search mode,
find what is(?s)(?<=BEGIN:VCALENDAR).*?(?=\REND:VCALENDAR)and replace with stays empty, then press Replace All.
Note, you cannot use Replace to jump through (bug) the file
and see what gets replaced.Cheers
Claudia -
Did you get
BEGINandENDmixed up? I found that your regexp didn’t work on the sample data, but this one seems to (the biggest change is to swap BEGIN and END):(?s-i)(?<=END:VCALENDAR).*?(?=BEGIN:VCALENDAR)Perhaps the OP would like to be able to individually cycle thru the matches. The above solutions don’t allow that but this one does:
Find what zone:
(?s-i)END:VCALENDAR.*?BEGIN:VCALENDAR
Replace with zone:END:VCALENDARBEGIN:VCALENDAR
Search mode: Regular expression -
Hi Scott,
thanks for the head up - yes, I did - I was just thinking BEGIN comes before END but I guess that is where OPs issue comes from.But the regex itself, when switching the end/begin terms should work, shouldn’t?
(?s)(?<=END:VCALENDAR).*?(?=\RBEGIN:VCALENDAR)does it for me.
Cheers
Claudia -
Starting with your original regexp, I added the
-ibecause the OP’s spec was definitely uppercase and I deleted your\Rbecause there didn’t seem to be a requirement that a line-ending occurred before theBEGIN, and indeed in the sample data there doesn’t appear to be one? Thus, copying the OP’s sample data to a N++ tab and trying your original regexp on it yielded no matches for me…hmmm… -
You are right, when copying the sample data there seems to be no eol but from what is displayed I assumed there is one
but even without \R it seems to work for me
This cannot be an linux/windows issue, can it be?
Note, I shorten the line to fit into the screen - but it works with the original data as well.
Cheers
Claudia -
Ah, now I see, I guess - you used the original data without the eol but with my regex which included the eol.
Yes, makes sense, does not work.Cheers
Claudia -
Hello, @frank-kirschner, @claudia-frank, @scott-sumner and All,
I thought about a third regex which, in addition, looks if :
-
The
END:VCALENDARstring is preceded by a line-break -
The
BEGIN:VCALENDARstring is followed by a line-break
and, in replacement, this regex S/R adds a line-break, if not initially present, in O.P.'s text
So, assuming the four possible cases, below :
blah blah bla bla blaEND:VCALENDAR 3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDARblah blah blah bla bla... blah blah bla bla blaEND:VCALENDAR 3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR blah blah blah... bla bla... blah blah bla bla bla END:VCALENDAR 3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDARblah blah blah bla bla... blah blah bla bla bla END:VCALENDAR 3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR blah blah blah bla bla...then the regex S/R :
SEARCH
(?s-i)((\R)?END:VCALENDAR).*?(BEGIN:VCALENDAR(\R)?)REPLACE
(?2:\r\n)\1\r\n\3(?4:\r\n)would gives the following text ( four identical blocks of text ) :
blah blah bla bla bla END:VCALENDAR BEGIN:VCALENDAR blah blah blah bla bla... blah blah bla bla bla END:VCALENDAR BEGIN:VCALENDAR blah blah blah... bla bla... blah blah bla bla bla END:VCALENDAR BEGIN:VCALENDAR blah blah blah bla bla... blah blah bla bla bla END:VCALENDAR BEGIN:VCALENDAR blah blah blah bla bla...Et voilà !
Notes :
-
You may, either, click several times on the
Replacebutton or once, only, on theReplace Allbutton -
In search :
-
First the
(?s-i)modifiers forces :-
The search to be performed in a sensitive way ( NON-insensitive ! )
-
The special dot character
.to be considered as any single character, even an End of Line one
-
-
Then group
1contains the string END:VCALENDAR, possibly preceded with a line-break -
The part
(\R)?( identical to the form(\R){0,1}) represents an optional line-break ( group2) -
Now, the
.*?part ( identical to.{0,}?) stands for the smallest range of any character, between the two strings END:VCALENDAR and BEGIN:VCALENDAR -
The group
3contains the string BEGIN:VCALENDAR, possibly followed with a line-break -
Finally, the part
(\R)?( identical to the form(\R){0,1}) represents an optional line-break ( group4)
-
-
In replacement :
-
The conditional replacement feature
(?2:\r\n), rewrites a line-break, only if group2(\R) does not exist, before the string END:VCALENDAR -
The block
\2\r\n\3adds the strings END:VCALENDAR and BEGIN:VCALENDAR, separated with a line-break (\r\n) -
The conditional replacement feature
(?4:\r\n), rewrites a line-break, only if group4(\R) does not exist, after the string BEGIN:VCALENDAR
-
Cheers,
guy038
-