Community
    • Login

    search and replace / regEx

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    8 Posts 4 Posters 2.8k Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Frank KirschnerF Offline
      Frank Kirschner
      last edited by

      Hi community.
      I have a corrupted iCalendar file and want to delete with search and replace the waste between two tags:

      END:VCALENDAR
      3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR

      Everything between END:VCALENDAR and BEGIN:VCALENDAR should be delete.
      How do I have to use the search pattern?

      Thanks, and best regards,
      Frank

      Claudia FrankC 1 Reply Last reply Reply Quote 0
      • Claudia FrankC Offline
        Claudia Frank @Frank Kirschner
        last edited by

        @Frank-Kirschner

        using regular expression in find dialog search mode,
        find what is

        (?s)(?<=BEGIN:VCALENDAR).*?(?=\REND:VCALENDAR)
        

        and replace with stays empty, then press Replace All.
        Note, you cannot use Replace to jump through (bug) the file
        and see what gets replaced.

        Cheers
        Claudia

        Scott SumnerS 1 Reply Last reply Reply Quote 0
        • Scott SumnerS Offline
          Scott Sumner @Claudia Frank
          last edited by

          @Claudia-Frank

          Did you get BEGIN and END mixed up? I found that your regexp didn’t work on the sample data, but this one seems to (the biggest change is to swap BEGIN and END):

          (?s-i)(?<=END:VCALENDAR).*?(?=BEGIN:VCALENDAR)
          

          Perhaps the OP would like to be able to individually cycle thru the matches. The above solutions don’t allow that but this one does:

          Find what zone: (?s-i)END:VCALENDAR.*?BEGIN:VCALENDAR
          Replace with zone: END:VCALENDARBEGIN:VCALENDAR
          Search mode: Regular expression

          Claudia FrankC 1 Reply Last reply Reply Quote 2
          • Claudia FrankC Offline
            Claudia Frank @Scott Sumner
            last edited by

            @Scott-Sumner

            Hi Scott,
            thanks for the head up - yes, I did - I was just thinking BEGIN comes before END but I guess that is where OPs issue comes from.

            But the regex itself, when switching the end/begin terms should work, shouldn’t?

            (?s)(?<=END:VCALENDAR).*?(?=\RBEGIN:VCALENDAR)
            

            does it for me.

            Cheers
            Claudia

            Scott SumnerS 1 Reply Last reply Reply Quote 0
            • Scott SumnerS Offline
              Scott Sumner @Claudia Frank
              last edited by

              @Claudia-Frank

              Starting with your original regexp, I added the -i because the OP’s spec was definitely uppercase and I deleted your \R because there didn’t seem to be a requirement that a line-ending occurred before the BEGIN, and indeed in the sample data there doesn’t appear to be one? Thus, copying the OP’s sample data to a N++ tab and trying your original regexp on it yielded no matches for me…hmmm…

              Claudia FrankC 1 Reply Last reply Reply Quote 1
              • Claudia FrankC Offline
                Claudia Frank @Scott Sumner
                last edited by Claudia Frank

                @Scott-Sumner

                You are right, when copying the sample data there seems to be no eol but from what is displayed I assumed there is one
                but even without \R it seems to work for me

                This cannot be an linux/windows issue, can it be?

                Note, I shorten the line to fit into the screen - but it works with the original data as well.

                Cheers
                Claudia

                1 Reply Last reply Reply Quote 0
                • Claudia FrankC Offline
                  Claudia Frank
                  last edited by Claudia Frank

                  Ah, now I see, I guess - you used the original data without the eol but with my regex which included the eol.
                  Yes, makes sense, does not work.

                  Cheers
                  Claudia

                  1 Reply Last reply Reply Quote 2
                  • guy038G Offline
                    guy038
                    last edited by guy038

                    Hello, @frank-kirschner, @claudia-frank, @scott-sumner and All,

                    I thought about a third regex which, in addition, looks if :

                    • The END:VCALENDAR string is preceded by a line-break

                    • The BEGIN:VCALENDAR string is followed by a line-break

                    and, in replacement, this regex S/R adds a line-break, if not initially present, in O.P.'s text

                    So, assuming the four possible cases, below :

                    blah blah
                    bla bla blaEND:VCALENDAR
                    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDARblah blah blah
                    bla bla...
                    
                    
                    blah blah
                    bla bla blaEND:VCALENDAR
                    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR
                    blah blah blah...
                    bla bla...
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDARblah blah blah
                    bla bla...
                    
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR
                    blah blah blah
                    bla bla...
                    

                    then the regex S/R :

                    SEARCH (?s-i)((\R)?END:VCALENDAR).*?(BEGIN:VCALENDAR(\R)?)

                    REPLACE (?2:\r\n)\1\r\n\3(?4:\r\n)

                    would gives the following text ( four identical blocks of text ) :

                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    BEGIN:VCALENDAR
                    blah blah blah
                    bla bla...
                    
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    BEGIN:VCALENDAR
                    blah blah blah...
                    bla bla...
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    BEGIN:VCALENDAR
                    blah blah blah
                    bla bla...
                    
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    BEGIN:VCALENDAR
                    blah blah blah
                    bla bla...
                    

                    Et voilà !


                    Notes :

                    • You may, either, click several times on the Replace button or once, only, on the Replace All button

                    • In search :

                      • First the (?s-i) modifiers forces :

                        • The search to be performed in a sensitive way ( NON-insensitive ! )

                        • The special dot character . to be considered as any single character, even an End of Line one

                      • Then group 1 contains the string END:VCALENDAR, possibly preceded with a line-break

                      • The part (\R)? ( identical to the form (\R){0,1} ) represents an optional line-break ( group 2 )

                      • Now, the .*? part ( identical to .{0,}? ) stands for the smallest range of any character, between the two strings END:VCALENDAR and BEGIN:VCALENDAR

                      • The group 3 contains the string BEGIN:VCALENDAR, possibly followed with a line-break

                      • Finally, the part (\R)? ( identical to the form (\R){0,1} ) represents an optional line-break ( group 4 )

                    • In replacement :

                      • The conditional replacement feature (?2:\r\n), rewrites a line-break, only if group 2 ( \R ) does not exist, before the string END:VCALENDAR

                      • The block \2\r\n\3 adds the strings END:VCALENDAR and BEGIN:VCALENDAR, separated with a line-break ( \r\n )

                      • The conditional replacement feature (?4:\r\n), rewrites a line-break, only if group 4 ( \R ) does not exist, after the string BEGIN:VCALENDAR

                    Cheers,

                    guy038

                    1 Reply Last reply Reply Quote 2

                    Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                    Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                    With your input, this post could be even better 💗

                    Register Login
                    • First post
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors