Community
    • Login

    search and replace / regEx

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    8 Posts 4 Posters 2.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Frank KirschnerF
      Frank Kirschner
      last edited by

      Hi community.
      I have a corrupted iCalendar file and want to delete with search and replace the waste between two tags:

      END:VCALENDAR
      3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR

      Everything between END:VCALENDAR and BEGIN:VCALENDAR should be delete.
      How do I have to use the search pattern?

      Thanks, and best regards,
      Frank

      Claudia FrankC 1 Reply Last reply Reply Quote 0
      • Claudia FrankC
        Claudia Frank @Frank Kirschner
        last edited by

        @Frank-Kirschner

        using regular expression in find dialog search mode,
        find what is

        (?s)(?<=BEGIN:VCALENDAR).*?(?=\REND:VCALENDAR)
        

        and replace with stays empty, then press Replace All.
        Note, you cannot use Replace to jump through (bug) the file
        and see what gets replaced.

        Cheers
        Claudia

        Scott SumnerS 1 Reply Last reply Reply Quote 0
        • Scott SumnerS
          Scott Sumner @Claudia Frank
          last edited by

          @Claudia-Frank

          Did you get BEGIN and END mixed up? I found that your regexp didn’t work on the sample data, but this one seems to (the biggest change is to swap BEGIN and END):

          (?s-i)(?<=END:VCALENDAR).*?(?=BEGIN:VCALENDAR)
          

          Perhaps the OP would like to be able to individually cycle thru the matches. The above solutions don’t allow that but this one does:

          Find what zone: (?s-i)END:VCALENDAR.*?BEGIN:VCALENDAR
          Replace with zone: END:VCALENDARBEGIN:VCALENDAR
          Search mode: Regular expression

          Claudia FrankC 1 Reply Last reply Reply Quote 2
          • Claudia FrankC
            Claudia Frank @Scott Sumner
            last edited by

            @Scott-Sumner

            Hi Scott,
            thanks for the head up - yes, I did - I was just thinking BEGIN comes before END but I guess that is where OPs issue comes from.

            But the regex itself, when switching the end/begin terms should work, shouldn’t?

            (?s)(?<=END:VCALENDAR).*?(?=\RBEGIN:VCALENDAR)
            

            does it for me.

            Cheers
            Claudia

            Scott SumnerS 1 Reply Last reply Reply Quote 0
            • Scott SumnerS
              Scott Sumner @Claudia Frank
              last edited by

              @Claudia-Frank

              Starting with your original regexp, I added the -i because the OP’s spec was definitely uppercase and I deleted your \R because there didn’t seem to be a requirement that a line-ending occurred before the BEGIN, and indeed in the sample data there doesn’t appear to be one? Thus, copying the OP’s sample data to a N++ tab and trying your original regexp on it yielded no matches for me…hmmm…

              Claudia FrankC 1 Reply Last reply Reply Quote 1
              • Claudia FrankC
                Claudia Frank @Scott Sumner
                last edited by Claudia Frank

                @Scott-Sumner

                You are right, when copying the sample data there seems to be no eol but from what is displayed I assumed there is one
                but even without \R it seems to work for me

                This cannot be an linux/windows issue, can it be?

                Note, I shorten the line to fit into the screen - but it works with the original data as well.

                Cheers
                Claudia

                1 Reply Last reply Reply Quote 0
                • Claudia FrankC
                  Claudia Frank
                  last edited by Claudia Frank

                  Ah, now I see, I guess - you used the original data without the eol but with my regex which included the eol.
                  Yes, makes sense, does not work.

                  Cheers
                  Claudia

                  1 Reply Last reply Reply Quote 2
                  • guy038G
                    guy038
                    last edited by guy038

                    Hello, @frank-kirschner, @claudia-frank, @scott-sumner and All,

                    I thought about a third regex which, in addition, looks if :

                    • The END:VCALENDAR string is preceded by a line-break

                    • The BEGIN:VCALENDAR string is followed by a line-break

                    and, in replacement, this regex S/R adds a line-break, if not initially present, in O.P.'s text

                    So, assuming the four possible cases, below :

                    blah blah
                    bla bla blaEND:VCALENDAR
                    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDARblah blah blah
                    bla bla...
                    
                    
                    blah blah
                    bla bla blaEND:VCALENDAR
                    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR
                    blah blah blah...
                    bla bla...
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDARblah blah blah
                    bla bla...
                    
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR
                    blah blah blah
                    bla bla...
                    

                    then the regex S/R :

                    SEARCH (?s-i)((\R)?END:VCALENDAR).*?(BEGIN:VCALENDAR(\R)?)

                    REPLACE (?2:\r\n)\1\r\n\3(?4:\r\n)

                    would gives the following text ( four identical blocks of text ) :

                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    BEGIN:VCALENDAR
                    blah blah blah
                    bla bla...
                    
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    BEGIN:VCALENDAR
                    blah blah blah...
                    bla bla...
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    BEGIN:VCALENDAR
                    blah blah blah
                    bla bla...
                    
                    
                    blah blah
                    bla bla bla
                    END:VCALENDAR
                    BEGIN:VCALENDAR
                    blah blah blah
                    bla bla...
                    

                    Et voilà !


                    Notes :

                    • You may, either, click several times on the Replace button or once, only, on the Replace All button

                    • In search :

                      • First the (?s-i) modifiers forces :

                        • The search to be performed in a sensitive way ( NON-insensitive ! )

                        • The special dot character . to be considered as any single character, even an End of Line one

                      • Then group 1 contains the string END:VCALENDAR, possibly preceded with a line-break

                      • The part (\R)? ( identical to the form (\R){0,1} ) represents an optional line-break ( group 2 )

                      • Now, the .*? part ( identical to .{0,}? ) stands for the smallest range of any character, between the two strings END:VCALENDAR and BEGIN:VCALENDAR

                      • The group 3 contains the string BEGIN:VCALENDAR, possibly followed with a line-break

                      • Finally, the part (\R)? ( identical to the form (\R){0,1} ) represents an optional line-break ( group 4 )

                    • In replacement :

                      • The conditional replacement feature (?2:\r\n), rewrites a line-break, only if group 2 ( \R ) does not exist, before the string END:VCALENDAR

                      • The block \2\r\n\3 adds the strings END:VCALENDAR and BEGIN:VCALENDAR, separated with a line-break ( \r\n )

                      • The conditional replacement feature (?4:\r\n), rewrites a line-break, only if group 4 ( \R ) does not exist, after the string BEGIN:VCALENDAR

                    Cheers,

                    guy038

                    1 Reply Last reply Reply Quote 2
                    • First post
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors