• Login
Community
  • Login

search and replace / regEx

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
8 Posts 4 Posters 2.4k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F
    Frank Kirschner
    last edited by May 3, 2018, 9:41 AM

    Hi community.
    I have a corrupted iCalendar file and want to delete with search and replace the waste between two tags:

    END:VCALENDAR
    3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR

    Everything between END:VCALENDAR and BEGIN:VCALENDAR should be delete.
    How do I have to use the search pattern?

    Thanks, and best regards,
    Frank

    C 1 Reply Last reply May 3, 2018, 11:12 AM Reply Quote 0
    • C
      Claudia Frank @Frank Kirschner
      last edited by May 3, 2018, 11:12 AM

      @Frank-Kirschner

      using regular expression in find dialog search mode,
      find what is

      (?s)(?<=BEGIN:VCALENDAR).*?(?=\REND:VCALENDAR)
      

      and replace with stays empty, then press Replace All.
      Note, you cannot use Replace to jump through (bug) the file
      and see what gets replaced.

      Cheers
      Claudia

      S 1 Reply Last reply May 3, 2018, 11:59 AM Reply Quote 0
      • S
        Scott Sumner @Claudia Frank
        last edited by May 3, 2018, 11:59 AM

        @Claudia-Frank

        Did you get BEGIN and END mixed up? I found that your regexp didn’t work on the sample data, but this one seems to (the biggest change is to swap BEGIN and END):

        (?s-i)(?<=END:VCALENDAR).*?(?=BEGIN:VCALENDAR)
        

        Perhaps the OP would like to be able to individually cycle thru the matches. The above solutions don’t allow that but this one does:

        Find what zone: (?s-i)END:VCALENDAR.*?BEGIN:VCALENDAR
        Replace with zone: END:VCALENDARBEGIN:VCALENDAR
        Search mode: Regular expression

        C 1 Reply Last reply May 3, 2018, 12:10 PM Reply Quote 2
        • C
          Claudia Frank @Scott Sumner
          last edited by May 3, 2018, 12:10 PM

          @Scott-Sumner

          Hi Scott,
          thanks for the head up - yes, I did - I was just thinking BEGIN comes before END but I guess that is where OPs issue comes from.

          But the regex itself, when switching the end/begin terms should work, shouldn’t?

          (?s)(?<=END:VCALENDAR).*?(?=\RBEGIN:VCALENDAR)
          

          does it for me.

          Cheers
          Claudia

          S 1 Reply Last reply May 3, 2018, 12:22 PM Reply Quote 0
          • S
            Scott Sumner @Claudia Frank
            last edited by May 3, 2018, 12:22 PM

            @Claudia-Frank

            Starting with your original regexp, I added the -i because the OP’s spec was definitely uppercase and I deleted your \R because there didn’t seem to be a requirement that a line-ending occurred before the BEGIN, and indeed in the sample data there doesn’t appear to be one? Thus, copying the OP’s sample data to a N++ tab and trying your original regexp on it yielded no matches for me…hmmm…

            C 1 Reply Last reply May 3, 2018, 12:35 PM Reply Quote 1
            • C
              Claudia Frank @Scott Sumner
              last edited by Claudia Frank May 3, 2018, 12:36 PM May 3, 2018, 12:35 PM

              @Scott-Sumner

              You are right, when copying the sample data there seems to be no eol but from what is displayed I assumed there is one
              but even without \R it seems to work for me

              This cannot be an linux/windows issue, can it be?

              Note, I shorten the line to fit into the screen - but it works with the original data as well.

              Cheers
              Claudia

              1 Reply Last reply Reply Quote 0
              • C
                Claudia Frank
                last edited by Claudia Frank May 3, 2018, 12:40 PM May 3, 2018, 12:40 PM

                Ah, now I see, I guess - you used the original data without the eol but with my regex which included the eol.
                Yes, makes sense, does not work.

                Cheers
                Claudia

                1 Reply Last reply Reply Quote 2
                • G
                  guy038
                  last edited by guy038 May 3, 2018, 7:35 PM May 3, 2018, 7:13 PM

                  Hello, @frank-kirschner, @claudia-frank, @scott-sumner and All,

                  I thought about a third regex which, in addition, looks if :

                  • The END:VCALENDAR string is preceded by a line-break

                  • The BEGIN:VCALENDAR string is followed by a line-break

                  and, in replacement, this regex S/R adds a line-break, if not initially present, in O.P.'s text

                  So, assuming the four possible cases, below :

                  blah blah
                  bla bla blaEND:VCALENDAR
                  3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDARblah blah blah
                  bla bla...
                  
                  
                  blah blah
                  bla bla blaEND:VCALENDAR
                  3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR
                  blah blah blah...
                  bla bla...
                  
                  blah blah
                  bla bla bla
                  END:VCALENDAR
                  3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDARblah blah blah
                  bla bla...
                  
                  
                  blah blah
                  bla bla bla
                  END:VCALENDAR
                  3F68636A-A88D-4B6D-95C7-DC5B65910335.ics ZÔúƒ4504b552cef6ac7c1141ef12fba9a94a ²VEVENT ZÚ p ZÚˆ3F68636A-A88D-4B6D-95C7-DC5B65910335€ $ (È„ P€ „ Í`õ ¢BEGIN:VCALENDAR
                  blah blah blah
                  bla bla...
                  

                  then the regex S/R :

                  SEARCH (?s-i)((\R)?END:VCALENDAR).*?(BEGIN:VCALENDAR(\R)?)

                  REPLACE (?2:\r\n)\1\r\n\3(?4:\r\n)

                  would gives the following text ( four identical blocks of text ) :

                  blah blah
                  bla bla bla
                  END:VCALENDAR
                  BEGIN:VCALENDAR
                  blah blah blah
                  bla bla...
                  
                  
                  blah blah
                  bla bla bla
                  END:VCALENDAR
                  BEGIN:VCALENDAR
                  blah blah blah...
                  bla bla...
                  
                  blah blah
                  bla bla bla
                  END:VCALENDAR
                  BEGIN:VCALENDAR
                  blah blah blah
                  bla bla...
                  
                  
                  blah blah
                  bla bla bla
                  END:VCALENDAR
                  BEGIN:VCALENDAR
                  blah blah blah
                  bla bla...
                  

                  Et voilà !


                  Notes :

                  • You may, either, click several times on the Replace button or once, only, on the Replace All button

                  • In search :

                    • First the (?s-i) modifiers forces :

                      • The search to be performed in a sensitive way ( NON-insensitive ! )

                      • The special dot character . to be considered as any single character, even an End of Line one

                    • Then group 1 contains the string END:VCALENDAR, possibly preceded with a line-break

                    • The part (\R)? ( identical to the form (\R){0,1} ) represents an optional line-break ( group 2 )

                    • Now, the .*? part ( identical to .{0,}? ) stands for the smallest range of any character, between the two strings END:VCALENDAR and BEGIN:VCALENDAR

                    • The group 3 contains the string BEGIN:VCALENDAR, possibly followed with a line-break

                    • Finally, the part (\R)? ( identical to the form (\R){0,1} ) represents an optional line-break ( group 4 )

                  • In replacement :

                    • The conditional replacement feature (?2:\r\n), rewrites a line-break, only if group 2 ( \R ) does not exist, before the string END:VCALENDAR

                    • The block \2\r\n\3 adds the strings END:VCALENDAR and BEGIN:VCALENDAR, separated with a line-break ( \r\n )

                    • The conditional replacement feature (?4:\r\n), rewrites a line-break, only if group 4 ( \R ) does not exist, after the string BEGIN:VCALENDAR

                  Cheers,

                  guy038

                  1 Reply Last reply Reply Quote 2
                  3 out of 8
                  • First post
                    3/8
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors