Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    .ics file selection problem

    Help wanted · · · – – – · · ·
    4
    20
    637
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Marcin Jewiarz
      Marcin Jewiarz last edited by Marcin Jewiarz

      Hi,
      I’m wandered is it possible in Notepad++ to select (and copy) all segments form iCalendar files that have defined string in SUMMARY: section. The selection would be from BEGIN:VEVENT to END:VEVENT
      I must extract important dates from many .ics. I tried to make a macro, but this language is for me not clear as I’m not a programmer at all. I made only a scheme to color BEGIN, END and SUMMARY.
      Maybe someone tried to solve a similar problem.
      example

      ...
      BEGIN:VEVENT
      DTSTART:20201002T133000
      DTEND:20201002T150000
      DTSTAMP:20201002T133000
      UID:880985+880986+880987+880988+880989+880990
      CLASS:PUBLIC
      DESCRIPTION:Sala: 
      LOCATION:(A 123) 50.079202,19.866041
      SEQUENCE:0
      STATUS:CONFIRMED
      SUMMARY:bla bla bla
      TRANSP:OPAQUE
      COLOR:#34B41F
      INTERNALID:880985+880986+880987+880988+880989+880990
      END:VEVENT
      ...
      
      PeterJones 1 Reply Last reply Reply Quote 0
      • Terry R
        Terry R last edited by

        @Marcin-Jewiarz said in .ics file selection problem:

        would be from BEGIN:VEVENT to END:VEVENT

        Yes it is possible. Given this file appears to be “text” based Notepad++ can open it. I suggest you will want to use the “Mark” function with bookmarking. this allows easy remove/copying after the mark has completed.

        So using the Mark function (under Search menu)
        Find What:(?s-i)BEGIN:VEVENT.+?END:VEVENT
        Make sure “bookmark lines” is ticked and have search mode set to “regular expression”. Enter (copy red text above) into the find what window and press "Mark All’. You can now close the window and the main tab in Notepad++ should have lots of lines marked with a blue circle at the start of each line.

        At this point use the Copy (or Cut) bookmarked lines, which is under Search, then Bookmark.
        Open another tab and save them, then save the file.

        Terry

        Marcin Jewiarz 1 Reply Last reply Reply Quote 2
        • Marcin Jewiarz
          Marcin Jewiarz @Terry R last edited by

          @Terry-R
          Thanks, but I need one more criterion, SUMMARY: should have also defined string, in this example “bla bla bla”.

          Terry R 1 Reply Last reply Reply Quote 0
          • PeterJones
            PeterJones @Marcin Jewiarz last edited by

            @Marcin-Jewiarz said,

            files that have defined string in SUMMARY: section

            @Terry-R’s solution shows how to do it for any contents of BEGIN:VEVENT to END:VEVENT, but I get the impression that you actually want a decision on whether a block should be marked based on whether SUMMARY was populated or not, in which case, it would be different.

            What does an empty/missing SUMMARY look like? is it SUMMARY: with nothing (just a newline) after the colon, or is there just no SUMMARY line at all?

            Also, I grabbed a random .ics file that I had laying around, and it sometimes has things like SUMMARY;ENCODING=QUOTED-PRINTABLE:blah blah blah – should the syntax we come up with be able to handle extra parameters on the SUMMARY, or should it always assume SUMMARY:?

            I’ve never knowingly encountered a multi-event .ics file – but my iCalendar file experience is limited (until today, I had never opened one up and saw that it was text; I had always assumed it was a binary format). But you also said,

            I must extract important dates from many .ics.

            So, to clarify: Can your .ics files have more than one event in them, or is each file a separate event? And do you expect this macro (or in Terry’s suggestion, the single regex) to just copy from one input .ics at a time, or do you want the equivalent of a Find In Files which finds all matches in a list of files in one go?

            PeterJones 1 Reply Last reply Reply Quote 1
            • PeterJones
              PeterJones @PeterJones last edited by

              @PeterJones said in .ics file selection problem:

              What does an empty/missing SUMMARY look like?

              … or did I misinterpret, and you want a specific bla bla bla, not just empty/populated

              Marcin Jewiarz 1 Reply Last reply Reply Quote 0
              • Terry R
                Terry R @Marcin Jewiarz last edited by

                @Marcin-Jewiarz said in .ics file selection problem:

                SUMMARY: should have also defined string

                As @PeterJones said we need a bit more info.
                There is the possibility that my solution has copied TOO much. That’s not a problem as then we could define an additional regex (regular expression) that worked on JUST the lines we extracted. that might be a bit simpler than attempting to be more exact with the first regex I supplied, especially if the criteria is not easily defined.

                Terry

                1 Reply Last reply Reply Quote 0
                • Marcin Jewiarz
                  Marcin Jewiarz @PeterJones last edited by

                  @PeterJones
                  Thank You, in this files there is always something after SUMMARY:
                  The @Terry-R idea with RegEx seems fair, as I have 8 files to check, in each up to 50 BEGIN:VEVENT to END:VEVENT blocks. If it would be tuned to find in this block SUMMARY: bla bla bla this would be more than enought

                  Marcin Jewiarz PeterJones Terry R 4 Replies Last reply Reply Quote 0
                  • Marcin Jewiarz
                    Marcin Jewiarz @Marcin Jewiarz last edited by

                    This post is deleted!
                    1 Reply Last reply Reply Quote 0
                    • PeterJones
                      PeterJones @Marcin Jewiarz last edited by

                      @Marcin-Jewiarz ,

                      This is one of the many times when my standard advice of “show both data that matches and data that does not” would be really helpful.

                      Have I interpreted correctly: given the data in my text box, you would like to copy what I’ve shown selected in the image, but not the other sections. Am I correct?

                      BEGIN:VEVENT
                      ...
                      SUMMARY:dont include me
                      ...
                      END:VEVENT
                      BEGIN:VEVENT
                      ...
                      SUMMARY:bla bla bla
                      ...
                      END:VEVENT
                      BEGIN:VEVENT
                      ...
                      SUMMARY:dont include me
                      ...
                      END:VEVENT
                      BEGIN:VEVENT
                      ...
                      SUMMARY:bla bla bla
                      ...
                      END:VEVENT
                      BEGIN:VEVENT
                      ...
                      SUMMARY:dont include me
                      ...
                      END:VEVENT
                      

                      c7d8102d-2620-4ce2-8d3f-6c5278fb6a9a-image.png

                      Marcin Jewiarz 1 Reply Last reply Reply Quote 0
                      • PeterJones
                        PeterJones @Marcin Jewiarz last edited by

                        @Marcin-Jewiarz

                        You also didn’t answer my question about whether ;ENCODING=... can modify the SUMMARY or not. Well, it can in general, but whether it can be in your example data or not.

                        And the long form of my advice follows, since it hasn’t been in this thread yet:

                        ----

                        Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as plain text using the </> toolbar button or manual Markdown syntax. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get… Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.

                        1 Reply Last reply Reply Quote 1
                        • Marcin Jewiarz
                          Marcin Jewiarz @PeterJones last edited by

                          @PeterJones said in .ics file selection problem:

                          Have I interpreted correctly: given the data in my t

                          Yes that’s what I’m looking for.

                          1 Reply Last reply Reply Quote 0
                          • Terry R
                            Terry R @Marcin Jewiarz last edited by

                            @Marcin-Jewiarz said in .ics file selection problem:

                            If it would be tuned to find in this block SUMMARY: bla bla bla this would be more than enought

                            So my steps to be performed on the data already extracted is:

                            1. Convert each “record set” into 1 line
                            2. Mark those lines with “bla bla bla” in them
                            3. Remove non-marked lines
                            4. Convert the single line records back to normal

                            1: We will be using the Replace function.
                            Find What:(?s)\R(?!BEGIN)
                            Replace With:@#@
                            Search Mode must be regular expression and have wrap around ticked. Click on the “Replace All” button. All records sets should now be in single lines.

                            2: Using the Mark function we have
                            Find What:(?i-s)SUMMARY.+?\Qbla bla bla\E
                            Have “bookmark lines” ticked. Replace the bla bla bla in the line above with the “literal” text you want to look for. You will see it is encapsulated within the \Q and \E metacharacters. This enables you to safely have any character within this area and not worry that some might have special meaning within the regex environment. Click on the "Mark All’ button. Close window once completed, some lines should be marked.

                            3: Under Search, Bookmark, use the “Remove unmarked Lines”. So at this point ONLY those with “bla bla bla” should remain.

                            4: return the lines to normal. Use the Replace function
                            Find What:@#@
                            Replace With:\r\n
                            All sections of each record set should be on their own line now.

                            I hope this helps.

                            Terry

                            Marcin Jewiarz 1 Reply Last reply Reply Quote 3
                            • Marcin Jewiarz
                              Marcin Jewiarz @Terry R last edited by

                              @Terry-R said in .ics file selection problem:

                              (?i-s)SUMMARY.+?\Qbla bla bla\E

                              Thank You a lot. This is great, for sure I’ll try to learn more about RegEx, the second time during the week I’ve used it.
                              The first was a simple code found in one of the communities to extract important data form service register form laboratory equipment. Now, this. I can make a macro and use it to other files, with modifications to differentr SUMMARY: parameters
                              Once again Thank You @Terry-R !

                              1 Reply Last reply Reply Quote 1
                              • guy038
                                guy038 last edited by guy038

                                Hello, @marcin-jewiarz, @Terry-r, @peterjones and All,

                                We may solve the problem in a more simple way, with these two other solutions :

                                • First solution :

                                  • Use the Mark regex (?xs-i) BEGIN:VEVENT ((?!BEGIN:).)*? \Qbla bla bla\E .*? END:VEVENT\R?

                                  • Then, run the menu option Search > Bookmark > Remove Unmarked Lines

                                • Second solution :

                                  • Use the regex S/R, below, with a negative look-ahead :

                                    • SEARCH (?xs-i) BEGIN:VEVENT \R ((?!BEGIN:|SUMMARY:\Qbla bla bla\E).)+? END:VEVENT \R?

                                    • REPLACE Leave EMPTY

                                See an updated version of these regexes at the end of this post :

                                https://community.notepad-plus-plus.org/post/58092

                                For instance, given this text :

                                BEGIN:VEVENT
                                ...
                                SUMMARY:dont include me
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:dont include me
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                   SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                               SUMMARY:dont include me
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:dont include me
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:dont include me
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:dont include me
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:dont include me
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:dont include me
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:dont include me
                                ...
                                END:VEVENT
                                

                                After running this S/R, we get our expected results :

                                BEGIN:VEVENT
                                ...
                                   SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                BEGIN:VEVENT
                                ...
                                SUMMARY:bla bla bla
                                ...
                                END:VEVENT
                                

                                We may use the negative look-ahead feature , of the second regex, to force conditions on several lines, too ! For instance, let’s suppose that each BEGIN:........END: block contains :

                                • A line containing Line_<Letter> and that you want to keep the lines Line_A, Line_B and Line_C, only

                                • A line containing Expression<Letter> and that you want to keep the lines Expression_X, Expression_Y and Expression_Z, only

                                Then, given this sample :

                                BEGIN:
                                ...
                                Line_C
                                ...
                                test    Expression_X
                                ...
                                END:
                                BEGIN:
                                ...
                                Expression_PTEST
                                ...
                                    Line_B
                                ...
                                END:
                                     BEGIN:
                                ...
                                Line_E
                                ...
                                Expression_X
                                ...
                                END:
                                BEGIN:
                                ...
                                Expression_M
                                ...
                                  Line_ATEST
                                ...
                                     END:
                                BEGIN:
                                ...
                                Line_B   Expression_H
                                ...
                                ...
                                END:
                                    BEGIN:
                                ...
                                Expression_X
                                ...
                                    Line_K
                                ...
                                     END:
                                BEGIN:
                                ...
                                Line_C
                                ...
                                test    Expression_U
                                ...
                                   END:
                                BEGIN:
                                ...
                                Test   Line_E
                                ...
                                Expression_Q
                                ...
                                END:
                                BEGIN:
                                ...
                                   Expression_X
                                ...
                                TEST_Line_A
                                ...
                                    END:
                                    BEGIN:
                                ...
                                Expression_Y_TEST
                                ...
                                   Line_E
                                ...
                                END:
                                   BEGIN:
                                ...
                                Line_A
                                ...
                                   __Expression_Y__
                                ...
                                    END:
                                BEGIN:
                                ...
                                    TESTLine_M_TEST_Expression_ZTest
                                ...
                                END:
                                BEGIN:
                                ...
                                123456789Expression_Y
                                ...
                                Line_B_OK
                                ...
                                END:
                                BEGIN:
                                ...
                                Line_MTEST
                                ...
                                   Expression_J
                                ...
                                END:
                                     BEGIN:
                                ...
                                Expression_H   Line_L
                                ...
                                END:
                                BEGIN:
                                ...
                                Expression_Z
                                ...
                                    Line_G
                                ...
                                      END:
                                

                                The following regex S/R deletes any block which does not contain the expression Line_A, Line_B or Line_C :

                                • SEARCH (?xs-i) ^\h* BEGIN: ((?!BEGIN:|Line_A|Line_B|Line_C).)+? END: .*?$ \R?

                                • REPLACE Leave EMPTY

                                We get :

                                Line_C
                                ...
                                test    Expression_X
                                ...
                                END:
                                BEGIN:
                                ...
                                Expression_PTEST
                                ...
                                    Line_B
                                ...
                                END:
                                BEGIN:
                                ...
                                Expression_M
                                ...
                                  Line_ATEST
                                ...
                                     END:
                                BEGIN:
                                ...
                                Line_B   Expression_H
                                ...
                                ...
                                END:
                                BEGIN:
                                ...
                                Line_C
                                ...
                                test    Expression_U
                                ...
                                   END:
                                BEGIN:
                                ...
                                   Expression_X
                                ...
                                TEST_Line_A
                                ...
                                    END:
                                   BEGIN:
                                ...
                                Line_A
                                ...
                                   __Expression_Y__
                                ...
                                    END:
                                BEGIN:
                                ...
                                123456789Expression_Y
                                ...
                                Line_B_OK
                                ...
                                END:
                                

                                This last regex S/R deletes any block which does not contain the expression Expression_X, Expression_Y or Expression_Z :

                                • SEARCH (?xs-i) ^\h* BEGIN: ((?!BEGIN:|Expression_X|Expression_Y|Expression_Z).)+? END: .*?$ \R?

                                • REPLACE Leave EMPTY

                                Nice ! Now, each remaining block, below, have, both :

                                • A line containing Line_A, Line_B or Line_C

                                • A line containing Expression_X, Expression_Y or Expression_Z

                                Line_C
                                ...
                                test    Expression_X
                                ...
                                END:
                                BEGIN:
                                ...
                                   Expression_X
                                ...
                                TEST_Line_A
                                ...
                                    END:
                                   BEGIN:
                                ...
                                Line_A
                                ...
                                   __Expression_Y__
                                ...
                                    END:
                                BEGIN:
                                ...
                                123456789Expression_Y
                                ...
                                Line_B_OK
                                ...
                                     END:
                                

                                Notes :

                                • The strings BEGIN: and END: may be preceded by some blank characters

                                • You may add characters after the strings BEGIN: and END:

                                • The expressions to exclude may occur at any location, within a block

                                Best Regards,

                                guy038

                                Terry R 2 Replies Last reply Reply Quote 2
                                • Terry R
                                  Terry R @guy038 last edited by

                                  @guy038 said in .ics file selection problem:

                                  We may solve the problem in a more simple way

                                  I like it very much. Your were probably seeing the issue I had trying to LOOK for the bla bla bla, rather than your idea is we should look for any that DON’T have the bla bla bla in them, hence the negative lookahead.

                                  Might I just add 2 sentences for the benefit of @Marcin-Jewiarz, just in case he didn’t notice.

                                  1. When you say to use the “Mark” regex (First solution) you forgot to mention the requirement to tick the “bookmark lines”. Obviously without it there are no lines bookmarked and the next step will therefore remove ALL lines.
                                  2. Use of the (?xs-i), the x option denotes the following as being of a “free form nature”. The spaces shown are NOT used, but exist ONLY to make it easier to read. This along with the \Q and \E regex functions aren’t used much, but perhaps should be, especially when OP’s come to us with words like “bla bla bla” and we have to say insert your text in this position, however without knowing what the actual text is, it can sometimes cause issues when one or more is actually a metacharacter.

                                  Cheers
                                  Terry

                                  1 Reply Last reply Reply Quote 3
                                  • Terry R
                                    Terry R @guy038 last edited by Terry R

                                    @guy038 said in .ics file selection problem:

                                    We may solve the problem in a more simple way

                                    @guy038 as your 2nd regex (which removes the non “bla bla bla” record sets) intrigued me I wondered if a slight alteration might allow the whole process to be carried out with 1 regex. So do a (book)mark with a single regex, then use the “remove unmarked line”.

                                    I think I may have cracked it. I’m still a bit hesitant to put it forward as a solution as it’s quite complicated and dare I say it, not something I’d expect anybody to readily adapt to any future need. It was really just an exercise to satisfy my curiosity.

                                    So the regex is:
                                    (?s-i)BEGIN:VEVENT\R((?=SUMMARY:\Qbla bla bla\E).|(?!SUMMARY|BEGIN:).)+?END:VEVENT\R?
                                    By bookmarking we will have after running this regex all record sets we want to keep. So we’re back with the positive look-ahead (at least in part) which allows us to remove all the extraneous data not of the BEGIN:VEVENT…END:VEVENT type and the non “bla bla bla” sets in one step.

                                    I’d value your input on the validity of this. It appears to work on some demo data which includes some without the “bla bla bla” text so from that point of view it is a success.

                                    Terry

                                    1 Reply Last reply Reply Quote 1
                                    • Terry R
                                      Terry R last edited by Terry R

                                      To all who are interested in my synopsis:

                                      I actually fell onto this quite by chance. I’d edited @guy038 regex to try the positive lookahead again. My regex was picking up all the BEGIN:VEVENT…END:VEVENT sets again. On a whim I added in the ?!SUMMARY in front of the ?!BEGIN as an alternation and suddenly it seemed to work. Several tests later it was still working.

                                      I’ve now been pulling my regex apart trying to better understand HOW it works, I suppose not quite believing it. It does seem contrary to both have a positive lookahead and then also a negative using the same characters. So if I understand it correctly:

                                      1. We start processing a record set starting with the BEGIN:VEVENT
                                      2. Several lines later we approach the SUMMARY line where we want to find the bla bla bla string. This is the lookahead.
                                      3. For a record set not containing bla bla bla we fail this positive lookahead (?=SUMMARY:\Qbla bla bla\E).
                                      4. As step 3 failed we use the alternation option. At this point it becomes a bit difficult to understand. As alternation works from left to right we first assert we don’t want SUMMARY. As we do currently have this we immediately fail this side of the alternation, so to the right side we assert we don’t want BEGIN:, we don’t and here I would have thought it would continue, but it appears to fail. At least that record set is NOT bookmarked and we start all over again. Actually a glimmer of light. Is it because once we commence moving into the SUMMARY line (so the ?!BEGIN actually was true to start with) the positive lookahead will always fail so we only use the alternation. And in the alternation option ?!SUMMARY also always fails, so we are ONLY using the ?!BEGIN as the method of stopping, and that eventually fails us as well, hence the regex fails. Thus the regex won’t bookmark a non bla bla bla set.

                                      Whew, have I actually understood it!

                                      Terry

                                      1 Reply Last reply Reply Quote 1
                                      • Terry R
                                        Terry R last edited by Terry R

                                        Further testing has given me another revised regex, shorter than before.

                                        I think this one is very easy to understand and could serve as the final solution.

                                        (?s-i)BEGIN:VEVENT\R((?=SUMMARY:\Qbla bla bla\E).|(?!SUMMARY:).)+?END:VEVENT\R?

                                        1. We want a set that contains the BEGIN and END lines and contains `SUMMARY:bla bla bla’.
                                        2. If step 1 fails the alternation says we CANNOT have a line with SUMMARY in it within these boundaries. As that WILL fail (unless no SUMMARY line at all) then the regex fails and thus non bla bla bla record sets are NOT bookmarked.

                                        So the proviso is the record set MUST contain valid start and end points, i.e. BEGIN:VEVENT and END:VEVENT (which we have always assumed throughout these posts) and it MUST contain a line starting with SUMMARY:.Depending on what is between the \Q and \E points in the regex determines which record sets are marked and which are NOT.

                                        At this point I think I’ve spent enough time on it, my curiosity is now satiated.

                                        Terry

                                        1 Reply Last reply Reply Quote 1
                                        • guy038
                                          guy038 last edited by guy038

                                          Hi, @Terry-r and All,

                                          In this post, you said :

                                          I wondered if a slight alteration might allow the whole process to be carried out with 1 regex

                                          I’m sorry but the two solutions given, at beginning of my post are totally independent ! So to solve the @marcin-jewiarz problem, you need to run :

                                          • The first Mark regex , with the Bookmark line option ticked, then use the Search > Bookmark > Remove Unmarked Lines

                                          OR

                                          • The second regex S/R ,only

                                          So, we do not have to try to mix them up ;-))


                                          Then you asked my opinion about your regex :

                                          (?s-i)BEGIN:VEVENT\R((?=SUMMARY:\Qbla bla bla\E).|(?!SUMMARY|BEGIN:).)+?END:VEVENT\R?

                                          Well, just look at the second alternative (?!SUMMARY|BEGIN:).. This regex means that, between the expression BEGIN:VEVENT\R and END:VEVENT\R?, it should never occur the expression SUMMARY or BEGIN: at any location !

                                          So, with this regex, between the expressions BEGIN:VEVENT\R( and END:VEVENT\R?

                                          • When the regex engine is at any location, of the block, different from the beginning of a possible line SUMMARY:bla bla bla, this second alternative matches and catches the single character .

                                          • When the regex engine is, exactly at the beginning of a line SUMMARY:bla bla bla, the first alternative (?=SUMMARY:\Qbla bla bla\E). does match and catches the single character ., too !

                                          So, in short, it matches any char of all blocks containing the expression SUMMARY:bla bla bla

                                          Now let’s imagine that you slightly change your regex as below :

                                          (?s-i)BEGIN:VEVENT\R((?=SUMMARY:\Qbla bla bla\E).|(?!SUMMARY:\Qbla bla bla\E|BEGIN:).)+?END:VEVENT\R?

                                          This time, the two alternatives are totally exclusive, regarding the SUMMARY:bla bla bla string ! So the whole regex just matches any multi-lines block BEGIN:VEVENT.........END:VEVENT !


                                          Now, in your last post, you said :

                                          Further testing has given me another revised regex, shorter than before

                                          As your final regex does not contain the alternative BEGIN:, in the negative look-head ! I support this point ;-)) Indeed, looking back to my second solution, this part is not needed ! I certainly needed this part, at one moment, during my tests, but it seems useless in my final try ;-))

                                          So, in summary, the two solutions of my previous post should be updated, without the free-spacing mode, as below :

                                          • First solution :

                                            • Use the Mark regex (?s-i)BEGIN:VEVENT((?!BEGIN:).)*?\Qbla bla bla\E.*?END:VEVENT\R?    with the Bookmark line ticked

                                            • Then, run the menu option Search > Bookmark > Remove Unmarked Lines

                                          • Second solution :

                                            • Use the regex S/R, below, with a negative look-ahead :

                                              • SEARCH (?s-i)BEGIN:VEVENT\R((?!SUMMARY:\Qbla bla bla\E).)+?END:VEVENT\R?

                                              • REPLACE Leave EMPTY

                                          Remark : In the first solution, we still need to the regex ((?!BEGIN:).)*? instead of the .+? one, to restrict the match to a single block. Indeed, the simple regex .*? can match a line END:VEVENT and the line BEGIN:VEVENT of the next block !

                                          Best Regards,

                                          guy038

                                          P.S. :

                                          I’ve verified that my updated second solution does match, as expected, a BEGIN:VEVENT....END:VEVENT block, which does not contain any line SUMMARY:........ like :

                                          BEGIN:VEVENT
                                          ...
                                          ...
                                          END:VEVENT
                                          
                                          1 Reply Last reply Reply Quote 1
                                          • Terry R
                                            Terry R last edited by Terry R

                                            @guy038 said in .ics file selection problem:

                                            at beginning of my post are totally independent !

                                            Firstly my apologies. I got fixated on the concept of using a positive lookahead after looking at both of your solutions. For some reason later on a did mix them together and thinking there were 2 steps.

                                            Perhaps in my defence I’ve just come to realise my reasoning all the way through was that there would be extraneous lines between the END:VEVENT and BEGIN:VEVENT lines, that is, between the record sets. I’ve just googled a typical ics file and whilst that isn’t true there are additional lines before AND after (header and footer info) the sets we were identifying with the regexes. I’ve got a longish one and reduced the size so you can see what shows in the file.

                                            BEGIN:VCALENDAR
                                            PRODID:-//Google Inc//Google Calendar 70.9054//EN
                                            VERSION:2.0
                                            CALSCALE:GREGORIAN
                                            METHOD:PUBLISH
                                            X-WR-CALNAME:ECML PKDD 2015
                                            X-WR-TIMEZONE:Europe/Lisbon
                                            X-WR-CALDESC:The European Conference on Machine Learning and Principles and
                                              Practice of\nKnowledge Discovery in Databases (ECMLPKDD) will take place i
                                             n Porto\,\nPortugal\, from September 7th to 11th\, 2015 (http://www.ecmlpkd
                                             d2015.org).\n\nThis event is the leading European scientific event on machi
                                             ne learning and\ndata mining and builds upon a very successful series of 25
                                              ECML and 18 PKDD\nconferences\, which have been jointly organized for the 
                                             past 14 years.
                                            BEGIN:VTIMEZONE
                                            TZID:Europe/Lisbon
                                            X-LIC-LOCATION:Europe/Lisbon
                                            BEGIN:STANDARD
                                            TZOFFSETFROM:+0100
                                            TZOFFSETTO:+0000
                                            TZNAME:WET
                                            DTSTART:19701025T020000
                                            RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
                                            END:STANDARD
                                            BEGIN:DAYLIGHT
                                            TZOFFSETFROM:+0000
                                            TZOFFSETTO:+0100
                                            TZNAME:WEST
                                            DTSTART:19700329T010000
                                            RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
                                            END:DAYLIGHT
                                            END:VTIMEZONE
                                            BEGIN:VEVENT
                                            DTSTART:20180907T083000Z
                                            ...
                                            SUMMARY:Ex. Ep. Especial: IP/PROGI 
                                            TRANSP:OPAQUE
                                            END:VEVENT
                                            BEGIN:VEVENT
                                            DTSTART;VALUE=DATE:20150803
                                            ...
                                            SUMMARY:Workshops - Camera Ready
                                            TRANSP:TRANSPARENT
                                            END:VEVENT
                                            BEGIN:VEVENT
                                            DTSTART;VALUE=DATE:20150901
                                            ...
                                            SUMMARY:Tutorials - Tutorials Material
                                            TRANSP:TRANSPARENT
                                            END:VEVENT
                                            END:VCALENDAR
                                            

                                            So although the OP never showed this I had made the assumption I couldn’t guarantee there weren’t other lines, nor did I think to ask.

                                            Thanks for critiquing my regexes. I had made a discovery and couldn’t quite believe I hadn’t considered it before. There have been lots of instances where I wanted to find a data set with a specific string using the lookahead and seeing it would continue through other sets UNTIL it found the correct one. The realisation I had the power to stop it upon a failed string search within the 1 data set was (dare I say it) overwhelming. It was like a light had suddenly switched on, learning a new ability with regexes.

                                            Cheers
                                            Terry

                                            1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            Copyright © 2014 NodeBB Forums | Contributors