Community
    • Login

    Can RegEx do the work for me?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    17 Posts 4 Posters 4.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • EkopalypseE
      Ekopalypse @Thomas Karlsson
      last edited by

      @Thomas-Karlsson

      if I understood correctly, what about
      find what:(?s)(BEGIN.*?\REND\R)(BEGIN.*?\REND\R)
      replace with: $1

      Alan KilbornA 1 Reply Last reply Reply Quote 4
      • Alan KilbornA
        Alan Kilborn @Ekopalypse
        last edited by

        This is one of those cases where I had to see the solution to understand what the problem really was! :)

        EkopalypseE 1 Reply Last reply Reply Quote 3
        • EkopalypseE
          Ekopalypse @Alan Kilborn
          last edited by

          @Alan-Kilborn

          :-D well, then please explain to me because I’m not sure if this is really correct what I’ve posted?

          Alan KilbornA 1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @Ekopalypse
            last edited by Alan Kilborn

            @Ekopalypse

            I think you are correct, at least, but it sure would be better for posters to say:

            This is what I have:

            foo
            foo
            foo
            

            And this is what I need when I’m done:

            bar
            bar
            bar
            

            And BTW you don’t have to come up with a regex for that.

            1 Reply Last reply Reply Quote 2
            • Thomas KarlssonT
              Thomas Karlsson
              last edited by

              Thanks for your answers, appreciated.

              Not sure if you are trying to help or if you just making fun of my question?

              So, regex or not. Can I somehow read through 1000 and 1000 rows of these 2 blocks and bookmark/exclude/delete Block 2 using Notepad++

              Thanks

              Alan KilbornA EkopalypseE 2 Replies Last reply Reply Quote 1
              • Alan KilbornA
                Alan Kilborn @Thomas Karlsson
                last edited by

                @Thomas-Karlsson

                Nobody’s making fun.

                And Yup, just use Eko’s regex replacement.

                1 Reply Last reply Reply Quote 3
                • EkopalypseE
                  Ekopalypse @Thomas Karlsson
                  last edited by

                  @Thomas-Karlsson

                  did you try my regex? I thought it should do what you want. It is just not 100% clear if we understood your request correctly in first place.

                  1 Reply Last reply Reply Quote 1
                  • Thomas KarlssonT
                    Thomas Karlsson
                    last edited by

                    Thanks a lot @Ekopalypse & @Alan Kilborn

                    I’ll see. Sorry for misinterpreted your answers. My bad!

                    Tried your regex Eko. But it seems to just delete the word BEGIN in both Block1 and Block 2.

                    EkopalypseE 1 Reply Last reply Reply Quote 0
                    • EkopalypseE
                      Ekopalypse @Thomas Karlsson
                      last edited by Ekopalypse

                      @Thomas-Karlsson

                      I assume that the block of data looks like this.
                      (by the way this is formatted by using ~~~ data ~~~)

                      BEGIN|JUNE_CUST_123|1|UPC|SW|PREFIX0000123456|NS
                      INLINE| 123456|||||||||1573||||||||||||
                      INLINE|2019-06-23|$_HolderAcctID|1994|Client_No 1134||1134
                      INLINE|PRE19007801|NSP070|ZZ||||0
                      INLINE|NSP123|1234||||
                      INLINE|NSP123|MI| 123456|9|1.000||||||||||||||||||||
                      INLINE|NSP123| 123456|1677
                      INLINE|1234|5678
                      END
                      BEGIN|JUNE_CUST_123|2|END|SW|PREFIX0000123456|NS
                      INLINE|2019-06-23|$_HolderAcctID|1994|Client_No 1134||1134
                      INLINE|ZZ|||||
                      INLINE|||||123456789| |00002|00002| |||||||
                      INLINE|1688|1134
                      END
                      BEGIN| ...
                      

                      Is this the case? Or is [Block X] really part of the data?

                      1 Reply Last reply Reply Quote 2
                      • PeterJonesP
                        PeterJones
                        last edited by

                        @Thomas-Karlsson ,

                        Joining the conversation: as the others have hinted at, it helps if:

                        1. example data is succinct, without extraneous information, but long enough to show what you want.
                        2. you give us distinct blocks of what you want before and after the search/replace
                        3. you format the data in a way that it renders properly in the forum. for more on this, see my boilerplate below

                        -----
                        FYI: here is some helpful information for finding out more about regular expressions, and for formatting posts in this forum (especially quoting data) so that we can fully understand what you’re trying to ask:

                        This forum is formatted using Markdown, with a help link buried on the little grey ? in the COMPOSE window/pane when writing your post. For more about how to use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic, and my updates near the end. It is very important that you use these formatting tips – using single backtick marks around small snippets, and using code-quoting for pasting multiple lines from your example data files – because otherwise, the forum will change normal quotes ("") to curly “smart” quotes (“”), will change hyphens to dashes, will sometimes hide asterisks (or if your text is c:\folder\*.txt, it will show up as c:\folder*.txt, missing the backslash). If you want to clearly communicate your text data to us, you need to properly format it.

                        If you have further search-and-replace (“matching”, “marking”, “bookmarking”, regular expression, “regex”) needs, study this FAQ and the documentation it points to. Before asking a new regex question, understand that for future requests, many of us will expect you to show what data you have (exactly), what data you want (exactly), what regex you already tried (to show that you’re showing effort), why you thought that regex would work (to prove it wasn’t just something randomly typed), and what data you’re getting with an explanation of why that result is wrong. When you show that effort, you’ll see us bend over backward to get things working for you. If you need help formatting, see the paragraph above.

                        Please note that for all regex and related queries, it is best if you are explicit about what needs to match, and what shouldn’t match, and have multiple examples of both in your example dataset. Often, what shouldn’t match helps define the regular expression as much or more than what should match.

                        1 Reply Last reply Reply Quote 2
                        • Thomas KarlssonT
                          Thomas Karlsson
                          last edited by

                          Hi guys,

                          Thanks for all you help and tips in this matters - appreciated.

                          Yes @Ekopalypse, the data blocks looks like this:

                          Block one

                          BEGIN|JUNE_CUST_123|1|UPC|SW|PREFIX0000123456|NS
                          INLINE|   123456|||||||||1573||||||||||||
                          INLINE|2019-06-23|$_HolderAcctID|1994|Client_No 1134||1134
                          INLINE|PRE19007801|NSP070|ZZ||||0
                          INLINE|NSP123|1234||||
                          INLINE|NSP123|MI|   123456|9|1.000||||||||||||||||||||
                          INLINE|NSP123|   123456|1677
                          INLINE|1234|5678
                          END
                          

                          Block two

                          BEGIN|JUNE_CUST_123|2|END|SW|PREFIX0000123456|NS
                          INLINE|2019-06-23|$_HolderAcctID|1994|Client_No 1134||1134
                          INLINE|ZZ|||||
                          INLINE|||||123456789| |00002|00002| |||||||
                          INLINE|1688|1134
                          END
                          

                          The two Blocks are very similar in it’s structure. But one with more data/rows between the BEGIN and END segments. This Block is the one I call Block One. The other one, what I describe as Block Two has also this BEGIN and END segment but less data/row in between.

                          The Blocks, one and two, then repeats it self (filled with different data of course) with up to 10.000 times. The goal here is to create some kind of logic that can mark-up/exclude/delete the data in Block Two.
                          Thinking if the first row in each block could be used e.g

                          if row begins with:

                          BEGIN|JUNE_CUST_123|1|UPC|.... 
                          

                          read, regardless, until the word END (last word in the block) and keep the block.

                          if row begins with:

                          BEGIN|JUNE_CUST_123|2|END|SW|.... 
                          

                          read, regardless, until the word END (last word in the block) and markup/exclude/delete the block.

                          Not sure if any of this make any sense.

                          Thanks in advance.

                          EkopalypseE 1 Reply Last reply Reply Quote 2
                          • EkopalypseE
                            Ekopalypse @Thomas Karlsson
                            last edited by

                            @Thomas-Karlsson

                            but this is what my regex does.

                            Sure you used the exact regex I’ve posted?

                            1 Reply Last reply Reply Quote 3
                            • PeterJonesP
                              PeterJones
                              last edited by PeterJones

                              @Thomas-Karlsson,

                              Thanks for adding more detail, and adding the formatting… That really helps your post be more understandable.

                              @Ekopalypse said:

                              but this is what my regex does.

                              Maybe my interpretation is slightly different than yours. You seem to think that @Thomas-Karlsson always expects blocks one and two to be alternating, and that block one will always come immediately before block two. If that’s so, then yours would work.

                              I looked more into the data, and decided that it’s the |1| vs |2| that determines whether it’s block one or block two, so I would search for (?s)BEGIN\|JUNE_CUST_123\|2\|END.*?END$\R* and replace with nothing. This shows what it looks like with “mark” rather than search/replace:

                              (Actually, that image was taken before I added the \R* on the end, because when I switched to Replace mode, it left a blank line, which the \R* at the end gets rid of.)

                              EkopalypseE 1 Reply Last reply Reply Quote 3
                              • EkopalypseE
                                Ekopalypse @PeterJones
                                last edited by

                                @PeterJones

                                You seem to think that @Thomas-Karlsson always expects blocks one and two to be alternating,

                                yes, that’s exactly what I thought is the case :-)

                                1 Reply Last reply Reply Quote 1
                                • PeterJonesP
                                  PeterJones
                                  last edited by

                                  @Ekopalypse said:

                                  yes, that’s exactly what I thought is the case :-)

                                  Easily understandable; that’s probably the more natural interpretation. On this forum (and similar), I just tend to assume that the OP isn’t clear in requirements, so I often try to look for more subtle clues, like the |1| vs |2|, in case the OP has accidentally implied more than intended. Sometimes, this means that my solution does not work for the OP… but sometimes it means mine does work when other, more literal interpretations, do not.

                                  Since the OP implied yours wasn’t working (though hasn’t answered your direct question), maybe my interpretation will be the lucky one this time. :-)

                                  If not, @Thomas-Karlsson will have to be more explicit about whether the blocks always alternate, or whether there are ever non-block rows between the two, or other such things.

                                  1 Reply Last reply Reply Quote 2
                                  • Thomas KarlssonT
                                    Thomas Karlsson
                                    last edited by

                                    @Ekopalypse and @PeterJones - a big BIG thanks for all your help and support.

                                    Eko - your regex was marked up each block, both one and tow so it did the job but at the same time not.

                                    PeterJones - your regex just nailed it and I’m soooo happy!! So impressed. You have know idea how much time I will save now using your regex. Previously I have search and deleted all of theses 1000 and 1000 rows manually.

                                    You have made my day, week, year :) Thanks and thanks in million!

                                    Cheers from happy Scandinavia guy :)

                                    1 Reply Last reply Reply Quote 5
                                    • First post
                                      Last post
                                    The Community of users of the Notepad++ text editor.
                                    Powered by NodeBB | Contributors