• Login
Community
  • Login

Find and remove everything else

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
17 Posts 4 Posters 2.9k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    Anthony Noriega
    last edited by Nov 19, 2020, 5:54 PM

    Could someone help me with a Find and remove function. I want to find words that are enclose by: /.keyword./

    and remove everything else around it.
    there are 15,000 lines and this /.keyword./ may show multiple times in one line. All I want are the keywords between this /.* .*/

    A 1 Reply Last reply Nov 19, 2020, 6:36 PM Reply Quote 0
    • A
      Alan Kilborn @Anthony Noriega
      last edited by Nov 19, 2020, 6:36 PM

      @Anthony-Noriega

      There’s probably a way to do what you need, but your need isn’t very clear. I see “italics” in your data which means that you’ve probably used a * in composing your post, but it was consumed by the site thinking that it was markup.

      1 Reply Last reply Reply Quote 1
      • A
        Alan Kilborn
        last edited by Nov 19, 2020, 7:02 PM

        I’m not one for guessing at people’s problem statement, but to show the general technique, let’s say that the data you want to keep is inside pairs of /.

        So, some data:

        After a weekend of emotional honesty at an Esalen-style retreat, Los Angeles
        sophisticates /Bob/ and /Carol/ Sanders (Robert Culp and Natalie Wood) return
        home determined to embrace complete openness. They share their enthusiasm
        and excitement over their new-found philosophy with their more conservative
        friends Ted and Alice Henderson (/Elliott/ Gould and Dyan Cannon), who remain
        doubtful. Soon after, filmmaker Bob has an affair with a young production
        assistant on a film shoot in San Francisco. When he gets home he admits his
        liaison to Carol, describing the event as a purely physical act, not an
        emotional one. To Bob's surprise, Carol is completely accepting of his
        extramarital behavior. Later, Carol gleefully reveals the affair to /Ted/ and
        /Alice/ as they are leaving a dinner party. Disturbed by Bob's infidelity and
        Carol's candor, Alice becomes physically ill on the drive home. She and Ted
        have a difficult time coping with the news in bed that night. But as time
        passes they grow to accept that Bob and Carol really are fine with the
        affair. Later, Ted admits to Bob that he was tempted to have an affair once,
        but didn't go through with it; Bob tells Ted he should, rationalizing:
        "You've got the guilt anyway. /Don't waste it/."
        

        and a replacement,

        find: (?s).*?/(?-s)(.*?)/|(?s).*\z
        repl: ?1${1}\r\n
        (regular expression search mode)

        will yield:

        Bob
        Carol
        Elliott
        Ted
        Alice
        Don't waste it
        

        This technique has its roots in THIS THREAD.

        A 1 Reply Last reply Nov 23, 2020, 6:14 PM Reply Quote 2
        • A
          Anthony Noriega @Alan Kilborn
          last edited by Nov 23, 2020, 6:14 PM

          @Alan-Kilborn You were correct, the asterisk/star was consumed by the formatter. The pattern is: slash dot star exampleKeyword dot star slash
          Essentially I only want the keyword in between that pattern and delete everything else.

          A 2 Replies Last reply Nov 23, 2020, 6:17 PM Reply Quote 0
          • A
            Alan Kilborn @Anthony Noriega
            last edited by Alan Kilborn Nov 23, 2020, 6:19 PM Nov 23, 2020, 6:17 PM

            @Anthony-Noriega

            So I’ll show it because you apparently can’t:

            /.*mykeyword*./

            :-)

            Please confirm that is correct.
            Can you make the adjustments to what I’ve already shown as an example, to make it work?
            It might be tricky…

            1 Reply Last reply Reply Quote 1
            • A
              Alan Kilborn @Anthony Noriega
              last edited by Alan Kilborn Nov 23, 2020, 6:31 PM Nov 23, 2020, 6:30 PM

              @Anthony-Noriega

              Well, it really is a bit tricky. :-)

              If we change my earlier text to this (which is more of what I think you have):

              After a weekend of emotional honesty at an Esalen-style retreat, Los Angeles
              sophisticates /.*Bob*./ and /.*Carol*./ Sanders (Robert Culp and Natalie Wood) return
              home determined to embrace complete openness. They share their enthusiasm
              and excitement over their new-found philosophy with their more conservative
              friends Ted and Alice Henderson (/.*Elliott*./ Gould and Dyan Cannon), who remain
              doubtful. Soon after, filmmaker Bob has an affair with a young production
              assistant on a film shoot in San Francisco. When he gets home he admits his
              liaison to Carol, describing the event as a purely physical act, not an
              emotional one. To Bob's surprise, Carol is completely accepting of his
              extramarital behavior. Later, Carol gleefully reveals the affair to /.*Ted*./ and
              /.*Alice*./ as they are leaving a dinner party. Disturbed by Bob's infidelity and
              Carol's candor, Alice becomes physically ill on the drive home. She and Ted
              have a difficult time coping with the news in bed that night. But as time
              passes they grow to accept that Bob and Carol really are fine with the
              affair. Later, Ted admits to Bob that he was tempted to have an affair once,
              but didn't go through with it; Bob tells Ted he should, rationalizing:
              "You've got the guilt anyway. /.*Don't waste it*./."
              

              If we then try this replacement:

              find: (?s).*?/\Q.*\E((?-s).*?)\Q*.\E/|(?s).*\z
              repl: ?1${1}\r\n
              (regular expression search mode)

              We’ll (again) obtain:

              Bob
              Carol
              Elliott
              Ted
              Alice
              Don't waste it
              

              I used the \Q and \E constructs to avoid leaning-toothpick-syndrome, somewhat.

              1 Reply Last reply Reply Quote 2
              • A
                Anthony Noriega
                last edited by Nov 23, 2020, 6:40 PM

                @Alan-Kilborn said in Find and remove everything else:

                ?1${1}\r\n

                Close, but the pattern you have is off…on the end, you hvae the star next to the keyword, and it should be the dot as my example.

                /.*mykeyword*./
                

                It should be:

                /.*mykeyword.*/
                
                A 1 Reply Last reply Nov 23, 2020, 6:57 PM Reply Quote 0
                • T
                  Terry R
                  last edited by Terry R Nov 23, 2020, 6:46 PM Nov 23, 2020, 6:45 PM

                  @Anthony-Noriega said in Find and remove everything else:

                  Close, but the pattern you have is off…on the end

                  My solution was:
                  Find What:(?s)\G/\.\*([^.]+)\.\*/|.+?(?=\z|/\.\*)
                  Replace With:?1\1\r\n
                  again a regular expression so search mode is regular expression.

                  Where (again) leaning toothpicks are all around.

                  Cheers
                  Terry

                  PS I should add there will likely be a last empty line, just a side effect of how the regex works. Should be easy enough to remove that afterwards.

                  1 Reply Last reply Reply Quote 2
                  • A
                    Anthony Noriega
                    last edited by Nov 23, 2020, 6:49 PM

                    @Terry-R said in Find and remove everything else:

                    (?s)\G/.*([^.]+).*/|.+?(?=\z|/.*)
                    That fixed it, thank you all for your help.

                    1 Reply Last reply Reply Quote 1
                    • A
                      Alan Kilborn @Anthony Noriega
                      last edited by Alan Kilborn Nov 23, 2020, 6:57 PM Nov 23, 2020, 6:57 PM

                      @Anthony-Noriega said in Find and remove everything else:

                      Close, but the pattern you have is off…

                      Yes, my bad on that. :-(

                      Too bad we couldn’t have seen this from the very beginning:
                      Imgur

                      A 1 Reply Last reply Nov 23, 2020, 8:28 PM Reply Quote 0
                      • A
                        Anthony Noriega @Alan Kilborn
                        last edited by Nov 23, 2020, 8:28 PM

                        @Alan-Kilborn Rookie mistake… i didnt realize the formatter was gonna make me look like a bonehead.

                        A 1 Reply Last reply Nov 23, 2020, 8:30 PM Reply Quote 2
                        • A
                          Alan Kilborn @Anthony Noriega
                          last edited by Alan Kilborn Nov 23, 2020, 8:30 PM Nov 23, 2020, 8:30 PM

                          @Anthony-Noriega said in Find and remove everything else:

                          look like a bonehead.

                          No worries.
                          We see that kind of thing CONSTANTLY here!
                          :-)
                          The important part is we are marking your problem SOLVED!

                          1 Reply Last reply Reply Quote 1
                          • guy038G
                            guy038
                            last edited by guy038 Nov 25, 2020, 12:41 AM Nov 24, 2020, 11:49 PM

                            Hello, @Anthony-Noriega, @alan-kilborn, @terry-r and All,

                            I know, I’m a bit late :-) Here is my solution !

                            Assuming that the exact syntax is :

                            /.*keyword.*/
                            

                            SEARCH (?s).+?/\.\*(.+?)\.\*/|.+

                            REPLACE ?1\1\r\n

                            Notes :

                            • First, the (?s) syntax means that the regex . char will match any single character, even an EOL one

                            • Then , in two parts of the search expression, the regex syntax .+? represents the shortest non-null range of characters till, either, the strings /.* or .*/

                            • Because of the regex symbols * and ., these characters must be escaped with an slash, so the form \.\*

                            • As the second .+? syntax is embedded between parentheses, the second range of chars ( each keyword ) is stored as group 1

                            • Finally , then no more keyword exists, the second alternative .+ looks for the greatest non-null range of characters till… the very end of file

                            • In replacement, the conditional structure ?1\1\r\n means that if the group 1 exists, it is rewritten \1, followed with a line break \r\n. When the second alternative of the search occurs, no group is involved. So nothing occurs, and the last range of text, after the last keyword, is simply deleted

                            Best Regards,

                            guy038

                            A 1 Reply Last reply Nov 24, 2020, 11:58 PM Reply Quote 1
                            • A
                              Alan Kilborn @guy038
                              last edited by Nov 24, 2020, 11:58 PM

                              @guy038

                              But really, Guy, there isn’t anything new here over what you posted HERE – with the removal of the ^ as discussed a bit later in that thread – it’s just an application of the other posting’s idea to slightly different data.

                              We probably should stop solving the specific problems and just point people to the already-derived general solutions.

                              1 Reply Last reply Reply Quote 0
                              • guy038G
                                guy038
                                last edited by Nov 25, 2020, 1:02 AM

                                Hi, @lan-kilborn,

                                Yes, I agree that it looks like a redundant piece of information ! In fact, I was thinking to this old post, where I proposed a general method, for isolating literal strings or expressions matched by a given regex, rewritten on different lines :

                                https://notepad-plus-plus.org/community/topic/12710/marked-text-manipulation/8

                                That’s the reason why, in my previous post, I preferred to focus on the regexes’s explanations, thinking it could be useful to the OP, anyway !

                                But, Alan, you’re right : my post wasn’t really needed ;-))

                                Cheers,

                                guy038

                                A 1 Reply Last reply Nov 25, 2020, 1:36 PM Reply Quote 2
                                • A
                                  Alan Kilborn @guy038
                                  last edited by Nov 25, 2020, 1:36 PM

                                  @guy038

                                  I had a further thought:

                                  The thread I linked to earlier, and referred to in my post just above is entitled “Marked Text Manipulation”.

                                  That relates to the current thread because a typical desire after marking some text is to copy only that text to another location, which is very similar to the topic of this “Find and remove everything else” thread.
                                  In both cases you obtain the same effective result.

                                  The new thought is that, at the time of the “Marked Text Manipulation” thread’s main discussion, there was no way to copy marked text without resorting to scripting. Now (7.9.1-ish) there is:

                                  1f6bdae9-f91c-4b07-bfb6-5afebec8922b-image.png

                                  Just press the indicated button after you already have marked some text.

                                  I will put a similar not in that other thread as well.

                                  1 Reply Last reply Reply Quote 2
                                  • guy038G
                                    guy038
                                    last edited by Nov 25, 2020, 6:08 PM

                                    Hi, @anthony-noriega, @alan-kilborn, @terry-r and All,

                                    Oh, yes, Alan. You’re right ! Of course, I already downloaded the portable v7.9.1 version but I’m still “stuck” with the v7.8.5 version which explains why I didn"t notice this recent enhancement !

                                    So, thanks to @scott-sumner, we just have to use the (?-s)/\.\*\K(.+?)(?=\.\*/) regex, click on the Mark All button to get all the keywords and, then, click on the Copy Marked Text button and paste the results on a new document. Nice !

                                    BR

                                    guy038

                                    1 Reply Last reply Reply Quote 2
                                    2 out of 17
                                    • First post
                                      2/17
                                      Last post
                                    The Community of users of the Notepad++ text editor.
                                    Powered by NodeBB | Contributors