• Login
Community
  • Login

Find and Replace: Multiple Replacements in Part of a String

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
11 Posts 5 Posters 745 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • ?
    A Former User
    last edited by Nov 9, 2020, 8:42 PM

    Hi,

    I’m using the build-in find and replace tool (CTRL+H) with case sensitivity turned on and regular expressions. I’m limited to using vanilla Notepad++ (no plugins, no python etc.).

    I have a file with sets of ID’s and strings separated by a comma structured like so:

    ID, String
    12345-01, A+A*2B+B*+A
    12345-02, A+AB+B*+AA

    I want to make the following replacements but only in the string (after the comma).
    +A* --> 1
    +A --> a
    +B* --> 2
    +B --> b
    2 --> X

    As example, “12345-01, A+A*2B+B*+A” should be changed to “12345-01, A1XB2a”.

    Now if the ID was not there the following works like a charm:
    Search for: (\+A\*)|(\+A)|(\+B\*)|(\+B)|(2)
    Replace with: (?{1}1)(?{2}a)(?{3}2)(?{4}b)(?{5}X)

    However, when the ID is present I cannot seem to find a solution that will leave the ID unchanged while making all the replacements in the string.

    Do you have any suggestions?

    T 1 Reply Last reply Nov 9, 2020, 8:49 PM Reply Quote 1
    • T
      Terry R @A Former User
      last edited by Nov 9, 2020, 8:49 PM

      @Anos said in Find and Replace: Multiple Replacements in Part of a String:

      that will leave the ID unchanged

      A quick “fix” might be to use:
      (\+A\*)|(\+A)|(\+B\*)|(\+B)|(\b2\b)
      So here the 2 must be at a "boundary. For the 2 example lines provided it does work, however 2 examples does NOT a book make! It will depend on whether the “2” in the rest of the expression is surrounded by different characters on both sides.

      Terry

      T 1 Reply Last reply Nov 9, 2020, 8:56 PM Reply Quote 1
      • T
        Terry R @Terry R
        last edited by Nov 9, 2020, 8:56 PM

        @Terry-R said in Find and Replace: Multiple Replacements in Part of a String:

        So here the 2 must be at a "boundary

        Sorry, jumped the gun slightly, it did work on 2nd example, missed that it didn’t work on the first examples. Yes it IS a bit of a poser. It will involve a bit more thought.

        Terry

        1 Reply Last reply Reply Quote 0
        • T
          Terry R
          last edited by Terry R Nov 9, 2020, 9:02 PM Nov 9, 2020, 9:01 PM

          @Terry-R said in Find and Replace: Multiple Replacements in Part of a String:

          It will involve a bit more thought.

          Sorry, about that false start, I think I now have it. We have
          FW:(?-s)((\+A\*)|(\+A)|(\+B\*)|(\+B)|(2))(?!.*?,)
          RW:(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)

          So as I had to add a negative lookahead the bracket numbering all changed hence a new replace with code as well.
          So basically whenever it finds a character, so long as no , after it on the line it will be changed. As the ID is before the , nothing there should be changed.

          Terry

          PS should have paid more attention to your statement
          I want to make the following replacements but only in the string (after the comma).

          P ? 2 Replies Last reply Nov 9, 2020, 9:06 PM Reply Quote 1
          • A
            Alan Kilborn
            last edited by Nov 9, 2020, 9:05 PM

            @Anos

            I came up with this; seems to work but maybe has holes:

            find: (^[^,]+,)|(\+A\*)|(\+A)|(\+B\*)|(\+B)|(2)
            repl: (?{1}\1)(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)

            The result of the replacement with it:

            12345-01, A1XB2a
            12345-02, AaB2aA
            
            T ? 2 Replies Last reply Nov 9, 2020, 9:17 PM Reply Quote 2
            • P
              PeterJones @Terry R
              last edited by Nov 9, 2020, 9:06 PM

              @Terry-R said in Find and Replace: Multiple Replacements in Part of a String:

              So as I had to add a negative lookahead the bracket numbering all changed

              You could have made the wrapping parentheses a non-capturing group: (?-s)(?:(\+A\*)|(\+A)|(\+B\*)|(\+B)|(2))(?!.*?,), to avoid the renumbering in the replacement.

              TIMTOWTDI

              1 Reply Last reply Reply Quote 1
              • T
                Terry R @Alan Kilborn
                last edited by Nov 9, 2020, 9:17 PM

                @Alan-Kilborn said in Find and Replace: Multiple Replacements in Part of a String:

                find: (^[^,]+,)|(+A*)|(+A)|(+B*)|(+B)|(2)
                repl: (?{1}\1)(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)

                I vote for yours. As an interesting aside, using regex101.com and inputting the 2 example lines and the Find What code, my code took twice as long as @Alan-Kilborn to process. It’s obvious the lookahead is where the extra time is spent.

                For a small file to process it may not mean a lot, but sometimes efficiency in coding can be an advantage, hence my vote for @Alan-Kilborn code.

                Terry

                1 Reply Last reply Reply Quote 2
                • ?
                  A Former User @Terry R
                  last edited by Nov 9, 2020, 9:18 PM

                  @Terry-R This does seem to work as intended, at least with my limited testing. Thank you very much for your quick replies. I have never really familiarized myself with lookaheads, they certainly look useful though.

                  1 Reply Last reply Reply Quote 2
                  • T
                    Terry R
                    last edited by Nov 9, 2020, 9:21 PM

                    @Anos said in Find and Replace: Multiple Replacements in Part of a String:

                    I have never really familiarized myself with lookaheads

                    There are LOTS of wonderful things to try and remember, as @PeterJones just reminded me. I should have made that a non-capture group, then it would not have required a rejig of the replace with code.

                    As I always say
                    “The day you stop learning is the day you die”

                    Terry

                    1 Reply Last reply Reply Quote 1
                    • ?
                      A Former User @Alan Kilborn
                      last edited by Nov 9, 2020, 9:24 PM

                      @Alan-Kilborn Thank you for this solution. This also gets the job done, and as @Terry-R points out it seems to be more efficient.

                      1 Reply Last reply Reply Quote 2
                      • G
                        guy038
                        last edited by guy038 Nov 10, 2020, 12:01 PM Nov 9, 2020, 10:20 PM

                        Hello, @anos, @terry-r, @alan-kilborn, @peterjones and All,

                        And here is my solution !

                        If we use the FREE-SPACING mode (?x), for the SEARCH part :
                        
                        SEARCH  (?x-s)  (?: ( \+A (\*)? ) | ( \+B (\*)? ) | (2) )  (?!.*,)
                        Groups -->      No  1     2         3     4         5     Look-Ahead  
                        
                        REPLACE (?1(?{2}1:a))(?3(?{4}2:b))?5X
                        
                        BEWARE that, in the REPLACE part, the FREE-SPACING mode is FORBIDDEN. So, ONLY for INFO :
                        
                        REPLACE ( ?1 ( ?{2} 1 : a ) )  ( ?3 ( ?{4} 2 : b ) ) ?5 X
                        

                        and given the data :

                        12345-01, A+A*2B+B*+A
                        12345-02, A+AB+B*+AA
                        

                        it would return :

                        12345-01, A1XB2a
                        12345-02, AaB2aA
                        

                        Notes :

                        • The first part (?x-s) of the regex search means that :

                          • The free-spacing mode is set ( Spaces are not taken in account, except for the [ ] syntax or an escaped space char )

                          • Due to (?-s) syntax, the dot regex symbol matches a single standard char only ( not an EOL char )

                        • Then, the (?:......) syntax defines a non-capturing group

                        • Now, in this non-capturing group, we have 3 alternatives and the first two contain an optional inner group (\*)? ( Remember that the ? is an other form of the {0,1} quantifier )

                        • To end, all this regex , so far, will match ONLY IF the final negative look-ahead structure (?!.*,) is verified, that is to say if at current position, reached by the regex engine, there is never a comma, at any further position, in current line

                        • Now, in the replacement regex :

                          • The (?1(?{2}1:a)) syntax means that if  group 1 exists, then if  group 2 exists, then  write 1 else  write a

                          • The (?3(?{4}2:b)) syntax means that if  group 3 exists, then if  group 4 exists, then  write 2 else  write b

                          • Finally, the ?5X means that if  group 5 exists, then  write an X ( The parentheses are not mandatory as this part ends the regex

                          • Note also that it’s not necessary to surround the groups 1, 3 and 5 with braces as these groups are not immediately followed with a digit !

                        Best Regards,

                        guy038

                        1 Reply Last reply Reply Quote 1
                        8 out of 11
                        • First post
                          8/11
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors