Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    Find and Replace Iteration...

    Help wanted · · · – – – · · ·
    4
    16
    6111
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ganesan govindarajan
      ganesan govindarajan last edited by

      Hi Guys,

      Thanks for the help in advance,

      I have data in XML and notepad as follows.

      notepad
      fig001 fig0001
      fig0013 fig0002
      sup-0001 sup-0002
      sup-0002 sup-0014
      etc…

      XML
      in the several places (even following values placed in more than one place but wanted to change in all places),
      fig001
      fig0013
      sup-0001
      sup-0002
      etc…

      I have 200+ items in notepad file (two column) which needs to be replaced one by one in xml with respective value.

      Please let me have any regex using for task.

      Thanks
      Ganesan. G

      Ekopalypse 1 Reply Last reply Reply Quote 0
      • Ekopalypse
        Ekopalypse @ganesan govindarajan last edited by

        @ganesan-govindarajan

        you need to do something like
        find what: \b(?:(fig001)|(fig0013)|(sup-0001))\b
        replace with: (?1fig0001)(?2fig0002)(?3sup-0002)

        Alan Kilborn 1 Reply Last reply Reply Quote 2
        • Alan Kilborn
          Alan Kilborn @Ekopalypse last edited by

          @Ekopalypse

          That is a good technique but the user has a lot of items. They are probably looking for a table-lookup based replacement. Fortunately we have dealt with that before a few times, one of which is here:

          https://notepad-plus-plus.org/community/topic/14548/help-replacing

          1 Reply Last reply Reply Quote 2
          • ganesan govindarajan
            ganesan govindarajan last edited by

            @Ekopalypse

            Thanks for the reply. But this technique is not exactly done for 200+ items and messed up.

            So, i need the more specific regex here. The link provided by Alan is not match with my requirement.

            Please advise.
            ganesang

            Ekopalypse 1 Reply Last reply Reply Quote 0
            • Ekopalypse
              Ekopalypse @ganesan govindarajan last edited by

              @ganesan-govindarajan

              But without knowing your data and what needs to be done exactly how should we know what to offer?
              I mean, by using regex it is essential to know
              a) what the data looks like and
              b) what is needed to be done.
              to be able to find a pattern which fulfills the requirement. The slightest offset will most likely break the regex.

              1 Reply Last reply Reply Quote 0
              • ganesan govindarajan
                ganesan govindarajan last edited by ganesan govindarajan

                Hi @Ekopalypse

                The data in the excel like ,

                seq-0031 seq-XX0001
                seq-0032 seq-XX0002
                seq-0001 seq-XX0003
                seq-0002 seq-XX0004
                seq-0003 seq-XX0005
                seq-0005 seq-XX0006
                seq-0028 seq-XX0007
                seq-0007 seq-XX0008
                seq-0008 seq-XX0009
                seq-0010 seq-XX0010

                the first column value found inside the XML anywhere between the other tags (some times found more than one without any particular order), which needs to be changed respective value as per second column.

                Ekopalypse Alan Kilborn 2 Replies Last reply Reply Quote 0
                • Ekopalypse
                  Ekopalypse @ganesan govindarajan last edited by

                  @ganesan-govindarajan

                  sorry, but looks like I’m not the one who can solve this puzzle.

                  1 Reply Last reply Reply Quote 0
                  • Alan Kilborn
                    Alan Kilborn @ganesan govindarajan last edited by

                    @ganesan-govindarajan

                    So as time and postings go on, the data is getting more of a pattern to it (which is a good thing for possible solutions). Is the data always as “patterned” as your most recent example shows? Meaning you want to replace seq- followed by four digits with something else (don’t really care what the something-else part is, doesn’t have to follow a pattern)?

                    @Ekopalypse

                    I feel your pain.

                    1 Reply Last reply Reply Quote 2
                    • ganesan govindarajan
                      ganesan govindarajan last edited by

                      Hi @Alan

                      The pattern of the data maybe vary like as follows,

                      Sup-0010, or fig-0011, or even whatever with three digit of alphabets and hyphen and ended with four digits numbers.

                      They needs to be changed with the format as mentioned above (sup-XX0010 or fig-XX0011 etc… )

                      Alan Kilborn 1 Reply Last reply Reply Quote 0
                      • Alan Kilborn
                        Alan Kilborn @ganesan govindarajan last edited by

                        @ganesan-govindarajan

                        The problem statement keeps changing. Sorry but like @Ekopalypse it appears I am not the one to solve this puzzle either. Good luck to you.

                        1 Reply Last reply Reply Quote 0
                        • guy038
                          guy038 last edited by guy038

                          Hello, @ganesan-govindarajan and All,

                          From your posts, I understand that your initial values, to be changed, begin with 3 letters ( let’s say 3 word characters ), followed with an hythen - and end with 4 digits

                          I also understood that any of these values can exist several times, in different locations of your XML file, and that you want to modify all the occurrences of any specific value

                          Finally, the new value, of any initial value, is taken from the second column of an Excel table and I assume that any initial value, in the first column of this table, is unique

                          Am I still right !? If so, here is a solution :


                          • Open a copy of your XMl file in Notepad++

                          • At the very end of the file, add a separator line of, at least, 4 equal signs ( ==== )

                          • Append, after this line, the contents of your Excel, table containing the list of “OLd - New” couples

                          So, let’s suppose, that we get, for instance, the following XML text, containing some seq-... values, followed with the ==== line and the two-columns table :

                          ....
                          ....
                          first value seq-0005
                          ....
                          bla seq-0031
                          ....
                          ....
                          Test seq-0001 Test
                          Test seq-0028 Test
                          ....
                          ....
                          ....
                          
                          Foo bar seq-0031
                          ....
                          Foo bar seq-0002
                          ....
                          seq-0005 Test
                          seq-0001 Test
                          ...
                          seq-0028 Test
                          ....
                              seq-0008
                          ....
                          seq-0008
                          ....
                          ....
                          ====
                          seq-0031 seq-XX0001
                          seq-0032 seq-XX0002
                          seq-0001 seq-XX0003
                          seq-0002 seq-XX0004
                          seq-0003 seq-XX0005
                          seq-0005 seq-XX0006
                          seq-0028 seq-XX0007
                          seq-0007 seq-XX0008
                          seq-0008 seq-XX0009
                          seq-0010 seq-XX0010
                          

                          Then :

                          • Open the Replace dialog ( CTRL + H )

                          • Tick the Match case and Wrap around options

                          • Select the Regular expression search mode

                          • SEARCH (?s)(\w{3}-\d{4})(?=.+^====.+\R\1\h+(?-s)(.+))|^====.+

                          • REPLACE \2

                          • Click once on the Replace all button or several times on the Replace button

                          => Any value of the first column of the ending table, found in your XML file, should be replaced with the appropriate value, of the second column of the table

                          => Then, when no more initial value can be found in the XML file, the regex S/R will delete the appended Excel table, too !

                          And you’ll obtain your expected text :

                          ....
                          ....
                          first value seq-XX0006
                          ....
                          bla seq-XX0001
                          ....
                          ....
                          Test seq-XX0003 Test
                          Test seq-XX0007 Test
                          ....
                          ....
                          ....
                          
                          Foo bar seq-XX0001
                          ....
                          Foo bar seq-XX0004
                          ....
                          seq-XX0006 Test
                          seq-XX0003 Test
                          ...
                          seq-XX0007 Test
                          ....
                              seq-XX0009
                          ....
                          seq-XX0009
                          ....
                          ....
                          

                          Let me know, if any problem !

                          Best regards,

                          guy038

                          1 Reply Last reply Reply Quote 3
                          • ganesan govindarajan
                            ganesan govindarajan last edited by

                            Hi @guy038

                            You Awesome!!!

                            This is what i expected.

                            Thanks much!!!

                            1 Reply Last reply Reply Quote 1
                            • ganesan govindarajan
                              ganesan govindarajan last edited by

                              Hi @guy038

                              Can you please explain the regex actually do?

                              It will helpful for me for further regex actions.

                              Thanks!!
                              Ganesan. G

                              1 Reply Last reply Reply Quote 0
                              • guy038
                                guy038 last edited by

                                Hi, @ganesan-govindarajan,

                                Not free presently, but I could give you an explanation by about 10h from now !

                                Cheers,

                                guy038

                                1 Reply Last reply Reply Quote 1
                                • guy038
                                  guy038 last edited by guy038

                                  Hello, @ganesan-govindarajan,

                                  Sorry for the delay ! I also slightly changed the search regex, because, with my previous version, the first line, after the line of equal signs, would never have been reached, in case your xml file, would have contained \n or \r EOL chars, only ( case of Unix or Mac files )


                                  So, what means this second version of my previous regex S/R

                                  SEARCH (?s)(\w{3}-\d{4})(?=.+^====.*^\1\h+(?-s)(.+))|^====.+

                                  REPLACE \2

                                  Firstly, the search regex can be split up in the two alternatives :

                                  • The regex (?s)(\w{3}-\d{4})(?=.+^====.*^\1\h+(?-s)(.+)), which searches any expression, contained in the first column of the Excel file and replaces it with the corresponding value in the second column of the Excel file

                                  • The regex ^====.+ which searches and deletes any text from the line of equal signs till the very end of your XML file

                                  • Now, the first alternative can be divided in 3 parts :

                                    • (?s) is an in-line modifier which means that a dot symbol . matches any single character, included an EOL one

                                    • (\w{3}-\d{4}) represents the string to search for, in your XML file : 3 word characters, followed with a hyphen, followed with 4 digit chars ( A word character is any Unicode single character considered as a letter, or as a digit, or the underscore symbol _ ) ) As it is embedded between parentheses, this occurrence is stored as group 1

                                    • Then, the (?=.+^====.*\R\1\h+(?-s)(.+)) structure is called a positive look-ahead, i.e. a condition which must be true, at cursor location ( so, right after (\w{3}-\d{4})) in order that the first alternative has matched ! In other words, IF exists, after the searched expression (\w{3}-\d{4}) :

                                      • A range, possibly multi-lines, of any character, including EOL ones ( .+ ), till a line beginning with 4 equal symbols ( ^==== )

                                      • Followed with a range, possibly null, of any character, including EOL ones, till the group 1 ( the searched string ), beginning a line ( .*^\1 )

                                      • Followed with a non-null range of horizontal blank characters, mostly, the space and tabulation chars ( \h+ ), which is the gap between columns 1 and 2 of an Excel row

                                      • Then followed with any non-null range of standard characters, only, of current line, due to the (?-s) modifier, and stored as group 2, because of the pair of parentheses ( (.+) ), which represents the corresponding expression replacing the searched regex

                                  • At the end, when no more string, matching the first alternative \w{3}-\d{4}, can occur, the regex engine tries the second alternative ^====.+, which matches any range of characters, including EOL ones ( .+ ), after the string ====, beginning a line ( ^====), till the very end of the XML file

                                  • During replacement :

                                    • When a corresponding value ( group 2 ) is found, the searched expression is then replaced with this value \2 ( case of the first alternative )

                                    • When the second alternative is used, this means that all the multi-lines block of pairs ( OLD-NEW values ), at the end of the xml file, including the line of equal signs ( ==== ), is selected and, as the second alternative does not contain any group, this multi-lines block is, thus, deleted


                                  Hope that this helps a bit ! Refer to the link, below, for further information about regexes :

                                  https://notepad-plus-plus.org/community/topic/15765/faq-desk-where-to-find-regex-documentation/1

                                  BR

                                  guy038

                                  1 Reply Last reply Reply Quote 4
                                  • ganesan govindarajan
                                    ganesan govindarajan last edited by

                                    Thank you so much @guy038

                                    You Genius!! in this field…

                                    Thanks
                                    Ganesan. G

                                    1 Reply Last reply Reply Quote 1
                                    • First post
                                      Last post
                                    Copyright © 2014 NodeBB Forums | Contributors