• Login
Community
  • Login

Search and replace between 2 files

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
8 Posts 5 Posters 657 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • E
    Enzo Turatti
    last edited by Sep 25, 2020, 6:43 AM

    Dear experts,

    I have two different files and I have to search and replace strings in this way:

    in the file A, I have to check if in the row there is the string [NO_CODE].
    Then, in the same row, I have to select the text between ; and [
    (for example the selection will be 5384578)
    The text may be variable lenght

    Sample file A
    XXX_020_G30;00;M9990940382;00;3;
    XXX_020_G30;00;5384578 [NO_CODE];00;3;
    XXX_020_G30;00;1214_020_341;00;2;
    XXX_020_G30;00;M9990940381;00;8;
    XXX_020_G30;00;1214_048_G37;00;1;

    In the file B, I have to search the string 5384578. Then in the row found, I have to select the string before ; copy and substitute 5384578 [NO_CODE] in file A. If, in the file B there is not the matching string, return to file A and skip to the next row.

    Sample file B
    M0100940001;6012602;
    M0100940002;6012606;
    M0100940003;6012605;
    M0220580002;5384578;
    M0220580004;5940029;
    M0220580005;5940030;
    M0220580007;5940111;
    M0220780001;5952013;

    Final file will be:

    XXX_020_G30;00;M9990940382;00;3;
    XXX_020_G30;00;M0220580002;00;3;
    XXX_020_G30;00;1214_020_341;00;2;
    XXX_020_G30;00;M9990940381;00;8;
    XXX_020_G30;00;1214_048_G37;00;1;

    Anyone can help me?

    1 Reply Last reply Reply Quote 0
    • T
      Terry R
      last edited by Terry R Sep 25, 2020, 10:42 AM Sep 25, 2020, 10:40 AM

      @Enzo-Turatti said in Search and replace between 2 files:

      Dear experts,
      I have two different files and I have to search and replace strings in this way

      I believe it is possible, however it will require a few steps. In broad terms:

      1. Bring the “key” to the front of the lines in both files. The two files could then be combined and sorted which would put any possible pairs together.
      2. With each pair adjust the line from file a changing the character sequence with what the secondary line has.
      3. Remove all of the lines from file b and remove the key from the front of the line.

      Sounds relatively straight forward. It only remains to be seen if in reality it works that easily.
      Question is the line order in file a important as that can be catered for, but it does complicate it some more.
      Terry

      1 Reply Last reply Reply Quote 1
      • G
        guy038
        last edited by guy038 Sep 25, 2020, 11:24 AM Sep 25, 2020, 11:12 AM

        Hello @enzo-turatti and All,

        Here is a regex solution which works ONLY IF the number of lines of file A and B are not too important !

        • First copy the contents of file A in a file C

        • Add a dummy line of, at least, three = signs

        • Append the contents of file B under the line ======

        => So, the contents of file C should be :

        XXX_020_G30;00;M9990940382;00;3;
        XXX_020_G30;00;5384578 [NO_CODE];00;3;
        XXX_020_G30;00;1214_020_341;00;2;
        XXX_020_G30;00;M9990940381;00;8;
        XXX_020_G30;00;1214_048_G37;00;1;
        ======
        M0100940001;6012602;
        M0100940002;6012606;
        M0100940003;6012605;
        M0220580002;5384578;
        M0220580004;5940029;
        M0220580005;5940030;
        M0220580007;5940111;
        M0220780001;5952013;
        
        • Run the following regex S/R, against file C :

          • SEARCH (?s-i)([^;]+) \[NO_CODE\](?=.+===.+^(.+?);\1;)|^===.+

          • REPLACE \2

        You should obtain your expected text :

        XXX_020_G30;00;M9990940382;00;3;
        XXX_020_G30;00;M0220580002;00;3;
        XXX_020_G30;00;1214_020_341;00;2;
        XXX_020_G30;00;M9990940381;00;8;
        XXX_020_G30;00;1214_048_G37;00;1;
        

        As said above, the behaviour of this regex highly depends on the size of files ! Il may fail and even end with a null file C !

        In this case, we’ll have to find out an other way, which probably will use sorting !

        See you later,

        Best regards,

        guy038

        A 1 Reply Last reply Sep 25, 2020, 1:16 PM Reply Quote 3
        • A
          Alan Kilborn @guy038
          last edited by Sep 25, 2020, 1:16 PM

          @guy038 said in Search and replace between 2 files:

          As said above, the behaviour of this regex highly depends on the size of files ! Il may fail and even end with a null file C !

          This failure comes up a lot in such solutions.

          @guy038 Have you ever switched to the following (as an experiment) when such a failure has occurred?:

          In Pythonscript console, do the following:

          editor.rereplace(r' search_expr ', r' replace_expr ')

          Specifically for the case above, it would be:

          editor.rereplace(r'(?s-i)([^;]+) [NO_CODE](?=.+===.+^(.+?);\1;)|^===.+', r'\2')

          I’m just wondering if this technique would work when the native N++ technique fails.

          1 Reply Last reply Reply Quote 2
          • G
            guy038
            last edited by guy038 Sep 25, 2020, 6:37 PM Sep 25, 2020, 6:24 PM

            Hi, @enzo-turatti, @alan-kilborn and All,

            Alan, I’ve just changed your line :

            editor.rereplace(r'(?s-i)([^;]+)\x20[NO_CODE](?=.+===.+^(.+?);\1;)|^===.+', r'\2')~~~
            

            as

            editor.rereplace(r'(?s-i)([^;]+)\x20\\[NO_CODE\\](?=.+===.+^(.+?);\1;)|^===.+', r'\2')
            

            n order that the regex works !! Seemingly it’s a problem with the NoddBB’s forum ! In order to show \\[ you must write \\\[


            For test, I used this example :

            XXX_020_G30;00;M9990940382;00;3;
            XXX_020_G30;00;5384578 [NO_CODE];00;3;    // Line to be MODIFIED
            XXX_020_G30;00;1214_020_341;00;2;
            XXX_020_G30;00;M9990940381;00;8;
            XXX_020_G30;00;1214_048_G37;00;1;
            .....
            XXX_020_G30;00;M9990940382;00;3;    \
            XXX_020_G30;00;1214_020_341;00;2;   |     // repeated 66,433 times 
            XXX_020_G30;00;M9990940381;00;8;    |
            XXX_020_G30;00;1214_048_G37;00;1;   /
            .....
            XXX_020_G30;00;M9990940382;00;3;
            XXX_020_G30;00;1214_020_341;00;2;         // ONCE more
            XXX_020_G30;00;M9990940381;00;8;
            XXX_020_G30;00;1214_048_G37;00;1;
            ======
            M0100940001;6012602;
            M0100940002;6012606;
            M0100940003;6012605;
            M0220580002;5384578;
            M0220580004;5940029;
            M0220580005;5940030;
            M0220580007;5940111;
            M0220780001;5952013;
            

            This give a total of 5 + 4 x 66434 + 9 = 265,750 lines !

            • Running, either the regex from within N++ or the one-line Python script, above, it’s OK for both but the script is faxter ( 13s instead of 20s for the regex, on my old Windows XP SP3 machine ! )

            • Unfortunately, when adding 48 lines only ( 12 blocks of four lines ) for a total of 265,798 lines :

              • The regex fails and deletes all the contents, leaving two lines only !

              • The python script also fails to do the replacement and reports this error :

            Traceback (most recent call last):
              File "D:\@@\785\plugins\Config\PythonScript\scripts\Test_Alan.py", line 2, in <module>
                editor.rereplace(r'(?s-i)([^;]+)\x20\\[NO_CODE\\](?=.+===.+^(.+?);\1;)|^===.+', r'\2')
            RuntimeError: The complexity of matching the regular expression exceeded predefined bounds.  Try refactoring the regular expression to make each choice made by the state machine unambiguous.  This exception is thrown to prevent "eternal" matches that take an indefinite period time to locate.
            

            Remarks :

            • The Python’s script behaviour is nicer because, in case of problem, it does not change the file at all and just outputs an error :-))

            • The Python script seems significantly faster than the native N++ regex engine !

            • Finally, as the lines of the @enzo-turatti’s example are rather small, both the regex S/R or the equivalent Python script seem to work with a 9 Mb file ! However, I just created a case with one concordance, only ! As always, working with real data will put an end to our questions !

            Cheers,

            guy038

            1 Reply Last reply Reply Quote 2
            • E
              Enzo Turatti
              last edited by Sep 28, 2020, 10:32 AM

              Thank you very much.
              This week I should have time to test your suggestions.

              Best regards!

              1 Reply Last reply Reply Quote 0
              • Makwana PrahladM
                Makwana Prahlad Banned
                last edited by Sep 30, 2020, 3:43 AM

                Hello,@Enzo-Turatti
                Please follow this Step,To Search and replace between 2 files

                The Find in Files tab (Search > Find in Files or Ctrl+Shift+F) allows you to search and replace in multiple files with one action.

                Step 1: press Ctrl+F and then from the Find in Files tab options.
                Step 2: Put the string in the regex format of the Find What:
                ^. (PeopleSleptWith).*$*
                The string will go between the “()” parenthesis just as shown above in #1
                Step 3: Put the 5 spaces and then the Replace with: PeopleSleptWith 7 string
                Step 4: Put the Filters: as . or *.txt or whatever you are replacing file type wise
                Step 5: Put the Directory: where you want it to be
                Step 6: Check the Regular expression option
                Step 7: Select Replace in Files then Check the file(s) and all should be correct now

                I hope this information will be useful.
                Thank you.

                A 1 Reply Last reply Sep 30, 2020, 11:39 AM Reply Quote -1
                • A
                  Alan Kilborn @Makwana Prahlad
                  last edited by Sep 30, 2020, 11:39 AM

                  @Makwana-Prahlad said in Search and replace between 2 files:

                  PeopleSleptWith

                  ???

                  I think it is time for “banning” this individual.

                  Grounds: He pollutes threads with posts with useless and unrelated information which will only make any future readers seeking a solution to the problem originally posed confused.

                  Need more evidence: Look at his previous postings.

                  1 Reply Last reply Reply Quote 1
                  1 out of 8
                  • First post
                    1/8
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors