• Login
Community
  • Login

How to remove the content from file1 or list1 (e.g. 4000 usernames) from file2 or list2 (e.g. 50000 usernames)

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
4 Posts 4 Posters 1.6k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    Armin Irger
    last edited by Aug 4, 2018, 2:19 PM

    Hello,

    how can the content from file1 or list1 (e.g. 4000 usernames) from file2 or list2 (e.g. 50000 usernames).

    file1 or list1 looks like:
    user3
    user9
    user10
    user2
    user30

    file2 or list2 looks like:
    [role]
    rid=2000
    rdesc=Members
    user=user100;user70;user4030;user10;user300;user33;user9;user2;user35;user290;user30;user158;user3;user893;user10;

    firstly i tried to modify file1 with
    Find what: \r\n
    Replace with: -
    Search Mode: Regular expression

    Then file1 looks like:
    user3-user9-user10-user2-user30

    Then change it with:
    Find what: -
    Replace with ;)|(
    Search mode: Normal

    Then file1 looks like:
    user3;)|(user9;)|(user10;)|(user2;)|(user30

    Then i add at the start a “(” and at the end a “;)”

    Then file1 looks like:
    (user3;)|(user9;)|(user10;)|(user2;)|(user30;)

    I open file2 in notepad++ and copy’n’paste the file1 content to the “Find what:” field.
    Find what: (user3;)|(user9;)|(user10;)|(user2;)|(user30)
    Replace with: <empty>
    Search mode: Normal

    The file2 looks like
    [role]
    rid=2000
    rdesc=Members
    user=user100;user70;user4030;user300;user33;user35;user290;user158;user893;user10;

    Is there another possibilty to solve this?

    Best regards
    Armin

    1 Reply Last reply Reply Quote 0
    • G
      guy038
      last edited by guy038 Aug 4, 2018, 8:18 PM Aug 4, 2018, 8:11 PM

      Hello, @armir-irger and All,

      Indeed, there is an other way to manage this task, using just one regex S/R :-))

      As usual, though obvious, just work on copies of your two files File1 and File2 !

      In order to simulate a real case, I simply split your set of members, in File2 in two parts, with some “rubbish” text, between !

      As regarding your list of members, in File1, I suppose that each string user####, which must disappear from File2 :

      • Begins a line, or is preceded with a semicolon ( ; )

      • Ends a line or is followed with a semicolon ( ; )

      Note that the File1 contents must be added at the end of File2, after a separation line, not used in File2. I personally chose the ### string but any string with, preferably, non-regex characters would be OK !

      So, following your example, we get the initial sample text, below :

      [role]
      rid=2000
      rdesc=Members
      user=user100;user70;user4030;user10;user300;user33;user9;
      
      bla
      bla blah
      
      blaaah
      bla bla blah
      
      [role]
      rid=1000
      rdesc=Members
      user=user2;user35;user290;user30;user158;user3;user893;user10;
      bla blah
      
      blaaah
      
      ###
      
      user3
      user9;user10;user2
      user30
      

      Now, we run the following regex S/R :

      SEARCH (user\d+);(?=(?s).*###.*(^|;)\1(;|\R))|(?s)###.+

      REPLACE Leave EMPTY

      And we get the expected text, below ( the user#### members, present in File1 are removed from File2 as well as all the File1 contents, which have been temporarily added ) Voilà !

      [role]
      rid=2000
      rdesc=Members
      user=user100;user70;user4030;user300;user33;
      
      bla
      bla blah
      
      blaaah
      bla bla blah
      
      [role]
      rid=1000
      rdesc=Members
      user=user35;user290;user158;user893;
      bla blah
      
      blaaah
      

      Best Regards,

      guy038

      P.S. :

      Note that, in File2, the last user10 member is also removed, because it was already present, twice, in File2 !

      1 Reply Last reply Reply Quote 1
      • A
        Alan Kilborn
        last edited by Aug 5, 2018, 6:10 PM

        This seems a common request. Maybe it is time to script it?

        1 Reply Last reply Reply Quote 1
        • S
          Scott Sumner
          last edited by guy038 Aug 7, 2018, 10:14 AM Aug 6, 2018, 2:45 AM

          This question comes up rather frequently, and it is always tackled with a regular-expression solution. This is okay… but sometimes people have trouble with that, so how about this time we throw down a Pythonscript solution? [Thanks, Alan, for the hint…]

          A variant of this question is “How do I replace a list of words (and corresponding replacement values) in one document and have the replace act upon another document?”. The question in this thread is just a special case of that: Deleting is simply where a replacement value is zero length.

          So I propose that the word list should have the following form, and be present in the clipboard when the script is run, with the data file to be operated on in the active Notepad++ editor tab:

          DELIMITERsearhtextDELIMITERreplacementtext
          where replacementtext can be empty in order to do a deletion

          Here’s an example:

          Text to be copied to clipboard prior to running the script:

          :silver:golden
          @silently@sqwalkingly
          .sqwalkingly .
          

          I purposefully used a different delimiter character on each line to show that that is possible…hmmm, maybe this just confuses things? Oh, well…

          Text to be operated on, all by itself in a fresh editor tab:

          Six silver swans swam silently seaward

          After running the script with that editor tab active, its text should be changed to:

          Six golden swans swam seaward

          Note that in the second line of the list, silently is changed to sqwalkingly. But…in the third line, sqwalkingly followed by a space is deleted (no text follows the final delimiter, meaning change the search text to nothing, i.e., delete it).

          Hopefully the reader can follow the progression of replacements in this case.

          So…to perform the OP’s original deletion of data, one would create the word list as follows and copy it to the clipboard:

          !user3;!
          !user9;!
          !user10;!
          !user2;!
          !user30;!
          

          Remember, the format is: DELIMITERsearhtextDELIMITERreplacementtext where replacementtext can be empty in order to do a deletion (as we are doing here). This time I have arbitrarily used the ! character as the delimiter, and I was consistent about it in each line, as one usually would be.

          Then, running the script in the file of the original data:

          [role]
          rid=2000
          rdesc=Members
          user=user100;user70;user4030;user10;user300;user33;user9;user2;user35;user290;user30;user158;user3;user893;user10;
          

          One obtains, with the desired users eliminated:

          [role]
          rid=2000
          rdesc=Members
          user=user100;user70;user4030;user300;user33;user35;user290;user158;user893;
          

          Here’s the script code for ReplaceUsingListInClipboard.py :

          def RULIC__main():
              if not editor.canPaste(): return
              cp = editor.getCurrentPos()
              editor.setSelection(cp, cp)  # cancel any active selection(s)
              doc_orig_len = editor.getTextLength()
              editor.paste()  # paste so we can get easy access to the clipboard text
              cp = editor.getCurrentPos()  # this has moved because of the paste
              clipboard_lines_list = editor.getTextRange(cp - editor.getTextLength() + doc_orig_len, cp).splitlines()
              editor.undo()  # revert the paste action, but sadly, this puts it in the undo buffer...so it can be redone
              editor.beginUndoAction()
              for line in clipboard_lines_list:
                  try: (search_text, replace_text) = line.rstrip('\n\r')[1:].split(line[0])
                  except (ValueError, IndexError): continue
                  editor.replace(search_text, replace_text)
              editor.endUndoAction()
          
          RULIC__main()
          
          1 Reply Last reply Reply Quote 3
          • Terry RT Terry R referenced this topic on Mar 25, 2023, 2:08 AM
          1 out of 4
          • First post
            1/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors