Community
    • Login

    How to remove the content from file1 or list1 (e.g. 4000 usernames) from file2 or list2 (e.g. 50000 usernames)

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    4 Posts 4 Posters 1.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Armin IrgerA
      Armin Irger
      last edited by

      Hello,

      how can the content from file1 or list1 (e.g. 4000 usernames) from file2 or list2 (e.g. 50000 usernames).

      file1 or list1 looks like:
      user3
      user9
      user10
      user2
      user30

      file2 or list2 looks like:
      [role]
      rid=2000
      rdesc=Members
      user=user100;user70;user4030;user10;user300;user33;user9;user2;user35;user290;user30;user158;user3;user893;user10;

      firstly i tried to modify file1 with
      Find what: \r\n
      Replace with: -
      Search Mode: Regular expression

      Then file1 looks like:
      user3-user9-user10-user2-user30

      Then change it with:
      Find what: -
      Replace with ;)|(
      Search mode: Normal

      Then file1 looks like:
      user3;)|(user9;)|(user10;)|(user2;)|(user30

      Then i add at the start a “(” and at the end a “;)”

      Then file1 looks like:
      (user3;)|(user9;)|(user10;)|(user2;)|(user30;)

      I open file2 in notepad++ and copy’n’paste the file1 content to the “Find what:” field.
      Find what: (user3;)|(user9;)|(user10;)|(user2;)|(user30)
      Replace with: <empty>
      Search mode: Normal

      The file2 looks like
      [role]
      rid=2000
      rdesc=Members
      user=user100;user70;user4030;user300;user33;user35;user290;user158;user893;user10;

      Is there another possibilty to solve this?

      Best regards
      Armin

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello, @armir-irger and All,

        Indeed, there is an other way to manage this task, using just one regex S/R :-))

        As usual, though obvious, just work on copies of your two files File1 and File2 !

        In order to simulate a real case, I simply split your set of members, in File2 in two parts, with some “rubbish” text, between !

        As regarding your list of members, in File1, I suppose that each string user####, which must disappear from File2 :

        • Begins a line, or is preceded with a semicolon ( ; )

        • Ends a line or is followed with a semicolon ( ; )

        Note that the File1 contents must be added at the end of File2, after a separation line, not used in File2. I personally chose the ### string but any string with, preferably, non-regex characters would be OK !

        So, following your example, we get the initial sample text, below :

        [role]
        rid=2000
        rdesc=Members
        user=user100;user70;user4030;user10;user300;user33;user9;
        
        bla
        bla blah
        
        blaaah
        bla bla blah
        
        [role]
        rid=1000
        rdesc=Members
        user=user2;user35;user290;user30;user158;user3;user893;user10;
        bla blah
        
        blaaah
        
        ###
        
        user3
        user9;user10;user2
        user30
        

        Now, we run the following regex S/R :

        SEARCH (user\d+);(?=(?s).*###.*(^|;)\1(;|\R))|(?s)###.+

        REPLACE Leave EMPTY

        And we get the expected text, below ( the user#### members, present in File1 are removed from File2 as well as all the File1 contents, which have been temporarily added ) Voilà !

        [role]
        rid=2000
        rdesc=Members
        user=user100;user70;user4030;user300;user33;
        
        bla
        bla blah
        
        blaaah
        bla bla blah
        
        [role]
        rid=1000
        rdesc=Members
        user=user35;user290;user158;user893;
        bla blah
        
        blaaah
        

        Best Regards,

        guy038

        P.S. :

        Note that, in File2, the last user10 member is also removed, because it was already present, twice, in File2 !

        1 Reply Last reply Reply Quote 1
        • Alan KilbornA
          Alan Kilborn
          last edited by

          This seems a common request. Maybe it is time to script it?

          1 Reply Last reply Reply Quote 1
          • Scott SumnerS
            Scott Sumner
            last edited by guy038

            This question comes up rather frequently, and it is always tackled with a regular-expression solution. This is okay… but sometimes people have trouble with that, so how about this time we throw down a Pythonscript solution? [Thanks, Alan, for the hint…]

            A variant of this question is “How do I replace a list of words (and corresponding replacement values) in one document and have the replace act upon another document?”. The question in this thread is just a special case of that: Deleting is simply where a replacement value is zero length.

            So I propose that the word list should have the following form, and be present in the clipboard when the script is run, with the data file to be operated on in the active Notepad++ editor tab:

            DELIMITERsearhtextDELIMITERreplacementtext
            where replacementtext can be empty in order to do a deletion

            Here’s an example:

            Text to be copied to clipboard prior to running the script:

            :silver:golden
            @silently@sqwalkingly
            .sqwalkingly .
            

            I purposefully used a different delimiter character on each line to show that that is possible…hmmm, maybe this just confuses things? Oh, well…

            Text to be operated on, all by itself in a fresh editor tab:

            Six silver swans swam silently seaward

            After running the script with that editor tab active, its text should be changed to:

            Six golden swans swam seaward

            Note that in the second line of the list, silently is changed to sqwalkingly. But…in the third line, sqwalkingly followed by a space is deleted (no text follows the final delimiter, meaning change the search text to nothing, i.e., delete it).

            Hopefully the reader can follow the progression of replacements in this case.

            So…to perform the OP’s original deletion of data, one would create the word list as follows and copy it to the clipboard:

            !user3;!
            !user9;!
            !user10;!
            !user2;!
            !user30;!
            

            Remember, the format is: DELIMITERsearhtextDELIMITERreplacementtext where replacementtext can be empty in order to do a deletion (as we are doing here). This time I have arbitrarily used the ! character as the delimiter, and I was consistent about it in each line, as one usually would be.

            Then, running the script in the file of the original data:

            [role]
            rid=2000
            rdesc=Members
            user=user100;user70;user4030;user10;user300;user33;user9;user2;user35;user290;user30;user158;user3;user893;user10;
            

            One obtains, with the desired users eliminated:

            [role]
            rid=2000
            rdesc=Members
            user=user100;user70;user4030;user300;user33;user35;user290;user158;user893;
            

            Here’s the script code for ReplaceUsingListInClipboard.py:

            def RULIC__main():
                if not editor.canPaste(): return
                cp = editor.getCurrentPos()
                editor.setSelection(cp, cp)  # cancel any active selection(s)
                doc_orig_len = editor.getTextLength()
                editor.paste()  # paste so we can get easy access to the clipboard text
                cp = editor.getCurrentPos()  # this has moved because of the paste
                clipboard_lines_list = editor.getTextRange(cp - editor.getTextLength() + doc_orig_len, cp).splitlines()
                editor.undo()  # revert the paste action, but sadly, this puts it in the undo buffer...so it can be redone
                editor.beginUndoAction()
                for line in clipboard_lines_list:
                    try: (search_text, replace_text) = line.rstrip('\n\r')[1:].split(line[0])
                    except (ValueError, IndexError): continue
                    editor.replace(search_text, replace_text)
                editor.endUndoAction()
            
            RULIC__main()
            
            1 Reply Last reply Reply Quote 3
            • Terry RT Terry R referenced this topic on
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors