How to remove the content from file1 or list1 (e.g. 4000 usernames) from file2 or list2 (e.g. 50000 usernames)
-
Hello,
how can the content from file1 or list1 (e.g. 4000 usernames) from file2 or list2 (e.g. 50000 usernames).
file1 or list1 looks like:
user3
user9
user10
user2
user30file2 or list2 looks like:
[role]
rid=2000
rdesc=Members
user=user100;user70;user4030;user10;user300;user33;user9;user2;user35;user290;user30;user158;user3;user893;user10;firstly i tried to modify file1 with
Find what: \r\n
Replace with: -
Search Mode: Regular expressionThen file1 looks like:
user3-user9-user10-user2-user30Then change it with:
Find what: -
Replace with ;)|(
Search mode: NormalThen file1 looks like:
user3;)|(user9;)|(user10;)|(user2;)|(user30Then i add at the start a “(” and at the end a “;)”
Then file1 looks like:
(user3;)|(user9;)|(user10;)|(user2;)|(user30;)I open file2 in notepad++ and copy’n’paste the file1 content to the “Find what:” field.
Find what: (user3;)|(user9;)|(user10;)|(user2;)|(user30)
Replace with: <empty>
Search mode: NormalThe file2 looks like
[role]
rid=2000
rdesc=Members
user=user100;user70;user4030;user300;user33;user35;user290;user158;user893;user10;Is there another possibilty to solve this?
Best regards
Armin -
Hello, @armir-irger and All,
Indeed, there is an other way to manage this task, using just one regex S/R :-))
As usual, though obvious, just work on copies of your two files File
1
and File2
!In order to simulate a real case, I simply split your set of members, in File
2
in two parts, with some “rubbish” text, between !As regarding your list of members, in File
1
, I suppose that each stringuser####
, which must disappear from File2
:-
Begins a line, or is preceded with a semicolon (
;
) -
Ends a line or is followed with a semicolon (
;
)
Note that the File
1
contents must be added at the end of File2
, after a separation line, not used in File2
. I personally chose the###
string but any string with, preferably, non-regex characters would be OK !So, following your example, we get the initial sample text, below :
[role] rid=2000 rdesc=Members user=user100;user70;user4030;user10;user300;user33;user9; bla bla blah blaaah bla bla blah [role] rid=1000 rdesc=Members user=user2;user35;user290;user30;user158;user3;user893;user10; bla blah blaaah ### user3 user9;user10;user2 user30
Now, we run the following regex S/R :
SEARCH
(user\d+);(?=(?s).*###.*(^|;)\1(;|\R))|(?s)###.+
REPLACE
Leave EMPTY
And we get the expected text, below ( the
user####
members, present in File1
are removed from File2
as well as all the File1
contents, which have been temporarily added ) Voilà ![role] rid=2000 rdesc=Members user=user100;user70;user4030;user300;user33; bla bla blah blaaah bla bla blah [role] rid=1000 rdesc=Members user=user35;user290;user158;user893; bla blah blaaah
Best Regards,
guy038
P.S. :
Note that, in File
2
, the lastuser10
member is also removed, because it was already present, twice, in File2
! -
-
This seems a common request. Maybe it is time to script it?
-
This question comes up rather frequently, and it is always tackled with a regular-expression solution. This is okay… but sometimes people have trouble with that, so how about this time we throw down a Pythonscript solution? [Thanks, Alan, for the hint…]
A variant of this question is “How do I replace a list of words (and corresponding replacement values) in one document and have the replace act upon another document?”. The question in this thread is just a special case of that: Deleting is simply where a replacement value is zero length.
So I propose that the word list should have the following form, and be present in the clipboard when the script is run, with the data file to be operated on in the active Notepad++ editor tab:
DELIMITERsearhtextDELIMITERreplacementtext
wherereplacementtext
can be empty in order to do a deletionHere’s an example:
Text to be copied to clipboard prior to running the script:
:silver:golden @silently@sqwalkingly .sqwalkingly .
I purposefully used a different delimiter character on each line to show that that is possible…hmmm, maybe this just confuses things? Oh, well…
Text to be operated on, all by itself in a fresh editor tab:
Six silver swans swam silently seaward
After running the script with that editor tab active, its text should be changed to:
Six golden swans swam seaward
Note that in the second line of the list,
silently
is changed tosqwalkingly
. But…in the third line,sqwalkingly
followed by a space is deleted (no text follows the final delimiter, meaning change the search text to nothing, i.e., delete it).Hopefully the reader can follow the progression of replacements in this case.
So…to perform the OP’s original deletion of data, one would create the word list as follows and copy it to the clipboard:
!user3;! !user9;! !user10;! !user2;! !user30;!
Remember, the format is:
DELIMITERsearhtextDELIMITERreplacementtext
wherereplacementtext
can be empty in order to do a deletion (as we are doing here). This time I have arbitrarily used the!
character as the delimiter, and I was consistent about it in each line, as one usually would be.Then, running the script in the file of the original data:
[role] rid=2000 rdesc=Members user=user100;user70;user4030;user10;user300;user33;user9;user2;user35;user290;user30;user158;user3;user893;user10;
One obtains, with the desired users eliminated:
[role] rid=2000 rdesc=Members user=user100;user70;user4030;user300;user33;user35;user290;user158;user893;
Here’s the script code for ReplaceUsingListInClipboard.py:
def RULIC__main(): if not editor.canPaste(): return cp = editor.getCurrentPos() editor.setSelection(cp, cp) # cancel any active selection(s) doc_orig_len = editor.getTextLength() editor.paste() # paste so we can get easy access to the clipboard text cp = editor.getCurrentPos() # this has moved because of the paste clipboard_lines_list = editor.getTextRange(cp - editor.getTextLength() + doc_orig_len, cp).splitlines() editor.undo() # revert the paste action, but sadly, this puts it in the undo buffer...so it can be redone editor.beginUndoAction() for line in clipboard_lines_list: try: (search_text, replace_text) = line.rstrip('\n\r')[1:].split(line[0]) except (ValueError, IndexError): continue editor.replace(search_text, replace_text) editor.endUndoAction() RULIC__main()
-