deleting duplicate names this Coordination
-
I have a txt file with names in the following format:
(sherin|sherif|shehab|shawky|shaker|shahin||shawky|shahd|shaaban|gaber)
But there are duplicate names.
Please send me a method for deleting duplicate names.
-
This post is deleted! -
Hello @sabry-farg, @thomas-knoefel and All,
I’ll use a different way than @thomas-knoefel, with native N++ features only !
First, we’ll change this one-line list into a one-word list. For example, given this INPUT list, pasted in a new tab :
Table|glass|fork|knife|Table|glass|spoon|fork|spoon|fork|Table|spoon|knife|Table|glass|table|glass|fork|spoon
-
Open the Replace dialog
-
Uncheck all box options
-
FIND
|
-
REPLACE
\r\n
-
Select the
Wrap around
option -
Choose the
Regular expression
search mode -
Click on the
Replace All
button
You should get this temporary text, below :
Table glass fork knife Table glass spoon fork spoon fork Table spoon knife Table glass table glass fork spoon
- Now, we’ll run the
Edit > Line Operations > Remove Duplicates Lines
operation
Which should return this temporary text :
Table glass fork knife spoon table
Note : there still are two words
table
because they do not have the same case !
Finally :
-
Open the Replace dialog
-
Uncheck all box options
-
FIND
(?-s)(?<=.)\R(?=.)
-
REPLACE
|
-
Select the
Wrap around
option -
Choose the
Regular expression
search mode -
Click on the
Replace All
button
And here is your expected OUTPUT text :
Table|glass|fork|knife|spoon|table
Best Regards,
guy038
-
-
@Thomas-Knoefel I didn’t understand that method
-
This post is deleted! -
This really isn’t a job for a plugin, as Notepad++ can handily do it, as @guy038 shows.
-
@guy038 said in deleting duplicate names this Coordination:
First, we’ll change this one-line list into a one-word list. For example, given this INPUT list, pasted in a new tab :
Table|glass|fork|knife|Table|glass|spoon|fork|spoon|fork|Table|spoon|knife|Table|glass|table|glass|fork|spoon
Open the Replace dialog Uncheck all box options FIND | REPLACE \r\n Select the Wrap around option Choose the Regular expression search mode Click on the Replace All button
The method is not effective.
See what happened after implementation.
-
@sabry-farg said in deleting duplicate names this Coordination:
The method is not effective
I think you have done something wrong or the character set is not as shown. If the character in your Find What field
|
is the same as the character between the names (see image below) then after the Replace function has been used there should be none of these characters left between the names. And as every character in each of the names is on it’s own line then that would suggest there is a|
is between every character but that doesn’t explain how the|
character remains between the names.How about providing the name list by inserting the actual names, then select them and click on the code icon above, see the
</>
. This will allow us to actually use exactly the text you say you are working with.Terry
-
@guy038 ,
Actually @guy038, I think your first step omitted the escape character for the pipe character, and that’s why it puts each character on a line when you hit replace all. Changing the S/R to this fixes it:- Open the Replace dialog
- Uncheck all box options
- FIND:
\|
- REPLACE:
\r\n
- Select the Wrap around option
- Choose the Regular expression search mode
- Click on the Replace All button
This will leave you with result like this:
(sherin sherif shehab shawky shaker shahin shawky shahd shaaban gaber)
As you’ll notice, the original text he gave us, looks like this with that regex, as he has an extra pipe character in there. On purpose or accidentally, that’s what his list would look like after the first step with the escaped pipe character, and it would look like this with your second step to remove the duplicate lines.
(sherin sherif shehab shawky shaker shahin shahd shaaban gaber)
Notice the duplicate
shawky
has been removed.Now it’s a matter of dealing with the parenthesis, the missing word between pipe characters and then it can be worked with as a clean list. I just wanted to point that little oversight out before it gets worse, otherwise, your solution works.
-
Hello, @sabry-farg, @thomas-knoefel, @terry-r, @lycan-thrope and All,
Oh…, @lycan-thrope, you’re perfectly right about it ! It’s a typo !
So, @sabry-farg, I apologize for my mistabke !
The correct regex is, indeed :
-
FIND
\|
-
REPLACE
\r\n
BR
guy038
-