Community
    • Login

    deleting duplicate names this Coordination

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    10 Posts 6 Posters 254 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • sabry fargS
      sabry farg
      last edited by

      I have a txt file with names in the following format:

      (sherin|sherif|shehab|shawky|shaker|shahin||shawky|shahd|shaaban|gaber)

      But there are duplicate names.

      Please send me a method for deleting duplicate names.

      Thomas KnoefelT 1 Reply Last reply Reply Quote 0
      • Thomas KnoefelT
        Thomas Knoefel @sabry farg
        last edited by Thomas Knoefel

        This post is deleted!
        sabry fargS 1 Reply Last reply Reply Quote 0
        • guy038G
          guy038
          last edited by guy038

          Hello @sabry-farg, @thomas-knoefel and All,

          I’ll use a different way than @thomas-knoefel, with native N++ features only !


          First, we’ll change this one-line list into a one-word list. For example, given this INPUT list, pasted in a new tab :

          Table|glass|fork|knife|Table|glass|spoon|fork|spoon|fork|Table|spoon|knife|Table|glass|table|glass|fork|spoon
          
          • Open the Replace dialog

          • Uncheck all box options

          • FIND |

          • REPLACE \r\n

          • Select the Wrap around option

          • Choose the Regular expression search mode

          • Click on the Replace All button

          You should get this temporary text, below :

          Table
          glass
          fork
          knife
          Table
          glass
          spoon
          fork
          spoon
          fork
          Table
          spoon
          knife
          Table
          glass
          table
          glass
          fork
          spoon
          

          • Now, we’ll run the Edit > Line Operations > Remove Duplicates Lines operation

          Which should return this temporary text :

          Table
          glass
          fork
          knife
          spoon
          table
          

          Note : there still are two words table because they do not have the same case !


          Finally :

          • Open the Replace dialog

          • Uncheck all box options

          • FIND (?-s)(?<=.)\R(?=.)

          • REPLACE |

          • Select the Wrap around option

          • Choose the Regular expression search mode

          • Click on the Replace All button

          And here is your expected OUTPUT text :

          Table|glass|fork|knife|spoon|table
          

          Best Regards,

          guy038

          sabry fargS Lycan ThropeL 2 Replies Last reply Reply Quote 5
          • sabry fargS
            sabry farg @Thomas Knoefel
            last edited by

            @Thomas-Knoefel I didn’t understand that method

            Thomas KnoefelT 1 Reply Last reply Reply Quote 0
            • Thomas KnoefelT
              Thomas Knoefel @sabry farg
              last edited by Thomas Knoefel

              This post is deleted!
              1 Reply Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn
                last edited by

                This really isn’t a job for a plugin, as Notepad++ can handily do it, as @guy038 shows.

                1 Reply Last reply Reply Quote 1
                • sabry fargS
                  sabry farg @guy038
                  last edited by

                  @guy038 said in deleting duplicate names this Coordination:

                  First, we’ll change this one-line list into a one-word list. For example, given this INPUT list, pasted in a new tab :

                  Table|glass|fork|knife|Table|glass|spoon|fork|spoon|fork|Table|spoon|knife|Table|glass|table|glass|fork|spoon

                  Open the Replace dialog
                  
                  Uncheck all box options
                  
                  FIND |
                  
                  REPLACE \r\n
                  
                  Select the Wrap around option
                  
                  Choose the Regular expression search mode
                  
                  Click on the Replace All button
                  

                  The method is not effective.

                  See what happened after implementation.
                  A_001.jpg

                  Terry RT 1 Reply Last reply Reply Quote 0
                  • Terry RT
                    Terry R @sabry farg
                    last edited by

                    @sabry-farg said in deleting duplicate names this Coordination:

                    The method is not effective

                    I think you have done something wrong or the character set is not as shown. If the character in your Find What field | is the same as the character between the names (see image below) then after the Replace function has been used there should be none of these characters left between the names. And as every character in each of the names is on it’s own line then that would suggest there is a | is between every character but that doesn’t explain how the | character remains between the names.

                    NPP26711.JPG

                    How about providing the name list by inserting the actual names, then select them and click on the code icon above, see the </>. This will allow us to actually use exactly the text you say you are working with.

                    Terry

                    1 Reply Last reply Reply Quote 2
                    • Lycan ThropeL
                      Lycan Thrope @guy038
                      last edited by Lycan Thrope

                      @guy038 ,
                      Actually @guy038, I think your first step omitted the escape character for the pipe character, and that’s why it puts each character on a line when you hit replace all. Changing the S/R to this fixes it:

                      • Open the Replace dialog
                      • Uncheck all box options
                      • FIND: \|
                      • REPLACE: \r\n
                      • Select the Wrap around option
                      • Choose the Regular expression search mode
                      • Click on the Replace All button

                      This will leave you with result like this:

                      (sherin
                      sherif
                      shehab
                      shawky
                      shaker
                      shahin
                      
                      shawky
                      shahd
                      shaaban
                      gaber)
                      

                      As you’ll notice, the original text he gave us, looks like this with that regex, as he has an extra pipe character in there. On purpose or accidentally, that’s what his list would look like after the first step with the escaped pipe character, and it would look like this with your second step to remove the duplicate lines.

                      (sherin
                      sherif
                      shehab
                      shawky
                      shaker
                      shahin
                      
                      shahd
                      shaaban
                      gaber)
                      

                      Notice the duplicate shawky has been removed.

                      Now it’s a matter of dealing with the parenthesis, the missing word between pipe characters and then it can be worked with as a clean list. I just wanted to point that little oversight out before it gets worse, otherwise, your solution works.

                      1 Reply Last reply Reply Quote 3
                      • guy038G
                        guy038
                        last edited by guy038

                        Hello, @sabry-farg, @thomas-knoefel, @terry-r, @lycan-thrope and All,

                        Oh…, @lycan-thrope, you’re perfectly right about it ! It’s a typo !

                        So, @sabry-farg, I apologize for my mistabke !

                        The correct regex is, indeed :

                        • FIND \|

                        • REPLACE \r\n

                        BR

                        guy038

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors