Community
    • Login

    How to remove duplicates words?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    12 Posts 4 Posters 3.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alan KilbornA
      Alan Kilborn @Bahaa0013
      last edited by

      @Bahaa-Eddin-ツ

      It’s a bit of a tough one.
      Perhaps as a starting point, try fiddling with this:

      Find: (?-s)^(.*?\b(\w+)\b.+?) \2\+?
      Replace: ${1}
      Search mode: Regular expression

      You’d have to run it several times, until no more replacements are made.

      And I just tried it quickly, so I’m sure some holes can be shot into it. :-)

      Bahaa0013B 1 Reply Last reply Reply Quote 2
      • Bahaa0013B
        Bahaa0013 @Alan Kilborn
        last edited by

        @Alan-Kilborn
        Thank you I guess it’s work…
        But I guess I have to run it at least 500 times to remove all the duplicated words xD

        but no problem I will use, it much easier, Thanks

        PeterJonesP Alan KilbornA 2 Replies Last reply Reply Quote 0
        • PeterJonesP
          PeterJones @Bahaa0013
          last edited by

          @Bahaa-Eddin-ツ said in How to remove duplicates words?:

          I guess I have to run it at least 500 times

          Record the search/replace as a macro, then use Macros > Run a Macro Multiple Times to run it 500 (or whatever is necessary).

          1 Reply Last reply Reply Quote 1
          • Alan KilbornA
            Alan Kilborn @Bahaa0013
            last edited by Alan Kilborn

            @Bahaa-Eddin-ツ said in How to remove duplicates words?:

            Thank you I guess it’s work…

            Don’t guess…be sure…your data is important.

            I have to run it at least 500 times to remove all the duplicated words xD

            Hold down the keyboard accelerator for Replace All until the Replace window’s status bar indicates no more replacements were made?

            Bahaa0013B 1 Reply Last reply Reply Quote 1
            • Bahaa0013B
              Bahaa0013 @Alan Kilborn
              last edited by

              @Alan-Kilborn
              I guess it’s not work as I wanted…
              because I didn’t add the right example

              this is what I want:
              example:

              [math part1 +Bilology part1+ biology part3+ History part1+ math part1+ Biology part3+ history part1]
              

              output:

              [Bilology part1+ History part1+ math part1+ Biology part3]
              
              Alan KilbornA 1 Reply Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn @Bahaa0013
                last edited by

                @Bahaa-Eddin-ツ

                I’d say, start from my kickstart attempt, and go from there. Good luck.

                1 Reply Last reply Reply Quote 0
                • guy038G
                  guy038
                  last edited by guy038

                  Hello, @Bahaa-Eddin-ツ, @alan-kilborn, @peterjones and All,

                  @Bahaa-Eddin-ツ, I suppose that you were already successful with the @alan-kilborn solution !

                  However, here is a solution which just needs one Replace All action !

                  • Open the Replace dialog ( Ctrl + H )

                  • Untick all box options

                  • SEARCH (?xi-s) (?: \[ | \+ ) \x20* ( [^+\r\n]+ ) (?= \x20* \+ .+ \1 )

                  • REPLACE Leave EMPTY

                  • Check the Wrap around option

                  • Select the Regular expression search mode

                  • Click once only on the Replace All button ( or several times on the Replace button )


                  So, for instance, from the INPUT text :

                  [math part1+ Biology part1+ biology part3+ History part1+ Test N°1+ math part1+ Biology part3+ history part1+ Biology part3+ Biology part1+ test number 2+ math part1+ History part1]
                  

                  You should get this OUTPUT text :

                  + Test N°1+ Biology part3+ Biology part1+ test number 2+ math part1+ History part1]
                  

                  Finally, just change the beginning of each section with this obvious regex S/R :

                  SEARCH (?x) ^ \+ \x20*

                  REPLACE [

                  Best Regards

                  guy038

                  Alan KilbornA 1 Reply Last reply Reply Quote 1
                  • Alan KilbornA
                    Alan Kilborn @guy038
                    last edited by Alan Kilborn

                    @guy038 said:

                    SEARCH (?xi-s) (?: [ | + ) \x20* ( [^+\r\n]+ ) (?= \x20* + .+ \1 )

                    It looks suspiciously like the first [ is a victim of this site losing the leading escape??

                    1 Reply Last reply Reply Quote 2
                    • guy038G
                      guy038
                      last edited by guy038

                      Hello, @alan-kilborn and All,

                      Sorry for the confusion !

                      Thus, I replaced my search regex in its initial state

                      And here is the right syntax that should be used :

                      • SEARCH (?xi-s) (?: \\[ | \+ ) \x20* ( [^+\r\n]+ ) (?= \x20* \+ .+ \1 )
                        BR

                      guy038

                      So, Alan, you can delete the EDIT part of your last post !

                      Alan KilbornA 1 Reply Last reply Reply Quote 1
                      • Alan KilbornA
                        Alan Kilborn @guy038
                        last edited by

                        @guy038 said in How to remove duplicates words?:

                        So, Alan, you can delete the EDIT part of your last post !

                        It ALREADY never happened! :-)

                        1 Reply Last reply Reply Quote 1
                        • guy038G
                          guy038
                          last edited by guy038

                          Hello, @alan-kilborn and All,

                          I’ve found out an interesting thing about posts which contains a literal [ character in search regexes :

                          \\[
                          

                          If you must edit one of these posts in order to change any other part, you’ll need to repeat the special modifications, regarding the regexes, by using, again, the syntax :

                          \\\[
                          

                          BR

                          guy038

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors