• Login
Community
  • Login

Remove duplicate strings with comma separator

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
4 Posts 2 Posters 2.7k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    Saki Soulimenas
    last edited by Jan 16, 2018, 4:44 PM

    Hi all,
    i would like to find and remove duplicate records of some Archives i have in a text file.
    The text file looks like this;
    AA,
    OA,PA,PC,TA,TB,TC,TG,
    OA,PA,PB,PC,RK,TA,TB,TC,TG,X0,X1,
    AA,ED,OA,RK,PA,PB,PT,PC,TA,TB,TC,TG,
    AA,OA,RK,PA,PB,PC,TA,TB,TC,TG,X0,X1,
    AA,ZD,
    AA,
    AA,
    HA,
    AA,HA,
    HA,
    RA,RB,RD,RE,RF,RG,RH,RI,Y1,Y2,Y3,
    RA,RB,RD,RE,RF,RG,RH,RI,RL,RK,X0,X1,Y1,Y2,Y3,
    AA,RA,RB,RD,RE,RF,RG,RH,RI,RK,Y1,Y2,Y3,YA,
    RA,RB,RD,RE,RF,RG,RH,RI,RK,Y1,Y2,Y3,
    CA,CB,EA,EB,EC,ED,PB,VA,
    CA,CB,AA,K1,EA,EB,EC,ED,VA,X0,X1,
    AA,CA,CB,RK,EA,EB,EC,ED,PB,VA,
    AA,CA,CB,RK,EA,EB,EC,ED,VA,X5,X6,
    FA,FB,
    K1,CA,RA,
    AA,FA,FB,K1,CA,RA,
    FA,FB,

    I havent found some solution in the existing posts so im trying a new one.
    Maybe someone can help.

    Preferably i would like to remove the duplicate archives like AA for example and leave only the first or last.
    also it would be nice to delete the commatas and maybe place all remaining records in separate lines.
    Like this:
    AA
    FA
    FB
    The last two are only luxury wishes, so not very important.

    Thx in advance

    S 1 Reply Last reply Jan 16, 2018, 5:51 PM Reply Quote 0
    • S
      Scott Sumner @Saki Soulimenas
      last edited by Jan 16, 2018, 5:51 PM

      @Saki-Soulimenas

      There are many postings in this community about removing duplicate lines from unsorted files. Here’s one …see my posting that starts out “I’m glad you have a solution…”. Doing a Regular Expression replacement operation with the Find what expression found there, and specifying an EMPTY Replace with box should do what you want.

      For the second part (your luxury wish), you can turn this:

      K1,CA,RA,
      

      into

      K1
      CA
      RA
      

      with this replace operation:

      Find what zone: ,
      Replace with zone: \r\n
      Search mode: Regular expression

      1 Reply Last reply Reply Quote 1
      • S
        Saki Soulimenas
        last edited by Jan 17, 2018, 7:41 AM

        Hi Scott,
        thx very much for your answer.
        A solution like yours for removing duplicate lines i did find.
        But as i said i would like to remove duplicate records of my archives .

        Ill point out the duplicates of one archive from my example
        AA,
        AA,
        HA,
        AA,HA,
        HA,
        RA,RB,RD,RE,RF,RG,RH,RI,Y1,Y2,Y3,
        RA,RB,RD,RE,RF,RG,RH,RI,RL,RK,X0,X1,Y1,Y2,Y3,
        AA,RA,RB,RD,RE,RF,RG,RH,RI,RK,Y1,Y2,Y3,YA,
        RA,RB,RD,RE,RF,RG,RH,RI,RK,Y1,Y2,Y3,
        CA,CB,EA,EB,EC,ED,PB,VA,
        CA,CB,AA,K1,EA,EB,EC,ED,VA,X0,X1,
        AA,CA,CB,RK,EA,EB,EC,ED,PB,VA,
        AA,CA,CB,RK,EA,EB,EC,ED,VA,X5,X6,
        FA,FB,
        K1,CA,RA,
        AA,FA,FB,K1,CA,RA,
        FA,FB,

        as you can see not all lines are the same where the archive AA lies.
        Some of the duplicates dont even lie on the first place of a line.

        So i would need an expression that would find Duplicates with exactly 2 letters without considering the commatas.
        So it is a bit more difficult then other problems.

        Thank you also for your other tip.

        1 Reply Last reply Reply Quote 0
        • S
          Saki Soulimenas
          last edited by Jan 17, 2018, 8:42 AM

          OK, Problem solved… i needed to wake up first. :)
          I’ll use your comma replacer first and then i can remove duplicate lines.

          Perfect! Thank you very much.
          Have a nice day

          1 Reply Last reply Reply Quote 1
          2 out of 4
          • First post
            2/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors