Community
    • Login

    Merge 2 text files with exact same line and removing duplicates

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    removeduplicatescombine
    5 Posts 3 Posters 1.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Devin RustyD
      Devin Rusty
      last edited by

      I have 2 files :

      FILE A :

      $ BEGIN STRING

      $ CONTEXT: Actors/1/description/ < UNTRANSLATED
      I walk through a number of battlefield, mercenary veteran who has survived. usually
      But good-natured, but once turn into berserk if Hajimare a fight.
      $ END STRING
      $ BEGIN STRING

      FILE B :

      $ BEGIN STRING
      数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
      温厚だが、ひとたび戦いが始まれば狂戦士と化す。
      $ CONTEXT: Actors/1/description/ < UNTRANSLATED

      $ END STRING

      How do I merge those 2 files and end up like this :

      Merged :

      $ BEGIN STRING
      数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
      温厚だが、ひとたび戦いが始まれば狂戦士と化す。
      $ CONTEXT: Actors/1/description/ < UNTRANSLATED
      I walk through a number of battlefield, mercenary veteran who has survived. usually
      But good-natured, but once turn into berserk if Hajimare a fight.
      $ END STRING

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello, @devin-rusty, and All,

        Seemingly, the link between your two files is the line $ CONTEXT: Actors/1/description/ < UNTRANSLATED

        So, I’m going to use the same principle as the one, used at the end of that post :

        https://notepad-plus-plus.org/community/topic/16446/is-there-a-way-to-hide-commands/13


        To test it, I created an sample of your File B, below, containing 3 records where $ CONTEXT: lines differs from the number 1, 2 or 3

        ----------------------- File B ----------------------------------
        
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        
        $ END STRING
        

        Note : The Chinese text, is identical in these 3 records !

        Then, I created a sample of your File A, below, containing 3 different blocks $ CONTEXT:...........$ END STRING

        ----------------------------------- File A ---------------------------------------------
        $ BEGIN STRING
        
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        I walk through a number of battlefield, mercenary veteran who has survived. usually
        But good-natured, but once turn into berserk if Hajimare a fight.
        $ END STRING
        
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        It is a simple try
        with any text
        $ END STRING
        
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        Here is the last bunch
        of text to test my solution
        $ END STRING
        

        Note : I did not add the last line of your File A, as I supposed it’s just was the beginning of the next record !


        Now, here is the method used to solve your problem :

        • Paste all the File B contents in a N++ new tab

        • Add a new line of equal signs, as, for instance, =================

        • Paste all the File A contents, after this line

        => We end up with that text :

        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        
        $ END STRING
        ====================================================
        $ BEGIN STRING
        
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        I walk through a number of battlefield, mercenary veteran who has survived. usually
        But good-natured, but once turn into berserk if Hajimare a fight.
        $ END STRING
        
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        It is a simple try
        with any text
        $ END STRING
        
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        Here is the last bunch
        of text to test my solution
        $ END STRING
        

        Now, using the menu command Edit > Line Operations > Remove Empty Lines ( Containing Blank characters), we get rid of all the blank lines, giving the text, below :

        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        $ END STRING
        ====================================================
        $ BEGIN STRING
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        I walk through a number of battlefield, mercenary veteran who has survived. usually
        But good-natured, but once turn into berserk if Hajimare a fight.
        $ END STRING
        $ BEGIN STRING
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        It is a simple try
        with any text
        $ END STRING
        $ BEGIN STRING
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        Here is the last bunch
        of text to test my solution
        $ END STRING
        
        • Finally, open the N++ Replace dialog ( Ctrl + H )

        • SEARCH (?-is)^(\$ CONTEXT:.+\R)(?=(?s).+\R\1(.+?)^\$ END STRING)|(?s)^=+.+

        • REPLACE \1\2

        • Set the Wrap around option

        • Select the Regular expression search mode

        • Click on the Replace All button

        Nice :-)) We get the expected text !

        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        I walk through a number of battlefield, mercenary veteran who has survived. usually
        But good-natured, but once turn into berserk if Hajimare a fight.
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        It is a simple try
        with any text
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        Here is the last bunch
        of text to test my solution
        $ END STRING
        

        Notes : Globally, the search regex :

        • Matches every $ CONTEXT: line, with its EOL chars, in the File B part, ( stored as group 1 ), ONLY IF there is an identical line, found, further on, in the File A part, after the line of equal signs and also grabs all text till the nearest $ END STRING ( stored as group 2 )

        • When NO more $ CONTEXT: lines can be found, in the File B part, then it attempts to match from the line of equal signs ======= till the very end of file

        • Now, in replacement, any complete $ CONTEXT: line, found in the File B part, is replaced by itself ( \1 ), along with the block, found in the File A part, after the $ CONTEXT: line ( \2 )

        • Then, all text starting with the ========= line is simply deleted, as, this time, groups 1 and 2 are not defined !

        Best Regards,

        guy038

        Devin RustyD 1 Reply Last reply Reply Quote 3
        • Devin RustyD
          Devin Rusty
          last edited by Devin Rusty

          @guy038 said:

          (?-is)^($ CONTEXT:.+\R)(?=(?s).+\R\1(.+?)^$ END STRING)|(?s)^=+.+

          Hey, thank you for your reply. Unfortunately, when I did all of the steps, it just deleting all texts under the ====== line. My bad for not providing the ‘real’ document. Here is the real document btw :

          > RPGMAKER TRANS PATCH FILE VERSION 3.2
          > BEGIN STRING
          エリック
          > CONTEXT: Actors/1/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
          温厚だが、ひとたび戦いが始まれば狂戦士と化す。
          > CONTEXT: Actors/1/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          銀の死神
          > CONTEXT: Actors/1/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          ナタリー
          > CONTEXT: Actors/2/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          暗殺拳の達人を祖父にもつ少女。幼少のころからその技の
          すべてを叩き込まれている格闘術のエキスパート。
          > CONTEXT: Actors/2/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          紅蓮の迅雷
          > CONTEXT: Actors/2/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          テレンス
          > CONTEXT: Actors/3/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          謀略により地位を剥奪された聖騎士。真の騎士道を極めるため
          各地をさまよい修練を重ねている。
          > CONTEXT: Actors/3/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          流浪の聖騎士
          > CONTEXT: Actors/3/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          アーネスト
          > CONTEXT: Actors/4/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          師匠の仇を探して旅を続ける剣士。剣に魔力を宿らせる技
          「魔法剣」を体得している。
          > CONTEXT: Actors/4/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          魔剣を継ぐ者
          > CONTEXT: Actors/4/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          リョーマ
          > CONTEXT: Actors/5/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          東国では無双の剛剣と称された、桜花一刀流の使い手。
          流れるような動きから繰り出される一閃は、重く、鋭い。
          > CONTEXT: Actors/5/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          暁の剛剣
          > CONTEXT: Actors/5/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          ブレンダ
          > CONTEXT: Actors/6/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          森の精霊に育てられた少女。自然を愛し、森の平穏を乱す者を
          許さない。都会での生活にちょっとだけ憧れている。
          > CONTEXT: Actors/6/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          深緑の護り手
          > CONTEXT: Actors/6/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          リック
          > CONTEXT: Actors/7/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          束縛されることが嫌いな自称義賊の青年。軽口ばかり叩くが
          仲間のためなら命も張れる熱血漢。
          > CONTEXT: Actors/7/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          見えざる疾風
          > CONTEXT: Actors/7/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          アリス
          > CONTEXT: Actors/8/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          神託により聖女となることを運命づけられた女性。慈愛に満ち
          その愛情は敵に対しても等しく与えられる。
          > CONTEXT: Actors/8/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          救世の聖女
          > CONTEXT: Actors/8/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          イザベル
          > CONTEXT: Actors/9/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          永きに渡り人類に恐怖を与えてきた魔女。転生術の失敗により
          記憶の大半を失っているが、キレると本性が出る。
          > CONTEXT: Actors/9/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          優雅なる悪夢
          > CONTEXT: Actors/9/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          ノア
          > CONTEXT: Actors/10/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          俗世との交わりを避け、山奥に隠れ住む賢者。凶星の正体を
          調べるため、伝説にある「最果ての書庫」を探す旅に出る。
          > CONTEXT: Actors/10/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          星を見る者
          > CONTEXT: Actors/10/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          クラリス
          > CONTEXT: Actors/15/name/ < UNTRANSLATED
          > CONTEXT: Actors/20/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          マリー
          > CONTEXT: Actors/16/name/ < UNTRANSLATED
          > CONTEXT: Actors/21/name/ < UNTRANSLATED
          
          > END STRING
          ============================================================
          > BEGIN STRING
          
          > CONTEXT: Actors / 1/name/ < UNTRANSLATED
          Eric
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/1/description/ < UNTRANSLATED
          I walk through a number of battlefield, mercenary veteran who has survived. usually
          But good-natured, but once turn into berserk if Hajimare a fight.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/1/nickname/ < UNTRANSLATED
          Death of silver
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/2/name/ < UNTRANSLATED
          Natalie
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/2/description/ < UNTRANSLATED
          The girl with the grandfather a master of assassination fist. Of the skills from childhood
          Expert of fighting surgery that has been hammered all.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/2/nickname/ < UNTRANSLATED
          Thunderclap of Guren
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/3/name/ < UNTRANSLATED
          Terence
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/3/description/ < UNTRANSLATED
          St. knight that has been stripped of his position by the conspiracy. In order to master the true chivalry
          It has repeatedly wander training the country.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/3/nickname/ < UNTRANSLATED
          Exile of the Holy Knight
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/4/name/ < UNTRANSLATED
          Ernest
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/4/description/ < UNTRANSLATED
          Swordsman to continue the journey looking for the revenge of the teacher. Technique to dwell magic to sword
          It has mastered the "magic sword".
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/4/nickname/ < UNTRANSLATED
          The Inheritors magic sword
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/5/name/ < UNTRANSLATED
          Ryoma
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/5/description/ < UNTRANSLATED
          In the eastern provinces it was called Tsuyoshi sword of Muso, cherry blossoms ittō-ryū consumer of.
          Issen fed from flowing motion are heavy, sharp.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/5/nickname/ < UNTRANSLATED
          Akatsuki of Tsuyoshiken
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/6/name/ < UNTRANSLATED
          Brenda
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/6/description/ < UNTRANSLATED
          Girl who was brought up in the spirit of the forest. Love nature, those who disturb the peace of the forest
          unforgivable. Are longing only a little to the life in the city.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/6/nickname/ < UNTRANSLATED
          Dark green be safety hand
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/7/name/ < UNTRANSLATED
          Rick
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/7/description/ < UNTRANSLATED
          Youth of hate self-styled gentleman thief is to be bound. Hit just joke but
          Life also Harel dashing if it is for the fellow.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/7/nickname/ < UNTRANSLATED
          Invisible Gale
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/8/name/ < UNTRANSLATED
          Alice
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/8/description/ < UNTRANSLATED
          Woman destined to be a saint by the oracle. Benevolent
          The love is given equally to the enemy.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/8/nickname/ < UNTRANSLATED
          Salvation of the saint
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/9/name/ < UNTRANSLATED
          Isabel
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/9/description/ < UNTRANSLATED
          Witch has given fear to mankind over the eternal. By the failure of the reincarnation surgery
          I have lost the majority of memory, but leave expires and nature.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/9/nickname/ < UNTRANSLATED
          Elegance Naru nightmare
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/10/name/ < UNTRANSLATED
          Noah
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/10/description/ < UNTRANSLATED
          Avoid fellowship with worldly, live hidden deep in the mountains wise man. The identity of the evil stars
          Investigate, go on a journey to find the "farthest reaches of the archive" in the legend.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/10/nickname/ < UNTRANSLATED
          Those who see the stars
          > END STRING
          
          > BEGIN STRING
          > CONTEXT: Actors/15/name/ < UNTRANSLATED
          Claris
          > CONTEXT: Actors/20/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          > CONTEXT: Actors/16/name/ < UNTRANSLATED
          Marie
          > CONTEXT: Actors/21/name/ < UNTRANSLATED
          
          > END STRING
          

          Hopefully you can help.

          1 Reply Last reply Reply Quote 0
          • Devin RustyD
            Devin Rusty @guy038
            last edited by

            @guy038 Hey sorry for the my reply above. I can’t edit nor delete it. It seems like all the steps you provided works really well. As of my case above, I replace all the > with $ because > kinda screw things up in Regular Expression . Thanks a lot.

            Scott SumnerS 1 Reply Last reply Reply Quote 0
            • Scott SumnerS
              Scott Sumner @Devin Rusty
              last edited by

              @Devin-Rusty

              I replace all the > with $ because > kinda screw things up in Regular Expression

              I think you mean to say:

              I replace all the $ with > because $ kinda screw things up in Regular Expression

              If that’s truly what you meant, then yes, $ is a special character to regular expressions. You can still use it literally, but you have to do it as a combination of two characters (\$) instead of the single character $.

              1 Reply Last reply Reply Quote 1
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors