Community
    • Login

    Merge 2 text files with exact same line and removing duplicates

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    removeduplicatescombine
    5 Posts 3 Posters 2.0k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Devin RustyD Offline
      Devin Rusty
      last edited by

      I have 2 files :

      FILE A :

      $ BEGIN STRING

      $ CONTEXT: Actors/1/description/ < UNTRANSLATED
      I walk through a number of battlefield, mercenary veteran who has survived. usually
      But good-natured, but once turn into berserk if Hajimare a fight.
      $ END STRING
      $ BEGIN STRING

      FILE B :

      $ BEGIN STRING
      数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
      温厚だが、ひとたび戦いが始まれば狂戦士と化す。
      $ CONTEXT: Actors/1/description/ < UNTRANSLATED

      $ END STRING

      How do I merge those 2 files and end up like this :

      Merged :

      $ BEGIN STRING
      数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
      温厚だが、ひとたび戦いが始まれば狂戦士と化す。
      $ CONTEXT: Actors/1/description/ < UNTRANSLATED
      I walk through a number of battlefield, mercenary veteran who has survived. usually
      But good-natured, but once turn into berserk if Hajimare a fight.
      $ END STRING

      1 Reply Last reply Reply Quote 0
      • guy038G Online
        guy038
        last edited by guy038

        Hello, @devin-rusty, and All,

        Seemingly, the link between your two files is the line $ CONTEXT: Actors/1/description/ < UNTRANSLATED

        So, I’m going to use the same principle as the one, used at the end of that post :

        https://notepad-plus-plus.org/community/topic/16446/is-there-a-way-to-hide-commands/13


        To test it, I created an sample of your File B, below, containing 3 records where $ CONTEXT: lines differs from the number 1, 2 or 3

        ----------------------- File B ----------------------------------
        
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        
        $ END STRING
        

        Note : The Chinese text, is identical in these 3 records !

        Then, I created a sample of your File A, below, containing 3 different blocks $ CONTEXT:...........$ END STRING

        ----------------------------------- File A ---------------------------------------------
        $ BEGIN STRING
        
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        I walk through a number of battlefield, mercenary veteran who has survived. usually
        But good-natured, but once turn into berserk if Hajimare a fight.
        $ END STRING
        
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        It is a simple try
        with any text
        $ END STRING
        
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        Here is the last bunch
        of text to test my solution
        $ END STRING
        

        Note : I did not add the last line of your File A, as I supposed it’s just was the beginning of the next record !


        Now, here is the method used to solve your problem :

        • Paste all the File B contents in a N++ new tab

        • Add a new line of equal signs, as, for instance, =================

        • Paste all the File A contents, after this line

        => We end up with that text :

        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        
        $ END STRING
        ====================================================
        $ BEGIN STRING
        
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        I walk through a number of battlefield, mercenary veteran who has survived. usually
        But good-natured, but once turn into berserk if Hajimare a fight.
        $ END STRING
        
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        It is a simple try
        with any text
        $ END STRING
        
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        Here is the last bunch
        of text to test my solution
        $ END STRING
        

        Now, using the menu command Edit > Line Operations > Remove Empty Lines ( Containing Blank characters), we get rid of all the blank lines, giving the text, below :

        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        $ END STRING
        ====================================================
        $ BEGIN STRING
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        I walk through a number of battlefield, mercenary veteran who has survived. usually
        But good-natured, but once turn into berserk if Hajimare a fight.
        $ END STRING
        $ BEGIN STRING
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        It is a simple try
        with any text
        $ END STRING
        $ BEGIN STRING
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        Here is the last bunch
        of text to test my solution
        $ END STRING
        
        • Finally, open the N++ Replace dialog ( Ctrl + H )

        • SEARCH (?-is)^(\$ CONTEXT:.+\R)(?=(?s).+\R\1(.+?)^\$ END STRING)|(?s)^=+.+

        • REPLACE \1\2

        • Set the Wrap around option

        • Select the Regular expression search mode

        • Click on the Replace All button

        Nice :-)) We get the expected text !

        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/1/description/ < UNTRANSLATED
        I walk through a number of battlefield, mercenary veteran who has survived. usually
        But good-natured, but once turn into berserk if Hajimare a fight.
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/2/description/ < UNTRANSLATED
        It is a simple try
        with any text
        $ END STRING
        $ BEGIN STRING
        数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
        温厚だが、ひとたび戦いが始まれば狂戦士と化す。
        $ CONTEXT: Actors/3/description/ < UNTRANSLATED
        Here is the last bunch
        of text to test my solution
        $ END STRING
        

        Notes : Globally, the search regex :

        • Matches every $ CONTEXT: line, with its EOL chars, in the File B part, ( stored as group 1 ), ONLY IF there is an identical line, found, further on, in the File A part, after the line of equal signs and also grabs all text till the nearest $ END STRING ( stored as group 2 )

        • When NO more $ CONTEXT: lines can be found, in the File B part, then it attempts to match from the line of equal signs ======= till the very end of file

        • Now, in replacement, any complete $ CONTEXT: line, found in the File B part, is replaced by itself ( \1 ), along with the block, found in the File A part, after the $ CONTEXT: line ( \2 )

        • Then, all text starting with the ========= line is simply deleted, as, this time, groups 1 and 2 are not defined !

        Best Regards,

        guy038

        Devin RustyD 1 Reply Last reply Reply Quote 3
        • Devin RustyD Offline
          Devin Rusty
          last edited by Devin Rusty

          @guy038 said:

          (?-is)^($ CONTEXT:.+\R)(?=(?s).+\R\1(.+?)^$ END STRING)|(?s)^=+.+

          Hey, thank you for your reply. Unfortunately, when I did all of the steps, it just deleting all texts under the ====== line. My bad for not providing the ‘real’ document. Here is the real document btw :

          > RPGMAKER TRANS PATCH FILE VERSION 3.2
          > BEGIN STRING
          エリック
          > CONTEXT: Actors/1/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
          温厚だが、ひとたび戦いが始まれば狂戦士と化す。
          > CONTEXT: Actors/1/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          銀の死神
          > CONTEXT: Actors/1/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          ナタリー
          > CONTEXT: Actors/2/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          暗殺拳の達人を祖父にもつ少女。幼少のころからその技の
          すべてを叩き込まれている格闘術のエキスパート。
          > CONTEXT: Actors/2/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          紅蓮の迅雷
          > CONTEXT: Actors/2/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          テレンス
          > CONTEXT: Actors/3/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          謀略により地位を剥奪された聖騎士。真の騎士道を極めるため
          各地をさまよい修練を重ねている。
          > CONTEXT: Actors/3/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          流浪の聖騎士
          > CONTEXT: Actors/3/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          アーネスト
          > CONTEXT: Actors/4/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          師匠の仇を探して旅を続ける剣士。剣に魔力を宿らせる技
          「魔法剣」を体得している。
          > CONTEXT: Actors/4/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          魔剣を継ぐ者
          > CONTEXT: Actors/4/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          リョーマ
          > CONTEXT: Actors/5/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          東国では無双の剛剣と称された、桜花一刀流の使い手。
          流れるような動きから繰り出される一閃は、重く、鋭い。
          > CONTEXT: Actors/5/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          暁の剛剣
          > CONTEXT: Actors/5/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          ブレンダ
          > CONTEXT: Actors/6/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          森の精霊に育てられた少女。自然を愛し、森の平穏を乱す者を
          許さない。都会での生活にちょっとだけ憧れている。
          > CONTEXT: Actors/6/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          深緑の護り手
          > CONTEXT: Actors/6/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          リック
          > CONTEXT: Actors/7/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          束縛されることが嫌いな自称義賊の青年。軽口ばかり叩くが
          仲間のためなら命も張れる熱血漢。
          > CONTEXT: Actors/7/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          見えざる疾風
          > CONTEXT: Actors/7/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          アリス
          > CONTEXT: Actors/8/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          神託により聖女となることを運命づけられた女性。慈愛に満ち
          その愛情は敵に対しても等しく与えられる。
          > CONTEXT: Actors/8/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          救世の聖女
          > CONTEXT: Actors/8/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          イザベル
          > CONTEXT: Actors/9/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          永きに渡り人類に恐怖を与えてきた魔女。転生術の失敗により
          記憶の大半を失っているが、キレると本性が出る。
          > CONTEXT: Actors/9/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          優雅なる悪夢
          > CONTEXT: Actors/9/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          ノア
          > CONTEXT: Actors/10/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          俗世との交わりを避け、山奥に隠れ住む賢者。凶星の正体を
          調べるため、伝説にある「最果ての書庫」を探す旅に出る。
          > CONTEXT: Actors/10/description/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          星を見る者
          > CONTEXT: Actors/10/nickname/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          クラリス
          > CONTEXT: Actors/15/name/ < UNTRANSLATED
          > CONTEXT: Actors/20/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          マリー
          > CONTEXT: Actors/16/name/ < UNTRANSLATED
          > CONTEXT: Actors/21/name/ < UNTRANSLATED
          
          > END STRING
          ============================================================
          > BEGIN STRING
          
          > CONTEXT: Actors / 1/name/ < UNTRANSLATED
          Eric
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/1/description/ < UNTRANSLATED
          I walk through a number of battlefield, mercenary veteran who has survived. usually
          But good-natured, but once turn into berserk if Hajimare a fight.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/1/nickname/ < UNTRANSLATED
          Death of silver
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/2/name/ < UNTRANSLATED
          Natalie
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/2/description/ < UNTRANSLATED
          The girl with the grandfather a master of assassination fist. Of the skills from childhood
          Expert of fighting surgery that has been hammered all.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/2/nickname/ < UNTRANSLATED
          Thunderclap of Guren
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/3/name/ < UNTRANSLATED
          Terence
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/3/description/ < UNTRANSLATED
          St. knight that has been stripped of his position by the conspiracy. In order to master the true chivalry
          It has repeatedly wander training the country.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/3/nickname/ < UNTRANSLATED
          Exile of the Holy Knight
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/4/name/ < UNTRANSLATED
          Ernest
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/4/description/ < UNTRANSLATED
          Swordsman to continue the journey looking for the revenge of the teacher. Technique to dwell magic to sword
          It has mastered the "magic sword".
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/4/nickname/ < UNTRANSLATED
          The Inheritors magic sword
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/5/name/ < UNTRANSLATED
          Ryoma
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/5/description/ < UNTRANSLATED
          In the eastern provinces it was called Tsuyoshi sword of Muso, cherry blossoms ittō-ryū consumer of.
          Issen fed from flowing motion are heavy, sharp.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/5/nickname/ < UNTRANSLATED
          Akatsuki of Tsuyoshiken
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/6/name/ < UNTRANSLATED
          Brenda
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/6/description/ < UNTRANSLATED
          Girl who was brought up in the spirit of the forest. Love nature, those who disturb the peace of the forest
          unforgivable. Are longing only a little to the life in the city.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/6/nickname/ < UNTRANSLATED
          Dark green be safety hand
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/7/name/ < UNTRANSLATED
          Rick
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/7/description/ < UNTRANSLATED
          Youth of hate self-styled gentleman thief is to be bound. Hit just joke but
          Life also Harel dashing if it is for the fellow.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/7/nickname/ < UNTRANSLATED
          Invisible Gale
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/8/name/ < UNTRANSLATED
          Alice
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/8/description/ < UNTRANSLATED
          Woman destined to be a saint by the oracle. Benevolent
          The love is given equally to the enemy.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/8/nickname/ < UNTRANSLATED
          Salvation of the saint
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/9/name/ < UNTRANSLATED
          Isabel
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/9/description/ < UNTRANSLATED
          Witch has given fear to mankind over the eternal. By the failure of the reincarnation surgery
          I have lost the majority of memory, but leave expires and nature.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/9/nickname/ < UNTRANSLATED
          Elegance Naru nightmare
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/10/name/ < UNTRANSLATED
          Noah
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/10/description/ < UNTRANSLATED
          Avoid fellowship with worldly, live hidden deep in the mountains wise man. The identity of the evil stars
          Investigate, go on a journey to find the "farthest reaches of the archive" in the legend.
          > END STRING
          
          > BEGIN STRING
          
          > CONTEXT: Actors/10/nickname/ < UNTRANSLATED
          Those who see the stars
          > END STRING
          
          > BEGIN STRING
          > CONTEXT: Actors/15/name/ < UNTRANSLATED
          Claris
          > CONTEXT: Actors/20/name/ < UNTRANSLATED
          
          > END STRING
          
          > BEGIN STRING
          > CONTEXT: Actors/16/name/ < UNTRANSLATED
          Marie
          > CONTEXT: Actors/21/name/ < UNTRANSLATED
          
          > END STRING
          

          Hopefully you can help.

          1 Reply Last reply Reply Quote 0
          • Devin RustyD Offline
            Devin Rusty @guy038
            last edited by

            @guy038 Hey sorry for the my reply above. I can’t edit nor delete it. It seems like all the steps you provided works really well. As of my case above, I replace all the > with $ because > kinda screw things up in Regular Expression . Thanks a lot.

            Scott SumnerS 1 Reply Last reply Reply Quote 0
            • Scott SumnerS Offline
              Scott Sumner @Devin Rusty
              last edited by

              @Devin-Rusty

              I replace all the > with $ because > kinda screw things up in Regular Expression

              I think you mean to say:

              I replace all the $ with > because $ kinda screw things up in Regular Expression

              If that’s truly what you meant, then yes, $ is a special character to regular expressions. You can still use it literally, but you have to do it as a combination of two characters (\$) instead of the single character $.

              1 Reply Last reply Reply Quote 1

              Hello! It looks like you're interested in this conversation, but you don't have an account yet.

              Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

              With your input, this post could be even better 💗

              Register Login
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors