Merge 2 text files with exact same line and removing duplicates



  • I have 2 files :

    FILE A :

    $ BEGIN STRING

    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    I walk through a number of battlefield, mercenary veteran who has survived. usually
    But good-natured, but once turn into berserk if Hajimare a fight.
    $ END STRING
    $ BEGIN STRING

    FILE B :

    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED

    $ END STRING

    How do I merge those 2 files and end up like this :

    Merged :

    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    I walk through a number of battlefield, mercenary veteran who has survived. usually
    But good-natured, but once turn into berserk if Hajimare a fight.
    $ END STRING



  • Hello, @devin-rusty, and All,

    Seemingly, the link between your two files is the line $ CONTEXT: Actors/1/description/ < UNTRANSLATED

    So, I’m going to use the same principle as the one, used at the end of that post :

    https://notepad-plus-plus.org/community/topic/16446/is-there-a-way-to-hide-commands/13


    To test it, I created an sample of your File B, below, containing 3 records where $ CONTEXT: lines differs from the number 1, 2 or 3

    ----------------------- File B ----------------------------------
    
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    
    $ END STRING
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/2/description/ < UNTRANSLATED
    
    $ END STRING
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/3/description/ < UNTRANSLATED
    
    $ END STRING
    

    Note : The Chinese text, is identical in these 3 records !

    Then, I created a sample of your File A, below, containing 3 different blocks $ CONTEXT:...........$ END STRING

    ----------------------------------- File A ---------------------------------------------
    $ BEGIN STRING
    
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    I walk through a number of battlefield, mercenary veteran who has survived. usually
    But good-natured, but once turn into berserk if Hajimare a fight.
    $ END STRING
    
    $ CONTEXT: Actors/2/description/ < UNTRANSLATED
    It is a simple try
    with any text
    $ END STRING
    
    $ CONTEXT: Actors/3/description/ < UNTRANSLATED
    Here is the last bunch
    of text to test my solution
    $ END STRING
    

    Note : I did not add the last line of your File A, as I supposed it’s just was the beginning of the next record !


    Now, here is the method used to solve your problem :

    • Paste all the File B contents in a N++ new tab

    • Add a new line of equal signs, as, for instance, =================

    • Paste all the File A contents, after this line

    => We end up with that text :

    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    
    $ END STRING
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/2/description/ < UNTRANSLATED
    
    $ END STRING
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/3/description/ < UNTRANSLATED
    
    $ END STRING
    ====================================================
    $ BEGIN STRING
    
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    I walk through a number of battlefield, mercenary veteran who has survived. usually
    But good-natured, but once turn into berserk if Hajimare a fight.
    $ END STRING
    
    $ CONTEXT: Actors/2/description/ < UNTRANSLATED
    It is a simple try
    with any text
    $ END STRING
    
    $ CONTEXT: Actors/3/description/ < UNTRANSLATED
    Here is the last bunch
    of text to test my solution
    $ END STRING
    

    Now, using the menu command Edit > Line Operations > Remove Empty Lines ( Containing Blank characters), we get rid of all the blank lines, giving the text, below :

    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    $ END STRING
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/2/description/ < UNTRANSLATED
    $ END STRING
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/3/description/ < UNTRANSLATED
    $ END STRING
    ====================================================
    $ BEGIN STRING
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    I walk through a number of battlefield, mercenary veteran who has survived. usually
    But good-natured, but once turn into berserk if Hajimare a fight.
    $ END STRING
    $ BEGIN STRING
    $ CONTEXT: Actors/2/description/ < UNTRANSLATED
    It is a simple try
    with any text
    $ END STRING
    $ BEGIN STRING
    $ CONTEXT: Actors/3/description/ < UNTRANSLATED
    Here is the last bunch
    of text to test my solution
    $ END STRING
    
    • Finally, open the N++ Replace dialog ( Ctrl + H )

    • SEARCH (?-is)^(\$ CONTEXT:.+\R)(?=(?s).+\R\1(.+?)^\$ END STRING)|(?s)^=+.+

    • REPLACE \1\2

    • Set the Wrap around option

    • Select the Regular expression search mode

    • Click on the Replace All button

    Nice :-)) We get the expected text !

    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/1/description/ < UNTRANSLATED
    I walk through a number of battlefield, mercenary veteran who has survived. usually
    But good-natured, but once turn into berserk if Hajimare a fight.
    $ END STRING
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/2/description/ < UNTRANSLATED
    It is a simple try
    with any text
    $ END STRING
    $ BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    $ CONTEXT: Actors/3/description/ < UNTRANSLATED
    Here is the last bunch
    of text to test my solution
    $ END STRING
    

    Notes : Globally, the search regex :

    • Matches every $ CONTEXT: line, with its EOL chars, in the File B part, ( stored as group 1 ), ONLY IF there is an identical line, found, further on, in the File A part, after the line of equal signs and also grabs all text till the nearest $ END STRING ( stored as group 2 )

    • When NO more $ CONTEXT: lines can be found, in the File B part, then it attempts to match from the line of equal signs ======= till the very end of file

    • Now, in replacement, any complete $ CONTEXT: line, found in the File B part, is replaced by itself ( \1 ), along with the block, found in the File A part, after the $ CONTEXT: line ( \2 )

    • Then, all text starting with the ========= line is simply deleted, as, this time, groups 1 and 2 are not defined !

    Best Regards,

    guy038



  • @guy038 said:

    (?-is)^($ CONTEXT:.+\R)(?=(?s).+\R\1(.+?)^$ END STRING)|(?s)^=+.+

    Hey, thank you for your reply. Unfortunately, when I did all of the steps, it just deleting all texts under the ====== line. My bad for not providing the ‘real’ document. Here is the real document btw :

    > RPGMAKER TRANS PATCH FILE VERSION 3.2
    > BEGIN STRING
    エリック
    > CONTEXT: Actors/1/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    数々の戦場を渡り歩き、生き延びてきた歴戦の傭兵。普段は
    温厚だが、ひとたび戦いが始まれば狂戦士と化す。
    > CONTEXT: Actors/1/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    銀の死神
    > CONTEXT: Actors/1/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    ナタリー
    > CONTEXT: Actors/2/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    暗殺拳の達人を祖父にもつ少女。幼少のころからその技の
    すべてを叩き込まれている格闘術のエキスパート。
    > CONTEXT: Actors/2/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    紅蓮の迅雷
    > CONTEXT: Actors/2/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    テレンス
    > CONTEXT: Actors/3/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    謀略により地位を剥奪された聖騎士。真の騎士道を極めるため
    各地をさまよい修練を重ねている。
    > CONTEXT: Actors/3/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    流浪の聖騎士
    > CONTEXT: Actors/3/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    アーネスト
    > CONTEXT: Actors/4/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    師匠の仇を探して旅を続ける剣士。剣に魔力を宿らせる技
    「魔法剣」を体得している。
    > CONTEXT: Actors/4/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    魔剣を継ぐ者
    > CONTEXT: Actors/4/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    リョーマ
    > CONTEXT: Actors/5/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    東国では無双の剛剣と称された、桜花一刀流の使い手。
    流れるような動きから繰り出される一閃は、重く、鋭い。
    > CONTEXT: Actors/5/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    暁の剛剣
    > CONTEXT: Actors/5/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    ブレンダ
    > CONTEXT: Actors/6/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    森の精霊に育てられた少女。自然を愛し、森の平穏を乱す者を
    許さない。都会での生活にちょっとだけ憧れている。
    > CONTEXT: Actors/6/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    深緑の護り手
    > CONTEXT: Actors/6/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    リック
    > CONTEXT: Actors/7/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    束縛されることが嫌いな自称義賊の青年。軽口ばかり叩くが
    仲間のためなら命も張れる熱血漢。
    > CONTEXT: Actors/7/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    見えざる疾風
    > CONTEXT: Actors/7/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    アリス
    > CONTEXT: Actors/8/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    神託により聖女となることを運命づけられた女性。慈愛に満ち
    その愛情は敵に対しても等しく与えられる。
    > CONTEXT: Actors/8/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    救世の聖女
    > CONTEXT: Actors/8/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    イザベル
    > CONTEXT: Actors/9/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    永きに渡り人類に恐怖を与えてきた魔女。転生術の失敗により
    記憶の大半を失っているが、キレると本性が出る。
    > CONTEXT: Actors/9/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    優雅なる悪夢
    > CONTEXT: Actors/9/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    ノア
    > CONTEXT: Actors/10/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    俗世との交わりを避け、山奥に隠れ住む賢者。凶星の正体を
    調べるため、伝説にある「最果ての書庫」を探す旅に出る。
    > CONTEXT: Actors/10/description/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    星を見る者
    > CONTEXT: Actors/10/nickname/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    クラリス
    > CONTEXT: Actors/15/name/ < UNTRANSLATED
    > CONTEXT: Actors/20/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    マリー
    > CONTEXT: Actors/16/name/ < UNTRANSLATED
    > CONTEXT: Actors/21/name/ < UNTRANSLATED
    
    > END STRING
    ============================================================
    > BEGIN STRING
    
    > CONTEXT: Actors / 1/name/ < UNTRANSLATED
    Eric
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/1/description/ < UNTRANSLATED
    I walk through a number of battlefield, mercenary veteran who has survived. usually
    But good-natured, but once turn into berserk if Hajimare a fight.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/1/nickname/ < UNTRANSLATED
    Death of silver
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/2/name/ < UNTRANSLATED
    Natalie
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/2/description/ < UNTRANSLATED
    The girl with the grandfather a master of assassination fist. Of the skills from childhood
    Expert of fighting surgery that has been hammered all.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/2/nickname/ < UNTRANSLATED
    Thunderclap of Guren
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/3/name/ < UNTRANSLATED
    Terence
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/3/description/ < UNTRANSLATED
    St. knight that has been stripped of his position by the conspiracy. In order to master the true chivalry
    It has repeatedly wander training the country.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/3/nickname/ < UNTRANSLATED
    Exile of the Holy Knight
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/4/name/ < UNTRANSLATED
    Ernest
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/4/description/ < UNTRANSLATED
    Swordsman to continue the journey looking for the revenge of the teacher. Technique to dwell magic to sword
    It has mastered the "magic sword".
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/4/nickname/ < UNTRANSLATED
    The Inheritors magic sword
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/5/name/ < UNTRANSLATED
    Ryoma
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/5/description/ < UNTRANSLATED
    In the eastern provinces it was called Tsuyoshi sword of Muso, cherry blossoms ittō-ryū consumer of.
    Issen fed from flowing motion are heavy, sharp.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/5/nickname/ < UNTRANSLATED
    Akatsuki of Tsuyoshiken
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/6/name/ < UNTRANSLATED
    Brenda
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/6/description/ < UNTRANSLATED
    Girl who was brought up in the spirit of the forest. Love nature, those who disturb the peace of the forest
    unforgivable. Are longing only a little to the life in the city.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/6/nickname/ < UNTRANSLATED
    Dark green be safety hand
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/7/name/ < UNTRANSLATED
    Rick
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/7/description/ < UNTRANSLATED
    Youth of hate self-styled gentleman thief is to be bound. Hit just joke but
    Life also Harel dashing if it is for the fellow.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/7/nickname/ < UNTRANSLATED
    Invisible Gale
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/8/name/ < UNTRANSLATED
    Alice
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/8/description/ < UNTRANSLATED
    Woman destined to be a saint by the oracle. Benevolent
    The love is given equally to the enemy.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/8/nickname/ < UNTRANSLATED
    Salvation of the saint
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/9/name/ < UNTRANSLATED
    Isabel
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/9/description/ < UNTRANSLATED
    Witch has given fear to mankind over the eternal. By the failure of the reincarnation surgery
    I have lost the majority of memory, but leave expires and nature.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/9/nickname/ < UNTRANSLATED
    Elegance Naru nightmare
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/10/name/ < UNTRANSLATED
    Noah
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/10/description/ < UNTRANSLATED
    Avoid fellowship with worldly, live hidden deep in the mountains wise man. The identity of the evil stars
    Investigate, go on a journey to find the "farthest reaches of the archive" in the legend.
    > END STRING
    
    > BEGIN STRING
    
    > CONTEXT: Actors/10/nickname/ < UNTRANSLATED
    Those who see the stars
    > END STRING
    
    > BEGIN STRING
    > CONTEXT: Actors/15/name/ < UNTRANSLATED
    Claris
    > CONTEXT: Actors/20/name/ < UNTRANSLATED
    
    > END STRING
    
    > BEGIN STRING
    > CONTEXT: Actors/16/name/ < UNTRANSLATED
    Marie
    > CONTEXT: Actors/21/name/ < UNTRANSLATED
    
    > END STRING
    

    Hopefully you can help.



  • @guy038 Hey sorry for the my reply above. I can’t edit nor delete it. It seems like all the steps you provided works really well. As of my case above, I replace all the > with $ because > kinda screw things up in Regular Expression . Thanks a lot.



  • @Devin-Rusty

    I replace all the > with $ because > kinda screw things up in Regular Expression

    I think you mean to say:

    I replace all the $ with > because $ kinda screw things up in Regular Expression

    If that’s truly what you meant, then yes, $ is a special character to regular expressions. You can still use it literally, but you have to do it as a combination of two characters (\$) instead of the single character $.