Community
    • Login

    Copy, search and replace between 2 HTML files

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    15 Posts 4 Posters 942 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • HienTwiH
      HienTwi
      last edited by

      Hi @guy038 and all,

      Definitely, it works perfectly with @guy038 smart solution. Many many many thanks for your solution which helps me a lots to save my time. It would be really nice if you can explain the regexes syntax, when you have free time!

      In addition, I want to split file A into 895 files based on “KOSMOS”. Could you please give me a further favor? For instances,

      file 1: From the very beginning of file A to the first KOSMOS, but not include it.
      file 2: From the 1st KOSMOS to the 2nd KOSMOS (not include the 2nd)
      file 3 ,… file 895 are similar file 2. The last KOSMOS (895th) I will be excluded.

      Bests,
      Kosmos

      1 Reply Last reply Reply Quote -1
      • HienTwiH
        HienTwi @astrosofista
        last edited by

        @astrosofista many thanks for your comments. The problem is solved with @guy038 solution.

        astrosofistaA 1 Reply Last reply Reply Quote 1
        • astrosofistaA
          astrosofista @HienTwi
          last edited by

          @HienTwi

          Good to know. Thank you for getting back to me.

          Best Regards.

          1 Reply Last reply Reply Quote 1
          • guy038G
            guy038
            last edited by

            Hello, @hientwi, @astrosofista and All,

            I’m quite confused, because I don’t see, exactly, the connexion between your previous goal and your new one ?

            Indeed, once your file A has been modified with our previous process, it does not contain any KOSMOS line which have all been replaced with a specific line from file B. So, it would be more difficult to determine each section which would have to be saved in the 895 files !

            On the other hand, If you decide to split the initial contents of file A into 895 files, first, then you’ll have to replace the first KOSMOS line of each file by the appropriate line of file B which seems to be more difficult than with my previous method !

            Please, could you enlighten us ?

            Best Regards,

            guy038

            HienTwiH 1 Reply Last reply Reply Quote 0
            • HienTwiH
              HienTwi @guy038
              last edited by

              Hi @guy038 and all,

              Sorry that I made you and others confused. I have another purpose which is totally different from my previous question. It means that I have two copies of file A. The one I wanted to split into multiple files based on “KOSMOS”. The other is used for my previous question. They are totally different questions.

              Best regards,
              Kosmos

              1 Reply Last reply Reply Quote 1
              • guy038G
                guy038
                last edited by guy038

                Hello, @hientwi, @astrosofista and All,

                Sorry to be late ! So OK : these are two tasks absolutely different !

                Well, as you would like to manage file’s creation, regexes are not a nice tool for such a task. Personally, I would use the Gawk application. So, if you do not have this program, yet :

                • Create a new folder

                • Download the gawk-5.0.1-w32-bin-zip archive from    https://sourceforge.net/projects/ezwinports/files/

                • Double-click on the gawk-5.0.1-w32-bin-zip archive

                • Double-click on the bin folder

                • Extract only the 5 files gawk.exe, libgmp-10.dll, libmpfr-4.dll, libncurses5.dll and libreadline6.dll in the new folder

                • Copy your file A in that folder, which will be renamed as File_A.txt

                • With N++, just add a line KOSMOS, at the very beginning of File_A.txt

                • Open a DOS cmd window

                • Type in and run the following command :

                  • gawk "BEGIN {n=0} $0!=\"KOSMOS\" {print > \"File_\"n\".txt\"} $0==\"KOSMOS\" {n++}" File_A.txt
                • Wait a few moments … …

                Et voilà ! You should see, in this new folder, 895 files from File_1.txt to File_895.txt ;-))


                An other possibility would be :

                • With N++, just add a line KOSMOS, at the very beginning of File_A.txt

                • Change, in your File_A.txt, each KOSMOS line into a pure empty line, with the regex :

                  • SEARCH (?-i)^KOSMOS(?=\R)

                  • REPLACE Leave EMPTY

                • Then, in your DOS window, you would run the following command :

                  • gawk "BEGIN {n=0} NF {print > \"File_\"n\".txt\"} !NF {n++}" File_A.txt

                That’s all ! Powerful, isn’t ?

                Remark : I suppose that your file did not contain, initially, any true empty line !! ( may be searched with the regex ^\R )


                For more information, you can download the latest PDF manual ( gawk v5.0 ) from    https://www.gnu.org/software/gawk/manual/

                Best Regards

                guy038

                P.S. :

                In order to select each zone, beginning with a KOSMOS line, till the next KOSMOS line, excluded, of your File_A.txt, simply use the regex :

                SEARCH (?-i)(KOSMOS)?(?s).+?(?=^KOSMOS\R|\z)

                HienTwiH Kosmos HuynhK 3 Replies Last reply Reply Quote 3
                • HienTwiH
                  HienTwi @guy038
                  last edited by

                  Dear @guy038 and all,

                  I am so sorry that I responded too late. It seems that everything can be soIved with you. Many thanks in advacne and I will let you know later on.

                  Stay healthy and best regards,
                  Kosmos

                  1 Reply Last reply Reply Quote 1
                  • Kosmos HuynhK
                    Kosmos Huynh @guy038
                    last edited by

                    Dear @guy038, dear all

                    Today, I have tried your first solution (File_B.txt which contains KOSMOS) and I got the error as in the following:
                    792ca86d-4ebc-4a8f-a63a-5d400ced3af3-image.png

                    It is the same with your second solution with File_A.txt with blank line) as well.
                    c09891c8-7465-4129-b716-d5e991427523-image.png

                    Could you please kindly give me a favor?

                    Many thanks in advance!
                    Bests,
                    Kosmos

                    1 Reply Last reply Reply Quote 0
                    • Kosmos HuynhK
                      Kosmos Huynh @guy038
                      last edited by

                      Dear @guy038 ,

                      I got the solution by correct quotations as the followings:

                      gawk ‘BEGIN {n=0} NF {print > “File_“n”.txt”} !NF {n++}’ File_A.txt

                      Best regards,
                      Kosmos.

                      1 Reply Last reply Reply Quote 1
                      • Kosmos HuynhK
                        Kosmos Huynh @guy038
                        last edited by

                        This post is deleted!
                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors