Community
    • Login

    How to copy html text content from a section of several pages to the section of other several different pages (with the tags of the text too)

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    14 Posts 5 Posters 1.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G
      guy038
      last edited by

      Hello, @robin-cruise, @peterjones and all,

      We still can simplify your problem !

      If I understand you properly, each of the 3,000 files, or so, located in a specific folder, contains one or several sections :

              <!-- * * * * * START HERE * * * * * -->
      ....
      ....
      ....
              <!-- * * * * * END HERE * * * * * -->
      

      which are useless and which should be replaced with this exact comment section, below :

              <!-- * * * * * START HERE * * * * * -->
      
              <p class="TATA"><em>At the mobile site I put as the header location in case the device isn't mobile? And then it executes the php code you gave stack..</em></p>
              <p class="MAMA">Simply check if the referrer is coming from within your site. If they are, they have already seen a page and have chosen where they want to go next</p>
      
              <!-- * * * * * END HERE * * * * * -->
      

      If this assumption is correct, here a method, which would still need the scripting Python or Lua plugin but doable from within Notepad++ !

      • First, do a backup of your 3,000 files ( Important )

      • Execute a Replace in Files operation, with the following regex S/R :

        • SEARCH (?s)^\h*\Q<!-- * * * * * START HERE * * * * * -->\E.+?\Q<!-- * * * * * END HERE * * * * * -->\E

        • REPLACE <!-- Replacement Point -->

      => This S/R will change any multi-lines section <!-- START HERE............END -->, in your 3,000 files, with a single line <!-- Replacement Point -->

      • Copy the section, below, which must be copied in all your files, in the clipboard with a simple Ctrl + V action
              <!-- * * * * * START HERE * * * * * -->
      
              <p class="TATA"><em>At the mobile site I put as the header location in case the device isn't mobile? And then it executes the php code you gave stack..</em></p>
              <p class="MAMA">Simply check if the referrer is coming from within your site. If they are, they have already seen a page and have chosen where they want to go next</p>
      
              <!-- * * * * * END HERE * * * * * -->
      
      • Now, with the help of a scripting language, the further steps are :

        • For each scanned file :

          • Bookmark all the lines <!-- Replacement Point -->

          • Perform a Search > Bookmark > Paste to (Replace) Bookmarked Lines action

      The last command should replace any bookmarked line, of each file, with the contents of the clipboard :-))

      Best Regards,

      guy038

      1 Reply Last reply Reply Quote 1
      • Robin CruiseR
        Robin Cruise
        last edited by Robin Cruise

        good day @guy038 First, the text content is not the same, is different in every html pages. That was just an example. Yes, both have the same <!-- * * * * * START HERE * * * * * --> and <!-- * * * * * END HERE * * * * * -->

        I understand the regex, but where is the location I must use it? Because there are 2 different folders. One with the old web design, one with the new design (in witch I have to add the texts content). I must copy from a folder to another folder, from a html files to another html files.

        1 Reply Last reply Reply Quote 0
        • Robin CruiseR
          Robin Cruise
          last edited by Robin Cruise

          in fact, I have to change the design of a site, but I have to keep thousands of articles. I can’t copy every text article in the new template, page by page. Must using something quicker.

          1 Reply Last reply Reply Quote 0
          • Terry RT
            Terry R
            last edited by

            @Robin-Cruise said in How to copy html text content from a section of several pages to the section of other several different pages (with the tags of the text too):

            I can’t copy every text article in the new template, page by page. Must using something quicker.

            This problem is so complex I think it would need multiple steps, each building on the previous one and possibly using a different process. Of course as @PeterJones states using a programming language would work, but you’d need to learn that and I suppose time is of the essence.

            My idea would likely build on some abilities you already know (or at least know of) and should be able to accomplish with little effort.

            The steps would be:

            1. Copy both folders elsewhere as it would be very important to do all this in offline copies and then test the results and proof read some of the files to confirm the results.
            2. For the first file which provides the text to be inserted into the second file use a regex to remove ALL but the lines that will be copied. This could be accomplished with a regex using the Find in Files function.
            3. Add the content of the first file which remains to the end of the second file. It may require an additional line to delimit the addition, say a line of hash’s (######…) in between the current file content and the additional new lines from file 1, this might help in the next step.
            4. Again using the Find in Files function, changing out the old data with the additional lines at the bottom of each file from step #3.
            5. Proof the new files to confirm data changed as required.

            Notice there is a big gap in all this. That is how do you determine the file name of the donor file and it’s replacement file in the new structure. You haven’t mentioned that at any point and I think that is going to be the biggest hurdle, unless some naming convention was used to make it easier to pair the files. If a naming convention was used such as “design1file0001.html” and it’s pair in the new structure such as “file0001structure2.html”. If something like this was used then again a sort and regex process within Notepad++ might get the 2 files paired relatively easily and then enable you to create a “bat” (MS-DOS batch) file which would do step #3.

            Terry

            1 Reply Last reply Reply Quote 3
            • Robin CruiseR
              Robin Cruise
              last edited by Robin Cruise

              @Terry-R yes, this was my solution from the beginning. Delete everything between <body> and <!-- * * * * * START HERE * * * * * --> and delete everything after <!-- * * * * * END HERE * * * * * --> and </body> So to keep the text content (and the meta tags of the beginning of html)

              And replace those too section (deleted) with the format style (html code) of the new web template. And I can use TextCrawler software for larger codes in order to make Search and Replace.

              Yes, I wish it was a safer way than that. Regex would have been much better, if if it could be used, even in more steps.

              1 Reply Last reply Reply Quote 0
              • Terry RT
                Terry R
                last edited by

                @Robin-Cruise said in How to copy html text content from a section of several pages to the section of other several different pages (with the tags of the text too):

                Regex would have been much better, if if it could be used, even in more steps.

                I think you are still missing the point, how do you pair the files. Regex has NO concept of files. It is tasked with editing or finding characters within other characters. In this case it is either presented with a tab (within NPP) or a text from a file if using the Find in Files function. it doesn’t know where the text came from, nor where it goes after the regex is finished. it is just a step in the process, which NPP handles from start to finish.

                Your biggest job is as I say, pairing the files together, the other steps are fairly simple to create.

                Terry

                1 Reply Last reply Reply Quote 1
                • Robin CruiseR
                  Robin Cruise
                  last edited by Robin Cruise

                  actually, @guy038 guy has a great idea with Bookmark, except notepad++ has not yet the does not yet have the possibility of multiple bookmarks and the possibility of insert them into a specific folder. Because the name of the html files are identicaly, only the html body is different , and must put the text where I indicated.

                  I could use copy bookmark in one folder files, and paste it into another folder.

                  1 Reply Last reply Reply Quote 0
                  • Terry RT
                    Terry R
                    last edited by

                    @Robin-Cruise said in How to copy html text content from a section of several pages to the section of other several different pages (with the tags of the text too):

                    I could use copy bookmark in one folder files, and paste it into another folder.

                    I think at this point your “original idea” backed up by some input from this forum (my idea which seems to correlate to yours and upvotes of it) should tell you that it is the RIGHT solution. Don’t going looking for things you know don’t exist and hoping.

                    The idea presented is right, easy to understand and should be “safe” as you put it. As you have identical filenames in both structures my main concern has now evaporated (how to pair the files).

                    So for getting a “bat” file created you need the filenames in 2 lists. You would a “DIR” command at the command prompt with some parameters which leave the list in a bare state (hint /B) and possibly sorted (hint again /ON).

                    Terry

                    1 Reply Last reply Reply Quote 1
                    • Robin CruiseR
                      Robin Cruise
                      last edited by Robin Cruise

                      maybe the new future of Multiple Bookmark should memorize the file names and the selected content in a temporary txt file before making a replacement in other files. And it can replace the content in order, from A-Z names of files. Also, may skip the content of the file that does not have a pair name. Something like that.

                      1 Reply Last reply Reply Quote 0
                      • Terry RT
                        Terry R
                        last edited by

                        @Robin-Cruise said in How to copy html text content from a section of several pages to the section of other several different pages (with the tags of the text too):

                        maybe the new future of Multiple Bookmark should memorize the file names and

                        I will say this once only. Forget what doesn’t exist. If you are serious about fixing your immediate problem, continuing to hope is pointless. You have the answer from the forum. We can help with portions such as helping create the regex to remove text not needed, and insert other text. We can also help with creating the BAT file using regex.

                        But if you continue down this road of “hoping” you will get nowhere and others here will also likely dismiss your requests as you don’t seem to be overly concerned about solving it either.

                        Terry

                        1 Reply Last reply Reply Quote 1
                        • andrecool-68A
                          andrecool-68
                          last edited by andrecool-68

                          Layout a website in pure html with over 3000 pages is complete nonsense!
                          Сms for the site)))

                          1 Reply Last reply Reply Quote 2
                          • Robin CruiseR
                            Robin Cruise
                            last edited by

                            THIS IS THE ANSWER !

                            A GREAT ANSWER for this problem, but using PowerShell in Windows. Very simple !!

                            https://superuser.com/questions/1620195/parsing-how-to-copy-html-text-content-from-a-section-of-several-pages-to-the-se

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post
                            The Community of users of the Notepad++ text editor.
                            Powered by NodeBB | Contributors