Community
    • Login

    replace / move numbers from one row to another with regular expression (in parentheses)

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    15 Posts 6 Posters 846 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Robin CruiseR
      Robin Cruise
      last edited by

      @Terry-R said in replace / move numbers from one row to another with regular expression (in parentheses):

      So from your example it appears that file 1 isn’t changed, only file 2.
      And I can see that file 2 lines have different wording. How do you determine which line in file 2 gets which number? Is it that the number in line 1 of file 1 gets copied to file 2 line 1? Or do you have some other rule used to determine which. To make a solution we need a rule to follow, something that can be translated into either a regular expression or some other function if a scripted solution.

      I think of something. First, I will select all numbers from parentheses (including parentheses)

      SEARCH: .*?(\(.*?\)).*

      REPLACE BY: \1\2

      So, I will get:

      (22)
      (18)
      (23)
      
      (2)
      (12)
      (10)
      

      Now, all I have to do is copy the first 3 rows instead of the next 3 rows to get:

      (22)
      (18)
      (23)
      
      (22)
      (18)
      (23)
      

      The problem is, how do I copy the parentheses and numbers back to their place between tags?

      1 Reply Last reply Reply Quote 0
      • Terry RT
        Terry R
        last edited by

        @Robin-Cruise said in replace / move numbers from one row to another with regular expression (in parentheses):

        The problem is, how do I copy the parentheses and numbers back to their place between tags?

        Not sure where you are heading but you still haven’t answered my question.

        I’m going to take a punt since it does seem a bit obvious from the examples:

        1. both files contain the same number of lines and are in the same order, line 1 of file 1 corresponds with line 1 of file 2.
        2. I’d prefix each line in each file with an ascending number 1,2,3. I’d also prefix (behind the number) and “a” for file 1 and a “b” for file 2.
        3. I’d then combine both files and sort lexicographically.
        4. I’d use a regex to copy the number from #a line and replace the equivalent number in the #b line, at the same time removing the #a line (since we don’t need the file 1 line anymore).
        5. I’d then remove the prefix from the lines (#b) leaving the changed file 2 content.

        Terry

        1 Reply Last reply Reply Quote 1
        • Robin CruiseR
          Robin Cruise
          last edited by Robin Cruise

          ok, I made a regex, it is kind a step forward. I managed to move the numbers and parentheses from the first 3 rows to the next 3, but not in the correct order.

          SEARCH: (?s)(<li><a href=)(.*?)(\(\d+\))(<\/a><\/li>).*?\K(\w+)

          REPLACE BY: \3

          if I could do that, it’s a sign that it’s somehow possible. but I am not very good at regex. Maybe @guy038 will improve my regex . :)

          1 Reply Last reply Reply Quote 0
          • guy038G
            guy038
            last edited by guy038

            Hi, @robin-cruise, @terry-r and All,

            Sorry, but a significant lot of information is missing for a good comprehension of your goal :

            • How many lines <li><a href="page-##.html" title="Page ##">Page ## (##)</a></li> contains your File 1.html file ?

            • Are all these lines consecutive ?

            • Even if some of these lines are consecutive, are there some other similar sections, containing this same type of lines ?

            • Does the File 2.html file contains the same number of lines <li><a href="page-##.html" title="Page ##">Page ## (##)</a></li> than the File 1.html file ?

            • If the File 2.html file contains also some sections of these lines, is the layout quite identical, between the two files ?


            In short, could you provide a larger part of your files to get a more precise idea of the changes to do ?

            Best Regards,

            guy038

            1 Reply Last reply Reply Quote 0
            • Robin CruiseR
              Robin Cruise
              last edited by Robin Cruise

              • both of them, File 1.html just like File 2.html contains 40 lines. both files have the same structure, except the numbers in parentheses.

              • The numbers (in parentheses) are different on both files, but I want them to be the same. Right no, they are not consecutive numbers, but random ones.

              • page-1.html" title=“Page 1”>Page 1 is a short version, just an example.

              The real pages are like: <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (22)</a></li>

              Basicaly, it’s about a meniu on a website translate in 2 languages, and that number in parenthesis , ex. (22), is the number of the articles on that section. Should be the same numbers in both languages

              astrosofistaA 1 Reply Last reply Reply Quote 0
              • Terry RT
                Terry R
                last edited by Terry R

                @Robin-Cruise said in replace / move numbers from one row to another with regular expression (in parentheses):

                both of them, File 1.html just like File 2.html contains 40 lines.

                Did you ever read my questions? Specifically I asked whether the same number of lines in each. Also is line 1 in file 1 equal to line 1 in file 2, thus the line 1 number (##) copied across to line 1 in file 2.

                If so then I have already provided the solution in words (just needs translating into code), which I wrote out for you. Since you seem to have a good idea on how to create regexes, did you not try to follow my instructions?

                Tery

                1 Reply Last reply Reply Quote 1
                • astrosofistaA
                  astrosofista @Robin Cruise
                  last edited by

                  @Robin-Cruise said in replace / move numbers from one row to another with regular expression (in parentheses):

                  • both of them, File 1.html just like File 2.html contains 40 lines. both files have the same structure, except the numbers in parentheses.
                    […]
                    The real pages are like: <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (22)</a></li>

                  Given these condictions and if you are allowed to install the LuaScript Plugin, there is a script than can select, copy the numbers in parentheses from one page and paste them in the other.

                  Please confirm and will provide further details.

                  Take care and have fun!

                  1 Reply Last reply Reply Quote 1
                  • caryptC
                    carypt
                    last edited by

                    sorry for my objection , but it looks easier to just change the wording of the file2 to make it fit file1 and keep the numbers untouched . so file 1 would get replaced and translated by file2 which is the right one concerning the numbers. idk

                    1 Reply Last reply Reply Quote -1
                    • Robin CruiseR
                      Robin Cruise
                      last edited by

                      this solved the problem.

                      In short, I copy all text before the parentheses from File 1 into the column A from EXCEL. Then I copy the parentheses with numbers from File 2 into the column B from Excel, put the rest after the parentheses into the column C from Excel. Then select all Excel columns into FILE 2, and replace line. Now the parentheses are identical.

                      Step 1 - Use regex to select all parentheses (with numbers), then copy them to an excel in column 2

                      SEARCH: .*?(\(.*?\)).* REPLACE BY: \1\2

                      Step 2 Use regex to select everything on each line before the parentheses:

                      SEARCH: \(.*\).* REPLACE BY: (leave empty)

                      Step 3 - Copy the resulting lines to an excel file in column 1

                      Step 4 - Copy directly to column 4 of excel what is after parentheses: </a></li>

                      or use regex to obtain this result </a></li>, select everything after round brackets SEARCH: ^(.*\)).* REPLACE BY: \1

                      Step 5 - Copy all excel content to a new notepad ++ file.

                      If there are too many empty spaces, search and replace 2 spaces with one space

                      1 Reply Last reply Reply Quote 0
                      • guy038G
                        guy038
                        last edited by guy038

                        Hello, @robin-cruise, @terry-R, @astrosofista, @carypt and All,

                        Sorry to be very late, as I answered to many posts, recently !

                        Here is my method :

                        • Open your File 1.html and File 2.html files in N++

                        • At the end of the File 1.html contents, insert, for instance, a new line =====

                        • Append the File 2.html contents, right after that new line


                        Note that we’ll need two specific characters, which are not used yet in your HTML files :

                        • One char to separate the contents of the two files, in *File 1.html**. I chose the = sign. Hence the line of five = signs

                        • One char used by the regex S/R in order to mark the numbers between parentheses already processed. I chose the # character

                        • Of course, you may choose any character for these two specific chars. Just modify the regex, accordingly

                        • Preferably, avoid the true regex symbols    \    ^    $    .    |    ?    *    +    (    )    [    ]    {    }


                        • For instance, after merging the File 2.html contents into File 1.html, we would obtain this tiny text, with the ===== separation
                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (22)</a></li>
                        bla bla
                        bla bla
                        bla bla
                        bla bla
                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (18)</a></li>
                        bla bla
                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (23)</a></li>
                        =====
                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (2)</a></li>
                        bla bla
                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (12)</a></li>
                        bla bla
                        bla bla
                        bla bla
                        bla bla
                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (10)</a></li>
                        
                        • Move to the very beginning of File 1.html

                        • Open the Replace dialog ( Ctrl + H )

                          • SEARCH (?s)\((\d+)\)(.+=====.+?)\((\d+)\)|^=====.+|#(?!.*^===)

                          • REPLACE ?1\(\3#\)\2\(\3#\)

                          • Untick, if necessary, the Wrap around option

                          • Select the Regular expression search mode

                        • Now, keeping the Replace dialog opened, click on the Replace All button ( or preferably hit the Alt + A shortcut ) repeatedly, until the message Replace All: 0 occurrences were replaced from caret to end-of-file is displayed !

                        And you’ll get the expected File 1.html contents :

                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (2)</a></li>
                        bla bla
                        bla bla
                        bla bla
                        bla bla
                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (12)</a></li>
                        bla bla
                        <li><a href="Love-Master-A-Manga-Volume.html title=“Love Master A Manga Volume”>Love Master A Manga Volume (10)</a></li>
                        
                        • Save the new File 1.html contents, with all the updated numbers between parentheses !

                        Notes :

                        • For N Replace All operations processed, in totality :

                          • The N - 2 first operations :

                            • Replace the numbers of File 1.html with the corresponding numbers of File 2.html, located after the ===== line

                            • Add a # marker to the two numbers processed

                          • The N - 1 operation deletes from line ===== till the very end of file, in order to suppress the temporary appended part

                          • The N operation deletes all the existing # markers, of the File 1.html

                        Best Regards,

                        guy038

                        1 Reply Last reply Reply Quote 3
                        • Robin CruiseR
                          Robin Cruise
                          last edited by

                          @guy038 said in replace / move numbers from one row to another with regular expression (in parentheses):

                          =====

                          brilliant. You really are very good @guy038

                          THANK YOU !

                          1 Reply Last reply Reply Quote 0
                          • Robin CruiseR
                            Robin Cruise
                            last edited by

                            @guy038 said in replace / move numbers from one row to another with regular expression (in parentheses):

                            ?1(\3#)\2(\3#)

                            by the way, on replace, what does it mean ?1\(\3#\)\2\(\3#\) (step by step, please) ?

                            Alan KilbornA 1 Reply Last reply Reply Quote 0
                            • Alan KilbornA
                              Alan Kilborn @Robin Cruise
                              last edited by

                              @Robin-Cruise said in replace / move numbers from one row to another with regular expression (in parentheses):

                              by the way, on replace, what does it mean ?1(\3#)\2(\3#) (step by step, please) ?

                              It’s fairly “easy”: :-)

                              ?1 controls the rest of it: If capture group #1 was NOT matched, the replacement is “nothing” (aka deletion)

                              If capture group #1 WAS matched, then the replacement consists of:

                              • opening parens: (
                              • what was matched with capture group #3
                              • a literal #
                              • closing parens: )
                              • what was matched with capture group #2
                              • opening parens: (
                              • what was matched with capture group #3
                              • a literal #
                              • closing parens: )
                              1 Reply Last reply Reply Quote 2
                              • First post
                                Last post
                              The Community of users of the Notepad++ text editor.
                              Powered by NodeBB | Contributors