Community
    • Login

    regex: compare lines and find out different numbers

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    11 Posts 2 Posters 2.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G
      guy038
      last edited by guy038

      Hello, Vasile Caraus and All,

      I, certainly, miss something obvious, because regexes to achieve what you want, do not seem that difficult ;-))

      For instance, the regex :

      (?-s)^.{4}\((?!(22|21|13))\).+ would match all the contents of any line, which has :

      • An opening round parenthesis, at position 5

      • Then a two-digits number, different of, either, 22, 21 and 13

      • And, finally, an ending round parenthesis, at position 8


      On the other hand, the regex :

      (?-s)^.{4}\((?!(22|21|13))\K..(?=\)) would match any two-digits number,enclosed in parentheses, which is different from, either, the values 22, 21 and 13

      Best Regards,

      guy038

      1 Reply Last reply Reply Quote 3
      • Vasile CarausV
        Vasile Caraus
        last edited by

        hello Guy, but in the case I have something like this, what will be the formula?

        Please see this link, akismet do not let me write the code

        https://regex101.com/r/vRXKWj/3/

        So, the order of the numbers must be taken into account, and regex formula should match only where the number does not match the default number.

        1 Reply Last reply Reply Quote 0
        • guy038G
          guy038
          last edited by guy038

          Hi, Vasile Caraus and All,

          I don’t understand, exactly, what you need ! Please, you should provide best explanations ! So, just one assumption :

          Given the text, below :

          
          # A BLOCK of 6 lines, with 6 DEFAULT values ( 22, 21, 13, 23, 24 and 15 )
          
          <li><a href="xxx.html" title="xxx">xxx (22)</a></li>
          <li><a href="yyy.html" title="yyy">yyy (21)</a></li>
          <li><a href="zzz.html" title="zzz">zzz (13)</a></li>
          <li><a href="xxx.html" title="xxx">xxx (23)</a></li>
          <li><a href="yyy.html" title="yyy">yyy (24)</a></li>
          <li><a href="zzz.html" title="zzz">zzz (15)</a></li>
          ...
          ...
          ...
          # Then an other BLOCK of 6 lines, downwards :
          ...
          <li><a href="ccc.html" title="ccc">ccc (22)</a></li>
          <li><a href="ddd.html" title="ddd">ddd (21)</a></li>
          <li><a href="eee.html" title="eee">eee (00)</a></li>  <==  A
          <li><a href="fff.html" title="fff">fff (23)</a></li>
          <li><a href="ggg.html" title="ggg">ggg (24)</a></li>
          <li><a href="hhh.html" title="hhh">hhh (57)</a></li>  <==  B
          ...
          ...
          ...
          And a last BLOCK of 6 lines, downwards :
          ...
          <li><a href="iii.html" title="iii">iii (20)</a></li>  <==  C
          <li><a href="jjj.html" title="jjj">jjj (21)</a></li>
          <li><a href="kkk.html" title="kkk">kkk (13)</a></li>
          <li><a href="lll.html" title="lll">lll (33)</a></li>  <==  D
          <li><a href="mmm.html" title="mmm">mmm (34)</a></li>  <==  E
          <li><a href="nnn.html" title="nnn">nnn (15)</a></li>
          ...
          ...
          ...
          

          You would like that the 5 lines, from A to E , would be matched by the regex engine, because their numbers do not correspond to the default numbers, respectively to their location in each block ! Wouldn’t you ?

          See you later

          Cheers,

          guy038

          1 Reply Last reply Reply Quote 0
          • Vasile CarausV
            Vasile Caraus
            last edited by

            yes. That’s right

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hi, @vasile-caraus,

              OK ! But what do you expect when the 5 lines, from A to E , are found ?

              Do you want that a regex S/R replaces the erroneous values, of the second and third block, with their corresponding default values, of the first block of 6 lines ?

              BR

              guy038

              1 Reply Last reply Reply Quote 0
              • Vasile CarausV
                Vasile Caraus
                last edited by

                in case the numbers are different, regex should match only those lines. Other case, nothing to find.

                1 Reply Last reply Reply Quote 0
                • guy038G
                  guy038
                  last edited by guy038

                  Hello, @vasile-caraus, and All,

                  I already did numerous tests, but I’m still not satisfied ! Let’s carry on our discussion :

                  Assuming the text, below :

                  ....
                  
                  # A BLOCK of 6 lines, with 6 DEFAULT values ( 22, 21, 13, 23, 24 and 15 )
                  ...
                  <li><a href="xxx.html" title="xxx">xxx (22)</a></li>
                  <li><a href="yyy.html" title="yyy">yyy (21)</a></li>
                  <li><a href="zzz.html" title="zzz">zzz (13)</a></li>
                  <li><a href="xxx.html" title="xxx">xxx (23)</a></li>
                  <li><a href="yyy.html" title="yyy">yyy (24)</a></li>
                  <li><a href="zzz.html" title="zzz">zzz (15)</a></li>
                  ...
                  # Then, an other BLOCK of 6 lines, downwards :
                  ...
                  <li><a href="ccc.html" title="ccc">ccc (22)</a></li>
                  <li><a href="ddd.html" title="ddd">ddd (21)</a></li>
                  <li><a href="eee.html" title="eee">eee (00)</a></li>  <==  F
                  <li><a href="fff.html" title="fff">fff (23)</a></li>
                  <li><a href="ggg.html" title="ggg">ggg (24)</a></li>
                  <li><a href="hhh.html" title="hhh">hhh (21)</a></li>  <==  G
                  ...
                  

                  Do you want that the regex matches all line contents, when :

                  • Only case F, where the value is different from any of the 6 default values

                  • Both cases F and G which has the value 21, corresponding to the second line of the default block, above, and not the sixth !

                  BR

                  guy038

                  1 Reply Last reply Reply Quote 0
                  • Vasile CarausV
                    Vasile Caraus
                    last edited by

                    Both, F and G.

                    Something like this with default numbers <li><a href=*.html" title=.*(?!\b(22|21|13|23|24|25\b).)* And, if F and G are not the same number on my default numbers, regex should match that line.

                    1 Reply Last reply Reply Quote 0
                    • guy038G
                      guy038
                      last edited by guy038

                      Hello, @vasile-caraus, and All;

                      Unfortunately, I could not find an automatic way, because you need,both, condition on values and condition on locations, which would need, preferably, a Python or Lua script

                      However, here is, below a possible work-around, which produce correct results !

                      So, assuming the original sample text, below :

                       This is the CORRECT block of 6 lines, with 6 DEFAULT values ( 22, 21, 13, 23, 24 and 15 )
                      
                      <li><a href="xxx.html" title="xxx">xxx (22)</a></li>
                      <li><a href="yyy.html" title="yyy">yyy (21)</a></li>
                      <li><a href="zzz.html" title="zzz">zzz (13)</a></li>
                      <li><a href="xxx.html" title="xxx">xxx (23)</a></li>
                      <li><a href="yyy.html" title="yyy">yyy (24)</a></li>
                      <li><a href="zzz.html" title="zzz">zzz (15)</a></li>
                      ...
                       A 2nd BLOCK of 6 lines, downwards :
                      ...
                      <li><a href="ccc.html" title="ccc">ccc (22)</a></li>
                      <li><a href="ddd.html" title="ddd">ddd (21)</a></li>
                      <li><a href="eee.html" title="eee">eee (00)</a></li>
                      <li><a href="fff.html" title="fff">fff (23)</a></li>
                      <li><a href="ggg.html" title="ggg">ggg (24)</a></li>
                      <li><a href="hhh.html" title="hhh">hhh (57)</a></li>
                      ...
                       A 3rd BLOCK of 6 lines, downwards :
                      ...
                      <li><a href="iii.html" title="iii">iii (20)</a></li>
                      <li><a href="jjj.html" title="jjj">jjj (21)</a></li>
                      <li><a href="kkk.html" title="kkk">kkk (13)</a></li>
                      <li><a href="lll.html" title="lll">lll (21)</a></li>
                      <li><a href="mmm.html" title="mmm">mmm (34)</a></li>
                      <li><a href="nnn.html" title="nnn">nnn (15)</a></li>
                      ...
                       A 4th BLOCK of 6 lines, downwards :
                      ...
                      <li><a href="ooo.html" title="ooo">ooo (22)</a></li>
                      <li><a href="ppp.html" title="ppp">ppp (99)</a></li>
                      <li><a href="qqq.html" title="qqq">qqq (15)</a></li>
                      <li><a href="rrr.html" title="rrr">rrr (23)</a></li>
                      <li><a href="sss.html" title="sss">sss (24)</a></li>
                      <li><a href="ttt.html" title="ttt">ttt (15)</a></li>
                      ...
                      A 5th BLOCK of 6 lines, downwards :
                      ...
                      <li><a href="uuu.html" title="uuu">uuu (07)</a></li>
                      <li><a href="vvv.html" title="vvv">vvv (13)</a></li>
                      <li><a href="www.html" title="www">www (21)</a></li>
                      <li><a href="xxx.html" title="xxx">xxx (15)</a></li>
                      <li><a href="yyy.html" title="yyy">yyy (23)</a></li>
                      <li><a href="zzz.html" title="zzz">zzz (15)</a></li>
                      ...
                      

                      I thought, to begin with, to prefix any line of these 6-lines blocks, with their corresponding default values, with a regex S/R ( I used the # symbol as a separator, which, I hope, does not exist, yet, in your file ! )

                      SEARCH (?-s)^(<li>.+\R)(<li>.+\R)(<li>.+\R)(<li>.+\R)(<li>.+\R)(<li>.+\R)

                      REPLACE 22#${1}21#${2}13#${3}23#${4}24#${5}15#${6}

                      So, we get the following text :

                       This is the CORRECT block of 6 lines, with 6 DEFAULT values ( 22, 21, 13, 23, 24 and 15 )
                      
                      22#<li><a href="xxx.html" title="xxx">xxx (22)</a></li>
                      21#<li><a href="yyy.html" title="yyy">yyy (21)</a></li>
                      13#<li><a href="zzz.html" title="zzz">zzz (13)</a></li>
                      23#<li><a href="xxx.html" title="xxx">xxx (23)</a></li>
                      24#<li><a href="yyy.html" title="yyy">yyy (24)</a></li>
                      15#<li><a href="zzz.html" title="zzz">zzz (15)</a></li>
                      ...
                       A 2nd BLOCK of 6 lines, downwards :
                      ...
                      22#<li><a href="ccc.html" title="ccc">ccc (22)</a></li>
                      21#<li><a href="ddd.html" title="ddd">ddd (21)</a></li>
                      13#<li><a href="eee.html" title="eee">eee (00)</a></li>
                      23#<li><a href="fff.html" title="fff">fff (23)</a></li>
                      24#<li><a href="ggg.html" title="ggg">ggg (24)</a></li>
                      15#<li><a href="hhh.html" title="hhh">hhh (57)</a></li>
                      ...
                       A 3rd BLOCK of 6 lines, downwards :
                      ...
                      22#<li><a href="iii.html" title="iii">iii (20)</a></li>
                      21#<li><a href="jjj.html" title="jjj">jjj (21)</a></li>
                      13#<li><a href="kkk.html" title="kkk">kkk (13)</a></li>
                      23#<li><a href="lll.html" title="lll">lll (21)</a></li>
                      24#<li><a href="mmm.html" title="mmm">mmm (34)</a></li>
                      15#<li><a href="nnn.html" title="nnn">nnn (15)</a></li>
                      ...
                       A 4th BLOCK of 6 lines, downwards :
                      ...
                      22#<li><a href="ooo.html" title="ooo">ooo (22)</a></li>
                      21#<li><a href="ppp.html" title="ppp">ppp (99)</a></li>
                      13#<li><a href="qqq.html" title="qqq">qqq (15)</a></li>
                      23#<li><a href="rrr.html" title="rrr">rrr (23)</a></li>
                      24#<li><a href="sss.html" title="sss">sss (24)</a></li>
                      15#<li><a href="ttt.html" title="ttt">ttt (15)</a></li>
                      ...
                      A 5th BLOCK of 6 lines, downwards :
                      ...
                      22#<li><a href="uuu.html" title="uuu">uuu (07)</a></li>
                      21#<li><a href="vvv.html" title="vvv">vvv (13)</a></li>
                      13#<li><a href="www.html" title="www">www (21)</a></li>
                      23#<li><a href="xxx.html" title="xxx">xxx (15)</a></li>
                      24#<li><a href="yyy.html" title="yyy">yyy (23)</a></li>
                      15#<li><a href="zzz.html" title="zzz">zzz (15)</a></li>
                      ...
                      

                      Now, it’s obvious that the simple regex ^(.+)#(?!.+\(\1\)).+, will match any line with a number, between parentheses, different from the number, at beginning of current line, located before the # separator !


                      If you prefer to replace all the erroneous values with the right ones, you may use the following regex S/R

                      SEARCH ^(.+)#(?!.+\(\1\))(.+\().+(\).+)|^.+#

                      REPLACE \2\1\3

                      And, of course, you’ll get the different blocks, with the identical default values between parentheses :

                       This is the CORRECT block of 6 lines, with 6 DEFAULT values ( 22, 21, 13, 23, 24 and 15 )
                      
                      <li><a href="xxx.html" title="xxx">xxx (22)</a></li>
                      <li><a href="yyy.html" title="yyy">yyy (21)</a></li>
                      <li><a href="zzz.html" title="zzz">zzz (13)</a></li>
                      <li><a href="xxx.html" title="xxx">xxx (23)</a></li>
                      <li><a href="yyy.html" title="yyy">yyy (24)</a></li>
                      <li><a href="zzz.html" title="zzz">zzz (15)</a></li>
                      ...
                       A 2nd BLOCK of 6 lines, downwards :
                      ...
                      <li><a href="ccc.html" title="ccc">ccc (22)</a></li>
                      <li><a href="ddd.html" title="ddd">ddd (21)</a></li>
                      <li><a href="eee.html" title="eee">eee (13)</a></li>
                      <li><a href="fff.html" title="fff">fff (23)</a></li>
                      <li><a href="ggg.html" title="ggg">ggg (24)</a></li>
                      <li><a href="hhh.html" title="hhh">hhh (15)</a></li>
                      ...
                       A 3rd BLOCK of 6 lines, downwards :
                      ...
                      <li><a href="iii.html" title="iii">iii (22)</a></li>
                      <li><a href="jjj.html" title="jjj">jjj (21)</a></li>
                      <li><a href="kkk.html" title="kkk">kkk (13)</a></li>
                      <li><a href="lll.html" title="lll">lll (23)</a></li>
                      <li><a href="mmm.html" title="mmm">mmm (24)</a></li>
                      <li><a href="nnn.html" title="nnn">nnn (15)</a></li>
                      ...
                       A 4th BLOCK of 6 lines, downwards :
                      ...
                      <li><a href="ooo.html" title="ooo">ooo (22)</a></li>
                      <li><a href="ppp.html" title="ppp">ppp (21)</a></li>
                      <li><a href="qqq.html" title="qqq">qqq (13)</a></li>
                      <li><a href="rrr.html" title="rrr">rrr (23)</a></li>
                      <li><a href="sss.html" title="sss">sss (24)</a></li>
                      <li><a href="ttt.html" title="ttt">ttt (15)</a></li>
                      ...
                      A 5th BLOCK of 6 lines, downwards :
                      ...
                      <li><a href="uuu.html" title="uuu">uuu (22)</a></li>
                      <li><a href="vvv.html" title="vvv">vvv (21)</a></li>
                      <li><a href="www.html" title="www">www (13)</a></li>
                      <li><a href="xxx.html" title="xxx">xxx (23)</a></li>
                      <li><a href="yyy.html" title="yyy">yyy (24)</a></li>
                      <li><a href="zzz.html" title="zzz">zzz (15)</a></li>
                      ...
                      

                      Cheers,

                      guy038

                      1 Reply Last reply Reply Quote 0
                      • Vasile CarausV
                        Vasile Caraus
                        last edited by

                        thanks guy, your solution is ok, but complex. I just found another solution.

                        <li><a href=".*\.html" title=".*">.* (?:(?!\b(22|9|15|23|4|15)\b).)*<\/a><\/li>$

                        Check this out: https://regex101.com/r/vRXKWj/4/

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors