Community
    • Login

    Newbie needs helps

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    5 Posts 3 Posters 5.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Daniel MayD
      Daniel May
      last edited by

      I’m searching a document with lines and each line ends with “1.4,1,1)” (minus the quotations). Out of 20,000 lines one of them is missing the closing parenthesis. Does anyone know how I can ask NP++ to find the missing parenthesis?

      Thanks in advance,

      Dan

      1 Reply Last reply Reply Quote 0
      • jonandrJ
        jonandr
        last edited by

        Open find, check “Regular expression”, search for this:

        1\.4,1,1[^\)]$
        

        Dot has a special meaning for regular expressions, so it must be escaped with a backslash. Same with parentheses.
        [^)] is a so called negated character class and means “any character except right parenthesis”.
        $ means “match end of line”.

        1 Reply Last reply Reply Quote 0
        • Daniel MayD
          Daniel May
          last edited by Daniel May

          Wow thanks for the fast response, fortunately we’ve found that the missing parenthesis is not in that specific string of characters but somewhere in the file and now I’ve really gone cross eyed.

          Here’s a snipit from the file:

          g(ccreekm1,xyh(9358.6227238,9897.5418358,673.91697223),dt(3,23,7,32,52,45),l3grd,xyh(-0.58809487458,0.80686194848,0.055840975887),1.4,1,1)
          g(ccreekm1,xyh(9357.6569034,9898.7712656,673.98295962),dt(3,23,7,32,53,48),l3grd,xyh(-0.59079549957,0.80517318715,0.051544314686),1.4,1,1)
          g(ccreekm1,xyh(9356.7105651,9900.0235405,674.01411243),dt(3,23,7,32,54,49),l3grd,xyh(-0.60135170686,0.79870482939,0.021135755628),1.4,1,1)
          g(ccreekm1,xyh(9355.7366367,9901.5794355,674.00311243),dt(3,23,7,32,55,54),l3grd,xyh(-0.53407133705,0.84541817075,-0.0059936220705),1.4,1,1)
          g(ccreekm1,xyh(9355.0445335,9903.0878211,674.04228009),dt(3,23,7,32,56,46),l3grd,xyh(-0.48097880186,0.87669403229,0.008183270413),1.4,1,1)
          g(ccreekm1,xyh(9354.7094565,9904.8417723,674.03229694),dt(3,23,7,32,57,47),l3grd,xyh(-0.30673979981,0.95175419176,0.0086402362665),1.4,1,1)
          g(ccreekm1,xyh(9353.94977,9906.3023111,674.01629694),dt(3,23,7,32,58,50),l3grd,xyh(-0.45809271816,0.8888512565,-0.0097213881814),1.4,1,1)
          g(ccreekm1,xyh(9353.568582,9907.9593727,674.01729694),dt(3,23,7,32,59,51),l3grd,xyh(-0.2294316975,0.97332458606,0.00058851543608),1.4,1,1)
          g(ccreekm1,xyh(9353.065188,9909.7433868,673.92329694),dt(3,23,7,33,0,54),l3grd,xyh(-0.27037942404,0.96142079994,-0.050645952426),1.4,1,1)

          That times 100000 and somewhere is a missing right parenthesis. Can you help?

          Dan

          1 Reply Last reply Reply Quote 0
          • jonandrJ
            jonandr
            last edited by

            Maybe it’s possible to build a regexp which finds the line in one go, but personally I would just do it in several steps.

            The insight here is that if you remove everything except parentheses, then all good lines will look the same. Since they all look the same, they can be replaced with nothing. That leaves only bad lines.

            1. Replace
            [a-z0-9\.,-]
            

            with nothing.

            1. Replace
            ^\(\(\)\(\)\(\)\)$
            

            with nothing.

            1. Search for ‘(’ or ‘)’ to find the wrong line (note that line numbers stay the same so it can be correlated with the original file)
            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hello Daniel and Andreas, and All,

              As soon as you’re looking for couples (), [], {}, <>… in a text, containing unlimited nested respective couples (), [], {}, <>, you absolutely need to use a very strong feature of PCRE regex : the recursive patterns. If you don’t, it quite impossible to handle any arbitrary nesting depth !

              Generally speaking, you automatically create a recursive pattern, when you add a recursive call to a group, located INSIDE the sub-pattern whose it makes a reference.

              For instance, let’s suppose the general regex ....(....(?n)....).... and that the showed group is the group n. Then, the form (?n), located inside the group n, is a recursive call to that group n


              Concerning your problem, Daniel, the solution is the regex, below :

              (^([^()\r\n])*(\(((?2)|(?3))*\))(?2)*\r?\n)+

              You may use the PCRE_EXTENTED option (?x) to get the regex :

              (?x) ( ^ ([^()\r\n])* ( \( ( (?2) | (?3) )* \) ) (?2)* \r?\n )+

              These regexes look for any consecutive sequence of entire lines, with their End of Line character(s), whose each of them :

              • Is, of the general form, ....(.......).....

              • Contents well-balanced nested couples (), inside the upper-level block (....)

              So, Daniel, if you leave the replace field empty, you’ll get, ONLY, the lines where the number of opening round brackets is different from the number of closing round brackets, or lines without any round bracket at all !


              For instance, these regexes, above, don’t match any of these following lines :

              abcdef
              abc(def
              a(b)def)ghi
              a(bc(((d))ef)g

              But they do match, in one go, the block of these seven following lines, with well-balanced couples of () :

              abc(de)f
              (a(bdef)ghi)
              a(bc(((d))e)f)g
              a()bc
              ((ab(cde((fgh)ij)kl))mno)pqr
              ab(c(de(fgh(ijk))lm)((()))n()()op)qrs


              In short :

              • The form (?2) is a NON recursive call to the sub-pattern [^()\r\n]

              • The form (?3) is a recursive call to the sub-pattern \( ( (?2) | (?3) )* \)

              • The anchor ^, at beginning and \r?\n, at the end, allow to cover an entire line, which can be repeated, due to the final + sign, applying to group 1

              • The opening and closing round brackets need to be escaped \( and \). Just notice that escaping round brackets, inside the class character [....], at the beginning of the regex, is not mandatory !

              • Inside the block \(....\), the regex looks for any sequence, even empty, of :

                • (?2) Characters different from round brackets and from End of Line characters OR
                • (?3)Nested other blocks of round brackets (....)

              and so on…

              I’ll give you any further information, about the recursion concept, if anyone needs to !

              Best regards,

              guy038

              P.S.,

              To end, I give you an other regex, with a recursive pattern (?2), which can match the general case of the string ....(.........)............(..)...(....)...........

              So, this regex, below, matches the tallest sequence of characters, even on several lines, which contains as much as opening round brackets than closing round brackets, with well-nested and/or juxtaposed other blocs (....) :

              ([^()]*(\(([^()]|(?2))*\))[^()]*)+

              With the PCRE_EXTENTED option (?x), we get the regex :

              (?x) ( [^()]* ( \( ( [^()] | (?2) )* \) ) [^()]* )+

              And, if you don’t think to use the group 1, in the replacement part, with the backreference \1, you may set group 1, as a non-capturing group, with the syntax ?:, in :

              (?:[^()]*(\(([^()]|(?1))*\))[^()]*)+

              (?x) (?: [^()]* ( \( ( [^()] | (?1) )* \) ) [^()]* )+

              Of course, because of the first non-capturing group, the old recursive group 2 becomes the recursive group 1

              1 Reply Last reply Reply Quote 1
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors