Community
    • Login

    Filter lines

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    5 Posts 3 Posters 2.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Jose Emilio OsorioJ
      Jose Emilio Osorio
      last edited by

      How to filter lines with less or more than “X” number of characters ?. Thanks

      1 Reply Last reply Reply Quote 0
      • Terry RT
        Terry R
        last edited by

        Use the Search, Mark option from the menu and type the following into the find what window.

        ^(?|.{1,37}|.{39,200})\R

        In this example we will mark every line that isn’t 38 characters long. So subtract 1 from x, then replace the 37. Then add 1 to x, and replace the 39. You may also need to make the number 200 larger if the line might exceed it.

        I’ve tried to use a different method so you can just use the number x once in the expression, but currently that eludes me, this would come in as a good 2nd choice.

        Terry

        1 Reply Last reply Reply Quote 1
        • Terry RT
          Terry R
          last edited by Terry R

          My previous example will NOT grab the last line in a file unless it has another carriage return\line feed after it. A revised regex would be:

          ^(?|.{1,37}|.{39,200})(?|\R|\z)

          So the ‘\z’ takes care of that last line if it’s needed.

          I also have another option, better than this but still not exactly what I was searching for. You could use:

          ^(.{38})(?|\R|\z)

          to mark the lines that DO meet the criteria. So replace 38 with x, your number. You would also tick the box to ‘bookmark line’. Once you have done that close, then back under the search menu option is bookmark. Under this is the ability to ‘inverse bookmark’, so you de-select the ones which DO meet the criteria and instead bookmark those which do NOT meet it. From the same menu option you could remove those lines, or cut them out for pasting elsewhere.

          Terry

          1 Reply Last reply Reply Quote 3
          • Jose Emilio OsorioJ
            Jose Emilio Osorio
            last edited by

            Thank you very much.

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by

              Hi, @terry-r and All,

              In your last post, Terry, the (?|\R|\z) regex syntax is, for instance, a branch reset group structure. However, as it is said here :

              If you don’t use any alternation or capturing groups inside the branch reset group, then its special function doesn’t come into play. It then acts as a non-capturing group.

              So, I don’t think that, in that specific case, the branch reset group syntax was necessary ;-))


              And for everybody, to, clearly, show the difference between a capturing list of alternatives and a branch reset group, let’s consider the two following regexes, which matches, either, one of the 5-chars strings : axyzw, apqrw or atuvw

              • A) with a list of consecutive alternatives, in a capturing group :

              (a)(x(y)z|(p(q)r)|(t)u(v))(w)

                • When the regex matches axyzw, group 1 = a, group 2 = x(y)z, group 3 = y and group 8 = w

                • When the regex matches apqrw, group 1 = a, group 2 = (p(q)r), group 4 = p(q)r, group 5 = q and group 8 = w

                • When the regex matches atuvw, group 1 = a, group 2 = (t)u(v), group 6 = t, group 7 = v and group 8 = w

              • B) with a branch reset group ( NOT a capturing group itself ! ) :

              (a)(?|x(y)z|(p(q)r)|(t)u(v))(w)

                • When the regex matches axyzw, group 1 = a, group 2 = y, group 3 is undefined and group 4 = w

                • When the regex matches apqrw, group 1 = a, group 2 = p(q)r, group 3 = q and group 4 = w

                • When the regex matches atuvw, group 1 = a, group 2 = t, group 3 = v and group 4 = w


              An other example. Given that text :

              abcdefg
              hijklmn
              opqrstu
              

              The regex S/R :

              SEARCH (abcdefg)|(hijklmn)|(opqrstu)

              REPLACE ==\1\2\3== OR, also, ==$0==

              would change the text as :

              ==abcdefg==
              ==hijklmn==
              ==opqrstu==
              

              Note that, in the syntax ==\1\2\3==, only one group is defined, at a time and the two others are just “empty” groups !

              Now, with the same initial text, the regex S/R, below :

              SEARCH ab(cde)fg|hi(jkl)mn|op(qrs)tu

              REPLACE ==\1\2\3==

              gives :

              ==cde==
              ==jkl==
              ==qrs==
              

              whereas the following regex S/R, with a branch reset group and only group 1, in replacement :

              SEARCH (?|ab(cde)fg|hi(jkl)mn|op(qrs)tu)

              REPLACE ==\1==

              would produce the same results

              …and the regex S/R :

              SEARCH (?|ab(cde)fg|hi(jkl)mn|op(qrs)tu)

              REPLACE ==\1\1\1==

              would give :

              ==cdecdecde==
              ==jkljkljkl==
              ==qrsqrsqrs==
              

              Best Regards,

              guy038

              1 Reply Last reply Reply Quote 3
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors