• Login
Community
  • Login

Filter lines

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
5 Posts 3 Posters 2.0k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J
    Jose Emilio Osorio
    last edited by Aug 11, 2018, 1:38 AM

    How to filter lines with less or more than “X” number of characters ?. Thanks

    1 Reply Last reply Reply Quote 0
    • T
      Terry R
      last edited by Aug 11, 2018, 5:04 AM

      Use the Search, Mark option from the menu and type the following into the find what window.

      ^(?|.{1,37}|.{39,200})\R

      In this example we will mark every line that isn’t 38 characters long. So subtract 1 from x, then replace the 37. Then add 1 to x, and replace the 39. You may also need to make the number 200 larger if the line might exceed it.

      I’ve tried to use a different method so you can just use the number x once in the expression, but currently that eludes me, this would come in as a good 2nd choice.

      Terry

      1 Reply Last reply Reply Quote 1
      • T
        Terry R
        last edited by Terry R Aug 11, 2018, 5:27 AM Aug 11, 2018, 5:27 AM

        My previous example will NOT grab the last line in a file unless it has another carriage return\line feed after it. A revised regex would be:

        ^(?|.{1,37}|.{39,200})(?|\R|\z)

        So the ‘\z’ takes care of that last line if it’s needed.

        I also have another option, better than this but still not exactly what I was searching for. You could use:

        ^(.{38})(?|\R|\z)

        to mark the lines that DO meet the criteria. So replace 38 with x, your number. You would also tick the box to ‘bookmark line’. Once you have done that close, then back under the search menu option is bookmark. Under this is the ability to ‘inverse bookmark’, so you de-select the ones which DO meet the criteria and instead bookmark those which do NOT meet it. From the same menu option you could remove those lines, or cut them out for pasting elsewhere.

        Terry

        1 Reply Last reply Reply Quote 3
        • J
          Jose Emilio Osorio
          last edited by Aug 13, 2018, 5:55 PM

          Thank you very much.

          1 Reply Last reply Reply Quote 0
          • G
            guy038
            last edited by Aug 15, 2018, 3:21 PM

            Hi, @terry-r and All,

            In your last post, Terry, the (?|\R|\z) regex syntax is, for instance, a branch reset group structure. However, as it is said here :

            If you don’t use any alternation or capturing groups inside the branch reset group, then its special function doesn’t come into play. It then acts as a non-capturing group.

            So, I don’t think that, in that specific case, the branch reset group syntax was necessary ;-))


            And for everybody, to, clearly, show the difference between a capturing list of alternatives and a branch reset group, let’s consider the two following regexes, which matches, either, one of the 5-chars strings : axyzw, apqrw or atuvw

            • A) with a list of consecutive alternatives, in a capturing group :

            (a)(x(y)z|(p(q)r)|(t)u(v))(w)

              • When the regex matches axyzw, group 1 = a, group 2 = x(y)z, group 3 = y and group 8 = w

              • When the regex matches apqrw, group 1 = a, group 2 = (p(q)r), group 4 = p(q)r, group 5 = q and group 8 = w

              • When the regex matches atuvw, group 1 = a, group 2 = (t)u(v), group 6 = t, group 7 = v and group 8 = w

            • B) with a branch reset group ( NOT a capturing group itself ! ) :

            (a)(?|x(y)z|(p(q)r)|(t)u(v))(w)

              • When the regex matches axyzw, group 1 = a, group 2 = y, group 3 is undefined and group 4 = w

              • When the regex matches apqrw, group 1 = a, group 2 = p(q)r, group 3 = q and group 4 = w

              • When the regex matches atuvw, group 1 = a, group 2 = t, group 3 = v and group 4 = w


            An other example. Given that text :

            abcdefg
            hijklmn
            opqrstu
            

            The regex S/R :

            SEARCH (abcdefg)|(hijklmn)|(opqrstu)

            REPLACE ==\1\2\3== OR, also, ==$0==

            would change the text as :

            ==abcdefg==
            ==hijklmn==
            ==opqrstu==
            

            Note that, in the syntax ==\1\2\3==, only one group is defined, at a time and the two others are just “empty” groups !

            Now, with the same initial text, the regex S/R, below :

            SEARCH ab(cde)fg|hi(jkl)mn|op(qrs)tu

            REPLACE ==\1\2\3==

            gives :

            ==cde==
            ==jkl==
            ==qrs==
            

            whereas the following regex S/R, with a branch reset group and only group 1, in replacement :

            SEARCH (?|ab(cde)fg|hi(jkl)mn|op(qrs)tu)

            REPLACE ==\1==

            would produce the same results

            …and the regex S/R :

            SEARCH (?|ab(cde)fg|hi(jkl)mn|op(qrs)tu)

            REPLACE ==\1\1\1==

            would give :

            ==cdecdecde==
            ==jkljkljkl==
            ==qrsqrsqrs==
            

            Best Regards,

            guy038

            1 Reply Last reply Reply Quote 3
            5 out of 5
            • First post
              5/5
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors