Community
    • Login

    Regex to find any lines that do NOT have a specific number of a character

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    30 Posts 7 Posters 9.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dinkumoilD
      dinkumoil @Mark Yorkovich
      last edited by dinkumoil

      @Mark-Yorkovich

      I generated with the test data of @Ekopalypse a file of 146545 lines and did that what I’ve suggested above - I got the expected result.

      Be sure that the pipe character in your file is really a pipe character (code 124). There is another one (code 166 in Windows-1252 character encoding) which looks nearly identical:

      Pipe character: |
      The other one: ¦

      Mark YorkovichM 1 Reply Last reply Reply Quote 1
      • Mark YorkovichM
        Mark Yorkovich @dinkumoil
        last edited by

        @dinkumoil said:

        @Mark-Yorkovich

        I generated with the test data of @Ekopalypse a file of 146545 lines and did that what I’ve suggested above - I got the expected result.

        Be sure that the pipe character in your file is really a pipe character (code 124). There is another one (code 166 in Windows-1252 character encoding) which looks nearly identical:

        Pipe character: |
        The other one: ¦

        Yup - they’re pipes.

        Here is a good sample of what I’m working with. Lines 1, 9, 10, 11, 16 thru 20 and 36, 37 are single-line records with 9 pipes and 10 columns. Lines 2 thru 8 are one record and together have 9 pipes/10 cols. Similarly, lines 12 through 15 are a single record, and lines 21 thru 35 are a single record.

        LOREM120|8 |3 |1 |1 |0 |0 |||INST020
        LOREM120|9 |1 |1 |0 |0 |0 ||Lorem Ipsum Dolor]
        LOREM: BS/BP

        LOREM IPSUM:
        Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.|
        IPSUM16|1 |1 |1 |1 |0 |0 |||3001479
        IPSUM16|1 |2 |1 |1 |0 |0 |||3003077
        IPSUM16|11 |0 |1 |0 |0 |0 |||
        IPSUM16|13 |0 |1 |0 |0 |0 ||Lorem ipsum dolor sit amet
        consectetur adipiscing elit,
        sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

        DOLOR53 1 1 1 2 0 0 3003084
        DOLOR53 2 3 1 1 0 0 Lorem ipsum
        DOLOR53 2 4 1 1 0 0 Lorem ipsum
        LOREM56 8 1 1 1 0 0 Lorem ipsum
        LOREM56 8 2 1 1 0 0 Lorem ipsum
        LOREM56 9 1 1 0 0 0 Lorem ipsum dolor sit amet

        consectetur adipiscing elit

        consectetur adipiscing elit
        consectetur adipiscing elit

        consectetur adipiscing elit
        Lorem ipsum dolor sit amet
        sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

        Lorem ipsum dolor sit amet
        consectetur adipiscing elit
        Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.|
        DOLOR19|1 |2 |1 |1 |0 |0 |||3003124
        LOREM01|1 |1 |1 |1 |1 |0 |||3003024

        Your suggested regex ^(?>.+?|){9}(?!.+?|) isn’t finding any matches on that

        EkopalypseE 1 Reply Last reply Reply Quote 0
        • EkopalypseE
          Ekopalypse @Mark Yorkovich
          last edited by

          @Mark-Yorkovich

          because it was assumed that all columns contain data

          find: ^(?>.*?\|){9}(?!.*?\|) does not make that assumption.

          Mark YorkovichM 1 Reply Last reply Reply Quote 0
          • Mark YorkovichM
            Mark Yorkovich @Ekopalypse
            last edited by

            @Ekopalypse said:

            @Mark-Yorkovich
            because it was assumed that all columns contain data

            My bad. I didn’t give you all of the details of what I’m working with.

            find: ^(?>.*?\|){9}(?!.*?\|) does not make that assumption.

            This works.

            So at this point what I’d need to do, ideally, is to do a Find/Replace, finding all of the new line/line feed characters - only in those now-bookmarked lines - and replace them with some other character (spaces, dummy chars, whatever) to get each of those records to be on one line. Can I do a find/replace on just the bookmarked lines? Or perhaps, instead of the multi-step approach, is there a way to do this on the Replace tab, entering a regex in the Find what box that finds those lines and just replace the new line characters with dummy characters in one step?

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @Mark Yorkovich
              last edited by

              @Mark-Yorkovich said:

              Alan’s exp doesn’t match anything in my file

              Well, if I copy and paste your “lorem ipsum” data (above) into a new tab and then run my regex (above) on it, I get lines with exactly 9 pipes redmarked, which I thought was the goal (or the inverse of the goal):

              Imgur

              So…I really don’t know where the disconnect is…

              1 Reply Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn @Mark Yorkovich
                last edited by

                @Mark-Yorkovich said:

                …finding all of the new line/line feed characters - only in those now-bookmarked lines - and replace them with some other character (spaces, dummy chars, whatever) to get each of those records to be on one line

                Didn’t we do all this the other day?

                1 Reply Last reply Reply Quote 0
                • Allen BaiA
                  Allen Bai
                  last edited by

                  (.|){9}.

                  how about this?

                  EkopalypseE Allen BaiA 3 Replies Last reply Reply Quote 0
                  • EkopalypseE
                    Ekopalypse @Allen Bai
                    last edited by

                    @Allen-Bai

                    I assume you meant (.\|){9}.
                    This matches 9 and more pipe delimited lines.

                    Allen BaiA 1 Reply Last reply Reply Quote 1
                    • Allen BaiA
                      Allen Bai @Allen Bai
                      last edited by

                      This post is deleted!
                      1 Reply Last reply Reply Quote 0
                      • Allen BaiA
                        Allen Bai @Ekopalypse
                        last edited by Allen Bai

                        @Ekopalypse said:

                        @Allen-Bai

                        I assume you meant (.\|){9}.
                        This matches 9 and more pipe delimited lines.

                        in fact, I mean…

                        (。\|){9}。*

                        but it can’t show correctly, and I don’t know how to put screenshot

                        1 Reply Last reply Reply Quote 0
                        • PeterJonesP
                          PeterJones
                          last edited by PeterJones

                          @Allen-Bai said:

                          it can’t show correctly,

                          To quote my boilerplate:

                          This forum is formatted using Markdown, with a help link buried on the little grey ? in the COMPOSE window/pane when writing your post. For more about how to use Markdown in this forum, please see @Scott-Sumner’s post in the “how to markdown code on this forum” topic, and my updates near the end. It is very important that you use these formatting tips – using single backtick marks around small snippets, and using code-quoting for pasting multiple lines from your example data files – because otherwise, the forum will change normal quotes ("") to curly “smart” quotes (“”), will change hyphens to dashes, will sometimes hide asterisks (or if your text is c:\folder\*.txt, it will show up as c:\folder*.txt, missing the backslash).

                          For images: upload image to imgur. embed images with the syntax ![](http://i.imgur.com/QTHZysa.png). (please use imgur’s “direct link” with i.imgur.com as the hostname and the appropriate .png or .gif extension, rather than the “image” link, which really links to the HTML-wrapper, and will not embed in the forum)

                          1 Reply Last reply Reply Quote 2
                          • Allen BaiA
                            Allen Bai @Allen Bai
                            last edited by

                            @Allen-Bai said:

                            (.|){9}.

                            how about this?

                            in fact, I mean
                            (。*\|){9}。*

                            1 Reply Last reply Reply Quote 0
                            • PeterJonesP
                              PeterJones
                              last edited by PeterJones

                              @Allen-Bai said:

                              in fact, I mean
                              (。*\|){9}。*

                              Then why not put it in tick marks? Both the help I linked to and the “how to use markdown code” post explained how to do that, as did my boilerplate text itself.

                              • `(.*\|){9}.*`

                              renders as

                              • (.*\|){9}.*
                              1 Reply Last reply Reply Quote 1
                              • Allen BaiA
                                Allen Bai
                                last edited by

                                ah…

                                understand now, thank you so much

                                1 Reply Last reply Reply Quote 0
                                • guy038G
                                  guy038
                                  last edited by

                                  Hi, @mark-yorkovich, and All,

                                  See my very late regex solution , below :

                                  https://community.notepad-plus-plus.org/post/47905

                                  Best Regards,

                                  guy038

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  The Community of users of the Notepad++ text editor.
                                  Powered by NodeBB | Contributors