Community
    • Login

    Find NULL Lines with RegEx

    Scheduled Pinned Locked Moved General Discussion
    5 Posts 4 Posters 7.2k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Jerry GoedertJ Offline
      Jerry Goedert
      last edited by

      Hi, I have found numerous files that
      contain a NULL line, no spaces, no CR\LF, no tab, just nothing.
      Example:
      Birth Date 1 Jan 1947
      Event Type: Residence
      Household Identifier 963991122

      “United States Public Records, 1970-2009”, database, FamilySearch
      The lines between are NULL. I don’t know what to use to find them
      and remove most of them.
      How can I detect the NULL line?
      Jerry

      Alan KilbornA Terry RT rdipardoR 3 Replies Last reply Reply Quote 0
      • Alan KilbornA Offline
        Alan Kilborn @Jerry Goedert
        last edited by Alan Kilborn

        @Jerry-Goedert

        Search in Regular Expression mode for ^$.

        Wait… what does this mean?:

        no CR\LF

        If you don’t have that, you don’t have a “line”, so…

        1 Reply Last reply Reply Quote 2
        • Terry RT Offline
          Terry R @Jerry Goedert
          last edited by Terry R

          @Jerry-Goedert said in Find NULL Lines with RegEx:

          contain a NULL line, no spaces, no CR\LF, no tab, just nothing.

          Can you turn on “show all characters” which is under the View, then “Show Symbol” menu item.

          I would think it should look like this:

          8a70e9a7-c009-4661-b7b0-82e954d4ace4-image.png

          By turning on this feature you should see that the “NULL” line (as you describe it) actually is a line, just with no characters on it. So it has a CR/LF combination.

          If your’s doesn’t look like this after turning on that feature, show us a screen print like I did.

          Terry

          PS if it’s a line, then it WILL have a line number

          1 Reply Last reply Reply Quote 2
          • rdipardoR Offline
            rdipardo @Jerry Goedert
            last edited by

            A NULL character can be matched by searching the code point U+0000.

            • Ctrl + F
            • Find what: \x{0000} [^1]
            • Search Mode: Regular Expression
            • Click :“Find All in Current Document” [^2]

            match_null.png

            You can recreate the sample text shown above using python(3):

            import re
            
            data = """
                Birth Date 1 Jan 1947
                Event Type: Residence
                Household Identifier 963991122
                """
            
            text_with_nulls = bytes(re.sub(r'\s', '\x00', data), 'ascii')
            
            with open('text_with_nulls.txt', 'wb') as file:
                file.write(text_with_nulls)
            
            

            I’m guessing the file that @Jerry-Goedert described was generated by a government database using some ancient 7-bit collation. Empty record fields containing NULL in the database are probably showing up as single-byte character strings: "\0".


            [^1]: The Boost regex engine supports this syntax
            [^2]: Since there’s only one true “line” in the text shown above, you have to de-select the one result per line option to exactly reproduce my example:

            • Setting
            • Preferences
            • Searching
            • Uncheck “Search result Window: show only one entry per found line”

            no_single_search.png

            Alan KilbornA 1 Reply Last reply Reply Quote 4
            • Alan KilbornA Offline
              Alan Kilborn @rdipardo
              last edited by

              @rdipardo

              I think your guess as to the OP’s data may have hit the nail squarely on the head.

              It would have been abundantly clear earlier if the OP had posted a screenshot of what he was working with.

              1 Reply Last reply Reply Quote 1

              Hello! It looks like you're interested in this conversation, but you don't have an account yet.

              Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

              With your input, this post could be even better 💗

              Register Login
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors