• Login
Community
  • Login

Removal of Blank Lines in a large number of files

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
8 Posts 5 Posters 13.8k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J
    John Fairweather
    last edited by Nov 24, 2015, 9:33 PM

    Though it is possible to remove blank lines, from single files. I wonder if it is possible to do this in bulk (ie. a large number of files), in one go.

    1 Reply Last reply Reply Quote 0
    • A
      AdrianHHH
      last edited by Nov 25, 2015, 9:32 AM

      The “Find in files” has a “Replace” facility. So you could try a regular expression replacement of something like (note that there is a space before the “\t”):
      \r\n[ \t\r\n]*\r\n
      with
      \r\n
      (But adjust the “\r\n” parts to reflect the line endings in your files if they are not Windows files.)

      The regular expression looks for one newline, then as many spaces, tabs and newlines as can be found, then one more newline. The replacement is a single newline.

      WARNING
      Be very careful with the “Replace” facility of “Find in files”. If you use the wrong search or replacement you can mess up a lot of files. Try with some unimportant files (or copies) first.

      1 Reply Last reply Reply Quote 0
      • G
        guy038
        last edited by Nov 25, 2015, 9:14 PM

        Hi John,

        Here is a very simple regex which will delete any pure empty line, whatever its End of Line character(s) :

        But above all, the warning of AdrianHHH is quite sensible. So, I would suggest to copy a couple of files in a new folder to test it, first !

        Then :

        • Open the Find in Files dialog ( CTRL + SHIFT + F )

        • Type the regex ^\R+ in the Find what zone

        • Leave the Replace with EMPTY

        • Fill up the Filters and the Directory fields

        • Set the Regular expression search mode

        • Click on the Replace in Files button

        • Confirm the Are you sure dialog

        All at once, your test files won’t contain any pure empty line. Et voilà !


        Notes :

        • The interest of \R syntax is that it matches any kind of EOL ( \r\n of a Windows file, \n of a Unix/OSX file and \r of an Old Mac file ) In fact, strictly speaking \R = \r\n|[\n\v\f\r\x{2028}\x{2029}] but practically, it is, most of the time, identical to \r\n|\n|\r ( The order of the alternatives is important ! )

        • If, in addition, you would like to delete lines containing ONLY blank characters, use the search regex ^(\h*\R)+. Again, the Replace with zone stays empty. The syntax \h represents any horizontal blank character, that is to say, either the Space character ( \x20), the Tabulation character ( \x09 ) or the No-Break Space character ( \xA0 )

        • If you would like to delete any surplus pure blank line ( in other words, keeping ONLY ONE blank line, as a paragraphs separator ), just change the search regex into \R\R\K\R+. However, due to the \K form, inside this regex, the step by step replacement, with the Replace button, in the Replace dialog, won’t work. Use, ONLY the Replace All button !

        Best Regards,

        guy038

        N 1 Reply Last reply Jan 26, 2018, 8:27 AM Reply Quote 0
        • J
          John Fairweather
          last edited by Nov 25, 2015, 10:05 PM

          Thanks for all your replies.
          However, I should have mentioned that these files with blank lines, are TXT files, which contain output data, derived from a number of EXCEL spreadsheets, containing astronomical data, one TXT file, from each EXCEL spreadsheet. Each TXT file contains 6003 lines (by default), with only the first ten lines (or so) containing any data, so 5999 (or so) blank lines have to be removed. The person who wrote the code, assumed that the maximum output would be 6003 lines.

          1 Reply Last reply Reply Quote 0
          • J
            John Fairweather
            last edited by Dec 8, 2015, 3:19 PM

            Forgot to say, the above solution solved my problem - Thanks.

            1 Reply Last reply Reply Quote 0
            • N
              Nguyễn Huy Hải @guy038
              last edited by Jan 26, 2018, 8:27 AM

              @guy038 said:

              Hi John,

              Here is a very simple regex which will delete any pure empty line, whatever its End of Line character(s) :

              But above all, the warning of AdrianHHH is quite sensible. So, I would suggest to copy a couple of files in a new folder to test it, first !

              Then :

              • Open the Find in Files dialog ( CTRL + SHIFT + F )

              • Type the regex ^\R+ in the Find what zone

              • Leave the Replace with EMPTY

              • Fill up the Filters and the Directory fields

              • Set the Regular expression search mode

              • Click on the Replace in Files button

              • Confirm the Are you sure dialog

              All at once, your test files won’t contain any pure empty line. Et voilà !


              Notes :

              • The interest of \R syntax is that it matches any kind of EOL ( \r\n of a Windows file, \n of a Unix/OSX file and \r of an Old Mac file ) In fact, strictly speaking \R = \r\n|[\n\v\f\r\x{2028}\x{2029}] but practically, it is, most of the time, identical to \r\n|\n|\r ( The order of the alternatives is important ! )

              • If, in addition, you would like to delete lines containing ONLY blank characters, use the search regex ^(\h*\R)+. Again, the Replace with zone stays empty. The syntax \h represents any horizontal blank character, that is to say, either the Space character ( \x20), the Tabulation character ( \x09 ) or the No-Break Space character ( \xA0 )

              • If you would like to delete any surplus pure blank line ( in other words, keeping ONLY ONE blank line, as a paragraphs separator ), just change the search regex into \R\R\K\R+. However, due to the \K form, inside this regex, the step by step replacement, with the Replace button, in the Replace dialog, won’t work. Use, ONLY the Replace All button !

              Best Regards,

              guy038

              Hi!
              The answer above fromGuy038 seems excellent and should work but somehow I can’t make it work with my files.

              I’m running the latest version of Notepad++ (7.5.4). I have very limited knowledge about Regex, I’d really appreciate if someone can point where I might have done wrong to make it work.

              Thank you!

              S 1 Reply Last reply Jan 26, 2018, 1:18 PM Reply Quote 0
              • S
                Scott Sumner @Nguyễn Huy Hải
                last edited by Jan 26, 2018, 1:18 PM

                @Nguyễn-Huy-Hải

                Just a guess but maybe your lines are blank but not empty, the difference being that a blank line would contain only whitespace (spaces, tabs, …) and a truly empty line would contain, well, nothing but the line-ending. Without turning on whitespace visibility it would be difficult to see what you have.

                Maybe try turning on this option: View (menu) -> Show Symbol -> Show White Space and TAB

                N 1 Reply Last reply Jan 27, 2018, 4:25 AM Reply Quote 0
                • N
                  Nguyễn Huy Hải @Scott Sumner
                  last edited by Jan 27, 2018, 4:25 AM

                  Hi @Scott-Sumner
                  Thanks for your reply!

                  I had my doubt so I went to turn on Show white space and TAB but nothing shown (in pix below)
                  https://cdn.discordapp.com/attachments/311547963883388938/406663528154529823/unknown.png

                  I used echo command to add the last line to the text file but that commands also generates another empty line that follows. That’s why I need to remove it.

                  After a bit of googling, I found that [\n\r]+$ works. I’m happy but still curious about the differences between multiple regex.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors