Some CRLFs not being removed when using 'Remove Empty Lines' or 'Remove Empty Lines (Containing Blank Characters)'
-
It works for me:
What line ending has Notepad++ recognized (as shown near the lower-right of the status bar) – if it’s not CRLF, then maybe that’s what’s confused. I doubt that’s it, but it’s worth checking.
What is your View > Show Symbol settings:
Perhaps you don’t have the non-printing characters or control-characters visible, and lines that you think are empty actually have invisible/zero-width characters.
For example, I was able to replicate your results by using
Ctrl+Shift+E
to put an ENQ character on lines 4 and 8. If I have Show Control Characters & Unicode EOL unchecked, I can replicate your second image after the remove-empty-lines:
This is because View > Show Symbol has
If I change to checkmark showing control characters, then I see them:
-
@PeterJones
Thank you very much for your help!OK, first thing, I was using a pretty old version of Notepad++ that didnt allow me to see some of those CRLF characters correctly, that was my first problem. Here is the newest version of my initial screen:
Now, when I do the ‘Remove Empty Lines (Containing Blank Characters)’, I am left with:
So, it is the ‘ZWNBSP’ character that is my problem here, I would think.
Now, armed with that information, I am seeing some talk in the NP++ forums about just that subject…so I am hoping that maybe I can find some more information about the issue.
-
BTW: the
ZWNBSP
U+FEFF was used in early Unicode as a Zero Width Non-Breaking Space, but its use as such was deprecated more than 20 years ago (recommending using WORD JOINER U+2060 instead), because the same sequence of bytes as U+FEFF was also defined as the BOM at the beginning of the file. It should not have been in the middle of a file since Unicode 3.2 was released in 2002.So, my guess as to what’s happened is that whatever you are copy/pasting from is either a bunch of small files and you’re accidentally copying the BOM from the beginning, or it’s one big file that was joined together outside your control with the BOMs incorrectly being left in the middle rather than being stripped out when the smaller files were merged.
If you cannot solve the problem at the source, then to get rid of that character in Notepad++, even if it’s hidden, you can use Regular Expression mode in the Replace dialog and replace
\x{feff}
with nothing. -
Again…THANK YOU for the help!
I did a Find&Replace on ’ \x{feff}’ and replaced it with nothing and that did get rid of the ZWNBSP characters…so that, combined with the ‘Remove Empty Lines’ gets rid of all that garbage :)I wish I could change the source of that text that I am copying, but it comes from our ticketing system and I just bring up a window that has all of the ticket number information
then copy all of the text, and paste that into my NP++.
I am trying to make myself as many shortcuts as possible in NP++ to make my ticket handling easier and quicker, and removing those spaces was one of those. I created a right-click context menu entry for the ‘Remove Empty Lines’ that is found in Edit | Line Operations |Remove Empty Lines, but it couldnt get rid of those ZWNBSP characters. Now I will try to learn myself a way to combine both of those commands and then add THAT to my context menu.
Thanks again for your help!
-
@James-McBride said in Some CRLFs not being removed when using 'Remove Empty Lines' or 'Remove Empty Lines (Containing Blank Characters)':
Now I will try to learn myself a way to combine both of those commands and then add THAT to my context menu.
If you are at all conversant with regular expressions you could create a macro using the Replace function. The details would be:
Find What:^\x{feff}*\R
Replace With: nothing in this fieldSo the Find What field looks for any line possibly starting with the ZWNBSP characters (0 or more) and ending with a EOL (in your case the CRLF sequence). Provided it ONLY has those characters the replacement (nothing) will remove the line.
If not sure on how a macro is created your should read the online manual reference here.
Terry
-
Hello, @james-mcbride, @peterjones, @terry-r and All,
An other solution would be to use this regex S/R :
SEARCH
^[\h\x{FEFF}]*\R
REPLACE
Leave EMPTY
And, of course, you could store this S/R as a macro which could be triggered by a keyboard shortcut !
Best Regards,
guy038
-
@guy038 said:
SEARCH
^[\h\x{FEFF}]*\R
This regex solves the OP’s problem, but I was expecting to see something more all-encompassing, meaning something that would handle more types of normally invisible “horizontal whitespace”, e.g. ZWSP, THSP, etc.
Now, there’s some argument about whether or not those types of things should be removed from a file (they might be there for a good reason), but…
Perhaps something more comprehensive could then serve as a replacement for
Remove Empty Lines (Containing Blank characters)
, which, as we’ve seen, only handles normal space and tab characters.I guess I’m thinking along the lines of what you derived for vertical whitespace (line-endings) here: https://community.notepad-plus-plus.org/post/91359
-
@Alan-Kilborn and everyone else!
I did eventually create a Macro to do all of that, and it works very nicely.
Now I am on to solving other ticketing issues to make my life easier.
-
@Alan-Kilborn said in Some CRLFs not being removed when using 'Remove Empty Lines' or 'Remove Empty Lines (Containing Blank Characters)':
I was expecting to see something more all-encompassing, meaning something that would handle more types of normally invisible “horizontal whitespace”, e.g. ZWSP, THSP, etc.
I came up with:
FIND:(?![\r\n\t\x20])(?:\s|\x{feff})
REPLACE WITH: nothingThis converts the following file:
into:
I just looked up the Unicode “Space Separator” category and put them all in that document, plus
ZWNBSP
, which is not categorized as a “Space Separator” and so has to be special-cased in the regex. -
@Mark-Olson said:
I came up with…
Nice one.
So then a unicode-ready replacement for the native command Remove Empty Lines (Containing Blank Characters) can be a macro, recorded as:
Find:
^((?![\n\r])[\s\x{FEFF}])*\R
Replace: nothing
Search mode: Regular expression
In selection: Checkmarked
Action: Replace AllI checkmark In selection because the original command can either run on the active selection or the entire file; for the macro it is a bit of a difference because to run on the entire file you’d have to Select All (Ctrl+a) first, but that’s not effort-intensive.