Community
    • Login

    Some CRLFs not being removed when using 'Remove Empty Lines' or 'Remove Empty Lines (Containing Blank Characters)'

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    11 Posts 6 Posters 629 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • James McBrideJ
      James McBride
      last edited by

      When I am copying text into a NP++ document, I start with this text:
      ebf767dd-482b-4e38-9d45-b7cedfc73c85-image.png

      and then I run ‘Remove Empty Lines’ or ‘Remove Empty Lines (Containing Blank Characters)’
      and end up with this:

      090ebe5e-f31d-4fb1-a903-f697a6dece60-image.png

      Why does the ‘Remove Empty Lines’/‘Remove Empty Lines (Containing Blank Characters)’
      not get rid of those remaining CR|LFs in the text (the remaining empty lines)?

      How would I go about getting rid of those as well?

      PeterJonesP 1 Reply Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @James McBride
        last edited by

        @James-McBride ,

        It works for me:

        db32f3ad-57ad-4cc1-af6c-97006d2182e3-image.png

        0c362ab4-d8f3-4af6-9ee2-affe153c4478-image.png

        What line ending has Notepad++ recognized (as shown near the lower-right of the status bar) – if it’s not CRLF, then maybe that’s what’s confused. I doubt that’s it, but it’s worth checking.

        What is your View > Show Symbol settings:
        7478f36b-3353-4691-9a46-7a66d0f1447c-image.png

        Perhaps you don’t have the non-printing characters or control-characters visible, and lines that you think are empty actually have invisible/zero-width characters.

        For example, I was able to replicate your results by using Ctrl+Shift+E to put an ENQ character on lines 4 and 8. If I have Show Control Characters & Unicode EOL unchecked, I can replicate your second image after the remove-empty-lines:
        145cc46f-27bf-4834-9027-bb6e794cc8d8-image.png

        This is because View > Show Symbol has
        3a863075-98a5-42b6-9bdc-d34eed3fcabd-image.png

        If I change to checkmark showing control characters, then I see them:
        4bb713af-e2df-40a7-a28f-de001631da13-image.png

        James McBrideJ 1 Reply Last reply Reply Quote 2
        • James McBrideJ
          James McBride @PeterJones
          last edited by

          @PeterJones
          Thank you very much for your help!

          OK, first thing, I was using a pretty old version of Notepad++ that didnt allow me to see some of those CRLF characters correctly, that was my first problem. Here is the newest version of my initial screen:
          4e558820-9c5d-4d55-841b-03b043d2e246-image.png

          Now, when I do the ‘Remove Empty Lines (Containing Blank Characters)’, I am left with:

          90cec44e-018a-41ed-b6c4-f3fb959e2197-image.png

          So, it is the ‘ZWNBSP’ character that is my problem here, I would think.

          Now, armed with that information, I am seeing some talk in the NP++ forums about just that subject…so I am hoping that maybe I can find some more information about the issue.

          PeterJonesP 1 Reply Last reply Reply Quote 0
          • PeterJonesP
            PeterJones @James McBride
            last edited by

            @James-McBride ,

            BTW: the ZWNBSP U+FEFF was used in early Unicode as a Zero Width Non-Breaking Space, but its use as such was deprecated more than 20 years ago (recommending using WORD JOINER U+2060 instead), because the same sequence of bytes as U+FEFF was also defined as the BOM at the beginning of the file. It should not have been in the middle of a file since Unicode 3.2 was released in 2002.

            So, my guess as to what’s happened is that whatever you are copy/pasting from is either a bunch of small files and you’re accidentally copying the BOM from the beginning, or it’s one big file that was joined together outside your control with the BOMs incorrectly being left in the middle rather than being stripped out when the smaller files were merged.

            If you cannot solve the problem at the source, then to get rid of that character in Notepad++, even if it’s hidden, you can use Regular Expression mode in the Replace dialog and replace \x{feff} with nothing.

            James McBrideJ 1 Reply Last reply Reply Quote 2
            • James McBrideJ
              James McBride @PeterJones
              last edited by

              @PeterJones

              Again…THANK YOU for the help!
              I did a Find&Replace on ’ \x{feff}’ and replaced it with nothing and that did get rid of the ZWNBSP characters…so that, combined with the ‘Remove Empty Lines’ gets rid of all that garbage :)

              I wish I could change the source of that text that I am copying, but it comes from our ticketing system and I just bring up a window that has all of the ticket number information

              39588c48-31da-4f55-8f12-b332b8b6d123-image.png

              then copy all of the text, and paste that into my NP++.

              I am trying to make myself as many shortcuts as possible in NP++ to make my ticket handling easier and quicker, and removing those spaces was one of those. I created a right-click context menu entry for the ‘Remove Empty Lines’ that is found in Edit | Line Operations |Remove Empty Lines, but it couldnt get rid of those ZWNBSP characters. Now I will try to learn myself a way to combine both of those commands and then add THAT to my context menu.

              Thanks again for your help!

              Terry RT 1 Reply Last reply Reply Quote 1
              • Terry RT
                Terry R @James McBride
                last edited by

                @James-McBride said in Some CRLFs not being removed when using 'Remove Empty Lines' or 'Remove Empty Lines (Containing Blank Characters)':

                Now I will try to learn myself a way to combine both of those commands and then add THAT to my context menu.

                If you are at all conversant with regular expressions you could create a macro using the Replace function. The details would be:
                Find What:^\x{feff}*\R
                Replace With: nothing in this field

                So the Find What field looks for any line possibly starting with the ZWNBSP characters (0 or more) and ending with a EOL (in your case the CRLF sequence). Provided it ONLY has those characters the replacement (nothing) will remove the line.

                If not sure on how a macro is created your should read the online manual reference here.

                Terry

                1 Reply Last reply Reply Quote 3
                • guy038G
                  guy038
                  last edited by

                  Hello, @james-mcbride, @peterjones, @terry-r and All,

                  An other solution would be to use this regex S/R :

                  SEARCH ^[\h\x{FEFF}]*\R

                  REPLACE Leave EMPTY

                  And, of course, you could store this S/R as a macro which could be triggered by a keyboard shortcut !

                  Best Regards,

                  guy038

                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                  • Alan KilbornA
                    Alan Kilborn @guy038
                    last edited by

                    @guy038 said:

                    SEARCH ^[\h\x{FEFF}]*\R

                    This regex solves the OP’s problem, but I was expecting to see something more all-encompassing, meaning something that would handle more types of normally invisible “horizontal whitespace”, e.g. ZWSP, THSP, etc.

                    Now, there’s some argument about whether or not those types of things should be removed from a file (they might be there for a good reason), but…

                    Perhaps something more comprehensive could then serve as a replacement for Remove Empty Lines (Containing Blank characters), which, as we’ve seen, only handles normal space and tab characters.

                    I guess I’m thinking along the lines of what you derived for vertical whitespace (line-endings) here: https://community.notepad-plus-plus.org/post/91359

                    James McBrideJ Mark OlsonM 2 Replies Last reply Reply Quote 0
                    • James McBrideJ
                      James McBride @Alan Kilborn
                      last edited by

                      @Alan-Kilborn and everyone else!

                      I did eventually create a Macro to do all of that, and it works very nicely.

                      Now I am on to solving other ticketing issues to make my life easier.

                      1 Reply Last reply Reply Quote 0
                      • Mark OlsonM
                        Mark Olson @Alan Kilborn
                        last edited by Mark Olson

                        @Alan-Kilborn said in Some CRLFs not being removed when using 'Remove Empty Lines' or 'Remove Empty Lines (Containing Blank Characters)':

                        I was expecting to see something more all-encompassing, meaning something that would handle more types of normally invisible “horizontal whitespace”, e.g. ZWSP, THSP, etc.

                        I came up with:
                        FIND: (?![\r\n\t\x20])(?:\s|\x{feff})
                        REPLACE WITH: nothing

                        This converts the following file:
                        54277d97-765e-4849-a351-9f74adf5d64e-image.png
                        into:
                        65d3126c-7830-4184-bbef-5c1b2ea061ec-image.png

                        I just looked up the Unicode “Space Separator” category and put them all in that document, plus ZWNBSP, which is not categorized as a “Space Separator” and so has to be special-cased in the regex.

                        1 Reply Last reply Reply Quote 3
                        • Alan KilbornA
                          Alan Kilborn
                          last edited by

                          @Mark-Olson said:

                          I came up with…

                          Nice one.


                          So then a unicode-ready replacement for the native command Remove Empty Lines (Containing Blank Characters) can be a macro, recorded as:

                          Find: ^((?![\n\r])[\s\x{FEFF}])*\R
                          Replace: nothing
                          Search mode: Regular expression
                          In selection: Checkmarked
                          Action: Replace All

                          I checkmark In selection because the original command can either run on the active selection or the entire file; for the macro it is a bit of a difference because to run on the entire file you’d have to Select All (Ctrl+a) first, but that’s not effort-intensive.

                          1 Reply Last reply Reply Quote 2
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors