Community
    • Login

    Sort Lines Lexicographically did not work

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    61 Posts 20 Posters 22.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • aarocaesA
      aarocaes
      last edited by

      This post is deleted!
      1 Reply Last reply Reply Quote 0
      • aarocaesA
        aarocaes
        last edited by

        Maybe it could be helpful for anyone else. I had this issue, and the problem was related with the line break codes.

        Even if you are seeing different lines, is possible that only one of the line break codes (\n or \r) is being used. If that’s the case, sorting won’t work as it will take all the text as just one single line. You can check this enabling the “Show all characters” option:

        2020-10-01 23_00_30-Window.png

        LF is the code for Line Feed (\n). But it’s missing the CR (Carriage Return, \r) code. Unlike Unix systems, standard line-termination in Windows and in the Internet is \r\n.

        So if you replace these line-termination codes activating the extended search mode (replacing \n with \r\n), it will work.

        2020-10-01 22_58_49-Window.png

        PeterJonesP 1 Reply Last reply Reply Quote 4
        • PeterJonesP
          PeterJones @aarocaes
          last edited by

          @aarocaes ,

          Interesting. It sorts correctly for me.
          5cdccf20-469c-44e6-a765-074bf01d15ac-image.png

          I thought maybe if you have mixed line endings, where some don’t match the current line-endings mode:
          94dd079f-50dd-4ff3-ae3a-a124003337e8-image.png => 9cc139ab-cde8-41c8-89d1-10cb24fb14fa-image.png
          Yeah, that’s it: correctly\nsorted and these\nlines were treated as single lines.

          So, as long as you don’t have mixed line endings, and your line endings match the defined line endings setting shown in the status bar, it will work. But if there’s a mix or discrepancy, it won’t be treated as a line ending.

          gitberryG 1 Reply Last reply Reply Quote 4
          • gitberryG
            gitberry @PeterJones
            last edited by

            Thank you for these posts - I love everything about NotePad++ EXCEPT line sorting because I’ve never understood why NotePad++ was inconsistent with line sorting until I read this. Now I can use it with more confidence.

            On reflection it would seem that this is an unnecessary pain for those who work in all 3 line ending environments and wondering if we couldn’t throw something into settings that could possibly be like this:
            [ X ] SORT Line Ending Agnostic (will treat CR, CR-LF and LF equally as line endings when sorting)

            Alan KilbornA 2 Replies Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @gitberry
              last edited by

              @gitberry

              What is a good example of why someone would need to “mix” line-ending types within one file? I don’t think I’ve ever run across a good reason for this. I welcome being enlightened.

              gitberryG 1 Reply Last reply Reply Quote 3
              • Terry RT
                Terry R
                last edited by

                @gitberry said in Sort Lines Lexicographically did not work:

                Line Ending Agnostic (will treat CR, CR-LF and LF equally as line endings when sorting)

                If you read @PeterJones post his last remarks were:
                So, as long as you don’t have mixed line endings, and your line endings match the defined line endings setting shown in the status bar, it will work.

                So if a file ONLY contains 1 type of line ending and that matches the status bar then sorting will work as expected. This seems to me to be a reasonable assumption by the developers.

                Maybe I’m naive when to it comes to the various line endings (Unix, Mac vs Windows) but I don’t see why a file would contain a mixture of the 3 types, unless there had been an error in encoding or reading of the file. So AFAIK a file would ONLY ever contain 1 type of line ending, and that depends on the use/environment the file is being used in.

                Notepad++ also has a function which resides under Edit main menu, called EOL Conversion where the file is “totally” converted from one type to another. maybe this should be used before a sort operation if at all unsure of whether there is a mixture of types.

                Terry

                1 Reply Last reply Reply Quote 1
                • gitberryG
                  gitberry @Alan Kilborn
                  last edited by

                  @Alan-Kilborn So true! I don’t think there is a good reason. When it happens (ie received from am uncaring/uncareful source etc) and the sort doesn’t work…

                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                  • Alan KilbornA
                    Alan Kilborn @gitberry
                    last edited by Alan Kilborn

                    @gitberry said in Sort Lines Lexicographically did not work:

                    So true! I don’t think there is a good reason. When it happens (ie received from am uncaring/uncareful source etc) and the sort doesn’t work…

                    If you are likely to get files of that nature from another source, suggest you “sanitize” them before beginning to work with them.

                    For example, do a line-ending conversion in Notepad++, which will unify the line-endings all to one type (whichever type you desire). After that, do your sort, or whatever other data manipulations you need to do.

                    (I guess Terry already said the same thing; sorry, didn’t see that first before crafting this reply)

                    mathlete2M 1 Reply Last reply Reply Quote 2
                    • Alan KilbornA
                      Alan Kilborn
                      last edited by

                      @Terry-R said in Sort Lines Lexicographically did not work:

                      I don’t see why a file would contain a mixture of the 3 types, unless there had been an error in encoding or reading of the file. So AFAIK a file would ONLY ever contain 1 type of line ending, and that depends on the use/environment the file is being used in.

                      Amen, brother, amen.

                      However, Notepad++ (Scintilla) doesn’t enforce this.
                      And, by default, it doesn’t let you know that you have “screwed up” files when this situation happens to occur.

                      One way for it to occur is the aforementioned reception of files from another source.

                      Another way for it to happen is a regex replacement where uses think that \n works to match a line-ending of any type. It does NOT ; \R should be used instead for this purpose. But, again, Notepad++ lets you do it, so line-ending weirdness can happen from this.

                      A good way to “set it and forget it” to avoid this type of problem is by using the EditorConfig plugin. With that, you specify your desired line-ending type, and when files are saved in Notepad++, the plugin steps in and corrects any improper line-endings to your desired type.

                      An alternative way to monitor the situation is to turn on visible line-endings and then hope you notice a mismatch. However, looking at the “heavy” line-ending character representation is too visually overwhelming for me. YMMV.

                      1 Reply Last reply Reply Quote 1
                      • Alan KilbornA
                        Alan Kilborn @gitberry
                        last edited by

                        @gitberry said in Sort Lines Lexicographically did not work:

                        [ X ] SORT Line Ending Agnostic (will treat CR, CR-LF and LF equally as line endings when sorting)

                        There’s an open issue on the ISSUE-TRACKER for this; perhaps you wanna add your voice there so it can be heard by developers?

                        Personally, I don’t think this needs a setting, I think it should ignore line-endings when sorting.

                        But, for myself, I use the EditorConfig plugin so that I just don’t get into a situation where a sorting problem (and other problems that could occur from this) doesn’t happen.

                        1 Reply Last reply Reply Quote 2
                        • mathlete2M
                          mathlete2 @Alan Kilborn
                          last edited by

                          @Alan-Kilborn said in Sort Lines Lexicographically did not work:

                          For example, do a line-ending conversion in Notepad++, which will unify the line-endings all to one type (whichever type you desire). After that, do your sort, or whatever other data manipulations you need to do.

                          actually, if you open this menu on a file with a mixture of line endings, the original selection is greyed out; NP++ thinks everything is still unified. perhaps this is a bug?

                          96805736-7d1d-47e7-85b6-e6b2593cfe43-image.png

                          PeterJonesP 1 Reply Last reply Reply Quote 0
                          • PeterJonesP
                            PeterJones @mathlete2
                            last edited by

                            @mathlete2 said in Sort Lines Lexicographically did not work:

                            actually, if you open this menu on a file with a mixture of line endings, the original selection is greyed out; NP++ thinks everything is still unified. perhaps this is a bug?

                            No. It just picked one (probably based on the line1 ending). To unify, you need to trigger at least one conversion, so picking the wrong one (like LF), and then convert back to the right one (CRLF). This is why Alan phrased is as “do a line-ending conversion”, not just “pick the line ending you want”.

                            mathlete2M 1 Reply Last reply Reply Quote 2
                            • mathlete2M
                              mathlete2
                              last edited by

                              also, FWIW, you can get yourself into these situations if you do a RegEx replacement similar to the one below to separate objects into separate lines. visually, this gets you to a the sortable state you want, but the EOL codes interfere with the actual sorting.

                              ff7074db-b897-4d4e-930f-fe89c4432cc0-image.png

                              Alan KilbornA 1 Reply Last reply Reply Quote 0
                              • Alan KilbornA
                                Alan Kilborn @mathlete2
                                last edited by Alan Kilborn

                                @mathlete2 said in Sort Lines Lexicographically did not work:

                                also, FWIW, you can get yourself into these situations if you do a RegEx replacement similar to the one below to separate objects into separate lines. visually, this gets you to a the sortable state you want, but the EOL codes interfere with the actual sorting.

                                I’m guessing from your screenshot that you have the belief (like others have in the past) that using \n in your replacement will get you \r\n in files that have “Windows (CR LF)” type. It is NOT true. You get what you ask for, in this case you will get exactly \n…another way to end up with a mismash of line ending types in your file, as maybe you found out…and yes, as the rest of this thread indicates, that will affect sorting.

                                It is never really a good idea to change a file’s line-ending type with a regular expression replacement. Best to use the status-bar menu already discussed.

                                Or, here’s a super-secret hack to unify line-ending characters:
                                Paste some data into a file with mixed line-endings using the Edit menu’s Paste command and you will observe that all line-endings become the same! Note that using Ctrl+v to paste does NOT get you the same effect!

                                mathlete2M 1 Reply Last reply Reply Quote 4
                                • mathlete2M
                                  mathlete2 @PeterJones
                                  last edited by

                                  @PeterJones well, if you select the “wrong” one that matches the other “wrong” ones already there, NP++ doesn’t add a second EOL character to those lines. So, “self conversions” are already possible to a certain extent, just not for the active selection

                                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                                  • mathlete2M
                                    mathlete2 @Alan Kilborn
                                    last edited by

                                    @Alan-Kilborn said in Sort Lines Lexicographically did not work:

                                    I’m guessing from your screenshot that you have the belief (like others have in the past) that using \n in your replacement will get you \r\n in files that have “Windows (CR LF)” type

                                    nope, just wasn’t aware that Windows used a weird EOL character coding until I came across this thread ;)

                                    Alan KilbornA Terry RT 2 Replies Last reply Reply Quote 0
                                    • guy038G
                                      guy038
                                      last edited by guy038

                                      Hello, @mathlete2 and All,

                                      • Regarding your regex, the trick is to capture the common line-ending, of each line in a group and places it, twice, in the replacement regex ! So, this new regex S/R :

                                      SEARCH (\w+(\\[w+\\])*)(\R)

                                      REPLACE \1\3\3


                                      • A simple way to get uniform line-endings, with a regex, is to run :

                                        • SEARCH \R    ( OK, whatever the effective line-ending of each line )

                                        • REPLACE \r\n    Case of Windows line-endings wanted

                                        • REPLACE \n    Case of Unix line-endings wanted

                                        • REPLACE \r    Case of Macintosh line-endings wanted

                                      Of course :

                                      • Tick the Wrap around option

                                      • Click on the Replace All button

                                      Best Regards,

                                      guy038

                                      mathlete2M 1 Reply Last reply Reply Quote 4
                                      • Alan KilbornA
                                        Alan Kilborn @mathlete2
                                        last edited by Alan Kilborn

                                        @mathlete2 said in Sort Lines Lexicographically did not work:

                                        …just wasn’t aware that Windows used a weird EOL character coding

                                        LOL. I’m guessing again: You’re a young person! :-)
                                        Us “old timers” stopped wondering about the weirdness of line-endings a long time ago.

                                        1 Reply Last reply Reply Quote 3
                                        • Alan KilbornA
                                          Alan Kilborn @mathlete2
                                          last edited by Alan Kilborn

                                          @mathlete2 said in Sort Lines Lexicographically did not work:

                                          well, if you select the “wrong” one that matches the other “wrong” ones already there, NP++ doesn’t add a second EOL character to those lines. So, “self conversions” are already possible to a certain extent, just not for the active selection

                                          It certainly “knows” about which characters are possible line-ending characters, i.e., \n and \r, so yes, when doing a conversion it considers a line to be a group of characters followed by any conglomeration of line-ending characters. Meaning that it knows how to strip off everything there before applying fresh line-ends.

                                          But to your larger point (I think), it certainly would be possible to have all 3 of those choice enabled at all times (if the developers so decided), so that if you have a mismash, and N++ thinks you have a “Windows” file, and you want to clear up the mismash and indeed end up with a “Windows” file, you really shouldn’t have to do TWO conversions.

                                          If you find you have to do this often, perhaps making a macro out of @guy038 's suggested operation(s) is a wise course of action (going against my advise to not use regex for this).

                                          Or, better yet, check out the EditorConfig plugin, set it up for what you want, and (after a save of your file), never think about these “weird” line endings again.

                                          1 Reply Last reply Reply Quote 2
                                          • Terry RT
                                            Terry R @mathlete2
                                            last edited by

                                            @mathlete2 said in Sort Lines Lexicographically did not work:

                                            nope, just wasn’t aware that Windows used a weird EOL character coding until I came across this thread ;)

                                            Us Windows users might take exception to that description, LOL!

                                            If you’re of similar age to me you will know about the manual typewriter which had a “carriage return lever” on the left which performed the \r and \n functions in one movement.

                                            Terry

                                            PS actually I’m not quite as old as the one in the picture, but I did get a hold of one similar that I intend on restoring sometime.

                                            7ec62ada-cd8c-4f94-94e3-299afb447d19-image.png

                                            mathlete2M 1 Reply Last reply Reply Quote 3
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors