• Login
Community
  • Login

File sorting

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
20 Posts 8 Posters 4.9k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G
    guy038
    last edited by guy038 Jul 24, 2019, 6:21 PM Jul 24, 2019, 6:21 PM

    @dave-pruce, @alan-kilborn,

    Yes, your new attempt, Alan, is the solution, when working with UTF8 encoded files, which may content multi-bytes encoded chars !

    As for me, I was thinking about the opposite solution : to convert UTf8-files to ANSI. However, when using this solution, some characters may result in question marks or may be changed for an approximate character, because, they do not belong to the the corresponding ANSI table of 256 characters !

    For instance, in my previous list of rivers, the Turkish Kızılırmak river, containing the Latin lowercase pointless letter ı, ( of code-point \x{0131} ), is changed into the approximate name Kizilirmak, after conversion to ANSI !

    Anyway, we just did our best to solve the OP’s problem ;-))

    BR

    guy038

    1 Reply Last reply Reply Quote 1
    • F
      freezer2022 @Dave Pruce
      last edited by freezer2022 Oct 23, 2023, 5:46 PM Oct 23, 2023, 5:13 PM

      @ Dave-Pruce said :

      Is it possible to sort a file by line length??

      Yes, not natively, but there is a Notepad++ plugin for it: Linesort v1.1 (but only for 32bit Notepad++) :

      https://webarchive.org/web/20200207125518/http://www.scout-soft.com/linesort/
      

      linesort.png

      1 Reply Last reply Reply Quote 1
      • C
        Coises @Alan Kilborn
        last edited by Coises Oct 23, 2023, 8:38 PM Oct 23, 2023, 8:35 PM

        This post is deleted!
        1 Reply Last reply Reply Quote 0
        • M
          mkupper @guy038
          last edited by Oct 24, 2023, 1:54 PM

          @guy038 You essentially did “programming” with a human computer doing the evaluations and flow control. :-)

          That reminds me of the stories about the first computers, which was a human job title, for those that computed but also had to do flow control! When ways were figured out in how to do parts of the job, first via mechanical means, and then electronic, the resulting machines came to be known as computers.

          1 Reply Last reply Reply Quote 0
          • G
            guy038
            last edited by guy038 Oct 25, 2023, 2:20 AM Oct 25, 2023, 2:04 AM

            Hello, All,

            Thanks to, @mkupper, which recently posted a comment and exactly, three years later, I going to simplify the way to get a sort by length of lines and, secondly, by line contents, too !

            Like in my previous post, I will use this list of rivers, below :

            https://en.wikipedia.org/wiki/List_of_rivers_by_length

            After suppression of some doublons, we get an INPUT text of 238 river’s names :

            Nile
            White Nile
            Kagera
            Nyabarongo
            Mwogo
            Rukarara
            Amazon
            Ucayali
            Tambo
            Ene
            Mantaro
            Yangtze
            Mississippi
            Missouri
            Jefferson
            Beaverhead
            Red Rock
            Hell Roaring
            Yenisei
            Angara
            Selenge
            Ider
            Yellow River
            Ob
            Irtysh
            Río de la Plata
            Paraná
            Congo
            Chambeshi
            Amur
            Argun
            Kherlen
            Lena
            Mekong
            Mackenzie
            Slave
            Peace
            Finlay
            Niger
            Brahmaputra
            Tsangpo
            Murray
            Darling
            Culgoa
            Balonne
            Condamine
            Tocantins
            Araguaia
            Volga
            Indus
            Sênggê Zangbo
            Shatt al-Arab
            Euphrates
            Murat
            Madeira
            Mamoré
            Caine
            Rocha
            Purús
            Yukon
            São Francisco
            Syr Darya
            Naryn
            Salween
            Saint Lawrence
            Niagara
            Detroit
            Saint Clair
            Saint Marys
            Saint Louis
            North
            Nizhnyaya Tunguska
            Danube
            Breg
            Zambezi
            Vilyuy
            Ganges
            Hooghly
            Padma
            Amu Darya
            Panj
            Japurá
            Nelson
            Saskatchewan
            Paraguay
            Kolyma
            Pilcomayo
            Biya
            Katun
            Ishim
            Juruá
            Ural
            Arkansas
            Colorado
            Olenyok
            Dnieper
            Aldan
            Ubangi
            Uele
            Negro
            Columbia
            Zhujiang
            Red
            Ayeyarwady
            Kasai
            Ohio
            Allegheny
            Orinoco
            Tarim
            Xingu
            Orange
            Salado
            Vitim
            Tigris
            Songhua
            Tapajós
            Don
            Podkamennaya Tunguska
            Pechora
            Kama
            Limpopo
            Chulym
            Guaporé
            Indigirka
            Snake
            Senegal
            Uruguay
            Blue Nile
            Churchill
            Khatanga
            Okavango
            Volta
            Beni
            Platte
            Tobol
            Alazeya
            Jubba
            Shebelle
            Içá
            Magdalena
            Han
            Kura
            Oka
            Guaviare
            Pecos
            Murrumbidgee
            Godavari
            Río Grande
            Belaya
            Cooper
            Barcoo
            Marañón
            Dniester
            Benue
            Ili
            Warburton
            Georgina
            Sutlej
            Yamuna
            Vyatka
            Fraser
            Brazos
            Liao
            Lachlan
            Yalong
            Iguaçu
            Olyokma
            Northern Dvina
            Sukhona
            Krishna
            Iriri
            Narmada
            Lomami
            Ottawa
            Lerma
            Grande de Santiago
            Elbe
            Vltava
            Zeya
            Juruena
            Rhine
            Athabasca
            Canadian
            North Saskatchewan
            Vistula
            Bug
            Vaal
            Shire
            Ogooué
            Nen
            Kızılırmak
            Markha
            Green
            Milk
            Chindwin
            Sankuru
            Wu
            James
            Kapuas
            Desna
            Helmand
            Madre de Dios
            Tietê
            Vychegda
            Sepik
            Cimarron
            Anadyr
            Paraíba do Sul
            Jialing
            Liard
            Cumberland
            White
            Huallaga
            Kwango
            Draa
            Gambia
            Tyung
            Chenab
            Yellowstone
            Ghaghara
            Huai
            Aras
            Chu
            Seversky Donets
            Bermejo
            Fly
            Kuskokwim
            Tennessee
            Oder
            Warta
            Aruwimi
            Daugava
            Gila
            Loire
            Essequibo
            Khoper
            Tagus
            Flinders
            
            • At end of the first line, we add some space chars till column 100

            • Then, with a zero-length selection, at column 100, we insert a exclamation mark ( ! ) at end of all lines of the list :

            => We get this temporary text ( I just listed the first lines and the last lines ) :

            Nile                                                                                               !
            White Nile                                                                                         !
            Kagera                                                                                             !
            Nyabarongo                                                                                         !
            Mwogo                                                                                              !
            Rukarara                                                                                           !
            Amazon                                                                                             !
            Ucayali                                                                                            !
            Tambo                                                                                              !
            Ene                                                                                                !
            Mantaro                                                                                            !
            Yangtze                                                                                            !
            Mississippi                                                                                        !
            Missouri                                                                                           !
            ......                                                                                             !
            ......                                                                                             !
            ......                                                                                             !
            ......                                                                                             !
            Seversky Donets                                                                                    !
            Bermejo                                                                                            !
            Fly                                                                                                !
            Kuskokwim                                                                                          !
            Tennessee                                                                                          !
            Oder                                                                                               !
            Warta                                                                                              !
            Aruwimi                                                                                            !
            Daugava                                                                                            !
            Gila                                                                                               !
            Loire                                                                                              !
            Essequibo                                                                                          !
            Khoper                                                                                             !
            Tagus                                                                                              !
            Flinders                                                                                           !
            
            
            • Now, we perform this regex S/R :

              • SEARCH ^([\w -]+?)(\x20+)(?=!)

              • REPLACE \2\1

            => Again, we get this temporary text ( I just listed the first lines and the last lines ) :

                                                                                                           Nile!
                                                                                                     White Nile!
                                                                                                         Kagera!
                                                                                                     Nyabarongo!
                                                                                                          Mwogo!
                                                                                                       Rukarara!
                                                                                                         Amazon!
                                                                                                        Ucayali!
                                                                                                          Tambo!
                                                                                                            Ene!
                                                                                                        Mantaro!
                                                                                                        Yangtze!
                                                                                                    Mississippi!
                                                                                                       Missouri!
                                                                                                         ......!
                                                                                                         ......!
                                                                                                         ......!
                                                                                                         ......!
                                                                                                Seversky Donets!
                                                                                                        Bermejo!
                                                                                                            Fly!
                                                                                                      Kuskokwim!
                                                                                                      Tennessee!
                                                                                                           Oder!
                                                                                                          Warta!
                                                                                                        Aruwimi!
                                                                                                        Daugava!
                                                                                                           Gila!
                                                                                                          Loire!
                                                                                                      Essequibo!
                                                                                                         Khoper!
                                                                                                          Tagus!
                                                                                                       Flinders!
            
            • Then, we run the Edit > Line Operations > Sort Lines Lexicographically Ascending option

            ==> Here is our sorted text ( I just listed the first lines and the last lines ) :

                                                                                                             Ob!
                                                                                                             Wu!
                                                                                                            Bug!
                                                                                                            Chu!
                                                                                                            Don!
                                                                                                            Ene!
                                                                                                            Fly!
                                                                                                            Han!
                                                                                                            Ili!
                                                                                                            Içá!
                                                                                                            Nen!
                                                                                                            Oka!
                                                                                                            Red!
                                                                                                           Amur!
                                                                                                           Aras!
                                                                                                         ......!
                                                                                                         ......!
                                                                                                         ......!
                                                                                                         ......!
                                                                                                   Saskatchewan!
                                                                                                   Yellow River!
                                                                                                  Madre de Dios!
                                                                                                  Shatt al-Arab!
                                                                                                  São Francisco!
                                                                                                  Sênggê Zangbo!
                                                                                                 Northern Dvina!
                                                                                                 Paraíba do Sul!
                                                                                                 Saint Lawrence!
                                                                                                Río de la Plata!
                                                                                                Seversky Donets!
                                                                                             Grande de Santiago!
                                                                                             Nizhnyaya Tunguska!
                                                                                             North Saskatchewan!
                                                                                          Podkamennaya Tunguska!
            
            • Finally, let’s run this last regex S/R

              • SEARCH ^\x20+|!$

              • REPLACE Leave EMPTY

            => It remains our expected OUTPUT text, sorted by line length :

            Ob
            Wu
            Bug
            Chu
            Don
            Ene
            Fly
            Han
            Ili
            Içá
            Nen
            Oka
            Red
            Amur
            Aras
            Beni
            Biya
            Breg
            Draa
            Elbe
            Gila
            Huai
            Ider
            Kama
            Kura
            Lena
            Liao
            Milk
            Nile
            Oder
            Ohio
            Panj
            Uele
            Ural
            Vaal
            Zeya
            Aldan
            Argun
            Benue
            Caine
            Congo
            Desna
            Green
            Indus
            Iriri
            Ishim
            James
            Jubba
            Juruá
            Kasai
            Katun
            Lerma
            Liard
            Loire
            Murat
            Mwogo
            Naryn
            Negro
            Niger
            North
            Padma
            Peace
            Pecos
            Purús
            Rhine
            Rocha
            Sepik
            Shire
            Slave
            Snake
            Tagus
            Tambo
            Tarim
            Tietê
            Tobol
            Tyung
            Vitim
            Volga
            Volta
            Warta
            White
            Xingu
            Yukon
            Amazon
            Anadyr
            Angara
            Barcoo
            Belaya
            Brazos
            Chenab
            Chulym
            Cooper
            Culgoa
            Danube
            Finlay
            Fraser
            Gambia
            Ganges
            Iguaçu
            Irtysh
            Japurá
            Kagera
            Kapuas
            Khoper
            Kolyma
            Kwango
            Lomami
            Mamoré
            Markha
            Mekong
            Murray
            Nelson
            Ogooué
            Orange
            Ottawa
            Paraná
            Platte
            Salado
            Sutlej
            Tigris
            Ubangi
            Vilyuy
            Vltava
            Vyatka
            Yalong
            Yamuna
            Alazeya
            Aruwimi
            Balonne
            Bermejo
            Darling
            Daugava
            Detroit
            Dnieper
            Guaporé
            Helmand
            Hooghly
            Jialing
            Juruena
            Kherlen
            Krishna
            Lachlan
            Limpopo
            Madeira
            Mantaro
            Marañón
            Narmada
            Niagara
            Olenyok
            Olyokma
            Orinoco
            Pechora
            Salween
            Sankuru
            Selenge
            Senegal
            Songhua
            Sukhona
            Tapajós
            Tsangpo
            Ucayali
            Uruguay
            Vistula
            Yangtze
            Yenisei
            Zambezi
            Araguaia
            Arkansas
            Canadian
            Chindwin
            Cimarron
            Colorado
            Columbia
            Dniester
            Flinders
            Georgina
            Ghaghara
            Godavari
            Guaviare
            Huallaga
            Khatanga
            Missouri
            Okavango
            Paraguay
            Red Rock
            Rukarara
            Shebelle
            Vychegda
            Zhujiang
            Allegheny
            Amu Darya
            Athabasca
            Blue Nile
            Chambeshi
            Churchill
            Condamine
            Essequibo
            Euphrates
            Indigirka
            Jefferson
            Kuskokwim
            Mackenzie
            Magdalena
            Pilcomayo
            Syr Darya
            Tennessee
            Tocantins
            Warburton
            Ayeyarwady
            Beaverhead
            Cumberland
            Kızılırmak
            Nyabarongo
            Río Grande
            White Nile
            Brahmaputra
            Mississippi
            Saint Clair
            Saint Louis
            Saint Marys
            Yellowstone
            Hell Roaring
            Murrumbidgee
            Saskatchewan
            Yellow River
            Madre de Dios
            Shatt al-Arab
            São Francisco
            Sênggê Zangbo
            Northern Dvina
            Paraíba do Sul
            Saint Lawrence
            Río de la Plata
            Seversky Donets
            Grande de Santiago
            Nizhnyaya Tunguska
            North Saskatchewan
            Podkamennaya Tunguska
            

            That’s all ! Neat, isn’t it ?

            Best Regards,

            guy038

            1 Reply Last reply Reply Quote 2
            • C
              Coises
              last edited by Coises Dec 13, 2023, 5:47 PM Dec 13, 2023, 5:39 PM

              @Thomas-Knoefel

              I received a feature request related to this post. It doesn’t quite feel like a good fit for Columns++ to me, but I think your MultiReplace plugin can assist in making this possible in a reasonable number of steps.

              I believe multi-replace can be set up to find ^.*$ and replace with set(string.len(MATCH).." "..MATCH).

              Then Edit | Line operations | Sort Lines As Integers Ascending will sort the lines in order by length, and then ^\d+\x20 replaced with nothing would remove the lengths.

              1 Reply Last reply Reply Quote 1
              • Mark OlsonM
                Mark Olson
                last edited by Mark Olson Dec 13, 2023, 11:18 PM Dec 13, 2023, 11:14 PM

                JsonTools v6.0 or higher, open treeview for document, go to REGEX mode, enter query @ = s_join(`\r\n`, sort_by(s_split(@, `\r\n`), s_len(@)))
                Hopefully the syntax is reasonably easy to understand- split the file by \r\n, sort the list of lines by string length, then set the document’s text (@) to the result of string-joining the list back together with \r\n.

                This converts

                abcdefg
                ab
                abcdefgh
                a
                abcdefghi
                abcde
                abcd
                abc
                

                into

                a
                ab
                abc
                abcd
                abcde
                abcdefg
                abcdefgh
                abcdefghi
                
                Mahmoud MadkourM 1 Reply Last reply Jun 3, 2024, 7:55 AM Reply Quote 2
                • C
                  Coises
                  last edited by Dec 14, 2023, 9:30 PM

                  In case anyone comes across this topic looking for a way to sort lines by length, Columns++ release 1.0.1 can do this.

                  Select Sort… from the Columns++ menu and let it enclose the entire document in a rectangular selection (or make your own selection first).

                  Use Whole lines, Ascending or Descending as desired, and Width. You can then sort on Entire column, unless you wish to use one of the other options.

                  The sort is based on the on-screen width of text in the current font. Columns++ is meant to deal with data in columns using tabs, including elastic tabstops and proportionally-spaced fonts; I found that using the width, rather than a count of characters, was the most consistent way to deal with all the variations in a way that makes intuitive sense for users. For files using monospaced fonts and no tabs, the results are the same as counting characters.

                  1 Reply Last reply Reply Quote 4
                  • Mahmoud MadkourM
                    Mahmoud Madkour @Mark Olson
                    last edited by Jun 3, 2024, 7:55 AM

                    @Mark-Olson , your proposed solution seems to be so easy but can you please elaborate more,
                    1- how to open the file in tree view
                    2- how to go to REGEX mode to enter the query

                    many thanks

                    Mark OlsonM 1 Reply Last reply Jun 3, 2024, 2:59 PM Reply Quote 0
                    • Mark OlsonM
                      Mark Olson @Mahmoud Madkour
                      last edited by Jun 3, 2024, 2:59 PM

                      @Mahmoud-Madkour
                      To open a tree view for a file in REGEX mode, just use the Regex search to JSON command from the JsonTools plugin menu.
                      Once the tree view is open, you can paste the query into the text box at the top right corner of the tree view, and click the Submit query button next to the text box.

                      1 Reply Last reply Reply Quote 3
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors