Need help with custom sort operations
-
Hey guys,
I would like to sort lines on a specific way and didn’t find out how to do that yet. So the sort task is pretty simple in this case and example. I want to sort only specific lines (include) and exclude other lines from the sort operation but they should moved too. Below a small example…
BBBBBBBBBBBBBBBBBBB something here something anywhere something there CCCCCCCCCCCCCCCCCCC something anywhere something here something there AAAAAAAAAAAAAAAAAAA something here something there something anywhere
…the lines above (only first line of each block) should get sorted to this…
AAAAAAAAAAAAAAAAAAA something here something there something anywhere BBBBBBBBBBBBBBBBBBB something here something anywhere something there CCCCCCCCCCCCCCCCCCC something anywhere something here something there
…so as you can see only the first line of each block was sorted (ascending) and the 4 lines below (belong to block) was just moved. So what means I need to find a way to sort every fifth line (1, 6 & 11) and moving all lines between (I called block move etc) but I don’t know how to manage that operation. So I need some kind of mask where I can enter which line distance (min would be 2 for every second line) like 2 or higher to include that lines into sort operation and the lines between should just moved with (in case of 2 the first line should get sorted and second line moved with the first line etc). I also would like to have that mask for rectangle selection to select any text from somewhere inside the text not from the start etc. As I said, I found not way yet to make such an specific custom sort yet and “Alan Kilborn” said I should as you guys here on this forum. Maybe you have some idea whether it is doable in npp (any existing function or RegEx or any plugin I could use etc) or not. Thank you.
-
@Dean-Corso said in Need help with custom sort operations:
Maybe you have some idea whether it is doable in npp (any existing function or RegEx or any plugin I could use etc) or not.
I think this post might have the answer you seek.
It involved appending all lines for each section into the one line. Sorting the new longer lines. Then reversing the first step.
Have a read first of that thread, then come back to this and add any additional requirements if that doesn’t meet your need…
Terry
-
Thanks for your answer so far. I did read this thread you mention and stuck already at the first part…
"displays all header strings in consecutive lines "
…where is this option?
Otherwise this RegEx code is using static values for Unicode symbols “\R(?=\x{2013}>)” <-- U+2013 which is a hyphen the other lines starting with in the other example. In my case the lines are unique and I can’t use any marker to check them before. The good thing in my case that the distances between the lines are same for all blocks what means I need only to enter any value X to sort every line. Still no clue how to manage that. Also found no advanced sort plugin or anything like that.
-
@Dean-Corso said in Need help with custom sort operations:
"displays all header strings in consecutive lines "
That just means that the blocks were made into single lines. Upon further inspection that post has the basis of what you need, but as you don’t have a specific character at the start of each sub-block line it will need adjustment.
You say there are 5 lines between blocks. That is correct for EVERY block? Or since you showed a blank (truly blank or does it have some spaces in it) line, is this also the case for EVERY block? There needs to be a sure fire method of specifying where each block ends.
Terry
-
@Terry-R said in Need help with custom sort operations:
That just means that the blocks were made into single lines.
Yes you are right, this example you did mention works now (before not / come copy paste issue maybe) but I don’t have any marker to check for in my lines.
In my example I have 3 blocks with 5 lines. Yes, in this case my blocks are static using same line counts. Otherwise you can also just use 2 lines only for every block etc. Just important are the lines which should get sorted and all lines below should moved.
Example: Just 2 lines for a block.
CCCCCCCCCCCCCCCCCCC 3 BBBBBBBBBBBBBBBBBBB 2 AAAAAAAAAAAAAAAAAAA 1 to AAAAAAAAAAAAAAAAAAA 1 BBBBBBBBBBBBBBBBBBB 2 CCCCCCCCCCCCCCCCCCC 3
Sort every second line (1, 3, 5) and move every second line after (2, 4, 6).
Example2: Sort every 1&2 lines & move 3 line of block.
CCCCCCCCCCCCCCCCCCC AAAAAAAAAAAAAAAAAAA 3 BBBBBBBBBBBBBBBBBBB AAAAAAAAAAAAAAAAAAA 2 CCCCCCCCCCCCCCCCCCC BBBBBBBBBBBBBBBBBBB 1 to AAAAAAAAAAAAAAAAAAA BBBBBBBBBBBBBBBBBBB 2 AAAAAAAAAAAAAAAAAAA CCCCCCCCCCCCCCCCCCC 3 BBBBBBBBBBBBBBBBBBB CCCCCCCCCCCCCCCCCCC 1
Sort (1,2 | 4,5 | 7,8) & move line 3, 6 & 9. Its more specific in this case. But the good and simple thing is that the block structure for all blocks is every time the same what means they have same line counts to sort & move = X & Y. Just need to have some function / dialog where I can enter those paramters…
Enter which every line to sort: 2
Enter how many lines to sort: 1
This means 1 line left to move for each blockor for example 2
Enter which every line to sort: 3
Enter how many lines to sort: 2
This means 1 line left to move for each block…and the lines between getting just moved to the same block. Something like that etc you know.
-
@Dean-Corso said in Need help with custom sort operations:
Yes, in this case my blocks are static using same line counts
Since I’m having a bit of difficulty in reading your posts I gather that english may not be your first language? It sometimes helps to know that since some of the terminology can be difficult to understand if not english speaking.
So you seem to suggest you have different lengths in your blocks, from block heading plus 1 other line through to block heading plus 5 other lines where the 5th line is blank.
Further since you have this variability you were hoping (I gather) to have a custom method to just input the number of lines in each block and have it all sorted for you.
I doubt there is a plugin available that does that (although I haven’t used every plugin so cannot confirm) and certainly the sorting system in Notepad++ isn’t that sophisticated. That’s why the other post uses a regex to alter the blocks so the sorting function can do it’s work, then reverses that block alteration to return the blocks to the original view.
So if you wanted to use the method proposed in the other post (make each block a long line) we can provide you with an adjusted method.
Terry
EDIT: just noticed example 2 appears to show 2 block heading lines. You show the replacement as the 2 same block heading lines in reverse order, but still maintaining the block with line 3. That puts a further complication on matters. Currently I cannot see a method of doing that.
-
Yes, English isn’t my main language and I need to translate sometimes a little or just try to speak more in freestyle you know.
Ok, sounds bad that npp can’t do that at the moment or you don’t know any plugin what could handle it that way I’m looking for. So the principle of the sort operation I want is pretty simple (sort function npp has already I want to use / line operations) but only need that specific line-select / line-move thing etc. Maybe it would be a good idea to add another function in npp itself for that in any next release. Otherwise I also don’t have a clue how to do my sort operations I want in npp.
MfG
-
@Dean-Corso said in Need help with custom sort operations:
So the principle of the sort operation I want is pretty simple
Well, sorry to say but what you need isn’t simple. If you had every line sorting alphabetically or numerically, ascending or descending then that is what I would call simple. And NPP covers those bases very well.
Beyond that there are so many different ideas (your’s being just 1 of thousands) I doubt the developer will consider expanding the NPP sorting functions to cover your’s or anyone else’s.
At present since you may have multiple block heading lines (in a single block) which need sorting first, before the new first block heading line becomes the line to sort the order of the blocks (your example 2) I can only think it will require some code (such as in PythonScript) to do what you need.
Unfortunately we aren’t a code creating forum as such, just a general forum supporting NPP users. Someone here might possibly consider it a worthwhile exercise to create such a program for you, but don’t expect it.
I will continue to ponder your request (I’m not a code creator) just in case there might be another way using just a (series of) regex.
Good luck
Terry -
This is a great opportunity to try out my new script for sorting multi-line blocks!
Just use the regex
(?s-i)(^[A-Z]+$).*?(?=^[A-Z]+$|\Z)
and the optionsGroup 1 transformation: [ ]Integer(DEFAULT) [ ]Decimal [ x ]String [ ]Ignorecase-cultural Sort order: [ ]Ascending(DEFAULT) [ ]Descending Other: [ ]Sort-only-if-all-lines-have-group-1-match(DEFAULT) [ ]Remove-non-matching-lines [ ]Sort-non-matching-lines-to-bottom [x ]Multi-line-blocks
EXAMPLE:
DDDDDDDDDDDDD onro roeri AAAAAAAAAAAAAAAAAAA FFFFFFFFFFFFFFFFFFF foo EEEEEEEEEEEEEEEEEEEEE oroieoj roernern reroo BBBBBBBBBBBBBBBBBBB ccc furr CCCCCCCCCCCCCCCCC
WILL BE SORTED TO
AAAAAAAAAAAAAAAAAAA BBBBBBBBBBBBBBBBBBB ccc furr CCCCCCCCCCCCCCCCC DDDDDDDDDDDDD onro roeri EEEEEEEEEEEEEEEEEEEEE oroieoj roernern reroo FFFFFFFFFFFFFFFFFFF foo
-
I did just see your “other” post where you showed the new script and wondered if it would be any good for this.
I have a problem with your example as you show the first block as just having the DDDDD… line but when it’s sorted it suddenly gains the CCCCCC… line which was at the bottom in the before example. Was that a typo?
Or maybe I am just not reading the before and after examples right.
Terry
PS I was working on a poor mans method which was a bunch of regexes and using some of the more advanced caret moves to expand selections prior to sorting. The macros created would “run until end of file”.
-
@Terry-R said in Need help with custom sort operations:
Or maybe I am just not reading the before and after examples right.
I was in the headspace of working the solution based on 2 block header lines per block. When I saw your example it seemed like you were doing the same. Re-reading it, it suggests your example doesn’t have a defined block line length. I suppose that is good in that it can cater for a “loose” structure. But how does it know where each block starts and ends? (I’m not a code reader).
Terry
-
@Terry-R said in Need help with custom sort operations:
Or maybe I am just not reading the before and after examples right.
Yeah, you’re just reading them wrong. My block-sorter just removed a trailing newline after the
DDDDDDDDDDDD
block. -
@Terry-R said in Need help with custom sort operations:
But how does it know where each block starts and ends? (I’m not a code reader).
That is a good question, which the documentation in my script should definitely address.
The answer is that your regex must match the entire block and have a single capture group within the portion to be matched.
Thus,
(?s-i)(^[A-Z]+$).*?(?=^[A-Z]+$|\Z)
is the appropriate regex to use, because the first capture group(^[A-Z]+$)
contains the header line, and the regex as a whole matches everything up to the next header. -
@Mark-Olson said in Need help with custom sort operations:
My block-sorter just removed a trailing newline after the DDDDDDDDDDDD block.
If you look at the first post the example there showed a “trailing” blank/empty line. They wanted this kept.
From all @Dean-Corso posts so far I think they want the ability to dictate 2 parameters before running the process.
- How many lines in each block. They state they are well defined, so ALL the same number of lines.
- How many of the lines (from start of block down) will be sorted first within the block.
Lastly another sort is performed on the file using ONLY the first header line of each block (keeping each block intact).
Currently I don’t think your script will work that correctly, certainly as you seem to be removing “blank” lines that won’t work for them.
Terry
-
@Terry-R said in Need help with custom sort operations:
If you look at the first post the example there showed a “trailing” blank/empty line. They wanted this kept.
Yep, that’s a limitation of my script. I’m not going to change it, because every way I’ve tried to get around that limitation makes things worse.
Fortunately, it is quite easy to make a regex-replace that creates a trailing blank/empty line after each block. Just replace
(?<!\A)(^[A-Z]+$)
with\r\n{0}
-
@Dean-Corso said in Need help with custom sort operations:
Just need to have some function / dialog where I can enter those paramters…
Enter which every line to sort: 2
Enter how many lines to sort: 1Done. Just use the following script in PythonScript
# -*- coding: utf-8 -*- ######################################### # # sort_line_blocks_and_headers # ######################################### # references: # https://community.notepad-plus-plus.org/topic/24742/need-help-with-custom-sort-operations # HOW IT WORKS: # 1. Divide the document into blocks of N lines, of which the top M lines are the header. # 2. Sort the header of each block # 3. Sort the blocks by the sorted headers. # EXAMPLE INPUT: # CCCCCCCCCCCCCCCCCCC # AAAAAAAAAAAAAAAAAAA # 3 # BBBBBBBBBBBBBBBBBBB # AAAAAAAAAAAAAAAAAAA # 2 # CCCCCCCCCCCCCCCCCCC # BBBBBBBBBBBBBBBBBBB # 1 # RESPOND TO THE 4 PROMPTS AS FOLLOWS: 3, yes, 2, yes # RESULT: # AAAAAAAAAAAAAAAAAAA # BBBBBBBBBBBBBBBBBBB # 2 # AAAAAAAAAAAAAAAAAAA # CCCCCCCCCCCCCCCCCCC # 3 # BBBBBBBBBBBBBBBBBBB # CCCCCCCCCCCCCCCCCCC # 1 from __future__ import print_function from Npp import * def main(): eol = [ '\r\n', '\r', '\n' ][ editor.getEOLMode() ] while True: lines_per_block_str = notepad.prompt('number of lines per block', 'sort blocks', '3') if lines_per_block_str is None: return try: lines_per_block = int(lines_per_block_str) if lines_per_block < 1: raise ValueError except: notepad.messageBox('lines per block must be integer >= 1') continue sort_blocks_ascending_str = notepad.prompt('sort blocks least to greatest (yes/no)', 'sort blocks', 'yes') if sort_blocks_ascending_str is None: return reverse_blocks = sort_blocks_ascending_str != 'yes' lines_in_header_str = notepad.prompt('number of lines in header', 'sort blocks', '1') if lines_in_header_str is None: return sort_header_ascending_str = notepad.prompt('sort header least to greatest (yes/no)', 'sort blocks', 'yes') if sort_header_ascending_str is None: return reverse_header = sort_header_ascending_str != 'yes' try: lines_in_header = int(lines_in_header_str) if lines_in_header < 1 or lines_in_header > lines_per_block: raise ValueError except: notepad.messageBox('lines in header must be integer <= lines in block and >= 1') continue break lines_not_in_header = lines_per_block - lines_in_header find_regex = '(?-s)((?:.*\R){%s})' % lines_in_header if lines_not_in_header: find_regex += '((?:.*(?:\R|\Z)){%s})' % lines_not_in_header # print(find_regex) header_block_list = [] doc_len = editor.getLength() def on_match(m): sorted_header = eol.join(sorted(m.group(1).splitlines(), reverse=reverse_header)) edited_block = sorted_header + eol + m.group(2) if m.span()[1] < doc_len: edited_block = edited_block[:-len(eol)] header_block_list.append((sorted_header, edited_block)) editor.research(find_regex, on_match) header_block_list_sorted = [x[1] for x in sorted(header_block_list, reverse=reverse_blocks)] # print(header_block_list) editor.setText(eol.join(header_block_list_sorted)) if __name__ == '__main__': main()
-
-
@Mark-Olson
My initial script had some bugs. Use this instead# -*- coding: utf-8 -*- ######################################### # # sort_line_blocks_and_headers # ######################################### # references: # https://community.notepad-plus-plus.org/topic/24742/need-help-with-custom-sort-operations # HOW IT WORKS: # 1. Divide the document into blocks of N lines, of which the top M lines are the header. # 2. Sort the header of each block # 3. Sort the blocks by the sorted headers. # EXAMPLE INPUT: # CCCCCCCCCCCCCCCCCCC # AAAAAAAAAAAAAAAAAAA # 3 # BBBBBBBBBBBBBBBBBBB # AAAAAAAAAAAAAAAAAAA # 2 # CCCCCCCCCCCCCCCCCCC # BBBBBBBBBBBBBBBBBBB # 1 # RESPOND TO THE 4 PROMPTS AS FOLLOWS: 3, yes, 2, yes # RESULT: # AAAAAAAAAAAAAAAAAAA # BBBBBBBBBBBBBBBBBBB # 2 # AAAAAAAAAAAAAAAAAAA # CCCCCCCCCCCCCCCCCCC # 3 # BBBBBBBBBBBBBBBBBBB # CCCCCCCCCCCCCCCCCCC # 1 from __future__ import print_function from Npp import * def main(): eol = [ '\r\n', '\r', '\n' ][ editor.getEOLMode() ] while True: lines_per_block_str = notepad.prompt('number of lines per block', 'sort blocks', '3') if lines_per_block_str is None: return try: lines_per_block = int(lines_per_block_str) if lines_per_block < 1: raise ValueError except: notepad.messageBox('lines per block must be integer >= 1') continue sort_blocks_ascending_str = notepad.prompt('sort blocks least to greatest (yes/no)', 'sort blocks', 'yes') if sort_blocks_ascending_str is None: return reverse_blocks = sort_blocks_ascending_str != 'yes' lines_in_header_str = notepad.prompt('number of lines in header', 'sort blocks', '1') if lines_in_header_str is None: return try: lines_in_header = int(lines_in_header_str) if lines_in_header < 1 or lines_in_header > lines_per_block: raise ValueError except: notepad.messageBox('lines in header must be integer <= lines in block and >= 1') continue if lines_in_header > 1: sort_header_ascending_str = notepad.prompt('sort header least to greatest (yes/no)', 'sort blocks', 'yes') if sort_header_ascending_str is None: return reverse_header = sort_header_ascending_str != 'yes' else: reverse_header = False break lines_not_in_header = lines_per_block - lines_in_header if lines_not_in_header: find_regex = '(?-s)((?:.*\R){%s})((?:.*(?:\R|\Z)){%s})' % (lines_in_header, lines_not_in_header) else: find_regex = '(?-s)((?:.*(?:\R|\Z)){%s})' % lines_in_header # print(find_regex) header_block_list = [] doc_len = editor.getLength() def on_match(m): is_last_match = m.span()[1] == doc_len group1 = m.group(1) if not group1: return sorted_header = eol.join(sorted(group1.splitlines(), reverse=reverse_header)) if lines_not_in_header: edited_block = sorted_header + eol + m.group(2) else: edited_block = sorted_header if not is_last_match and edited_block.endswith(eol): edited_block = edited_block[:-len(eol)] header_block_list.append((sorted_header, edited_block)) editor.research(find_regex, on_match) header_block_list_sorted = [x[1] for x in sorted(header_block_list, reverse=reverse_blocks)] # print(header_block_list) editor.setText(eol.join(header_block_list_sorted)) if __name__ == '__main__': main()
-
-
Thank you guys for trying to help me (and others maybe too) to find a solution for my sorting problems. Sounds great that you was able to create a script for that task. Only problem at the moment is I don’t know yet how to use it so maybe you could give me a little crash curse what plugin to install and what parameters I have to execute (python + your script + my npp tab or file) etc. So I have python installed already on computer. My question is also whether I can execute your py script in npp for the open / focused tab itself (also when I just have written text in real time without saved file). As I said, not sure what plugin to install for py (by the way, the PyNPP plugin is not listed in plugin manager) if I have to. Would be nice if you could tell me and show one or two examples how to execute your script with an file which is open in notepad++ or if possible also using a direct tab (none file) in pipe mode etc if this is supported. Thank you.
-
@Dean-Corso said in Need help with custom sort operations:
little crash curse what plugin to install and what parameters I have to execute
-
Thank you for that link & info. Ok, I think I got it working now to execute your script @Mark-Olson and on my first tests it seems to work as I wanted! Coolio! I will do some more tests but it looks already pretty well.
Is this sort operation which works now for all complete lines also doable to for that rectangle selection (maybe for later)? Lets say I have few vertical text blocks and I want to select any of them via rectangle selection and want to sort all lines from that selection. Could that be possible too?
Just as info: I found out that if I want to sort specific rectangle selected lines (middle - end) that it’s only working when I trim all the selected lines starting at same vertical position and I also need to change all blank chars like TABs to Space before I do a sort to make it work. Just don’t understand why but I have to.
Example: 2 blocks / tabs between / sort ascending ic.
66666 qqqqqqq 55555555 zzzzzzz 777 aaaaaaa 88888888888 xxxxxxx
I just select the second block via rectangle select then I do the line sort with ingore case and I got this out.
55555555 zzzzzzz 88888888888 xxxxxxx 777 aaaaaaa 66666 qqqqqqq
=? So when I do same but before I call the blank option “Tab to Space” and select the 2. block and call then the sort function then I get this out…
777 aaaaaaa 66666 qqqqqqq 88888888888 xxxxxxx 55555555 zzzzzzz
which is correctly now. Just don’t understand why I have to change the tabs to space before the selection so normally it should play no role at this point what comes before the selection you know. Is this a bug or do I understand it wrong? You know what I mean?
Thank you very much so far guys.