Community
    • Login

    How i can sort number from largest to smallest on notepad++

    Scheduled Pinned Locked Moved General Discussion
    sort lines
    26 Posts 8 Posters 4.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Mark OlsonM
      Mark Olson @Mark Olson
      last edited by

      @Mark-Olson
      I discovered that my previous implementation of normalized case-insensitive text wasn’t quite right.

      I can’t edit my last post, but the function ignorecase_culture_invariant should be replaced with this:

      def ignorecase_culture_invariant(x):
          return unicodedata.normalize('NFD', x.upper())
      

      The previous implementation (using x.lower()) doesn’t work for German ß, and there are probably some other places where it fails.

      1 Reply Last reply Reply Quote 1
      • Alan KilbornA Alan Kilborn referenced this topic on
      • Mark OlsonM
        Mark Olson
        last edited by

        Well, I said I would wait to post again until I had a significant improvement, and I believe I do.

        I have now updated the script so it can sort multi-line blocks, for example, sorting a bunch of multi-line XML elements by some attribute.

        Normally the only way to sort multi-line blocks is to do the following:

        1. use a regex that replaces all the newlines in each block with some delimiter that is not present in the file while preserving newlines outside the block.
        2. Sort the lines of the document using the above script or one of the built-in sorts
        3. replace the delimiter with newlines.

        This process is annoying, so I decided to automate it.

        Behold!

        # -*- coding: utf-8 -*-
        from __future__ import print_function
        
        #########################################
        #
        #  SortLinesWithRegexGroup1AsSortKey (SLWRG1ASK)
        #
        #########################################
        
        # references:
        #  https://community.notepad-plus-plus.org/topic/24325
        
        #-------------------------------------------------------------------------------
        
        from Npp import *
        import inspect
        import os
        import re
        import threading
        import unicodedata
        import SendKeys as sk
        
        #-------------------------------------------------------------------------------
        
        NUMBER_RE = r'([-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?\d+)?)'
        INT_RE = r'([+-]?[0-9]+)'
        
        def ignorecase_culture_invariant(x):
            return unicodedata.normalize('NFD', x.upper())
        
        class SLWRG1ASK(object):
        
            def __init__(self):
        
                self.this_script_name = inspect.getframeinfo(inspect.currentframe()).filename.split(os.sep)[-1].rsplit('.', 1)[0]
        
                self.key_and_line_content_tup_list = []
                self.matched_line_numbers = set()
                self.key_did_not_match_specified_type__ERROR = False
                self.user_did_not_specify_group1__ERROR = False
                self.not_all_lines_matched__ERROR = False
                self.last_matched_line_num = None
                self.user_line_num_of_error = None
                self.zero_matches_for_regex = True
                self.key_transform = None
                self.do_descending_sort = False
                self.each_line_must_have_regex_group1_key_match = True
                self.remove_lines_that_dont_match_regex = False
                self.sort_lines_that_dont_match_to_bottom = False
                self.multiline = False
                self.multiline_block_delim = ''
                self.eol = [ '\r\n', '\r', '\n' ][ editor.getEOLMode() ]
        
                rect_sel_mode = editor.getSelectionMode() in [ SELECTIONMODE.RECTANGLE, SELECTIONMODE.THIN ]
                if rect_sel_mode or editor.getSelections() > 1:
                    self.mb('Cannot operate on a column selection or multiple selections; aborting!')
                    return
        
                if len(editor.getSelText()) == 0:
                    if not self.mb_ok_cancel('No selection active; confirm desire to sort entire file by pressing OK.'):
                        return
        
                (search_start_line, search_end_line) = editor.getUserLineSelection()
                if editor.positionFromLine(search_end_line) == editor.getSelectionEnd(): search_end_line -= 1
                num_lines_in_user_sel = search_end_line - search_start_line + 1
                if num_lines_in_user_sel < 2: return  # if selection only on 1 line, nothing to do; already sorted!
        
                user_regex_with_group_1 = ''
                used_DECIMAL_RE = False
                while True:
                    user_regex_with_group_1 = self.prompt('Enter regex with group 1 being the sort key.\r\nIf you wish, you can use (INT) to capture an integer or (DECIMAL) to capture a decimal', user_regex_with_group_1)
                    if user_regex_with_group_1 is None: return  # user cancel
                    if len(user_regex_with_group_1.strip()) == 0:
                        self.mb('Cannot specify an empty regex!  Try again.')
                        continue
                    if '(' not in user_regex_with_group_1:
                        self.mb('Need to specify capture group 1 (as the sort key) in the regex!  Try again.')
                        continue
                    regex_err_msg = self.search_regex_is_invalid_error_msg(user_regex_with_group_1)
                    if len(regex_err_msg) > 0:
                        self.mb('Bad regular expression!\r\n\r\n{e}\r\n\r\nTry again.'.format(e=regex_err_msg))
                        continue
                    used_DECIMAL_RE = '(DECIMAL)' in user_regex_with_group_1
                    user_regex_with_group_1 = user_regex_with_group_1.replace('(INT)', INT_RE).replace('(DECIMAL)', NUMBER_RE)
                    break
        
                while True:
                    threading.Timer(0.25, lambda : sk.SendKeys("{RIGHT}")).start()  # clear auto-selection of prompt text when box appears
                    user_options = self.prompt(
                        '    Regex specified: {r}\r\n'
                        'Choose options by placing 1 x in each group (no need to x to get DEFAULT choice)'.format(r=user_regex_with_group_1),
                        '\r\n'.join([
                            'Group 1 transformation:  [   ]Integer(DEFAULT)      [  {0}]Decimal      [   ]String      [   ]Ignorecase-cultural'.format(' x'[used_DECIMAL_RE]),
                            'Sort order:  [   ]Ascending(DEFAULT)    [   ]Descending',
                            'Other:',
                            '[   ]Sort-only-if-all-lines-have-group-1-match(DEFAULT)',
                            '[   ]Remove-non-matching-lines',
                            '[   ]Sort-non-matching-lines-to-bottom',
                            '[   ]Multi-line-blocks',
                        ])
                        )
                    if user_options is None: return  # user cancel
                    if len(user_options.strip()) == 0:
                        self.mb('Cannot specify empty options!  Try again.')
                        continue
                    if self.option_check(user_options, 'String'): self.key_transform = str
                    elif self.option_check(user_options, 'Decimal'): self.key_transform = float
                    elif self.option_check(user_options, 'Ignorecase-cultural'): self.key_transform = ignorecase_culture_invariant
                    else: self.key_transform = int
                    self.do_descending_sort = self.option_check(user_options, 'Descending')
                    self.each_line_must_have_regex_group1_key_match = self.option_check(user_options, 'Sort-only-if-all-lines-have-group-1-match')
                    self.remove_lines_that_dont_match_regex = self.option_check(user_options, 'Remove-non-matching-lines')
                    self.sort_lines_that_dont_match_to_bottom = self.option_check(user_options, 'Sort-non-matching-lines-to-bottom')
                    self.multiline = self.option_check(user_options, 'Multi-line')
                    if not (self.remove_lines_that_dont_match_regex or self.sort_lines_that_dont_match_to_bottom):
                        self.each_line_must_have_regex_group1_key_match = True
                    if self.each_line_must_have_regex_group1_key_match + self.remove_lines_that_dont_match_regex + self.sort_lines_that_dont_match_to_bottom > 1:
                        self.mb('Can only specify one way of handling unmatched lines! Try again!')
                        continue
                    break
        
                search_start_pos = editor.positionFromLine(search_start_line)
                search_end_pos = editor.getLineEndPosition(search_end_line)
                
                if self.multiline:
                    # find a character (or multi-char sequence) that is not in the document
                    delim = 1
                    delim_len = 1
                    allText = editor.getRangePointer(search_start_pos, search_end_pos)
                    while True:
                        # consider only NUL, SOH, STX, ETX, EOT, ENQ, ACK, BEL
                        delim = delim % 8
                        if delim == 0:
                            delim += 1
                        delim_str = chr(delim) * delim_len
                        if delim_str not in allText:
                            self.multiline_block_delim = delim_str
                            break
                        else:
                            delim += 1
                            delim_len += 1
                else:
                    # enforce single-line matching and only get the FIRST match on each line:
                    user_regex_with_group_1 = '(?-s)' + user_regex_with_group_1 + '.*'
        
        
                # special "seeding", in case the very first line of the selection isn't hit:
                self.last_matched_line_num = search_start_line - 1
        
                editor.research(user_regex_with_group_1, lambda m: self.match_found(m), 0, search_start_pos, search_end_pos)
        
                if self.zero_matches_for_regex:
                    self.mb('No lines matched the regex; aborting!')
                    return
        
                if self.user_did_not_specify_group1__ERROR:
                    self.mb('No sort key specified (via using capture group 1) in the regex; aborting!')
                    return
        
                if self.key_did_not_match_specified_type__ERROR:
                    self.mb('Key data did not match specified type ({t}) on line {L}; aborting!'.format(L=self.user_line_num_of_error,
                        t={ int : 'Integer', float : 'Decimal', str : 'String', ignorecase_culture_invariant: 'String' }[self.key_transform]))
                    return
        
                if self.each_line_must_have_regex_group1_key_match and self.not_all_lines_matched__ERROR:
                    self.mb('Not every line in the source data matched the specified regex (first non-matching line was {L}) OR there were multiple matches in at least one line; aborting!'.format(L=self.user_line_num_of_error))
                    return
        
                if len(self.key_and_line_content_tup_list) < 2:
                    self.mb('Nothing (reasonable) found to sort; aborting!')
                    return
                
        
                sorted_line_list = [tup[1] for tup in sorted(self.key_and_line_content_tup_list, reverse=self.do_descending_sort)]
                
                if self.multiline:
                    sorted_line_list = [x.replace(self.multiline_block_delim, self.eol) for x in sorted_line_list]
                elif self.matched_line_numbers:
                    first_unmatched_line = search_start_line + len(self.matched_line_numbers) + 1
                    self.mb('Line numbers {0}-{1} in sorted text were NOT matched.'.format(first_unmatched_line, search_end_line + 1),
                        title='not all lines matched')
                    for line_num in range(search_start_line, search_end_line + 1):
                        if line_num not in self.matched_line_numbers:
                            unmatched_line = editor.getLine(line_num)
                            sorted_line_list.append(unmatched_line.rstrip('\r\n'))
        
                doc_len_before = editor.getLength()
        
                editor.setSelection(search_end_pos, search_start_pos)
                editor.replaceSel(self.eol.join(sorted_line_list))
        
                doc_len_delta = editor.getLength() - doc_len_before
        
                # leave sorted text selected:
                editor.setSelection(search_end_pos + doc_len_delta, search_start_pos)
        
            def match_found(self, m):
        
                self.zero_matches_for_regex = False
        
                # we're already corrupt; don't bother processing further matches after the one where we found the problem
                if self.user_did_not_specify_group1__ERROR: return
                if self.key_did_not_match_specified_type__ERROR: return
                if self.not_all_lines_matched__ERROR: return
        
                (start_pos_of_match, end_pos_of_match) = m.span(0)
                line_num_of_match = editor.lineFromPosition(start_pos_of_match)
                
                try:
                    g1_str = m.group(1)
                except IndexError:
                    self.user_did_not_specify_group1__ERROR = True
                    return
        
                try:
                    key_with_correct_type = self.key_transform(g1_str)
                except ValueError:
                    self.key_did_not_match_specified_type__ERROR = True
                    self.user_line_num_of_error = line_num_of_match + 1
                    return
        
                if self.multiline:
                    line_content = m.group(0).rstrip('\n\r').replace(self.eol, self.multiline_block_delim)
                else:
                    if line_num_of_match != self.last_matched_line_num + 1:
                        if self.each_line_must_have_regex_group1_key_match:
                            self.not_all_lines_matched__ERROR = True
                            system_line_num_of_err = self.last_matched_line_num + 1
                            self.user_line_num_of_error = system_line_num_of_err + 1
                            return
                    self.last_matched_line_num = line_num_of_match
                    if self.sort_lines_that_dont_match_to_bottom:
                        self.matched_line_numbers.add(line_num_of_match)
                    line_content = editor.getLine(line_num_of_match).rstrip('\n\r')
        
                tup = (key_with_correct_type, line_content)
                self.key_and_line_content_tup_list.append(tup)
        
            def option_check(self, input_text, option_text):
                m = re.search(r'\\[([^]]+)\\] ?{opt}'.format(opt=option_text), input_text)
                retval = True if m and m.group(1) != ' ' * len(m.group(1)) else False
                return retval
        
            def search_regex_is_invalid_error_msg(self, test_regex):
                try:
                    # test test_regex for validity
                    editor.research(test_regex, lambda _: None)
                except RuntimeError as r:
                    return str(r)
                return ''
        
            def mb(self, msg, flags=0, title=''):  # a message-box function
                return notepad.messageBox(msg, title if title else self.this_script_name, flags)
        
            def mb_ok_cancel(self, msg, title=''):  # returns True(OK) or False(Cancel)
                okay = notepad.messageBox(msg, title if title else self.this_script_name, MESSAGEBOXFLAGS.OKCANCEL) == MESSAGEBOXFLAGS.RESULTOK
                return okay
        
            def prompt(self, prompt_text, default_text=''):
                if '\n' not in prompt_text: prompt_text = '\r\n' + prompt_text
                prompt_text += ':'
                return notepad.prompt(prompt_text, self.this_script_name, default_text)
        
        #-------------------------------------------------------------------------------
        
        if __name__ == '__main__': SLWRG1ASK()
        

        EXAMPLE: sorting the functions in a Python file

        def foo(a) -> ruon:
            fdnfoen
            eorjerjeon
            if neoren:
                eorneor
            for orneor in reiroewh:
                orneroej
                def xjorneo(zser):
                    ornero("defoorne")
                
        def zargothrax(rinonrururr: int, bozo: float, 
        defrnore: rourno) -> ziiron:
            oerjjeorno
        def bar(b,
        c,
        d,
        e):
            oerneorn
            erjelrkererei
            o33n4o3jo3443i3i
            043304 343kj3kj
            def noer(x):
                fnoerno
        def uonreorn (x, b, z): ornepronp
        
        def barru(rnx):
            pass
            
        def ournoerno
          (fnorne
         ):
            pass
        

        can have its functions sorted using the following regex: (?-si)^def\h+(\w+)\s*\([\w:\s,]+?\)\s*(?:->[^:]+)?:(?:.+$|\h*\R(?:\x20+.*\R?)+) and the following options selected:

        Group 1 transformation:  [   ]Integer(DEFAULT)      [   ]Decimal      [   ]String      [   x]Ignorecase-cultural
        Sort order:  [   ]Ascending(DEFAULT)    [   ]Descending
        Other:
        [   ]Sort-only-if-all-lines-have-group-1-match(DEFAULT)
        [   ]Remove-non-matching-lines
        [   ]Sort-non-matching-lines-to-bottom
        [  x ]Multi-line-blocks
        
        Paul WormerP 1 Reply Last reply Reply Quote 0
        • Mark OlsonM Mark Olson referenced this topic on
        • Mark OlsonM Mark Olson referenced this topic on
        • Paul WormerP
          Paul Wormer @Mark Olson
          last edited by

          @Mark-Olson said in How i can sort number from largest to smallest on notepad++:

          (?-si)^def\h+(\w+)\s*([\w:\s,]+?)\s*(?:->[^:]+)?:(?:.+$|\h*\R(?:\x20+.*\R?)+)

          I see that you use \h, \s and \x20 for a space in one and the same regex. Is there a deeper reason for that? Or is it didactic in that you want to point out that there are more possibilities for a space?

          Mark OlsonM 1 Reply Last reply Reply Quote 0
          • Mark OlsonM
            Mark Olson @Paul Wormer
            last edited by

            @Paul-Wormer said in How i can sort number from largest to smallest on notepad++:

            I see that you use \h, \s and \x20 for a space in one and the same regex. Is there a deeper reason for that?

            Yes, there are reasons for all of those decisions:

            1. \h between def and the name of the function because they must be on the same line but (I believe) there can be arbitrary spacing between them
            2. \s in the region containing the arguments because the arguments can be separated by arbitrary whitespace
            3. \x20 because I’m assuming that the function is indented by spaces.
            Paul WormerP 1 Reply Last reply Reply Quote 0
            • Paul WormerP
              Paul Wormer @Mark Olson
              last edited by

              @Mark-Olson OK, that is clear. Thank you for the explanation.

              1 Reply Last reply Reply Quote 0
              • Mark OlsonM Mark Olson referenced this topic on
              • Alan KilbornA
                Alan Kilborn
                last edited by

                If you’ve used a script in this thread, you might want to double check your copy of it for a bug I’ve discovered.
                Look to previous postings in this topic thread where the script has been changed – find the text moderator edit (2024-Jan-14).
                There’s a link there that describes the bug in more detail, and shows what needs to be changed in an old copy (or you can simply grab a copy of the current version).

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors