Sort by the largest number of the same results (frequency)



  • Hello

    I have a question sorting
    Now I’m using Notepad++ but it is difficult to sort in my way.

    For example
    I have few examples


    10AndrewJNVR
    10Andrpt1Pf7
    10Anuiot18g2
    10Andrew1H54
    10Andrew17yb
    10Andre614uw
    10Andr321EUb
    10And48n1Q2K
    10And1in15tA
    10Andrew13gG
    10An987ppKeP
    10Andrew1Dom


    How to sort text in Notepad ++ (or other addon) like in bellow
    at the very top after sorting there will be addressess with the names most often repeating or closest,
    then subsequent, subsequent and subsequent ones as in the example below

    So
    How to sort to this way:


    10Andrew1Dom
    10Andrew1H54
    10Andrew13gG
    10Andrew17yb
    10AndrewJNVR
    10Andre614uw
    10Andr321EUb
    10Andrpt1Pf7
    10And1in15tA
    10And48n1Q2K
    10Ana87ppKeP
    10Anuiot18g2


    please help



  • That’s less of a general purpose sorting function, so it’s not going to be native to Notepad++, and probably not already in any major plugin.

    In general, such tasks should be solved in a programming language of your choice. Using one of Notepad++ scripting plugins, like PythonScript, you could implement the algorithm in such a way that it makes use of the active files in Notepad++… but it’s still a general-purpose coding question, without much that’s specific to Notepad++.



  • To give you a leg up, here’s the Notepad++/PythonScript-plugin-specific aspect. You will just have to replace the indicated two lines from the script with the python algorithm that actually implements your specific sort. My example just sorts alphabetically, which I know isn’t what you want.

    # encoding=utf-8
    """in response to https://notepad-plus-plus.org/community/topic/19099/
    
    This does not solve the problem.  This just gives the Notepad++ and PythonScript specific parts of the answer
    
    The actual implementation of the sorting algorithm is a general Python programming exercise,
    and whether or not Notepad++ exists has no bearing on that part of the coding (thus not a question for this forum)
    """
    
    # step 0: assume data is active file in editor1 (main/left view)
    # debug: console.clear()
    
    # step 1: grab all the data from the editor1; keep the newline sequence, since I'll be printing it out later
    contentsArray = []
    def grabContentsArray(contents, lineNumber, totalLines):
        contentsArray.append(contents)
    
    editor1.forEachLine( grabContentsArray )
    
    # step 2: define a function that implements _your_ sort algorithm;
    #   it's a generic programming exercise, nothing Notepad++ or PythonScript-plugin specific,
    #   so left for you to implement
    def sortTheContents( inputArray ):
        # these next two lines should be replaced by the real algorithm
        returnArray = list(inputArray)  # this will have to be replaced by your actual algorithm
        returnArray.sort() # in-place alphabetical sort
        # once the algorithm is done, return the result here
        return returnArray
    
    sortedContents = sortTheContents( contentsArray )
    
    # step 3: replace the entire file's contents with the sorted data
    editor1.beginUndoAction()
    editor1.clearAll()
    # debug: console.show()
    for s in sortedContents:
        # debug: console.write(s)
        editor1.addText(s)
    
    editor1.endUndoAction()
    


  • If we’re just providing “a leg up”, then I’ll donate this bit, which I had started when Peter’s reply showed up:

    # -*- coding: utf-8 -*-
    
    def custom_sort_function(line_content):
        #
        # do your custom logic here
        # to return a pseudo-key
        # that will be used to determine
        # one line's placement relative
        # to another
        #
        return line_content  # <--------- this example merely returns original line, so we'll get a normal alpha sort
    
    lines_list = editor.getText().splitlines()
    
    lines_list.sort(key=custom_sort_function)
    
    eol = [ '\r\n', '\n', '\r' ][editor.getEOLMode()]
    
    editor.beginUndoAction()
    editor.setText(eol.join(lines_list) + eol)
    editor.endUndoAction()
    


  • @Alan-Kilborn

    I’m sorry but I don’t understand much
    I inserted your code and executed it but it doesn’t work
    could you lead my hand??
    like step by step with my example?



  • @Martin-X said in Sort by the largest number of the same results (frequency):

    I inserted your code and executed it but it doesn’t work

    We told you it wouldn’t. This forum isn’t a generic code-writing service. We gave you two ways to interface between the script and Notepad++. But you (or someone you hire – and this isn’t a jobs-board, either) will have to write the code that does the specific sort you want. Your sort algorithm does not exist in any prepackaged sort function that I know of. And that algorithm has nothing to do with Notepad++; once you get the data into the programming language (those two examples both got it into python), the problem is a general programming exercise, and nothing specific to this editor, so it’s off topic for the forum.

    Honestly, if I were programming this,

    1. I would skip the Notepad++ interface, and just use the programming language’s file I/O, because that’s often simpler
    2. I would force you to give much better definition of your requirements, and what is and isn’t present in the data set.

    But the algorithm you somewhat defined does not exist in Notepad++ or any of its plugins that I know of, and isn’t likely to be needed in the general Notepad++ usage. This is a perfect task for a programming language, and not something that is the topic of this forum.



  • Hello, @martin-x, @alan-kilborn, @peterjones and All,

    In addition to the points commented by @peterjones and @alan-kilborn, I don’t understand, clearly, your sort algorithm as well as your output list, below, where I added some space chars for readability :

    10Andrew 1Dom
    10Andrew 1H54
    10Andrew 13gG
    10Andrew 17yb
    10Andrew JNVR
    
    10Andr e614uw
    10Andr 321EUb
    10Andr pt1Pf7
    
    10And 1in15tA
    10And 48n1Q2K
    
    10An 987ppKeP      ( and NOT 10An a87ppKeP ! ) 
    10An uiot18g2
    

    Seemingly, the varying parts, at end of lines, do not respect your initial or a particular order !

    For instance, if you would have followed the alphabetic order, regarding the ending parts, you should get the following list :

    10Andrew 13gG
    10Andrew 17yb
    10Andrew 1Dom
    10Andrew 1H54
    10Andrew JNVR
    
    10Andr 321EUb
    10Andr e614uw
    10Andr pt1Pf7
    
    10And 1in15tA
    10And 48n1Q2K
    
    10An 987ppKeP
    10An uiot18g2
    

    Secondly, if I try to get closed to your sort algorithm, your list should be, strictly speaking :

    10Andrew1 3gG
    10Andrew1 7yb
    10Andrew1 Dom
    10Andrew1 H54
    
    10Andre 614uw
    10Andre wJNVR
    
    10Andr 321EUb
    10Andr pt1Pf7
    
    10And 1in15tA
    10And 48n1Q2K
    
    10An 987ppKeP
    10An uiot18g2
    

    Finally, I tried to think about a solution, involving regular expressions, without any valuable result, yet :-((

    Best Regards,

    guy038



  • Hi, @martin-x,

    I succeeded to imagine a process, which, however, needs some regex searches and some other regex replacements, not easy to recapitulate in a post, at first sight :-(

    So, if your data are not confidential and/or not personal, can you e-mail me an example, of an average size, to : tguy.038@gmail.com

    Of course, may I ask you to add, from your example, which text you’re expecting to ? Moreover, could you describe, as accurately as possible, the customized sort algorithm used ?

    Best Regards,

    guy038



  • @guy038
    Nice :)
    Ok I will send You example
    thanks


Log in to reply