Column-aligning jagged data



  • @PeterJones

    I think I can’t make it shorter ;-)

    editor.setText('\n'.join(['{:<20} {}'.format(*x.split()) for x in editor.getText().splitlines()]))
    

    whereas 20 is assumed to be the length of the longest string in first column.

    Cheers
    Claudia



  • Hi, @alan-kilborn, @claudia-frank, @peterjones, and All,

    Nevertheless, it’s quite simple, indeed !! I propose to you 3 different regex S/R :

    SEARCH ^.{12}\K + , with a space before the plus sign

    REPLACE EMPTY

    or

    SEARCH (?<=^.{12}) + , with a space before the plus sign

    REPLACE EMPTY

    or

    SEARCH ^(.{12}) + , with a space before the plus sign

    REPLACE \1

    Notes :

    • For the first two S/R, you must use the Replace All button only ( The step by step replacement does NOT work, due to the \K syntax or the look-behind )

    • The last S/R accept hitting on the Replace button, too !

    • Note that these regexes need that the blank character, is, exclusively, the space character !


    Now, Alan, let’s try something more tricky : I simply copy all your list again, on the right, using the column mode !

    ----|----1----|----2----|----3----|----4----|----5----|----6----|----7----|----8----|----9----|----A----|----B
    
    trade                               Ground           trade                               Ground
    list                                Cry              list                                Cry
    free                       print                     free                       print
    Told                                Supply           Told                                Supply
    square              stood                            square              stood
    metal                 do                             metal                 do
    held                    shine                        held                    shine
    large                              boy               large                              boy
    map                 table                            map                 table
    book                                car              book                                car
    process               also                           process               also
    thank                        young                   thank                        young
    held                             if                  held                             if
    ship                       atom                      ship                       atom
    Have                         game                    Have                         game
    thousand                          strong             thousand                          strong
    case              most                               case              most
    head                      Tube                       head                      Tube
    those                          wait                  those                          wait
    sudden            triangle                           sudden            triangle
    while                                feed            while                                feed
    human                            order               human                            order
    paint                   sight                        paint                   sight
    mouth                            rope                mouth                            rope
    Hair                     suffix                      Hair                     suffix
    want                        this                     want                        this
    hot                           salt                   hot                           salt
    call                            house                call                            house
    similar                  experiment                  similar                  experiment
    count                      rub                       count                      rub
    quite            won't                               quite            won't
    opposite                      no                     opposite                      no
    note              low                                note              low
    process                       term                   process                       term
    to                              Fine                 to                              Fine
    Solution                       Season                Solution                       Season
    band                         block                   band                         block
    among                            direct              among                            direct
    who               These                              who               These
    between                  sugar                       between                  sugar
    ice                              leg                 ice                              leg
    took                                symbol           took                                symbol
    between                 Leg                          between                 Leg
    Design                Share                          Design                Share
    quotient               segment                       quotient               segment
    

    Then :

    • Place your cursor just, under the ruler and before the first item trade

    • Open the Replace dialog

    • Leave the Replace with: zone EMPTY

    • Type, in the Find what: zone, the regex (?-s)^.{12}\K + , with a space before the plus sign

    • Click on the Replace All button

    => The second column is aligned :-)) Of course, the third and fourth ones are not aligned

    • Now, change the number 12 by the number 27, in the Find what: zone

    • Click, again, on the Replace All button

    => The third column is now aligned :-))

    • Now, change the number 27 by 43, in the Find what: zone

    • Click, a last time, on the Replace All button

    => All the columns are well aligned…, as below. Et voilà ! Note that the columns begin at positions 12+1, 27+1 and 43+1

    ----|----1----|----2----|----3----|----4----|----5----|----6----|----7----|----8----|----9----|----A----|----B
    
    trade       Ground         trade           Ground
    list        Cry            list            Cry
    free        print          free            print
    Told        Supply         Told            Supply
    square      stood          square          stood
    metal       do             metal           do
    held        shine          held            shine
    large       boy            large           boy
    map         table          map             table
    book        car            book            car
    process     also           process         also
    thank       young          thank           young
    held        if             held            if
    ship        atom           ship            atom
    Have        game           Have            game
    thousand    strong         thousand        strong
    case        most           case            most
    head        Tube           head            Tube
    those       wait           those           wait
    sudden      triangle       sudden          triangle
    while       feed           while           feed
    human       order          human           order
    paint       sight          paint           sight
    mouth       rope           mouth           rope
    Hair        suffix         Hair            suffix
    want        this           want            this
    hot         salt           hot             salt
    call        house          call            house
    similar     experiment     similar         experiment
    count       rub            count           rub
    quite       won't          quite           won't
    opposite    no             opposite        no
    note        low            note            low
    process     term           process         term
    to          Fine           to              Fine
    Solution    Season         Solution        Season
    band        block          band            block
    among       direct         among           direct
    who         These          who             These
    between     sugar          between         sugar
    ice         leg            ice             leg
    took        symbol         took            symbol
    between     Leg            between         Leg
    Design      Share          Design          Share
    quotient    segment        quotient        segment
    

    Of course, I just evaluated, roughly, at each step, where the next column should begin, according to the longest string of the previous column. I don’t know, Alan, if you consider this way as a lot of pre-calculation steps !!

    Cheers,

    guy038



  • I will play reverse-golf and make @Claudia-Frank 's version longer but IMO better…and still one line:

    editor.setText(['\r\n', '\r', '\n'][notepad.getFormatType()].join([('{:<' + str(editor.getColumn(editor.getCurrentPos())-1) + '} {}').format(*x.split()) for x in editor.getText().splitlines()]))
    

    Two changes:

    • do correct line-endings, not Linux–sorry Claudia!–line-endings
    • start the aligned data in the column the caret is in when the script is run (be sure to leave the caret in a column greater than the longest entry in the leftmost data “column”!)


  • I deserve what I get because I didn’t quite ask in the right way. I was sort of looking for the solution to the general case. But in presenting example text I got specific answers to solve that specific thing (2 columns, whole file). Don’t get me wrong, the answers I got were awesome!–thanks to responders! Good ideas, all!

    Of the answers I think Scott’s (put caret in column…and then run script) starts getting at the interactivity I was hoping for. Another clarifying situation might be what if I want this to only affect certain lines, or only after a certain column point on specific lines…

    So I guess the main answer is something like this is best served by scripting, although in the end I did like Guy’s regexes (although i did try to head off his enthusiasm for them with my earlier post).



  • Hi, @alan-kilborn,

    Another clarifying situation might be what if I want this to only affect certain lines, or only after a certain column point on specific lines…

    • Concerning the possibility to change text, after a specific column point c, simply use the regex ^.{c+ε}\K\x20+

    • Concerning reducing text changed to a specific block of lines, do a normal selection of your range of lines, first. So, when opening the Replace dialog, the In selection option is automatically ticked, and the Replace All operation is performed on the selection, only :-))

    Cheers,

    guy038



  • @PeterJones

    The old man wasn’t invited to the tournament. Nevertheless, he ambled over to the tee box and took a swing with an ancient wooden driver that has been meticulously maintained for more than 40 years:

    gawk "{printf \"%-256s%s\n\",$1,$2}" $(FULL_CURRENT_PATH)
    

    :-)



  • Somewhat tangential but possibly a solution is the Elastic Tabstops plugin. Its would only require a single tab between columns but has the disadvantage of only working within Notepad++ itself.



  • Neither was the simpleton invited to the tournament but he stumbled up to the tee and out from his bag fell a TextFX plugin and hideous python script that would make a crow blush:

    # coding: iso-8859-1
    selected = editor.getSelText()
    selStart = editor.getSelectionStart()
    #replace any existing commas with a weird char
    selected = selected.replace(",", chr(174))
    #replace the double spaces
    while ( selected.find("  ") > 0 ):
    	selected = selected.replace("  ", " ")
    #replace the spaces with commas since our 'line up' function uses commas
    selected = selected.replace(" ", ",")
    selEnd = len(selected)
    editor.replaceSel(selected)
    #re-select the selection
    editor.setSelectionStart(selStart)
    editor.setSelectionEnd(selStart + selEnd)
    notepad.runMenuCommand("TextFX Edit", "Line up multiple lines by (,)")
    notepad.runMenuCommand("TextFX Edit", "E:Line up multiple lines by (,)")
    selected = editor.getSelText()
    #take out the lineup commas
    selected = selected.replace(",", " ")
    #put back any original commas
    selected = selected.replace(chr(174), ",")
    editor.replaceSel(selected)
    

    This works for any number of columns, and only on lines in the current selection. It makes the columns as narrow as possible. I’m not really sure how you would line up things after a certain column point though.



  • Hello, @cipher-1024, and All,

    I’m thinking about an other solution, which still use the TextFX plugin but which avoids this [ hideous :-D ] Python Script !

    • First, use the following regex S/R :

    SEARCH \x20+

    REPLACE \x60

    Note : I, specially, chose the Unicode Grave Accent character ( U+0060 ) , as a dummy character, because it is, both, rarely used in programming languages, ( AFAIK ! ) and part of all character encodings, as belonging to the international ASCII encoding ( from Unicode U+0000 to U+007F )

    • Copy a single ` ( Grave Accent ) in the clipboard, hitting the Ctrl + C shortcut ( IMPORTANT )

    • Now, do a normal selection of the text, which is to be aligned

    • Click on the menu choice TextFX > TextFX Edit > Line up multiples lines by (Clipboard Character)

    • Finally, use the regex, below, to delete the dummy Grave Accent character ` and add some space characters between columns, with a possible delimiter character !

    SEARCH \x60

    REPLACE \x20\x20\x20

    OR, for instance :

    SEARCH \x60

    REPLACE \x20\x20|\x20\x20

    Cheers,

    guy038



  • Other than “rarely used in programming languages,” I like that answer.

    Perl uses a pair of Grave Accents (aka “backticks”) as an often-used alternate for the qx// quote-like syntax for running a shell command and placing the command’s output in a string.

    SQL uses backticks for denoting identifiers, such as field names.

    Markdown uses it for embedding inline fixed width text, like:

    embedding `inline` fixed width text
    

    But if you know your text has no backticks, then it’s a great choice.

    If your data might have backticks, I would use U+001C (\x1c), the Field Separator FS character, which is a control code found in ASCII. (I won’t make the claim that it’s “rarely used” in text files or programming language source code… but I’ve never seen it intentionally used in such. :-) )

    I think this style of solution meets the original requirements of not requiring complicated S/R regex or precomputing, which is nice.



  • Hi, @PeterJones and All,

    So, I strongly apologize ! My programming skills are weaker than most N++ users’s ones :-D.

    BTW, Peter, just have a look to the link, below :

    https://en.wikipedia.org/wiki/C0_and_C1_control_codes

    it seems, that the C0 Control character ( \x1C ) rather refers to the File Separator control character ! Anyway, your idea, about using a Control character, is great ! And, if we follow the description notes, it would be logical to prefer the US Control character \x1F :-D

    Cheers,

    guy038


Log in to reply