Delete columns using ++ notepad



  • Hello everyone, I appeal to you to help me, I have a file of approximately 35k, in the following format

    col1:col2:col3:col4:col5

    My question is if you can combine column by column? as follows:

    col1:col2
    col1:col3
    col1:col4
    col1:col5

    col2:col3
    col2:col4
    col2:col5

    col3:col4
    col3:col5

    col4:col5

    How can I do this using notepad?
    thanks



  • @favio-rodi

    It might be possible. First, though, curious how that kind of data manipulation is useful. Can you explain the real-world example?



  • @favio-rodi,

    It appears to me that you are trying to list all the combinations of those 5 columns, taken 2 at a time, preserving original column order. Personally, for data combinatorics, I would use a full programming language (I would use Perl; many others would use Python or Lua or some other personal favorite language; a benefit of the latter two is that there are NPP plugins available which embed copies of the programming language in NPP, and allow you direct access to the currently-open files in Notepad++, rather than making you manually deal with the file i/o portion of the code).

    Assuming it’s literally 5 columns, you can do it rather straightforwardly using a regular expression:

    • Find what: ^([^:\r\n]+):([^:\r\n]+):([^:\r\n]+):([^:\r\n]+):([^:\r\n]+)$
    • Replace with: $1:$2\r\n$1:$3\r\n$1:$4\r\n$1:$5\r\n\r\n$2:$3\r\n$2:$4\r\n$2:$5\r\n\r\n$3:$4\r\n$3:$5\r\n\r\n$4:$5\r\n
    • Search Mode = Regular Expression

    This finds 5 groups separated by colons, all on a single line; it replaces those with windows-newline(CRLF)-separated combinations columns.

    If you wanted an expression that would handle pairs of columns taken from 6 original columns, you would need one extra term in the Find What, and you would need nCr = C(n,r) = C(6,2) = 6!/4!/2! = 15 terms in the Replace With. If you want to be able to handle an arbitrary n number of columns, you would probably want to do it in a programming language – or at least generate the regular expression in a program (because manually generating the Find what and Replace with was tedious with 5 columns, and would get rather annoying pretty quickly beyond that.

    With the expression above, if I start with

    col1:col2:col3:col4:col5
    one:two:doesnt match because five 'columns' split across:
    multiple:lines
    a:b:c:d:e
    

    I end up with

    col1:col2
    col1:col3
    col1:col4
    col1:col5

    col2:col3
    col2:col4
    col2:col5

    col3:col4
    col3:col5

    col4:col5

    one:two:doesnt match because five ‘columns’ split across:
    multiple:lines
    a:b
    a:c
    a:d
    a:e

    b:c
    b:d
    b:e

    c:d
    c:e

    d:e



  • @PeterJones , thank you very much Mr. PeterJones


Log in to reply