Delete columns using ++ notepad
-
Hello everyone, I appeal to you to help me, I have a file of approximately 35k, in the following format
col1:col2:col3:col4:col5
My question is if you can combine column by column? as follows:
col1:col2
col1:col3
col1:col4
col1:col5col2:col3
col2:col4
col2:col5col3:col4
col3:col5col4:col5
How can I do this using notepad?
thanks -
It might be possible. First, though, curious how that kind of data manipulation is useful. Can you explain the real-world example?
-
It appears to me that you are trying to list all the combinations of those 5 columns, taken 2 at a time, preserving original column order. Personally, for data combinatorics, I would use a full programming language (I would use Perl; many others would use Python or Lua or some other personal favorite language; a benefit of the latter two is that there are NPP plugins available which embed copies of the programming language in NPP, and allow you direct access to the currently-open files in Notepad++, rather than making you manually deal with the file i/o portion of the code).
Assuming it’s literally 5 columns, you can do it rather straightforwardly using a regular expression:
- Find what:
^([^:\r\n]+):([^:\r\n]+):([^:\r\n]+):([^:\r\n]+):([^:\r\n]+)$
- Replace with:
$1:$2\r\n$1:$3\r\n$1:$4\r\n$1:$5\r\n\r\n$2:$3\r\n$2:$4\r\n$2:$5\r\n\r\n$3:$4\r\n$3:$5\r\n\r\n$4:$5\r\n
- Search Mode = Regular Expression
This finds 5 groups separated by colons, all on a single line; it replaces those with windows-newline(CRLF)-separated combinations columns.
If you wanted an expression that would handle pairs of columns taken from 6 original columns, you would need one extra term in the Find What, and you would need
nCr = C(n,r) = C(6,2) = 6!/4!/2! = 15
terms in the Replace With. If you want to be able to handle an arbitraryn
number of columns, you would probably want to do it in a programming language – or at least generate the regular expression in a program (because manually generating the Find what and Replace with was tedious with 5 columns, and would get rather annoying pretty quickly beyond that.With the expression above, if I start with
col1:col2:col3:col4:col5 one:two:doesnt match because five 'columns' split across: multiple:lines a:b:c:d:e
I end up with
col1:col2
col1:col3
col1:col4
col1:col5col2:col3
col2:col4
col2:col5col3:col4
col3:col5col4:col5
one:two:doesnt match because five ‘columns’ split across:
multiple:lines
a:b
a:c
a:d
a:eb:c
b:d
b:ec:d
c:ed:e
- Find what:
-
@PeterJones , thank you very much Mr. PeterJones