Organizing data

Hello, I have data that is in long rows, but I need to make different columns while keeping the rows in the certain order.
For example, I have this:
ospht421 4.52E02 1.20E02 2.98E03 3.19E02 1.93E02 6.87E02 4.16E02 1.71E02 2087 2.95E02And I need it to look like this:
ospht421
4.52E02 1.20E02 2.98E03
3.19E02 1.93E02 6.87E02
4.16E02 1.71E02 2082.95E02I have a lot of data and it takes forever to do it by hand. Any advice would be appreciated!
Thanks,
Kim 
Hello Kimberly Plomp,
Once more, this kind of modification of text can be easily achieved with regular expressions !
At first sight, your text seems to contain a list of numbers, in the classical E scientific notation ! Am I right ?
I supposed so. Therefore, I, first, tried to build a regex, in order to match such numbers. Taking in account, all the different syntaxes for these numbers, with an example below :
+9.9999998E+03 +1.23e07 5.4625e+04 6.258E00 5.789E+02 1.008E06 +9.E+03 +1.E07 5.e+04 6.E00 5.E+02 1.e06 +9E+03 +1e07 5E+04 6e00 5E+02 1E06
I got this regex :
[+]?\d+(\.(\d+)?)?[Ee][+]\d\d

The part
[+]?
matches an optional sign ( remain that the regex quantifier?
means 0 or 1 time ) 
The part
\d+
matches the mantissa part, before the dot ( The regex quantifier+
means 1 or more time(s) ) 
The part
(\.(\d+)?)?
represents the optional fractional part of a number, itself composed of a dot\.
( which must be escaped as it’s a meta character in regexes ) and an optional digits(\d+)?
, after the dot 
The part
[Ee]
matches the E notation letter, whatever the case 
The part
[+]
matches the mandatory sign 
The part
\d\d
represents the two digits of the exponent
Right. Now, from your text, below :
ospht421 4.52E02 1.20E02 2.98E03 3.19E02 1.93E02 6.87E02 4.16E02 1.71E02 2082.95E02
We notice that any E notation number is preceded by a space. So, as you expect to split on several lines, every 3 numbers, we can build the final search regex
( [+]?\d+(\.(\d+)?)?[Ee][+]\d\d){3}
, which will matches 3 numbers
Enclosed in round brackets, it begins with a space and ends with the
{3}
quantifier 
The replacement part
\r\n$0
will replace this range of numbers by an Windows End of Line characters\r\n
, followed by the entire regex matched$0

Don’t forget, in the Replace dialog, to set the Regular expression search mode !
So , from the test example, below
ospht421 4.52E02 1.20E02 2.98E03 3.19E02 1.93E02 6.87E02 4.16E02 1.71E02 2082.95E02 ospht000 +9.9999998E+03 +1.23e07 5.4625e+04 6.258E00 5.789E+02 1.008E06 +9.E+03 +1.E07 5.e+04 ospht999 6.E00 5.E+02 1.e06 +9E+03 +1e07 5E+04 6e00 5E+02 1E06
After clicking on the Replace All button, we get the new text :
ospht421 4.52E02 1.20E02 2.98E03 3.19E02 1.93E02 6.87E02 4.16E02 1.71E02 2082.95E02 ospht000 +9.9999998E+03 +1.23e07 5.4625e+04 6.258E00 5.789E+02 1.008E06 +9.E+03 +1.E07 5.e+04 ospht999 6.E00 5.E+02 1.e06 +9E+03 +1e07 5E+04 6e00 5E+02 1E06
Possibly, to get rid of the leading space(s), just use the simple search regex
^ +
and leave the replacement zone emptyEt voilà !
Best Regards,
guy038


Thanks, guy038!
I have never used Notepad++ before, so I am not familiar with how to use it. I have tried your suggestion in the Replace box but it say there are 0 replacements.
Find what: ([±]?\d+(.(\d+)?)?[Ee][±]\d\d){3}
Replace with: \r\n$0
And the Regular Expression box is ticked.I am sure it is just a silly mistake on my part, but I can’t seem to get it working.
Thanks!
Kim 
Hi Kim,
From your reply, it happens that one SPACE should be present, in the search regex, between the opening round bracket and the block
[+]
!Well, I just realized that your numbers could be separated by more than one space characters ! Moreover, your file may, also, contain some tabulation characters. In that case, prefer the search regex below ( The replacement regex is unchanged ):
(\h+[+]?\d+(\.(\d+)?)?[Ee][+]\d\d){3}
Notes :

The
\h
regex syntax matches one horizontal blank character, that is to say :
The normal space character (
\x20
) 
The tabulation character (
\x09
) 
The NoBreak space ( abbreviated NBSP ) (
\xA0
)


As usual, the
+
quantifier means one or more times the previous character
So given the test example below, where I put, between numbers, several spaces, in the first line, several tabulations in the second line and a mix of these two, in the third line :
ospht421 4.52E02 1.20E02 2.98E03 3.19E02 1.93E02 6.87E02 4.16E02 1.71E02 2082.95E02 ospht000 +9.9999998E+03 +1.23e07 5.4625e+04 6.258E00 5.789E+02 1.008E06 +9.E+03 +1.E07 5.e+04 ospht999 6.E00 5.E+02 1.e06 +9E+03 +1e07 5E+04 6e00 5E+02 1E06
After clicking on the Replace All button, we get the following text :
ospht421 4.52E02 1.20E02 2.98E03 3.19E02 1.93E02 6.87E02 4.16E02 1.71E02 2082.95E02 ospht000 +9.9999998E+03 +1.23e07 5.4625e+04 6.258E00 5.789E+02 1.008E06 +9.E+03 +1.E07 5.e+04 ospht999 6.E00 5.E+02 1.e06 +9E+03 +1e07 5E+04 6e00 5E+02 1E06
Now, we have to get rid of all the possible blanks characters, at the beginning of the lines and just keep one space character between the different numbers. The following S/R will achieve it, nicely :
Find what :
^\h+(\h+)
Replace with :
(?1 )
, with a space after the digit 1Notes :

Due to the alternative regex symbol

, this search regex looks :
for any non null range of blank characters, that begins the line ( the
^
symbol represents the zero length assertion beginning of line ) 
for any subsequent range of these blank characters, between the numbers


The second part
(\h+)
, ( the subsequent blanks ), is enclosed in round brackets, and represents the group 1 
In replacement, we use the conditional form
(?n....)
:
which replace the searched regex with the string located after digit
n
, till the ending round bracket, if the groupn
exists 
which does NO replacement if the group
n
does NOT exist

So :

When the first part, of the search regex, matches possible blank characters, at beginning of a line, group 1, does not exist. Therefore, there’s NO replacement

When the second part, of the search regex, matches blank characters, before the numbers, group 1 exists. Then they are replaced with the single space, located between the digit 1 and the ending round bracket, of the replacement regex
This time, after performing this second S/R, we obtain the final text below :
ospht421 4.52E02 1.20E02 2.98E03 3.19E02 1.93E02 6.87E02 4.16E02 1.71E02 2082.95E02 ospht000 +9.9999998E+03 +1.23e07 5.4625e+04 6.258E00 5.789E+02 1.008E06 +9.E+03 +1.E07 5.e+04 ospht999 6.E00 5.E+02 1.e06 +9E+03 +1e07 5E+04 6e00 5E+02 1E06
Cheers,
guy038
