removing whitespace between columns in a .txt data file

Too Glad

Hi everyone,

Please I need a help. I have a data file saved in notepad but it has too much space between columns? Is there a way of removing these spaces. This is how my original data looks:

15 20 33 boy white
20 11 28 girl yellow

I want to remove the whitespace between the column so it will look like this:

15 20 33 boy white
20 11 28 girl yellow

Any suggestion will be greatly appreciated.

Thanks

Scott Sumner

@Too-Glad

To my eyes it appears you are done before you start: BEFORE = AFTER

:-D

Too Glad

No I’m not. Sadly the original file in my example is not showing well when I type. Basically, I want to remove the whitespace. The spaces between columns are too wide, so I want to reduce the space. Please ignore my initial post because the original data format and the desired format seems to be same. In fact, the original data format has too much space in between and the desired format is the one I posted in my initial post. Thanks

Too Glad

@Scott-Sumner

The BEFORE has too much space in between as in example below. The ‘>>>>’ is just showing the whitespace

15>>>> 20>>>> 33>>>> boy>>>> white
20>>>> 11>>>> 28>>>> girl>>>> yellow

The desired format should be like this:
15 20 33 boy white
20 11 28 girl yellow

Thanks

Scott Sumner

@Too-Glad

So, it’s like this:

15    20    33    boy    white
20    11    28    girl    yellow

So it sure seems like this would solve your problem:

Find what zone: \h+
Replace with zone: put a single space character here
Search mode: Regular expression
Action: Press Replace or Replace All as desired

Herb Martin

@To-Glad, @Scott-Sumner gave you a working method, and the basic strategy.

However if you wish to keep the columns aligned and you have only spaces (not tabs) or predicatable tabbing you can (sometimes) ‘work backwards’ to remove a fixed number of spaces “before a word character”.

Maybe this:

Find what: \s{3}<
Replace with (NOTHING LEAVE EMPTY):

You might wish to mark a selection if other text is in the file. And you could use this Find also:
Find what: \s{3}\b

Translation:
\s = space (or \h for horizontal whitespace but that can present complications with tabs)
{3} = EXACTLY 3 (you want a fix amount if you are going to keep the columns, but change to any number you know will work (probably leaving at least enough for the longest item in the ‘previous column’.

< = require the match at the BEGINNING of a work
\b = or on a word boundary, which in the case of the space pre-match is always going to be the beginning of a work and this regex patter is found in far more regex engines than < (NotePad++ has particularly good regexes.)

You MAY need to work the “biggest” gaps first if you have varying amounts of white space, perhaps 6 extra characters for one column and 20 another.

You many also have some challenges if the longest word in one column is very long, while there is a big gap in other columns.

If it is bad enough just use column selection and cut them out manually – Shift-Alt while dragging the mouse or using the arrow keys will mark a column which you can then delete (or replace by typing).

Too Glad

@Scott-Sumner

Thank you so much Scott. That works for me, sorted.

Cheers

Too Glad

@Herb-Martin

Many thanks Herb-Martin for your suggestion. Much appreciated.