Adding line breaks to fixed width table
-
we have an app that exports data in a .DBF file type, but aside from some header code, it’s just a space-separated, fixed-width table, but it has NO line breaks. so i can open it in Notepad++ and with word wrap, if i stretch the window to the right size, it is a perfectly aligned table. i can’t import it into Excel though, or else it will treat it like one enormous line. If it were a small table, it would be no big deal to just go to the end of each line and hit enter, but this is several thousand lines. i tried making a macro, but it seems like when i record a macro, the down arrow takes me to the next line as i see it on the screen, but when i play it back, the down arrow goes to the next carriage return, which is the end of the document. Am i doing something wrong? Or, even better, is there a simpler way? Thanks in advance.
-
It can probably be done, but you share no data so I can’t say for sure.
-
Yes, i apologize about that, it is company data and i can’t really just post it publicly.
i’ll see if i scrub several lines and post that. -
I’d say definitely publish a representative version of the “some header code” as well as the other scrubbed lines.
-
Here is a data sample. I have to stretch my notepad++ window to almost full scree to get it to line up as a table. i believe it was originally formatted to print on an old landscape-format dot matrix printer. as for the header code, it doesn’t really come through in notepad like XML or some other recognizable language. it is mostly empty squares and maybe assembly symbols? not really familiar, but i didn’t think that would help, but i can try to copy that if you think it will help.
CPIXXXXXXXXXT 2RTXXXXXXFT CPI RXXXXSTREH65 EA EA XXXXX .XXX .XXX .XXX .X N DMXXXXXXXXXXXXXXXXNO UPDATE CPXXXXXXXXXX0RE 3RMXXXXXXXXXE CPI XXXXXXXXX200 EA EA 1XXXX50 1XXXX50 1XXXX60 1XXXXX0 .X N DMXXXXXXXXXXXXXX ACXXXXXXX THXXXXXXXXXTTEN CPXXXXXXXXXXXRE 4RXXXXXXXXXRE CPI FXXXXXXXXXXX EA EA 1XXXXXX 1XXX350 1XXXXX0 1XXXX50 .0 N DXXXXXXXXXXXXXXXXXXXXXXXXXXXTXXNXLXSXRXTXEN CPIXXXXXXXXXXXX 5RMXXXXXXXXXE CPI FUXXXXXXXX03 EA EA 8XXX60 123XXXX 8XXXX0 1XXXX70 .0 N XXXXXXXXXXXXXXXXXXXXXXXXMXRXTXAX XIXXXXXTEN KAXXXXXXXX 6KXXXXXX KAT LXXXXXXXXXXM EA EA 2XXX70 3XXX00 27XXXX 30XXX0 .0 N DXXXXXXXXXXXXXXXXXXXXXX XXRXTXAX XXSXXIXTXN MSXXXXXXXXXXXXX 7XXXXXXXXXX16 MSE LXXXXXXXXXXP EA EA 1XXXX40 19XXX00 XXX00 XXXX .0 N SXXXXXXXXXXXDXSXAXPXIXEXXAXNXXXXXXTE MSXXXXXXXXXXXX 8XXXXXXXXX16 MSE PXXXXXXXX00 EA EA XXXX00 X1XX.X0X XXXXX0 XXXXXX0 .0 N SXXXXXXX XXNX XWXPXRXXEXVAXXXXXXXXTE XXXXXXXXXXXX 91XXXXXXX6 XER XXXXXXXXSER EA EA 13XXXX0 XXXXXX80 XXXXX XXXX XX N XYXNXXXX XEXDXSWXPXXICX XXXXOXUXXXTE MIXXXXXXXXX 10XXXXXXX0 MII GXXXXXXXXXPD BX BX XXXXXX20 XXXXXX20 XXXXX00 XXXX0 .0 N PXIXEXVXRXOXUXXXXXXXX MIXXXXXXXXX 11XXXXXXX1 MII GXXXXXXXXXX BX BX XXXX20 XXXXX20 XXXXX00 XXXX .0 N XXXXXXXXXXXXXXXXTE MIXXXXXXXXX 12XXXXXXX2 MII GXXXXXXXXX BX BX XXXXXX0 XXXXXX0 XXXXX0 XXXXX .0 N PXIXEXVXXNX XXXAXX MIXXXXXXXXX 13XXXXXXX3 MII GXXXXXXXXXXD BX BX XXXXXX XXXXXX XXXX XXX0 .0 N XXXXX VXXXX XXXXXXX MIIPRO31764 1XXXXXXXX4 MII GXXXXXXXXXXD BX BX 4XXXXX 3XXXX0 XXX0 XXXXX0 .0 N XXXXXXXXXXXXXXXXXXXXXXXXXXX -
@Alan-Kilborn Also, i forgot to mention that i just delete all the symbols in the “header code” because i assume they are either instructions to whatever printer this went to once upon a time, or instructions to the app itself for reading the file. My job is just to try to pull data.
-
So… the header’s gone, so no longer a concern.
Really now you just need to add a line break after a certain number of columns.
The way you pasted your sample data doesn’t really help; a before/after would have helped.
So anyway, you can break your data into constant fixed length lines by:
- invoking Replace by pressing Ctrl+h
- selecting Regular expression for the Search mode
- ticking the Wrap around checkbox
- putting
.{n}
in the Find what box (n
is whatever you need for columns) - putting
$0\r\n
in the Replace with box - pressing the Replace All button
-
@Stinson-s-Stationers said in Adding line breaks to fixed width table:
it doesn’t really come through in notepad like XML or some other recognizable language. it is mostly empty squares and maybe assembly symbols?
The reason is that the so called “header code” is in reality the start of a binary file, information that is just for the original application to understand. Notepad++ isn’t designed to edit binary files, even if most of the data is just plain text.
Whilst this process of massaging the data into multiple lines will probably work I’d be cautious about doing so. My preference would be to get a 'freeware" DBF viewer that can export the data in multiple formats, such as is available here. The first one called DBF Viewer Plus even gives a TXT export function. It will likely either provide fixed width columns and/or comma delimited data.
my 2c worth
Terry -
@Terry-R It didn’t even occur to me that DBF would be a standard file type i could find a specific viewer for. I was told it was a file format specific to this app and just accepted it. Thanks for this, i will definitely be checking this out.
-
@Alan-Kilborn Sorry, yes, i def should have added the after data. here it is:
CPIXXXXXXXXXT 2RTXXXXXXFT CPI RXXXXSTREH65 EA EA XXXXX .XXX .XXX .XXX .X N DMXXXXXXXXXXXXXXXXNO UPDATE
CPXXXXXXXXXX0RE 3RMXXXXXXXXXE CPI XXXXXXXXX200 EA EA 1XXXX50 1XXXX50 1XXXX60 1XXXXX0 .X N DMXXXXXXXXXXXXXX ACXXXXXXX THXXXXXXXXXTTEN
CPXXXXXXXXXXXRE 4RXXXXXXXXXRE CPI FXXXXXXXXXXX EA EA 1XXXXXX 1XXX350 1XXXXX0 1XXXX50 .0 N DXXXXXXXXXXXXXXXXXXXXXXXXXXXTXXNXLXSXRXTXEN
CPIXXXXXXXXXXXX 5RMXXXXXXXXXE CPI FUXXXXXXXX03 EA EA 8XXX60 123XXXX 8XXXX0 1XXXX70 .0 N XXXXXXXXXXXXXXXXXXXXXXXXMXRXTXAX XIXXXXXTEN
KAXXXXXXXX 6KXXXXXX KAT LXXXXXXXXXXM EA EA 2XXX70 3XXX00 27XXXX 30XXX0 .0 N DXXXXXXXXXXXXXXXXXXXXXX XXRXTXAX XXSXXIXTXN
MSXXXXXXXXXXXXX 7XXXXXXXXXX16 MSE LXXXXXXXXXXP EA EA 1XXXX40 19XXX00 XXX00 XXXX .0 N SXXXXXXXXXXXDXSXAXPXIXEXXAXNXXXXXXTE
MSXXXXXXXXXXXX 8XXXXXXXXX16 MSE PXXXXXXXX00 EA EA XXXX00 X1XX.X0X XXXXX0 XXXXXX0 .0 N SXXXXXXX XXNX XWXPXRXXEXVAXXXXXXXXTE
XXXXXXXXXXXX 91XXXXXXX6 XER XXXXXXXXSER EA EA 13XXXX0 XXXXXX80 XXXXX XXXX XX N XYXNXXXX XEXDXSWXPXXICX XXXXOXUXXXTE
MIXXXXXXXXX 10XXXXXXX0 MII GXXXXXXXXXPD BX BX XXXXXX20 XXXXXX20 XXXXX00 XXXX0 .0 N PXIXEXVXRXOXUXXXXXXXX
MIXXXXXXXXX 11XXXXXXX1 MII GXXXXXXXXXX BX BX XXXX20 XXXXX20 XXXXX00 XXXX .0 N XXXXXXXXXXXXXXXXTE
MIXXXXXXXXX 12XXXXXXX2 MII GXXXXXXXXX BX BX XXXXXX0 XXXXXX0 XXXXX0 XXXXX .0 N PXIXEXVXXNX XXXAXX
MIXXXXXXXXX 13XXXXXXX3 MII GXXXXXXXXXXD BX BX XXXXXX XXXXXX XXXX XXX0 .0 N XXXXX VXXXX XXXXXXX
MIIPRO31764 1XXXXXXXX4 MII GXXXXXXXXXXD BX BX 4XXXXX 3XXXX0 XXX0 XXXXX0 .0 N XXXXXXXXXXXXXXXXXXXXXXXXXXX -
@Stinson-s-Stationers I realize now this is not as usefel as i’d hoped, because all excess spaces are removed. so you can’t see the space separated column allignment. Sorry about that, i just noticed.
-
Use this when composing and it will put your data in a black box and preserve spacing and other important details
If you’ve done it right it will appear like this:
If you've done it right it will appear like this:
-
@Stinson-s-Stationers said in Adding line breaks to fixed width table:
It didn’t even occur to me that DBF would be a standard file type i could find a specific viewer for.
Yes, in fact it is one of the oldest file types, first created in in the early 1980s, see here. It is still possible that it doesn’t actually correspond to the acceptable format known, but certainly worth a try using one of the free viewers.
File extensions don’t restrict developers to an acceptable format but most will do so. I just feel that using Notepad++ for a task it is not designed for (editing what is essentially a binary file) is looking for trouble. 99% of the time it will work (as this binary file is mostly just the text records with a binary header), but just once you will find your transformation will produce bad data. If you are not doing extensive data checks every transformation, it could have dire consequences.
As you say this is company data, so you need to be certain the output is valid. Using the right tools for the job is central to achieving this.
Terry
PS creating a robust process means it can be easily handled by anyone who can click few buttons,so less chance in the future something could go wrong