Converting paginated TXT file to Table layout
-
Greetings All
The paginated TXT report comes from a server; and it does not matter what report I download (there are many), but at page 154 things start to change with the delimiters.
The bottom line, I have multiple TABS (as delimiters) that offset the columnar data and make it difficult to turn into a proper data-table.
Question. What is the easiest approach to aligning columns? Also, how can one remove all redundant delimiters leaving just the one to separate the data?
Thanks for your advise.
-
If you just want to change multiple tabs into a single tab, it’s easy
Find =
\t+
Replace =\t
search mode = regular expressionIf you want Notepad++ to find meaning in the tabs, and be able to figure out what’s supposed to have multiple tabs and what’s not, that’s not going to happen.
As far as getting things to visually line up in Notepad++ when there are single tabs, a plugin like Elastic Tabstops might help.
PS: please note that your file sounds like it might be a tab-separated value file, which falls under the generic character-separated value (CSV) format, which are often intended to be read in a spreadsheet program rather than a text editor.
Good luck
-
@PeterJones - That is interesting. Can one replace the 1st delimited TAB with another delimiter-type, and allow the remaining TABS as is?
I have some null data in my columns that I need to replace with “something” so when I use your suggestion, the column position will not be impacted - because of the TAB only.
-
@PeterJones - The suggested syntax works perfectly; however, let me rephrase my last comment. I have columnar data where at some rows there is “no data” just a TAB.
The problem. I have to First, replace that missing data with something line a “space-TAB” combination, so there is now at least a space at every row. Then I could use your suggestion to remove all unnecessary TABS.
If I do not do the previous step first, then the rows with no data (just the TAB) may be part of the “\t+ \t”, if there was a TAB proceeding the “no data” row.
Thank you for your guidance.
-
Sorry for the delay: I hadn’t fully understood your followup, and was hoping someone else would chime in.
It would be so much easier if you provided before and after data.
But if I have read correctly and you want to change the first instance of two tabs (which is what an empty field looks like) with some placeholder (maybe a smiley, since that’s likely not in your data): FIND =
(?-s)\t(\t.*)$
, REPLACE =\t☺$1
----
Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the
</>
toolbar button or manual Markdown syntax. To makeregex in red
(and so they keep their special characters like *), use backticks, like`^.*?blah.*?\z`
. Screenshots can be pasted from the clipboard to your post usingCtrl+V
to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries. -
@PeterJones - Agreed some images are required. I will start with the big picture.
The paginated report, for whatever reason puts TABS between data-columns after the report aligns the report layout for printing. Your first suggestion (Find = \t+ and Replace = \t) performs just fine.
The problem is shown by the second snip at the column level.
As I mentioned previously; but realize a picture is worth a 1,000 words, “I have to First, replace that missing data with something line a “space-TAB” combination, so there is now at least a space at every row. Then I could use your suggestion to remove all unnecessary TABS.”
The snips should help to understand how I am trying to cope with this paginated report. I appreciate your insight, thank you.
-
@Robert-J-Melascaglia said in Converting paginated TXT file to Table layout:
As I mentioned previously; but realize a picture is worth a 1,000 words, “I have to First, replace that missing data with something line a “space-TAB” combination, so there is now at least a space at every row. Then I could use your suggestion to remove all unnecessary TABS.”
And as I mentioned, you can do the smiley-face replacement to accomplish that. (and the implication of that suggestion was after doing the smiley transform, you would then do the tab reduction from my earlier post, and then you would turn smilies back into empty fields by just deleting all smiley characters.)
If that is not sufficient, you’ll need to be more explicit as to why. Because your pictures have done nothing to convince me my suggestion won’t work.
(But your fundamental problem is that you are trying to take a report that was made for printing in a specific format to a specific sheet of paper – probably an old 70s/80s-style wide dot-matrix printer with that paper with the alternating green/white rows – and use that as the input for further data processing or data reformatting. It would be so much better if you got the database to output another report that was formatted the way you needed it.)
-
@PeterJones - Understand completely. I did not say your follow-up suggestion did not or could not work; just responding to your suggestion, “It would be so much easier if you provided before and after data.” so I provided the graphics to complete the picture.
In any event, I am proceeding with your suggestions and thank you.