paste time stamp to each word of a line
-
I have a data like below as thousands of lines,
9:40:10 SGOC VERB SBG FAMI OSAT
10:08:09 AHPI RCATI want to add the starting timestamp of a line to each words only in the same line like below,
9:40:10 SGOC 9:40:10 VERB 9:40:10 SBG 9:40:10 FAMI 9:40:10 OSAT
10:08:09 AHPI 10:08:09 RCATI am new to regex, I need to every day data. Could someone help me on this
Thanks
-
@paramesh-palanisamy said in paste time stamp to each word of a line:
I am new to regex, I need to every day data. Could someone help me on this
Notepad++ does not have a native “insert date/time” option. So anything to accomplish this would need to be built as some program code. Read this old post for some ideas.
Terry
-
@paramesh-palanisamy said in paste time stamp to each word of a line:
I am new to regex, I need to every day data. Could someone help me on this
Sorry, just re-read your post and now understand you already have the time at the start of the line and want to copy that before each word on the line. That is possible with regex. Someone (or I) should be able to create a regex for you to do that.
Terry
-
@Terry-R Thanks for the response! Maybe I didn’t explain properly. as you said, I want to copy the timestamp(which is the first word in all the lines) and paste it before all the words in a line. As you see, each line have different counts of words. some lines has 2 words some has 6 words.
-
@paramesh-palanisamy ,
Though you might not immediately recognize it as such, this is very similar to the problem that @Terry-R and @guy038 solved in two steps or one complicated step in the split inside curly brackets thread. Yours doesn’t have curly brackets, obviously, but the general idea is the same: for each line, take some token that’s at the start of the line, and replicate it before each word/token .
I’m going to simplify each step further, and do it in 3 steps.
- move the timestamp to the end of the line so that we can use lookaheads
- FIND =
(?-s)^(\S+\h*)(.*)$
- REPLACE =
$2\t$1
- FIND =
- insert a newline and the timestamp before each word that doesn’t already have the timestamp; if it’s not the first token on the line, also add a space before the timestamp
- FIND =
(?-s)(^)?(\S+)\h*(?=.*\t(\S+\h*)$)
- REPLACE =
(?1: )$3$2
- FIND =
- remove the extra timestamp at the end
- FIND =
(?-s)\t(\S+\h*)$
- REPLACE = leave empty
- FIND =
I chose to do three separate regex because that makes it slightly easier to understand (and it’s how I would break up such a problem in my mind); with clever alternations, you can reduce it to two or one step, but IMO that just gets in the way of the core regex concepts that you need to learn to do activities like this. And this solution is complicated enough as it is.
Read the docs referenced below to learn more about regex.
----
Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the
</>
toolbar button or manual Markdown syntax. To makeregex in red
(and so they keep their special characters like *), use backticks, like`^.*?blah.*?\z`
. Screenshots can be pasted from the clipboard to your post usingCtrl+V
to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries. - move the timestamp to the end of the line so that we can use lookaheads
-
Hi @PeterJones, This is working thank you!
Actually my original data was like below,
SGOC VERB SBG FAMI OSAT 9:40:10
AHPI RCAT 10:08:09I was trying some regex and didn’t work. I made data as posted initially in the thread :)
My final data set should be each line should have a one-word and it’s timestamp like below,SGOC 9:40:10
VERB 9:40:10
SBG 9:40:10
FAMI 9:40:10
OSAT 9:40:10
AHPI 10:08:09
RCAT 10:08:09I have been trying modifying the regex you posted but couldn’t achieve as above. could you suggest the way to do it
Thanks in Advance
-
@params16 said in paste time stamp to each word of a line:
My final data set should be each line should have a one-word and it’s timestamp like below,
I don’t know how @PeterJones feels about your sudden change of input data nor the change in output but if I’d spent some valuable time creating a solution for you only to have it essentially all thrown out the window I wouldn’t be happy.
Please be aware this isn’t a free bus service. We are happy to help those who need it, so long as they don’t waste our time, which is what you are doing.
When you want help, give us the correct data to work with and show us exactly what you want the first time!
Terry
-
@Terry-R @PeterJones I am sorry! I am new to this community! I was overwhelmed with my multiple failiures, created a account and created the post with the last data I had. I wasn’t thinking it would be easier as @PeterJones suggested. I got my learning and I wouldn’t do this again. I was trying to achieve from the other thread @PeterJones pointed but still didn’t figure out the right solution.
-
@params16 said in paste time stamp to each word of a line:
could you suggest the way to do it
Put in the effort, experiment some?
So your data is basically the result of step1 (except I had a tab and you had a space). I was going to say just skip that step. But that makes step 3 harder. So for step one, replace the
(?-s)\h(?=\S+\s*$)
with\t
, to make it a tab before the timestamp.Step 2’s search is then still what you need
Step 2’s replace is almost what you need, but you will need a newline before each line instead of a space (except the start-of-line one – which is the same logic as my space-based logic) , and the order of token vs timestamp needs to swap, with a space between.
(?1:\r\n)$2 $3
Step 3 will work as before
----
Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.
-
@PeterJones Got the result as expected. Thank you!