Append third row of text to a find replace?
-
I have 1k+ files that I need to find foo in and make it foo&3rdrow. Least complex way to make this happen without hand massaging each file? Thanks!
-
Hello @Matthew-suhajda,
I’m trying to guess what you want, because a literal example would be welcomed !
For instance, from the initial text, below :
Line foo 1 Line 2 Line 3 Line 4 foo Line 5 Line 6 Line 7 foo
are you expecting the following text ?
Line foo&Line3 1 Line 2 Line 3 Line 4 foo&Line3 Line 5 Line 6 Line 7 foo&line3
If so, I will be able to post a solution, next time, which does the job, with two consecutive regex search/replacements !
See you later,
Best Regards,
guy038
-
All 1k files have a date and time stamp at line 3. I need to append that stamp to several places in each file before proceeding to clean up the rest of the data and combine all the files. So more like
Line 1
Line 2
Line 3 1/1/2008 14:53:36
Line 4 foo
Line 5 dsafoj
Line 6 adsaf foo
Line 7
Line 8 12341234 foo sdfsdI think your example has it correct, but hopefully the above illustrates the case a tad better.
-
Hi, @Matthew-suhajda,
OK for the time stamp, in line
3
of your files, but you didn’t add what text you expect to, after replacement !I mean : Is the
foo
generic expression, located, in your example, in lines4
,6
and8
, should be replaced with :-
A :
foo&1/1/2008 14:53:36
, with a literal&
between ? -
B :
foo1/1/2008 14:53:36
, simply attached ? -
C :
foo 1/1/2008 14:53:36
, with aspace
separator ? -
D :
Other Case
?
Cheers,
guy038
-
-
A space between would be perfect.
-
Note: Pasted as an image because of the stupid spam filter!! :-D
-
Hello, @Matthew-suhajda, and All,
Well ! My idea consists in three steps :
-
Firstly, copy the time stamp
3rd
line ( which I suppose to be different for each file !) at the very end of each file scanned, after a pure blank line ! this new line won’t be followed by any line-break -
Secondly, search for any occurrence of the foo expression, with a look-ahead feature ( always true ) which stores the the very last line ( The time stamp line ) , added during the previous S/R step, as group 1
-
Thirdly, delete, in each file scanned, the very last line, temporarily added
The first point is realized with a first regex S/R. The second and third ones are done, all together, by a second regex S/R
Note that it’s necessary to copy the time stamp line at the very end, because, once the regex engine position is after Line 3, looking for some foo occurrences, it cannot remember that specific line, because it’s not part, anymore, of the later matches !
So, let’s imagine the text below, with the time stamp in
line 3
and thefoo
word, in lines2
,6
,8
and10
:This is a small example foo of 1/1/2008 14:53:36 text for testing the Matthew's goal ! foo It doesn't mean anything and foo it's created to test the search/replacement foo That's the end.
Then, the first regex S/R :
SEARCH
(?-s)^(?:.*\R){2}(.+)(?s).+
REPLACE
$0\r\n\1
would give the following text, with a last line ( the
3rd
) added :This is a small example foo of 1/1/2008 14:53:36 text for testing the Matthew's goal ! foo It doesn't mean anything and foo it's created to test the search/replacement foo That's the end. 1/1/2008 14:53:36
Now, the second regex S/R :
SEARCH
(?i)foo(?s)(?=.*\R(.+)\z)|(?-s)\R.+\z
REPLACE
?1foo\x20\1
give the expected text, below :
This is a small example foo 1/1/2008 14:53:36 of 1/1/2008 14:53:36 text for testing the Matthew's goal ! foo 1/1/2008 14:53:36 It doesn't mean anything and foo 1/1/2008 14:53:36 it's created to test the search/replacement foo 1/1/2008 14:53:36 That's the end.
I supposed that the search is insensitive to the case, so words
FOO
,Foo
,… would match. If you prefer a sensitive search, just change the first regex part(?i)
with the(?-i)
syntax
Practically, Matthew, follow these few steps :
-
First, BACKUP all the files, concerned with these S/R (
IMPORTANT
) -
Start Notepad ++ and open the Find in Files dialog
-
Type in
(?-s)^(?:.*\R){2}(.+)(?s).+
, in the Find what: zone -
Type in
$0\r\n\1
, in the Replace with: zone -
Enter the right extension of your files ( for instance
*.txt
,*.html
, … ), in the Filters : zone -
Add the
full pathname
of the folder, containing all your files, in the Directory : zone -
Select the Regular expression search mode ( IMPORTANT )
-
Click on the Replace in Files button
-
Click on the OK button, of the confirmation dialog
At that time, all the files scanned should have a new line, at their end, identical to their
3rd
line ! Now :-
Change the Find what: zone with the regex
(?i)foo(?s)(?=.*\R(.+)\z)|(?-s)\R.+\z
-
Change the Replace with: zone with the regex
?1foo\x20\1
-
Click, again, on the Replace in Files button
-
Click on the OK button, of the confirmation dialog
Et voilà ! any occurrence of
foo
, in each scanned file, should be followed, after a space separator, with the appropriate time stamp of each file ;-))Best Regards,
guy038
P.S. :
If you want to, I’ll give you, next time, some explanations about these regexes !!
-
-
@guy038 said:
?1foo\x20\1
Exquisite. Worked perfectly. Hopefully this is the only time I’ll need to do something like this, but I will totally ask for more direction in the future if it comes up again. There’s always a new puzzle when dealing with shitty data! lol
Thank you so very much.
-
Hello, @Matthew-suhajda, and All,
Pleased to hear that it worked fine ! Just for information :
Regarding the first S/R :
SEARCH
(?-s)^(?:.*\R){2}(.+)(?s).+
REPLACE
$0\r\n\1
-
The modifier
(?-s)
means that, further dots will match any single character, only -
Then, the part
^(?:.*\R){2}
looks for the first two lines, with their EOL chars, in a non capturing group -
Now, the part
(.+)
stores, as group 1, the next3rd
line, without its End of Line characters -
Finally, the
(?s).+
syntax catches all remaining text from End of Line characters of line 3 -
In replacement, due to the
$0
syntax, it re-writes, first, the entire matched text ( = file contents ), followed with a Windows line break (\r\n
) and, finally, with the group 1 ( = The3rd
line = time stamp )
Regarding the second S/R :
SEARCH
(?i)foo(?s)(?=.*\R(.+)\z)|(?-s)\R.+\z
REPLACE
?1foo\x20\1
-
The searched regex is made of two alternatives, separated by the alternation special character
|
:-
(?i)foo(?s)(?=.*\R(.+)\z)
-
(?-s)\R.+\z
-
-
In the first alternative, the part
(?i)foo
tries, first, to match the foo word, in any case -
Then, the
(?s)(?=.*\R(.+)\z)
syntax represents an always true look-ahead,(?=......)
, which matches all text after the foo word, till the second to the last line (.*\R
), and the last ( or3rd
) line, without any line-break ((.+)\z
), which is stored as group 1 -
Near the end of each file, the second alternative,
(?-s)\R.+\z
, looks for the very last ( or3rd
) line contents, till the very end of each file (\z
) -
In replacement, the
?1foo\x20\1
syntax means :-
If group 1 exists, it rewrites the entire matched string foo, followed with a space character and the time stamp ( last ) line (
\1
) -
If group 1 does not exist ( case of the second alternative ), the very last line, temporarily added, is then, simply, deleted, as no ELSE part is present in the conditional replacement
?1.....
!
-
Best Regards,
guy038
-