Regex: Can I Delete the content of files that doesn't have some words?
-
good day, everyone. Just a question. I have this words in many files, but not in all files. For example:
++++++++++++++++±
text text
my baby goes away
text text
++++++++++++++++±I want to delete all contents of those files that doesn’t have this unique words.
I try something, but doesn’t work too well.
check dot matches newlines and Search
^(?!.*\s(my baby goes away)\s).*Any suggestion?
-
Hello @robin-cruise,
First of all, it would be better to back up all the files, concerned by the Search/Replacement ;-))
Now, if all these files are located in a specific folder :
-
Open the Find in Files dialog ( Ctrl +Shift +F )
-
In the Find what: zone, type
(?s).*\s(my baby goes away)\s?.*|.+ -
In the Replace with: zone, type
?1$0 -
In the Filters zone, enter
\*.txtor else… -
In the Directory zone, specify the folder, containing all the concerned files
-
If necessary, select the Match case option, if the string to search for, must have this exact case
-
Select, of course, the Regular expression search mode
-
Click on the Replace in Files button
-
Please, verify, one more time, that the FOUR zones, Find what:, Replace with:, Filters: and Directory:, are correctly filled !
-
Click on the Yes button, of the dialog Are you sure?
Et voilà !
=> All the contents of the files, that do NOT contain the string
my baby goes away( not embedded in a larger word ), are deletedNotes :
-
The
(?s)syntax, at the very beginning of the search regex, ensures you that the regex engine consider the dot regex symbol as matching any single character ( standard or EOL character ) -
Then, the remainder is an alternative between :
-
.*\s(my baby goes away)\s?.*: All the contents of the current file scanned, containing, at least, one stringmy baby goes away, not glued in a larger expression. So, the last stringmy baby goes awayis stored as group 1 -
.+: All the contents of the current file scanned, which do NOT contain the stringmy baby goes away
-
-
In replacement, the syntax
?1$0, strictly(?1$0), is a conditional replacement that means :-
If group 1 exists ( your specific string found ), all the contents of the current file are replaced with the entire searched string (
$0), that is to say all the contents matched ! -
If group 1 does not exist ( NO specific string found ), no replacement action occurs => All the contents of the current file are, simply, deleted
-
-
A question mark
?, after the final syntax\s, is necessary, for the unique case, where the stringmy baby goes awayends the current file, without any final line break !
Best Regards,
guy038
P.S :
As described above, sometimes, it’s easier to use the general template of a list of alternatives :
(NOT This|NOT That|.....)|(This)|(That)......-
All the alternatives to
EXCLUDE, are re-written, with the syntax\1, in the replacement part -
All the alternatives to
INCLUDE, are replaced, thanks to each syntax(?#....), in the remplacement part (# > 1) OR deleted if this syntax is absent
Consider, for instance, the original text, below :
Jane said to Tarzan : "Tarzan" is a very strong person, much more than "Jane" is ! "Tarzan and Jane" or "Jane and Tarzan"And suppose that we would like to convert , in uppercase, the first names Tarzan and Jane, ONLY IF they are NOT surrounded by double quotes !
Then, we could use the simple S/R :
SEARCH :
("Tarzan"|"Jane")|(Tarzan)|(Jane)REPLACE
\1(?2TARZAN)(?3JANE)As the replacement action is identical, for each first name, we could also use :
SEARCH
("Tarzan"|"Jane")|(Tarzan|Jane)REPLACE
\1(?2\U\2)Note that when group 2 is defined, group 1 is NOT defined. Then, in replacement, the form
\1stands for an empty string !
Of course, the two following S/R, more complicated, may be used and produce the same replacements :
SEARCH
(?<!")(Tarzan|Jane)|(Tarzan|Jane)(?!")REPLACE
\U\1\2or
SEARCH
(?<!")(Tarzan|Jane)|((?1))(?!")REPLACE
\U\1\2
After replacement, we get, in all cases, the new text, below :
JANE said to TARZAN : "Tarzan" is a very strong person, much more than "Jane" is ! "TARZAN and JANE" or "JANE and TARZAN"
For newby people, about regular expressions concept and syntax, begin with that article, in N++ Wiki :
http://docs.notepad-plus-plus.org/index.php/Regular_Expressions
In addition, you’ll find good documentation, about the new Boost C++ Regex library, v1.55.0 ( similar to the PERL Regular Common Expressions, v1.48.0 ), used by
Notepad++, since its6.0version, at the TWO addresses below :http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
http://www.boost.org/doc/libs/1_48_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html
-
The FIRST link explains the syntax, of regular expressions, in the SEARCH part
-
The SECOND link explains the syntax, of regular expressions, in the REPLACEMENT part
You may, also, look for valuable informations, on the sites, below :
http://www.regular-expressions.info
http://perldoc.perl.org/perlre.html
Be aware that, as any documentation, it may contain some errors ! Anyway, if you detected one, that’s good news : you’re improving ;-))
-
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login