Mark all .pdf files except zusammenfassung.pdf using RegEx
-
Hi there,
i have exported a real large content file from Joomla with articles and fields using J2XML.
Now i have to delte some unused stuff, wich works real good with Notepad++. But i really do not have much idea of the use of RegEx and i now have to mark all lines where a .pdf file is used bute i dont have to mark the lines where the filename zusammenfassung.pdf is used.Is there a way to do this with RegEx?
Any help appreciated.
Greets from Germany and a happy weekend to all of you,
Andreas (Andy)
-
Hello, @andreas-soraru and All,
From your title, I would say :
-
Open your file in N++
-
Open the Mark dialog (
Ctrl + M
) -
Untick all box options
-
SEARCH
(?xi-s) ^ (?! .* zusammenfassung.pdf ) .* \b \S+ \. pdf \b
-
Check the
Bookmark line
and theWrap around
options -
Select the
Regular expression
search mode -
Click on the
Mark All
button -
Then, either :
-
Click on the
Copy Marked Text
button -
Click on the
Search > Bookmark > Copy Bookmarked Lines
-
-
Open a new tab
-
Paste the
.pdf
filenames or the complete lines in this new tab
Best Regards
guy038
-
-
@guy038 Thanks a lot … that was exactly what i was looking for … without such good knowledge of RegEx i would never got that working. Thanks a lot, greets from Germany and a good day,
Andreas (Andy)
-
My first thought was this would be a good application of this technique:
What I DON’T want(*SKIP)(*F)|What I DO want
This was discussed HERE and probably some other places, too.
Anyway, for the current problem, try this:
Find what:
(?-s)(?:zusammenfassung\.pdf(*SKIP)(*F)|\w+?\.pdf)
Search mode: Regular expression -
Hi, @andreas-soraru, @alan-kilborn and All,
Ah… yes @alan-kilborn. Excellent example of the power of backtracking control verbs (
(*SKIP) , (*F), ...
) !And I think that a final version could be :
SEARCH
(?xi) zusammenfassung \. pdf (?= \s ) (*SKIP) (*F) | \S+ \. pdf (?= \s )
Notes :
-
No need for the No single line modifier
(?-s)
, as this regex does not contain any regex dot character ! -
I preferred to add the
(?=\s)
look-ahead structure, after the stringspdf
, to be sure that we search for truePDF
files
You may test this new version against the text below :
This is Test.PDF and TEST2.pdf files Example.pdf : This one is OK This line does NOT contain portable file name NOT searched : bad.pdf---123 A file zusammenfassunge.pdf The zusaMMENFASSung.pdf file This is the Test.PDFTEST2.pdf file The zusammenfassung.pdf---456 file This is my last file.pdf A special 123zusammenfassung.pdf456 file Last.PDF True PDF file
BR
guy038
-
-
@guy038 said in Mark all .pdf files except zusammenfassung.pdf using RegEx:
No need for the No single line modifier (?-s)
This is true.
It is in mine because at first I did have a non-escaped.
in it…but I changed where I was heading with it. -
Hi, @andreas-soraru, @alan-kilborn and All,
In my previous regex, I used the
\S
syntax to include the litteral dot as a possible character, in the filename. However, to be rigorous, I should have used this syntax :SEARCH (?xi) zusammenfassung \. pdf (?= \s ) (*SKIP) (*F) | [!#$%&'()+,-.;=@\\[\\]^`{}~\w]+ \. pdf (?= \s )
As some characters are forbidden in
Windows
filenames :\ / : * ? " < > |
BR
guy038
-
@guy038 said in Mark all .pdf files except zusammenfassung.pdf using RegEx:
Hello, @andreas-soraru and All,
From your title, I would say :
-
Open your file in N++
-
Open the Mark dialog (
Ctrl + M
) -
Untick all box options
-
SEARCH
(?xi-s) ^ (?! .* zusammenfassung.pdf ) .* \b \S+ \. pdf \b
-
Check the
Bookmark line
and theWrap around
options -
Select the
Regular expression
search mode -
Click on the
Mark All
button -
Then, either :
-
Click on the
Copy Marked Text
button -
Click on the
Search > Bookmark > Copy Bookmarked Lines
-
-
Open a new tab
-
Paste the
.pdf
filenames or the complete lines in this new tab
Best Regards
guy038
-