Community
    • Login

    Mark all .pdf files except zusammenfassung.pdf using RegEx

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    8 Posts 4 Posters 1.1k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Andreas SoraruA Offline
      Andreas Soraru
      last edited by Andreas Soraru

      Hi there,

      i have exported a real large content file from Joomla with articles and fields using J2XML.
      Now i have to delte some unused stuff, wich works real good with Notepad++. But i really do not have much idea of the use of RegEx and i now have to mark all lines where a .pdf file is used bute i dont have to mark the lines where the filename zusammenfassung.pdf is used.

      Is there a way to do this with RegEx?

      Any help appreciated.

      Greets from Germany and a happy weekend to all of you,

      Andreas (Andy)

      1 Reply Last reply Reply Quote 0
      • guy038G Offline
        guy038
        last edited by guy038

        Hello, @andreas-soraru and All,

        From your title, I would say :

        • Open your file in N++

        • Open the Mark dialog ( Ctrl + M )

        • Untick all box options

        • SEARCH (?xi-s) ^ (?! .* zusammenfassung.pdf ) .* \b \S+ \. pdf \b

        • Check the Bookmark line and the Wrap around options

        • Select the Regular expression search mode

        • Click on the Mark All button

        • Then, either :

          • Click on the Copy Marked Text button

          • Click on the Search > Bookmark > Copy Bookmarked Lines

        • Open a new tab

        • Paste the .pdf filenames or the complete lines in this new tab

        Best Regards

        guy038

        Andreas SoraruA datatraveller1D 2 Replies Last reply Reply Quote 3
        • Andreas SoraruA Offline
          Andreas Soraru @guy038
          last edited by

          @guy038 Thanks a lot … that was exactly what i was looking for … without such good knowledge of RegEx i would never got that working. Thanks a lot, greets from Germany and a good day,

          Andreas (Andy)

          1 Reply Last reply Reply Quote 1
          • Alan KilbornA Offline
            Alan Kilborn
            last edited by

            My first thought was this would be a good application of this technique:

            What I DON’T want(*SKIP)(*F)|What I DO want

            This was discussed HERE and probably some other places, too.

            Anyway, for the current problem, try this:

            Find what: (?-s)(?:zusammenfassung\.pdf(*SKIP)(*F)|\w+?\.pdf)
            Search mode: Regular expression

            1 Reply Last reply Reply Quote 0
            • guy038G Offline
              guy038
              last edited by guy038

              Hi, @andreas-soraru, @alan-kilborn and All,

              Ah… yes @alan-kilborn. Excellent example of the power of backtracking control verbs ( (*SKIP) , (*F), ... ) !

              And I think that a final version could be :

              SEARCH (?xi) zusammenfassung \. pdf (?= \s ) (*SKIP) (*F) | \S+ \. pdf (?= \s )


              Notes :

              • No need for the No single line modifier (?-s), as this regex does not contain any regex dot character !

              • I preferred to add the (?=\s) look-ahead structure, after the strings pdf, to be sure that we search for true PDF files


              You may test this new version against the text below :

              This is Test.PDF and TEST2.pdf files
              Example.pdf : This one is OK
              This line does NOT contain portable file name
              NOT searched : bad.pdf---123
              A file zusammenfassunge.pdf
              The zusaMMENFASSung.pdf file
              This is the Test.PDFTEST2.pdf file
              The zusammenfassung.pdf---456 file
              This is my last file.pdf
              A special 123zusammenfassung.pdf456 file
              Last.PDF True PDF file
              

              BR

              guy038

              Alan KilbornA 1 Reply Last reply Reply Quote 0
              • Alan KilbornA Offline
                Alan Kilborn @guy038
                last edited by

                @guy038 said in Mark all .pdf files except zusammenfassung.pdf using RegEx:

                No need for the No single line modifier (?-s)

                This is true.
                It is in mine because at first I did have a non-escaped . in it…but I changed where I was heading with it.

                1 Reply Last reply Reply Quote 0
                • guy038G Offline
                  guy038
                  last edited by guy038

                  Hi, @andreas-soraru, @alan-kilborn and All,

                  In my previous regex, I used the \S syntax to include the litteral dot as a possible character, in the filename. However, to be rigorous, I should have used this syntax :

                  SEARCH  (?xi) zusammenfassung \. pdf (?= \s ) (*SKIP) (*F) | [!#$%&'()+,-.;=@\\[\\]^`{}~\w]+ \. pdf (?= \s )
                  

                  As some characters are forbidden in Windows filenames : \ / : * ? " < > |

                  BR

                  guy038

                  1 Reply Last reply Reply Quote 0
                  • datatraveller1D Offline
                    datatraveller1 @guy038
                    last edited by

                    @guy038 said in Mark all .pdf files except zusammenfassung.pdf using RegEx:

                    Hello, @andreas-soraru and All,

                    From your title, I would say :

                    • Open your file in N++

                    • Open the Mark dialog ( Ctrl + M )

                    • Untick all box options

                    • SEARCH (?xi-s) ^ (?! .* zusammenfassung.pdf ) .* \b \S+ \. pdf \b

                    • Check the Bookmark line and the Wrap around options

                    • Select the Regular expression search mode

                    • Click on the Mark All button

                    • Then, either :

                      • Click on the Copy Marked Text button

                      • Click on the Search > Bookmark > Copy Bookmarked Lines

                    • Open a new tab

                    • Paste the .pdf filenames or the complete lines in this new tab

                    Best Regards

                    guy038

                    1 Reply Last reply Reply Quote 0

                    Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                    Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                    With your input, this post could be even better 💗

                    Register Login
                    • First post
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors