Community
    • Login

    Mark all .pdf files except zusammenfassung.pdf using RegEx

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    8 Posts 4 Posters 487 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Andreas SoraruA
      Andreas Soraru
      last edited by Andreas Soraru

      Hi there,

      i have exported a real large content file from Joomla with articles and fields using J2XML.
      Now i have to delte some unused stuff, wich works real good with Notepad++. But i really do not have much idea of the use of RegEx and i now have to mark all lines where a .pdf file is used bute i dont have to mark the lines where the filename zusammenfassung.pdf is used.

      Is there a way to do this with RegEx?

      Any help appreciated.

      Greets from Germany and a happy weekend to all of you,

      Andreas (Andy)

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello, @andreas-soraru and All,

        From your title, I would say :

        • Open your file in N++

        • Open the Mark dialog ( Ctrl + M )

        • Untick all box options

        • SEARCH (?xi-s) ^ (?! .* zusammenfassung.pdf ) .* \b \S+ \. pdf \b

        • Check the Bookmark line and the Wrap around options

        • Select the Regular expression search mode

        • Click on the Mark All button

        • Then, either :

          • Click on the Copy Marked Text button

          • Click on the Search > Bookmark > Copy Bookmarked Lines

        • Open a new tab

        • Paste the .pdf filenames or the complete lines in this new tab

        Best Regards

        guy038

        Andreas SoraruA datatraveller1D 2 Replies Last reply Reply Quote 3
        • Andreas SoraruA
          Andreas Soraru @guy038
          last edited by

          @guy038 Thanks a lot … that was exactly what i was looking for … without such good knowledge of RegEx i would never got that working. Thanks a lot, greets from Germany and a good day,

          Andreas (Andy)

          1 Reply Last reply Reply Quote 1
          • Alan KilbornA
            Alan Kilborn
            last edited by

            My first thought was this would be a good application of this technique:

            What I DON’T want(*SKIP)(*F)|What I DO want

            This was discussed HERE and probably some other places, too.

            Anyway, for the current problem, try this:

            Find what: (?-s)(?:zusammenfassung\.pdf(*SKIP)(*F)|\w+?\.pdf)
            Search mode: Regular expression

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hi, @andreas-soraru, @alan-kilborn and All,

              Ah… yes @alan-kilborn. Excellent example of the power of backtracking control verbs ( (*SKIP) , (*F), ... ) !

              And I think that a final version could be :

              SEARCH (?xi) zusammenfassung \. pdf (?= \s ) (*SKIP) (*F) | \S+ \. pdf (?= \s )


              Notes :

              • No need for the No single line modifier (?-s), as this regex does not contain any regex dot character !

              • I preferred to add the (?=\s) look-ahead structure, after the strings pdf, to be sure that we search for true PDF files


              You may test this new version against the text below :

              This is Test.PDF and TEST2.pdf files
              Example.pdf : This one is OK
              This line does NOT contain portable file name
              NOT searched : bad.pdf---123
              A file zusammenfassunge.pdf
              The zusaMMENFASSung.pdf file
              This is the Test.PDFTEST2.pdf file
              The zusammenfassung.pdf---456 file
              This is my last file.pdf
              A special 123zusammenfassung.pdf456 file
              Last.PDF True PDF file
              

              BR

              guy038

              Alan KilbornA 1 Reply Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn @guy038
                last edited by

                @guy038 said in Mark all .pdf files except zusammenfassung.pdf using RegEx:

                No need for the No single line modifier (?-s)

                This is true.
                It is in mine because at first I did have a non-escaped . in it…but I changed where I was heading with it.

                1 Reply Last reply Reply Quote 0
                • guy038G
                  guy038
                  last edited by guy038

                  Hi, @andreas-soraru, @alan-kilborn and All,

                  In my previous regex, I used the \S syntax to include the litteral dot as a possible character, in the filename. However, to be rigorous, I should have used this syntax :

                  SEARCH  (?xi) zusammenfassung \. pdf (?= \s ) (*SKIP) (*F) | [!#$%&'()+,-.;=@\\[\\]^`{}~\w]+ \. pdf (?= \s )
                  

                  As some characters are forbidden in Windows filenames : \ / : * ? " < > |

                  BR

                  guy038

                  1 Reply Last reply Reply Quote 0
                  • datatraveller1D
                    datatraveller1 @guy038
                    last edited by

                    @guy038 said in Mark all .pdf files except zusammenfassung.pdf using RegEx:

                    Hello, @andreas-soraru and All,

                    From your title, I would say :

                    • Open your file in N++

                    • Open the Mark dialog ( Ctrl + M )

                    • Untick all box options

                    • SEARCH (?xi-s) ^ (?! .* zusammenfassung.pdf ) .* \b \S+ \. pdf \b

                    • Check the Bookmark line and the Wrap around options

                    • Select the Regular expression search mode

                    • Click on the Mark All button

                    • Then, either :

                      • Click on the Copy Marked Text button

                      • Click on the Search > Bookmark > Copy Bookmarked Lines

                    • Open a new tab

                    • Paste the .pdf filenames or the complete lines in this new tab

                    Best Regards

                    guy038

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    The Community of users of the Notepad++ text editor.
                    Powered by NodeBB | Contributors