• Login
Community
  • Login

Mark all .pdf files except zusammenfassung.pdf using RegEx

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
8 Posts 4 Posters 503 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    Andreas Soraru
    last edited by Andreas Soraru Feb 11, 2023, 1:25 PM Feb 11, 2023, 1:09 PM

    Hi there,

    i have exported a real large content file from Joomla with articles and fields using J2XML.
    Now i have to delte some unused stuff, wich works real good with Notepad++. But i really do not have much idea of the use of RegEx and i now have to mark all lines where a .pdf file is used bute i dont have to mark the lines where the filename zusammenfassung.pdf is used.

    Is there a way to do this with RegEx?

    Any help appreciated.

    Greets from Germany and a happy weekend to all of you,

    Andreas (Andy)

    1 Reply Last reply Reply Quote 0
    • G
      guy038
      last edited by guy038 Feb 11, 2023, 8:12 PM Feb 11, 2023, 8:10 PM

      Hello, @andreas-soraru and All,

      From your title, I would say :

      • Open your file in N++

      • Open the Mark dialog ( Ctrl + M )

      • Untick all box options

      • SEARCH (?xi-s) ^ (?! .* zusammenfassung.pdf ) .* \b \S+ \. pdf \b

      • Check the Bookmark line and the Wrap around options

      • Select the Regular expression search mode

      • Click on the Mark All button

      • Then, either :

        • Click on the Copy Marked Text button

        • Click on the Search > Bookmark > Copy Bookmarked Lines

      • Open a new tab

      • Paste the .pdf filenames or the complete lines in this new tab

      Best Regards

      guy038

      A D 2 Replies Last reply Feb 12, 2023, 1:39 PM Reply Quote 3
      • A
        Andreas Soraru @guy038
        last edited by Feb 12, 2023, 1:39 PM

        @guy038 Thanks a lot … that was exactly what i was looking for … without such good knowledge of RegEx i would never got that working. Thanks a lot, greets from Germany and a good day,

        Andreas (Andy)

        1 Reply Last reply Reply Quote 1
        • A
          Alan Kilborn
          last edited by Feb 12, 2023, 9:23 PM

          My first thought was this would be a good application of this technique:

          What I DON’T want(*SKIP)(*F)|What I DO want

          This was discussed HERE and probably some other places, too.

          Anyway, for the current problem, try this:

          Find what: (?-s)(?:zusammenfassung\.pdf(*SKIP)(*F)|\w+?\.pdf)
          Search mode: Regular expression

          1 Reply Last reply Reply Quote 0
          • G
            guy038
            last edited by guy038 Feb 12, 2023, 10:52 PM Feb 12, 2023, 9:58 PM

            Hi, @andreas-soraru, @alan-kilborn and All,

            Ah… yes @alan-kilborn. Excellent example of the power of backtracking control verbs ( (*SKIP) , (*F), ... ) !

            And I think that a final version could be :

            SEARCH (?xi) zusammenfassung \. pdf (?= \s ) (*SKIP) (*F) | \S+ \. pdf (?= \s )


            Notes :

            • No need for the No single line modifier (?-s), as this regex does not contain any regex dot character !

            • I preferred to add the (?=\s) look-ahead structure, after the strings pdf, to be sure that we search for true PDF files


            You may test this new version against the text below :

            This is Test.PDF and TEST2.pdf files
            Example.pdf : This one is OK
            This line does NOT contain portable file name
            NOT searched : bad.pdf---123
            A file zusammenfassunge.pdf
            The zusaMMENFASSung.pdf file
            This is the Test.PDFTEST2.pdf file
            The zusammenfassung.pdf---456 file
            This is my last file.pdf
            A special 123zusammenfassung.pdf456 file
            Last.PDF True PDF file
            

            BR

            guy038

            A 1 Reply Last reply Feb 12, 2023, 11:07 PM Reply Quote 0
            • A
              Alan Kilborn @guy038
              last edited by Feb 12, 2023, 11:07 PM

              @guy038 said in Mark all .pdf files except zusammenfassung.pdf using RegEx:

              No need for the No single line modifier (?-s)

              This is true.
              It is in mine because at first I did have a non-escaped . in it…but I changed where I was heading with it.

              1 Reply Last reply Reply Quote 0
              • G
                guy038
                last edited by guy038 Feb 13, 2023, 9:53 AM Feb 12, 2023, 11:28 PM

                Hi, @andreas-soraru, @alan-kilborn and All,

                In my previous regex, I used the \S syntax to include the litteral dot as a possible character, in the filename. However, to be rigorous, I should have used this syntax :

                SEARCH  (?xi) zusammenfassung \. pdf (?= \s ) (*SKIP) (*F) | [!#$%&'()+,-.;=@\\[\\]^`{}~\w]+ \. pdf (?= \s )
                

                As some characters are forbidden in Windows filenames : \ / : * ? " < > |

                BR

                guy038

                1 Reply Last reply Reply Quote 0
                • D
                  datatraveller1 @guy038
                  last edited by Feb 13, 2023, 7:28 PM

                  @guy038 said in Mark all .pdf files except zusammenfassung.pdf using RegEx:

                  Hello, @andreas-soraru and All,

                  From your title, I would say :

                  • Open your file in N++

                  • Open the Mark dialog ( Ctrl + M )

                  • Untick all box options

                  • SEARCH (?xi-s) ^ (?! .* zusammenfassung.pdf ) .* \b \S+ \. pdf \b

                  • Check the Bookmark line and the Wrap around options

                  • Select the Regular expression search mode

                  • Click on the Mark All button

                  • Then, either :

                    • Click on the Copy Marked Text button

                    • Click on the Search > Bookmark > Copy Bookmarked Lines

                  • Open a new tab

                  • Paste the .pdf filenames or the complete lines in this new tab

                  Best Regards

                  guy038

                  1 Reply Last reply Reply Quote 0
                  4 out of 8
                  • First post
                    4/8
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors