Community
    • Login

    Replace string, but maintain substring (convert Markdown to HTML image)

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    4 Posts 2 Posters 805 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Jeroen BorgmanJ
      Jeroen Borgman
      last edited by

      I have 500 Markdown files where I need to replace a complete line with another line. This surely can be done using Regex, but I have no clue how.

      I want to replace this line (last words of line differ in each file):

      ![](..\pregit\uploads\1kln-2hdw-4h50/media/image1.jpeg){width="5.1722222222222225in" height="2.5868055555555554in"} + Variable tekst
      

      with:

      <img src="/uploads/1kln-2hdw-4h50/image1.jpeg" height="300">
      

      Note that the ‘1kln-2hdw-4h50’ is reused in the new line.

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello, @jeroen-borgman, and All,

        Again, a regex S/R is the solution !

        I supposed that :

        • The image names may be different

        • The part 1kln-2hdw-4h50 may be different


        So, given the input sample text, below :

        ![](..\pregit\uploads\1kln-2hdw-4h50/media/image1.jpeg){width="5.1722222222222225in" height="2.5868055555555554in"}    bla bla blah
        
        ![](..\pregit\uploads\9ftu-7abc-10h27/media/My Image.jpeg){width="9.2333in" height="3.555556in"}    Variable text
        
        ![](..\pregit\uploads\5gya-0hgz-2h36/media/image45678.png){width="10.0in" height="5.0in"}   This a test : ( .../media/... ) bla blah !
        

        If you use the following regex S/R :

        SEARCH (?x-is) \Q![](..\pregit\uploads\\E (.+?) /media (/.+?) \) .+

        REPLACE <img src="/uploads/\1\2" height="300">

        You’ll get the modified text :

        <img src="/uploads/1kln-2hdw-4h50/image1.jpeg" height="300">
        
        <img src="/uploads/9ftu-7abc-10h27/My Image.jpeg" height="300">
        
        <img src="/uploads/5gya-0hgz-2h36/image45678.png" height="300">
        

        Hope that it’s your expected output text ;-))

        Notes :

        • First, the in-line modifiers (?x-is) means :

          • That any space char in the regex is NOT taken in account by the regex engine and just helps the user to better identify the different sections of the search regex ( (?x) ). In this mode, if you need to search for a space character, use, either, the syntax :

            • \x20 ( The escaped form )

            • [ ] ( A space char, in a class character )

            • A \ symbol, followed with a Space char

          • Any dot regex char ( . ) will match a single standard character, only and not an EOL one ( (?-s )

          • The search engine carries the search in a NON-insensitive way ( (?-i) )

        • Then the part \Q![](..\pregit\uploads\\E, simply delimits a literal string, between the two \Q and \E syntaxes, to be matched, with that exact case

        • Then, the part (.+?) matches the shortest string of any character before the /media string, with that exact case, stored as group 1` because of the embedded parentheses

        • Now, the part /media matches the litteral string /media, with that exact case

        • And the following part (/.+?) looks for a slash symbol / followed with the shortest string of any character before an ending parenthesis \), stored as group 2 because of the embedded parentheses

        • Then, the part \) matches a literal ending parenthesis

        • And, finally the part .+ matches all the remaining standard characters of current line

        • In replacement, all the current line contents are replaced with :

          • The part <img src="/uploads/, which rewrites this exact expression, first

          • The part \1\2, which rewrites the contents of groups 1, then 2

          • The part " height="300">, which rewrites this exact expression

        Best Regards,

        guy038

        P.S. :

        You must be aware of a fundamental difference, in regex syntaxes containing variable quantifiers, like *, +, ?, {n,} and {n,m}

        • You may use the quantifier, by itself

        • You may add the ? symbol, right after the quantifier

        For instance, the regex abc.+xyz may not match the same expresions as the abc.+?xyz will !

        Against the text - abcdefghijklmnopqrstuvwxyz - abcdefghijklmnopqrstuvwxyz - : :

        • The regex abc.+xyz would match the string abcdefghijklmnopqrstuvwxyz - abcdefghijklmnopqrstuvwxyz, i.e. the longest string between the abc and the xyz strings

        • Whereas the regex abc.+?xyz would match the string abcdefghijklmnopqrstuvwxyz, i.e. the shortest string between the abc and the xyz strings


        Jeroen, just remove one ? symbol or the two ones, in the search regex above. As you can see, the third line of the sample text is, now, wrongly replaced :-((

        1 Reply Last reply Reply Quote 3
        • Jeroen BorgmanJ
          Jeroen Borgman
          last edited by

          @guy038 said in Replace string, but maintain substring (convert Markdown to HTML image):

          <img src=“/uploads/\1\2” height=“300”>

          WoW! pure magic!
          Thanks for this Guy. I not only like the solution, but also the explanation.

          1 Reply Last reply Reply Quote 2
          • guy038G
            guy038
            last edited by

            Hi, @jeroen-borgman, and All,

            Thanks for your comment !

            The Free Spacing regex mode also allows you to place the different parts of your regex in consecutive lines, with possible comments after a # character, as below :

            (?x)                          # FREE SPACING regex mode
            (?-is)                        # DOT regex char = 1 STANDARD char and search SENSITIVE to case
            
            ^                             # START of CURRENT line boundary ( Added to be more RIGOROUS ! )
            \Q![](..\pregit\uploads\\E    # LITTERAL string ![](..\pregit\uploads\
            (.+?)                         # Part BETWEEN uploads\ and /media ( Group 1 )
            /media                        # LITTERAL string /media
            (/.+?)                        # Image NAME ( Group 2 )
            \)                            # LITTERAL string ) The ESCAPED form is necessary as PARENTHESES are REGEX chars !
            .+                            # REMAINING chars of CURRENT line scanned
            

            Just select all these lines and paste them in the Find what: field of the Find dialog ;-))

            Note that if your regex must contain a # char, just place use the escaped syntax \# or the character class [#]

            Cheers,

            guy038

            1 Reply Last reply Reply Quote 3
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors