Community
    • Login

    Regex to match spaces in a URL

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    3 Posts 3 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • progonkpaP
      progonkpa
      last edited by

      I want to replace spaces in URLs in src attributes using a regex e.g.
      src="http :// jos / bos " (remove spaces in string between the quotes)

      I found this regex to select the string between the quotes on regexr.com.
      (?![src=“]).*(?=”)

      When I use this regex in Notepad++, it doesn’t match.

      Also, if I replace .* with a space but this is due to my poor regex skills.

      Any help appreciated, kr.

      EkopalypseE 1 Reply Last reply Reply Quote 0
      • EkopalypseE
        Ekopalypse @progonkpa
        last edited by Ekopalypse

        This post is deleted!
        1 Reply Last reply Reply Quote 0
        • guy038G
          guy038
          last edited by guy038

          Hello, @progonkpa and All,

          I suppose that the following regex S/R sould do the job !

          SEARCH (?-is)(?:<img\x20+src="\K|\G)(?:\x20*([^"\x20\r\n]+)|\x20+)

          REPLACE \1

          Remarks :

          • Due to the \G syntax, the initial location of the cursor must not be followed with space characters

          • Tick the Wrap around option

          • Select the Regular expression search mode

          • Because of the \K syntax, click on the Replace All button, exclusively. ( Do not use the Replace button ! )

          Notes :

          • If we use the Free-spacing mode (?x), this regex can be expressed as :
          (?x-is) (?: <img \x20+ src="  \K  | \G )  (?: \x20*  ( [^"\x20\r\n]+ )  |  \x20+ )
          
          • So the regex engine is looking for a string <img src=", first, with this exact case

          • Then , due the \K syntax, anything being matched, so far, is canceled and the regex engine, now, searches, either, for :

            • A non-null range of non-space characters, possibly preceded by space chars

            • A non-null range of space characters

          • Then, due the \G syntax, it searches, for the same ranges, as above, right after the location of the last match. If a match cannot be found, it tries, again, to find an other string <img src=" and so on…

          • Note that the (?:...........) structures are non-capturing groups which do not store anything for further use, either, in search and/or replacement !

          • In replacement, we only rewrite the non-space characters [^"\x20\r\n]+ stored in group 1

          Best Regards,

          guy038

          1 Reply Last reply Reply Quote 3
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors