Regex to match spaces in a URL



  • I want to replace spaces in URLs in src attributes using a regex e.g.
    src="http :// jos / bos " (remove spaces in string between the quotes)

    I found this regex to select the string between the quotes on regexr.com.
    (?![src="]).*(?=")

    When I use this regex in Notepad++, it doesn’t match.

    Also, if I replace .* with a space but this is due to my poor regex skills.

    Any help appreciated, kr.



  • This post is deleted!


  • Hello, @progonkpa and All,

    I suppose that the following regex S/R sould do the job !

    SEARCH (?-is)(?:<img\x20+src="\K|\G)(?:\x20*([^"\x20\r\n]+)|\x20+)

    REPLACE \1

    Remarks :

    • Due to the \G syntax, the initial location of the cursor must not be followed with space characters

    • Tick the Wrap around option

    • Select the Regular expression search mode

    • Because of the \K syntax, click on the Replace All button, exclusively. ( Do not use the Replace button ! )

    Notes :

    • If we use the Free-spacing mode (?x), this regex can be expressed as :
    (?x-is) (?: <img \x20+ src="  \K  | \G )  (?: \x20*  ( [^"\x20\r\n]+ )  |  \x20+ )
    
    • So the regex engine is looking for a string <img src=", first, with this exact case

    • Then , due the \K syntax, anything being matched, so far, is canceled and the regex engine, now, searches, either, for :

      • A non-null range of non-space characters, possibly preceded by space chars

      • A non-null range of space characters

    • Then, due the \G syntax, it searches, for the same ranges, as above, right after the location of the last match. If a match cannot be found, it tries, again, to find an other string <img src=" and so on…

    • Note that the (?:...........) structures are non-capturing groups which do not store anything for further use, either, in search and/or replacement !

    • In replacement, we only rewrite the non-space characters [^"\x20\r\n]+ stored in group 1

    Best Regards,

    guy038


Log in to reply