Community
    • Login

    Regex to match spaces in a URL

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    3 Posts 3 Posters 1.8k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • progonkpaP Offline
      progonkpa
      last edited by

      I want to replace spaces in URLs in src attributes using a regex e.g.
      src="http :// jos / bos " (remove spaces in string between the quotes)

      I found this regex to select the string between the quotes on regexr.com.
      (?![src=“]).*(?=”)

      When I use this regex in Notepad++, it doesn’t match.

      Also, if I replace .* with a space but this is due to my poor regex skills.

      Any help appreciated, kr.

      EkopalypseE 1 Reply Last reply Reply Quote 0
      • EkopalypseE Offline
        Ekopalypse @progonkpa
        last edited by Ekopalypse

        This post is deleted!
        1 Reply Last reply Reply Quote 0
        • guy038G Offline
          guy038
          last edited by guy038

          Hello, @progonkpa and All,

          I suppose that the following regex S/R sould do the job !

          SEARCH (?-is)(?:<img\x20+src="\K|\G)(?:\x20*([^"\x20\r\n]+)|\x20+)

          REPLACE \1

          Remarks :

          • Due to the \G syntax, the initial location of the cursor must not be followed with space characters

          • Tick the Wrap around option

          • Select the Regular expression search mode

          • Because of the \K syntax, click on the Replace All button, exclusively. ( Do not use the Replace button ! )

          Notes :

          • If we use the Free-spacing mode (?x), this regex can be expressed as :
          (?x-is) (?: <img \x20+ src="  \K  | \G )  (?: \x20*  ( [^"\x20\r\n]+ )  |  \x20+ )
          
          • So the regex engine is looking for a string <img src=", first, with this exact case

          • Then , due the \K syntax, anything being matched, so far, is canceled and the regex engine, now, searches, either, for :

            • A non-null range of non-space characters, possibly preceded by space chars

            • A non-null range of space characters

          • Then, due the \G syntax, it searches, for the same ranges, as above, right after the location of the last match. If a match cannot be found, it tries, again, to find an other string <img src=" and so on…

          • Note that the (?:...........) structures are non-capturing groups which do not store anything for further use, either, in search and/or replacement !

          • In replacement, we only rewrite the non-space characters [^"\x20\r\n]+ stored in group 1

          Best Regards,

          guy038

          1 Reply Last reply Reply Quote 3

          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

          With your input, this post could be even better 💗

          Register Login
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors