Regex pattern needed
-
Hello,
I am looking for a way to replace all lines of the form:
https://www.server.example/path/something%20with%20spaces
with:
<a href="https://www.server.example/path/something%20with%20spaces">something with spaces</a>
Using a single search/replace operation.Currently I do it with one operation that transforms the link to an
<h ref...>format, and another one to replace a%20with a space after the closing angle bracket, which has to be repeated several times until all the%20instances are replaced.Please assist.
-
I would approach it as a two-step process.
- convert
https://www.server.example/path/something%20with%20spacesto<a href="https://www.server.example/path/something%20with%20spaces">something%20with%20spaces</a>– because that’s a pretty easy regex - convert the
>something%20with%20spaces</a>to>something with spaces</a>
I would do this because I assume that some of your URLs might have one %20, some might have two %20, and some might have more (or none). And coding a regex for all those edge cases is fragile. OTOH, if I can just search for a URL and break it into two pieces, that’s easy.
https://www.server.example/path/something%20with%20spaces https://www.different.example/path/one%20space https://www.third.example/path/spaceless- FIND =
(?-s)^(https?://\S*/)([^"\s/]*)$
REPLACE =<a href="$1$2">$2</a>
MODE = Regular expression
<a href="https://www.server.example/path/something%20with%20spaces">something%20with%20spaces</a> <a href="https://www.different.example/path/one%20space">one%20space</a> <a href="https://www.third.example/path/spaceless">spaceless</a>2 . For this one, I would use @guy038’s generic “change data, but only between start and end markers” regex from this post
* Generic =(?-i:BSR|(?!\A)\G)(?s:(?!ESR).)*?\K(?-i:FR)
* BSR =>(for the end of the<a href="...">)
* ESR =</a>
* FR =%20
* RR =\x20(or a literal space
* => FIND =(?-i:>|(?!\A)\G)(?s:(?!</a>).)*?\K(?-i:%20)
REPLACE =\x20(or a literal space)Unfortunately, when I did that, my test data became
<a href="https://www.server.example/path/something%20with%20spaces">something with spaces</a> <a href="https://www.different.example/path/one space">one space</a> <a href="https://www.third.example/path/spaceless">spaceless</a>… and you can see that it replaced a %20 that was inside the href portion… I think because used such a small BSR expression. Unfortunately, my attempt at fixing it with BSR =
<a[^\s>]*>, to be more specific, said it couldn’t find it at all. And unfortunately, I have to focus on my day job a bit more today, so I cannot continue debugging. But this is the path I’d follow.Maybe @guy038 will have time to tell us what I did wrong, or come up with a better BSR to keep the find-region out of the href value. Or maybe I will find some time this evening.
- convert
-
@PeterJones said in Regex pattern needed:
Or maybe I will find some time this evening.
Well, it was the next day, but…
My mistake in yesterday’s modified BSR =
<a[^\s>]*>was including\sin the complement character class, which meant it had to be<a...>without any spaces, which obviously cannot match<a href="...">. Once I realized that, it was easy to fix.- BSR =
<a[^>]*> - ESR =
</a> - FR =
%20 - RR =
\x20(or literal space) - FIND =
(?-i:<a[^>]*>|(?!\A)\G)(?s:(?!</a>).)*?\K(?-i:%20) - REPLACE =
\x20 - Final Transformation of my previous data:
<a href="https://www.server.example/path/something%20with%20spaces">something with spaces</a> <a href="https://www.different.example/path/one%20space">one space</a> <a href="https://www.third.example/path/spaceless">spaceless</a>
- BSR =
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login