Replacing word between specific characters
-
Hello, I need some help
I have a url like this
https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZWhat i want is to replace each directory with a specific keyword i.e XYZ
https://developers.apple.com/XYZ/profile/ShadowDES/Boy?view=32
https://developers.apple.com/forums/XYZ/ShadowDES/Boy?view=32
https://developers.apple.com/forums/profile/XYZ/Boy?view=32Means if there are 3 directory, output should be with 3 urls
I have a list of url, is there any way i can do this?
-
In a single step? Not easily.
Basically, what you gave would make a good program, written in your favorite language, which would search for URLs and then do all the logic, written in a way that you could understand and maintain in the future. And this isn’t a programming forum or a codewriting service.
I was curious, and no one had replied yet, so I ran a few experiments. Using some really complicated regex, I was able to come up with something that worked for up to three directory levels; but every level requires more logic and more nested capture groups.
So if you never have more than 3 directories (and if you define a directory’s valid characters the same way I do), then this will work:
- FIND =
(?-s)(https?://[^\s/]+/)([^\s/]+/)(((?2))?(((?2))?(.*)))
- REPLACE =
(?2${1}XYZ/${3})(?4\r\n${1}${2}XYZ/${5})(?6\r\n${1}${2}${4}XYZ/${7})
- SEARCH MODE = regular expression
https://developers.apple.com/XYZ/profile/ShadowDES/Boy?view=FUZZ https://developers.apple.com/forums/XYZ/ShadowDES/Boy?view=FUZZ https://developers.apple.com/forums/profile/XYZ/Boy?view=FUZZ
No, I am not going to explain it in full detail. Basically, it is using capture groups to store various pieces in numbered groups, and using the numbered subexpression control flow to repeat previous conditions later in the match, and nesting groups so that you can reference various layers, For the replacment, it’s using substitution replacements for deciding how many output lines there will be, and what they will look like.
But I do not recommend this. Using regex you don’t understand is dangerous, and this is something complex enough that you should only use it if you understand it, and are willing to read the document I linked, and backup your data, and keep experimenting with it until you do understand it.
An alternate way, which is slightly easier to understand, would still use conditionals, but the sequence is less complex (no nesting), so you can make it work for more terms by just copying the last term in the search, and copying the pattern
Step one of this would be to run a regular expression that would make N copies of the URL, and would add an indicator at the beginning (I chose N
∙
characters) to indicate which level of directory should be replaced on each line.Starting with data
https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ https://developers.apple.com/one/two/three/four/five/six/seven/end.html
the end of first step will be
∙https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ ∙∙https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ ∙∙∙https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ ∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html ∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html ∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html ∙∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html ∙∙∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html ∙∙∙∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html ∙∙∙∙∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html
The regex to do this would be
- FIND =
(?x-s)(https?://[^\s/]+/) ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? [^\s]*
- REPLACE =
(?{2}∙$0)(?{3}\r\n∙∙$0)(?{4}\r\n∙∙∙$0)(?{5}\r\n∙∙∙∙$0)(?{6}\r\n∙∙∙∙∙$0)(?{7}\r\n∙∙∙∙∙∙$0)(?{8}\r\n∙∙∙∙∙∙∙$0)
If you need more than seven directories deep, add another
([^\s/]+/)?
in the sequence of those in the FIND, and another(?{ℕ}\r\n∙∙∙∙∙∙∙$0)
at the end of the REPLACEThe second and following steps would look for 7
∙
symbols followed by a URL, then 6, then 5, then … on down, replacing the right piece in each one:- FIND =
(?x-s) ∙{7} (https?://[^\s/]+/) (([^\s/]+/){6}) (?3) (.*$)
- for each time, the numbers in the curly braces in the FIND would each decrease by 1, so it’s 7 and 6 for the seventh subdir, 6 and 5 for the sixth subdir, and so on down. (I did confirm that
{0}
will properly match 0 instances of the match, so it even works on the final pair of 1 and 0.)
- for each time, the numbers in the curly braces in the FIND would each decrease by 1, so it’s 7 and 6 for the seventh subdir, 6 and 5 for the sixth subdir, and so on down. (I did confirm that
- REPLACE =
${1}${2}XYZ/${4}
After all of those, I had:
https://developers.apple.com/XYZ/profile/ShadowDES/Boy?view=FUZZ https://developers.apple.com/forums/XYZ/ShadowDES/Boy?view=FUZZ https://developers.apple.com/forums/profile/XYZ/Boy?view=FUZZ https://developers.apple.com/XYZ/two/three/four/five/six/seven/end.html https://developers.apple.com/one/XYZ/three/four/five/six/seven/end.html https://developers.apple.com/one/two/XYZ/four/five/six/seven/end.html https://developers.apple.com/one/two/three/XYZ/five/six/seven/end.html https://developers.apple.com/one/two/three/four/XYZ/six/seven/end.html https://developers.apple.com/one/two/three/four/five/XYZ/seven/end.html https://developers.apple.com/one/two/three/four/five/six/XYZ/end.html
I give you this solution because it’s easier to extend to work with more subdirectories if you’ve got huge URLs. But it’s still a multistep process. I think this has less room for error than the previous one, but still back up your data and study the resources given, and see if you can understand each piece.
caveat emptor
This suggestions given seemed to work for me, based on my understanding of your issue, and is published here to help you learn how to do this. I make no guarantees or warranties as to the functionality for you. You are responsible to save and backup all data before and after running anything contained herein. If you want to use it long term, I recommend investing time in adding error checking and verifying with edge cases, and making sure you understand every piece of any of the suggestions.
- FIND =