Replacing word between specific characters



  • Hello, I need some help
    I have a url like this
    https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ

    What i want is to replace each directory with a specific keyword i.e XYZ

    https://developers.apple.com/XYZ/profile/ShadowDES/Boy?view=32
    https://developers.apple.com/forums/XYZ/ShadowDES/Boy?view=32
    https://developers.apple.com/forums/profile/XYZ/Boy?view=32

    Means if there are 3 directory, output should be with 3 urls

    I have a list of url, is there any way i can do this?



  • @shpock-boss ,

    In a single step? Not easily.

    Basically, what you gave would make a good program, written in your favorite language, which would search for URLs and then do all the logic, written in a way that you could understand and maintain in the future. And this isn’t a programming forum or a codewriting service.

    I was curious, and no one had replied yet, so I ran a few experiments. Using some really complicated regex, I was able to come up with something that worked for up to three directory levels; but every level requires more logic and more nested capture groups.

    So if you never have more than 3 directories (and if you define a directory’s valid characters the same way I do), then this will work:

    • FIND = (?-s)(https?://[^\s/]+/)([^\s/]+/)(((?2))?(((?2))?(.*)))
    • REPLACE = (?2${1}XYZ/${3})(?4\r\n${1}${2}XYZ/${5})(?6\r\n${1}${2}${4}XYZ/${7})
    • SEARCH MODE = regular expression
    https://developers.apple.com/XYZ/profile/ShadowDES/Boy?view=FUZZ
    https://developers.apple.com/forums/XYZ/ShadowDES/Boy?view=FUZZ
    https://developers.apple.com/forums/profile/XYZ/Boy?view=FUZZ
    

    No, I am not going to explain it in full detail. Basically, it is using capture groups to store various pieces in numbered groups, and using the numbered subexpression control flow to repeat previous conditions later in the match, and nesting groups so that you can reference various layers, For the replacment, it’s using substitution replacements for deciding how many output lines there will be, and what they will look like.

    But I do not recommend this. Using regex you don’t understand is dangerous, and this is something complex enough that you should only use it if you understand it, and are willing to read the document I linked, and backup your data, and keep experimenting with it until you do understand it.

    An alternate way, which is slightly easier to understand, would still use conditionals, but the sequence is less complex (no nesting), so you can make it work for more terms by just copying the last term in the search, and copying the pattern

    Step one of this would be to run a regular expression that would make N copies of the URL, and would add an indicator at the beginning (I chose N characters) to indicate which level of directory should be replaced on each line.

    Starting with data

    https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ
    https://developers.apple.com/one/two/three/four/five/six/seven/end.html
    

    the end of first step will be

    ∙https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ
    ∙∙https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ
    ∙∙∙https://developers.apple.com/forums/profile/ShadowDES/Boy?view=FUZZ
    ∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html
    ∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html
    ∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html
    ∙∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html
    ∙∙∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html
    ∙∙∙∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html
    ∙∙∙∙∙∙∙https://developers.apple.com/one/two/three/four/five/six/seven/end.html
    

    The regex to do this would be

    • FIND = (?x-s)(https?://[^\s/]+/) ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? ([^\s/]+/)? [^\s]*
    • REPLACE = (?{2}∙$0)(?{3}\r\n∙∙$0)(?{4}\r\n∙∙∙$0)(?{5}\r\n∙∙∙∙$0)(?{6}\r\n∙∙∙∙∙$0)(?{7}\r\n∙∙∙∙∙∙$0)(?{8}\r\n∙∙∙∙∙∙∙$0)

    If you need more than seven directories deep, add another ([^\s/]+/)? in the sequence of those in the FIND, and another (?{ℕ}\r\n∙∙∙∙∙∙∙$0) at the end of the REPLACE

    The second and following steps would look for 7 symbols followed by a URL, then 6, then 5, then … on down, replacing the right piece in each one:

    • FIND = (?x-s) ∙{7} (https?://[^\s/]+/) (([^\s/]+/){6}) (?3) (.*$)
      • for each time, the numbers in the curly braces in the FIND would each decrease by 1, so it’s 7 and 6 for the seventh subdir, 6 and 5 for the sixth subdir, and so on down. (I did confirm that {0} will properly match 0 instances of the match, so it even works on the final pair of 1 and 0.)
    • REPLACE = ${1}${2}XYZ/${4}

    After all of those, I had:

    https://developers.apple.com/XYZ/profile/ShadowDES/Boy?view=FUZZ
    https://developers.apple.com/forums/XYZ/ShadowDES/Boy?view=FUZZ
    https://developers.apple.com/forums/profile/XYZ/Boy?view=FUZZ
    https://developers.apple.com/XYZ/two/three/four/five/six/seven/end.html
    https://developers.apple.com/one/XYZ/three/four/five/six/seven/end.html
    https://developers.apple.com/one/two/XYZ/four/five/six/seven/end.html
    https://developers.apple.com/one/two/three/XYZ/five/six/seven/end.html
    https://developers.apple.com/one/two/three/four/XYZ/six/seven/end.html
    https://developers.apple.com/one/two/three/four/five/XYZ/seven/end.html
    https://developers.apple.com/one/two/three/four/five/six/XYZ/end.html
    

    I give you this solution because it’s easier to extend to work with more subdirectories if you’ve got huge URLs. But it’s still a multistep process. I think this has less room for error than the previous one, but still back up your data and study the resources given, and see if you can understand each piece.

    caveat emptor

    This suggestions given seemed to work for me, based on my understanding of your issue, and is published here to help you learn how to do this. I make no guarantees or warranties as to the functionality for you. You are responsible to save and backup all data before and after running anything contained herein. If you want to use it long term, I recommend investing time in adding error checking and verifying with edge cases, and making sure you understand every piece of any of the suggestions.


Log in to reply