regex to replace any value other than between 2 special charterers



  • Hello Everyone,

    I m looking to replace any value other than between 2 special characters, Please see the example below for the reference.

    I m working n numbers of urls and for format of urls is something like this:

    www.abc.com/xyz-pqr-acs-mnq123?somerandomvalues

    So what i need is to retrieve the value between “-” and “?” (mnq123)

    I have tried this:
    FIND-> .?(-(.?)?)
    REPLACE-> $1\n

    But the output i m getting is not accurate “-pqr-acs-mnq123?”

    Looking for some help.

    NOTE: some times is even need to retrieve data between other special characters in the same format so it would be helpful some can help with a generic regexp.

    Thanks in Advance



  • Hello @rajat-kumar and All,

    First, I was a bit disappointed because your link does not seem to work ! And…, after a while, I just understood that this link was YOUR example ;-))


    So, what’s about these two simple regexes :

    Regex A -\w+\?

    Regex B -[^?-]+\?


    NOTES on regex A :

    • The first part - and the last part \? are quite obvious. Note that the ? is a regex symbol and must be escaped \?, to be interpreted as a literal

    • The middle part \w+ is a non-null range of word characters, i.e. roughly, characters which belongs to the [A-Za-z0-9_] class character

    NOTES on regex B :

    • The first part - and last part \? are identical to the previous case

    • The middle part [^?-]+ is a negative class character ( [^.....] ) which match any character, even EOL chars, different from any char located inside that class. So, different from :

      • A ? symbol, which does not to be escaped, as enclosed in a character class [....]

      • A - symbol, which must end the character class [....], because of its special meaning in a character class


    Try these two regexes against the sample test below :

    01 www.abc.com/xyz-pqr-acs-mnq123?somerandomvalues
    02 www.abc.com/xyz-pqr-acs-mnq 123?somerandomvalues
    03 www.abc.com/xyz-pqr-acs-mnq	123?somerandomvalues
    04 www.abc.com/xyz-pqr-acs-mnq+123?somerandomvalues
    05 www.abc.com/xyz-pqr-acs-mnq:123?somerandomvalues
    06 www.abc.com/xyz-pqr-acs-mnq/123?somerandomvalues
    07 www.abc.com/xyz-pqr-acs-mnq?123?somerandomvalues
    08 www.abc.com/xyz-pqr-acs-mnq-123?somerandomvalues
    09 www.abc.com/xyz-pqr-acs-mnq123?somerandomvalues
    

    Note that line 02 contains a space char and line 03 a tabulation char, between the strings mnq and 123

    Best Regards

    guy038



  • @Rajat-Kumar
    Hi Rajat,

    does that work for you?

    BG,
    Heike

    dd8e367f-78a6-475e-96fa-fc2923705fd9-image.png



  • @guy038 said in regex to replace any value other than between 2 special charterers:

    -[^?-]+?

    Hello @guy038,

    I really appropriate your time and effort.

    Both the regex you have created are seems to be working perfectly and it does identifying the unique values with in my source urls.

    But can you please look into my issue once again as my desired result is to retrieve theses unique codes from the url i.e, I need to replace everything other than the unique code which is in this case “mnq123”.

    I have tried this:
    FIND WHAT -> .*?(-[^?-]+?)
    REPLACE WITH -> $1\n

    www.abc.com/xyz-pqr-acs-mnq123?somerandomvalues
    www.abc.com/xyz-pqr-acs-mnq 123?somerandomvalues
    www.abc.com/xyz-pqr-acs-mnq 123?somerandomvalues
    www.abc.com/xyz-pqr-acs-mnq+123?somerandomvalues
    www.abc.com/xyz-pqr-acs-mnq:123?somerandomvalues
    www.abc.com/xyz-pqr-acs-mnq/123?somerandomvalues
    www.abc.com/xyz-pqr-acs-mnq?123?somerandomvalues
    www.abc.com/xyz-pqr-acs-mnq-123?somerandomvalues
    www.abc.com/xyz-pqr-acs-mnq123?somerandomvalues

    AND THE RESULT WAS:

    -mnq123?
    -mnq 123?
    -mnq 123?
    -mnq+123?
    -mnq:123?
    -mnq/123?
    -mnq?
    -123?
    -mnq123?
    somerandomvalues

    AND MY DESIRED RESULT IS:

    mnq123
    mnq 123
    mnq 123
    mnq+123
    mnq:123
    mnq/123
    mnq?123
    mnq123

    Hopefully i haven’t made this confusing for you to understand.

    Thank!
    Rajat



  • @Ninon_1977

    Hello @Ninon_1977,

    Really appreciate your work, But the character length is not find in my case. So it will not work for me.

    Thanks!
    Rajat



  • @guy038

    Quick update: The values are only the combination of letters and number and does not going to contain any special character or spaces in between.

    “-mnq123?” only going to be like this
    “-mnq:123?” , “-mnq?123?” never be like this



  • Hello @rajat-kumar, @ninon_1977 and All,

    Oh… sorry, I just forgot the replacement phase :-(( So, the zones to keep are made of word characters. Then, from the input text, below :

    01 www.abc.com/xyz-pqr-acs-mnq123?somerandomvalues
    02 www.abc.com/xyz-pqr-acs-mnq 123?somerandomvalues
    03 www.abc.com/xyz-pqr-acs-mnq	123?somerandomvalues
    04 www.abc.com/xyz-pqr-acs-mnq+123?somerandomvalues
    05 www.abc.com/xyz-pqr-acs-mnq:123?somerandomvalues
    06 www.abc.com/xyz-pqr-acs-mnq/123?somerandomvalues
    07 www.abc.com/xyz-pqr-acs-mnq?123?somerandomvalues
    08 www.abc.com/xyz-pqr-acs-mnq-123?somerandomvalues
    09 www.abc.com/xyz-pqr-acs-mnq123?somerandomvalues
    

    Using the following regex S/R :

    SEAARCH (?-s)^.*-(\w+)\?.*

    REPLACE $1

    You should get the expected output text :

    mnq123
    02 www.abc.com/xyz-pqr-acs-mnq 123?somerandomvalues
    03 www.abc.com/xyz-pqr-acs-mnq	123?somerandomvalues
    04 www.abc.com/xyz-pqr-acs-mnq+123?somerandomvalues
    05 www.abc.com/xyz-pqr-acs-mnq:123?somerandomvalues
    06 www.abc.com/xyz-pqr-acs-mnq/123?somerandomvalues
    mnq
    123
    mnq123
    

    Remark :

    • Only lines 01, 07, 08 and 09 matched and are replaced. Other lines, containing a non-word symbol, between the strings mng and 123, are unchanged !

    Notes :

    • First, the in-line modifier (?-s) means that any dot regex char . represents a single standard character ( and not EOL chars like \r or \n )

    • Then the part ^.*- matches, from beginning of line ( ^ ) the greatest range, even null, of standard chars ( .* ), followed with a dash symbol ( - )

    • Now, the part (\w+) match any non-null range of word characters, stored as group 1, due to the enclosed parentheses

    • Finally the part \?.+ looks for a literal question mark ( \? ) followed by any range, even null, of standard characters, till the end of current line

    • In replacement, the entire contents of current line, before the line-break, are replaced with the group 1 contents, only !

    Cheers,

    guy038



  • @guy038

    Wow working absolutely perfect. Thanks for the help!


Log in to reply