Hello, @robin-cruise and All,
In the search regex (?s)(https://xxx.com/en/)([^"]+)".+?\1(?!\2").+?" :
The regex part https://xxx.com/en/ looks for the literal string https://xxx.com/en/, stored as group 1
The regex part ([^"]+)" represents the remainder of the internet address ( for instance the string page-AAA.html ), followed with a double-quote, because [^"]+ is a non-null range of consecutive chars, all different from ", stored as group 2
Now, the part .+? stands for the shortest range of any char till…
The group 1 ( \1 ). So an other string https://xxx.com/en/
Which must be followed by .+?", which represents the shortest non-null range of any char before a double-quote…
But ONLY IF this range is different from \2 ( i.e. different, for instance, from the string page-AAA.html and a " char )
Note also that the [^"]+" syntax, without the parentheses, is more restrictive than .+?" and must be preferred because of the negative look-ahead (?!\2")
Besst Regards,
guy038