Hello, @pbarney and All,
I’ll try to explain you why your initial regex ^.*?(SID=\d+)?.* cannot work !
To begin with, let’s consider the first part of your regex :
^.*?(SID=\d+)?
If you try this regex, against your text :
Lorem ipsum dolor sit amet, libero turpis non cras ligula, id commodo, aenean
est in volutpat amet sodales, porttitor bibendum facilisi suspendisse, aliquam
ipsum ante morbi sed ipsum SID=324221815251191 mollis. Sollicitudin viverra, vel
varius eget sit mollis. Commodo enim aliquam suspendisse tortor cum diam, commodo
facilisis, rutrum et duis nisl porttitor, vel eleifend odio ultricies ut, orci in
SID=32422181753241& adipiscing felis velit nibh. Consectetuer porttitor feugiat
vestibulum sit feugiat, voluptates dui eros libero. Etiam vestibulum at lectus.
Donec vivamus. Vel donec et scelerisque vestibulum. Condimentum SID=324221819525920
aliquam, mollit magna velit nec, SID=324221821424161 tempor cursus vitae sit
You’ll note that it always matches a zero-length string but the 6-th line, beginning with the SID=.... string. Why ?
Well, as you decided to put a lazy quantifier ( *? ( or also {0,}? ), the regex engine begins to match the minimum string, i.e. the empty string, at beginning of line and, of course, cannot see the string SID=... at this beginning. But, it does not matter as the SID=... string is optional. So, the regex engine considers that this zero-length match is a correct match for the current line ! And so on till …
The 6th line, where the Sid=... string does begin the line. So, the regex engine considers this string as a correct match for this 6th line. And so on…
Now, when you add the final part .*, then, at each beginning of line, due to the lazy quantifier, your regex is equivalent to :
^.*?.* ( in other words equivalent to .* ), if the SID=... string is not at the beginning of current line. Thus, as the group1 is not taken in account, the regex engine simply replaces the current line, without its line-break, with nothing, as the group 1 is not defined, resulting in an empty line
(SID=\d+).* if the SID=... string begins the current line. In this case the group 1 is defined and the regex engine changes all contents of current line with the string SID=.....
Finally, note that your second regex ^.*?(SID=\d+).* matches ONLY the lines containing a SID=... string. Thus, it’s obvious that the other lines remain untouched !
Neverthless, it was easy to solve your problem. You ( and I ) could have thought of this regex S/R !
SEARCH (?-s)^.*(SID=\d+).*|.+\R
REPLACE \1
When a line contains the SID=.... string, it just rewrites that string ( group 1 )
When a line does not contain a SID=.... string, the second alternative of the regex, .+\R grabs all contents of current line WITH its line-break. But, as this second alternative does not refer at all about the group 1, nothing is rewritten during the replacement, and the lines are just deleted
Best Regards,
guy038