Find text and replace at the top of file

Srirup Das

Hi,
Is there a way to find text, in this case <!DOCTYPE HTML> in all JSP files and replace it at the top of the file. Like, cut and paste at the top. If this is not possible, I would like to find all files which doesn’t have <!DOCTYPE HTML> at the beginning.

Alan Kilborn

@Srirup-Das

A technique for matching your data only at top-of-file would be searching for this regular expression: (?<!\x0A)^<!DOCTYPE HTML>

I had bookmarked the origin of this technique, and it is here if you want to read more.

Srirup Das

@Alan-Kilborn Thanks for your response. What would be the regular expression to find in files which doesn’t have <!DOCTYPE HTML> at the top of the file.

Alan Kilborn

@Srirup-Das

For that, I’d be tempted to try (?<!\x0A)^(?!<!DOCTYPE HTML>).

Note that this one matches only a single character when the special string is NOT in a file.

But my eyes are swimming…the usage of this sequence ?<! versus this one ?!<! meaning two totally different and unrelated things. :-)

Srirup Das

@Alan-Kilborn Seems like (?<!\x0A)^(?!<!DOCTYPE HTML>) is working. Thanks for your help.

Srirup Das

@Alan-Kilborn One more thing I found is (?<!\x0A)^(?!<!DOCTYPE HTML>) is listing down the files which doesn’t have <!DOCTYPE HTML> at all in the file. I don’t need those files. I need only the files which contains <!DOCTYPE HTML> but not at the top.

Alan Kilborn

@Srirup-Das

Okay, okay, one more freebie, but after that unless I start charging your for consulting, the others here (and really, me, too) are gonna get all over me/us. :-)

(?<!\x0A)^(?!<!DOCTYPE HTML>)(?s).*?<!DOCTYPE HTML>

NOW, What you owe ME is to interpret this and reply back with how it works (assuming it does)!

guy038

Hello, @srirup-das, @alan-kilborn and All,

I do not get much credit for intervening at the end of the discussion to announce that the negative look-ahead (?!<!DOCTYPE HTML>) sems useless, in the last version of the regex !

Indeed, this shorter syntax (?<!\x0A)^(?s).+<!DOCTYPE HTML> is enough to get the job done ;-)) As the former regex, it even detects files whose the first line is, let’s say, as ABC<!DOCTYPE HTML> ( so, not exactly at the very beginning of current file ) !

Best Regards,

guy038

Alan Kilborn

@guy038

True enough, although these discussions tend to evolve to “build up a regex” by just adding on to what was said last. That’s what I did in this case.

Throwing most of it out might have been more confusing to the OP.

But I’d still like the OP to answer my question as well as now a second question: Why the + in @guy038 's regex is the absolute key to the whole thing!

Srirup Das

Thanks @Alan-Kilborn and @guy038 . Both of them worked what I was looking for. I’m still new to regular expressions, but I will try to dig in more on both of your solutions.