RegEx: Anchor for beginning of file
Is there a RegEx symbol for the beginning of the file? Some places on the Web say it’s “\A” or “`”, but that doesn’t work. The Wiki page  doesn’t mention beginning or end of file, but says that " \A, ’ " matches the start of the matching string. Is there a way to make the whole file be the matched string?
What I want to do is find all files that start with “<?php”. Is there a way to do that?
I originally started using the info in the link you show, now I’ve moved onto a better website. This one explains in more detail, has better examples and I think has a more logical approach.
Of course your website is only meant as a primer so what it does is good, but I quickly got beyond it’s abilities to show and guide.
In the website link I provided look for the
DOTALLconcept. This might allow you to select (match) the entire content of the file.
I started thinking that looking for the
<?phpshould be simple but realised most regexs are designed to succeed, so they continue looking until the expression is true. What I think you need to research is a method to have it fail as soon as it moves 1 character into the file. Again the website I listed has examples on how the regex can be made to try and fail. So the concept would be (in plain english) look for the string
<?phpwhere there are absolutely no characters before, not even spaces or CR/LF.
I will continue trying to come up with a regex, but thought you’d maybe like to know what I have considered thus far.
I’ve been playing with this and my solution is
The problem I had was that the
?are special characters and although I tried using the delimiter on them it didn’t work. Also the
\Rcaused the same problem. In the end I used the hex code for these characters.
(?<!\x0A)means I do not want a line feed. This may need amending in your environment depending on the character code set used. I assumed it is a CR followed by a LF.
x0Ais the LF only.
Then we want a
^, start of line
Then we want the characters
You mentioned wanting to select the entire file contents if the characters you seek are at the immediate start, so to add to my regex we would then have
This is only half the battle. Now I think if you looking across multiple files using some automatic process this regex needs incorporating somehow into that. Unfortunately that is outside my abilities, presently.
Indeed, in some cases, the
\Aassertion, standing for the beginning of file, when followed with a specific regex, may not match the location right before the very first character of the current file.
However, in your case, @ennexmb, as you’re searching for a string
<?php, made up of literal characters only, it should be OK !
So if you scan of bunch of files, using the
Find in Filesdialog, , the regex
\A<\?phpshould identify any [ PHP ] file, whose first line begins with the
Just note that the question mark must be escaped , as
\?, to be interpreted as a literal char !
Oh geez, my stupidity!
I just ran the search again using
\A<\?phpas suggested by guy038 and it does indeed work. I’m guessing that when I tired it before I neglected to escape the
Terry, thanks for your valiant efforts. I tried your search string as well and it also works. That’s some pretty hardcore hacking, which would have been a great solution had the
\Aactually not been working.
And I’ve also learned from this not to trust the Notepad++ Wiki as documentation. I wanted to use that instead of a reference like the one suggested by Terry because it’s specific to Notepad++. Since there are different flavors of RegEx, I wanted a doc specific to the flavor in Notepad++. But it doesn’t help much if it’s wrong!
Thanks Terry and guy for your help.
I’ve played around a bit to find the absolute start of file in the current version of Notepad++ (7.8.6), I’ve found that ^\A works beautifully.
Also, I don’t know if it’s documented anywhere, but another item I’ve found in Notepad++ is that \Z functions as ‘end of a blank line’ rather than the absolute end of file, whereas \z (lower case Z) acts to only detect the absolute end of the file.
(7.8.6), I’ve found that ^\A works beautifully.
Have you seen the recent discussions with @guy038 and @Alan-Kilborn? There might be more tests to run described in there. (I cannot remember which topic, and forum searches don’t work to find punctuation. One of them should be able to link you to the more-detailed discussion of the
Also, I don’t know if it’s documented anywhere
is that \Z functions as ‘end of a blank line’ rather than the absolute end of file, whereas \z (lower case Z) acts to only detect the absolute end of the file.
You are right that
\zonly matches absolute end-of-file, or the end of the string if
☑ In selectionis enabled.
\Zdoes not match ‘end of a blank line’. As the documentation says, it matches
\zpreceded by 0 or more blank lines. You can see this by pasting the below file into Notepad++ and searching for
\Z: it will match in two locations: at the end of line H (which is
\zpreceded by a single newline), and the beginning of the empty line after (called I), which is
\zwithout any newlines.
If your description were right, it would have matched on B, D, E, G, and I; but it does not.
A. Not blank C. Two (B) was blank F. Four and five (D,E) were blank H. Next line (I) will be blank and end of file
Either of these threads may be what @PeterJones is referencing?:
These links may also contain related info:
Peter , when I first pasted your sample text, I stupidly forgot the EOL chars of line
H, and obviously, I just got one match, using the
So, to any people trying to reproduce @peterjones manipulations, beware that the line
Hmust end with a line-break, in order to create an empty line
I, without any EOL char !
To be short, any of these three syntaxes
is a work-around syntax to replace the buggy
\Alegal form !
Correctly works, assuming that all your file(s) scanned does/do not begin with true empty line(s)
For the particular cases, simply refer to @alan-kilborn’s links, in the previous post
Do you know how I could maintain some spacing between the three regexes :
Which are usually rewritten :
in the legal code text, of my previous post ?
Well, the trick is to use
No Break Spacecharacter(s) (
\xA0), instead of the usual
So, use the
Alt + 160input method, from the numeric keypad, to insert a
No Break Spacecharacter, at current cursor location !