How to get the Filenames after running "Find in Files" ?
-
Nice simplification–I was just trying to be quick-n-dirty, figuring you would provide the link to the “bloody post”–but you simplified it too much: You should have left
shits?
in–not too often do you get to use cool text like that in searches! :-) -
Yup, that regular expression did not work for me because it assumes that the file contains only text. In my case it also contains binary data.
However, it successfully removes the “hits” at the end of the lines. -
Since there seem to be no easy solutions to this problem, which are INDEPENDENT OF FILE CONTENTS, I’d like to ask what is the best way to make a new feature request to the authors of Notepad++ ?
Adding a check box “Output Filenames Only” to the “Search in Files” tab should take them 15 minutes since it does not require creating a new algorithm or new data. Just a trimming down of the data currently output to the Find results window.
If they care about the speed of the search, they could also terminate searching the current file once the first hit is made …and go on to searching the next file.
This way, if a 100MB file contains 9999 matching strings and the 1st hit is located in the beginning of this file, then only a small portion of it would need to be searched to output the filename (i.e. the beginning of it). Statistically, that would save a lot of CPU cycles and disk I/O when searching a set of big files.Alternately, fixing the Ctrl-C vs. RightClick-Copy behavior to be synonymous and to copy only file names when the results are collapsed (+) in the Results window, would be an elegant solution to this problem.
Getting rid of the (hits xxx) text at the ends of the paths, remains but seems to be solvable with a regex workaround, etc… -
…assumes that the file contains only text. In my case it also contains binary data…
…no easy solutions to this problem, which are INDEPENDENT OF FILE CONTENTS…Well, this is rather a grey area (but not really, to me)…Notepad++ is a text editor, and while you can do some things with it regarding non-text (binary) data, it is not really advisable. I think the regular expressions above stand as good solutions to the normal use.
Searching is a frequently complained-about area of Notepad++; heck, I do it myself. :-) But the replies tend to be, especially for Find in Files, that Notepad++ isn’t the right tool for intensive searches, and that you should “use the right tool for the right job”. I’m not saying I agree with that, necessarily, just filling you in from what I’ve observed here in the past.
Regarding “terminate searching the current file once the first hit”, you might be interested in this related thread.
Your best option is to make a feature request or a change-in-functionality request here by opening a “New issue”. However, please spend a little time searching to see if your specific request already exists, in which case you can “up-vote” that Open Issue so that in theory it gets more attention. You may also add to an already-open Issue any new information you think relevant.
-
Hi, @robinson-george, @scott-sumner and All,
Yeah ! At the first glance to your regex
....\(\d+\shits?\))
, I thought that you just did a mistake :-))
@robinson-george, you said :
Adding a check box “Output Filenames Only” to the “Search in Files” tab
I do agree, that would be a valuable option !
to copy only file names when the results are collapsed (+) in the Results window
Again, I agree with that possible improvement !
fixing the Ctrl-C vs. RightClick-Copy behavior to be synonymous
Personally, once you’re aware of that particularity, I think that it’s worth having two ways to paste Find result text :
-
All the selected text, when hitting the
Ctrl + C
shortcut -
The matched lines, ONLY, when using the context menu
Copy
option
If they care about the speed of the search, they could also terminate searching the current file once the first hit is made …and go on to searching the next file.
Of course, I do agree, that this would be the obvious optimization, to insert in code ! But Ah, Ah ! Thinking about it, I found out a nice way to simulate both a rapid search and a short text report ;-))
By that mean, if you simply want to know what are the files, matching a specific string, text or regex, no need to filter the Find result window, anymore !
So, the miracle regex is :
(String|Text|Regex)to Search(?s).*\K
, where you replace the(String|Text|Regex)
with your own textFor instance, I’ve got, presently, 34 opened tabs, in the two views, of N++
v7.5.5
, whose two copies of notepad++.exe, that I previously renamed asxxx.txt
andSmall text to see.txt
!!Then I decided to search for the simple word
the
, in all these files. Hence, the regexthe(?s).*\K
. One second later, though my anti-virus was performing, simultaneously, a scan, I got the Find result window, which contained two lines per file :-
The absolute pathname of the file
-
The last line of the file, which may be virtual, when the last physical line ends with a line break !
Then, as in my previous post :
-
Select all the Find result window and copy its contents with the
CTrl + C
-
Open a new tab, with
Ctrl + N
and paste the clipboard withCtrl + V
-
Finally, using this other regex S/R, below, on the new tab contents, you’ll get the complete list of files, matching the word
the
, at least one time ;-))
SEARCH
(?-is)^\x20\x20(.+)\x20\(\d+\x20hits?\)\R[^\r\n]+|^Search.+\R
REPLACE
?1\1
Cheers,
guy038
-
-
I have verified that your workaround works even with files that are not all text, e.g. all of the *.exe files in C:\Windows directory.
The final cleanup regex successfully cleans up even the binary garbage that is interlaced between the lines with paths.
The (hits xxx) text is also correctly removed from the end of the paths. I did not test how it behaves if the path accidentally contains, e.g. a string like “(hits from the 1960s)”So, I must write “good job” overall and I hope N++ authors are reading this to see how many steps us poor users have to go through to extract the file paths of files containing the matching string.
P.S.
Could you explain how the (?s).*\K regex works ?
Also, if it is not too much to ask: What would be the “cleanup regex” that outputs just the hit numbers in the first column in front of the paths? e.g.:
243 C:\Windows\explorer.exe -
…or a path like:
C:\MyMusic\The Best of ABBA (1970 hits)\Playlist.m3u -
@Robinson-George said:
C:\MyMusic\The Best of ABBA (1970 hits)\Playlist.m3u
This won’t cause a problem because the regular expression looks for
hits)
(orhit)
) at the very end of a line.I think one of the B’s in ABBA is backwards…but no matter…
-
Hello, @robinson-george,
No problem , George !
A) :
What does the generic
(String|Text|Regex)(?s).*\K
regex match ? Well, once you change the part(String|Text|Regex)
with your effective text :-
First, the regex just matches your own text
-
Then the
(?s)
modifier tells the regex engine that, from now on, any dot.
will match any single character ( standard or EOL one ) -
Thus, the
.*
syntax matches all text, after the first occurrence of your text, till the very end of each file -
Finally, the
\K
structure resets the regex engine search. So the final overall match is, simply, the zero length string, at the very end of each file -
So, the Find result simply displays the last line ( real or virtual ), where this empty string has been found !
Et voilà !
You may give it a try with the regex
(?s).*\K
, which should display, in the Find result window, the list of all the opened documents, loaded in the current N++ session, or all the files, involved in a Find in Files search, followed with the unique last line containing that zero-length string !
B)
Now, supposing that all your pathnames are of the form
Letter Drive:\....\.....\FileName.Extension
or, possibly,new ##
( but NOT as “network paths”, using the Universal Naming Convention syntax ). I, then, changed the regex to match some variants of paths, collected in the new tab !Thus, the global syntax, with possible spaces characters, in paths :
^[Any range of characters]Upper Letter:\... ....\...\... ... ....\File Name.ext[ (1 hit)]$
is, now, supported by the new search regex. So, given the sample text, below, in a new tab, which could be the results of the
the(?s).*\K
search regex :Search "the(?s).*\K" (12 hits in 12 files) C:\MyMusic\The Best of ABBA (1970 hits)\Playlist.m3u (1 hit) Line 596: Any Char new 3 (1 hit) Line 16: C:\_755\license.txt (1 hit) Line 116: C:\Program Files\Notepad++\Tests\NativeLang.xml (1 hit) Line 596: C:\MyMusic\The Best of ABBA (1970 hits)\Nice song.m3u Line 345: C:\Program Files\Notepad++\Tests\MySecond File.txt (1 hit) Line 100: C:\Program Files\Notepad++\Lettres\George_3.txt (1 hit) Line 97: C:\_755\xxx.txt (1 hit) Line 12538: .¤B"¸?5ø{v\^ê× ª™ý=ú}ˆÑz¤4²GÏXð™°#B± C:\Program Files\Notepad++\Lettres\RegexDocum.txt Line 12894: C:\Program Files\Notepad++\Tests\My Third File.txt Line 100: 1234 C:\Program Files\Notepad++\Tests\MyFile.txt (1 hit) Line 856: 1234 C:\MyMusic\The Best of ABBA (1970 hits)\Test.m3u (1 hit) Line 123:
The new regex S/R, below :
SEARCH
(?-is)^.*?(([A-Z]:\\|new).+?)(\x20\(1\x20hit\)$)?\R[^\r\n]+|^Search.+\R
REPLACE
?1\1
Would give the following 12 absolute paths, below :
C:\MyMusic\The Best of ABBA (1970 hits)\Playlist.m3u new 3 C:\_755\license.txt C:\Program Files\Notepad++\Tests\NativeLang.xml C:\MyMusic\The Best of ABBA (1970 hits)\Nice song.m3u C:\Program Files\Notepad++\Tests\MySecond File.txt C:\Program Files\Notepad++\Lettres\George_3.txt C:\_755\xxx.txt C:\Program Files\Notepad++\Lettres\RegexDocum.txt C:\Program Files\Notepad++\Tests\My Third File.txt C:\Program Files\Notepad++\Tests\MyFile.txt C:\MyMusic\The Best of ABBA (1970 hits)\Test.m3u
IMORTANT :
Beware that the regex
(?-is)^.*?(([A-Z]:\\|new).+?)(\x20\(1\x20hit\)$)?\R[^\r\n]+|^Search.+\R
expects, both :-
The string
(1 hit)
, when present -
An unique line of results, after the absolute pathname line
So, this regex must be performed ONLY on the results of the previous regex search (
(String|Text|Regex)(?s).*\K
), on multiple files !Cheers,
guy038
P.S. : As usual, things are harder to explain than to execute ;-))
For noob people, about regular expressions concept and syntax, begin with that article, in N++ Wiki :
http://docs.notepad-plus-plus.org/index.php/Regular_Expressions
In addition, you’ll find good documentation, about the Boost C++ Regex library, v1.55.0 ( similar to the PERL Regular Common Expressions, v5.8 ), used by
Notepad++
, since its6.0
version, at the two addresses below :http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html
-
The FIRST link explains the syntax, of regular expressions, in the SEARCH part
-
The SECOND link explains the syntax, of regular expressions, in the REPLACEMENT part
You may, also, look for valuable information, on the sites, below :
http://www.regular-expressions.info
http://perldoc.perl.org/perlre.html
Be aware that, as any documentation, it may contain some errors ! Anyway, if you detected one, that’s good news : you’re improving ;-))
-
-
@Scott-Sumner this works. thanks!!
-