Find content in files and rename/delete files not containing content



  • Hello friends,

    i have a folder with 5600 textfiles (*.txt). I just want to keep files that contain a special phrase. The result is 1800 hits in 399 files. Does anyone have an idea how to delete or rename the files not needed or how to rename/mark the files needed?
    Many thanks for your help. Samos007



  • This isn’t really the right place for the question, because Notepad++ probably isn’t the right solution. Personally, I would search on superuser.com, and if you cannot find an answer already, ask a question, something like “in Windows, how do I delete any *.txt files in a given folder that contain a special phrase” – making sure to tag the question as Windows (and be prepared to be mocked for using windows; there’s a reason I don’t hang out in the Stack Exchange family of forums).

    It could be done in PythonScript or LuaScript, but then it becomes more of a Python or Lua question than Notepad++ question, because it’s not to do with the specifics of how PythonScript or LuaScript interacts with Notepad++, but instead how Python or Lua interacts with the windows filesystem. Any scripting language (Python, Perl, Lua) could do it in a few lines of code. But this isn’t a Python/Perl/Lua forum.

    A cmd.exe batch script, or powershell script, could probably also do it, but I couldn’t even give hints without more research.

    There are plenty of good unix/linux utilies, which you can get for windows using MSYS2 or GnuWin32 (I personally use the latter). There would be a good way to do it using a combination of find and grep at the command line… and while I’m more familiar with those, it would still take some consulting time, for something that’s not really relevant to the forum.

    But except for being able to integrate a PythonScript or LuaScript into the Notepad++ environment, Notepad++ really isn’t the right tool for the job. Sorry.

    edits: fixed typos



  • I couldn’t stop thinking about it, so you get a freebie:

    If you grab grep from gnuwin32, and install it in your PATH, then you can run something like

    FOR /F %G IN ('grep -lP "\Qspecial phrase\E" *.txt') DO @DEL %G
    
    • the FOR syntax loops through whatever the grep command
    • the -l tells grep to only output the filename of files that match.
    • the -P tells grep to use perl-style regular expressions, which are needed for …
    • the \Q and \E wrapped around the text make sure that characters between will be treated literally (so you can use periods and question marks with impunity)
    • replace special phrase as needed

    Alternately,

    FOR %G IN (*.txt) DO @(grep -qP "\Qspecial phrase\E" %G && DEL %G)
    

    which loops through all *.txt, then greps each file individually.

    • -q will make grep “quiet”, so it outputs nothing… but it does the return code correctly, so that if it matches, the && ... will fire; if it doesn’t match, it will just procede to the next file
    • && del %G: if the grep on that file matched, it will execute the DEL command on that file

    (If you run either of these in a batch file, make sure to use %%G instead of %G: a quirk of batch vs command line.)

    ! WARNING ! NO WARRANTY ! NO GUARANTEE !

    I tested these with “echo del” instead of “del”. I make no promises these will 100% work for you. Always make sure you have a backup of critical files before trying any command you find on the internet, especially ones that explicitly delete files.



  • @Samos007

    With a bit of technical savvy, one can do it without any special tools. The method would be:

    • do the Find in Files in Notepad++
    • copy the results out of the Find result panel and paste into a new Notepad++ editing tab
    • use a regular expression replacement to eliminate all lines except ones representing hit filenames
    • use another regex replace clean up remaining lines (e.g. remove the “hits” info)
    • use yaregex to add a copy command to the beginning of each line
    • use yaregex to add a . command to the end of each line
    • save the file with a .bat or .cmd extension in an empty folder
    • run the resulting file

    This should result in all files needed copied to the new folder, effectively deleting all the unneeded files. Of course this puts all of the files that might have been from a tree structure in one folder, with the inherent problem of name collisions, but since I don’t know the OP’s exact situation I can’t know if this is a problem. If so, one could change the copy command to something more involved, or potentially use xcopy

    Doable? YES Super-easy: NOPE



  • Hallo experts,

    thank you so much for spending time to resolve this. A work around is quite easy.
    Replace “abcdefg” with “abcdefg1234”. Save. Replace “abcdefg1234” with “abcdefg”
    Now all relevant files habe a new date. Not perfect, but that’s all i need. Again many thanks.



  • @Samos007

    Haha…well if you’d said all you needed to do was “touch” (update the timestamp on) the files in your original posting…we wouldn’t have gone down a long and winding road…to the wrong solution…



  • I doubt @Samos007 will come back to confirm, but I am assuming Samos actually used File Explorer to delete the files after updating their timestamps and sorting by date.

    (And I’m happy that Samos didn’t just expect us to solve the problem, but put more effort into it; that’s encouraging that there are still random users who do put in effort into their own issues.)

    If Samos does return, Scott’s “without any special tools” challenged me: using only Windows builtin commands, two one-liner alternatives:

    for %F IN ( *.txt ) DO @( FIND    /C "special text" %F > NUL && DEL %F)
    
    for %F IN ( *.txt ) DO @( FINDSTR /L "special text" %F > NUL && DEL %F)
    

    the builtin FIND command, using FIND /C, will print out a count of the number of matches, and also return an %ErrorLevel% of 0 to the system if it matches, or 1 if there’s no match; since I don’t care about the printing, I redirect it to NUL. I then use the windows && to cause the next command to only run if the previous %ErrorLevel% was 0 (a match). DEL will thus be run only on any files that match

    The builtin FINDSTR command, using FINDSTR /L, works similarly.

    Someday, I’ll maybe think “FINDSTR” or “FIND” instead of gnuwin32’s “grep” for matching text in Windows files… but probably not before MS has discontinued cmd.exe completely and forced me to learn PowerShell.


Log in to reply