Reply to Find corrupt files on Mon, 26 Aug 2019 16:02:42 GMT

PeterJones — Mon, 26 Aug 2019 16:02:42 GMT

I want to find in all my 4.000 text files those that are corrupt

Assuming you knew what text should be in a good file, which you could guarantee was in a corrupt file, you could use the Notepad++ find-in-files feature to look for files that did have that – but that would find the non-corrupt files, not the corrupt ones.

Assuming you knew that all your good files were pure ascii, and the corrupt ones all guaranteed to have something outside of the normal ASCII range (ie, in code points 128-255), you could find-all-files for something like [\x80-\xFF].

If your “good” files are in a full 8-byte encoding (a 1-page European or Asian encoding), or if your good files use a unicode encoding like UTF-8, it would be very difficult to identify “corrupt” vs “good”, because multi-byte encodings will use most or all 8-bit bytes as valid somewhere in the sequence.

If you need more help than that, you’ll have to provide more information, like defining what you mean by corrupt, and whether there are certain similarities between all good files or between all corrupt files.