Speeding up searches
I have large log files (100 MB or more) that I need to search and sometimes searching for specific text can take a long time (30 seconds or more). It’s weird since clicking the “Count” button which must search the whole file anyway is sometimes instant and never more than a few seconds.
It seems it has something to do with updating the display. Is this a bug? Maybe there is some problem in my settings but I haven’t been able to find anything.
This post is deleted!
sometimes searching for specific text can take a long time (30 seconds or more).
Obviously some functions within Notepad++ work much faster than others. That’s likely due to internal coding of which I’m unaware, I think just accept it for what it is.
There are many environmental factors which can affect the processing speed. One interesting example is the “periodic backup” if you have enabled that. The default is 7 seconds, so as you go about replacing/deleting text in a large file the system is also trying to back up the latest version of that file every 7 seconds. This setting is under Settings, preferences, Backup. The ability of Notepad++ to undo edits, so if many edits then it has to keep a list of those, again potentially causing (slight) delays. Possibly the use of Plugins might cause minor delays as each plugin might also be trying to keep up with any keyboard input or changing file contents. Use of lexers, depending on what type of file it is as they interpret the file content and attempt to format/colour text based on formulae.
I’d say though that with a 100MB file I’d expect a 30 second wait to be rather normal. In fact I believe if you were to look at some old posts several posters had problems with much smaller files than yours, so you might be lucky. See this one:
I’ve done several tests using 5-20MB files and often the wait is much longer depending on complexity of the search regex.
Last time I checked Notepad++ search had simple design with terrible performance. Which is OK for SW developers and other users who deal with a few MB of text files that they wrote by themselves. Not as good for log files which are auto generated.
Notepad++ loads every file as if it is going to be opened.
This allows Notepad++ to detect the file’s encoding.
This requires Notepad++ to allocate enough memory to fit the entire file and later free that memory just to be reallocated for the next file. Not a big problem for files with size of a few KB which is typical for SW developers.
This is single threaded and synchronous. A full file is loaded into memory using I/O while CPU is mostly idle. Then the file is searched in memory using CPU while I/O is idle.
Not sure if syntax highlighting lexers are applied or not which would be a complete waste of CPU cycles.
I actually never use Notepad++ search in files. I use ‘grep’ from console and then use TagLEET to jump to the locations.
Someone should implement within Notepad++ or as a plugin an alternate search that assume all files to be utf-8 or ansi which I assume covers about 80% of users. Such a feature could scan files using a few relatively small buffers and asynchronous I/O.
I believe that a x5 speedup is a modest expectation.
30 seconds for a 100MB file is way too long for modern disks.
What is the difference between first search and 2nd search while we expect file to be cached in OS disk cache?
People tend to get “wound up” about Notepad++ searching being, well, less than ideal.
Sure, it does a great job on simple, generic searches.
But…if you go beyond this, the answer is to not expect great searching from Notepad++, but to use a different tool (one that excels at, well, searching).
If it is a good enough tool, it will have an option to open the files and jump to the hits in a tool of your choice – and this choice could be Notepad++.